WO2010135678A1 - Nucleic acids of pichia pastoris and use thereof for recombinant production of proteins - Google Patents
Nucleic acids of pichia pastoris and use thereof for recombinant production of proteins Download PDFInfo
- Publication number
- WO2010135678A1 WO2010135678A1 PCT/US2010/035825 US2010035825W WO2010135678A1 WO 2010135678 A1 WO2010135678 A1 WO 2010135678A1 US 2010035825 W US2010035825 W US 2010035825W WO 2010135678 A1 WO2010135678 A1 WO 2010135678A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- protein
- sequence
- genes
- set forth
- gene
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/37—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi
- C07K14/39—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts
- C07K14/395—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts from Saccharomyces
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/67—General methods for enhancing the expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
Definitions
- This invention relates generally to novel nucleic acids and recombinant expression technology.
- the invention relates to determination and assembly of the genome sequence of Pichia pastoris.
- the invention provides novel nucleic acids, proteins, and related expression vectors useful for genetic engineering of methylotrophic yeast strains, as well as engineered methylotrophic yeast strains particularly Pichia pastoris and use thereof for recombinant production of heterologous proteins.
- the methylotrophic yeast Pichia pastoris is by far the most often used yeast species in the production of recombinant proteins and is used in thousands of laboratories worldwide for the production of proteins for fundamental studies, as drug targets and for therapeutic use, and as a model for peroxisomal proliferation and methanol assimilation.
- the P. pastoris expression technology is available from Invitrogen (Carlsbad, CA) and from Research Corporation Technologies (Tucson, AZ), making it accessible for academic and commercial purposes alike.
- P. pastoris grows to high cell density, provides tightly controlled methanol- inducible transgene expression and efficiently secretes heterologous proteins in defined media. Indeed, several P.
- the present invention provides the determination, assembly and manually curated annotation of the 9.43 Mbp genomic sequence of the GSl 15 strain of Pichia pastoris ⁇
- the invention provides isolated nucleic acid molecules that encode a protein as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176- 0200, or 0202-0212, or a protein substantially homologous thereto.
- the nucleic acid molecules contain the coding sequence of an ORF as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202-0212, or contain a nucleotide sequence that is substantially homologous to the coding sequence of an ORF as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202- 0212.
- the present invention provides isolated proteins that have an amino acid sequence as set forth in any one of SEQ0001-0025, 0027-0126. 0128-0165, 0172- 0174, 0176-0200, or 0202-0209, or an amino acid sequence substantially homologous thereto.
- the present invention provides 53 peptides, shown in SEQ0001-0025 and 0027-0054, which are signal peptides of secreted proteins of Pichia pastoris. Nucleic acids that encode any of these signal peptides are also parts of the present invention.
- the present invention provides a set of vectors useful for identification of the most effective choice of signal peptide for any given heterologous protein. Each vector contains a promoter and the coding sequence of one of the signal peptides identified in SEQOOO 1-0025 and 0027-0054, and a linker sequence or cloning site for inserting or receiving the coding sequence of the heterologous protein.
- the linker sequence or cloning site simply includes a restriction endonuclease site.
- the heterologous coding sequence with the same restriction site at its 5' end, can be joined to the signal peptide coding sequence via restriction enzyme digestion and subsequent ligation.
- the linker sequence is an intron sequence functional in Pichia pastoris and includes a restriction endonuclease site.
- the heterologous coding sequence with the same intron sequence and restriction site at its 5' end, can be joined to the signal peptide coding sequence via the intron sequence.
- a library of such fusion clones can be transformed into Pichia pastoris to select for the most effective choice of signal peptide for any given heterologous protein.
- the invention provides isolated nucleic acid molecules composed of a promoter sequence of any one of the Pichia pastoris genes set forth in SEQ0055-0165, 0169-0200, or 0202-0212.
- Specific promoters of the present invention include those identified for SEQ0060- 0085 (genes involved in glycolysis pathway), SEQ0096-0124 (genes showing high expression levels), SEQO 125-0128 (homologs of S. cerevisiae genes whose promoters are frequently used for recombinant expression), SEQ0169-0178 (methanol metabolism genes), and SEQ0202-0208 (genes involved in xylose, arabinose or threhalose metabolism).
- the promoters of these genes are located within the 1000 bp of the 5' region provided herein, and generally are located within the 500 bp immediately before the start codon of the gene, in some embodiments within 250 bp, 200 bp, 150 bp, 125 bp, 100 bp, 75 bp, 50 bp, 40 bp, or even 25 bp immediately before the start codon of the gene.
- the promoters include a TATA element identified herein in Table 6. [Oil] Use of any of the newly identified promoters for expression of a heterologous gene is encompassed by the present invention, including the related expression vectors and cells transformed with any such expression vectors.
- the present invention is directed to expression vectors and engineered methylotrophic yeast strains for increased expression (or overexpression) of one or more Pichia proteins involved in the secretory pathway, e.g., those as set forth in SEQ0055- 0059, in order to achieve increased protein secretion.
- the present invention is directed to expression vectors and engineered methylotrophic yeast strains for overexpression of Pichia glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-GIc- 4-epimerase for ER or Golgi localization.
- Such expression vectors are constructed to contain, from 5' to 3', a promoter functional in the recipient strain, operably linked to a coding sequence as set forth in any one of SEQO 129-0132, and a transcription termination sequence.
- the encoded protein is preferably also designed to include an ER or Golgi localization signal.
- Such an expression vector can be introduced into a methylotrophic yeast strain by transformation.
- the resulting engineered strains capable of increased expression of a protein encoded by any one of SEQO 129-0132 constitute another embodiment of the invention.
- the present invention provides a methylotrophic yeast strain in which at least one (i.e., one or more) native gene encoding an O-mannosyl transferase or a beta- mannosyl transferase has been inactivated.
- the strain is a Pichia strain, preferably, a P. pastoris strain.
- An O- or beta-mannosyl transferase knockout P. pastoris strain can be generated by inactivating at least one gene as set forth in SEQO 133- 0137 or SEQ0210-0212, which can reduce or eliminate unwanted O- or beta-glycosylation of a heterologous protein.
- the present invention provides a methylotrophic yeast strain in which one or more native genes encoding enzymes involved in the ER glycosylation pathway have been inactivated.
- the native yeast STT3 gene has been inactivated, and optionally the Leishmania STT3 gene has been introduced in place thereof.
- the methylotrophic yeast strain is P. pastoris in which the native Pichia gene as set forth in SEQO 163 has been inactivated, and the Leishmania STT3 gene has been introduced in place thereof.
- the present invention is directed to methylotrophic yeast strains which overexpress one or more of the MNN4 homologs as set forth in SEQ0166-0168, and related expression vectors.
- the present invention provides a protease-deficient methylotrophic yeast strain.
- the strain is a Pichia strain, e.g., a P. pastoris strain.
- the protease-deficient Pichia strain can be generated by inactivating at least one (i.e., one or more) protease-encoding genes as set forth in SEQO 179 and SEQO 181-0186, such that there is no functional protease produced from a disrupted gene.
- protease-deficient Pichia strains allow more stable accumulation of heterologous proteins.
- the present invention provides a methylotrophic yeast strain engineered to overexpress a protease inhibitor protein, encoded by the gene set forth in SEQO 180, at an elevated level compared to an unmodified strain.
- the strain is a Pichia strain, e.g., a P. pastoris strain. Expression vectors created for making such strains form another embodiment of the invention.
- the present invention is directed to expression vectors and engineered methylotrophic yeast strains for overexpression of at least one Pichia chaperones involved in secreted protein folding in the ER.
- Such expression vectors are constructed to contain, from 5' to 3', a promoter functional in the recipient strain, operably linked to a coding sequence as set forth in any one of SEQ0187-SEQ0200, and a transcription termination sequence.
- Such an expression vector can be introduced into a methylotrophic yeast strain by transformation.
- Another embodiment of the invention is directed to engineered methylotrophic yeast strains capable of overexpressing a protein encoded by any one of SEQ0187-SEQ0200.
- methylotrophic yeast strains capable of overexpressing a combination of multiple chaperones are provided, which combination can be selected as most effective for recombination production of a particular heterologous protein.
- the present invention provides an isolated nucleic acid molecule containing the nucleotide sequence as set forth in SEQ0200, which encodes the 5S rRNA.
- Use of this nucleic acid in creating vectors to achieve multi-copy integration of a heterologous gene and generate strains having a heterologous gene stably integrated in the genome in multiple copies is also contemplated by the present invention.
- FIGS 1A-1C Pichia pastoris genome sequencing and overview.
- IA Genome sequencing and assembly strategy.
- IB P. pastoris chromosomes and known markers position. Genes that had been previously mapped to the chromosomes through PFGE are indicated in blue, and rDNA repeats in yellow, the 5S rRNA are indicated in yellow with the red arrow.
- 1C Phylogenetic tree. The phylogenetic tree was built on the concatenated sequence of 200 single-copy ortholog genes in all of the 6 species. Numbers next to each branch correspond to the number of Pfam domains uniquely present in the corresponding lineage.
- FIGS 2A-2B Pichia pastoris codon usage.
- 2A Codon usage in the P. pastoris ORFeome. The relative abundance of a codon is represented as a percentage of the total codon usage for the amino acid.
- FIGS. 3A-3B Pichia pastoris pathways.
- 3A Methanol utilisation pathway in Pichia pastoris.
- Table 5A A detailed table with the genes coding for the respective enzymes is shown in Table 5A.
- 1 AOX alcohol oxidase
- 2 FLD formaldehyde dehydrogenase
- 3 FGH S-formylglutathione hydrolase
- 4 FDH formate dehydrogenase
- 5 CAT catalase
- 6 DAS: dihydroxyacetone synthase
- DAK dihydroxyacetone kinase
- TPI triosephosphate isomerase
- 9 FBA fructose- 1,6-bisphosphate aldolase
- FBP fructose- 1,6-bisphosphatase
- DHA dihydroxyacetone
- GAP glyceraldehyde-3-phosphate
- DHAP dihydroxyacetone phosphate
- F 1 ⁇ BP fructose- 1,6-bisphosphate
- the N-glycans are further processed to the yeast-typical hypermannosyl-type glycans.
- the hypermannosylation is abolished and the glycans are processed to Gal 2 GlcNAc 2 Man 3 GlcNAc 2 .
- the protein is secreted in the growth medium, where it may be a substrate for yeast proteases.
- FIGS 5A-5C Chromosome assembly. 5A. By PFGE and Southern blot detection,
- H. wingeii chromosomes were used as marker for the PFGE, but they also gave a signal on the blot with the conserved c2 probe.
- FIG. 1 Distribution of gene ontology terms assigned to P. pastoris.
- a total of 4,262 P. pastoris genes were assigned with gene ontology (GO) terms: 3,142 genes with molecular function assignment, 3,647 genes with cellular component assignment and 3,182 genes with biological process assignment.
- GO gene ontology
- FIG. 7 P. pas tons secretion signals. 53 SignalP predicted signal peptides were manually curated to be secretion signals based on the function of orthologs. The predicted site of signal peptidase cleavage is indicated by the red triangle. Alignment of these peptides shows a hydrophobic consensus sequence (poly Leu), and a small amino acid residue at position -1 and -3 from the cleavage site.
- FIG. 1 [028] Figure 8. Protease Gene Insertional Inactivation Strategy.
- a drug resistance marker or other auxotrophic marker can be used for this method.
- the present inventors have determined and assembled the 9.43 Mbp genomic sequence of the GS 115 strain of P. pastoris, and manually curated annotation of 5,313 protein-coding genes.
- the invention provides novel protein-encoding genes from Pichia pastoris, including identification of the 5' upstream region (including promoter), open reading frame (ORF) and 3' downstream region of these genes.
- the present invention also provides novel Pichia pastoris proteins, and certain signal peptides of some of these P. pastoris proteins.
- nucleic acids and encoded proteins identified herein, as well as the promoters and signal peptides can be used in engineering methylotrophic yeast strains, particularly Pichia strains, for recombinant production of proteins, including but not limited to glycoproteins having glycoforms suitable for therapeutic use in mammals especially humans.
- the determination and annotation of the genome sequence of Pichia pastoris also permit a more complete, overall understanding of Pichia pastoris in respect to its protein modification and secretion system as well as methanol metabolism, by providing a complete set of Pichia pastoris genes coding for enzymes involved in methanol assimilation, a complete catalog of Pichia pastoris orthologs to the S.
- ORF Open Reading Frame
- Coding sequence This term is used herein to refer to a contiguous sequence of codons of the protein encoded by the ORF and does not include intron.
- 5' upstream region This term (or "5' region” or “upstream region” in abbreviation) is used herein to refer to the genomic region 5' relative to the ORF of a gene.
- the Sequence file provided herein has set forth a 5' upstream region of approximately 1000 nucleotides, which, in some cases, includes the start and/or stop codons of the previous gene.
- the 5' upstream region of a gene includes the promoter of the gene. The extent of the 5' region provided herein for each gene is sufficient for targeting recombination to this site, e.g.
- Promoter - This term refers to a portion of the 5' upstream region of a gene that directs the transcription of the gene. Promoters are located within the 1000 bp of the 5' upstream region of yeast genes, with a "TATA box” sequence most commonly located at 10-120 bp upstream from the start codon of a gene. A "TATA box” is a DNA sequence (cis-regulatory element) found in the promoter region, and has the core DNA sequence 5'-TATAAA-3' or a variant.
- a TATA box is usually located 25 base pairs upstream to the transcription site.
- TATAA 50% of the elements are found 60 to 90 bp upstream of ATG (75% 40 to 110 bp upstream of ATG); TACAA: 78% of the elements are found 50 to 70bp upstream of ATG; TATA: 50% of the elements are found 10 to 40 bp upstream of ATG; TATATA: 50% of the elements are found 80 to 90bp upstream of ATG; and TATATATA: 50% of the elements are found 50 to 60bp of ATG. Details of the TATA boxes of the promoters provided by the present invention are set forth in Table 6.
- the promoter of a Pichia pastoris gene provided herein is located within the 500 bp immediately before the start codon of the gene; prepferably within 250 bp, 200 bp, 150 bp, 125 bp, 100 bp, 75 bp, 50 bp, 40 bp, or even 25 bp immediately before the start codon of the gene; and in particular embodiments, the promoter includes a TATA element identified herein in Table 6.
- the precise location and composition of a promoter can be determined by using well known techniques including deletion mapping and site-directed mutagenesis, as further described below.
- 3' downstream region This term (sometimes “downstream region” or “3' region” in abbreviation) is used herein to refer to the genomic region 3' relative to the ORF of a gene.
- the Sequence file provided herein has set forth a 3' downstream region of approximately 1000 nucleotides, which, in some cases, includes the start and/or stop codons of the next gene.
- the 3' down region of a gene includes the transcription termination sequence (or 3' termination sequence) of the gene. The extent of the 3' region provided herein is sufficient to target recombination to this site in the chromosome.
- Selectable marker - This term refers either to a dominant drug resistance marker or similar dominant marker, or to a more limited prototrphic selection such as HIS, applicable to a host that is defective for the key enzyme supplied by a selectable marker gene.
- Signal peptide and “mature proteins” -
- the term “signal peptide” or “signal sequence” refers to the short peptide sequence within a protein precursor synthesized in the cytoplasm that targets the precursor form to the endoplasmic reticulum. Signal peptides are typically cleaved from the precursor form by signal peptidase after the proteins are transported to the ER, and the resulting proteins move along the secretory pathway to their intracellular or extracellular location. For some proteins, cleavage of the signal peptide results in the mature form (i.e., the final, biologically active form) of the protein, while for other proteins, additional proteolytic processing may be required in order to generate the mature form of the protein.
- Substantially homologous amino acid sequences When two or more amino acid sequences are said to be substantially homologous, it is meant that the sequences share a significant degree of similarity, for example, at least 85%, 90%, 95%, 98% or even 99% similarity.
- the term "similarity" includes identity.
- Substantially homologous proteins can perform or possess substantially the same function; i.e., the enzymatic activities of the proteins differ by not more than 20%, 15%, 10%, or even 5% under a same set of conditions applicable for measuring enzymatic activity.
- a protein that is substantially homologous to a Pichia pastoris protein identified herein is, in some embodiments, a protein of methylotrophic yeast; for example, a protein of Pichia.
- Substantially homologous nucleotide sequences When two or more nucleotide sequences are said to be substantially homologous, it is meant that the sequences share a significant degree of identity, for example, at least 85%, 90%, 95%, 98% or even 99% identity. The degree of homology is also reflected by hybridization characteristics. As defined herein, a first nucleic acid sequence that is substantially homologous to a second nucleic acid sequence molecule also hybridizes to the complement of the second nucleic acid sequence under high stringency conditions.
- High stringency conditions include, for example, hybridization at 42°C in 50% v/v formamide, IM NaCl, and 1% w/v SDS, and washing at 65°C in 0.1-2X SSC (e.g., 0.1, 0.2, 0.5. 1 or 2X SCC) and 1% w/v SDS.
- a nucleic acid that is substantially homologous to a Pichia pastoris nucleic acid identified herein is, in some embodiments, a nucleic acid of methylotrophic yeast; for example, a nucleic acid of Pichia.
- Heterologous versus “native'V'endogenous” -
- the term “heterologous” is used herein in several different contexts to reflect the fact that a molecule is placed in a genetic, molecular or cellular environment that is different than its native environment. For example, when the promoter of a gene is being utilized to drive the expression of a different gene, the promoter will be taken out of its native genetic context and placed in an operable linkage to a heterologous gene. As another example, the signal peptide of a protein can be used to direct the localization of a different protein, i.e., a heterologous protein.
- a methylotrophic yeast strain such as Pichia can be transformed with a heterologous nucleic acid, i.e., a nucleic acid which the non-engineered Pichia strain does not have.
- the resulting engineered strain will express the protein encoded by the heterologous nucleic acid, i.e., a heterologous protein.
- Gene overexpression - Overexpression of a gene in a methylotrophic yeast is achieved by genetic modification of the yeast such that the expression of the gene is increased (measurable either at the mRNA level or the protein level), as compared to the unmodified yeast.
- the extent of increase in expression is at least 35%, or at least 50%, or preferably at least 150%, 200%, 250%, 300%, 400% or more.
- Overexpression of a gene can be achieved by introducing additional expression cassette carrying the gene (as a plasmid vector or integrated into the chromosome), or by replacing the native promoter of the gene with a stronger constitutive or inducible promoter.
- Gene inactivation - Inactivation of a gene in a methylotrophic yeast is achieved by genentic modification of the yeast such that substantially no functional protein is produced from the strain.
- substantially' ' it is meant that the level of functional protein produced from the modified strain is not more than 20%, or 15%, or 10% or even 5% or less of the level of functional protein produced from an unmodified strain.
- Inactivation of a gene can be achieved by disrupting the genomic ORF of the gene in the strain, or disrupting the native promoter, or replacing the native promoter with a repressible promoter (e.g., repressible by methanol).
- Methylotrophic yeast - Methylotrophic yeasts are those capable of growth on methanol, and include yeasts of the genera Candida, Hansenula (such as H. polymorpha, now classified as Pichia angusta), Torulopsis, and Pichia (e.g., Pichia pastoris, Pichia methanolica, Pichia angusta (formerly Hansenula polymorpha), Pichia stipitis, and Pichia anomala).
- Chaperones - Chaperones are proteins that assist in the non-covalent folding or unfolding of proteins, mediate the redox potential to assist the formation of disulphide bonds within or between protein subunits, assist in the assembly or disassembly of other macromolecular structures or complexes, and/or translocation of proteins across membranes.
- novel protein-encoding genes identified by the present invention have been grouped based on the functions of the encode proteins and are discussed in details below. Where the utility of a particular group or a particular gene is not specifically discussed, its utilities are apparent based on the function and utility of its homolog(s) from other yeast species such as S. cerevisiae which have been characterized.
- 53 genes have been identified as encoding secreted proteins with a signal peptide.
- the ORF and the amino acid sequence (with the signal peptide portion shown in bold) of each of the 53 genes are set forth in SEQOOO 1-0025 and 0027-0054.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQOOO 1-0025 or 0027-0054, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0001-0025 or 0027-0054.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0001-0025 or 0027-0054, or a nucleotide sequence that is substantially homologous to the coding sequence of an ORF as set forth in any of SEQ0001-0025 or 0027-0054.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQOOOl- 0025 or 0027-0054, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the invention is directed to the signal peptides of these 53 newly identified genes and uses thereof for recombinant expression of heterologous proteins.
- These signal peptides are summarized and aligned in Figure 7. Alignment of these peptides shows a hydrophobic consensus sequence (poly Leu), and a small amino acid residue (such as A, C, G and S) at position -1 and -3 from the cleavage site.
- Proteins destined for translocation into the endoplasmic reticulum carry a signal peptide that is recognized upon ribosomal translation by the signal recognition particle, which docks to the translocon, upon which translation continues and the protein is delivered into the ER lumen.
- the signal peptide is removed by signal peptidase.
- the protein's native signal peptide When over-expressing a protein destined for secretion, one can either use the protein's native signal peptide if the translocation and signal peptidase machinery of the host cell can efficiently recognize and then process it.
- the coding sequence of a heterologous signal peptide can be fused to the coding sequence for the mature form of the protein.
- the signal sequence which has been frequently used is the prepro signal of S. cerevisiae alpha mating factor. Whereas this signal works in many cases, processing of the propeptide can be problematic as it requires Kex2p protease cleavage followed by polishing of the newly created N-terminus by Stel3p diaminopeptidase.
- Pichia signal peptides available is particularly useful. For example, one can screen for effective signal peptide(s) for a given protein desired to be expressed in Pichia.
- a consensus artificial signal peptide can be designed which can be used to efficiently secrete multiple different heterologous proteins.
- an intron functional in Pichia (which can be selected from those shown in the Sequence File provided herein) can be cloned before the ORF or coding sequence for the target protein, wherein the intron contains either a unique restriction site or a recognition site for a recombinase.
- the same intron can be cloned behind (i.e., 3' of) each signal peptide coding sequence in the library, enabling rapid generation of an expression library of signal peptide-target ORF fusions through classical cloning using the unique restriction site or through recombinational cloning.
- the intron is removed by the Pichia splicing machinery, resulting in an expressed library of in-frame fusion between the coding sequence for one of the signal peptides and the target ORF.
- secretion of the target protein by individual members of the expressed library can be evaluated by a suitable technique such as SDS-PAGE analysis, coomassie blue staining, or Western blot. In this way, the suitability of a given Pichia signal peptide can be evaluated for secretion of any target protein.
- the present invention is directed to a library of expression vectors, wherein each of the 54 signal peptides is represented in the library.
- Each expression vector contains, from 5' to 3', a promoter functional in Pichia, operably linked to a coding sequence for a signal peptide, and an intron sequence containing a restriction endonuclease recognition site.
- the expression vector is designed to accommodate insertion of the coding sequence of a target protein, linked in its 5' to the same intron sequence containing the same restriction endonuclease recognition site on the expression vector.
- the present invention provides an expression vector capable of directing the expression and secretion of a heterologous protein in Pichia pastoris or another methylotrophic yeast.
- the expression vector contains, from 5' to 3', a promoter functional in the recipient strain, operably linked to a coding sequence for the fusion of a signal peptide (identified in SEQOOO 1-0025 and 0027-0054) and the heterologous protein.
- Host cells transformed with such an expression vector constitute another embodiment of the present invention.
- the signal sequences from some of the secreted proteins are not cleaved off in the ER, but remain linked to the protein. This is usual for a number of those enzymes involved in the glycosylation pathway (for example those for mannosyl transferases), for secretion chaperones, and for other factors that help secretion in a particular compartment, for example.
- These signal sequences are useful for the localization of heterologous proteins to target them to particular compartments of the secretory pathway, which may be advantageous to various aspects of posttranslational modification. This may correspond to the same or equivalent compartment in which a heterologous enzyme or protein normally acts in its native cell. Alternatively, it may be a different compartment, but, is effective because it is the same as that in which another enzyme acts or is immediately downstream or upstream of the compartment in which a second enzyme acts, effectively channeling and coordinating the sequence of a metabolic process.
- 5 genes have been identified as encoding proteins potentially involved in secretion, including P. pastoris homologs of S. cerevisiae SECl, SEC 11, and subunits SPCl, SPC2 and SPC3 of signal peptidase complex.
- P. pastoris homologs of S. cerevisiae SECl, SEC 11, and subunits SPCl, SPC2 and SPC3 of signal peptidase complex are set forth in SEQ0055 to SEQ0059.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0055 to SEQ0059, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0055 to SEQ0059.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence sequence of an ORF as set forth in any of SEQ0055 to SEQ0059, or a nucleotide sequence that is substantially homologous to the coding sequence of an ORF as set forth in any of SEQ0055 to SEQ0059.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0055 to SEQ0059, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the invention is directed to a methylotrophic yeast strain, for example a Pichia strain such as Pichia pastoris, which overexpresses the Pichia pastoris SECl gene as set forth in SEQ0055.
- a methylotrophic yeast strain for example a Pichia strain such as Pichia pastoris, which overexpresses the Pichia pastoris SECl gene as set forth in SEQ0055.
- Expression vectors for achieving such overexpression are also contemplated by the present invention.
- the invention is directed to a methylotrophic yeast strain, for example a Pichia strain such as Pichia pastoris, which overexpresses at least one of the Pichiapastoris SECI l, SPCl, SPC2 or SPC3 genes as set forth in SEQ0056-0059.
- Expression vectors for achieving such overexpression are also provided by the present invention.
- Efficient processing of the signal peptide of a secreted protein is essential for high yield and to eliminate the presence of additional amino acids in the secreted protein. Overexpression of one or more subunits of the signal peptidase complex may increase the efficiency and the quality of the processing. To evaluate whether the efficiency of the signal peptide cleavage is improved as a result of overexpression of a subunit of signal peptidase complex, the quantity and the amino acid sequence of a heterologous secreted glycoproteins can be analyzed.
- 26 genes have been identified as encoding proteins involved in the glycolysis pathway.
- the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 26 genes are set forth in SEQ0060 to SEQ0085.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0060 to SEQ0085, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0060 to SEQ0085.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0060 to SEQ0085, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0060 to SEQ0085, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the present invention is directed to the promoters of each of the genes as set forth in SEQ0060 to SEQ0085. Because glycolysis is a central metabolic pathway, the promoters of the genes encoding proteins involved in this pathway are believed to be strong promoters, which can be used for driving overexpression of heterologous genes in methylotrophic yeast such as P. pastoris. Accordingly, the present invention provides expression vectors containing any one of the promoters of the genes set forth in SEQ0060 to SEQ0085, operably linked to the coding sequence of a heterologous protein. Methylotrophic yeast strains transformed with such an expression vector are also provided by the invention.
- P. pastoris homologues of genes involved in homologous recombination P. pastoris homologues of genes involved in homologous recombination
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0086 to SEQ0095, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0086 to SEQ0095.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0086 to SEQ0095, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0086 to SEQ0095, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the invention is directed to methylotrophic yeast strains, especially Pichia pastoris strains, wherein one or more of the genes as set forth in SEQ0086 to SEQ0095 have been inactivated. Inactivation of these genes involved in recombination may prevent out-recombination of an expression unit or cassette containing a heterologous gene integrated in the chromosome, therefore potentially "lock” or stabilize the insertion, especially multicopy insertions as further discussed hereinbelow.
- 29 genes have been identified as encoding proteins and showing expression levels 2Ox higher than GAPl based on microarray analysis.
- the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 10 genes are set forth in SEQ0096 to SEQ0124.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0096 to SEQ0124, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0096 to SEQO 124.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0096 to SEQO 124, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0096 to SEQ0124, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the present invention is directed to the promoters of each of these 29 genes showing high expression levels. These promoters can be used for driving overexpression of heterologous genes in methylotrophic yeast such as P. pastoris. Accordingly, the present invention provides expression vectors containing any one of the promoters of the 29 genes set forth in SEQ0096 to SEQO 124, operably linked to the coding sequence of a heterologous protein. Methylotrophic yeast strains transformed with such an expression vector are also provided by the invention. Homologs of promoters used for expression of proteins in S. cerevisiae
- S. cerevisiae whose promoters have been used for recombinant expression in S. cerevisiae, including SEQ0125 (homolog of S. cerevisiae glycerol-3- phosphate dehydrogenase 1 (GPDl) and GPD2), SEQ0126 (homolog of S. cerevisiae alcohol dehydrogenase 1 (ADHl) and ADH2), SEQO 127 (homolog of S.
- SEQ0125 homolog of S. cerevisiae glycerol-3- phosphate dehydrogenase 1 (GPDl) and GPD2
- SEQ0126 homolog of S. cerevisiae alcohol dehydrogenase 1 (ADHl) and ADH2
- SEQO 127 homolog of S.
- SEQ0128 homolog of S. cerevisiae sulfite reductase beta subunit ECM17.
- SEQ0125 to SEQ0128 the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 4 genes are set forth in SEQ0125 to SEQ0128.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0125, SEQ0126 or SEQ0128, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0125, SEQ0126 or SEQ0128.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 125, SEQO 126 or SEQO 128, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0125, SEQO 126 or SEQO 128, or an amino acid sequence substantially homologous thereto.
- amino acid sequence e.g., a full-length protein sequence
- the present invention is directed to the promoters of each of these 4 genes.
- These promoters can be used for driving expression of heterologous genes in Pichia pastoris or another methylotrophic yeast species.
- Expression vectors containing any of these promoters, operably linked to a heterologous coding sequence, and methylotrophic yeast strains transformed with any such expression vectors, are contemplated by the invention.
- UDP-GIcNAc transporter SEQ0129
- UDP-glucose-4-epimerase SEQ0130
- HUTl putative role of transporting UDP-galactose into Golgi
- SEQ0132 putative UDP-galactose transporter
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQO 129 to SEQO 132, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0129 to SEQ0132.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 129 to SEQ0132, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0129 to SEQO 132, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the present invention is directed to expression vectors to achieve increased expression in Pichia of native glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-Glc-4-epimerase for ER or Golgi localization. It is believed that increased expression of these proteins can increase homogeneity of final hybrid or complex type glycoform on the heterologous protein designed to be expressed in the same strain.
- the expression vectors contain the coding sequence of one of the desirable glycosylation precursor synthesis enzymes or transporters, which is placed in operable linkage to a promoter functional in recipient host cells.
- the promoters that control the expression of the precursor synthesis enzymes or transporters can be selected to be induced by similar conditions to those directing the expression of the heterologous protein.
- the invention provides methylotrophic yeast strains especially Pichia strains which overexpress one or more native or Pichia pastoris glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-Glc-4-epimerase for ER or Golgi localization
- methylotrophic yeast strains especially Pichia strains which overexpress one or more native or Pichia pastoris glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-Glc-4-epimerase for ER or Golgi localization
- 5 genes have been identified as encoding proteins involved in O-glycosylation.
- the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 5 genes are set forth in SEQ0133 to SEQ0137.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0133 to SEQ0137, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 133 to SEQO 137.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 133 to SEQ0137, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0133 to SEQ0137, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- yeasts In addition to N-glycosylation, yeasts also O-glycosylate secreted proteins with oligomannosyl-glycans that differ from the mucin-type O-glycosylation in humans . No robust engineering approach has yet been developed to overcome this issue prior to this invention. The identification herein of the Pichia protein-O-mannosyltransferases that initiate this modification in the ER permits genetic modification of Pichia to reduce O-glycosylation.
- the present invention is directed to methylotrophic yeast strains, e.g., Pichia such as P. pastoris strains, in which one of more of the identified O-mannosyl transferase genes are inactivated to reduce or eliminate unwanted O-glycosylation of a heterologous protein.
- methylotrophic yeast strains e.g., Pichia such as P. pastoris strains
- one of more of the identified O-mannosyl transferase genes are inactivated to reduce or eliminate unwanted O-glycosylation of a heterologous protein.
- 18 genes have been identified as encoding mannosyltransferases, including 14 ⁇ - mannosyltransferases.
- the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 18 genes are set forth in SEQ0138-0152 and 0210-0212.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0138-0152 or 0210-0212, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 138-0152 or 0210-0212.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0138-0152 or 0210-0212, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0138 to SEQO 152, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- methylotrophic yeasts such as P. pastoris modify proteins with a range of heterogeneous high-mannose glycans, which introduce a large amount of heterogeneity in the protein (reducing downstream processing efficiency and complicating product characterization) and induce fast clearance from the bloodstream.
- the highly immunogenic terminal alpha- 1,3-mannosyl glycotypes that are abundantly produced by S. cerevisiae are not detected on Pichia-produced glycoproteins. Consistently, no ortholog of the S. cerevisiae MNNl gene (encoding the alpha- 1,3-mannosyltransferase) has been found in the Pichia genome.
- Pichia glycoproteins can in some cases be modified with beta- 1,2-mannose residues , reminiscent of antigenic epitopes on the Candida albicans cell wall .
- P. pastoris AMR2 beta-mannosyltransferase which has been documented in the art, has been identified in the genome. 3 homologs of AMR2 beta-mannosyltransferase have also been identified (SEQ0210-0212), thus providing the basis for reducing the levels of undesired beta- mannosylation.
- the present invention is directed to methylotrophic yeast strains, particularly Pichia strains, in which one of more of the identified P. pastoris mannosyltransferase genes are inactivated, for example, the homolog genes of AMR2 beta- mannosyltransferase as set forth in SEQ0210-0212, to reduce or eliminate unwanted beta-mannosylation of a heterologous protein.
- 13 genes have been identified as encoding proteins involved in the ER glycosylation pathway.
- the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 13 genes are set forth in SEQ0153 to SEQ0165.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0153 to SEQ0165, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 153 to SEQO 165.
- a protein e.g., a full-length protein
- SEQO 153 to SEQO 165 encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 153 to SEQO 165.
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 153 to SEQ0165, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0153 to SEQ0165, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the invention is directed to methylotrophic yeast strains, especially Pichia strains such as P. pastoris, in which one or more of the genes as set forth in SEQ0153-0165 have been inactivated.
- An alternative to modification of glycans after transfer to a protein is the modification of the glycan precursor before transfer to the protein. Inactivation of enzyme activities in the synthesis of the glycan precursor in the ER can result in glycolysation of proteins with modified glycan structures.
- the invention is directed to the use of the Pichia pastoris STT3 gene sequence as set forth in SEQO 163 to disrupt the chromosomal STT3 gene, and optionally, to further insert a heterologous STT3 gene, such as the Leishmania STT3 gene into the Pichia pastoris STT3 locus.
- STT3 is a part of the oligosaccharyl-transferase (OT) complex which transfers the lipid linked oligosaccharide to the protein.
- Leishmania major has four STT3 paralogues, of which 3 could complement a yeast stt3 deletion.
- LmSTT3 does not work in the OT-complex but is active as a dimeric complex. It is suggested that the various LmSTT3 dimeric complexes display different protein substrate specificities at the level of individual glycosylation sequences.
- the LmSTT3D dimeric complex has a relaxed specificity with respect to the lipid linked oligosaccharide substrate.
- LmSTT3 OTase In contrast to the homogenous OT-complex, LmSTT3 OTase has no reduced transfer efficiency of glycans lacking ⁇ -l,2-linked mannoses on the B and C branch 12 . Replacing the Pichia pastoris STT3 by LmSTT3D can provide glycosylation flexibility when the native OT-complex of Pichia pastoris is unable to transfer modified lipid- linked oligosaccharides to the protein.
- the plasmid containing the LmSTT3 expression cassette was digested with restriction endonuclease(s), excising the LmSTT3 cassette containing the PpSTT3 sequences. Transformants are selected by using the selection marker present in the expression cassette. To evaluate whether glycosylation is increased, N-glycans derived from secreted glycoproteins are analyzed by DSA-FACE capillary electrophoresis.
- the invention is directed to methylotrophic yeast strains which overexpress one or more of the MNN4 homologs to promote the core type phosphorylation of N-glycans.
- Increased phosphorylation of recombinant proteins can be useful in directing the protein for uptake throught the mannose-6-phosphate receptor 13 .
- Expression vectors for achieving such elevated expression are also part of the present invention.
- N-glycans derived from secreted glycoproteins after 48 hours culture in YPD medium are analyzed by DSA- FACE capillary electrophoresis.
- Man 8 GlcNAc 2 will be drastically reduced in favor of two structures that migrate faster (compared to Man 8 GlcNAc 2 ) and that are likely to contain one (P) and two (PP) phosphate residues, respectively. Assuming that both peaks derive from the MansGlcNAc2 peak, the amount of MangGlcNAci converted to phosphorylated glycans can be quantitated.
- nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 10 genes are set forth in SEQ0169 to SEQ0178. While some of these genes have been documented in the art, the promoters have not been identified prior to the present application.
- the commonly used methanol-inducible promoters in P. pastoris the alcohol oxidase I (AOXI) promoter and the formaldehyde dehydrogenase (FLD) promoter, drive the production of enzymes needed for methanol assimilation and therefore produce extremely high levels of these transcripts upon switching the carbon source to methanol.
- the P. pastoris genome sequence has now allowed identification of all genes coding for enzymes involved in methanol assimilation (Fig. 3A and Table 5A) and their promoters, which are useful for driving transgene expression in P. pastoris.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0172- SEQ 0174 or SEQ0176 to SEQ0178, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0172- SEQ 0174 or SEQ0176 to SEQ0178.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 172- SEQ 0174 or SEQO 176 to SEQO 178, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0172- SEQ 0174 or SEQO 176 to SEQO 178, or an amino acid sequence substantially homologous thereto.
- the present invention is directed to the promoters of the genes disclosed in SEQ0169 to SEQ0178. These promoters can be placed in an operable linkage to a heterologous gene for methanol-inducible recombinant expression in Pichia pastoris. Therefore, expression vectors, host cells and methods of recombinant expression by utilizing any of the promoters disclosed in SEQ0169 to SEQ0178 are also embodiments of the present invention.
- S. cerevisiae proteases 8 genes have been identified as encoding protein homologs of S. cerevisiae proteases or protease inhibitors.
- the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 8 genes are set forth in SEQ0179 to SEQ0186.
- SEQ0179 sets forth a Pichia pastoris gene encoding a serine-type peptidase.
- SEQO 180 sets forth a Pichia pastoris gene encoding a serine-type endopeptidase inhibitor.
- SEQO 181-0186 sets forth Pichia pastoris genes coding for aspartic-type endopeptidases.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQO 179 to SEQO 186, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 179 to SEQO 186.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 179 to SEQO 186, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0179 to SEQO 186, or an amino acid sequence substantially homologous thereto.
- the present invention is directed to vectors for inactivating one or more of the Pichia proteases, and to Pichia strains having one or more of the protease
- the present invention is directed to expression vectors capable of expressing the serine-type endopeptidase inhibitor (SEQO 180) in a methylotrophic yeast strain such as P. pastoris, and methylotrophic yeast strains such as P. pastoris engineered transformed with such expression vector to produce the endopeptidase inhibitor at an elevated level.
- SEQO 180 serine-type endopeptidase inhibitor
- protease-deficient Pichia strains and strains that produce the endopeptidase inhibitor at an elevated level are believed to allow more stable accumulation of a heterologous protein expressed in these strains.
- Such strains may be useful especially for producing recombinant immunoglobulins.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQO 187 to SEQ0200, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 187 to SEQ0200.
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 187 to SEQ0200, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0187 to SEQ0200, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the present invention has now provided a complete catalog of orthologs of the S. cerevisiae ER folding machinery. This information is especially useful for design of an efficacious folding system.
- the present invention is directed to expression vectors and engineered Pichia strains for increased expression of one or more native Pichia chaperones.
- the chaperone coding sequences can be placed under the control of a promoter selected to be induced by similar conditions to those for heterologous protein expression.
- Expression libraries of multiple chaperones can be screened to identify the most effective combination of chaperon expression for a particular heterologous protein. For example, a series of libraries of chaperones can be created, with each library having a different drug resistance marker incorporated for selection, such that either successive or combinatorial introduction of members of these libraries into a strain expressing a heterologous protein may be selected.
- the P, pastoris 5S ribosomal RNA gene has been identified and set forth in SEQ0201.
- One strategy for optimizing protein expression levels in a recipient cell is to increase copy number of the expression cassette. This effort has been hampered by the absence of knowledge regarding sequences which occur multiple times in the Pichia genome and could be used for stable multi-copy strain generation through homologous recombination-mediated targeting of such multi-copy sequences.
- the 5S rRNA coding-sequence of Pichia pastoris provides the basis for multi-copy targeting. Contrary to the situation in S. cerevisiase, the 5S rRNA-coding sequence is not a part of the rDNA repeat locus in Pichia pastoris, and many copies of the 5S rRNA coding- sequence are spread over the 4 chromosomes of Pichia pastoris (Fig. IB), thus providing an ideal targeting site for multi-copy integration.
- the 5S rRNA coding sequence can be placed in a vector which also carries an expression cassette of interest and a selectable marker.
- the selectable marker can be a dominant selection marker that confers drug resistance, or a marker that confers phenotype selectable based on prototrophy.
- a unique restriction site is made or designed to be available within the 5S rRNA coding sequence.
- the vector is linearized by using the restriction enzyme that cleaves in the 5S rRNA coding sequence.
- the linearized vector is transformed into Pichia pastoris, and drug-resistant clones are isolated at increasing drug concentrations. Those clones that are resistant against the highest drug concentrations are expectedly those that have taken up the largest number of expression cassettes.
- the selectable marker supplies an enzyme (e.g., HIS4)
- HIS4 an enzyme
- the clones that grow faster in the appropriate seletable media are identified as having multicopy integration. This procedure can be repeated until an optimal number of expression cassettes in the strain is obtained. Protein production in the selected strains is evaluated using methods known in the art.
- the present invention also provides expression vectors capable of mediating multi-copy integration of an expression cassette onto the chromosomes of Pichia pastoris, as well as Pichia pastoris containing multiple copies of the expression cassette, stably integrated into the chromosomes.
- An expression cassette refers to a nucleic acid that includes, from 5' to 3', a promoter, the coding sequence of a heterologous protein of interest, and a 3' downstream sequence including a transcription termination sequence.
- the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0202 to SEQ0209, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0202 to SEQ0209.
- a protein e.g., a full-length protein
- the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0202 to SEQ0209, or a nucleotide sequence that is substantially homologous thereto.
- the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0202 to SEQ0209, or an amino acid sequence substantially homologous thereto.
- an amino acid sequence e.g., a full-length protein sequence
- the present invention is directed to the promoters of the genes disclosed in SEQ0202 to SEQ0209. These promoters are expected by the induced by specific sugars (xylose and arabinose metabolism pathway by C5 sugars, and threhalose pathway by ⁇ -l,l-disaccharides) can be placed in an operable linkage to a heterologous gene for inducible recombinant expression in Pichia pastoris . Therefore, expression vectors, host cells and methods of recombinant expression by utilizing any of the promoters disclosed in SEQ0202 to SEQ0209 are also embodiments of the present invention.
- the genes encoding these enzymes are inactivated. This is achieved through standard yeast genetics techniques. Examples of such techniques include gene replacement through double homologous recombination, in which homologous regions flanking the gene to be inactivated are cloned in a vector flanking a selectable marker gene (such as an antibiotic resistance gene or a gene complementing an auxotrophy of the yeast strain). Alternatively, the homologous regions can be PCR-amplified and linked through overlapping PCR to the selectable marker gene.
- a selectable marker gene such as an antibiotic resistance gene or a gene complementing an auxotrophy of the yeast strain.
- DNA fragments are transformed into Pichia pastoris through methods known in the art, e.g., electroporation.
- Transformants that then grow under selective conditions are analyzed for the gene disruption event through standard techniques, e.g. PCR on genomic DNA or Southern blot.
- gene inactivation can be achieved through single homologous recombination, in which case, e.g. the 5' end of the gene's ORF is cloned on a promoterless vector also containing a selectable marker gene.
- a promoterless vector also containing a selectable marker gene.
- transposon mutagenesis is used to inactivate the target gene.
- a library of such mutants can be screened through PCR for insertion events in the target gene.
- the functional phenotype (i.e., deficiencies) of an engineered/knockout strain can be assessed using techniques known in the art.
- a deficiency of an engineered strain in protease activity can be ascertained using any of a variety of methods known in the art, such as an assay of hydrolytic activity of chromogenic protease substrates, band shifts of substrate proteins for the select protease, among others.
- a deficiency in protein O- mannosylation can be detected by mass changes of expressed O-glycosylated proteins and through mass spectrometrical techniques designed to detect the sites of O-glycosylation (such as beta-elimination in 18O-H2O), used in a comparative experiment with the same protein expressed in a non-engineered strain.
- Beta-mannosyltransferase deficiency can be detected through glycan analysis of expressed proteins that are beta-mannosylated in non-engineered strains or through loss of signal with beta-mannosyl specific antibodies.
- the ORFs or coding sequences of such genes are cloned under the control of a promoter of desired strength and regulation (such as the methanol-inducible AOXI, AOXII or FLD promoters, or the constitutively expressed GAP or TEFlalpha promoter), in a vector also containing a selectable marker gene functional in Pichia pastoris.
- a promoter of desired strength and regulation such as the methanol-inducible AOXI, AOXII or FLD promoters, or the constitutively expressed GAP or TEFlalpha promoter
- the vector may be linearized in a genome-homologous sequence, although this is not essential. Subsequently, such vector is transformed into Pichia pastoris using techniques known in the art.
- the promoter sequence is amplified from Pichia pastoris genomic DNA with primers designed to the 5 1 and 3' regions of the sequences provided herein.
- the corresponding PCR fragment is cloned in an appropriate vector by blunt or TA ligation or by restriction-ligation cloning using restriction sites, not present in the promoter, adapted to the primer.
- the promoter is subsequently cloned upstream of a gene of interest in a yeast expression vector.
- the expression vector is transferred into the Pichia pastoris strain according to protocols described by Cregg and Russel .
- the vector is cleaved in a P. pastoris DNA segment (e.g.. the P. pastoris promoter) with unique restriction sites. This allows integration into the genome by single crossover type insertion 15 .
- Transgenic yeast can be obtained on medium containing the selection marker or on medium lacking the complemented amino acid.
- Identifying promoter sequence and cis-acting elements For the identification of the promoter sequence, a 5'-deletion series is generated by PCR on P. pastoris genomic DNA as template. For this deletion series, a set of forward primers is designed which hybridize at different distance from the start. As reverse primer, a primer is used that hybridizes to the 3' end of the suggested promoter sequence. In a first screen, the promoter deletions can be done in steps of 100 bp. By this a rough estimation of the promoter size can be made. Later a second screen in this region can be perfomed with smaller deletion steps. The promoter deletions are cloned in an expression vector, in front of a marker gene which allows to quantify the expression. Analysis of expression levels compared to the whole sequence will allow identification of the promoter.
- the present invention provides additional Pichia pastoris genes involved in the glycolysis pathway, methanol assimilation, xylose, arabinose and threhalose metabolisms.
- the promoters of these genes can be used for driving overexpression of heterologous genes.
- promoters of the genes in the methanol utilization pathway the xylose metabolism, the arabinose metabolism and the threhalose metabolism
- these promoters can be used for inducible expression of a heterologous gene.
- the expression of their endogenous gene can be compared by qPCR of cells grown in medium with or without addition of the rsepective carbon sources. For example, P.
- pastoris wild type cells are grown in BMGY (uninduced condition), centrifuged and induced in BMMY (for methanol induction), in media containing xylose, in media containing arabinose or in media containing threhalose, respectively. At different time points, samples will be taken. At every time point, RNA is isolated from the yeast cells, treated with DNAse to remove all genomic DNA and then the mRNA converted into cDNA. On all genes to be tested, primers are designed (by Primer3Plus Software) and tested for amplification of the gene. To be able to compare the expression of the different genes, a good reference gene (e.g., actin; GAPDH or PDAl) is included.
- actin e.g., actin; GAPDH or PDAl
- the present invention contemplates the use of a methylotrophic yeast strain, including but not limited to Pichia strains such as Pichiapastoris, as the host for genetic engineering and recombinant production of proteins including glycoproteins. While the identification of native Pichiapastoris genes and their constituents (such as promoters, signal peptides, proteins including enzymes and chaperons) are preferred choices for engineering a Pichia pastoris strain, these choices are believed to also work for other methylotrophic yeast strains, especially other Pichia strains closely related to Pichiapastoris.
- Embodiments described herein can be combined as appropriate.
- genetic modifications of a strain such as usage of inducible promoters, usage of signal peptides, inactivation of one or more protease genes, overexpression of nucleotide sugar synthesis and transport, inactivation of O- or beta-mannosylation, expression of chaperones, and multicopy integration, for example, can be combined in any manner desirable.
- This Example describes the methods employed to identify, assemble and annotate the full genomic sequence of P. pastoris GS 115 strain.
- P. pastoris GSl 15 (Invitrogen, Carlsbad, CA) is a strain derived from the wild type strain NRRL-Y 11430 (Northern Regional Research Laboratories, Peoria, IL). It has a mutation in the histinol dehydrogenase gene (HIS4) and was generated by nitrosoguanidine mutagenesis at Phillips Petroleum Co. It is the most frequently used Pichia strain for heterologous protein production.
- P. pastoris genomic DNA was prepared according to a published protocol with minor modifications. Instead of vortexing, the samples were shaked in a Mixer Mill (Retsch) for 2 minutes.
- the shotgun library of P. pastoris for sequencing on the Genome Sequencer FLX was prepared from five micrograms of intact genomic DNA. Based on random cleavage of the genomic DNA 16 with subsequent removal of small fragments with AMPureTM SPRI beads (Agencourt, Beverly, MA), the resulting single- stranded (sst) DNA library showed a fragment distribution between 300 and 900 bp with a maximum of 574 bp.
- the optimal amount of sstDNA library input for the emulsion PCR 16 (emPCR) was determined empirically through two small-scale titrations leading to 1.5 molecules per bead used for the large-scale approach.
- a total of 64 individual emPCRs were performed to generate 3,974,400 DNA carrying beads for two two-region- sized 70x75 PicoTiterPlates (PTP) and each region was loaded with 850.000 DNA carrying beads.
- Each of the two sequencing runs was performed for a total of 100 cycles of nucleotide flows 16 (flow order TACG) and the 454 Life Sciences/Roche Diagnostics software Version 1.1.03 was used to perform the image and signal processing.
- the information about read flowgram (trace) data, basecalls and quality scores of all high quality shotgun library reads was stored in a Standard Flowgram Format (SFF) file which was used by the subsequent computational analysis (see below).
- SFF Standard Flowgram Format
- the initial assembly contained 1,154 contigs with 9.6 Mbp sequence and 2Ox sequencing depth.
- the contig N/L50 was 40/77 kbp. Assembly of the contigs was performed manually, based on homology between the contig ends. 13 contigs were assigned to chromosomes by identification of the chromosomal markers previously described 10 (Chromosome 1: HIS4, ARG4, OCHl, PAS5, PRBl, PRCl; Chromosome 2: PAS8, GAP; Chromosome 3: DASl, URA3, PEP4; Chromosome 4: AOXl, AOX2).
- contigs with homologous contig ends were identified by BLASTN search with 500- 1000 bp of the contig ends to a database with the contig sequences. Contigs sharing homology with a P-value ⁇ e-20 were assumed to be linked. Pools of potentially linked contigs were assembled to supercontigs by the SeqMan assembly software (DNASTAR inc, Madison, WI, USA). The resulting contig junctions were curated by removing the low- coverage ends of either joined contig. In the cases where the BLASTN P-value was >e-50, the junction was PCR-amplified and Sanger-sequenced (primer sequences: Table 1). This resulted in 10 supercontigs, with 9,1 Mbp of sequence, and a remaining 7 unassembled contigs, The supercontig N/L 50 was 3/1.544 Mbp.
- the mitochondrial genome was also assembled and had extremely high coverage (859.9 fold), indicating the presence of approx. 43 mitochondrial genomes per cell in P. pastoris when grown on glucose as carbon source.
- Gaps were PCR-amplified using primers flanking these regions (Table 1) and sequenced by Sanger sequencing for finishing.
- rDNA repeat regions were detected by Southern blot on all four PFGE-separated chromosomes (Fig. 5A). The Southern signal on chromosomes 1 and 4 was as strong as that on chromosomes 2 and 3 combined. Through PCR, the location and orientation of the rDNA locus was determined to be at one end of Chromosomes 2 and 3 (Fig. IB). The attempts at verification of the rDNA locus position on chromosomes 1 and 4 (still containing 1 gap) were inconclusive.
- a BioRad contour-clamped homogeneous electric field CHEF DRIII system was used for PFGE.
- Chromosomal DNA was prepared in agarose plugs with the CHEF Genomic DNA Plug kit (BioRad) following the instructions of the manufacturer.
- a 0.8% agarose gel in 1 x modified TBE (0.1 M Tris, 0.1 M Boric Acid, 0.2 mM EDTA) was used to separate the chromosomes. The gel was electrophoresed with a 106° angle at 14 0 C at 3 V/cm for 32 h, with a switch interval of 300 s, followed by 32 h with a switch interval of 600 s and 24 h with a switch interval of 900 s.
- chromosomes were visualized with ethidium bromide and the different contigs were mapped onto the chromosomes by Southern blot. Therefore, the gel was incubated in 0.25 M HCl for 30 minutes, followed by capillary alkali transfer of the DNA onto a Hybond N + membrane (Amersham).
- the probes were prepared by PCR on an open reading frame. For chromosome specific probes, a part of the coding sequence of HIS4 (chromosome 1), GAP (Chromosome 2), URA3 (chromosome 3) and AOXl (chromosome 4) was used. The probes were random labelled with OC P dCTP, using the High Prime kit (Roche).
- Protein-coding genes were predicted by the integrative gene prediction platform
- EuGene (Fig. 6). A specific EuGene version was trained based on 108 manually checked P. pastoris genes, Documented genes from P, stipitis and S. cerevisiae were used to build P, pastoris orthologous gene models allowing the training of P. pastoris-spec ⁇ ic Interpolated Markov Models for coding sequences and introns. Splice sites were predicted by NetAspGene 19 and gene prediction from GeneMarkHMM-ES 20 trained for P. pastoris and AUGUSTUS (Pichia stipitis model) were used to provide alternative gene models for EuGene prediction. The UniProt and the fungi RefSeq protein database were searched against the supercontig sequence by BLASTX to identify the coding area. The DeCypher-TBLASTX program was used to search the conserved sequence area between the P. pastoris, P. stipitis and Candida guilliermondii genomes.
- GO Gene Ontology
- BOGAS also provides a search function where users can search for genes by sequence similarity (BLAST), gene id, gene name or InterPro domain.
- BLAST sequence similarity
- Each predicted Pichia gene's structure and the similarity search result were visually inspected through an embedded strip-down version of ARTEMIS. The splice sites of each gene were carefully checked and compared with S. cerevisiae and P. stipitis loci. A functional description of each gene was added to the gene annotation when a closely related homologous gene was available.
- CEGs core eukaryotic genes
- the CEGs contains 248 genes across six model organisms (H. sapiens, D. melanogaster, C. elegans, A. thaliana, S. cerevisiae and S. pombe) of which ⁇ 90% are single copy in D. melanogaster, C. elegans, S. cerevisiae and S. pombe.
- the protein-coding genes identified in this invention were checked with the HMM profile from the CEGs dataset by the HMMER package. All of the 248 CEGs were present in our curated gene set with full HMM domain coverage.
- FUNYBASE FUNgal phYlogenomic dataBASE
- FUNYBASE FUNgal phYlogenomic dataBASE
- Ribosomal RNAs were detected automatically by INFERNAL 1.0 (INFERence of RNA ALignment) against the Rfam 29 database and manually confirmed by BLASTN search with S. cerevisiae homologs to the P. pastoris genome sequence. Localization of the ribosomal DNA locus was assayed by PFGE and PCR. [0182] Transfer RNAs were automatically predicted by tRNA Scan-SE 1.21 and manually confirmed by BLASTN search with the 5. cerevisiae homologs to the P. pastoris genome sequence.
- Nucleotide sequences of the predicted P. pastoris ORFeome were analyzed with ANACONDA 1.5 31 .
- the analysis by ANACONDA generates a codon-pair context map for the ORFeome. This map shows one colored square for each codon-pair, the first codon corresponds to rows and the second corresponds to columns in the map. Favored codon pairs are shown in green, underrepresented ones are shown in red.
- the phylogenetic tree was based on 200 single-copy genes which were present in 12 sequenced fungal genomes. A multiple sequence alignment was constructed using the MUSCLE program and gap removal by in-house script based on the BLOSUM62 scoring matrix.
- the maximum likelihood tree reconstruction program TREE- PUZZLE 32 (quartet puzzling, WAG model, estimated gama distribution rate with 1000 puzzling step) was used for phylogenetic tree reconstruction. The tree was well supported by 1000 bootstraps in each node.
- the predicted proteomes used in this study were those of six hemiascomycetes (P. pastoris, S. cerevisiae, K. lactis, P. stipitis, C. lustianiae and Y. lipolyticdy ' .
- hemiascomycetes P. pastoris, S. cerevisiae, K. lactis, P. stipitis, C. lustianiae and Y. lipolyticdy ' .
- a similarity search of all protein sequences from the 6 fungi all- against-all BLASTP, e-value le-10) was performed. Gene families were constructed by Markov clustering 3 based on the BLASTP result. All predicted protein sequences from the six genomes were searched against the Pfam 36 database to obtain the protein domain occurrence in each species. The protein domain loss and acquisition was counted based on the Dollo parsimony principle by the DOLLOP program from the PHYLIP package 37 .
- This Example describes the results of assembly and annotation of the full genomic sequence of P. pastoris GS 115 strain.
- chromosome assembly was completed according to the strategy shown in Fig. IA.
- 454/Roche sequencing 16 (GS-FLX version) was utilized to highly oversample the genome (2Ox coverage) and generated 70,500 paired-end sequence tags, to enable the assembly of all but 7 contigs into only 9 "supercontigs" (plus the mitochondrial genome) using automated shotgun assembly and BLASTN-based contig end joining (see Example 1 and Fig. 4).
- the order of the supercontigs was determined through PCR and Sanger sequencing of the amplificates.
- the rDNA locus contains the 18S, 5.8S and 26S rRNA coding sequences. Unlike the S. cerevisiae 5S rRNA, which is present in the repeated rDNA locus, the 21 copies of the P. pastoris 5S ribosomal RNA (rRNA) were found to spread across the chromosomes. While the chromosomes of P. pastoris GS 115 were estimated to be (from chromosome 1 to 4): 2.9, 2.6, 2.3 and 1.9 Mbp based on pulsed-field gel electrophoresis (PFGE) , it was estimated after assembly in this invention to be 2.88 (2.8 + 0.08), 2.39, 2.24 and 1.8 (1.78+0.017) Mbp. Including the estimated 0.12 Mbp of rRNA repeats, the genome size of P. pastoris was determined to be 9.43 Mbp. The P. pastoris genomic sequences are deposited in Genbank.
- the Sanger sequences confirmed the genome sequence of this invention, and thus the error rate was estimated to be 1 in 35,147 bp.
- all open reading frames encoding proteins with at least one clear homolog in the databases were analyzed. If an interrupted ORF was found to have clear homology to the 5' part of the homologs, immediately followed by a coding sequence with clear homology to the 3' part, the most logical interpretation would be that there was a frame-shift error mutation in the genome sequence (i.e. both coding sequences are extremely likely to be linked into one open reading frame (ORF)).
- Phylogenetic analysis shows that P. pastoris diverged before the formation of the CTG clade (yeasts which translate the CUG codon into serine instead of leucine 13 ).
- Protein-coding genes were automatically predicted using the EuGene 18 prediction platform (Example 1) and these gene models were manually curated for functional annotation, accurate translational start and stop assignment, and intron location. This resulted in a 5,313 protein-coding gene set of which 3,997 (75.2%) were found to have at least one homolog in the National Center for Biotechnology Information (NCBI) protein database (BLASTP e- value le-5, sequence length ⁇ 20 % difference and sequence similarity >50 %). The protein- coding genes were found to occupy 80% of the genome sequence.
- NCBI National Center for Biotechnology Information
- 1,285 genes were assigned to the Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways 40 , and 4,262 of genes were annotated with Gene Ontology (GO) terms 3 ' .
- the GO slim categories of P. pastoris are presented in Fig. 6.
- a secretion signal peptide was predicted in 9% of the genes 42 and 4,274 of proteins were predicted to contain InterPro domains . These include 2,320 distinct Pf am domains.
- 32 domains in 32 genes were identified as specific to P. pastoris.
- the two fungi in the CTG clade of which the genomes have been sequenced (P. stipitis and C. lusitaniae) share 71 gene families which were found to be absent in P. pastoris.
- Codon (pair) optimization of transgenes to the expression host organism often yields substantial improvements in recombinant protein yield .
- P. pastoris' codon usage is shown in Fig. 3A. Overall, the codon usage is similar to the one for S. cerevisiae (the same codons being preferred by both organisms for all amino acids). Some synonymous codon pairs are also more or less frequently used than expected (the "codon pair bias"). As previously reported for S. cerevisiae 44 , underrepresented and overrepresented codon pair clusters were observed.
- tRNA coding genes were automatically predicted and manually confirmed by BLASTN with S. cerevisiae homologs, which identified 123 nuclear tRNA genes (Table 4), compared to 274 in the S. cerevisiae genome 45 .
- P. pastoris has three tRNA families not present in S. cerevisiae (tR(UCG), tL(CAG) and tP(CGG)), but also lacks one tRNA family (tL(GAG)).
- This Example describes experiments conducted to test whether overexpression in engineered Glycoswitch Pichia pastoris an endogenous UDP-GIcNAc transporter, UDP-GaI transporter and UDP-Glc-4-epimerase can increase glycan conversion efficiency of these strains, thus resulting in more uniform glycosylation patterns.
- E. coli was cultivated in LB medium containing the appropriate antibiotics for selection.
- the Pichia pastoris GlycoSwitch GnMan5 and GalGnMan5 strains expressing recombinant human alfa- antitrypsin were used in this analysis.
- TheGnMan5 strain is a GSl 15 (Invitrogen, Carlsbad, CA, USA)-derived strain that modifies its glycoproteins predominantly with GlcNAcMan5GlcNAc2 N-glycans due to inactivation of Ochl and overexpression of an ER-localized T.
- the GalGnM5 strain was derived from the GnM5 strain by transformation of the latter with a Golgi-localized human Galactosyl transferase fused to the S. cerevisiae UDP-Glc-4-epimerase GaIlO. This strain predominantly modifies its glycoproteins with GalGlcNAcMan5GlcNAc2 N-glycans.
- Pichia pastoris cultures were grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media were prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
- the primers for the PCR reaction were designed to amplify the complete gene with the stop codon and to incorporate a Sail site downstream of UDP-GlcNAcT and a BstBl restriction site upstream and a Notl site downstream of the UDP-GaIT and UDP-Glc-4- epimerase genes to allow subcloning. After amplification, the genes were cloned into pCR ® 4Blunt-TOPO ® (Invitrogen, Carlsbad, CA, USA) and sequenced.
- UDP-GaIT and UDP- Glc-4-epimerase genes were ligated into the pPICHygMnn2DmManII vector in which the Mnn2-ManII fusiongene was removed by BsiRVNotl digest.
- UDP-GlcNAcT was cut from the TOPO clone by BstBVSatl digest and ligated into the pPICKanMnn2SpGall OhGaIT vector in which the Mnn2-GallO- GaIT fusiongene was removed by BstBUSall digest.
- the expression vectors are opened by Pmel or Sail digest for transformation of Pichia pastoris strains. Insertion of the vector in the Pichia pastoris genome of Pmel opened vectors were expected to be targeted to the AOXl promoter locus while Sail opened vectors were expected to insert in the gene of interest.
- Yeast treatments - Cells were grown in 5 ml BMGY medium at 3O 0 C. After 48h of incubation, the medium of the cultures was replaced by BMMY. The induction was performed for 48h by spiking the cultures twice a day with 1% methanol. The cultures were harvested by centrifugation for 5 minutes at 300Og. Medium and cell pellet were frozen at - 2O 0 C.
- ILlO or IFN- ⁇ expressing GSM5 strains were transformed with pGAPNORPpROTl or with pGAPNORPpSHR3, both linearized with EcoRY. Individual clones were selected on nourseothricine plates and used for induction experiments.
- the resulting PCR-DNA was then cut with Notl and Nsil (CNEl) or with Notl and Avrll (LHS 1) and ligated with the vector fragment cut with the same enzymes and transformed to E. coli.
- Clones were checked by colony-PCR for the presence of chaperone containing plasmids. Once completed, the coding sequences for these chaperones would be fused with a C-terminal myc/His6 tag and placed under control of the GAP promoter. The resulting plasmids would also confer resistance to nourseothricine.
- E. coli was cultivated in LB medium containing the appropriate antibiotics for selection.
- the Pichia pastoris GlycoSwitch GnMan5 and GalGnMan5 strains expressing recombinant human alfa-antitrypsin were used in this analysis.
- TheGnMan5 strain is a GS 115 (Invitrogen, Carlsbad.
- CA, USA -derived strain that modifies its glycoproteins predominantly with GlcNAcMan5GlcNAc2 N-glycans due to inactivation of Ochl and overexpression of an ER-localized T, reesei ⁇ -l,2-mannosidase and a Golgi- localised human GIcNAc transferase.
- the GalGnM5 strain was derived from the GnM5 strain by transformation of the latter with a Golgi-localized human Galactosyl transferase fused to the S. cerevisiae UDP-Glc-4-epimerase GaIlO.
- Pichia pastoris cultures were grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media were prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
- the genes were cloned into pCR ® 4Blunt-TOPO ® (Invitrogen, Carlsbad, CA, USA) and sequenced. After EcoRVNotI digestion the chaperon genes were inserted into the pGAPNOURCre vector in which the Cre recombinase gene was removed by EcoRVNotI digest. This created a transcriptional fusion of the chaperon gene with a myc and His6 tag.
- the pGAPNOURCre vector was generated by deletion of the Pichia autosomal replication sequence (PARS) from the pGAPNorCre IPARS 1 vector by Nsil digest and self ligation.
- PARS Pichia autosomal replication sequence
- the chaperone expression vectors were opened by Avrll or EcoRV digest for transformation of Pichia pastoris strains. Insertion of the vector in the Pichia pastoris genome of Avrll opened vectors was targeted to the Gap promoter locus while EcoRV opened vectors was inserted randomly.
- a set of 54 P. pastoris genes was identified that contain a signal sequence, as discussed hereinabove.
- the expression of these genes in P. pastoris was assessed using microarray data published by Graf et al. (2008) 48 , the presence of these proteins in P. pastoris grown in glucose containing medium (Mattanovich et al., 2009) and for the protein abundance of their homologues in S. cerevisiae (Brockmann et al., 2007 46 ; Ghaemmaghami et al., 2003 47 ; Liu et al., 2004 49 ; Newman et al., 2006 51 ) (Table 8).
- the initializing step in yeast O-glycosylation is known to be catalysed by a family of protein mannosyltransferases (PMT).
- PMT protein mannosyltransferases
- 5 orthologs of the PMT genes were annotated, with representatives in the 3 subfamilies.
- S. cerevisiae deletion of only one of these genes was found to be insufficient to abolish O-glycosylation.
- the double and triple knockouts resulted in a loss of O- glycosylation but showed a severe defect in growth, hi this Example, two approaches are described for generating PMT deficient Pichia pastoris strains which could result in an O- glycosylation deficient strain.
- disruptions of the PMT ORFs are made through the use of a knock-in vector by single homologous recombination.
- PMT knock-outs are made by double homologous recombination.
- E. coli is cultivated in LB medium containing the appropriate antibiotics for selection.
- the Pichia pastoris GS 115 strain (Invitrogen, Carlsbad, CA, USA) and glycoengineered strains are used in this analysis, which can express a protein of interest.
- Pichia pastoris cultures are grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media are prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
- knock-in vectors for the PMT genes a fragment of these genes is amplified by PCR.
- the primers for the PCR reaction are designed to amplify a fragment that has low similarity to other regions in the genome, is not the full length gene, and is located so the protein fragment preceding the fragment is not functional.
- the primers contain restriction sites for subcloning of the fragment. In this fragment a restriction site, unique in the final vector, is incorporated to allow later linearization.
- the genes are cloned into pCR ® 4Blunt-TOPO ® (Invitrogen, Carlsbad, CA, USA) and sequenced.
- This fragment is cloned into a Pichia vector containing a selectable marker.
- the PMT knock-in vectors are opened by restriction digest using the unique site in the PMT fragment for transformation of Pichia pastoris strains. Insertion of the vectors in the Pichia pastoris genome will be targeted to the respective PMT loci.
- knock-out vectors for the PMT genes two fragments of the PMT loci are amplified by PCR, cloned into the pCR ® 4Blunt-TOPO ® vector (Invitrogen, Carlsbad, CA, USA) and sequenced.
- the primers for the PCR reaction are designed to amplify fragments within the PMT ORF, promoter or 5'UTR in a way that when these two fragments recombine with the PMT allele it will create an inactive allele.
- the primers contain restriction sites for subcloning of the fragments and a unique restriction site upstream of the 5' fragment and downstream of the 3' fragment. These fragments will be cloned into a P.
- the PMT knock-out vectors are cut by restriction digest using the unique site incorporated into the PMT fragment, after which the excised fragment will be used for transformation of Pichia pastoris strains.
- the stability of certain proteins expressed in Pichia pastoris has been observed to be influenced by the action of proteases such as protease A and B.
- proteases such as protease A and B.
- proteases such as protease A and B.
- a series of orthologs of novel protease genes were identified and annotated, with representatives in the serine protease, aspartyl protease and cysteine protease subfamilies.
- One strategy is described in this example to generate a series of strains of Pichia pastoris deficient in one or more of the identified protease activities. This strategy can be applied to any strain of Pichia pastoris that expresses a heterologous protein to compare the stability of that heterologous protein with or without the particular protease being active.
- One application of this strategy is to take a strain that expresses a protein of interest and use a set or "kit” of the insertional inactivation ("knock-in”) vectors to generate a series of derivative strains that each lack activity of one of the endogenous proteases of Pichia.
- E. coli is cultivated in LB medium containing the appropriate antibiotics for selection.
- the Pichia pastoris GS 115 strain (Invitrogen, Carlsbad, CA, USA) or glycoengineered strains are used in this analysis, these strains can express a protein of interest.
- Pichia pastoris cultures are grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media are prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
- the primers contain restriction sites for subcloning of the fragment.
- a restriction site unique in the final vector, is incorporated to allow later linearization.
- the genes are cloned into pCR ® 4Blunt-TOPO ® (Invitrogen, Carlsbad, CA, USA) and sequenced.
- This fragment is cloned into a Pichia vector containing a selectable marker.
- the protease knock-in vectors are opened by restriction digest using the unique site in the protease fragment for transformation of Pichia pastoris strains. Insertion of the vectors in the Pichia pastoris genome are targeted to the respective protease gene loci.
- Figure 8 shows the DNA sequences for one strategy of protease inactivation by knock-in.
- the knock-out vectors for the protease genes two fragments of the protease loci are amplified by PCR, cloned into the pCR ® 4Blunt-TOPO ® vector (Invitrogen, Carlsbad, CA, USA) and sequenced.
- the primers for the PCR reaction are designed to amplify fragments within the protease ORF, promoter or 5'UTR in such a way that when these two fragments recombine with the protease allele it creates an inactive allele.
- the primers contain restriction sites for subcloning of the fragments and a unique restriction site upstream of the 5' fragment and downstream of the 3' fragment.
- protease knock-out vectors are cut by restriction digest using the unique site incorporated into the protease fragment, after which the excised fragment is used for transformation of Pichia pastoris strains. In this way, more stable gene knock-out constructs are generated.
- CandidaDB a multi-genome database for Candida species and related Saccharomycotina. Nucleic Acids Res. 36, D557-D561 (2007).
- N50 number of contigs that collectively cover at least 50% of the assembly
- Coding genes 16 tRNA genes: 31 Table 3. Resequencing of selected known P. Pastoris ORFs.
- GalGnMan5 :UDP-GalT 21,53 ⁇ 1,91 9,49 ⁇ 3,75 68,98 ⁇ 4,69 GalGnMan5: :UDP-Glc-4-epi 20,69 ⁇ 1,72 9,18 ⁇ 0,41 70,13 ⁇ 1,66
- Glycoprotein involved in cell wall beta-glucan assembly leads to severe growth defects, aberrant multibudded morphology, and mating defects
- Annotated localization extracellular region (IDA)
- S. cerevisi ⁇ e null mutant inviable Chr3, 0960
- Protein of unknown function has similarity to Pry Ip and Pry3p and to the plant PR-I class of pathogen related proteins
- S. cerevisiae null mutant viable, decreased osmotic stress resistance
- S. cerevisiae null mutant viable, no cytokinesis
- GFP green fluorescent protein
- ATPase involved in protein import into the ER also acts as a chaperone to mediate protein folding in the ER and may play a role in ER export of soluble proteins; regulates the unfolded protein response via interaction with Irelp
- S. cerevisiae null mutant decreased metal resistance; abnormal nuclear fusion during mating; inviable
- Alpha- 1.2-mannosyltransferase responsible for addition of the first alpha- 1.2-linked mannose to form the branches on the mannan backbone of oligosaccharides, localizes to an early Golgi compartment
- S. cerevisiae null mutant budding pattern: abnormal; cell shape: abnormal; cell size: decreased; glycogen accumulation: increased; chitin deposition: increased; competitive fitness: decreased; resistance to hygromycin B: decreased; resistance to Calcofluor White: decreased; viable
- Lectin-like protein with similarity to Flo Ip thought to be expressed and involved in flocculation
- S. cerevisiae null mutant filamentous growth: decreased; haploid invasive growth: absent chrl-4_0584
- Lectin-like protein involved in flocculation cell wall protein that binds to mannose chains on the surface of other cells, confers floc-forming ability that is chymotrypsin sensitive and heat resistant; similar to Flo5p
- S. cerevisiae null mutant flocculation: absent; oxidative stress resistance: absent; resistance to ethanol: decreased; toxin resistance: absent
- TGTEPGTVI IETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVI IETPESYVTTTQPWTGTYETTYSVPPSGTEP
- S. cerevisiae null mutant abnormal polyphosphate accumulation chr3_0419
- Essential ER membrane protein may be involved in protein folding; mutation causes defects in cell wall synthesis and in lysis of autophagic bodies, suppresses tor2 mutations, and is synthetically lethal with kar2-l and with rot2 mutations
- S. cerevisiae null mutant cell shape: abnormal; chitin deposition: increased; inviable; killer toxin resistance: increased; liquid culture appearance: abnormal; resistance to hygromycin B: decreased; resistance to sodium dodecyl sulfate: decreased
- S. cerevisiae reduction of function mutant growth rate in exponential phase: decreased; resistance to tunicamycin: decreased; vegetative growth: decreased
- Peptidyl-prolyl cis-trans isomerase cyclophilin
- cyclophilin Peptidyl-prolyl cis-trans isomerase of the endoplasmic reticulum, catalyzes the cis-trans isomerization of peptide bonds N-terminal to proline residues; transcriptionally induced in response to unfolded proteins in the ER
- S. cerevisiae overexpression mutant decreased vegetative growth, viable
- Aspartic protease attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor
- S. cerevisiae null mutant viable; decreased competitive fitness
- Aspartic protease attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor
- Protein disulfide isomerase multifunctional protein resident in the endoplasmic reticulum lumen, essential for the formation of disulfide bonds in secretory and cell-surface proteins, unscrambles non-native disulfide bonds
- S. cerevisiae conditional mutant increased heat sensitivity; delayed endocytosis
- Sporulation-specific exo-l,3-beta-glucanase contributes to ascospore thermoresistance
- S. cerevisiae misexpression mutant decreased resistance to 1,4-dithiothreitol chrl-l_0011
- Vacuolar carboxypeptidase Y proteinase C
- broad-specificity C-terminal exopeptidase involved in non-specific protein degradation in the vacuole member of the serine carboxypeptidase family
- S. cerevisiae null mutant viable; decreased phytochelatin accumulation chrl-4_0013
- GPI-anchored membrane protein required for cell wall biosynthesis in bud formation;homologous to Dfg5p
- Endo-beta-l,3-ghicanase major protein of the cell wall, involved in cell wall maintenance
- S. cer ⁇ visiae overexpression mutant decreased vegetative growth chrl-4_0426
- Cell wall protein with similarity to glucanases may play a role in conjugation during mating based on its regulation by Stel2p chr2-l_0052
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Mycology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Medicinal Chemistry (AREA)
- Gastroenterology & Hepatology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
The present invention provides the genome sequence of Pichia pastoris and manually curated annotation of protein-coding genes. The invention provides novel nucleic acids, proteins, and related expression vectors useful for genetic engineering of methylo trophic yeast strains, as well as engineered methylotrophic yeast strains particularly Pichia pastoris, and use thereof for recombinant production of heterologous proteins including glycoproteins suitable for use in mammals including humans.
Description
Nucleic Acids Of Pichia pastoris And Use Thereof For Recombinant Production Of Proteins
CROSS REFERENCE TO RELATED APPLICATION
[001] This application claims the benefit of priority from U.S. Provisional Application 61/180,502, filed May 22, 2009.
FIELD OF THE INVENTION
[002] This invention relates generally to novel nucleic acids and recombinant expression technology. In particular, the invention relates to determination and assembly of the genome sequence of Pichia pastoris. The invention provides novel nucleic acids, proteins, and related expression vectors useful for genetic engineering of methylotrophic yeast strains, as well as engineered methylotrophic yeast strains particularly Pichia pastoris and use thereof for recombinant production of heterologous proteins.
BACKGROUND OF THE INVENTION
[003] The methylotrophic yeast Pichia pastoris is by far the most often used yeast species in the production of recombinant proteins and is used in thousands of laboratories worldwide for the production of proteins for fundamental studies, as drug targets and for therapeutic use, and as a model for peroxisomal proliferation and methanol assimilation. The P. pastoris expression technology is available from Invitrogen (Carlsbad, CA) and from Research Corporation Technologies (Tucson, AZ), making it accessible for academic and commercial purposes alike. P. pastoris grows to high cell density, provides tightly controlled methanol- inducible transgene expression and efficiently secretes heterologous proteins in defined media. Indeed, several P. pastoris-produced biopharmaceuticals that are either not glycosylated (such as human serum albumin) or for which glycosylation is only needed for proper folding (such as several vaccines) are already on the market. P. pastoris strains with small, homogenous N-glycans have also been generated, which were then further engineered
i
into human-type N-glycosylation '". Glyco-engineered products are now moving to clinical development3. Moreover, monoclonal antibodies can be made at gram per liter scale in glycosylation-homogeneous strains4. For further strain engineering, a better understanding of all aspects of the yeast's protein production machinery is desired, and a number of studies relating to Pichia's secretory system and engineered promoters have been forthcoming56. However, although P. pastoris is widely used for protein production, relatively few genetic tools, engineered strains and data on the biology of this organism are available.
SUMMARY OF THE INVENTION
[004] The present invention provides the determination, assembly and manually curated annotation of the 9.43 Mbp genomic sequence of the GSl 15 strain of Pichia pastoris \
[005] In one embodiment, the invention provides isolated nucleic acid molecules that encode a protein as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176- 0200, or 0202-0212, or a protein substantially homologous thereto. In specific embodiments, the nucleic acid molecules contain the coding sequence of an ORF as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202-0212, or contain a nucleotide sequence that is substantially homologous to the coding sequence of an ORF as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202- 0212.
[006] In another embodiment, the present invention provides isolated proteins that have an amino acid sequence as set forth in any one of SEQ0001-0025, 0027-0126. 0128-0165, 0172- 0174, 0176-0200, or 0202-0209, or an amino acid sequence substantially homologous thereto.
[007] In still another embodiment, the present invention provides 53 peptides, shown in SEQ0001-0025 and 0027-0054, which are signal peptides of secreted proteins of Pichia pastoris. Nucleic acids that encode any of these signal peptides are also parts of the present invention.
[008] In a further embodiment, the present invention provides a set of vectors useful for identification of the most effective choice of signal peptide for any given heterologous protein. Each vector contains a promoter and the coding sequence of one of the signal peptides identified in SEQOOO 1-0025 and 0027-0054, and a linker sequence or cloning site for inserting or receiving the coding sequence of the heterologous protein. In a specific embodiment, the linker sequence or cloning site simply includes a restriction endonuclease site. The heterologous coding sequence, with the same restriction site at its 5' end, can be joined to the signal peptide coding sequence via restriction enzyme digestion and subsequent ligation. In an alternative embodiment, the linker sequence is an intron sequence functional in Pichia pastoris and includes a restriction endonuclease site. The heterologous coding sequence, with the same intron sequence and restriction site at its 5' end, can be joined to the signal peptide coding sequence via the intron sequence. Transcription from the expression vector and subsequent RNA processing in a recipient cell lead to the generation of mRNAs without the intron, from which signal peptide-heterologous protein fusions are produced upon translation. A library of such fusion clones can be transformed into Pichia pastoris to select for the most effective choice of signal peptide for any given heterologous protein.
[009] In one embodiment, the invention provides isolated nucleic acid molecules composed of a promoter sequence of any one of the Pichia pastoris genes set forth in SEQ0055-0165, 0169-0200, or 0202-0212.
[010] Specific promoters of the present invention include those identified for SEQ0060- 0085 (genes involved in glycolysis pathway), SEQ0096-0124 (genes showing high expression levels), SEQO 125-0128 (homologs of S. cerevisiae genes whose promoters are frequently used for recombinant expression), SEQ0169-0178 (methanol metabolism genes), and SEQ0202-0208 (genes involved in xylose, arabinose or threhalose metabolism). The promoters of these genes are located within the 1000 bp of the 5' region provided herein, and generally are located within the 500 bp immediately before the start codon of the gene, in some embodiments within 250 bp, 200 bp, 150 bp, 125 bp, 100 bp, 75 bp, 50 bp, 40 bp, or even 25 bp immediately before the start codon of the gene. In certain embodiments, the promoters include a TATA element identified herein in Table 6.
[Oil] Use of any of the newly identified promoters for expression of a heterologous gene is encompassed by the present invention, including the related expression vectors and cells transformed with any such expression vectors.
[012] In one embodiment, the present invention is directed to expression vectors and engineered methylotrophic yeast strains for increased expression (or overexpression) of one or more Pichia proteins involved in the secretory pathway, e.g., those as set forth in SEQ0055- 0059, in order to achieve increased protein secretion.
[013] In another embodiment, the present invention is directed to expression vectors and engineered methylotrophic yeast strains for overexpression of Pichia glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-GIc- 4-epimerase for ER or Golgi localization. Such expression vectors are constructed to contain, from 5' to 3', a promoter functional in the recipient strain, operably linked to a coding sequence as set forth in any one of SEQO 129-0132, and a transcription termination sequence. The encoded protein is preferably also designed to include an ER or Golgi localization signal. Such an expression vector can be introduced into a methylotrophic yeast strain by transformation. Thus, the resulting engineered strains capable of increased expression of a protein encoded by any one of SEQO 129-0132 constitute another embodiment of the invention.
[014] In a further embodiment, the present invention provides a methylotrophic yeast strain in which at least one (i.e., one or more) native gene encoding an O-mannosyl transferase or a beta- mannosyl transferase has been inactivated. In a specific embodiment, the strain is a Pichia strain, preferably, a P. pastoris strain. An O- or beta-mannosyl transferase knockout P. pastoris strain can be generated by inactivating at least one gene as set forth in SEQO 133- 0137 or SEQ0210-0212, which can reduce or eliminate unwanted O- or beta-glycosylation of a heterologous protein.
[015] In one embodiment, the present invention provides a methylotrophic yeast strain in which one or more native genes encoding enzymes involved in the ER glycosylation pathway have been inactivated. In a specific embodiment, the native yeast STT3 gene has been
inactivated, and optionally the Leishmania STT3 gene has been introduced in place thereof. In another specific embodiment, the methylotrophic yeast strain is P. pastoris in which the native Pichia gene as set forth in SEQO 163 has been inactivated, and the Leishmania STT3 gene has been introduced in place thereof.
[016] In another embodiment, the present invention is directed to methylotrophic yeast strains which overexpress one or more of the MNN4 homologs as set forth in SEQ0166-0168, and related expression vectors.
[017] In still another embodiment, the present invention provides a protease-deficient methylotrophic yeast strain. In a specific embodiment, the strain is a Pichia strain, e.g., a P. pastoris strain. The protease-deficient Pichia strain can be generated by inactivating at least one (i.e., one or more) protease-encoding genes as set forth in SEQO 179 and SEQO 181-0186, such that there is no functional protease produced from a disrupted gene. Protease deficient strains allow more stable accumulation of heterologous proteins.
[018] In yet another embodiment, the present invention provides a methylotrophic yeast strain engineered to overexpress a protease inhibitor protein, encoded by the gene set forth in SEQO 180, at an elevated level compared to an unmodified strain. In a specific embodiment, the strain is a Pichia strain, e.g., a P. pastoris strain. Expression vectors created for making such strains form another embodiment of the invention.
[019] In one embodiment, the present invention is directed to expression vectors and engineered methylotrophic yeast strains for overexpression of at least one Pichia chaperones involved in secreted protein folding in the ER. Such expression vectors are constructed to contain, from 5' to 3', a promoter functional in the recipient strain, operably linked to a coding sequence as set forth in any one of SEQ0187-SEQ0200, and a transcription termination sequence. Such an expression vector can be introduced into a methylotrophic yeast strain by transformation. Another embodiment of the invention is directed to engineered methylotrophic yeast strains capable of overexpressing a protein encoded by any one of SEQ0187-SEQ0200. In still another embodiment, methylotrophic yeast strains capable of
overexpressing a combination of multiple chaperones are provided, which combination can be selected as most effective for recombination production of a particular heterologous protein.
[020] In a further embodiment, the present invention provides an isolated nucleic acid molecule containing the nucleotide sequence as set forth in SEQ0200, which encodes the 5S rRNA. Use of this nucleic acid in creating vectors to achieve multi-copy integration of a heterologous gene and generate strains having a heterologous gene stably integrated in the genome in multiple copies is also contemplated by the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[021] Figures 1A-1C. Pichia pastoris genome sequencing and overview. IA. Genome sequencing and assembly strategy. IB. P. pastoris chromosomes and known markers position. Genes that had been previously mapped to the chromosomes through PFGE are indicated in blue, and rDNA repeats in yellow, the 5S rRNA are indicated in yellow with the red arrow. 1C. Phylogenetic tree. The phylogenetic tree was built on the concatenated sequence of 200 single-copy ortholog genes in all of the 6 species. Numbers next to each branch correspond to the number of Pfam domains uniquely present in the corresponding lineage.
[022] Figures 2A-2B. Pichia pastoris codon usage. 2A. Codon usage in the P. pastoris ORFeome. The relative abundance of a codon is represented as a percentage of the total codon usage for the amino acid. 2B. Correlation of tRNA genes and codon usage. Graph shows correlation between the codon usage in relation to the number of genes coding for tRNAs recognising this codon (Spearman rho=0.88, P<0.0001).
[023] Figures 3A-3B. Pichia pastoris pathways. 3A. Methanol utilisation pathway in Pichia pastoris. A detailed table with the genes coding for the respective enzymes is shown in Table 5A. Abbreviations: 1AOX: alcohol oxidase, 2FLD: formaldehyde dehydrogenase, 3FGH: S-formylglutathione hydrolase, 4FDH: formate dehydrogenase, 5CAT: catalase, 6DAS: dihydroxyacetone synthase, DAK: dihydroxyacetone kinase, TPI: triosephosphate isomerase, 9FBA: fructose- 1,6-bisphosphate aldolase, 10FBP: fructose- 1,6-bisphosphatase;
DHA: dihydroxyacetone, GAP: glyceraldehyde-3-phosphate, DHAP: dihydroxyacetone phosphate, F1^BP: fructose- 1,6-bisphosphate, F6P: fructose-6-phosphate, P1: phosphate, XusP: xylulose-5-phosphate, GSH: glutathione. 3B. Protein secretion pathway. Schematic representation of the secretion pathway in P. pastoris. A detailed table with the genes coding for the components involved in the represented complexes or processes is shown in Table 5B. The nascent protein is translocated to the ER by the Sec61 complex and N-glycosylation sites are glycosylated with the dolichol-linked Glc3Man9GlcNAc2 oligosaccharide precursor by the OST complex. After processing of the signal peptide, the protein is folded by the aid of chaperones. ER N-glycan processing results in MangGlcNAci type glycan. O-glycosylation is also initiated in the ER by the protein-O-mannosyltransferases. After transport to the Golgi apparatus, the N-glycans are further processed to the yeast-typical hypermannosyl-type glycans. In GlycoSwitch-engineered strains, the hypermannosylation is abolished and the glycans are processed to Gal2GlcNAc2Man3GlcNAc2. After processing of the pro-domain, the protein is secreted in the growth medium, where it may be a substrate for yeast proteases.
[024] Figure 4. Contig overlap processing. Conflicts at contig overlaps, due to low coverage of the sequence represented by "XXX", are resolved by trimming the extreme ends of the contig. Contig coverage is represented by heatmap. Low fidelity overlap regions (P- value >e°°) were PCR amplified and Sanger sequenced.
[025] Figures 5A-5C. Chromosome assembly. 5A. By PFGE and Southern blot detection,
2 supercontigs (FragB and FragD), 4 contigs (cl21, c34, cl31, cl57, cl59) and the contig containing the rDNA repeats (c2) were located on the different chromosomes. Every lane of the blot was incubated with a probe on an open reading frame of the indicated genome fragment. A probe on HIS4, GAP, URA3 and AOXl was chosen to detect chromosome 1, 2,
3 and 4, respectively. The H. wingeii chromosomes were used as marker for the PFGE, but they also gave a signal on the blot with the conserved c2 probe. The rightmost 2 lanes derived from a different gel than the rest of the figure, and chromosomes 2 and 3 were not well resolved on this gel. Presence of an rDNA locus (corresponding to contig c2) on both of these chromosomes was ascertained through PCR. 5B. Result of the PCRs performed to join the supercontigs and contigs. Lanes 1-8 are PCR with primers 1&4, 2&5, 6&25, 12&26, 8&9,
23&62, 15&60 and 17&21, respectively. 5C. Representation of the chromosomes assembled by the supercontigs and contigs. The numbers in blue represent PCR primers that were chosen on each end of the supercontigs and contigs (-200 bp from the end). The size of the gap is depicted between each supercontig and contig.
[026] Figure 6. Distribution of gene ontology terms assigned to P. pastoris. A total of 4,262 P. pastoris genes were assigned with gene ontology (GO) terms: 3,142 genes with molecular function assignment, 3,647 genes with cellular component assignment and 3,182 genes with biological process assignment.
[027] Figure 7. P. pas tons secretion signals. 53 SignalP predicted signal peptides were manually curated to be secretion signals based on the function of orthologs. The predicted site of signal peptidase cleavage is indicated by the red triangle. Alignment of these peptides shows a hydrophobic consensus sequence (poly Leu), and a small amino acid residue at position -1 and -3 from the cleavage site.
[028] Figure 8. Protease Gene Insertional Inactivation Strategy. A drug resistance marker or other auxotrophic marker can be used for this method.
DETAILED DESCRIPTION OF THE INVENTION
[029] The present inventors have determined and assembled the 9.43 Mbp genomic sequence of the GS 115 strain of P. pastoris, and manually curated annotation of 5,313 protein-coding genes. On this basis, the invention provides novel protein-encoding genes from Pichia pastoris, including identification of the 5' upstream region (including promoter), open reading frame (ORF) and 3' downstream region of these genes. The present invention also provides novel Pichia pastoris proteins, and certain signal peptides of some of these P. pastoris proteins. The nucleic acids and encoded proteins identified herein, as well as the promoters and signal peptides, can be used in engineering methylotrophic yeast strains, particularly Pichia strains, for recombinant production of proteins, including but not limited to glycoproteins having glycoforms suitable for therapeutic use in mammals especially humans. In addition to the values and utilities provided by each of the novel molecules
individually, the determination and annotation of the genome sequence of Pichia pastoris also permit a more complete, overall understanding of Pichia pastoris in respect to its protein modification and secretion system as well as methanol metabolism, by providing a complete set of Pichia pastoris genes coding for enzymes involved in methanol assimilation, a complete catalog of Pichia pastoris orthologs to the S. cerevisiae endoplasmic reticulum (ER) folding machinery, and a full collection of genes involved in the ER glycosylation pathway of Pichia pastoris. These findings of the present invention enable more efficacious overall design and engineering for recombinant production of heterologous proteins in methylotrophic yeasts.
[030] The various aspects and embodiments of the present invention are described in details below.
General Definitions:
[031] Open Reading Frame (ORF) - An ORF refers to the portion of a gene that begins with the start codon and ends with a stop codon and that encodes a protein. An ORF may include one or more intron sequences.
[032] Coding sequence - This term is used herein to refer to a contiguous sequence of codons of the protein encoded by the ORF and does not include intron.
[033] 5' upstream region - This term (or "5' region" or "upstream region" in abbreviation) is used herein to refer to the genomic region 5' relative to the ORF of a gene. Generally speaking, the Sequence file provided herein has set forth a 5' upstream region of approximately 1000 nucleotides, which, in some cases, includes the start and/or stop codons of the previous gene. The 5' upstream region of a gene includes the promoter of the gene. The extent of the 5' region provided herein for each gene is sufficient for targeting recombination to this site, e.g. to delete the entire ORF when used in combination with the 3' downstream region sequence, or to insert a different regulatory region or promoter immediately upstream of the ORF when used in combination with the ORF sequence itself.
[034] Promoter - This term refers to a portion of the 5' upstream region of a gene that directs the transcription of the gene. Promoters are located within the 1000 bp of the 5' upstream region of yeast genes, with a "TATA box" sequence most commonly located at 10-120 bp upstream from the start codon of a gene. A "TATA box" is a DNA sequence (cis-regulatory element) found in the promoter region, and has the core DNA sequence 5'-TATAAA-3' or a variant. A TATA box is usually located 25 base pairs upstream to the transcription site. In about 76% percent of the Pichia pastoris genes, at least one of the following TATA-elements has been located - TATAA: 50% of the elements are found 60 to 90 bp upstream of ATG (75% 40 to 110 bp upstream of ATG); TACAA: 78% of the elements are found 50 to 70bp upstream of ATG; TATA: 50% of the elements are found 10 to 40 bp upstream of ATG; TATATA: 50% of the elements are found 80 to 90bp upstream of ATG; and TATATATA: 50% of the elements are found 50 to 60bp of ATG. Details of the TATA boxes of the promoters provided by the present invention are set forth in Table 6. Accordingly, in certain specific embodiments, the promoter of a Pichia pastoris gene provided herein is located within the 500 bp immediately before the start codon of the gene; prepferably within 250 bp, 200 bp, 150 bp, 125 bp, 100 bp, 75 bp, 50 bp, 40 bp, or even 25 bp immediately before the start codon of the gene; and in particular embodiments, the promoter includes a TATA element identified herein in Table 6. The precise location and composition of a promoter can be determined by using well known techniques including deletion mapping and site-directed mutagenesis, as further described below.
[035] 3' downstream region - This term (sometimes "downstream region" or "3' region" in abbreviation) is used herein to refer to the genomic region 3' relative to the ORF of a gene. Generally speaking, the Sequence file provided herein has set forth a 3' downstream region of approximately 1000 nucleotides, which, in some cases, includes the start and/or stop codons of the next gene. The 3' down region of a gene includes the transcription termination sequence (or 3' termination sequence) of the gene. The extent of the 3' region provided herein is sufficient to target recombination to this site in the chromosome.
[036] Selectable marker - This term refers either to a dominant drug resistance marker or similar dominant marker, or to a more limited prototrphic selection such as HIS, applicable to a host that is defective for the key enzyme supplied by a selectable marker gene.
[037] "Signal peptide" and "mature proteins" - The term "signal peptide" or "signal sequence" refers to the short peptide sequence within a protein precursor synthesized in the cytoplasm that targets the precursor form to the endoplasmic reticulum. Signal peptides are typically cleaved from the precursor form by signal peptidase after the proteins are transported to the ER, and the resulting proteins move along the secretory pathway to their intracellular or extracellular location. For some proteins, cleavage of the signal peptide results in the mature form (i.e., the final, biologically active form) of the protein, while for other proteins, additional proteolytic processing may be required in order to generate the mature form of the protein.
[038] Substantially homologous amino acid sequences - When two or more amino acid sequences are said to be substantially homologous, it is meant that the sequences share a significant degree of similarity, for example, at least 85%, 90%, 95%, 98% or even 99% similarity. The degree of similarity can be determined, for example, as the index calculated using the BLAST or the Lipman-Pearson Protein Alignment program with the following choice of parameters for the latter program: Ktuple = 2, Gap Penalty = 4, and Gap Length Penalty = 12. The term "similarity" includes identity. Substantially homologous proteins can perform or possess substantially the same function; i.e., the enzymatic activities of the proteins differ by not more than 20%, 15%, 10%, or even 5% under a same set of conditions applicable for measuring enzymatic activity. A protein that is substantially homologous to a Pichia pastoris protein identified herein is, in some embodiments, a protein of methylotrophic yeast; for example, a protein of Pichia.
[039] Substantially homologous nucleotide sequences - When two or more nucleotide sequences are said to be substantially homologous, it is meant that the sequences share a significant degree of identity, for example, at least 85%, 90%, 95%, 98% or even 99% identity. The degree of homology is also reflected by hybridization characteristics. As
defined herein, a first nucleic acid sequence that is substantially homologous to a second nucleic acid sequence molecule also hybridizes to the complement of the second nucleic acid sequence under high stringency conditions. "High stringency conditions", as defined herein, include, for example, hybridization at 42°C in 50% v/v formamide, IM NaCl, and 1% w/v SDS, and washing at 65°C in 0.1-2X SSC (e.g., 0.1, 0.2, 0.5. 1 or 2X SCC) and 1% w/v SDS. A nucleic acid that is substantially homologous to a Pichia pastoris nucleic acid identified herein is, in some embodiments, a nucleic acid of methylotrophic yeast; for example, a nucleic acid of Pichia.
[040] "Heterologous" versus "native'V'endogenous" - The term "heterologous" is used herein in several different contexts to reflect the fact that a molecule is placed in a genetic, molecular or cellular environment that is different than its native environment. For example, when the promoter of a gene is being utilized to drive the expression of a different gene, the promoter will be taken out of its native genetic context and placed in an operable linkage to a heterologous gene. As another example, the signal peptide of a protein can be used to direct the localization of a different protein, i.e., a heterologous protein. Additionally, a methylotrophic yeast strain such as Pichia can be transformed with a heterologous nucleic acid, i.e., a nucleic acid which the non-engineered Pichia strain does not have. The resulting engineered strain will express the protein encoded by the heterologous nucleic acid, i.e., a heterologous protein.
[041] Gene overexpression - Overexpression of a gene in a methylotrophic yeast is achieved by genetic modification of the yeast such that the expression of the gene is increased (measurable either at the mRNA level or the protein level), as compared to the unmodified yeast. The extent of increase in expression is at least 35%, or at least 50%, or preferably at least 150%, 200%, 250%, 300%, 400% or more. Overexpression of a gene can be achieved by introducing additional expression cassette carrying the gene (as a plasmid vector or integrated into the chromosome), or by replacing the native promoter of the gene with a stronger constitutive or inducible promoter.
[042] Gene inactivation - Inactivation of a gene in a methylotrophic yeast is achieved by genentic modification of the yeast such that substantially no functional protein is produced from the strain. By "substantially'', it is meant that the level of functional protein produced from the modified strain is not more than 20%, or 15%, or 10% or even 5% or less of the level of functional protein produced from an unmodified strain. Inactivation of a gene can be achieved by disrupting the genomic ORF of the gene in the strain, or disrupting the native promoter, or replacing the native promoter with a repressible promoter (e.g., repressible by methanol).
[043] Methylotrophic yeast - Methylotrophic yeasts are those capable of growth on methanol, and include yeasts of the genera Candida, Hansenula (such as H. polymorpha, now classified as Pichia angusta), Torulopsis, and Pichia (e.g., Pichia pastoris, Pichia methanolica, Pichia angusta (formerly Hansenula polymorpha), Pichia stipitis, and Pichia anomala).
[044] Chaperones - Chaperones are proteins that assist in the non-covalent folding or unfolding of proteins, mediate the redox potential to assist the formation of disulphide bonds within or between protein subunits, assist in the assembly or disassembly of other macromolecular structures or complexes, and/or translocation of proteins across membranes.
[045] For convenience of discussion, the novel protein-encoding genes identified by the present invention have been grouped based on the functions of the encode proteins and are discussed in details below. Where the utility of a particular group or a particular gene is not specifically discussed, its utilities are apparent based on the function and utility of its homolog(s) from other yeast species such as S. cerevisiae which have been characterized.
Secreted proteins with a signal peptide
[046] In accordance with the present invention, 53 genes have been identified as encoding secreted proteins with a signal peptide. In the annotated Sequence file provided herein, the ORF and the amino acid sequence (with the signal peptide portion shown in bold) of each of the 53 genes are set forth in SEQOOO 1-0025 and 0027-0054.
[047] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQOOO 1-0025 or 0027-0054, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0001-0025 or 0027-0054.
[048] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0001-0025 or 0027-0054, or a nucleotide sequence that is substantially homologous to the coding sequence of an ORF as set forth in any of SEQ0001-0025 or 0027-0054.
[049] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQOOOl- 0025 or 0027-0054, or an amino acid sequence substantially homologous thereto.
[050] In still another embodiment, the invention is directed to the signal peptides of these 53 newly identified genes and uses thereof for recombinant expression of heterologous proteins. These signal peptides are summarized and aligned in Figure 7. Alignment of these peptides shows a hydrophobic consensus sequence (poly Leu), and a small amino acid residue (such as A, C, G and S) at position -1 and -3 from the cleavage site.
[051] Proteins destined for translocation into the endoplasmic reticulum carry a signal peptide that is recognized upon ribosomal translation by the signal recognition particle, which docks to the translocon, upon which translation continues and the protein is delivered into the ER lumen. The signal peptide is removed by signal peptidase.
[052] When over-expressing a protein destined for secretion, one can either use the protein's native signal peptide if the translocation and signal peptidase machinery of the host cell can efficiently recognize and then process it. Alternatively, the coding sequence of a heterologous signal peptide can be fused to the coding sequence for the mature form of the protein. In Pichia pastoris, the signal sequence which has been frequently used is the prepro signal of S. cerevisiae alpha mating factor. Whereas this signal works in many cases, processing of the propeptide can be problematic as it requires Kex2p protease cleavage followed by polishing
of the newly created N-terminus by Stel3p diaminopeptidase. Therefore, having a large library of Pichia signal peptides available is particularly useful. For example, one can screen for effective signal peptide(s) for a given protein desired to be expressed in Pichia. In addition, based on the library of signal peptides provided herein, a consensus artificial signal peptide can be designed which can be used to efficiently secrete multiple different heterologous proteins.
[053] To screen the library for a suitable Pichia signal peptide for a given target protein desired to be expressed in Pichia, an intron functional in Pichia (which can be selected from those shown in the Sequence File provided herein) can be cloned before the ORF or coding sequence for the target protein, wherein the intron contains either a unique restriction site or a recognition site for a recombinase. The same intron can be cloned behind (i.e., 3' of) each signal peptide coding sequence in the library, enabling rapid generation of an expression library of signal peptide-target ORF fusions through classical cloning using the unique restriction site or through recombinational cloning. Upon transcription, the intron is removed by the Pichia splicing machinery, resulting in an expressed library of in-frame fusion between the coding sequence for one of the signal peptides and the target ORF.
[054] Subsequently, secretion of the target protein by individual members of the expressed library can be evaluated by a suitable technique such as SDS-PAGE analysis, coomassie blue staining, or Western blot. In this way, the suitability of a given Pichia signal peptide can be evaluated for secretion of any target protein.
[055] Accordingly, in another embodiment, the present invention is directed to a library of expression vectors, wherein each of the 54 signal peptides is represented in the library. Each expression vector contains, from 5' to 3', a promoter functional in Pichia, operably linked to a coding sequence for a signal peptide, and an intron sequence containing a restriction endonuclease recognition site. The expression vector is designed to accommodate insertion of the coding sequence of a target protein, linked in its 5' to the same intron sequence containing the same restriction endonuclease recognition site on the expression vector.
[056] In a further embodiment, the present invention provides an expression vector capable of directing the expression and secretion of a heterologous protein in Pichia pastoris or another methylotrophic yeast. The expression vector contains, from 5' to 3', a promoter functional in the recipient strain, operably linked to a coding sequence for the fusion of a signal peptide (identified in SEQOOO 1-0025 and 0027-0054) and the heterologous protein. Host cells transformed with such an expression vector constitute another embodiment of the present invention.
[057] In another embodiment, the signal sequences from some of the secreted proteins are not cleaved off in the ER, but remain linked to the protein. This is usual for a number of those enzymes involved in the glycosylation pathway (for example those for mannosyl transferases), for secretion chaperones, and for other factors that help secretion in a particular compartment, for example. These signal sequences are useful for the localization of heterologous proteins to target them to particular compartments of the secretory pathway, which may be advantageous to various aspects of posttranslational modification. This may correspond to the same or equivalent compartment in which a heterologous enzyme or protein normally acts in its native cell. Alternatively, it may be a different compartment, but, is effective because it is the same as that in which another enzyme acts or is immediately downstream or upstream of the compartment in which a second enzyme acts, effectively channeling and coordinating the sequence of a metabolic process.
Proteins potentially involved in secretion
[058] In accordance with the present invention, 5 genes have been identified as encoding proteins potentially involved in secretion, including P. pastoris homologs of S. cerevisiae SECl, SEC 11, and subunits SPCl, SPC2 and SPC3 of signal peptidase complex. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF, and the 3' downstream region, as well as the amino acid sequence, of each of these 5 genes are set forth in SEQ0055 to SEQ0059.
[059] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein)
as set forth in any of SEQ0055 to SEQ0059, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0055 to SEQ0059.
[060] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence sequence of an ORF as set forth in any of SEQ0055 to SEQ0059, or a nucleotide sequence that is substantially homologous to the coding sequence of an ORF as set forth in any of SEQ0055 to SEQ0059.
[061] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0055 to SEQ0059, or an amino acid sequence substantially homologous thereto.
[062] In one embodiment, the invention is directed to a methylotrophic yeast strain, for example a Pichia strain such as Pichia pastoris, which overexpresses the Pichia pastoris SECl gene as set forth in SEQ0055. Expression vectors for achieving such overexpression are also contemplated by the present invention.
[063] Analysis of mutants of S. cerevisiae shows SECl to be required for the SNARE mediated docking and fusion of exocytic vesicles. Therefore, overexpression of the native SECl gene in P. pastoris may facilitate the exocytosis of secretory vesicles. The coding sequence of the P. pastoris SECl gene can be cloned into an expression vector and is introduced into a methylotrophic yeasr strain. To evaluate whether secretion levels are increased as a result of such overexpression, the quantity of secreted (glyco)proteins per cell can be analyzed.
[064] In another embodiment, the invention is directed to a methylotrophic yeast strain, for example a Pichia strain such as Pichia pastoris, which overexpresses at least one of the Pichiapastoris SECI l, SPCl, SPC2 or SPC3 genes as set forth in SEQ0056-0059. Expression vectors for achieving such overexpression are also provided by the present invention.
[065] Efficient processing of the signal peptide of a secreted protein is essential for high yield and to eliminate the presence of additional amino acids in the secreted protein. Overexpression of one or more subunits of the signal peptidase complex may increase the efficiency and the quality of the processing. To evaluate whether the efficiency of the signal peptide cleavage is improved as a result of overexpression of a subunit of signal peptidase complex, the quantity and the amino acid sequence of a heterologous secreted glycoproteins can be analyzed.
P. pastoris homologues of series involved in the glycolysis pathway
[066] In accordance with the present invention, 26 genes have been identified as encoding proteins involved in the glycolysis pathway. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 26 genes are set forth in SEQ0060 to SEQ0085.
[067] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0060 to SEQ0085, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0060 to SEQ0085.
[068] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0060 to SEQ0085, or a nucleotide sequence that is substantially homologous thereto.
[069] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0060 to SEQ0085, or an amino acid sequence substantially homologous thereto.
[070] In a further embodiment, the present invention is directed to the promoters of each of the genes as set forth in SEQ0060 to SEQ0085. Because glycolysis is a central metabolic pathway, the promoters of the genes encoding proteins involved in this pathway are believed
to be strong promoters, which can be used for driving overexpression of heterologous genes in methylotrophic yeast such as P. pastoris. Accordingly, the present invention provides expression vectors containing any one of the promoters of the genes set forth in SEQ0060 to SEQ0085, operably linked to the coding sequence of a heterologous protein. Methylotrophic yeast strains transformed with such an expression vector are also provided by the invention.
P. pastoris homologues of genes involved in homologous recombination
[071] In accordance with the present invention, 10 genes have been identified as encoding proteins involved in the homologous recombination. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 10 genes are set forth in SEQ0086 to SEQ0095.
[072] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0086 to SEQ0095, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0086 to SEQ0095.
[073] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0086 to SEQ0095, or a nucleotide sequence that is substantially homologous thereto.
[074] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0086 to SEQ0095, or an amino acid sequence substantially homologous thereto.
[075] In a further embodiment, the invention is directed to methylotrophic yeast strains, especially Pichia pastoris strains, wherein one or more of the genes as set forth in SEQ0086 to SEQ0095 have been inactivated. Inactivation of these genes involved in recombination may prevent out-recombination of an expression unit or cassette containing a heterologous
gene integrated in the chromosome, therefore potentially "lock" or stabilize the insertion, especially multicopy insertions as further discussed hereinbelow.
Genes with high expression levels
[076] In accordance with the present invention, 29 genes have been identified as encoding proteins and showing expression levels 2Ox higher than GAPl based on microarray analysis. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 10 genes are set forth in SEQ0096 to SEQ0124.
[077] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0096 to SEQ0124, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0096 to SEQO 124.
[078] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0096 to SEQO 124, or a nucleotide sequence that is substantially homologous thereto.
[079] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0096 to SEQ0124, or an amino acid sequence substantially homologous thereto.
[080] In a further embodiment, the present invention is directed to the promoters of each of these 29 genes showing high expression levels. These promoters can be used for driving overexpression of heterologous genes in methylotrophic yeast such as P. pastoris. Accordingly, the present invention provides expression vectors containing any one of the promoters of the 29 genes set forth in SEQ0096 to SEQO 124, operably linked to the coding sequence of a heterologous protein. Methylotrophic yeast strains transformed with such an expression vector are also provided by the invention.
Homologs of promoters used for expression of proteins in S. cerevisiae
[081] In accordance with the present invention, 4 genes have been identified as encoding P. pastoris protein homologs of S. cerevisiae whose promoters have been used for recombinant expression in S. cerevisiae, including SEQ0125 (homolog of S. cerevisiae glycerol-3- phosphate dehydrogenase 1 (GPDl) and GPD2), SEQ0126 (homolog of S. cerevisiae alcohol dehydrogenase 1 (ADHl) and ADH2), SEQO 127 (homolog of S. cerevisiae PHO5, although partial promoter and the ORF of the Pichia pastoris gene have been reported in the art), an SEQ0128 (homolog of S. cerevisiae sulfite reductase beta subunit ECM17). In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 4 genes are set forth in SEQ0125 to SEQ0128.
[082] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0125, SEQ0126 or SEQ0128, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0125, SEQ0126 or SEQ0128.
[083] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 125, SEQO 126 or SEQO 128, or a nucleotide sequence that is substantially homologous thereto.
[084] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0125, SEQO 126 or SEQO 128, or an amino acid sequence substantially homologous thereto.
[085] In a further embodiment, the present invention is directed to the promoters of each of these 4 genes. These promoters can be used for driving expression of heterologous genes in Pichia pastoris or another methylotrophic yeast species. Expression vectors containing any of these promoters, operably linked to a heterologous coding sequence, and methylotrophic yeast strains transformed with any such expression vectors, are contemplated by the invention.
Genes involved in nucleotide sugar synthesis and transport
[086] In accordance with the present invention, 4 genes have been identified as encoding proteins involved in nucleotide sugar synthesis and transport, including UDP-GIcNAc transporter (SEQ0129), UDP-glucose-4-epimerase (SEQ0130), HUTl (putative role of transporting UDP-galactose into Golgi) (SEQ0131), and a putative UDP-galactose transporter (SEQ0132). In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 4 genes are set forth in SEQ0129 to SEQ0132.
[087] Efforts have been made to re-engineer the entire glycosylation pathway of Pichia pastoris in order to produce complex or hybrid type N-glycans (N-glycosylation humanization technology1'2; see also Fig. 3b). The heterologous glycosyltransferases needed for this engineered glycosylation processing utilize the sugar-nucleotides UDP-GIcNAc and UDP-GaI as monosaccharide donors. The identification of a UDP-GIcNAc transporter in the Pichia genome is consistent with the fact that UDP-GIcNAc is known to be synthesized in yeasts for the synthesis of cell wall chitin. However, no galactosylated glycoconjugates in P. pastoris have been previously described. It has been shown that the mere overexpression of a Pichia Golgi- targeted version of human beta- 1,4-galactosyl transferase I is sufficient to achieve galactosylation of secreted glycoproteins7. While this finding may suggest that Pichia produces UDP-GaI and transports it into the Golgi apparatus, the present invention provides the molecular basis for the first time by identifying an endogenous cytoplasmic UDP-Glc-4- epimerase, and clear homologs of Golgi UDP-Galactose transporters. See also Table 5B. Researchers have previously overexpressed a heterologous UDP-Glc-4-epimerase in fusion to the galactosyltransferase to achieve higher levels of UDP-GaI in the yeast Golgi apparatus ' . The identification of the Pichia pastoris proteins involved in nucleotide sugar synthesis and transport permits more effective glycan engineering in this and other methylotrophic yeasts.
[088] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set
forth in any of SEQO 129 to SEQO 132, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0129 to SEQ0132.
[089] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 129 to SEQ0132, or a nucleotide sequence that is substantially homologous thereto.
[090] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0129 to SEQO 132, or an amino acid sequence substantially homologous thereto.
[091] In a further embodiment, the present invention is directed to expression vectors to achieve increased expression in Pichia of native glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-Glc-4-epimerase for ER or Golgi localization. It is believed that increased expression of these proteins can increase homogeneity of final hybrid or complex type glycoform on the heterologous protein designed to be expressed in the same strain. The expression vectors contain the coding sequence of one of the desirable glycosylation precursor synthesis enzymes or transporters, which is placed in operable linkage to a promoter functional in recipient host cells. The promoters that control the expression of the precursor synthesis enzymes or transporters can be selected to be induced by similar conditions to those directing the expression of the heterologous protein.
[092] In another embodiment, the invention provides methylotrophic yeast strains especially Pichia strains which overexpress one or more native or Pichia pastoris glycosylation precursor synthesis enzymes or transporters, e.g. UDP-GaI or UDP-GIcNAc transporters and UDP-Glc-4-epimerase for ER or Golgi localization
Genes involved in O-glycosylation
[093] In accordance with the present invention, 5 genes have been identified as encoding proteins involved in O-glycosylation. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3'
downstream region, as well as the amino acid sequence, of each of these 5 genes are set forth in SEQ0133 to SEQ0137.
[094] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0133 to SEQ0137, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 133 to SEQO 137.
[095] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 133 to SEQ0137, or a nucleotide sequence that is substantially homologous thereto.
[096] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0133 to SEQ0137, or an amino acid sequence substantially homologous thereto.
[097] In addition to N-glycosylation, yeasts also O-glycosylate secreted proteins with oligomannosyl-glycans that differ from the mucin-type O-glycosylation in humans . No robust engineering approach has yet been developed to overcome this issue prior to this invention. The identification herein of the Pichia protein-O-mannosyltransferases that initiate this modification in the ER permits genetic modification of Pichia to reduce O-glycosylation.
[098] Specifically, in a further embodiment, the present invention is directed to methylotrophic yeast strains, e.g., Pichia such as P. pastoris strains, in which one of more of the identified O-mannosyl transferase genes are inactivated to reduce or eliminate unwanted O-glycosylation of a heterologous protein.
Genes encoding mannosyltransferases
[099] In accordance with the present invention, 18 genes have been identified as encoding mannosyltransferases, including 14 α- mannosyltransferases. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the
ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 18 genes are set forth in SEQ0138-0152 and 0210-0212.
[0100] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0138-0152 or 0210-0212, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 138-0152 or 0210-0212.
[0101] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0138-0152 or 0210-0212, or a nucleotide sequence that is substantially homologous thereto.
[0102] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0138 to SEQO 152, or an amino acid sequence substantially homologous thereto.
[0103] In terms of N-glycosylation, methylotrophic yeasts such as P. pastoris modify proteins with a range of heterogeneous high-mannose glycans, which introduce a large amount of heterogeneity in the protein (reducing downstream processing efficiency and complicating product characterization) and induce fast clearance from the bloodstream. The highly immunogenic terminal alpha- 1,3-mannosyl glycotypes that are abundantly produced by S. cerevisiae are not detected on Pichia-produced glycoproteins. Consistently, no ortholog of the S. cerevisiae MNNl gene (encoding the alpha- 1,3-mannosyltransferase) has been found in the Pichia genome. However, Pichia glycoproteins can in some cases be modified with beta- 1,2-mannose residues , reminiscent of antigenic epitopes on the Candida albicans cell wall . P. pastoris AMR2 beta-mannosyltransferase, which has been documented in the art, has been identified in the genome. 3 homologs of AMR2 beta-mannosyltransferase have also been identified (SEQ0210-0212), thus providing the basis for reducing the levels of undesired beta- mannosylation.
[0104] Specifically, in a further embodiment, the present invention is directed to methylotrophic yeast strains, particularly Pichia strains, in which one of more of the identified
P. pastoris mannosyltransferase genes are inactivated, for example, the homolog genes of AMR2 beta- mannosyltransferase as set forth in SEQ0210-0212, to reduce or eliminate unwanted beta-mannosylation of a heterologous protein.
Genes encoding proteins involved in ER glycosylation pathway
[0105] In accordance with the present invention, 13 genes have been identified as encoding proteins involved in the ER glycosylation pathway. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 13 genes are set forth in SEQ0153 to SEQ0165.
[0106] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0153 to SEQ0165, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 153 to SEQO 165.
[0107] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 153 to SEQ0165, or a nucleotide sequence that is substantially homologous thereto.
[0108] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0153 to SEQ0165, or an amino acid sequence substantially homologous thereto.
[0109] In a further embodiment, the invention is directed to methylotrophic yeast strains, especially Pichia strains such as P. pastoris, in which one or more of the genes as set forth in SEQ0153-0165 have been inactivated. An alternative to modification of glycans after transfer to a protein is the modification of the glycan precursor before transfer to the protein. Inactivation of enzyme activities in the synthesis of the glycan precursor in the ER can result in glycolysation of proteins with modified glycan structures. In the event that inactivation of one of the genes as set forth in SEQO 153-0165 results in a lethal phenotype as the generated
glycan-precursor is a non-optimal substrate for further processing steps, overexpression of a downstream enzyme, or expression of a modified or alternative downstream enzyme (e.g., LmSTT3 as further discussed below), may overcome this defect.
[0110] In still another embodiment, the invention is directed to the use of the Pichia pastoris STT3 gene sequence as set forth in SEQO 163 to disrupt the chromosomal STT3 gene, and optionally, to further insert a heterologous STT3 gene, such as the Leishmania STT3 gene into the Pichia pastoris STT3 locus.
[0111] In yeast, STT3 is a part of the oligosaccharyl-transferase (OT) complex which transfers the lipid linked oligosaccharide to the protein. Leishmania major has four STT3 paralogues, of which 3 could complement a yeast stt3 deletion. Furthermore, it has been reported that LmSTT3 does not work in the OT-complex but is active as a dimeric complex. It is suggested that the various LmSTT3 dimeric complexes display different protein substrate specificities at the level of individual glycosylation sequences. In addition, the LmSTT3D dimeric complex has a relaxed specificity with respect to the lipid linked oligosaccharide substrate. In contrast to the homogenous OT-complex, LmSTT3 OTase has no reduced transfer efficiency of glycans lacking α-l,2-linked mannoses on the B and C branch12. Replacing the Pichia pastoris STT3 by LmSTT3D can provide glycosylation flexibility when the native OT-complex of Pichia pastoris is unable to transfer modified lipid- linked oligosaccharides to the protein.
[0112] "Knock-in" of LmSTT3 - The coding sequence of the LmSTT3D gene is amplified by PCR using specific primers. The coding sequence is subsequently cloned into the expression plasmid using unique restriction sites which places the coding sequence under control of a promoter of the expression plasmid, these restriction sites are incorporated in the primers. The expression vector contains a sequence of the Pichia pastoris STT3 gene ("PpSTT3"). Prior to transformation in the P, pastoris strain, the plasmid containing the LmSTT3 expression cassette was digested with a unique site in the PpSTT3 sequence to facilitate integration. Transformants are selected for by using the selection marker present on the
expression plasmid. To evaluate whether glycosylation is increased, N-glycans derived from secreted glycoproteins are analyzed by DSA-FACE capillary electrophoresis.
[0113] "Knock-out" of PpSTT3 - The coding sequences of the LmSTT3D gene is amplified by PCR using specific primers. The coding sequence is subsequently cloned into the expression plasmid using unique restriction sites which place the coding sequence under control of a promoter of the expression plasmid, these restriction sites are incorporated in the primers. The expression vector contains two Pichia pastoris sequences flanking the LmSTT3 expression cassette, these sequences are upstream and downstream of the PpSTT3 gene. Prior to transformation into the P. pastoris strain, the plasmid containing the LmSTT3 expression cassette was digested with restriction endonuclease(s), excising the LmSTT3 cassette containing the PpSTT3 sequences. Transformants are selected by using the selection marker present in the expression cassette. To evaluate whether glycosylation is increased, N-glycans derived from secreted glycoproteins are analyzed by DSA-FACE capillary electrophoresis.
Other genes encoding proteins involved in the glycosylation pathway
[0114] In accordance with the present invention, 3 genes have been identified as encoding additional proteins involved in the glycosylation pathway (S. cerevisiae MNN4 homologs). In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 3 genes are set forth in SEQ0166 to SEQ0168.
[0115] In one embodiment, the invention is directed to methylotrophic yeast strains which overexpress one or more of the MNN4 homologs to promote the core type phosphorylation of N-glycans. Increased phosphorylation of recombinant proteins can be useful in directing the protein for uptake throught the mannose-6-phosphate receptor13. Expression vectors for achieving such elevated expression are also part of the present invention. To evaluate whether manno-phosphorylation is increased from such modified strains, N-glycans derived from secreted glycoproteins after 48 hours culture in YPD medium are analyzed by DSA- FACE capillary electrophoresis. The amount of Man8GlcNAc2 will be drastically reduced in favor of two structures that migrate faster (compared to Man8GlcNAc2) and that are likely to
contain one (P) and two (PP) phosphate residues, respectively. Assuming that both peaks derive from the MansGlcNAc2 peak, the amount of MangGlcNAci converted to phosphorylated glycans can be quantitated.
Genes encoding proteins involved methanol metabolism
[0116] In accordance with the present invention, 10 genes have been identified as encoding proteins involved in methanol metabolism. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 10 genes are set forth in SEQ0169 to SEQ0178. While some of these genes have been documented in the art, the promoters have not been identified prior to the present application.
[0117] The commonly used methanol-inducible promoters in P. pastoris, the alcohol oxidase I (AOXI) promoter and the formaldehyde dehydrogenase (FLD) promoter, drive the production of enzymes needed for methanol assimilation and therefore produce extremely high levels of these transcripts upon switching the carbon source to methanol. The P. pastoris genome sequence has now allowed identification of all genes coding for enzymes involved in methanol assimilation (Fig. 3A and Table 5A) and their promoters, which are useful for driving transgene expression in P. pastoris.
[0118] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0172- SEQ 0174 or SEQ0176 to SEQ0178, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0172- SEQ 0174 or SEQ0176 to SEQ0178.
[0119] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 172- SEQ 0174 or SEQO 176 to SEQO 178, or a nucleotide sequence that is substantially homologous thereto.
[0120] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0172- SEQ 0174 or SEQO 176 to SEQO 178, or an amino acid sequence substantially homologous thereto.
[0121] In a further embodiment, the present invention is directed to the promoters of the genes disclosed in SEQ0169 to SEQ0178. These promoters can be placed in an operable linkage to a heterologous gene for methanol-inducible recombinant expression in Pichia pastoris. Therefore, expression vectors, host cells and methods of recombinant expression by utilizing any of the promoters disclosed in SEQ0169 to SEQ0178 are also embodiments of the present invention.
Genes encoding protein homologs of S. cerevisiae proteases and protease inhibitor
[0122] In accordance with the present invention, 8 genes have been identified as encoding protein homologs of S. cerevisiae proteases or protease inhibitors. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 8 genes are set forth in SEQ0179 to SEQ0186. SEQ0179 sets forth a Pichia pastoris gene encoding a serine-type peptidase. SEQO 180 sets forth a Pichia pastoris gene encoding a serine-type endopeptidase inhibitor. SEQO 181-0186 sets forth Pichia pastoris genes coding for aspartic-type endopeptidases.
[0123] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQO 179 to SEQO 186, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 179 to SEQO 186.
[0124] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 179 to SEQO 186, or a nucleotide sequence that is substantially homologous thereto.
[0125] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0179 to SEQO 186, or an amino acid sequence substantially homologous thereto.
[0126] In a further embodiment, the present invention is directed to vectors for inactivating one or more of the Pichia proteases, and to Pichia strains having one or more of the protease
& genes inactivated.
[0127] In another embodiment, the present invention is directed to expression vectors capable of expressing the serine-type endopeptidase inhibitor (SEQO 180) in a methylotrophic yeast strain such as P. pastoris, and methylotrophic yeast strains such as P. pastoris engineered transformed with such expression vector to produce the endopeptidase inhibitor at an elevated level.
[0128] The protease-deficient Pichia strains and strains that produce the endopeptidase inhibitor at an elevated level (i.e., overpression), are believed to allow more stable accumulation of a heterologous protein expressed in these strains. Such strains may be useful especially for producing recombinant immunoglobulins.
Genes encoding chaperones
[0129] In accordance with the present invention, 14 genes have been identified as encoding chaperones. In the annotated Sequence file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 14 genes are set forth in SEQO 187 to SEQ0200.
[0130] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQO 187 to SEQ0200, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQO 187 to SEQ0200.
[0131] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQO 187 to SEQ0200, or a nucleotide sequence that is substantially homologous thereto.
[0132] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0187 to SEQ0200, or an amino acid sequence substantially homologous thereto.
[0133] The present invention has now provided a complete catalog of orthologs of the S. cerevisiae ER folding machinery. This information is especially useful for design of an efficacious folding system. In one embodiment, the present invention is directed to expression vectors and engineered Pichia strains for increased expression of one or more native Pichia chaperones. The chaperone coding sequences can be placed under the control of a promoter selected to be induced by similar conditions to those for heterologous protein expression. Expression libraries of multiple chaperones can be screened to identify the most effective combination of chaperon expression for a particular heterologous protein. For example, a series of libraries of chaperones can be created, with each library having a different drug resistance marker incorporated for selection, such that either successive or combinatorial introduction of members of these libraries into a strain expressing a heterologous protein may be selected.
5S ribosomal RNA gene
[0134] In accordance with the present invention, the P, pastoris 5S ribosomal RNA gene has been identified and set forth in SEQ0201.
[0135] One strategy for optimizing protein expression levels in a recipient cell is to increase copy number of the expression cassette. This effort has been hampered by the absence of knowledge regarding sequences which occur multiple times in the Pichia genome and could be used for stable multi-copy strain generation through homologous recombination-mediated targeting of such multi-copy sequences. The 5S rRNA coding-sequence of Pichia pastoris provides the basis for multi-copy targeting. Contrary to the situation in S. cerevisiase, the 5S
rRNA-coding sequence is not a part of the rDNA repeat locus in Pichia pastoris, and many copies of the 5S rRNA coding- sequence are spread over the 4 chromosomes of Pichia pastoris (Fig. IB), thus providing an ideal targeting site for multi-copy integration.
[0136] To practice this aspect of the invention, the 5S rRNA coding sequence can be placed in a vector which also carries an expression cassette of interest and a selectable marker. The selectable marker can be a dominant selection marker that confers drug resistance, or a marker that confers phenotype selectable based on prototrophy. A unique restriction site is made or designed to be available within the 5S rRNA coding sequence. The vector is linearized by using the restriction enzyme that cleaves in the 5S rRNA coding sequence. The linearized vector is transformed into Pichia pastoris, and drug-resistant clones are isolated at increasing drug concentrations. Those clones that are resistant against the highest drug concentrations are expectedly those that have taken up the largest number of expression cassettes. Alternatively, when the selectable marker supplies an enzyme (e.g., HIS4), the clones that grow faster in the appropriate seletable media are identified as having multicopy integration. This procedure can be repeated until an optimal number of expression cassettes in the strain is obtained. Protein production in the selected strains is evaluated using methods known in the art.
[0137] Accordingly, the present invention also provides expression vectors capable of mediating multi-copy integration of an expression cassette onto the chromosomes of Pichia pastoris, as well as Pichia pastoris containing multiple copies of the expression cassette, stably integrated into the chromosomes. An expression cassette, as used in the present context, refers to a nucleic acid that includes, from 5' to 3', a promoter, the coding sequence of a heterologous protein of interest, and a 3' downstream sequence including a transcription termination sequence.
Genes encoding proteins involved in xylose, arabinose and threhalose metabolism
[0138] In accordance with the present invention, 8 genes have been identified as encoding proteins involved in xylose, arabinose or threhalose metabolism. In the annotated Sequence
file provided herein, the nucleotide sequences of the 5' upstream region (including promoter), the ORF and the 3' downstream region, as well as the amino acid sequence, of each of these 7 genes are set forth in SEQ0202 to SEQ0209.
[0139] In one embodiment, the present invention is directed to isolated nucleic acid molecules comprising a nucleotide sequence that encodes a protein (e.g., a full-length protein) as set forth in any of SEQ0202 to SEQ0209, or encodes a protein that is substantially homologous to a full-length protein as set forth in any of SEQ0202 to SEQ0209.
[0140] In a specific embodiment, the present invention is directed to isolated nucleic acid molecules comprising the coding sequence of an ORF as set forth in any of SEQ0202 to SEQ0209, or a nucleotide sequence that is substantially homologous thereto.
[0141] In another embodiment, the invention is drawn to an isolated protein comprising an amino acid sequence (e.g., a full-length protein sequence) as set forth in any of SEQ0202 to SEQ0209, or an amino acid sequence substantially homologous thereto.
[0142] In still another embodiment, the present invention is directed to the promoters of the genes disclosed in SEQ0202 to SEQ0209. These promoters are expected by the induced by specific sugars (xylose and arabinose metabolism pathway by C5 sugars, and threhalose pathway by α-l,l-disaccharides) can be placed in an operable linkage to a heterologous gene for inducible recombinant expression in Pichia pastoris . Therefore, expression vectors, host cells and methods of recombinant expression by utilizing any of the promoters disclosed in SEQ0202 to SEQ0209 are also embodiments of the present invention.
General Methodology
[0143] Gene knockouts
[0144] To reduce the protease, protein-O-mannosyltransferase or beta-mannosyltransferase activity in Pichia pastoris, the genes encoding these enzymes are inactivated. This is achieved through standard yeast genetics techniques. Examples of such techniques include gene replacement through double homologous recombination, in which homologous regions
flanking the gene to be inactivated are cloned in a vector flanking a selectable marker gene (such as an antibiotic resistance gene or a gene complementing an auxotrophy of the yeast strain). Alternatively, the homologous regions can be PCR-amplified and linked through overlapping PCR to the selectable marker gene. Subsequently, such DNA fragments are transformed into Pichia pastoris through methods known in the art, e.g., electroporation. Transformants that then grow under selective conditions are analyzed for the gene disruption event through standard techniques, e.g. PCR on genomic DNA or Southern blot. In an alternative experiment, gene inactivation can be achieved through single homologous recombination, in which case, e.g. the 5' end of the gene's ORF is cloned on a promoterless vector also containing a selectable marker gene. Upon linearization of such vector through digestion with a restriction enzyme only cutting the vector in the target-gene homologous fragment, such vector is transformed into Pichia pastoris . Integration at the target gene site is confirmed through PCR on genomic DNA or Southern blot. In this way, a duplication of the gene fragment cloned on the vector is achieved in the genome, resulting in two copies of the target gene locus: a first copy in which the ORF is incomplete, thus resulting in the expression (if at all) of a shortened, inactive protein, and a second copy which has no promoter to drive transcription.
[0145] Alternatively, transposon mutagenesis is used to inactivate the target gene. A library of such mutants can be screened through PCR for insertion events in the target gene.
[0146] The functional phenotype (i.e., deficiencies) of an engineered/knockout strain can be assessed using techniques known in the art. For example, a deficiency of an engineered strain in protease activity can be ascertained using any of a variety of methods known in the art, such as an assay of hydrolytic activity of chromogenic protease substrates, band shifts of substrate proteins for the select protease, among others. A deficiency in protein O- mannosylation can be detected by mass changes of expressed O-glycosylated proteins and through mass spectrometrical techniques designed to detect the sites of O-glycosylation (such as beta-elimination in 18O-H2O), used in a comparative experiment with the same protein expressed in a non-engineered strain. Beta-mannosyltransferase deficiency can be detected
through glycan analysis of expressed proteins that are beta-mannosylated in non-engineered strains or through loss of signal with beta-mannosyl specific antibodies.
[0147] Gene over-expression
[0148] To increase the expression level in Pichia pastoris of desirable genes such as polypeptide-folding-promoting proteins, UDP-GaI or UDP-GIcNAc transporters and UDP- Glc-4-epimerase, the ORFs or coding sequences of such genes are cloned under the control of a promoter of desired strength and regulation (such as the methanol-inducible AOXI, AOXII or FLD promoters, or the constitutively expressed GAP or TEFlalpha promoter), in a vector also containing a selectable marker gene functional in Pichia pastoris. To increase transformation efficiency and/or target the vector to a selected genomic locus, the vector may be linearized in a genome-homologous sequence, although this is not essential. Subsequently, such vector is transformed into Pichia pastoris using techniques known in the art.
[0149] Promoter analysis and usage
[0150] Cloning of the promoters and generating engineered yeast strains - The promoter sequence is amplified from Pichia pastoris genomic DNA with primers designed to the 51 and 3' regions of the sequences provided herein. The corresponding PCR fragment is cloned in an appropriate vector by blunt or TA ligation or by restriction-ligation cloning using restriction sites, not present in the promoter, adapted to the primer. The promoter is subsequently cloned upstream of a gene of interest in a yeast expression vector. The expression vector is transferred into the Pichia pastoris strain according to protocols described by Cregg and Russel . To generate stable transformants via homologous recombination, the vector is cleaved in a P. pastoris DNA segment (e.g.. the P. pastoris promoter) with unique restriction sites. This allows integration into the genome by single crossover type insertion15. Transgenic yeast can be obtained on medium containing the selection marker or on medium lacking the complemented amino acid.
[0151] Analyzing promoter activity - To analyze the promoter activity, stable strains are generated as an example: a strain containing a marker gene under control of the promoter of
interest and a reference strain with the marker gene controlled by a reference promoter. Quantitative analysis of expression levels of this marker can determine the relative activity of the promoter.
[0152] Identifying promoter sequence and cis-acting elements - For the identification of the promoter sequence, a 5'-deletion series is generated by PCR on P. pastoris genomic DNA as template. For this deletion series, a set of forward primers is designed which hybridize at different distance from the start. As reverse primer, a primer is used that hybridizes to the 3' end of the suggested promoter sequence. In a first screen, the promoter deletions can be done in steps of 100 bp. By this a rough estimation of the promoter size can be made. Later a second screen in this region can be perfomed with smaller deletion steps. The promoter deletions are cloned in an expression vector, in front of a marker gene which allows to quantify the expression. Analysis of expression levels compared to the whole sequence will allow identification of the promoter.
[0153] To identify cis-acting elements, a series of 3'-deletion fragments of the promoter are obtained next to the 5'-deletions in a similar fashion. In addition, a minimal promoter is fused to the 3'-deletion fragments to compensate for the loss of the TATA-box. Further, combinations of the 3' deletions (without minimal promoter) and 5' deletions are made by overlapping PCR. Expression vectors are generated containing the promoter fragments controlling the expression of a marker gene. Analysis of the activity of these promoter fragments can identify regulatory elements in the promoter.
[0154] In addition to the methanol-inducible promoters of the Alcohol Oxidase I gene and the formaldehyde dehydrogenase I gene, and the constitutively active promoter of the glyceraldehyde-3-phosphate dehydrogenase gene, the present invention provides additional Pichia pastoris genes involved in the glycolysis pathway, methanol assimilation, xylose, arabinose and threhalose metabolisms. The promoters of these genes can be used for driving overexpression of heterologous genes.
[0155] For those promoters of the genes in the methanol utilization pathway, the xylose metabolism, the arabinose metabolism and the threhalose metabolism, these promoters can be used for inducible expression of a heterologous gene. To assess the inducibility of these promoters by methanol, xylose, arabinose or threhalose, the expression of their endogenous gene can be compared by qPCR of cells grown in medium with or without addition of the rsepective carbon sources. For example, P. pastoris wild type cells are grown in BMGY (uninduced condition), centrifuged and induced in BMMY (for methanol induction), in media containing xylose, in media containing arabinose or in media containing threhalose, respectively. At different time points, samples will be taken. At every time point, RNA is isolated from the yeast cells, treated with DNAse to remove all genomic DNA and then the mRNA converted into cDNA. On all genes to be tested, primers are designed (by Primer3Plus Software) and tested for amplification of the gene. To be able to compare the expression of the different genes, a good reference gene (e.g., actin; GAPDH or PDAl) is included. All reference genes are tested in uninduced and induced conditions by qPCR and the best reference gene is chosen by the geNorm software. After selecting the best reference gene, qPCR with primers on the reference gene and primers on the gene of interest can be performed on every sample. Thereafter, the relative quantity (RQ= 2 q, with ΔCq the difference in PCR cycles between induced and non-induced condition) of every gene can be calculated (qBASEP/MS software). From these RQs, the ratio of the gene of interest (GOI) relative to to the reference gene (NRQ= RQGOI/RQREF) is determined. When different genes are investigated, the gene that is most upregulated in the induced condition can be identified. The promoter driving this gene can then be used to drive expression of a heterologous test gene.
[0156] Counterpart genes from other methylotrophic yeast species
[0157] Given the availability of the Pichia pastoris genome and annotation, counterpart genes from other methylotrophic yeast species, e.g., other Pichia species, can be isolated. Isolation of such counterpart genes can be useful in generating engineered methylotrophic yeast strains other than Pichia pastoris for recombinant production of heterologous glycoproteins.
[0158] Other methylotrophic yeast species as genetic engineering host
[0159] The present invention contemplates the use of a methylotrophic yeast strain, including but not limited to Pichia strains such as Pichiapastoris, as the host for genetic engineering and recombinant production of proteins including glycoproteins. While the identification of native Pichiapastoris genes and their constituents (such as promoters, signal peptides, proteins including enzymes and chaperons) are preferred choices for engineering a Pichia pastoris strain, these choices are believed to also work for other methylotrophic yeast strains, especially other Pichia strains closely related to Pichiapastoris.
[0160] Combination of genetic modifications
[0161] Embodiments described herein can be combined as appropriate. For example, genetic modifications of a strain, such as usage of inducible promoters, usage of signal peptides, inactivation of one or more protease genes, overexpression of nucleotide sugar synthesis and transport, inactivation of O- or beta-mannosylation, expression of chaperones, and multicopy integration, for example, can be combined in any manner desirable.
[0162] The present invention is further supported and illustrated by the following examples.
EXAMPLE-I
[0163] This Example describes the methods employed to identify, assemble and annotate the full genomic sequence of P. pastoris GS 115 strain.
DNA preparation
[0164] P. pastoris GSl 15 (Invitrogen, Carlsbad, CA) is a strain derived from the wild type strain NRRL-Y 11430 (Northern Regional Research Laboratories, Peoria, IL). It has a mutation in the histinol dehydrogenase gene (HIS4) and was generated by nitrosoguanidine mutagenesis at Phillips Petroleum Co. It is the most frequently used Pichia strain for heterologous protein production.
[0165] P. pastoris genomic DNA was prepared according to a published protocol with minor modifications. Instead of vortexing, the samples were shaked in a Mixer Mill (Retsch) for 2 minutes.
Sample preparation and sequencing with Roche/454 Genome Sequencer FLX
[0166] The shotgun library of P. pastoris for sequencing on the Genome Sequencer FLX (GS FLX) was prepared from five micrograms of intact genomic DNA. Based on random cleavage of the genomic DNA16 with subsequent removal of small fragments with AMPure™ SPRI beads (Agencourt, Beverly, MA), the resulting single- stranded (sst) DNA library showed a fragment distribution between 300 and 900 bp with a maximum of 574 bp. The optimal amount of sstDNA library input for the emulsion PCR16 (emPCR) was determined empirically through two small-scale titrations leading to 1.5 molecules per bead used for the large-scale approach. A total of 64 individual emPCRs were performed to generate 3,974,400 DNA carrying beads for two two-region- sized 70x75 PicoTiterPlates (PTP) and each region was loaded with 850.000 DNA carrying beads. Each of the two sequencing runs was performed for a total of 100 cycles of nucleotide flows16 (flow order TACG) and the 454 Life Sciences/Roche Diagnostics software Version 1.1.03 was used to perform the image and signal processing. The information about read flowgram (trace) data, basecalls and quality scores of all high quality shotgun library reads was stored in a Standard Flowgram Format (SFF) file which was used by the subsequent computational analysis (see below).
[0167] Within this sequencing project, a paired end library of P. pastoris (strain GSl 15) was prepared for subsequent ordering and orienting of contigs (see computational analysis below). Six micrograms of intact genomic DNA was sheared hydrodynamically (Hydroshear - Genomic Solutions, Ann Arbor, MI) and purified with AMPure™ SPRI beads (Agencourt, Beverly, MA) into DNA fragments of about 3 kbp in length. After methylation of Eco RI restriction sites, a biotinylated hairpin adaptor was ligated to the ends of the P, pastoris DNA fragments, followed by Eco RI digestion with a subsequent circularization. The restriction of the circularized DNA fragments with Mine I, the subsequent ligation of paired end adaptors and the amplification of the remaining DNA fragments resulted in a double- stranded paired
end library with 130 bp in length. For the following eight individual emPCRs of the paired end library, 1.5 molecules per bead were used to generate 339,480 DNA-carrying beads of which 280,000 were loaded onto a region of a four-region sized 70 x75 PTP. The subsequent sequencing run with the GS FLX was performed for a total of 42 cycles of nucleotide flow (see above) and the 454 Life Sciences/Roche Diagnostics software Version 1.1.03 was used to perform the image and signal processing. The information about read flowgram (trace) data, basecalls and quality scores of all high quality shotgun library reads was also stored in a SFF file which was used by the subsequent computational analysis.
Computational analysis of GS FLX shotgun and paired end reads
[0168] An automatic assembly pipeline (in-house software, Eurofins MWG Operon) was used to de novo assemble the generated shotgun and paired end reads.
[0169] For de novo assembly of the P. pastoris genome sequence, a total of 897,197 good quality base called, clipped shotgun reads with an average read length of 243 bp and a total of 70,500 good quality base called, clipped 20 bp-Paired End tag reads were used.
[0170] Within this pipeline, the information about all sequences and their quality was extracted from the SFF- into a FASTA-file and subsequently converted into CAF format, the input format of choice of the used assembler mira (version 2.9 26x3) for contig creation. The provided mate and size information {i.e., forward and reverse read and the 3 kbp of length) of the paired end reads was used to scaffold the resulting contigs from the de novo assembly .
Assembly (Fig. IA and Fig. 4)
[0171] The initial assembly contained 1,154 contigs with 9.6 Mbp sequence and 2Ox sequencing depth. The contig N/L50 was 40/77 kbp. Assembly of the contigs was performed manually, based on homology between the contig ends. 13 contigs were assigned to chromosomes by identification of the chromosomal markers previously described10 (Chromosome 1: HIS4, ARG4, OCHl, PAS5, PRBl, PRCl; Chromosome 2: PAS8, GAP; Chromosome 3: DASl, URA3, PEP4; Chromosome 4: AOXl, AOX2). Starting from these
contigs, contigs with homologous contig ends were identified by BLASTN search with 500- 1000 bp of the contig ends to a database with the contig sequences. Contigs sharing homology with a P-value < e-20 were assumed to be linked. Pools of potentially linked contigs were assembled to supercontigs by the SeqMan assembly software (DNASTAR inc, Madison, WI, USA). The resulting contig junctions were curated by removing the low- coverage ends of either joined contig. In the cases where the BLASTN P-value was >e-50, the junction was PCR-amplified and Sanger-sequenced (primer sequences: Table 1). This resulted in 10 supercontigs, with 9,1 Mbp of sequence, and a remaining 7 unassembled contigs, The supercontig N/L 50 was 3/1.544 Mbp.
[0172] The mitochondrial genome was also assembled and had extremely high coverage (859.9 fold), indicating the presence of approx. 43 mitochondrial genomes per cell in P. pastoris when grown on glucose as carbon source.
Gap joining and finishing
[0173] Supercontigs were linked by mapping contigs to paired-end scaffolds (n=l) and automated prediction of protein-coding sequences revealed a partial ORF at the end of a supercontig, homologous to a WD40 domain protein in other yeasts (including Pichia guillermondii homolog PGUG 04385). Finding the other part of this ORF on one of the unassembled contigs allowed joining of this supercontig to one of the as yet unassembled contigs. This was confirmed by PCR and Sanger sequencing.
[0174] Seven of the 9 thus generated supercontigs could be assigned to a specific chromosome when they contained one or more of the 13 genes for which chromosomal location had been previously established10 (Fig. IB and Fig. 5C). For those two supercontigs and the 6 unassembled contigs where this was not the case, Southern blot analysis of pulsed- field gel electrophoresis- separated Pichia pastoris chromosomes (see below) was used for the assignment (Figs. 5A). After assignment to the chromosomes, orientation of the supercontigs and contigs on the chromosomes was determined by PCR analysis with primers on the contig ends (Table 1). Gaps were PCR-amplified using primers flanking these regions (Table 1) and sequenced by Sanger sequencing for finishing.
[0175] rDNA repeat regions were detected by Southern blot on all four PFGE-separated chromosomes (Fig. 5A). The Southern signal on chromosomes 1 and 4 was as strong as that on chromosomes 2 and 3 combined. Through PCR, the location and orientation of the rDNA locus was determined to be at one end of Chromosomes 2 and 3 (Fig. IB). The attempts at verification of the rDNA locus position on chromosomes 1 and 4 (still containing 1 gap) were inconclusive.
Pulsed-field Gel electrophoresis (PFGE)
[0176] A BioRad contour-clamped homogeneous electric field CHEF DRIII system was used for PFGE. Chromosomal DNA was prepared in agarose plugs with the CHEF Genomic DNA Plug kit (BioRad) following the instructions of the manufacturer. A 0.8% agarose gel in 1 x modified TBE (0.1 M Tris, 0.1 M Boric Acid, 0.2 mM EDTA) was used to separate the chromosomes. The gel was electrophoresed with a 106° angle at 140C at 3 V/cm for 32 h, with a switch interval of 300 s, followed by 32 h with a switch interval of 600 s and 24 h with a switch interval of 900 s. After separation, the chromosomes were visualized with ethidium bromide and the different contigs were mapped onto the chromosomes by Southern blot. Therefore, the gel was incubated in 0.25 M HCl for 30 minutes, followed by capillary alkali transfer of the DNA onto a Hybond N+ membrane (Amersham). The probes were prepared by PCR on an open reading frame. For chromosome specific probes, a part of the coding sequence of HIS4 (chromosome 1), GAP (Chromosome 2), URA3 (chromosome 3) and AOXl (chromosome 4) was used. The probes were random labelled with OC P dCTP, using the High Prime kit (Roche).
Automatic gene structure prediction afunctional annotation
[0177] Protein-coding genes were predicted by the integrative gene prediction platform
EuGene (Fig. 6). A specific EuGene version was trained based on 108 manually checked P. pastoris genes, Documented genes from P, stipitis and S. cerevisiae were used to build P, pastoris orthologous gene models allowing the training of P. pastoris-specήic Interpolated
Markov Models for coding sequences and introns. Splice sites were predicted by NetAspGene19 and gene prediction from GeneMarkHMM-ES20 trained for P. pastoris and AUGUSTUS (Pichia stipitis model) were used to provide alternative gene models for EuGene prediction. The UniProt and the fungi RefSeq protein database were searched against the supercontig sequence by BLASTX to identify the coding area. The DeCypher-TBLASTX program was used to search the conserved sequence area between the P. pastoris, P. stipitis and Candida guilliermondii genomes.
[0178] All predicted protein-coding genes were searched against the yeast protein database, UniProt22 and RefSeq23 fungi protein database by BLASTP. Protein domains were detected by InterProScan" with various databases (BlastProDom, FPrintScan, PIR, Pfam, Smart, HMMTigr, SuperFamily, Panther and Gene3D) through the European Bioinformatics Institute Web Services SOAP-based web tools. Signal peptide and transmembrane helices were predicted by SignalP and TMHMM respectively. GO (Gene Ontology) terms25 were derived from the InterProScan result and the KEGG (Kyoto Encycolopedia for Genes and Genomes) pathway and EC (Enzyme Commission) numbers were annotated by the annotδr pipeline.
Expert gene structure/functional annotation
[0179] The gene structure prediction and the database search results from various databases were formatted and stored in a MySQL relational database. A multiple alignment of each protein-coding gene with the top 10 best hits against the UniProt, RefSeq fungi and yeast protein database was built by MUSCLE2 . A BOGAS (Bioinformatics Online Genome Annotation System) P. pastoris annotation website was setup as the workspace for expert annotators. The initial aim of BOGAS is to provide a workspace for gene structure and functional annotation. The editing of gene structure or gene function assignment is directly updated to the MySQL relational database through the web interface. All of the modification from expert annotators is traceable and reversible by the database system. Once the expert annotator modified the gene structure and change the translated protein product, the system will automatically trigger the update function to check the protein domain and protein database. BOGAS also provides a search function where users can search for genes by
sequence similarity (BLAST), gene id, gene name or InterPro domain. Each predicted Pichia gene's structure and the similarity search result were visually inspected through an embedded strip-down version of ARTEMIS. The splice sites of each gene were carefully checked and compared with S. cerevisiae and P. stipitis loci. A functional description of each gene was added to the gene annotation when a closely related homologous gene was available.
Estimate of the gene space completeness
[0180] Parra et al. proposed a set of core eukaryotic genes (CEGs) to estimate the completeness of genome sequencing and assembly programmes. The CEGs contains 248 genes across six model organisms (H. sapiens, D. melanogaster, C. elegans, A. thaliana, S. cerevisiae and S. pombe) of which ~90% are single copy in D. melanogaster, C. elegans, S. cerevisiae and S. pombe. The protein-coding genes identified in this invention were checked with the HMM profile from the CEGs dataset by the HMMER package. All of the 248 CEGs were present in our curated gene set with full HMM domain coverage. Further, FUNYBASE (FUNgal phYlogenomic dataBASE)" provides 246 single-copy ortholog clusters in 21 sequenced fungal genomes. These single-copy protein sequences were extracted from the FUNYBASE website and built the HMM model for each cluster. The corrected P. pastoris protein sequences were searched with the FUNYBASE HMM database. All of the FUNYBASE models were presented in our gene catalog with complete domain coverage.
Detection ofrRNA and tRNA loci
[0181] Ribosomal RNAs were detected automatically by INFERNAL 1.0 (INFERence of RNA ALignment) against the Rfam29 database and manually confirmed by BLASTN search with S. cerevisiae homologs to the P. pastoris genome sequence. Localization of the ribosomal DNA locus was assayed by PFGE and PCR.
[0182] Transfer RNAs were automatically predicted by tRNA Scan-SE 1.21 and manually confirmed by BLASTN search with the 5. cerevisiae homologs to the P. pastoris genome sequence.
Codon usage
[0183] Nucleotide sequences of the predicted P. pastoris ORFeome were analyzed with ANACONDA 1.531. In addition to calculation of the codon use, the analysis by ANACONDA generates a codon-pair context map for the ORFeome. This map shows one colored square for each codon-pair, the first codon corresponds to rows and the second corresponds to columns in the map. Favored codon pairs are shown in green, underrepresented ones are shown in red.
Phylogenetic tree reconstruction of fungal genomes
[0184] The phylogenetic tree was based on 200 single-copy genes which were present in 12 sequenced fungal genomes. A multiple sequence alignment was constructed using the MUSCLE program and gap removal by in-house script based on the BLOSUM62 scoring matrix. The maximum likelihood tree reconstruction program TREE- PUZZLE32 (quartet puzzling, WAG model, estimated gama distribution rate with 1000 puzzling step) was used for phylogenetic tree reconstruction. The tree was well supported by 1000 bootstraps in each node.
Comparative analysis of gene family and protein domain
[0185] The predicted proteomes used in this study were those of six hemiascomycetes (P. pastoris, S. cerevisiae, K. lactis, P. stipitis, C. lustianiae and Y. lipolyticdy ' . In order to obtain the gene families, a similarity search of all protein sequences from the 6 fungi (all- against-all BLASTP, e-value le-10) was performed. Gene families were constructed by Markov clustering3 based on the BLASTP result. All predicted protein sequences from the six genomes were searched against the Pfam36 database to obtain the protein domain
occurrence in each species. The protein domain loss and acquisition was counted based on the Dollo parsimony principle by the DOLLOP program from the PHYLIP package37.
EXAMPLE-2
[0186] This Example describes the results of assembly and annotation of the full genomic sequence of P. pastoris GS 115 strain.
Genome sequencing and assembly
[0187] Prior to this invention, very little was known about the genome features of P. pastoris. Ohi et al.38 reported that the P. pastoris genome was organized in 4 chromosomes with a total estimated size of 9.7 Mbp (+/- a few hundred kbp, as the pulsed field gel electrophoresis technique has a relatively poor accuracy). In addition, Ohi et al, assigned 13 P. pastoris genes to the different chromosomes.
[0188] During the course of this invention, chromosome assembly was completed according to the strategy shown in Fig. IA. 454/Roche sequencing16 (GS-FLX version) was utilized to highly oversample the genome (2Ox coverage) and generated 70,500 paired-end sequence tags, to enable the assembly of all but 7 contigs into only 9 "supercontigs" (plus the mitochondrial genome) using automated shotgun assembly and BLASTN-based contig end joining (see Example 1 and Fig. 4). Upon assigning these supercontigs to the 4 chromosomes (see Example 1 and Figs. 5A-5C), the order of the supercontigs was determined through PCR and Sanger sequencing of the amplificates. Through these finishing experiments, the 4 chromosomal sequences were reconstructed (Fig. IB; results summarized in Table 2), with only 2 gaps remaining (one on chromosomes 1 and 4 each). A ribosomal DNA (rDNA) repeat sequence was present in the assembly as a separate contig of 7450 bp, with exceptionally high coverage (328.8 fold). Given that sequence coverage all over the assembly very closely approximates 2Ox, it has been interpreted that there are approximately 16 copies of the rDNA repeat region, thus accounting for about 119 kbp in sequence. These rDNA loci were detected on all chromosomes (Example 1, Fig. IB and Fig. 5A). The rDNA locus contains the 18S, 5.8S and 26S rRNA coding sequences. Unlike the S. cerevisiae 5S rRNA,
which is present in the repeated rDNA locus, the 21 copies of the P. pastoris 5S ribosomal RNA (rRNA) were found to spread across the chromosomes. While the chromosomes of P. pastoris GS 115 were estimated to be (from chromosome 1 to 4): 2.9, 2.6, 2.3 and 1.9 Mbp based on pulsed-field gel electrophoresis (PFGE) , it was estimated after assembly in this invention to be 2.88 (2.8 + 0.08), 2.39, 2.24 and 1.8 (1.78+0.017) Mbp. Including the estimated 0.12 Mbp of rRNA repeats, the genome size of P. pastoris was determined to be 9.43 Mbp. The P. pastoris genomic sequences are deposited in Genbank.
Genome sequence accuracy estimation
[0189] A concern with genome sequences largely generated through 454 sequencing was the potential for "indel errors" at homopolymeric sequences39. An analysis of the occurrence of such sequences in the P. pastoris genome was conducted. Two approaches were followed to estimate the accuracy of the genome sequence. First, 39 peer-reviewed Genbank coding sequences of P. pastoris strain GS 115 were retrieved (Table 3; a total sequence length of 70,295 bp). These sequences were compared to the genome sequence of this invention, and 84 differences were encountered. To determine which sequence was correct, PCR was performed on GS 115 genomic DNA and the amplificates were Sanger-sequenced. In all but 2 cases, the Sanger sequences confirmed the genome sequence of this invention, and thus the error rate was estimated to be 1 in 35,147 bp. In an alternative approach, all open reading frames encoding proteins with at least one clear homolog in the databases were analyzed. If an interrupted ORF was found to have clear homology to the 5' part of the homologs, immediately followed by a coding sequence with clear homology to the 3' part, the most logical interpretation would be that there was a frame-shift error mutation in the genome sequence (i.e. both coding sequences are extremely likely to be linked into one open reading frame (ORF)). On this premise, frameshift errors were found in 2.7% (106) of the 3,997 genes for which such analysis could be made, totaling 6.11 Mbp of coding sequence. Assuming (fairly conservatively) that such error would have been detected if it has occurred in the first 2/3 of the ORF, a frame-shift error rate in the coding sequences was determined to be 1 in 37,716 bp. Both of the approaches above show that high-coverage 454 sequencing indeed yielded highly accurate genome sequences.
Pichia pasϊoris phylogenetic position
[0190] Phylogenetic analysis (Fig. 1C; Example 1) shows that P. pastoris diverged before the formation of the CTG clade (yeasts which translate the CUG codon into serine instead of leucine13).
Genome sequence annotation: protein-coding genes
[0191] Protein-coding genes were automatically predicted using the EuGene18 prediction platform (Example 1) and these gene models were manually curated for functional annotation, accurate translational start and stop assignment, and intron location. This resulted in a 5,313 protein-coding gene set of which 3,997 (75.2%) were found to have at least one homolog in the National Center for Biotechnology Information (NCBI) protein database (BLASTP e- value le-5, sequence length <20 % difference and sequence similarity >50 %). The protein- coding genes were found to occupy 80% of the genome sequence. According to recently proposed measures for genome completeness, the genome was searched for highly conserved single (or low) copy gene sets: CEG with 248 genes across six model organisms15, and FUNYBASE with 246 genes with orthologs in 21 fungi. All genes from both gene-sets were found to be present in our proteome with full domain coverage.
[0192] In accordance with this invention, 1,285 genes were assigned to the Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways40, and 4,262 of genes were annotated with Gene Ontology (GO) terms 3' . The GO slim categories of P. pastoris are presented in Fig. 6. A secretion signal peptide was predicted in 9% of the genes42 and 4,274 of proteins were predicted to contain InterPro domains . These include 2,320 distinct Pf am domains. In comparing the presence and absence of protein domains with five other yeasts proteomes, 32 domains in 32 genes were identified as specific to P. pastoris. The two fungi in the CTG clade of which the genomes have been sequenced (P. stipitis and C. lusitaniae) share 71 gene families which were found to be absent in P. pastoris.
[0193] Codon (pair) optimization of transgenes to the expression host organism often yields substantial improvements in recombinant protein yield . P. pastoris' codon usage is shown
in Fig. 3A. Overall, the codon usage is similar to the one for S. cerevisiae (the same codons being preferred by both organisms for all amino acids). Some synonymous codon pairs are also more or less frequently used than expected (the "codon pair bias"). As previously reported for S. cerevisiae44, underrepresented and overrepresented codon pair clusters were observed.
Genome sequence annotation: tRNA genes
[0194] tRNA coding genes were automatically predicted and manually confirmed by BLASTN with S. cerevisiae homologs, which identified 123 nuclear tRNA genes (Table 4), compared to 274 in the S. cerevisiae genome45. P. pastoris has three tRNA families not present in S. cerevisiae (tR(UCG), tL(CAG) and tP(CGG)), but also lacks one tRNA family (tL(GAG)).
[0195] Interestingly, a positive correlation was found between the number of tRNA genes for a given codon and the frequency of use of this codon (Spearman rho=0.88; P<0.0001, Fig.
3C).
EXAMPLE-3
[0196] This Example describes experiments conducted to test whether overexpression in engineered Glycoswitch Pichia pastoris an endogenous UDP-GIcNAc transporter, UDP-GaI transporter and UDP-Glc-4-epimerase can increase glycan conversion efficiency of these strains, thus resulting in more uniform glycosylation patterns.
Results
[0197] Expression vectors for these three genes as described in the experimental procedures below. In these vectors, the transporter or epimerase gene was placed under control of the AOXl promoter. Selection of these vectors was based on G418 (for UDP-GIcNAc transporter) or hygromycin (for UDP-GaI transporter and UDP-Glc-4-epimerase).
[0198] These vector were transformed into GSGnM5 (UDP-GIcNAc transporter) or GSGalGnM5 (UDP-GaI transporter and UDP-Glc-4-epimerase). Clones for all three strains were methanol induced and relative contribution of the glycans was assessed (Table 7). Similar results were obtained using clones from a separate transformation.
[0199] It is possible that overexpression of these genes can have a beneficial influence on the glycan conversion efficiency in fermentation conditions.
Experimental procedures:
[0200] Strains and growth conditions - Escherichia coli MC 1061 was used as the host strain for all molecular experiments. E. coli was cultivated in LB medium containing the appropriate antibiotics for selection. The Pichia pastoris GlycoSwitch GnMan5 and GalGnMan5 strains expressing recombinant human alfa- antitrypsin were used in this analysis. TheGnMan5 strain is a GSl 15 (Invitrogen, Carlsbad, CA, USA)-derived strain that modifies its glycoproteins predominantly with GlcNAcMan5GlcNAc2 N-glycans due to inactivation of Ochl and overexpression of an ER-localized T. reesei α-l,2-mannosidase and a Golgi-localized human GIcNAc transferase. The GalGnM5 strain was derived from the GnM5 strain by transformation of the latter with a Golgi-localized human Galactosyl transferase fused to the S. cerevisiae UDP-Glc-4-epimerase GaIlO. This strain predominantly modifies its glycoproteins with GalGlcNAcMan5GlcNAc2 N-glycans. Pichia pastoris cultures were grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media were prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
[0201] Vector construction - All molecular experiments were carried out according to standard procedures (Sambrook et al., 1989). Phusion polymerase was used in PCR reactions according to manufacturer's instructions (Finnzymes, Espoo, Finland). To construct the expression vectors for the activated sugar transporters (UDP-GIcNAc transporter and UDP- GaI transporter) and the UDP-Glucose-4-epimerase, the genes were amplified by PCR on genomic DNA. The primers for the PCR reaction were designed to amplify the complete gene with the stop codon and to incorporate a Sail site downstream of UDP-GlcNAcT and a
BstBl restriction site upstream and a Notl site downstream of the UDP-GaIT and UDP-Glc-4- epimerase genes to allow subcloning. After amplification, the genes were cloned into pCR®4Blunt-TOPO® (Invitrogen, Carlsbad, CA, USA) and sequenced. After BstBUNotl digestion the UDP-GaIT and UDP- Glc-4-epimerase genes were ligated into the pPICHygMnn2DmManII vector in which the Mnn2-ManII fusiongene was removed by BsiRVNotl digest. Similarly, UDP-GlcNAcT was cut from the TOPO clone by BstBVSatl digest and ligated into the pPICKanMnn2SpGall OhGaIT vector in which the Mnn2-GallO- GaIT fusiongene was removed by BstBUSall digest. After selection of good clones, the expression vectors are opened by Pmel or Sail digest for transformation of Pichia pastoris strains. Insertion of the vector in the Pichia pastoris genome of Pmel opened vectors were expected to be targeted to the AOXl promoter locus while Sail opened vectors were expected to insert in the gene of interest.
[0202] Yeast treatments - Cells were grown in 5 ml BMGY medium at 3O0C. After 48h of incubation, the medium of the cultures was replaced by BMMY. The induction was performed for 48h by spiking the cultures twice a day with 1% methanol. The cultures were harvested by centrifugation for 5 minutes at 300Og. Medium and cell pellet were frozen at - 2O0C.
[0203] Glycosylation analysis - N-linked glycans were analysed using DNA sequencer- assisted fluorophore-assisted carbohydrate electrophoresis (DSA-FACE) as described previously (Laroy et al, 2006).
EXAMPLE-4
[0204] This Example describes experiments conducted to test the effect of overexpression of chaperone genes identified hereinabove on yield of heterologous proteins.
Results
[0205] Of the endogenous chaperone genes in the genome of Pichia pastoris, three ER- resident chaperones were selected for testing: ROTl, SHR3 and SILl. Expression vectors for
these genes were generated as described in the experimental procedures. In these vectors, the chaperone genes are under control of the GAP promoter.
[0206] For transformation of these vectors into the GlycoSwitch Man5 strain, two approaches were used. In the first approach, the construct were targeted to the GAP promoter by linearization of the vectors with Λvrll. In the second approach, random integration was achieved by linearization of the vectors with EcoRY. Transformants were selected on medium containing 100 μg/ml nourseothricin. Clones for all three chaperone constructs, in both transformations, were grown and intra-cellular proteins were extracted and run on SDS- PAGE gel. For detection of chaperone expression, a Western blot analysis was performed using an antibody against the His-tag. Expression was shown for ROTl (31 kDa) and SHR3 (28 kDa). No expression could be observed for SILl (44 kDa).
[0207] To analyse the effect of the chaperones on heterologous expression and secretion of ILlO and IFN-β, two approaches were followed. In the first approach, the GSM5 strains expressing the chaperone were transformed with an ILlO- or IFNβ-expression plasmid. In the second approach, GSM5 strains expressing ILlO or IFN-b were transformed with expression vectors for ROTl or SHR3. ROTl- or SHR3-expressing GSM5 strains were transformed with pPIC92mIL10, linearized with Stul or with pPIC9MFhIFNb2, linearized with BstBl. Individual clones were isolated on minimal medium and used for expression experiments. Secretion of ILlO or IFN-β in the medium upon methanol induction was assessed by SDS- PAGE and Coomassie Brilliant Blue staining.
[0208] ILlO or IFN-β expressing GSM5 strains were transformed with pGAPNORPpROTl or with pGAPNORPpSHR3, both linearized with EcoRY. Individual clones were selected on nourseothricine plates and used for induction experiments.
[0209] Upon overexpression of ROTl or SHR3, no significant enhancement of the secretion of ILlO or IFN-β was observed, in comparison with control clones expressing ILlO or IFN-β.
[0210] In the ROTl -expressing strains, somewhat more of a 31 kDa protein was detected in the medium. To test whether some of the tagged ROTl proteins were secreted upon
overexpression, Western blot-immunodetection was performed using anti-His antibodies. In both the media of the control strain and the ROTl or SHR3- expressing strains, a faint band of approximately 31 kDa could be detected. Moreover, upon deglycosylation with PNGaseF, part of this band shifted to a faster migrating band. It was therefore concluded that this band is not related to ROTl .
[0211] Generation of expression vectors for the P. pastoris homologues of LHSl, CNEl and EPS 1 was also underway. Upon PCR amplification of the chaperone genes on P. pastoris genomic DNA, the PCR product for EPS 1 (1945 bp) was cut with EcoRl and NoU and ligated into the vector fragment of pGAPNORPpROTl, cut with the same enzymes and transformed to E. coli. Because the genes for CNEl and LHSl contain EcoRl sites, the CNEl (1753 bp)- and LHS 1 (2700 bp)-PCR products were fused by PCR with the GAP promoter DNA of pGAPNORPpROTl (500 bp). The resulting PCR-DNA was then cut with Notl and Nsil (CNEl) or with Notl and Avrll (LHS 1) and ligated with the vector fragment cut with the same enzymes and transformed to E. coli. Clones were checked by colony-PCR for the presence of chaperone containing plasmids. Once completed, the coding sequences for these chaperones would be fused with a C-terminal myc/His6 tag and placed under control of the GAP promoter. The resulting plasmids would also confer resistance to nourseothricine.
Experimental procedures
[0212] Strains and growth conditions - Escherichia coli MC 1061 was used as the host strain for all molecular experiments. E. coli was cultivated in LB medium containing the appropriate antibiotics for selection. The Pichia pastoris GlycoSwitch GnMan5 and GalGnMan5 strains expressing recombinant human alfa-antitrypsin were used in this analysis. TheGnMan5 strain is a GS 115 (Invitrogen, Carlsbad. CA, USA) -derived strain that modifies its glycoproteins predominantly with GlcNAcMan5GlcNAc2 N-glycans due to inactivation of Ochl and overexpression of an ER-localized T, reesei α-l,2-mannosidase and a Golgi- localised human GIcNAc transferase. The GalGnM5 strain was derived from the GnM5 strain by transformation of the latter with a Golgi-localized human Galactosyl transferase fused to the S. cerevisiae UDP-Glc-4-epimerase GaIlO. This strain predominantly modifies
its glycoproteins with GalGlcNAcMan5GlcNAc2 N-glycans. Pichia pastoris cultures were grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media were prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
[0213] Vector construction - All molecular experiments were carried out according to standard procedures (Sambrook et al., 1989). Phusion polymerase was used in PCR reactions according to manufacturer's instructions (Finnzymes, Espoo, Finland). To construct the expression vectors for the chaperone genes Sill, Rotl and Shr3, the chaperone genes were amplified by PCR on genomic DNA. The primers for the PCR reaction were designed to amplify the complete gene without the stop codon and to incorporate an EcoRI restriction site upstream and a Notl site downstream of the gene to allow subcloning. Ater amplification, the genes were cloned into pCR®4Blunt-TOPO® (Invitrogen, Carlsbad, CA, USA) and sequenced. After EcoRVNotI digestion the chaperon genes were inserted into the pGAPNOURCre vector in which the Cre recombinase gene was removed by EcoRVNotI digest. This created a transcriptional fusion of the chaperon gene with a myc and His6 tag. The pGAPNOURCre vector was generated by deletion of the Pichia autosomal replication sequence (PARS) from the pGAPNorCre IPARS 1 vector by Nsil digest and self ligation. After selection of good clones, the chaperone expression vectors were opened by Avrll or EcoRV digest for transformation of Pichia pastoris strains. Insertion of the vector in the Pichia pastoris genome of Avrll opened vectors was targeted to the Gap promoter locus while EcoRV opened vectors was inserted randomly.
[0214] Yeast treatments - Cells were grown in 5 ml BMGY medium at 3O0C. After 48h of incubation, the medium of the cultures was replaced by BMMY. The induction was performed for 48h by spiking the cultures twice a day with 1% methanol. The cultures were harvested by centrifugation for 5 minutes at 300Og. Medium and cell pellet were frozen at - 2O0C.
EXAMPLE-5
[0215] A set of 54 P. pastoris genes was identified that contain a signal sequence, as discussed hereinabove. The expression of these genes in P. pastoris was assessed using microarray data published by Graf et al. (2008)48, the presence of these proteins in P. pastoris grown in glucose containing medium (Mattanovich et al., 2009) and for the protein abundance of their homologues in S. cerevisiae (Brockmann et al., 200746; Ghaemmaghami et al., 200347; Liu et al., 200449; Newman et al., 200651) (Table 8).
[0216] Based on these data ten genes (13 P. pastoris sequences) were selected for further analysis. The efficiency of these signal sequences is assessed in comparison with the prepro sequence of α-mating factor (α-MF) to drive the secretion of N-terminally tagged human ILlO upon methanol induction. The human ILlO expression plasmid pKai61EA-hIL10 is used. In this plasmid the mature hILlO sequence is preceded by the prepro sequence from α-MF, followed by a His6-tag and a DEVD cleavage site. The His6-tag facilitates the purification of the acid-labile ILlO protein and the DEVD cleavage site makes it possible to remove this tag by incubation with purified caspase-3.
[0217] To facilitate the exchange of the α-MF prepro sequence with the signal sequences from the candidate genes, 13 synthetic genes are obtained that can be cut out using BstBl and Kpnl and that contain the signal sequence of the candidate gene followed by His6-tag, DEVD- site and part of the mature hILlO.
[0218] In a first attempt the BstBI-Kpnl fragment of the synthetic genes was PCR amplified with specific primers and subsequently cut with BstBI and Kpnl and ligated into the vector fragment of pKai61EA-hIL10, cut with the same enzymes. The ligation mixture was then used for transformation of E. coll. Individual clones selected on zeocine containing plates were checked by colony PCR. Candidate clones were already identified for chrl-4_0611 and chrl-4 0426.
EXAMPLE-6
[0219] The initializing step in yeast O-glycosylation is known to be catalysed by a family of protein mannosyltransferases (PMT). As described hereinabove, in the Pichia pastoris genome, 5 orthologs of the PMT genes were annotated, with representatives in the 3 subfamilies. In S. cerevisiae, deletion of only one of these genes was found to be insufficient to abolish O-glycosylation. The double and triple knockouts resulted in a loss of O- glycosylation but showed a severe defect in growth, hi this Example, two approaches are described for generating PMT deficient Pichia pastoris strains which could result in an O- glycosylation deficient strain.
[0220] In one approach, disruptions of the PMT ORFs are made through the use of a knock-in vector by single homologous recombination. In the second approach, PMT knock-outs are made by double homologous recombination.
Experimental procedures
[0221] Strains and growth conditions - Escherichia coli competent strains are used as the host strain for all molecular experiments. E. coli is cultivated in LB medium containing the appropriate antibiotics for selection. The Pichia pastoris GS 115 strain (Invitrogen, Carlsbad, CA, USA) and glycoengineered strains are used in this analysis, which can express a protein of interest. Pichia pastoris cultures are grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media are prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
[0222] Vector construction - All molecular experiments are carried out according to standard procedures (Sambrook et al., 1989). Phusion polymerase is used in PCR reactions according to manufacturer's instructions (Finnzymes, Espoo, Finland).
[0223] To construct knock-in vectors for the PMT genes, a fragment of these genes is amplified by PCR. The primers for the PCR reaction are designed to amplify a fragment that
has low similarity to other regions in the genome, is not the full length gene, and is located so the protein fragment preceding the fragment is not functional. In addition, the primers contain restriction sites for subcloning of the fragment. In this fragment a restriction site, unique in the final vector, is incorporated to allow later linearization. After amplification, the genes are cloned into pCR®4Blunt-TOPO® (Invitrogen, Carlsbad, CA, USA) and sequenced. This fragment is cloned into a Pichia vector containing a selectable marker. After selection of good clones, the PMT knock-in vectors are opened by restriction digest using the unique site in the PMT fragment for transformation of Pichia pastoris strains. Insertion of the vectors in the Pichia pastoris genome will be targeted to the respective PMT loci.
[0224] To construct knock-out vectors for the PMT genes, two fragments of the PMT loci are amplified by PCR, cloned into the pCR®4Blunt-TOPO® vector (Invitrogen, Carlsbad, CA, USA) and sequenced. The primers for the PCR reaction are designed to amplify fragments within the PMT ORF, promoter or 5'UTR in a way that when these two fragments recombine with the PMT allele it will create an inactive allele. The primers contain restriction sites for subcloning of the fragments and a unique restriction site upstream of the 5' fragment and downstream of the 3' fragment. These fragments will be cloned into a P. pastoris vector up and downstream of a selectable marker. After selection of good clones, the PMT knock-out vectors are cut by restriction digest using the unique site incorporated into the PMT fragment, after which the excised fragment will be used for transformation of Pichia pastoris strains.
EXAMPLE-7
[0225] The stability of certain proteins expressed in Pichia pastoris has been observed to be influenced by the action of proteases such as protease A and B. As described above, in the Pichia pastoris genome, a series of orthologs of novel protease genes were identified and annotated, with representatives in the serine protease, aspartyl protease and cysteine protease subfamilies. One strategy is described in this example to generate a series of strains of Pichia pastoris deficient in one or more of the identified protease activities. This strategy can be applied to any strain of Pichia pastoris that expresses a heterologous protein to compare the stability of that heterologous protein with or without the particular protease being active. One
application of this strategy is to take a strain that expresses a protein of interest and use a set or "kit" of the insertional inactivation ("knock-in") vectors to generate a series of derivative strains that each lack activity of one of the endogenous proteases of Pichia.
[0226] To generate protease deficient P. pastoris strains, two we describe two approaches, one generates a disruption of the individual protease ORFs through knock-in of the vector by single homologous recombination. The second approach generates true protease knock-outs by double homologous recombination.
Experimental procedures
[0227] Strains and growth conditions - Escherichia coli competent strains are used as the host strain for all molecular experiments. E. coli is cultivated in LB medium containing the appropriate antibiotics for selection. The Pichia pastoris GS 115 strain (Invitrogen, Carlsbad, CA, USA) or glycoengineered strains are used in this analysis, these strains can express a protein of interest. Pichia pastoris cultures are grown in YPD medium, minimal medium or BMGY and induced in BMMY medium. All Pichia pastoris media are prepared as described in the Pichia instruction manual (Invitrogen, Carlsbad, CA, USA).
[0228] Vector construction - All molecular experiments are carried out according to standard procedures (Sambrook et al., 1989). Phusion polymerase is used in PCR reactions according to manufacturer's instructions (Finnzymes, Espoo, Finland). To construct the knock- in vectors for the protease genes, a fragment of the corresponding gene is amplified by PCR. The primers for the PCR reaction are designed to amplify a fragment that has low similarity to other regions in the genome, is not the full length gene, and is located so the protein fragment preceding the fragment is not functional. This is readily achieved by using segments of the gene that lie within the highly conserved DNA/protein motifs that correspond to essential amino acid residues in the active site of that particular class of protease. In addition, the primers contain restriction sites for subcloning of the fragment. In this fragment a restriction site, unique in the final vector, is incorporated to allow later linearization. After amplification, the genes are cloned into pCR®4Blunt-TOPO® (Invitrogen, Carlsbad, CA, USA) and
sequenced. This fragment is cloned into a Pichia vector containing a selectable marker. After selection of good clones, the protease knock-in vectors are opened by restriction digest using the unique site in the protease fragment for transformation of Pichia pastoris strains. Insertion of the vectors in the Pichia pastoris genome are targeted to the respective protease gene loci. Figure 8 shows the DNA sequences for one strategy of protease inactivation by knock-in.
[0229] To construct the knock-out vectors for the protease genes, two fragments of the protease loci are amplified by PCR, cloned into the pCR®4Blunt-TOPO® vector (Invitrogen, Carlsbad, CA, USA) and sequenced. The primers for the PCR reaction are designed to amplify fragments within the protease ORF, promoter or 5'UTR in such a way that when these two fragments recombine with the protease allele it creates an inactive allele. The primers contain restriction sites for subcloning of the fragments and a unique restriction site upstream of the 5' fragment and downstream of the 3' fragment. These fragments are cloned into a P. pastoris vector up and downstream of a selectable marker. After selection of good clones, the protease knock-out vectors are cut by restriction digest using the unique site incorporated into the protease fragment, after which the excised fragment is used for transformation of Pichia pastoris strains. In this way, more stable gene knock-out constructs are generated.
References:
1. Hamilton, S. R. & Gerngross, T.U. Glycosylation engineering in yeast: the advent of fully humanized yeast. Curr. Opin. Biotechnol. 18, 387-392 (2007).
2. Jacobs, P.P., Geysens, S., Vervecken, W., Contreras, R. & Callewaert, N. Engineering complex-type N-glycosylation in Pichia pastoris using GlycoSwitch technology. Nat. Protoc. 4, 58-70 (2009).
3. Ratner, M. Pharma swept up in biogenetics gold rush. Nat. Biotechnol. 27, 299-301 (2009).
4. Potgieter, T.I. et al. Production of monoclonal antibodies by glycoengineered Pichia pastoris. J. Biotechnol. 139, 318-325 (2009).
5. Mogelsvang, S., Gomez-Ospina, N., Soderholm, J., Glick, B. S., Stachelin, L.A. Tomographic evidence for continuous turnover of Golgi cisternae in Pichia pastoris. MoI. Biol. Cell 14, 2277-2291 (2003).
6. Hartner, F.S. et al. Promoter library designed for fine-tuned gene expression in Pichia pastoris. Nucleic Acids Res. 36, e76 (2008).
7. Vervecken, W. et al. In vivo synthesis of mammalian-like, hybrid- type N-glycans in Pichia pastoris. Appl. Environ. Microbiol. 70, 2639-2646 (2004).
8. Bobrowicz, P. Engineering of an artificial glycosylation pathway blocked in core oligosaccharide assembly in the yeast Pichia pastoris: production of complex humanized glycoproteins with terminal galactose. Glycobiology 14, 757-766 (2004).
9. Trimble, R.B. et al. Characterization of N- and O-linked glycosylation of recombinant human bile salt- stimulated lipase secreted by Pichia pastoris. Glycobiology 14, 265- 274 (2004).
10. Mille, C, et al. Identification of a new family of genes involved in beta- 1,2- mannosylation of glycans in Pichia pastoris and Candida albicans. J. Biol. Chem. 283, 9724-9736 (2008).
11. Dalle, F. et al. Beta- 1,2- and alpha- 1,2-linked oligomannosides mediate adherence of Candida albicans blastospores to human enterocytes in vitro. Infect. Immun. 71, 7061- 7068 (2003).
12. Nasab et al., MoI Biol Cell. 19:3758-68, 2008.
13. Glycobiology 12(12): 821-828, 2002.
14. Pichia Protocols. Methods in Molecular Biology 103, 27-39, 1998.
15. Higgins and Cregg, Pichia Protocols. Methods in Molecular Biology 103, 1-16, 1998.
16. Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376-380 (2005).
17. Pop, M., Kosack, D. S. & Salzberg, S. L. Hierarchical scaffolding with Bambus. Genome Res. 14, 149-159 (2004).
18. Foissac, S. et al. Genome Annotation in Plants and Fungi: EuGene as a Model Platform. Current Bioinformatics 3, 89-97 (2008).
19. Wang, K., Ussery, D.W. & Brunak, S. Analysis and prediction of gene splice sites in four Aspergillus genomes. Fungal Genet. Biol. 46 Suppl 1, S 14- 18 (2009).
20. Ter-Hovhannisyan, V., Lomsadze, A., Chernoff, Y. & Borodovsky, M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res., 13 (2008).
21. Stanke, M. et al. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7, 62 (2006).
22. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 37, D169-174 (2009).
23. Pruitt, K.D., Tatusova, T. & Maglott, D.R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61-65 (2007).
24. Mulder, NJ. et al. New developments in the InterPro database. Nucleic Acids Res. 35, D224 (2007).
25. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29 (2000).
26. Edgar, R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792-1797 (2004).
27. Parra, G., Bradnam, K., Ning, Z., Keane, T. & Korf, I. Assessing the gene space in draft genomes. Nucleic Acids Res. 37, 289-297 (2009).
28. Marthey, S. et al. FUNYBASE: a FUNgal phYlogenomic dataBASE. BMC Bioinformatics 9, 456 (2008).
29. Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121-124 (2005).
30. Lowe, T.M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955-964 (1997).
31. Pinheiro, M. et al. Statistical, computational and visualization methodologies to unveil gene primary structure features. Methods Inf. Med. 45, 163-168 (2006).
32. Schmidt, H.A., Strimmer, K., Vingron, M. & von Haeseler, A. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502-504 (2002).
33. Rossignol, T. et al. CandidaDB: a multi-genome database for Candida species and related Saccharomycotina. Nucleic Acids Res. 36, D557-D561 (2007).
34. Jeffries, T. et al. Genome sequence of the lignocellulose-bioconverting and xylose- fermenting yeast Pichia stipitis. Nat. Biotechnol. 25, 319-326 (2007).
35. Enright, AJ. , Van Dongen, S. & Ouzounis, CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575 (2002).
36. Finn, R. et al. The Pfam protein families database. Nucleic Acids Res. 36, D281 (2008).
37. Felsenstein, J. Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol. 266, 418-427 (1996).
38. Ohi, H., Okazaki, N., Uno, S., Miura, M. & Hiramatsu, R. Chromosomal DNA patterns and gene stability of Pichia pastoris. Yeast 14, 895-903 (1998).
39. Huse, S.M., Huber, J.A., Morrison, H.G., Sogin, M.L. & Welch, D.M. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 8, R143 (2007).
40. Kanehisa, M. et al. KEGG for linking genomes to life and the environment, Nucleic Acids Res. 36, D480 (2008).
41. Schmid, R. & Blaxter, M. annot8r: GO, EC and KEGG annotation of EST datasets. BMC Bioinformatics 9, 180 (2008).
42. Emanuelsson, 0., Brunak, S., von Heijne, G. & Nielsen, H. Locating proteins in the cell using TargetP, SignalP and related tools. Nat. Protoc. 2, 953-971 (2007).
43. Hu, S. et al. Codon optimization, expression, and characterization of an internalizing anti-ErbB2 single-chain antibody in Pichia pastoris. Protein Expr. Purif. 47, 249-257 (2006).
44. Friberg, M., von Rohr, P. & Gonnet, G. Limitations of codon adaptation index and other coding DNA-based features for prediction of protein expression in Saccharomyces cerevisiae. Yeast 21, 1083-1093 (2004).
45. Hani, J. & Feldmann, H. tRNA genes and retroelements in the yeast genome. Nucleic Acids Res. 26, 689-696 (1998).
46. Brockmann et al. (2007) PLoS Computational Biology 3, e57.
47. Ghaemmaghami et al. (2003) Nature 425, 737.
48. Graf et al. (2008) BMC Genomics 9, 390.
49. Liu et al. (2004) Anal Chem 76, 4193.
50. Mattanovich et al. (2009) Microbial Cell Factories 8, 29.
51. Newman et al. (2006) Nature 441 , 840.
Table 1 : Primers used in the genome assemblyprocess.
5' Primers used for sequencing 3' 5' Primers used for probes 3'
13 CCTATTCTCTCTTATCCCGACG Chr1 ATCGACTGAGCTTCAAGTTAATGTGC
16 GAAGCTTTATCGGCAACATC TTCAAGCAGTCTCCATCACAATCC
29 GGAAGATGCAGAGGATGTTAGG Chr2 ACTCTGGTGGAGTAACCGTACTCG
30 AGGTTATACTGTGTGTAGTACAGATGG ATTGTCTCCAATGCTTCTTGTACTACC 33 CATCTATAACGAGTCAAAGGATGA Chr3 CTACGATTATGATGTCAGTGCCAGTG
35 TGGAGTGAATGGAAGTTCAGAG GTGAGAACAACTAAAGAATTATTGGAGC
36 GATTAGCTAAGTTGCCCTATTAGG Chr4 TAGCACGGTAGATCTTGTGACTTGG
37 AAATTGTCGTTCTCACAACAGG GTAGAAGAGCCATTGTTCCATGTG
38 CATCTTAAGAATACATCATATGCACC FragB GCTTGCAGAAGATGATTACAAACTTGC
39 CCATGTTGCACTCACACTTGG GACTGAACTTGTTGGATGTCTAAGGAG
40 CGTTGAGTAGCTTTCTCCTGTCC FragD TCTATACAATGTCCAAGTGGACCAAG
41 TATCACTCCATATGGCTCTTGG GTTTCTTCGACGTTTCGATGATCC
42 GTAGAGTAGCTGACCTAAAGTCTTGG C2 AGCACGACGGAGTCTAAGATTCC
43 AAGCCGTAGAGGAGTGTAAACG GTCAGAGGTGAAATTCTTGGATCG
48 GAACTTCAACTGTCATTATTGATGG C34 TATGTCTTCTCCTGTTGTTCCTACCTCC
49 CAGGCATCTTCAGAGGGTACTC CGGAATCAATGATTAGCTCATTGG
50 TTCCTGTGGCAGATATCAGCT C121 GAATTCAGATGCTGATTTACTTGCAC
51 GAGAGTACTAACGATTCTTCACTCACT TGGATAAGCAATGGAGGAAGTTCTG
52 TTACCAATCACGTAGTTCGCTC C131 CATTAGTAGGAGCTTCCAAGGTACTGT
53 GACTTCTAACGGGTGAGATATCC AGTCCGTGGCAATAGAGCGATTG
54 TAGGTCAGTGCTTGATTGTAATCTC C157 TCACCAAGAAAGCTCTTATTTGGAGAG
55 AATCGACAGGGATTGCTCAA ATTTCAGTCTCGCATTGTCCGACT
56 ACACAAGGAGTTAGGGTACTAAGC C159 CAAAGAAGCTAGCTGTGGATGGAG
57 CCACTTTGAGTTCCTTGATCTCAC AACTGTAGGAACTCCATTCATCGGTG
58 CATTGACACTCCTTACAATGTGC
63 TTTGCGAGTTCATGTTTGAAGAGG
64 GAGTGGTATCAGAAACTTCAGATG
65 GACCCTGGTAGTCTCAATCTAAG Primers used for fragment joining
66 CTCACCATATAAGATTTATTCTGATAAAGC 1 CGTTTAACGTCGATAGGCTAG
67 GCTCAACTTTCTGAACTTGCTCA 2 CAGCAGTCAGACCCTTAATGTC
68 CATCAAAGTCCAGTAAGCAACCC 4 GCAGAAAAGTTACCAAGTGAGC
69 GGAGCTGTGTTATTTACTATTGAGG 5 TCGTGCCATGGTAGTATTCCT
70 ATGATTTCGAAAGGTTCTACCCC 6 TCTCAAACTGTTCAAAGACTTGTAC
71 TACTAACTTTGAGTTTAGAGGATGG 8 TAATGAAGGACAAATAATTTCAGC
72 AGGGAAGTCTAAGTTCTTTCGC 9 TGTTCACATAGCCAACTGATACTG
73 GATGATAACGGTGCCTGGTTCA 62 GGCCGGGAATGCTATTTGGT
74 GTATACCAGTTTCCCAACAATCTTAG 60 TTTGGCATTATGCGAAAAGAATAGGAC
75 GGGGTTGGAATTTGCTCGATAA 12 GCAAGCTATTGGAAACAACC
76 TACAGATGAATCTGGGTCGGTG 15 ATTTTGATAGGAAGCTTAGATTGC
77 CTGGTACTTACGAAACCACGTA 17 CCAACTCGTGACAACAACCTC
78 TTTCGTAAGTACCAGTCCATGG 21 GCCACCAACTTCAGTAGTCCAC
79 GTGGTAGCTGGGGTGGTT 23 CCTTAGAAGTCGTTATATGGAAGTG
80 TCTGTCTAATGCTTACCATCAGC 25 ATACGCATGTGGAGATTCCTC
81 CCCTCTGTCGATCTCAGGTTT 26 CTCTCATCCAACCAGTTGACC
82 CTGAACAGCATTGAAGATAAAAAAACG
83 CAGAAGTTTATGGCTGATATGATGC
84 GATAGTGGTGTTGGGAGGATATG
85 AGAATGATTTGGGAGATACGAGC
86 GCTGCCCTTTGAGTATCAGATC
87 GCTGGCCAATCTCATCTCATG
88 GGAAGGATTTGGGCTAATAAACTG
115 CAACTATCCACTTGCTCATAACAGAC
116 CCTACAGGGTTTCCATGTGATG
117 GGTAAGATGAACTAATTCACAAGGTG
Table 2
A. Genome sequencing and assembly statistics
454 Sequencing
Sequenced Reads Sequenced Length (bp) Paired-end reads
897,197 218,602,026 1 1 ,538
MIRA assembly
Assembled Assembled Contigs Length (bp) N50 L50 Average reads Contigs (> 500 bp) (Kbp) coverage
885,659 1 ,154 230 9,658,092 40 77 20
Contig joning Chromosomes
Joined contigs Supercontigs Length (Mbp)
203 10 9.3 4
N50: number of contigs that collectively cover at least 50% of the assembly
L50 length of the shortest contig among those that collectively cover 50% of the assembly
B. Genome Contents Overview General information
Size (Mbp): 9 3 (not including rDNA loci, estimated at 0 12 Mbp)
Genome GC content (%): 41 .1
Assembled chromosomes: 4 Coding genes
Coding genes: 5,313
Coding %: 79 6
Coding GC (%) 41 .6
Mean gene length (bp) 1 ,442
Single exon genes: 4,680 RNA genes tRNA genes: 123
5s rRNA genes 21 Mitochondrial genome
Sιze (bp) : 36,1 19
Genome GC content (%): 22
Coding genes: 16 tRNA genes: 31
Table 3. Resequencing of selected known P. Pastoris ORFs.
Comparisonof sequence differences between our 454 sequence and the ORF of genes present in Genbank. The genes containing sequence difference and our 454 sequence are given in the last column
Differences Differences
Gene Name Locus id CDS (bp) Genbank Sanger
1 Cyc chr4_0018 333 0
2 Delta 15-fatty acid desaturase chr4_0743 1248 5 0
3 YPS2 chr3_0299 1584 0
4 PFK3 chr4_0943 1056 1 0
5 FET3 chr2-1_0787 1890 5 0
6 EF-1 FragB_0052 1380 5 0
7 GAS1 chr1-3_0226 1617 0
8 PMR1 chr1-4_0325 2775 0
9 Methionine synthase chr2-1_0160 2307 10 0
10 Sphingolipid C9-methyltransferase chr4_0465 1471 0
11 DES chr3_0939 1083 2 1
12 SLD1 chr1-1_0013 1629 0
13 VAC8 chr1-4_0101 1671 0
14 FTR1 chr1-4_0040 1098 0
15 KEX2 chr2-1_0304 2334 0
16 YPS1 chr4_0584 1800 2 0
17 PRC1 chr1-4_0013 1572 0
18 PDI chr4_0844 1554 2 1
19 YPT-hke chr1-4_0627 919 0
20 SEC4 chr3_0143 615 0
21 FAD2 chr4_0052 1263 1 0
22 ALG3 chr4_0712 1398 6 0
23 EF-2 chr2-1_0812 2605 0
24 PN01 chr1-4_0410 2334 0
25 GSA10 chr2-1_0641 2391 1 0
26 CBS chr2-2_0137 1506 2 0
27 ceramide glucosyltransferase chr3_0357 1530 0
28 GDM chr3_0531 1344 0
29 GSA9 chr1-4_0555 3941 0
30 SEC12 chr4_0606 3117 0
31 SAR1 chr1-1_0180 648 0
32 SEC18 chr3_0342 2277 7 0
33 ACT1 chr3_1169 1728 0
34 UGT51 B1 chr4_0167 3636 0
35 GSA12 chr3_0931 1632 1 0
36 GSA11 chr1-3_0168 5580 24 0
37 SEC17 chr2-1_0644 936 7 0
38 SEC13 chr1-3_0057 943 3 0
39 AOX2 promoter chr4_0821 1550 0
Total length 70295
Total differences 84 2
Accuracy 1/35147 bp
Table 4. Nuclear tRNA genes identified in the P pastoπs genome Overview of P pastoris tRNA genes tRNA species Anticodon P pastoπs intron S cerevιsιaem intron tRNA- AIa AGC 5 11 tRNA- Ala UGC 2 5 tRNA- Arg ACG 2 6 tRNA- Arg CCG 1 1 tRNA- Arg ecu 1 1 tRNA- Arg UCG 1 O tRNA- Arg UCU 4 11 tRNA- Ans GUU 4 10 tRNA- Asp GUC 6 15 tRNA- Cys GCA 2 4 tRNA- GIn CUG 2 1 tRNA- GIn UUG 3 9 tRNA- GIu cue 4 2 tRNA- GIu UUC 5 14 tRNA- GIy CCC 1 2 tRNA- GIy GCC 5 16 tRNA- GIy UCC 3 3 tRNA- His GUG 3 7 tRNA- He AAU 5 13 tRNA- He UAU 1 2 tRNA- Leu AAG 2 0 tRNA- Leu CAA 3 I 10 tRNA- Leu CAG 1/1 -/I 0 tRNA- Leu GAG O 1 tRNA- Leu UAA 1 7 tRNA- Leu UAG 1 3 tRNA- Lys CUU 5 14 tRNA- Lys UUU 3 7 tRNA- Metintiator CAU 2 5 tRNA- Metelongator CAU 2 5 tRNA- Phe GAA 5 10 tRNA- Pro AGG 2 2 tRNA- Pro CGG 1 0 tRNA- Pro UGG 4 10 tRNA- Ser AGA 4 11 tRNA- Ser CGA 1 1 tRNA- Ser GCU 2 4 tRNA- Ser UGA 2 3 tRNA- Thr AGU 5 11 tRNA- Thr CGU 1 1 tRNA- Thr UGU 1 4 tRNA- Trp CCA 3 6 tRNA- Tyr GUA 4 tRNA- VaI AAC 5 14 tRNA- VaI UAC 1 2 tRNA- VaI CAC 1 2 total 123 274 different tRNAs 45 42
1 Hani and Feldmann (1998) Nucleic acids Research, 26 (3) 689
Table 5A. Methanol pathway genes in P. pastoris
Overview of P. pastoπs genes involved in methanol metabolism (shown in Fig 3A)
Reference Gene EC code Locus id
1
AOX 1.1.3.13 chr4_0152 chr4 0821
2
FLD 1.2.1.1 chr3 1028
3
FGH 3.1.2.12 chr3 0867
4
FDH 1.2.1.2 chr3 0932
5
CAT 1.11.1.6 chr2-2 0131
6
DAS 2.2.1.3 chr3_0832 chr3 0834
7
DAK 2.7.1.29 chr3 0841
8
TPI 5.3.1.1 chr3 0951
9
FBA 41.21.13 chr1-1_0072 chrM 0319
10
FBP 3.1.3.11 chr3 0868
Table 5B. Protein secretion pathway in P. pastoris
Overview of P. pastoris genes involved in ; secreted (glyco)protein post-translational processes (shown in Fig . 3B)
Reference Process or complex Gene locus id
1 Sec61 complex SEC61 chr1 -3_0202
SBH1 chr2-2_0210
SSS1 chr1 -1 0023
2 OST complex STT3 chr1 -4_0685
" chr1 -4_0496
SWP1 chr1 -3_0248
WBP1 chr2-1_0423
0ST1 chr3_0741
0ST2 chr2-2_0346
0ST3 chr4_0610
0ST4 chr2-2 62421 -6251 6
0ST6 chr3 1 142
3 Signal Peptidase complex SPC1 chr1 -1_0491
SPC2 chr2-1_0589
SPC3 chr4_0874
SEC1 1 chr1 -4 0187
4 Quality control
Chaperones R0T1 frag B_0048
LHS1 chr1 -3_0063
CNE1 chr2-1_0322
YDJ1 chr2-2_0066
EPS1 chr2-1_0421
SHR3 chr1 -3_01 16
KAR2 chr2-1_0140
SIL1 chr1 -1_0237
Folding sensors HTM1 (MNL1 ) chr3_0891
UGGT chr1 -3_01 14
" chr3 0929
5 Early N-glycan processing GLS1 chr1 -1_0215
GLS2 chr2-1_0778
MNS1 chr2-1_0753
GTB1 chr3 0179
6 N-glycan precursor synthesis SEC59 chr2-1_0498
ALG7 chr2-1_0727
ALG 13 chr1 -4_0448
ALG 14 chr3_0944
ALG 1 chr2-1_0759
ALG2 c121_0002
ALG 12 chr4_0544
ALG 1 1 chr1 -4_0417
ALG3 chr4_0712
ALG9 chr2-2_0036
ALG6 chr2-1 0549
ALG 10 Chr1 -4_0475
ALG8 chr3 0999 0-Glycosylation PMT1 chr2-1_0212
PMT2 chr2-1_0256
PMT3 "
PMT4 chr1 -4_0033
PMT5 chr1 -1_0286
PMT6 chr4 0777 Nucleotide sugar synthesis
GDP-Mannose synthesis PMI40 chr3_1 1 15
MPG1 (PSA1 ) chr3_0870
" chr2-1_0093
ALG4 chr2-2_0053
UDP-GIc synthesis UGP1 chr1 -3_0122
(QRM ) chr3_0676
UDP-GIcNAc synthesis GFA1 chr2-1_0626
GNA1 chr4_0066
PCM1 chr1 -1_0067
QRI1 /UAP1 chr3_0676
UDP-GaI synthesis GaM O chr4 0839 Nucleotide sugar transport
GDP-Mannose transport VRG4 chr3_0916
GDA1 chr4_0021
UDP-GIcNAc transport YEA4 chr1 -3_0163
UDP-GaI transport HUT1 chr2-1_0692
Pisti UDP-
GaIT chr4_0810
(VRG4) chr3_0916
(YEA4) chr1 -3 0163 Golgi N-glycan processing
Hyperglycosyl and core type OCH1 chr1 -3_0251
MNN9 chr4_0103
VAN1 chr2-1_0772
MNN10 chr2-2_0185
MNN1 1 chr2-2_0125
HOC1 chr3_0620
ANP1 chr3_0515
MNN2 chr1 -4_0037
MNN5 chr3_0370
MNN6 chr3_1 162
" chr3_0215
MNN4 chr1 -4_0409
" chr2-1_0718
" chr2-1_0706
PNO1 chr1 -4_0410 β-Mannosyltransferase chr1 -4_0696 chr4_0471 chr4_0450 chr4 0451
Pro-peptidase KEX2 chr2-1_0304
STE13 chr2-2_0310
DAP2 chr3 0896 Protease
Aspartic-type endopeptidase YPS1 chr4_0584
YPS2 chr3_0299
" chr3_1 157
YPS3 chr3_0303
" chr3_0866
YPS7 chr3_0394
MKC7 chr1 -1_0379
PEP4 chr3_1087
Cysteine-type peptidase ATG4 chr1 -4_0522
GPI8 chr4_0261
HSP31 chr3_0691
HSP32
HSP33
SNO4 "
Table 6. Location of TATA elements in certain Pichia pastoris promoters.
Table 7. Relative glycan contribution
GlycoSwitch strain relative glycan contribution
Man5 GnMan5 GalGnMan5
GnMan5 21,26 ± 0,52 78,74 ± 0,52 GnMan5::UDP-GlcNAcT 30,52 ± 2,27 69,48 ± 2,27
GalGnMan5 19,62 ± 0,64 8,30 ± 0,38 72,08 ± 1,01
GalGnMan5::UDP-GalT 21,53 ± 1,91 9,49 ± 3,75 68,98 ± 4,69 GalGnMan5: :UDP-Glc-4-epi 20,69 ± 1,72 9,18 ± 0,41 70,13 ± 1,66
Table shows average ± standard deviation (n=12)
Table 8
Ptchta pastons Saccharomyces cerevisiae
To check Pipas Graf Mattanovich Name Newman Ghaemmaghami Lm Brockm wild- annotation type Secreted * (relative)
* chr2-l_0140 54500 KAR2 197,0 336941,9 156,0 48808,8 chrl-l_0160/chr4_0844 249 PDIl 407,0 #N/A 195,0 32771,3
* chr4J)559 CRHl 163,5 29521 4 #N/A 29521,4 chrl-4_0426 120000 * BGL2 450104 27,0 28279,0 chrl-4_0013 16700 PRCl 44049 8 21,0 27096,5 chr2-l_0052 12100 * SCWI l 127,5 22621 3 #N/A 22621,3 chrl-l_0267/chr4J)545 3260 CPR5 112,5 #N/A 37,0 13586,4 chrl-3_0226/chrl-3_0227 51600 * GASl 6241,5 12359 8 27,0 11953,7 chrl-4_0611 16000 APE3 4738,8 70,0 11809,0 chr2-2 0148 186000 PSTl 181,0 11676 9 #N/A 11676,9
chr3_0179 64,5 9755,8 #N/A 9755,8 chr3_1003 1750 * CTSl 194,0 6348,4 #N/A 6348,4 chrl-3_0229 * SCW4 575,0 6185,6 8,0 6175,1 chr3_0419 329 ECM14 468,4 26,0 5896,7 chr3_0299/chr4_0584 14900 YPSl 5436,2 #N/A 5436,2 chr2-l_0454 46000 * EXGl 171,5 4277,0 8,0 5220,8 chrl-4_0037
/chr3_0370/chr3_0767 2300 MNN2 129,5 6725,3 1,0 4416,7 chr3_0107 977 CTS2 3049,1 #N/A 3049,1 chrl-l_0147 DFG5 343,0 2948,6 #N/A 2948,6 chr3_O12O 1760 BIGl 64,0 3464,9 1,0 2786,6
FragB_0048 ROTl 635,5 2360,9 2,0 2687,8 chrl-4_0242 DCWl 403,0 2583,7 #N/A 2583,7 chrl-l_0293 21800 UTR2 684,0 #N/A 1,0 2108,2 chr3_0633 7380 1417,9 1,0 1763,1 chrl-l_0130 64,3 2,0 1539,5 chrl-4_0017 20.4 PPNl 319,3 1,0 1213,8 chrl-l_0379 MKC7 538,4 #N/A 538,4 chr2-l_0156 1870 NHXl 121,0 521,1 #N/A 521,1 chr3_0394 9960 YPS7 47,0 148,8 #N/A 148,8 chr4_0305 4810 PIRl 538,0 chrl-4_0164/chr3_0076 61300 * PRY2 #N/A #N/A #N/A chr3_0306 985 GAS2 #N/A #N/A #N/A chr3_0303/chr3_0866 YPS3 50,5 #N/A #N/A #N/A chr3 0517 ZPS l #N/A #N/A #N/A
P . pastoris Sequences
Potentially secreted P. pastoris genes and genes involved in secretion 76
1.1 Putative secreted proteins with a predicted signaling peptide 76
1.2 Proteins potentially involved in secretion 116 P. pastoris homologues of genes involved in the glycolysis pathway 123 P. pastoris homologues of genes involved in homologous recombination 157 P. pastoris genes with high expression levels 174 P. pastoris homologs of promoters used for expression of proteins in S. cerevisiae 208 P. pastoris genes involved in glycosylation 214
6.1 Nucleotide sugar synthesis and transport 214
6.2 O-glycosylation 219
6.3 Mannosyltransferases 227
6.3.1 α-Mannosyltransferases 227
6.3.2 x-mannosyltransferases 245
6.4 ER glycosylation pathway 247
6.5 Remaining genes of the glycosylation pathway 264 Methanol metabolism 268 Annotation of homologues of S. cerevisiae proteases 278
8.1 Serine-type peptidases 278
8.2 Serine-type endopeptidase inhibitor 280
8.3 Aspartic-type endopeptidases 282 Chaperones 291 5S ribosomal RNA gene 309 Xylose metabolism 310 Arabinose metabolism 314 Threhalose metabolism 316 β-mannosyltransf erases 319
1 Potentially secreted P. pastoris genes and genes involved in secretion
1.1 Putative secreted proteins with a predicted signaling peptide
SEQOOOl
P. pαstons homolog of S. cerevmαe KRE9 (YJL174W)
Glycoprotein involved in cell wall beta-glucan assembly; null mutation leads to severe growth defects, aberrant multibudded morphology, and mating defects Annotated localization: extracellular region (IDA) S. cerevisiαe null mutant: inviable Chr3, 0960
5' region (in bold start and stop codon previous genes) agcagggaccctaacagcactatttggaaagtttgcaatgttgttcaatctggcaaactg ccttgtgaatctgcccgtgggcagtttgggaaacatctgggacatgatgatttgagtcgg tgatgaaaagttttgaggttggtagcgaaatctgactacaatgcgatggtttccctccta agagcgaatccgcccagtaaaccaacggtagaaccacctatattacttcctatttacagc ttatatactcatcacaacgtcctacacgaccttgggcgccaacgatactattttgttaaa gttcttgtcacgcagctccttggctacaacgctactgattctggtggctatgggttcatc attcttgttcaacagaacacatgcgttatcatcaaatctgatcactgaaccatctggtct ctgagtttccttctttgttctcacaataactgcccggactatgtctccccttttgactct gttagaagccgaagcaccagtgatcgtggaagtggaaggtctggccttctgcaccacaca gacaatcttgtctccaatggaagcaaagttcttcgggttctttctcaacaccttgatgca ttcaaccatctgcgcaccagaattgtcgatgcatttaagcatcgtcttaaggtatatcat tgtgggtgtgaatgttttccagtgaggtaatgaaaaaatgagaaattcgatacctgaaat cttagaactggaggagcagtagagggagaggactagtgggcggaggtgcctccgcgctcc ttccacggtcatgtaggtttcttttcttctttcgctaacattctttgccaagttctattc ttaacaccatttgttttgggcgtgtactaagcagcatcgcctcacgctgctgaatatgaa atctatcctgtgctattctgtctttcactttgtcaacaactgcaaccagatcgaacacga acgtgattccattcgtttactttctccattcgacacttcaa
Downstream (in bold start and stop codon next genes) gaagttctgcctcactcttaaaaaccatagtaacaatgtatattttattttactctacca caatatttttttccctgtttcctctgcggtactgtacctgctgtgtttcgaatatcaata aaacagctcactgtgagcatggacttgttgcaccagtactttcaaggcccgctcatcaag tgcgttacccatggaagttctgctgaactcttgcacactgccatcgtctttgaccttgcc gacaaactcgtaaaagtagccattttgaagatcctcgtctctctcacttgctaggttaac agtcacctttccattcgactctacttgcgctgttcgggaatcatgatgtacaactttacc tataagtttaattggagaggttgtaaaccgtcccagcacggaagcatctactctgatggc gttcatatcagttctggttgtgaacgcgttagctgtaattgatttgaagaagggaatacg
cgacagacataattttttagaagacacttagtagcgagcccttccactgatgttttctct tcgtctgttagactcatctatcaaaacccttcgcgaaatagtaatacgtagcaccaagtt ccctgactttcttttcagtaattttcactatgactaaacgagtccttttgattgataact acgattccttcacttggaacttgtaccagtatctgtgccaggagggtgcccatgttgatg tgtataggaatgacgagattacattggagcaagtctctcaactgaatcccgatgttgtcg tcatctccccaggccctggtcatccaacaactgatagtggaatttcagtagacgtcatca agacattcaagggtaagatacctatatttggtgtctgcatgggccagcaatgtatgattg cagctttcggaggtgtggtagagtacgctggcgagattgttcatgggaagaccagcccta tcgctcatgacggcaaagggctgttccaacatattccggac
AA (bold underlined signal peptide)
MFWLLVLSLISQALAWEFTSPEGGETF SVSGSTVT IP IEFKDDESFPS IADASNLVISLCTGPNGDI SCSE IET TPPADLLSGSTYEYDASVLATFGRSGSYYLQI YSFYSGGYAIVYSDRFSLQSMTGTLTPSGSGSPPANDI SMSNA QNAE ISKSFTVPYTLQTGRTRYAPMQTQPSGTVSATGHTRRYPTSSVSYYSTLVPSPAVYST ITPGWDYS FTSAV NFATPAPYPSQNGGWYAASRRI SRPF IQATTALQRRWAY
P. pastons homolog of S. cerevmae PRY2 (YKR013W)
Protein of unknown function, has similarity to Pry Ip and Pry3p and to the plant PR-I class of pathogen related proteins
Annotated localization: extracellular region (IDA)
S. cerevisiae null mutant: viable, decreased osmotic stress resistance
2 homologues
SEQ0002 Chr3, 0076
5' region (in bold stop codon previous gene) gcatatctcaaggcatcaaaggttctggatctctcgttgatacattccgcgatgggttgc tctattcaaaggaacttgctgacctgaaaaggtttgaatatgctgaggatctggagtttt atgatgctgaggaagaagaacagaactttaagcaagggatcactggcaacggaaaaacta gcaaacaccagagagtcttatcccaagatatctccaataattccgtgaaaatgaccccag taaatggtatcaacagtgattcgtatgtatcggtgaatacatagtaccctcctatttagg caaatgtcatgcttcaaatataatacccacgtttgttattgttacccctatccttttgtc tagaacacgtttgtcgcacaccaactatgcctagaaccctgtagatcactctcattggct cgaacgatggggatacagaatgaaacctggtggaaagttgtttacatttggaattgagtt agcccctgtgattatgacaatctgcatgctgatgcatttgatgcatgtgcattctaatct gagaattccactcagtagatcgcaacaaagaagacaaaaaagtaggacaatttctctggt aaagcgagttgaatgatcaaagttaagataagctagcctatttacacctgttgtttctct gtttgaaaaggccttttttgtgatgattttttttgactcatcacgttattccaaatttca ctcgaaacaataggtcgtgcagatgcattttcatacacttaaaatcagtcttcttgtcta atattaagaaacgcgttatttacgtctccccctttttggaaccctggcactcagggtttg ttagagaagttataagactcctgtggttccctgaaatcgtgttgtttgtgattctttttg atacatcaagtctttttcaaaaaacaaaagaagatactcttgagccgttttcttcaaatc ccaactatctgtcactcctttaaaacacgttcactattcca
ORF
ATCAGCCAGACACTTCTGGATACTCACAATGATAAGCGTGCTTTGCACGGCGTCCCAGACCTTACTTGGTCTACC
TCTGCCGGCTACTTCGAAGACAACGTCCTGCCTCCTGTTTGA
Downstream (in bold start codon next gene) tcggttcgttcttgcttcaacgtttacgtaatatttcatgtattccattctttaaattgt ttgattatgtaagttattgtttcaacaccagtagtgtcttcgctgaaaatgtttccagtt gggaaatttttgactggcagaagaatctcaggatccaactgactcaaaatcttagactag gcaaacatactaaagaaccctaccgaatctactaataatgagcacatcaacttgccggtt ctacgaaaacaaatatcctgaggtggatgatgttgtcatggtgaatgtacaacagatcgc cgagatgggtgcctatgtgaaactattggagtatgacaatattgaaggaatggtgttact ctctgagttatccagaagacgtatcagatctatccagaagcttatcagagtgggtaagaa tgaagtagtggtggtactgagagttgataaggaaaaaggatacattgacctgtccaagag acgagtctcgtcggaggatatagcaaaatgtgaggagagatacacaaagtccaagtctgt ccattctatcttgaggcattgtgctgaaaagtttaacatgcctctggaggagctgtaccg tactataggatggccattaagcaaggaatttggccacgcctacgatgccttcaagatatc aatcactgacagcactgtttttgacaaggtccaacccccatcacaagaggtcttagaaga actgaaagtttacatctcccggaggctaacgccgcaggccattaaatgtcgtgctgacgt tgaggtctcatgcttcagctatgagggaattgaagccatcaagagttcactaaaagctgc tgaggatatctccaccgaggctaatcaagtcaaggtcaagctagtagccgctccattgta cgtcattacaacccaatccttagacaagacgcaaggcattgaaatcttagaacgtgcaat caaaatcatcgaagactccatcaccaaacagggcggatcct
AA (bold underlined: signal peptide)
TKKGKGSTTHSGAPGATSGAPTDDTTSTSGSVGLPTSATSVTSSTSSASTTSSGTSATSTGTGTSTSTSTGTGTG TTGTGTTSSSTSSSATSTPTGS IDAISQTLLDTHMDKRALHGVPDLTWSTELADYAQGYADSYTCGSSLEHTGGP YGENLASGYSPAGSVEAWYNEISDYDFSNPGYSAGTGHFTQWWKSTTQLGCGYKECSTDRYYI ICEYAPRGNIV SAGYFEDNVLPPV
SEQ0003 chrl-4_0164
ORF
CCACCAGGAAATTATGTCAACGAGGGATACTTCGAAGCCAATGTGTTACCACTGGTAGATTAA
AA (bold underlined: signal peptide)
MRLLHISLLSIISVLTKANAECCYTNTHTTTEVWYTTVYARDVSEETSSTLAGGSATVSSEVSSTIESSVATSAT TESSSETSGSTSGSTSATESSTGSSSLATSSS ITSSESSTITQTTGQESTSPTPSSSETGSSTTTPYDISPTASS DFDAFKYQILDEHNIKRALHGVDGLEWDEEVYAAAQAYADAYTCDGTLVHSGNSLYGENLAYGYSTRGTVDAWYS EIEYYDFNNPGYTPGVGHFTQWWKSTTKLGCAFKYCNDYYGAYWCNYSPPGNYVNEGYFEANVLPLVD
SEQOO 04
P. pastoris homolog of Phanerochaete chrysosporium ABB73028
Mannose-6-phosphatase
Ref: Computational analysis of the Phanerochaete chrysosporium v2.0 genome database and mass spectrometry identification of peptides in ligninolytic cultures reveal complex mixtures of secreted proteins (Vanden Wymelenberg et al., 2006)
Chrl-1, 0071
5' region tgtgtaaaaaatatgttcaattgatatgtggggaactaaggggagaagaaaaaggatatt gtcagcgggaactggcaccagttaaatagtacgcagccattatttcgaagcgctaggagg gttccaatgagagggggaagaggcctggtgagctccggggtaatcgcgtgttaagtgtag cacaactgggacatagttgctggttcaagtcacgttatttcactttatacagcccggacc cctatctgttcaggattgctctggatgccattggtccgttcacgaccttgatctggtgat tgttgtatgagaaatatctggcaatagcggatccatgtacaccggagaccggtgagagta cgtaagcaacggaggtgtgaaatgaattcaagtaattcggcaaggacgggatgagttgaa gactcaattagtcgagaaaaagagcgtttttttgacgggggtagtctggacagtgtgcta gaggtaacggttttcaattacgaagtgtgctttggggatgtcaggctagcatcagcttgg tccctctatagtgttctctggtgggaattcccaatcgaaaccttattatctcccgaaaat gataccaccagtgaaggctgaaaaagctctcaatttggtgggggaccatgctttccggat tatgtaagggccttctactgtagcccgcacgaattgtattgaactaggcaataaacctag gcatccaatccaggcagctcgaatgtcatctaggaggtgtggactagaagaggatcaaag aaaaccttaattaggaacaatcgagataacagataaactacattgtatccccactatggt agatactccaggttcgactgtattactgatccttgcacccttccccatacgagcacacaa tggtaaactcttagccgaataaatcgaacagatggaaagctgcacccaagaccaataact tcattcttcagacaaacttgtagaacttggacaactgtaccaccatggtcaacggccaag gacgtctgtaaagagtccaacctatatggtgattctttcctatgttgttgtctgcctgtt aaatttgcgttaagaagagaaagagtctggtctgaaaaattttatagctagtcagtcgtg tcctaaccatatagcaggcgttccaaactttctagggatagaagaccgtcaagaaatgca aattaaaggatgaaagagggggagaaaggggacaggggatctaggaggagcatgtgaaga atggtagttcaacattaacttgaggtataagcaaaatggtagtttgatcaactcaatggg ggtgcattccgctatccaggtggctattgtggatgggcaccttaattggtcagctcaaat gtcaataatatcaggaaatgacgtcaggtttgtacccgccaagtgcggttctttgtccta gcacgcatcgctggatgcacgtcctattctcttcttaaatgaggtagcaaaaccactcct ttaggcatcgactttgctcatcgcgctatttcttcttccccggcataaccttaaatcage ccctttaaggtcaaagtgca
ORF
CGTCCAGTAGTTACAGAACTTCTATTATAA
Downstream (in bold stop codon previous gene) tctgtcactagcttatctgtatcgttgcatatctttgaggtccttctatttcaaagacag gctgtttcatgtatggggctaacaactttgaaaatcgcttttctacagtctccacctgtt ctgcgagtggcacttgttcagctaatgtcttaagaatggttgggataactagtttatcaa cattcttaccccatagtagatccatgtgttcataatcttgcactgggacgatatccagga cattttcatgaggcagatatggtagcatcgtaggtatgttcattagagtgtcattggtac cgtacatcaatagaagcttagtagaaatattattagttggaaaaggagttggcaaatgcc ccttgagaccaaagataacattgagaggggagctatcgtggtccttctcgtcaaacattt caaaacgttgatttttgattatctggaaccaatgcagaatcgatttaactgaactattgg aaaagagatggggatatccgatttgcttctgctgtgatgagatgttatcgccttgccaat tgaacaaataaatcaatgagagatcaacaacacgctgatagtattcctgtcccattgccc atttccaaaaatgaactgagggcattatagccctatttccaagaaccagtggaagatagt ggtacgcattcactgcaaacgaggcaactgggtgcctcaaattaggtggaatgaatgcag
gtgaaagacctatcattaactgaattttgtcgttcaaatatggattcaaagacagactgg ctagtccttgagcagttccctgactgaacccaatataggccagtattttgtgtttattgg taatgtgtagaatggtacggatggaatctggtatgttgtaaatggcatactcattcaacg aaaagttccaaaatcgttttgatttagtagatattgatgtgtggtttctggagtatttgt tacctctattatttcccagccatacgtcatatccgttatct
AA (in bold underlined signal peptide)
MKNLAKPKINYPFLILLPFLLFFIWFPNFKPFNREMPEVQAPSKSLIRLQTWNLRFDSQPNGESVQDTISRMERK VPQDNKISYYETSGEVPWSKRRISIANHWFNGAGIFAVQEALYRQVQDLNELLNLFDSSSWDWIGLGRDDGKLG GEFEAIFYDTERFELIEWDNFWLSKTPFEPSRYPGAGSLRTVTIGLFKFKGEESKNPFI IMNTHYDEQSENQRRL ASSLIKYRASYEFERLKNEYGVENPS 11LCGDLNSHSFNENSGGYRIITGNQPLETDEISSEFAERFRSANLSKF NFRDLMEETPPQFRFGHLSTFTGFSKIGDTSKFSRIDFQFGATWEEDGKTNHGWKALKYKVDEIFYDNEYPLSDH RPWTELLL
SEQ0005
P. pastoris homolog of C. albicans SC5314 potential secreted Cu/Zn superoxide dismutase Chrl-1, 0109
5' region (in bold start and stop codon next gene) agaaatcatgtacaggttacttccaagatcatataacagcgctttgtctaggactttgtg ttctagctccattgctcgctctacttataagacgttaagacccaccgttgctggcgacag gcaggatgactttttcaattcattgcagtttggagaagcctctaagtggtcgaaacattc actaccacgcagtaggaggttgaaggacaagaattccaacgaggagaaagacgtttctgc tgcttctgaagatagttccaaaccagcttcagtcacctccaattctacatcgtctcccaa actatccacatccacagctacatccacattgacaacttcgaacaacactgttccacctac acagaaaacacagaagaaacgccgtgcgttgcgcccccgcaaggctttgatcactctatc gccaagcgcagtatcacatctacgagcattatcagatctacccgaaccaaagttgactag aatcggtgtacgtaacagagggtgctcaggactcacctaccacttggaatacgtatccga accgggtaagtttgatgagctggtggaacaagacggtgtaaaagttctgatagactctaa ggctttgtttagtattgttgggtcagagatggactgggttgatgacaagctcagctccaa gtttgtgtttaagaaccccaattccaaaggaacttgcggatgtggggaatcgttcatggt ttagatgataaaccqaactqactcacattcctctccqagatcqgatttqagacaacaaaa gaaaagggttcttttttccttcttcgcgaaaggtggaatgccttgctatgcgacatcaaa catctcacaaccaacatcatttctgcatcaaactcaaactctgcggcaccacctttgtgc cttatagacacaaattcagcatatttaaactagcaagactgttatacacttcagaattgt actcactcaatcgactttttacacattctatcacatttata
CACATTCATAAGAAACCACTCAAGTCTGGAGCTTCTTGTGAGTCCACCAGTACTCATTTCAACCCGTATAATGCA
GCCATTATTTCCTACCTGGTCTAG
Downstream (in bold start and stop codon previous gene) actttcattaactgagctaaatacgcactaaatatatgactaaagaaaagattaagagta gatgaactaatagtgagccgcatacaagtcgaatgcacgacaaagagtttgtcgcgacgc atcatctacagacagaagctcctcactaacagagaattcggtgccatgtgaaactgattg ctgactaggctatcacatggcactaaactgtatataagccagccagtggccaattttatt ctctcttctagtcaagaaaatgtcattcaaatcagagccattgttgcagggaatagctga caccctcagcaaaaatgaggcgctaaaagaaagaacccttcaaaagaccagagcaatttt
tgtctttaatgtaaaaaataaggagggtaaagccaaggcatgggccttagatctgaaaaa ggcaggaactttaacagaactgaagagtactgaggaattggaggaattgaagcctgacgc tactatcacgatcaaagatggcgatcttcgtagtttgatccggggaaagtccaaccctca aaaactttttatgagtgggaagctgaaaatcaagggaaatgtggctaaagccgcttcaat cgaaacagtgttgaagagcactaggccagccaagtccaaactatgagattcttatttaca aaaaaatgaataaattaactaactattttcagtaagtgtctcccttcgaactcaagaatt gatactacccttatacctgatcttaaaatttttaacctttttctcctcacgtggattctt tccccaaaggaggatccccacggagattctgttcactagcgtcttttgctgaaccccggt gtgtactgcgaaatggcctcaccgaattctgtggtgcccaagcttctccgataagtgggt gttctagcgactgctgcatcataggccctacttgagaataaaaagaaccgagaaaaagaa acgcgcaaacacagtgtggttgcgccaaatccgaagagact
AA (bold underlined: signal peptide)
MKLQLFTLTLSLMASSTLAAKAPKNKDSPSGAVARADFDDDKLHGIIEFSTAKNKSVEVHVDITGLPETGGPFYY HIHKKPLKSGASCESTSTHFNPYNAPADCLKTKDDSYCQVGDLSGKHGYINTTCFEDSYYDPYISLNPKNKAYPV GLAINLHYANQERFACANIELVDKKVKREFDDVDAEFVGQIPESDNWSAESVKEDTKVKSSKKDGSSTSKPHVS
KLSSKKNVTNGTSPLLSKGKYYTNSTSKYGNFTNGTSYPTDDDYDDDETIEVSSSYANAAGYLLNGNTVYGATLA
AI ISYLV
SEQOO 06
P. pastoris homolog of S. cerevisiae DSE4 (YNR067C)
Daughter cell-specific secreted protein with similarity to glucanases, endo-l,3-beta-glucanase, degrades cell wall from the daughter side causing daughter to separate from mother
Annotated localization: extracellular region (ISS)
S. cerevisiae null mutant: viable, no cytokinesis
Chrl-1, 0130
5' region (in bold stop codon previous gene) tcgaccctacactttacgccgcctttgaagccggtgagcagagcaaactcaatggaccct atcccacaaccagtaaaagtccccgtggaggatttttttatcccaccaccggggctagtg cggaggtcaacatcaaattacaagaacgagatgtccaaggcaaagcttcgcaggcaagga agcgcaggatccaaggaagaggagaaagcaatatttgaccacatgtcgcctcaaaatctc aaaagaaggtgctccaccgagctgaacagtgcgctcccaaatatggacccaagctcaacc acaattgcagagcgcaatggggttgagtttacaggtcatacaaagaataattctctgttg gatcaggttaaggaaaagggagaacacatgtacaataaaattcacgacaaatgggaaaac cgatttgaatttggattgaagagttgcgatgagtcttccgcatctgtcaacgcacccaat tcagatgaagaagaagatccgttgacctcaaggaagggcagttttgattttaacgagtat aagggagtaatgtacaaaaaggggaaataaataggcgaaataacagtgcatatcatagtt tacgatcgtattgattgcgatcagatcagattgaatcaatagtaatgaaattaacaccct tctcagagtggactgaatcgacctaaaaatggtttggaagggcatggctggtacgtgatt gctcgatttggcaataaccttgcatttcaacggttttttttcgagctttttattattagt ttttgctgcatccgtgtcgctgtgcatcagtagatttctaaagtagaaatgaaaccaacc ttctttgttaatgtgcctgattcacgggtatttgaggtaaaaaacaggacctaatgatca atggctggtgtcatccaccatccaacggttatatatatgagtagcatagcattcaggttg ttccgtaatcatatcaaacctttctctcgttagcttcccaa
CCCAACAGGGTTGCGGGAATTCTGTTTGAAAACAAAATGGATTACACCACTTATTTTGGTACTAATAAGGAATAC ATCCACGGTATCCACATGCTGCCTATCACACCAGCCTCCTCGTTGATCAGGACGCCCCAATTTGTCTCTCAGGAG
CAAAGTCGTACATGGGCATACGCATTTTCTGCAGGACTGGCTAATTCCGTTTAA
Downstream agaggtgtaatttcatatataaaatttggcgatatttttaataaggtaccacacatagtt gacgttgattgtgctgttcttttcgcgataggacaggtgcacgggagatcatccttgtat cttttctttttgcatgtgggtctttttttttttctcttgtcgcctagaattgttatttca ttccattccatttcattttgacttctgacagtccctccccactggtaacaatctgccagt aaaagtttcgtcagtttctttcttggtccaactgggaatggaagggtgcagcctctcatt tctgttgtgccctttcaaggcgtgtttaccattttcggctgatgggagtggactaaaagg aaattaccctattgacacaatccgactttcgtggatcttgttttagtaaacccatttttg ggttgcttttgaacaaaaaaaaaagacacaagtctgcatattactaagaattaccatggt aacatattgcacttgcatgcgcagaatgcattaacaacaagcagctcagaaaggctgaaa ttgaggatacgcgcttcattaaaaaatgctgcgtttgaaaatgtgtgcaacaactttgca tcagcacagaaaacacgacgcagtagtcctagtagggtgatatttggccgccatcttgct ctggagtgtcgagttaattattttgccccttgatttcaaaacgttactttgtttcctttt tttctagttcaccttttgtacgggttagtaactttgcatatctgattgttacaaaagtgg atggcaatgcaagctggatgcgcatgaacagtgttccaggcactaattttaacgtgatgt gtctcttctcgaagatgcattattttatacacagcatcttggttgcattacattatgtat gcacctatcatgtatatatgtacgcccgtaagtaatcgtaggtctaataagcaaacgtat gcagaggttgtaaatttttgaaacgctcagtcggagcatca
AA (bold underlined: signal peptide)
MSFSSNVPQLFLLLVLLTNIVSGAVISVWSTSKVTKTFTAAPRTVTVYADGDGVVPSKSADGDSTKAATGEISNG NSNEGSNPNELPAFSTIYGPSGRYELYLSTSYSTYATTLYYTSTATYGTIRSSAAPAKETQSVSETEKIISVNPI EASTAPTTTSTVFSATTSIPTSSLGRISTGLETSISSNPLVTEPVATSSASRTIETATLTESVQISTGLVNGTRS TSTYETLKTVYSSFNSSTIIGTATATQTSLGSTFTTLSTLYSSFNSSSAIGTETATETSKVSPTQTTVPSPDVGV VDLFETIDTAPPPSVFSRQDLPLPLLSSVSNNGKPYQTNKFYQNIVIGDQKGAVITYPYSVFYTSENHLGLGISH YTEΞDKVFGPTNSNGASSYFLNPTLVGHLVLSAKEFTΞSGLSLQVSELKELΞVLATLQSSSGSVHAPLVEGMGFV SGVYEGSLTLRLSSQVGFKTLTQETSKTLPSGVLKFRVTLFSDIEWLVYVTLPEGTSSDYKLEITDNYTLQGTES VDGLIVQVAKAAEKEESEIYYDYAAGMYPESAWTGSTDISTNVASYSIVYDTKGKSSSGSTLVFALPHQVDSID ESTNARDTGVQLESTTKGKLRAYLTNVLGLQETVTSQIQFLPWASFMSKTLSYTAQQLQELAQIANDEISQDIKN AVTGLDSNYFSGKVLDKYAYILLVISDI IQDDEVALSLLETLKETIGVFTSNQQYYPLMYDTRYGGVTSTGSQSG DTGVDFGSAYYNDHHFHYGYFIHAAAVIGHVDNKIGNGTWAQANKDWVNSLVRDVANPSDGDSYFPVSRSFDWYQ
GHSWASGLFSAYDGRNQESSSEDYNFAYGMKLWGAVIGDTSMEQRGNLMLSIMSRAIGTYFLFEDSNAWPSEII PNRVAGILFENKMDYTTYFGTNKEYIHGIHMLPITPASSLIRTPQFVSQEWSSVISNILPDVNSGWTGILRLNQA LSDPVSAFSFFSNDDVWNNNYLDNGQSRTWAYAFSAGLANSV
SEQ0007
Homolog of S. cervisiae YHR202W
Putative protein of unknown function; green fluorescent protein (GFP)-fusion protein localizes to the vacuole, while HA-tagged protein is found in the soluble fraction, suggesting cytoplasmic localization S. cereviεiae null mutant: viable
Chr4_0932
ORF
TAA
AA
MGLLEKRVITLSIVTLLVLVNLLDFRSVSNGSTSQDYRANNILFKLNLRHADAIDSRQQLQSRFINYYDQLRSFN TGELNFLHTTDTHGWYMGHPTQPQYSSDWGDFLSFVQRLKDLLFQEGKDLILVDSGDRHDGNGLSDCTSPNGLRS NNIFSAIDFDLLTVGNHELYSEAVSELEFSTVIPHYGDRYISTNVEYLTGDGEWVPFGNSQYRVWKSKNTRRTIL GLSFMFDFRYTGNERVRVTSISTMIQTGSLDAIIETIKSNHRPEIIWVGHLPVHHVWAESRMLHDYLRSKFPDK IIQYFGGHSHIRDFWHDDLΞTGLQSGRFCETIGFLSMNNIDSATSKDTSNWAWDNIDRKYIDFNLHSMLFHTNI
SKVSEFNTEEGVTLSDKI ISTVSELNLTKEYGRVPRNYYLAGRPYPHHQNLYSLISDEILPTLPYRVKLGDEPFE PSNARVVIINTGAIRYDLYKGIFSENTKYTISPFLNQWRVIPAVPKKIALQILPTLNQLDYIISNLDKNADQDFS LDIMSDYYNGDHSSRDRLFGQDIVDLTTLMSPFARSNLIEKLIMMGKFQSDVRNAKEKEQQPFLPHHEQHVLVKL LEFFNWNKLTHGYVTNDDFGDEGDDTKHKQISFYEPPCVIQSYQRHYNDSISNI IFDLVIPDDEDEVIDVVFYDF IEPYILYALRQILNVDLDKRΞWVQPREPVYLQSATKVGNTTLYQSFLRSIQVYNDVENEWNVGEMLKRYVTENWA
Peptide
MGLLEKRVITLS IVTLLVLVNLLDFRSVSNGS SΞQ0008
Homolog of YDR452W; PPNl
Vacuolar endopolyphosphatase with a role in phosphate metabolism; functions as a homodimer
S. cerevisiae null mutant: viable
Chrl-4_0017
ORF
CACCTTTGGAAGAGTTTTCTTTCCCGGGCTTTTGTCCAGACAGACTACGAAGATACAGGAATATAG
AA
MMVFGCLFYPETLTLNFQTTVNMSHPLFIAIVSLVIFLTAYGNREF IFGKFEGNWFDYSKNSDILSQLGLVGSS YASKCHGKWKNKS QQNHTAΞRYGDALSGCDSPIELMQSTLDWIKDNLLDKI DFWWTGDNVRHDNDRDHPRLE S
QIFEMNQVVADLIHSTFLTDKDRENQENIIDNSSRDIKIIPSLGNNDVYPHNMFSPGPTLQIREFYRIWRNFVPE EQLHVFGRGAYFFVEVIKGKLAVLSFNTLYLYTANPLVDNCDSRKQPGYQLFRWLGWLEEMRKRDMKIWLTGHV PPVPKNYDESCLRKYI IWTHEYRDVI IGGLYGHMNIDHWIPLDSEWAYNSLAAEVFPSES IPFDLYTGFPELDIK DADGMSHRYDGVPVGKVRYMNSVREDLYARIAESNDSGIFSDRYSIAHVSTSVIPTYNPGIRVWEYNITGLVDED EKRAEIQRHPPWSTFFEDLEVEMAEDEETDEFAFPTKDPSLPVKMPSKTPLGPAYVSQTFSPLRYVQLFANLTAI NSGQSPFQYELEYTTDDSPYQMNSLLVDDWIKLGRKLGASTDDQCPSQYKHLWKSFLSRAFVQTDYEDTGI
Peptide
MMVFGCLFYPETLTLNFQTTVNMSHPLF IAIVSLVIFLTAYGN SEQ0009
Homolog of YDR371W; CTS2 Sporulation-specific chitinase S. cerevisiae null mutant: viable Chr3_0107
ORF
GGCATATACAATTACAATATGCTACCTCAAAATTTCAAGGATAATGAAAAGTACGACCCTGAGGCTGTCGCTGCG TACAGTTATGAACCCATATCTAGAACGTTTGTTTCATATGATACTCCAAGAGTTGTACGAGAAAAAGCTAAGTTT
GATAATAATACCAAGAGTGCATTCTACAAAAAAAGTAACACCCTGAAAGGACTGGTTTAG
AA
MKIAQLFLRNHLDFVLISGTGWLFTLIALFGDISTLDSNNRQTQKVLF DSMKALDKEQKEQQQ^IHGSPHAPLS YRTCAYFSYWSVYEPRNF TPSDLP ISQL TNVFYAFFDIDCETGKVWI DKWSSLEMPLKFSPRSDD IKTLYADGT AEAFKLKQLHFSKKTVGALKQL FQMKLIKPSLQTSMS IGGWGRTDGFKALMFDESKVEQFVSSCVSAMQEFGFDG VDLDWEYPSNDEEGQFLLHLVSSLKRKLSQLEAHYELPPDTFLITLATPASLYTLQHYPLSRLDLFVSFWNVMTY DF A.GSWΞPQTEYHSNL YSPKGEMLSVKDAMAYYVRNAIPPNKLIMGLPNYGRGFGNTDGYNKKFKGVGKGTSDEE GI YN YNMLPQNF KDNEKYDPEAVAAYSYEPISRTFVSYDTPRWRE KAB FVKNNGYGGAMWWEACGDFYFDNPEQ SLLLNFVDELGGTQTLLFVPSSQHQDNNTKSAFYFKSNTLKGLV Peptide
MKIAQLFLRNHLDFVLISGTGVVLFTLIALFGDI STLDSN
SEQOOlO
Homolog of YJL034W; KAR2
ATPase involved in protein import into the ER, also acts as a chaperone to mediate protein folding in the ER and may play a role in ER export of soluble proteins; regulates the unfolded protein response via interaction with Irelp
S. cerevisiae null mutant: decreased metal resistance; abnormal nuclear fusion during mating; inviable
Chr2-l_0140
ORF
AA
MLSLKPSWLTLAALMYAMLLVWPFAKPVRADDVESYGTVIGIDLGTTYSCVGVMKSGRVEILANDQGNRITPSY VSFTEDERLVGDAAKNLAASNPKNTIFDIKRLIGMKYDAPEVQRDLKRLPYTVKSKNGQPWSVEYKGEEKSFTP EEISAMVLGKMKLIAEDYLGKKVTHAWTVPAYFNDAQRQATKDAGLIAGLTVLRIVNEPTAAALAYGLDKTGEE RQIIVYDLGGGTFDVSLLSIEGGAFEVLATAGDTHLGGEDFDYRWRHFVKIFKKKHNIDISNNDKALGKLKREV EKAKRTLSSQMTTRIEIDSFVDGIDFSEQLSRAKFEEINIELFKKTLKPVEQVLKDAGVKKSEIDDIVLVGGSTR IPKVQQLLEDYFDGKKASKGINPDEAVAYGAAVQAGVLSGEEGVDDIVLLDVNPLTLGIETTGGVMTTLINRNTA IPTKKSQIFSTAADNQPTVLIQVYEGERALAKDNNLLGKFELTGIPPAPRGTPQVEVTFVLDANGILKVSATDKG TGKSESITINNDRGRLSKEEVDRMVEEAEKYAAEDAALREKIEARNALENYAHSLRNQVTDDSETGLGSKLDEDD KETLTDAIKDTLEFLEDNFDTATKEELDEQREKLSKIAYPITSKLYGAPEGGTPPGGQGFDDDDGDFDYDYDYDH
DEL
Peptide
MLSLKPSWLTLAALMYAMLLWVPFAKPVRAD
Homolog of YBR015C; MNN2
Alpha- 1.2-mannosyltransferase. responsible for addition of the first alpha- 1.2-linked mannose to form the branches on the mannan backbone of oligosaccharides, localizes to an early Golgi compartment
S. cerevisiae null mutant: budding pattern: abnormal; cell shape: abnormal; cell size: decreased; glycogen accumulation: increased; chitin deposition: increased; competitive fitness: decreased; resistance to hygromycin B: decreased; resistance to Calcofluor White: decreased; viable
3 homologs
SEQOOIl
Chr3_0370
ORF
AACTTGAAATATTTGGAGAACGCTAACGTGAAGCCTCAAGATTTATGTATGTTCATCAAGGAAGAGCTAAACTTT TTACAGAATAACCCAATACAATTGACGTGA
AA
MFGKRRQVRKLLIWWLLLIVYFFGLQFRAKNSAHQSS IRSFYADNKEFFDRQYSRYDEYDI IDNMNSHNELLQE
QFRNGKLAAGLRGVAEEPNSDEVTDDTAIEEDEQAAMINFPKRSPQREKSLVELRKFYKNVLSIIINNKPAMPIE
NPRDPTPNENALKRKFGKSGIINIALHDTDPSLPILSEAYLRDSLQLSPSFIASLSKSHSAVVKAFPPSFPANAY
NGTGIVFIGGQKFSWLSLLS IENLRKTGSKVPVELI IPFAHEYEPQLCEEILPKLNATCVLLQETVGIDLLKSGH
LKGYQFKSLALLASSFEQVLLVDSDNIIVENPDPIFDSEVFQRTGLVLWPDFWRRVTHPDYYKIAGIKLGSERVR
HVVDSYTDPSLYTSSSEDPFTDIPLHDREGAIPDGSTESGQILISKTKHCQTILLSLYYNFFGPDYYYPLFTQGA
SGEGDKETFLAAANYYKLPFYNIKKGVDVIGYWKPDQSAYQGCGMLQYDPIVDYQNLQTFLKTHKGSRVNKLEQS
ELDKPGLLSRLIPKFFFRKTFDEHQLQSHFTKDRSKIMFIHSNFPKLDPFGLKLHNYLFVDQDTHKPRIRMYADQ
TGLΞFDFELRQWII IHEYFCEYPDFNLKYLENANVKPQDLCMFIKEELNFLQNNPIQLT
Peptide
MFGKRRQVRKLLIWWLLLIVYFFGLQF SEQ0012
Chrl-4_0037
ORF
GACAATCCAATTAGGATCGAGGGCTAA
AA
MLFGLIRHSRRQLLFLGALVTVIVLIFTLPNTSPIEANGVKSEEGSITPIIPVLESPANSLEKIVDTASEERIGG ATLEEGHENNKEEQALENAERAKEKEKTEAIAAEEEKLKAAELLRQQETTREKEAAKEDDSKKPNQELVEQDTYL DDIPDDVΈDNIIISEQDRKKIILPSYTPKTDPAYSKRATALKIFYNDFFIKVADSGPNTAPITKKTRKKGKSKLK GDVΞSGDKYEGPVLTEDFLRFMEIYSDEFIDAVΞESHΞKIVNLMPESFPKGMYQGDGIVI IGGGVYSWYGLLAIR NLRDGGNTLPVELMLPSDNEYEPQLCEQILPSLNAKCIMLSDIVDQDVLKKLDFKGYQFKALSLLASSFENVLSL DSDNIPVANVSHLFDHEPFSETGLVSWPDFWRRTTNPRYYEAAGIKIGEYQVRNCLDGFVPESDFVHIGLKDIPL
HDRNGTIPDASTESGQLLVNKNKHAKTLMLMFYYNFYGPGYYYPLLSQGMAGEGDKETFLAAANFFGLPFYQVKA GPGILGHHDSTGAFTGVAIVQYDPIADYELTKENFVGEKRKGIEAPKAFYGNNNKSPLFHHCNFPKLDPVKLIKE KKLIDNKTHKFNRMYGPNTKLKYDFEERQWKYTKEYLCEKKYNLLYFTEQYKNYGQGYSQERICKFSDRFLKFLS DNPIRIEG
Peptide
MLFGLIRHSRRQLLFLGALVTVIVLIFTL SEQ0013
Chr3_0787
ORF
TCCACGGGGGATTAG
AA
MFNSLAPMRIKKLLKVFCASWLLAATSWLFFHFGGQIIIPIPERTVTLSTPPANDTWQFQQFFNGYLDALLEN NLSYPIPERWNHEVTNVRFFNRIGELLSESRLQELIHFSPEFIEDTSDKFDNIVEQIPAKWPYENMYRGDGYVIV GGGRHTFLALLNINALRRAGNKLPVEWLPTYDDYEEDFCENHFPLLNARCVILEERFGDQVYPRLQLGGYQFKI FAIAASSFKNCFLLDSDNIPLRKMDKIFSSELYKNKTMITWPDFWLRSTSPHYYHNITKTPIGDKRVRYFNDFYT NPNEYYYGDEDPRSEIPFHDREGTIPDWTTESGQLVINKEVHFPAILLGLFYNFNGPMGFYPLLSQGGAGEGDKD TFVAASHYYNLPYYQVYKNCEMLYGWVDHANSGRIEHSAIVQYNPIVDYENLQSVKAKAE IILKNHEPDSRKKSS KPKSYSKTRLSTHVKGSIYSYRRLFRDSFNKANSDEMFLHCHTPKIEPYRIMEDDLTLGRNKEAKQRWYGGRKNR VRFGYDVELYIWELIDQYICDKNIQYKIFEGKDRDALCGSFMREQLGFLRSTGD
Peptide
MFNS LAPMRLKKLLKVFCAS WLLAATSV SEQ0014
Homolog of YAL063C; FL09
Lectin- like protein with similarity to Flo Ip, thought to be expressed and involved in flocculation
S. cerevisiae null mutant: filamentous growth: decreased; haploid invasive growth: absent chrl-4_0584
ORF
GCCCAAACTTGCACGCAGGCTACTATTGTGACTGGAGAAATTCTGCAAACTACTGTTGTGGACTCTGGTTCTACA ACTGTTGTTCCAAAGTATGTTCCGGTGGAGACGCATGAACCAACATTTGAATTGAGTACTCTTTAA
AA
MFEKSKFWSFLLLLQLFCVLGVHGQESGNGTTSDTAYACDIGATPFDGFNATI YQ YQASDDNS IQDPVFMSTGY LQRNQLHSTTGVTNPGFNIFTAGVATTTLYGIPNVNYQNMLLELKGYFRADASGNYGLSLRNIDDSAI LFFGRET AFECCNENLIPLDEAPTDYSLFTIKEGEAS TNPDSYTYTQYLEAGRYYPVRTFFANIRTRAVFNFTMTLPDGSEL TDFQNYIFQFGALNQQQCQAE IVTRENYTTTTEPWTGTFEATTTVIPSGTEPGTVIVQTPYSTI DS TS TWTGTF T TFTTDADGSTIAWPSST IDDHFASTETVLTDTAISTTVI TVTSCGTSKCTKTTALTGVTQRTLTI DDRTTVVTT YCPLPTDVATIKTASVSGSEWQT IYTAKHSQAVSYVHPSTVTITREVCDAQTCTQATIVTGE I LQTTWDSGST TVVPKYVPVETHEPTFELSTL
Peptide
MFEKSKFVVSFLLLLQLFCVLGVHGQ SEQ0015
Homolog of FLOl (YAR050W)
Lectin-like protein involved in flocculation, cell wall protein that binds to mannose chains on the surface of other cells, confers floc-forming ability that is chymotrypsin sensitive and heat resistant; similar to Flo5p
S. cerevisiae null mutant: flocculation: absent; oxidative stress resistance: absent; resistance to ethanol: decreased; toxin resistance: absent
S. cerevisiae over expression: increased flocculation chr3_1145
ORF
AA
MKFPVPLLFLLQLFFIIATQGDESGNGDESDTAYGCDITSNAFDGFDATIYEYNANDLKLIRDPVFMSTGYLGRN VLNKISGVTVPGFNIWNPRSRTATVYGVQNVNYYNMVLELKGYFKAAVSGDYKLTLSNIDDSSMLFFGKNTAFQC CDTGSIPVDQAPTDYSLFTIKPSNQVNSEVISSTQYLEAGKYYPVRIVFVNALERALFNFKLTIPSGTVLDDFQD YIYQFGALDENSCYETTVSKITEWTTYTTPWTGTFETTRTITPTGTEGTVVIETPESYVTTTQPWTGTYETTYTV
PPTGTEPGTVI IETPE I I DCEAVCCGPFLTAFSFRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTVPP
TGTEPGTVI IETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVI IETPESYVTTTQPWTGTYETTYSVPPSGTEP
GTWIETPESYVTTTQPWTGTYETTYSVPPSGTEPGTVVIETPEASTARTKFTTVTSSWTGVFTTTKTLPASGTE
PATIVIQTPTGYFNTSSLVS TRTKTNVD TVTRVI PCPICTAPKTITWPEEPNE SVSVI I SQPQSSSTDTTLSKP
DSVRVI SQPETASQMDTSLSKTDSAVISTETAGNNI IPLAGSHSYNTIVTTVTDSPQVAQSTTATSSSNVHLTI S
TQTTTPSLVYSSSLSTVHQVΞPSNGGFRSS ITVHPLLSVIGAIFGALFM
Peptide
MKFPVPLLFLLQLFFIIATQGD SΞQ0016
Homolog of YHR132C; ECM14
Putative metalloprotease with similarity to the zinc carboxypeptidase family, required for normal cell wall assembly
S. cerevisiae null mutant: abnormal polyphosphate accumulation chr3_0419
ORF
TGTTCTTTTGTGCTGAATTTAGAAGGCTAA
AA
MLYKTTLSIAHTSVILLSLITAISCFELHLPQKVSHIVDSLQYTCGQFLQKQQIFALYNKQNFTEIVNQNIKGIE ERVLSELLEERLENESQNDYYTANSQNWPIDLDQYSEΞFVIRITSEDEFIKYLIFKEAKALHISIWEQSVGLIDL KVDRDQMHRLLYNVESRILERRTRSVDSPVSEYKVQLMIGDLPQRIYETYPΞTKVTSLQALGEFPSFQNLSNAFF EDFRTLETIYDWFEEIQKEFPKLVSINWIGQTYEGRDLKALHVRGKHSGNKTWVTGGMHAREWISVTSACYAVH KLLQNYADGHHKEAKYLDKLDFLFVPVLNPDGYEYSFNEDRLWRKNRQETYMPRCFGIDIDHSFDYHFVKSEDLP CGEEYSGESPFESIESEVWNNFLNRTKEEHKIYGYIDLHSYSQTVLYPYAYSCE ILPRDEENLIELGYGIARAIR KSTGKKYQVLKACEDRDADLLPDLGGGTALDYMYHNRAYWAFQIKLRDSGNHGFLLPKKF IYPVGTEVYASIQYF CSFVLNLEG
Peptide
MLYKTTLS IAHTSVILLSLI TAISCF SEQ0017
Homolog of YMR200W; ROTl
Essential ER membrane protein; may be involved in protein folding; mutation causes defects in cell wall synthesis and in lysis of autophagic bodies, suppresses tor2 mutations, and is synthetically lethal with kar2-l and with rot2 mutations
S. cerevisiae null mutant: cell shape: abnormal; chitin deposition: increased; inviable; killer toxin resistance: increased; liquid culture appearance: abnormal; resistance to hygromycin B: decreased; resistance to sodium dodecyl sulfate: decreased
S. cerevisiae conditional mutant: increased heat sensitivity
S. cerevisiae reduction of function mutant: growth rate in exponential phase: decreased; resistance to tunicamycin: decreased; vegetative growth: decreased
FragB_0048
ORF
TTCCTGTATTAA
AA
MVLIQNFLPLFAYTLFFNQRAALADDSVPESELTIVGTWSSKSNTVFTGSGFYDPVDELLIEPDLPGISYSFTDD GYFEEALYQVAGNAKDHHCPTAVLIFQHGTYRELDNGTLVLEPYDVDGRQLLSQPCEDKGISTYSRYNQTEVFRN YEVSLNTYHGRIQLQLYASDGVKQPPLYLAYRPPNMLPTIVLNPTAASDEAQATATGASAKIKRSLENRYRTNAV KESSLNYTLWWWLGAILMGAGSTIYFLY
Peptide
MVLIQNFLPLFAYTLFFNQRAALAD
Homolog of YDR304C; CPR5
Peptidyl-prolyl cis-trans isomerase (cyclophilin) of the endoplasmic reticulum, catalyzes the cis-trans isomerization of peptide bonds N-terminal to proline residues; transcriptionally induced in response to unfolded proteins in the ER
S. cerevisiae null mutant: viable
S. cerevisiae overexpression mutant: decreased vegetative growth, viable
2 homologues
SEQ0018 chrl-l_0267
ORF
AA
MKLLNFLLSFVTLFGLLSGSVFAQELKDLPE I TQKVYFDINEGDKFLGRIVIGLFGDVVPRTVENF YQLAISEDP DFGYKGSKFHRVIKDFMIQGGDFTSGDGRGGKSIYGSRFKDENFKLKHDRPGLLSMANAGKDTNGSQFF I TTWT SWLDNKHWFGYWDGFDWKKIESTKTSGSAPKED IVIVDCGEVKESVE DVIKDDAEDYAEAIEEWKDEL
Peptide
MKLLNFLLSFVTLFGLLSGSVFAQ
SEQ0019
chr4_0545
ORF
GGCACCGCTCATACTGGTTTATGGGCATTGGTATTTTTTGCAGCAATTGGAATTGTAGCTTTCCTGATTTACAAG AAGTCTATCCGCCACACTCTTTACACCAAATTGAACAAATGA
AA
MNLTLIFTLISLLLGVWSAP IEEPPYTHFVTFHI TQDDQPLGDLVLGLYGTVVPKTVKNFYQLAAMTPGFGYDGS
LFHRVIEDFMIQGGDFTSGDGKGGRS IYGENKGDFPDENFVLKHDRLGRVSMANSGKDTNGSQFF I TAKATSWLD
GHHWFGQLVDGFEVFSS I IKTPTNSRDNKPTAKIMIKNATVKEVDESDIKSKQTDSPKAEESNDKGSESKQSVE
GTAHTGLWALVFFAAIGIVAFL IYKKSIRHTLYTKLNK
Peptide
MNLTLIFTLISLLLGVWSA
Homolog of YLR120C; YPSl
Aspartic protease, attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor
S. cerevisiae null mutant: viable; decreased competitive fitness
2 homologues
SEQ0020 chr4_0584
ORF
AGCGAGTCCGGGCATAACACGGTTGAAAAACGAGATGCCAAAAACGTTGTTGGCGTTCAACAGTTGGACTTCAGC
AA
MLKDQFLLWVALIASVPVSGVMAAPSESGHNTVEKRDAKNWGVQQLDFSVLRGDSFESASSENVPRLVRRDDTL EAELINQQSFYLSRLKVGSHQADIGILVDTGSSDLWVMDSVNPYCSSRSRVKRDIHDEKIAEWDPINLKKNETSQ
NKNFWDWLVGTSTSSPSTATATGSGSGSGSGSGSGSAATAVSVSSAQATLDCSTYGTFDHADSSTFHDNNTDFF I SYADTTFASGIWGYDDVIIDGIEVKELSFAVADMTNSSIGVLGIGLKGLESTYASASSVSEMYQYDNLPAKMVTD GLINKNAYSLYLNSKDASSGSILFGGVDHEKYSGQLLTVPVINTLASSGYREAIRLQITLNGIDVKKGSDQGTLL QGRFAALLDSGATLTYAPSSVLNSIGRNLGGSYDSSRQAYTIRCVSASDTTSLVFNFGGATVEVSLYDLQIATYY TGGSATQCLIGIFSSGSDEFVLGDTFLRSAYWYDLDGLEVSLAQANFNETDSDVEAITSSVPSATRASGYSSTW SGSASGTVYTSVQMESGAASSSNSSGSNMGSSSSSSSΞSSSTSSGDEEGGSSANRVPFSYLSLCLWILGVCIV
Peptide
MLKDQFLLWVALIASVPVSGVMAA
SΞQ0021 chr3_0299
ORF
CCCAGCAAGTTACCTCTCAAAAAACATCGTGATTCTTCTTCCCCGCATGAACGATTTCTTAAACGAGATGGACCC
AACATTAGGGATATTAACAAAAAAATCATAGAGTTTCAATTTGGTGACGAGATTGTGATACATTCTCCCTTATCA
TTCCTTTAG
AA
MNPSSLILLALSIGYSIAESNFSFKPSKLPLKKHRDSSSPHERFLKRDGPYHPLEADAYFYYTTSILVGSEEEKV EVTVDLGTSDLWWDYNTGLCDRSFDETYLKRSLDTSEEDYSAGDLGSSVGVRSARKFLRKRDTNQTEVNEANYG ACPNSITFNPENSSSFQSNDTAFNISYFDGTSASGFWATDTIYFGDLEVSEQFFGLANLTISYGGVLGLGPSNLQ TTNANPNGEEFIYSGVLDSMRDQGLINSASFS IYLNPENFRDEDNYSNEGAILFGAIDNAKIDGSLKLLPYVTSG GHSQIDANFTYITLNNIAVADNDTALIVETNPQLAMLNPKFIYTYFPNEVLTRLVNSIDNLEYDPVEGLYRIRRT NIRDINKKI IEFQFGDEIVIHSPLSNYLSDTWVPSTNYTYLE IQDSREDFFILGNAFFKSAYLFFDNDNSEVGIG QLKVTDKEDIVPVGEFSLDQDSGYSSTWSTFSYETGSAPLGTSTFETSTKTSSDGAAPSVSHINTSSYLFAFVLL FL
Peptide
MNPSSLILLALS IGYSIAE
Homolog of YLR121C; YPS3
Aspartic protease, attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor
S. cerevisiae null mutant: viable
2 homologues
SEQ0022 chr3_0303
AA
MLPIRLSKLLLLLSLKLKLGTAEEKYQKLDLKRIDKDYYAVDVKVGSDEQEIKEVLIDTGSSDFWILDKSFCNSP
TSEEEENSNGRSNKESCGVYGSFDSNKSETFQATGQVFDAAYGDTTAESTGSSGVRGIDQLRVGDIHIEELYFGL VTNTTSLPPVLGIAQLSEEFSNNSYPNFPYQMKEEGLIDVVAYSLSLGQSKGELLFGAMDHSKYNGTLLKAPILQ AGTPGMQVLLTGVALTNGSSSVFNETDNKGFIYFDSGTTASTLPSEHFDDLFNHHGWAYDGDTLTYSIQCDSEGE KSLLDFTLEYTIAGNIVIKVPFEDI IMKNENDGECLSTVMVSNQTSFSYSDDTPFFVAGDEVLLNAYVVYNLETQ
FAAATLKPFTFWGFVLFFFHFLI Peptide
MLPIRLSKLLLLLSLKLKLGTAE
SEQ0023 chr3_0866
ORF
AAGACATTATTAACATTTATCATATTGTATATTTTTTAG
AA
MLVAVALVLLLSTGYAGIVAIDTEYEFTIGFLSTIE IGFPPQSITAQWDTGSSDLLVNSVTNSQCAQDGCSFGAF AFNKSTTYSNITNPNNLHVQFSFASGSVVDDKLVSDTIFVDSKVIPRFNFALVSKGDLYGDNIFGIGPRGNQGTF DSNGTPAFYDSFPYHLKALGLIKRLAYSFYTGPTQGKVVFGGVDHGKYDGCLEKLE IVHDSAFYTLLEAIDADDT SVLDEQIHVLFDTGTALTLFPSFIAEQLADFLKATYSDEYNTFWPCDQDFDFEYLHFGFRNIKLSVRFKDLFLV IDDSVCAVGFDQGADANKITFGSSLLRNYYTLYDLDSKEILIADVKPDGPDDIE ILSGPVQRICDEKGVSSTSLW SSLS IESTIEPDTFTTKPSISQTRYSTSSIGPQNISNSLGEYPSVSVTLSEHHNTTSIASNSSLEGKPATPTVTD QSYQNNKTTSTVIAVNLITHSTTHSTTHSPTYSTTHSSNGSRSTLEYTSTKESSVKMPCALI ISDTIPYNASGGN SSYGSLISTSTVNNVEENNSNTVRPRKRQTFVSGTTSTILLYSSTTTQAYQMLSSTSIPRPS IKASSNAGSRKTS KTLLTFIILYIF
Peptide
MLVAVALVLLLSTGYAG
Homolog of YCL043C; PDIl
Protein disulfide isomerase, multifunctional protein resident in the endoplasmic reticulum lumen, essential for the formation of disulfide bonds in secretory and cell-surface proteins, unscrambles non-native disulfide bonds
S. cerevisiae null mutant: inviable
S. cerevisiae conditional mutant: increased heat sensitivity; delayed endocytosis
2 homologues
SEQ0024 chr4_0844
ORF
GTCGAGGAAGAAAAGGAAGCTGAAGAAGAAGCTGAAAGTGAGGCAGACGCTCACGACGAGCTTTAA
AA
MQFNWNIKTVASILSALTLAQASDQEAIAPEDSHWKLTEATFESF ITSNPHVLAEFFAPWCGHCKKLGPELVSA
AEILKDNEQVKIAQIDCTEEKELCQGYEIKGYPTLKVFHGEVEVPSDYQGQRQSQSIVSYMLKQSLPPVSEINAT
KDLDDTIAEAKEPVIVQVLPEDASNLESNTTFYGVADETHLVHWIDIESKPLFGDIDGSTFKSYAEANIPLAYYF
YENEEQRAAAADIIKPFAKEQRGKINFVGLDAVKFGKHAKNLNMDEEKLPLFVIHDLVSNKKFGVPQDQELTNKD
VTELIEKFIAGEAEPIVKSEPIPEIQEEKVFKLVGKAHDEWFDESKDVLVKYYAPWCGHCKRMAPAYEELATLY
ANDEDASSKWIAKLDHTLNDVDNVDIQGYPTLILYPAGDKSNPQLYDGSRDLESLAEFVKERGTHKVDALALRP
VEEEKEAEEEAESEADAHDEL
Peptide
MQFNWNIKTVASILSALTLAQAS SEQ0025 chrl-l_0160
CGCAATTTTATTGAGGCCAAAACTGAGTCAAAACCCCAGTTATTACACCAAGAGCTATAA
AA
MKILSALLLLFTLAFAEVIELTNKNFDDWLKSGKYTLVKFYADWCSHCKRMNPEYEKLAEELKPKSDLIQIAAI
DANKYSKYMKVYDIDGFPTMKLFTPKDISHPIEFSGSRDSESFLNFLESTTGLKLKKKAEVNEPSLVQSIDDSTI
DDLVGKDRFIAVTASWCGYCKRLHPEWEKLAKAFGNDDIVIGNWTDVVEGENIKAKYKVQSFPTILYFTAGSDE
PIRYESPDRTVEGLVKFVNEQAGLFRDPDGTLNFNAGLIPGVSDKLTNYIKEKDQSLLESTLDLLSNHEHIKDKF
SVKYHKKVIEKLLKGENEFLNNEVERLSKMLNTKLSANNSDSVIKRLNILRNFIEAKTESKPQLLHQEL
Peptide
MKILSALLLLFTLAFAE SEQ0027
Homolog of YPR026W; ATHl
Acid trehalase required for utilization of extracellular trehalose
S. cerevisiae null mutant: viable chr4_0342
ORF
CCCCAAGATTATTCATATGGAGCCCAAGTAGCCGAAGTTGTTCTCTACTAA
AA
MLNRVLLVALSCWFFHLVTTFPVGTSSDSLQIRNLLSHNFTRANISEGLSSGATYFVDEDTETYYDKELKVLRT TRFPRYNNYQLQPYVANGYIGSRIPRVGSGFTYDTSDNKTSENLKNGWPLFNKRYSGAFIAGFFNSQPTVPETNF EELEKDGYESIIASIPQWTSLELTVNVNGTNQTLKADDVDITHISDYSQQLSLLDGIVTTNYTWLGLVNVSISVL AHRDIVSLGFVSLELSSQKNITVSVTDILDFATSTRCSYLDSGVNEQS IFMFVQPSNVPTNATIYSSLMSSNSTS SLLKQNQTVSQTLRVNLSKNQAASFQKYVGWSDDYLDSIETNLTSYQFARETAKFAEIKGRSWILKSHKEAWNE LLNGKS IVFHDNDFLTLASDSS IYHLMANTRSEANGGTSALGVSGLSSDSYGGMVFWDTDFWMLPSVQAFSPRHA VSLSKFRDHTHDQAKKNAQTRDMNGAVYPWTSGRFGNCTSTGPCYDYEYHINIDIAFMFWKLYLGGAIDDDYMKE FGYPIIEDVASFFVDYVDYNSTLDKYTTRNLTDPDEYAEFKNNAAFTNVGISQLMKWALILGKHLKVGNERSYDK WEDIMTKMYLPVNHAGDVTLEYTGMNNSIEVKQADVVLISYPLDDEDGALQEYFDYDEDRAISDVRYYSDKQTDE GPAMTFΞVYSAVNAKFNKEGCSSQTYLLKSVEPYFRFPFGQMSEQSTDQYDTNGGTHPAFPFLTGHGAFLQSSIY GLTGLRFSYIYNDTDKSIKRRLAFDPLQLPCLPGGFSINGFVYMNQTLDITVNDTYATIAHRGNATTINVYVDSR NEMGGKEHKIQPGKSLSIPLYQTEQNIPGSFIECTVKNVTALQPGVVGDPIQAVADGDNSTIWKIESREEPTHLI FDLGDELDIEGGLWWGTYPAESFSVSVLRDFNSTNYRVINNVENYDLIYESGNVTASSPFDESHIKKVQILPHN CTNFTFΞELTASRYVLFEFTDVLGYPQDYSYGAQVAEWLY
Peptide
MLNRVLLVALSCWFFHLVTTF SEQ0028
Homolog of YORl 9OW; SPRl
Sporulation-specific exo-l,3-beta-glucanase; contributes to ascospore thermoresistance
S. cerβvisiae null mutant: viable chr4_0692
ORF
GCTTTGGAATGGGATATGGAGAAGCTCTTACATTATACTCTTTTTCCCTCGGGTTGGGTTTAA AA
MMISAFVWSSLIVGLISGLLALEHPQFKEFSNRTIIQHNDLLERNRSTSRFDKPPKLYGVALGGWLVLEPYITPS LFNDTVEETVDEYTLCHKLGKQKATEVLKKHWSTFITESDIIKIKNVGLNSVRIPIGYWAYDLLEDDPYIQGQDE FLSQCIDWCAKHGLSVWIDLHGAPSSQNGFDNSGRRGRAGWQDEQRYIDKTLYVLETIAKRHGNKSNVIGIEILN EPFGPVLNIEKLKQFYQKGITVIRNTGYSKDIVISDAFQGIFYWNDFQPSDSNLILDRHHYEVFSDGQLRSSFEG HLRGIEAFGRAIAIEKPTWVGEWSAAITDCAPWVNGAGRPSRYHGMVLEDGTVGDCSGATNIEQWAGKRREEFL KIIETSLKAYNAADGWFFWCWKTESALEWDMEKLLHYTLFPSGWV
Peptide
MMISAFVWSSLIVGLISGLLAL SEQ0029
Homolog of YML130C; EROl
Thiol oxidase required for oxidative protein folding in the endoplasmic reticulum
S. cβrevisiae null mutant: inviable
S. cβrevisiae overexpression mutant: decreased vegetative growth
S. cerevisiae misexpression mutant: decreased resistance to 1,4-dithiothreitol chrl-l_0011
ORF
GACTTGTAA
AA
MRIVRSVAIAIACHCITALANPQIPFDGNYTEIIVPDTEVNIGQIVDINHEIKPKLVELVNTDFFKYYKLNLWKP CPFWNGDEGFCKYKDCSVDFITDWSQVPDIWQPDQLGKLGDNTVHKDKGQDENELSSNDYCALDKDDDEDLVYVN LIDNPERFTGYGGQQSESIWTAVYDENCFQPNEGSQLGQVEDLCLEKQIFYRLVSGLHSSISTHLTNEYLNLKNG AYEPNLKQFMIKVGYFTERIQNLHLNYVLVLKSLIKLQEYNVIDNLPLDDΞLKAGLSGLISQGAQGINQSSDDYL FNEKVLFQNDQNDDLKNEFRDKFRNVTRLMDCVHCERCKLWGKLQTTGYGTALKILFDLKNPNDSINLKRVELVA LVNTFHRLSKSVESIENFEKLYKIQPPTQDRASASSESLGLFDNEDEQNLLNSFSVDQAVISSKEAPEEIKSKPV GKAAYKQNSCPSLGSKSIKEAFHEELHAFIDAIGFILNSYRTLPKLLYTLFLVKSSELWDIFIGTQRHRDTTYRV
DL
Peptide
MRIVRSVAIAIACHCITALAN SEQ0030
Homolog of YMR297W; PRCl
Vacuolar carboxypeptidase Y (proteinase C), broad-specificity C-terminal exopeptidase involved in non-specific protein degradation in the vacuole; member of the serine carboxypeptidase family
S. cerevisiae null mutant: viable; decreased phytochelatin accumulation chrl-4_0013
AA
MILHTYIILSLLTIFPKAIGLSLQMPMALEASYASLVEKATLAVGQEIDAIQKGIQQGWLEVETRFPTIVSQLSY STGPKFAIKKKDATFWDFYVESQELPNYRLRVKRNNPEVLKVDFTKQYSGYLDVEADDKHFFYWFFESRNDPQND PI ILWLNGGPGCSSLTGLFFELGSSRINENLKPIFNPYSWNGNASI IYLDQPVNVGFSYSSSSVSNTVVAGEDVY AFLQLFFQHFPEYQTNDFHIAGESYAGHYIPVFADE ILSQKNRNFNLTSVLIGNGLTDPLTQYRYYEPMACGEGG APSVLPADECENMLVTQDKCLSLIQACYDSQSAFTCAPAAIYCNNAQMGPYQRTGKNVYDIRKECDGGSLCYKDL EFIDTYLNQKFVQDALGAEVDTYESCNFEINRNFLFAGDWMKPYHEHVSSLLNKGLPVLIYAGDKDFICNWLGNR AWTDVLPWVDADGFEKAEVQDWLVNGRKAGEFKNYSNFTYLRVYDAGHMAPYDQPENSHEMVNRWISGDFSFH
Peptide
MILHTYIILSLLTIFPKAIGL SEQ0031
Homolog of YKL046C; DCWl
Putative mannosidase, GPI-anchored membrane protein required for cell wall biosynthesis in bud formation;homologous to Dfg5p
S. cβrβvisiae null mutant: viable
S. cβrβvisiae overexpression mutant: decreased vegetative growth chrl-4 0242
AA
MWRPLVLVLALLRVLSPAKGLELELSDLSSLQHATSLVADGLMDYYEGFHLGGTIGMFTNPYYWWQSGAAFGSML
DYWWYMENDTYHDAIMQAIVYQAGDNADFMPLNQTTTEGNDDQGFWGITAMAAVERNFTNPPEDEPQWLYLVQAT
VNTMWERWDLEHCNGGLRWQIFQWNAGYNYKNTVSNACLFQLSARLARYTANDTYITLAEEAFDWMYGAGFLTEG DWWFVYDGAFVDDNCTEIVMLQWTYNAGLMVSGCAYLANYTGDEMWLDRTENFLHGIQVFTNQSVFFEAACQGSG NCNTDQRSFKAYLARFLGLTAQMVPSTAETIMNWMNTSAVAVAQSCSGGTDGHTCGMNWLADGWDGFYGLGEQMS ALETLQNTRALVRPAPYTAQTGGSSQGDPAAGLGVKTEAVPPLKLTNADVAGAAIITAIIGLSVIAGAIWLLL
Peptide
MWRPLVLVLALLRVLSPAKGL SEQ0032
Homolog of YGR282C; BGL2
Endo-beta-l,3-ghicanase, major protein of the cell wall, involved in cell wall maintenance
S. cβrevisiae null mutant: viable
S. cerβvisiae overexpression mutant: decreased vegetative growth chrl-4_0426
ORF
AAGCTGGACTCATTAGGTTGCGACTTTTCTTCTTGA
AA
MIFNLKTLAAVAISISQVSAVSSLGFALGNKNVDGTCKYLADYEADLDTIRGGSEAVAIRAYSAEDCNTLQYLGP AVEEKGFKLVLSVRPLDESYYQAEKNALSEYLPQLSVSTLQFLSVGSEALYRDDLPASDLADKIRDMKEFLAGLT DKNGDSYSSVPVGTIDSWNVLVDASAAPAIEASDAVYANAFSYWQGQGPSNSTYSFFDDIMQALQVIQTIKGSTD IDFWVGETGWPTDGDNFGDAVPSIENADNFWKEAICGIRGWGINTFVFQAFDEDWKEEDDAVENHFGVWDSSRQL KLDSLGCDFSS
Peptide
MIFNLKTLAAVAISISQVSAV SEQ0033
Homolog of YGL028C; SCWIl
Cell wall protein with similarity to glucanases; may play a role in conjugation during mating based on its regulation by Stel2p chr2-l_0052
ORF
TATGGCATTGAGCAGTTTTTCGGAGCTATCGATCTGTTTAGTTAA
AA
MLSTILNIFILLLFIQASLQAPIPWTKYVTEGIAVVTETNVRWTKT IPIVQVLI SDGATYTHTLTTVSTAEEN GNFQPI TTTS IVNKEVWPTSVTPNTQQTRPTQVDTTQNNADTPAAPTPSPTTSΞNNGVF TTYSTTRSWTSVW VGPDGSPIENTGQTANPTTTAPTTSTTAARTTSSTSTTPTASSTPGGNHPRS IVYSPYSDSSQCKDATTIETDLE F IASKGISAVRI YGNDCNYLTVVLPKCASLGLKVNQGFWIGPSGVDSI DDAVQEF IQA VNGNNGFNWDLFELITV GNEAISAGYVSASSLISKIKEVSS ILSSAGYTGP ITTAEPPNVYEDYGDLCSTDVMS IVGVNAHSYFNTLFAASD SGSFVKΞQIEWQKACSRSD IT I IETGYPSQGATNGKNVPSKENQKTAIFΞ IFEWGTDVTI LSTYDDLWKDPGP YGIEQFFGAIDLFS Peptide
MLSTILNIFILLLFIQASLQA SΞQ0034
Homolog of YOL154W; ZPSl
Putative GPI-anchored protein; transcription is induced under low-zinc conditions, as mediated by the Zap Ip transcription factor, and at alkaline pH chr3_0517
ORF
TACGCTGATACGTATGAAGAATGCTTAGAATTGGCTATTGAGAGTCCAGATGAAGCAGTTAGAAACACCGCTTCC
CACACACACTCAGATGGCGAAGTCCACTGTGCTTAA
AA
MRFINLTITSLLALASRTTAEDVTVTQLVSSPTSASEAGDWSANWVKGFPIHSSCNQTKFNQLSLGLEEAQIMAA HARDHTLRYGNESEFFTQYFGNASTAEVIGWFSVWDADKSNLLFRCDDIDGNCKFDGWAGHWRGENGSDETVIC DLSFQTRKFLAQMCSNGYNVANYPNNLYWASDLLHRLYHTTTIGQLTVDHYADTYEECLELAIESPDEAVRNTAS LRFYALDVYAYDIAVPGEGCSGEIGEDDEESTSSGSATVTTTSTSAGTECHTHSDGEVHCA
Peptide
MRFINLTITSLLALASRTTAE SEQ0035
Homolog of YGR189C; CRHl
Putative chitin transglycosidase, cell wall protein that functions in the transfer of chitin to beta(l-6)glucan; localizes to sites of polarized growth; expression is induced under cell wall stress conditions
S. cerevisiae null mutant: resistance to ketoconazole: normal; resistance to miconazole: normal; viable
S. cerevisiae overexpression mutant: resistance to ketoconazole: increased; resistance to geneticin: normal; resistance to miconazole: increased; resistance to benomyl: increased chr4_0559
ORF
GCCGCAACGTCTAGTGTTTCCTCCCACTCCCCAGAGGTTTCTGCATCCAACGGAGCCGACAGGAACATTTTACAA ATCAGAGCTGCGTACTTGGTAGGAATACCGTTCTCAGTTCTTTCATTGTTAATGATATGA
AA
MVSLTRLLITGIATALQVNALPVADSNPAGTPWKRETATSAECNPLLTADCPPVKALATSISDDFTSASDNYHI VSFEDQVTYDEDGLTLTLAERWDNPSLKSNFYIMFGKVEVVLKAATGQGIISSFYLQSDDLDEIDLEWFGGDPYE VQTNFFSKGNVTTYDRGGYHNIADPTANFHNYTI DWNYDKLTFYFDGNVIRELANDTSSGYPQSPMYLRTGIWAG GDPSNQPGTIEWAGGLTDYSQAPFSMHIKHLVVSDYSSGTEYKYTDQSGAWTSIEAEGGEVLGRVDQANEEFALL VSGQELDDTSSSESSSSSSSSSSSSSSSSSSSSSTSSSSSSTTASSSSESSSTSESSSSVISSTTASSSTTTSAN STTHESTTTRPTSESTTQSTATSSHVSETSQSAHSTTTGPSRNSSIAAESAATSSVSSHSPEVSASNGADRNILQ IRAAYLVGIPFSVLSLLMI
Peptide
MVSLTRLLITGIATALQVNAL
Homolog of YMR307W; GASl
Beta-l,3-glucanosyltransferase, required for cell wall assembly; localizes to the cell surface via a glycosylphosphatidylinositol (GPI) anchor
S. cerevisiae null mutant: alkaline pH resistance: decreased; budding pattern: abnormal; cell shape: abnormal; chitin deposition: increased; competitive fitness: decreased; freeze-thaw resistance: decreased; resistance to doxorubicin: decreased; resistance to Calcofluor White: decreased; resistance to calcium dichloride: decreased; resistance to caffeine: decreased; resistance to hygromycin B: decreased; resistance to mycophenolic acid: decreased; resistance to rapamycin: decreased; viable; oxidative stress resistance: decreased; pheromone sensitivity: abnormal; Bud8p-GFP distribution: abnormal; resistance to FK506: decreased; resistance to
Congo Red: decreased
2 homologies
SEQ0036 chrl-3_0226
ORF
AACATGTACGAATGGTGTGGAAAGGCTACTTTTTCCAACTCTGGATATAAGGACAGAACCGCAGAGTTCAAAAAC
AGCAACTGGAAGGCCTCTACTGATCTGCCACCAGTCCCAGAACAGGCTGCTTGCCAGTGCATGGCTGATGCGCTA TCCTGCGTTGTGTCTGAAGATGTCGACACTGATGATTACTCTGACCTATTTAGCTACGTCTGTGAGAATGTCTCC
GTGTCAATTTCCGTAGCTCTTGGAATGATTATGTCATTCTAA
AA
MFKSLCMLIGSCLLSSVLAADFPTIEVTGNKFFYSNNGSQFYIKGVAYQKDTSGLSSDATFVDPLADKSTCERDI PYLEELGTNVIRVYAVDADADHDDCMQMLQDAGIYVIADLSQPNNSIITTDPEWTVDLYDGYTAVLDNLQKYDNI LGFFAGNEVITNKSNTDTAPFVKAAIRDMKTYMEDKGYRSIPVGYSANDDELTRVASADYFACGDSDVKADFYGI NMYEWCGKATFSNSGYKDRTAEFKNLSIPVFFSEYGCNEVQPRLFTEVQSLYGDDMTDVWSGGIVYMYFEETNNY GLVTIKΞDGDVSTLEDFNNLKTELASISPSIATQSEVΞATATEIDCPATGSNWKASTDLPPVPEQAACQCMADAL
SCWSEDVDTDDYSDLFSYVCENVSSCDGVSADSESGEYGSYSFCSSKEKLΞFLLNLYYSENGAKSSACDFSGSA TLVSGTTASECSSILSAAGTAGTGSITGITGSVEAATQSGSNSGSSKSSSAΞQSSSSNAGVGGGASGSSWAMTGL VSISVALGMIMSF
Peptide
MFKSLCMLIGSCLLSSVLAA
SEQ0037 chrl-3_0227
ORF
TCCGATGATGCTGCATTTATTCACAAGCCAAAGTCTGGATCAGAATTATTAATGGCCGCATTCTTTGTAACATTT GTGACCTCAGGATTCACTTACACTCTTTTATAG
AA
MLSILSALTLLGLSCASDLTPPIEVTGNKFFFSNNGSQFYMKGIAYQQDTSNSTTGDNFIDPLANVEACRRDFPY LEELHTNVIRVYAIDVDADHDECMQLLEEAGIYVIADLSEPGDSI ITTDPAWDVHLYDRYTSVIDELQQYDNVLG FFAGNEVITNRSNTDAAPFVKAAIRDMKAYMEDQGYRQIPIGYSANDDALTRISSANYFACGDSDVRADFYGYNI YSWCGRSSFTTSGYDDRTEDFRNLSVPLFFSEYGCNEVQPRLFTEVQALYGENMTDVWSGGIVYMYFEEENNYGL VTIEPDGDVSTYSDFNYLRSELATISPSLATQSEVSATATELECPPTGNNWRASTDLPPTPNKEVCECLQNTLAC WADDVEPEDYADIFNYVCSRIECGDIGTDAEIGEYGPYSFCDAKEKLSFVLNLYYEDQNGASSACDFDGSASIV STTAVSSCVSILQSASTVGTAPSSSRTATGSGSSGTGSGSGSSADATSTESDDAAF IHKPKSGSELLMAAFFVTF VTSGFTYTLL
Peptide
MLSILSALTLLGLSCAS
SEQ0038
Homolog of YDR055W; PSTl
Cell wall protein that contains a putative GPI- attachment site; secreted by regenerating protoplasts; up-regulated by activation of the cell integrity pathway, as mediated by Rlmlp; upregulated by cell wall damage via disruption of FKSl
S. cerevisiae null mutant: viable; increased budding index chr2-2_0148
ORF
TTGGCTCCTGCTTCTCTCTTCACTTTGTTAGCCTCAATTGTTCTCGGATTCTTGTAG
AA
MQFGKVLFAISALAVTALGETTSSLSATLTSTTRIS IASGCSLEDFTATAQSNLDELSDCEAVRGDIHIAGSLGS
AAIANVKAIYGSLIVKNATSLVSLTADSLTTITEQLALSELTILDTLSFAQLTSVGSIYFVTLPALEETGFHTGV
VASVSLNNLTTVNASLGF IEΞGFQKLSFPGITRVGGSFSIVDNDDLEE IDFSNVQS IGGGLI IANNSKLTDFSEW
EDLQTVGGALVLEGSFDNGSFPSLRAVRGAFSLESDGDISCDDFSDIRGDTAGEYQCSAASFSTSASAQSSSSST
STGGSSTHTGSSTATSSSSEDAGVALAPASLFTLLASIVLGFL
Peptide
MQFGKVLFAISALAVTALGE SEQ0039
Homolog of YDR221W; GTBl
Glucosidase II beta subunit, forms a complex with alpha subumt Rot2p, involved in removal of two glucose residues from N-linked glycans during glycoprotein biogenesis in the ER
S. cerevisiae null mutant: viable chr3_0179
ORF
AGCCCAGTTGGATGTTTCAGTGAAGACAATGTTTATCTGGTAAAGGAGATTGCTGAAGGAGAGTCACATGCCCCT TTCAAGGTGATACCTTTCAAAGATGAACTCTAA
AA
MLRLLTIGSIAVSLFPASAEIPPLRGVAPDLLEKYVPDKDGNWKCLGHPEIVLHFDQVNDDYCDCPDGSDEPGTA ACENGKFYCANEGFEPNF IPTFLVDDGVCDYKVCCDGΞDEKSGKCPNRCLELAEKAELLRLERKARLENGLKAKR
DLIVQSQNKLREISQRRAELEKTIQLEQQQLNYLNEQDPNDSALRKEVDPQLEKWQILKVQKDEILLLKTKADS LERLLKELEKNYNPNFNDAAVKAAVQGFRDYSSNSNDDSYSDLDSVEPENIVKDIQKTINTFAFDTSNDSKYESI WSFWKEVLADKIVKIYQEFTAVPLKTASKPLKPISGTAQEIQKSIDSHNKEIVKIDSDLENSESRYGLEDIFRAY DGRCFVEKIGDYDYEYCFTGSLTQISSNGQRVSIGTRDKIEVLEDEQSVGGYSYRVYYEKGAKCWNGPVRKAIAV
VQCGDVEQLVSVSEPEKCEYHLWRSPVGCFSEDNVYLVKEIAEGESHAPFKVIPFKDEL
Peptide
MLRLLTIGSIAVSLFPASAE SEQ0040
Homolog of YLR343W; GAS2
1,3-beta-glucanosyltransferase, involved with Gas4p in spore wall assembly; has similarity to
Gas Ip
S. cerevisiae null mutant: viable chr3_0306
ORF
ATTATTTCCAAATCCAGCCTTTTGACCAATCTTATTGCTATGTTCTTAGTAATATTATTTACTTAG
AA
MLGFKDFFRLLILFVTAFASDSYEFSSSVKHIASPTTVERIRKLTPTIEAYGSKFFYSNNGSQFFIKGIAYQQDF TEYLRRPESLVDVDAPKTQTLCELNFRNGRHFVDPLAWPDICTRDLPYLQELGVNTLRVYSIDPRANHDVCMERL AEAGIYVILDLSNVDCSINRDDPSWDLE IMDCYTSVIDTMAPYDNVLGFFAGNEVNTDVANTYTMPFVKSSIRDI KSYIKRKNIRKIPVGYSSNDDSLTRNNLADYLVCGENDLAKIDFFGINMYEWCGYSSYATSGYKERTLEMAGYPV PVFLSEFGCNLKRPRPFTEVEAIFNKPMTNVWSGAIMYMYFEERNEYGWKVKGDYITDPVEKLPDFNLLKAQYR NASPKGVNKQDYSSHGSKDVQCPVASEDWKVSEELPSTPRAPKCDCMMAALKCSIEEVPNRSELAEIVGYICGEI
SCHNIViWGTTGWYGEFSDCSFHQKLSYVAQLYYNYHNSTPDACYFEGYGRINAYPNTLEALAMFFTLKGTRCKE ELGPVKEISRLITGEPFGSQNRDSPRRILVDSRLFNTNLDTSESKAITAKIISKSSLLTNLIAMFLVILFT
Peptide
MLGFKDFFRLLILFVTAFAS SΞQ0041
Homolog of YBR139W
Putative serine type carboxypeptidase with a role in phytochelatin synthesis, green fluorescent protein (GFP)-fusion protein localizes to the vacuole; expression induced by nitrogen limitation in a GLN3, GATl -independent manner
S. cerevisiae null mutant: viable chr3_0633
ORF
GACCAAAAGCTACTGAAGCCGTACACTGACACTGGCTTGAATGTCTATGATATTCGTACAATGTGCGATGAAGGG
ACACACGGTGATTTCTCATTTGGTTATTAA
AA
MKSVIWSLLSLLALSQALTIPLLEELQQQTFFSKFTVPQQVAELVGTHYSKDEIISLWKDIELDVPREKIQEAFD KFVKQSTATSPVRNEFPLSQQDWVTVTNTKFDNYQLRVKKSHPEKLNIDKVKQSSGYLDI IDQDKHLFYWFFESR NDPSTDPI ILWLNGGPGCSS ITGLLFEKIGPSYITKEIKPEHNPYSWNNNAΞVIFLEQPVGVGFSYSSKKVGDTA TAAFDTYVFLELFFQKFPQFLTSNLHIAGESYAGHYLPKIASEIVSHADKTFDLSGVMIGNGLTDPLIQYKYYQP MACGKGGYKQVISDEECDELDRVYPRCERLTRACYEFQNSVTCVPATLYCDQKLLKPYTDTGLNVYDIRTMCDEG TDLCYKELEYVEKYMNQPEVQEAVGSEVSSYKGCDDDVFLRFLYSGDGSKPFHQYITDVLNASIPVLIYAGDKDY ICNWLGNQAWVNELEWNLSEEFQATPIRPWFTLDNNDYAGNVQTYGNFSFLRVFDAGHMVPYNQPVNALDMVVRW THGDFSFGY
Peptide
MKSVIWSLLSLLALSQALTI SΞQ0042
Homolog of YKL164C; PIRl
O-glycosylated protein required for cell wall stability, attached to the cell wall via beta- 1,3- glucan; mediates mitochondrial translocation of Apnlp; expression regulated by the cell integrity pathway and by Swi5p during the cell cycle
S. cerevisiae null mutant: viable; increased budding index chr4_0305
ORF
AA
MKLAALSTIALTILPVALAGYAPPDDWSTLTAKGVYPGAFSSYSNTFGIIVEPLTSSVILTPATTTHVVSQIDDG QIQHTNTAYVGTAHQVVSQIGDGQIQATASAVPLPTELASQIADGQIQATTPAGAPATPASQIQDGQVQATSSAD AHPTAHSQAEDIGAHSLSSTGLIPGTLTTVLTSTGSDTTLTLVTVETEWTYTPEVTVTVNRNAAFVKRDNIESA CLTPQALGLTLKDSVLLDLQGRVGSIVANRQFQFDGPPPQAGTIYAVGWSITPNGYLALGDSEVFYQCLSGSFYN LYDQHIAEQCEAVHLKAVDLISC
Peptide
MKLAALSTIALTILPVALAG SEQ0043
Homolog of YDL037C; BSCl
Protein of unconfirmed function, similar to cell surface flocculin Muclp; ORF exhibits genomic organization compatible with a translational readthrough-dependent mode of expression
S. cerevisiae null mutant: viable; increased glycogen accumulation
FragB_0067
GACCCTTCTAGTTGGTCAACTACAATCAGAGTGAATGCTAAAGATTCTGCTGACGGCTGCCATGTTGAGATGTAT
TCAGTGGCTAGTGGAGTTGCGGCATTGATTGTTAATGCATTGTTCTTATGA
AA
MLSLTKLFIGIILGAFVAAEQSDVEPACYARQWENTFPPSNIYVSSATWVRDNVYDVTLRYEAESLHLRSLGELK
VIGLNSPVSGTKWYSRNSKIYEINDPSSWSTTIRVNAKDSADGCHVEMYPFQIQVEWCQAGADTDECSSWPYPS
VYDYDIGCDNMQDGVSRF HHAVFRWLKHSVSGCSAASTVAAISSQTTSSSQTSTSSQSSSSTAISSVTTQTTQTT
TSGVPIWGQCGGNGHSGSTSCAEGVCVYVNDWYYQCQPGASTTTTSQASSSQVFSQLTTTEESISSEESIGLTST
VETTSSAEWESTTSGNQAIPTIEFDVEIEFPAAEITEEETTETEEQVPLHGKCGGIGHTGSTTCANGICTYVND
WYYECLPREAESEEEVPLYGQCGGINYSGATKCAEGVCTYVNDWYYQCLIGPATGTPQSTSSTDSQSVSESSAAE
DSTPTEEAEESAEESTSTEDAEESTEDFATTEEVEESTSTEDAEESTSTEEAEESTSTEDAEESTSTEEAEESTE
ESTSTDEVEESTSTEEVEESTEESTSTEDAEESTSTEEAEESTEESTSTDEVEESTSTEEVEESTEESTSTDEVE
ESTSTEEVEESTEESTSTDEVEESTSTEEVEESTEESTSTEDAEESTSTEEAEEΞTEESTSTDEV/DESTSTEEAE
EGTEQFSSTDVPQGRPGFENPTEEVESSSTEEFEEPTSTDETDESTEEATSTEEAEESISTDDVEQSTΞVEEAEE STEESTSTEALEESTSTGDFENISAVDEELEESTEESTSTEEVEESTSTEDAEESTSTEEAEESTEESTSTEDAE ESTSTEEAEESTEESTSTEDAEESTSSDEVEESTSTEEVEESTEESTSTEDAEESTSTEEAEESTEESTSTEDAE ESTSTEDAEESTSTEYVEEΞTSIPSSTSEPSLSTSIΞPSSSSSFETATVTTDVSTTVLTTTDCGEQTCYKSLIIT GVTEKTITTSGRTTVITTYCPLPTETVIPTPVWTSTIYADETITETTCYVTGAIEKTVTVGGSPTWTVHTTLP
IEAIKPTPVTVTSTIYADETVTKTTCYETGAIEKTYTIGGSSTIVTIHTPLΞTVQPTPVTVTSTIYADETVTKTT CY\7TGAIEKTYTVGGSASVATVfTPLPTEPSKTS IVTLTSVISNSGSHPLTTSIVTGAVEKTIWDGTAS IVTVH TPIASVYSSAHEHSKVTKAITPSTTTIIRDVCDSHSCTQATIETGVVSKTITIGQSVTTVVPSY^PLEHEQQEVH TTAATPVDSSVHSQSGSPVPPAITTQSYSAQASΞVPVASYYEGSANNLVFSVASGVAALIVNALFL
Peptide
MLSLTKLF IGI ILGAFVAAE SΞQ0044
Homolog of YDR144C; MKC7
GPI-anchored aspartyl protease (yapsin) involved in protein processing; shares functions with
Yap3p and Kex2p
S. cβrβvisiae null mutant: viable chrl-l_0379
ATCAGATATCTGGACAACTCTTTTGCCAATGGATCGTGGGTGAGGGATACGGTTTATGTTGGTGATTTTGAAATT GACCAGCAAAGTTTTGCATTGGTTGATATCACAAATAACTACATGGGAATTCTGGGCCTTGGTCCTTCTAGTCAG
GTTGTATCTTTGAGTGTTGGTCCCTGCATTATTGCCTTCCTACTACTCATCTCTTAA
AA
MFVIQLAFLCLGVSLTTAQPΞSPFKANKFPFKKVHYSSNPSDRLIKRDNYKKLDLRHLGVLYTAEIEIGSGKTE I EVIVDTGSADLWVIDSNAAVCDCPILRYKFQENHLLKRNEALNFDVDLNKPICDQFGSFNPQSSRTFQΞNDTAFS IRYLDNSFANGSWVRDTVYVGDFEIDQQSFALVDITNNYMGILGLGPSSQQTTNSDPTDNSFTYLGILDSLRAQG FINSASYSVYLAPDGKTDDTDHDDGEILFGAIDEAKINGQLKLFPYVNPYKSVYPDQYASYITVSSITVASYFSS RLVERIPQLALLDTGATFSYLPTYTLIRLAYAIHPGFEYVRQLGLF IIESNVLSΞARQSTIDFRFGKDWIRSNV SDHLLDVSQYFTSGHYLALTIHESVDGLLILGDTFIKSTYLFFDNDNSELGIGQIKITNDEDIQEVGEFTLERDS DYSSTWSIYSYETSLDPLSTGTGTGSTYSPTRSTTARSEPTTSRRSTTLQPRTTVIPSIDRLSLNS ITSHGSSTN GTSPTNETSFAEDGGTLTPEEASLTTSLNSATISETTFVDVETSTTNGASWSLSVGPCIIAFLLLIS
Peptide
MFVIQLAFLCLGVSLTTAQ SEQ0045
Homolog of YGR279C; SCW4
Cell wall protein with similarity to glucanases; scw4 scwlO double mutants exhibit defects in mating
S. cerevisiae null mutant: viable chrl-3_0229
ORF
MQVKSIVNLLLACSLAVARPLEHAHHQHDKRGWWTKTIWDGSTVEATAAAQVQEHAETFAESTPSAVVSSSS APSSASSASAPASSGSFSAGTKGVTYSPYQAGGGCKTAEEVASDLSQLTGYE I IRLYGVDCNQVEMVFKAKAPGQ KLFLGIFFVDAIESGVSAIASAVKSYGSWDDVHTVSVGNELVNNGEATVSQIGQYVSTAKSALRSAGFTGPVLSV DTFIAVINNPGLCDFADEYVAVNAHAFFDGGIAASGAGDWAAEQIQRVSSACGGKDVLIVESGWPSKGDTNGAAV PSKSNQQAAVQSLGQKIGSSCIAFNAFNDYWKADGPFNAEKYWGILDS
Peptide
MQVKSIVNLLLACSLAVAR SΞQ0046
Homolog of YLR286C; CTSl
Endochitinase, required for cell separation after mitosis; transcriptional activation during late
G and early M cell cycle phases is mediated by transcription factor Ace2p
S. cβrβvisiae null mutant: viable; decreased competitive fitness
S. cerβvisiae overexpression mutant: decreased vegetative growth chr3_1003
ORF
GACCGTTTCTTCTAA
AA
MKFFYFAGFISLLQLIFAFDNSAKNNVALYWGQNSAGSQERLSYYCQSDSVDIVLLSFLYIFPANPLGLDFSNAC
GDQFPSGLLKCDTIAEDIQTCQSLGKKVLLSLGGATGTYGFSSDSEAEDFAEVLWDTFLGGSTDERPFGDSILDG
IDYDAENNNPTGYTALSAKLREFYASDPSRTYYIAAAPQCPYPDASVGDVLANADVDFVF IQFYNNYCALASTSF
NWATWLDYAQNTSPNPNVKLYVGLPGGPTGASSGYVGTDVVKQRIDEIGASSSLGGIMLWDASQGFSNQVDGGNY
VDAMKS ILNGLGSVDASTTSSSQAAATSQTTSTLATSISSTPGSSSTVSSSSSLΞSSSLPLIFI ILYHDLNSHGE
VDSEGTTLTGTSTIVJWTPSEAQSYETSSLSSVSSIPTGNKDVSSILVITDVTDSLTSTKESSDSALTISTSLSSS
PSLADSSRDGETSTWQVTSSTTTWGGNTGNTASPSSTASGLSSFVFDPTTLSTKITSSELWSSTLENSLGTT
KIVLTPVSTVPSSTDLGTTTDSSPAPTASTTGTYLDCSSLSGKAKAACLNKNFANGFFLSGLQGCKDGDHACSSD
GYFΞVCDHGQWVFFNCPAGTACYASNQDDETFVGCNFΞELKDKFTKRSWLDRFF
Peptide
MKFFYFAGFISLLQLIFAF SEQ0047
Homolog of YIL099W; SGAl
Intracellular sporulation- specific glucoamylase involved in glycogen degradation; induced during starvation of a/a diploids late in sporulation, but dispensable for sporulation S. cerβvisiae null mutant: viable; decreased resistance to sulfanilamide chr4_0579
ORF
GTTTGGAGTTGTCTCCGTTGGAGGAACAGAATACTAAACTTACTTGCAACAATTAATGAATAG
AA
MKLLDGLTISLCISMATSLVLPGFDQLRLQQQNVLSLLPKEKIQIQTLSEIWSYVDCAIASLGPKE IVIDSDLS I LKPVQASS IPRSEFNQWLLYQRNVSFHGVLNNIGGYGYNADNVSVGCI IASPSKSSPNYFYQWVRDSAITINTVI
EYLYDDNIDDSIDKSELLEAIDGYINNIHHLQRQDNRSGKFKDGYASLGEPKFMVNGDPFNDSWGRPQRDGPGLR ALSVANYIDLLDKLGIQKDSERLSFIYNEVLKPDLTYVTLYWDHDGFDLWEETNGLHFFTSLTQLKALRRGIELA LRFEDYEFHRDLKNAYAQLRNFIVGRDSGFQDPRFPHIIEHPNLVDAPAHIRRNGLDIATIIAALRSHDVDDAGD IVNIPFGVDNARVLNSLTYLVNDMKFRYPLNQGRIVPGLTLGMALGRYPEDTYNGVGTSEGNPWFISTSSAGELI YKLLYLQYKYQQDFVIDSSNREFYEMFLDLPKDSARGIQIRIPFGSDTYRALSVSLIQYADSFLDVIREHVDNEG QMSEQFNRYNGYMQGAEKLTWSFGSVWSCLRWRNRILNLLATINE
Peptide
MKLLDGLTISLCISMATSL SΞQ0048
Homolog of YEL040W; UTR2
Cell wall protein that functions in the transfer of chitin to beta(l-6)glucan; putative chitin transglycosidase; glycosylphosphatidylinositol (GPI)-anchored protein localized to the bud neck; has a role in cell wall maintenance
S. cerβvisiae null mutant: viable chrl-l_0293
ORF
AA
MRPVLSLLLLLASSVLADEVIECDADNKCPEDKPCCSQYGVCGTGVNCLGGCDPRHSFNASACLPMPVCRDVDLK ASTDAFEIDTNYLGDANETDWVYNGYLIDYDDSVLLAMPKESYGTWSSTFYVWYGKITATLKTSRGAGWTSFI LFSNVHDE IDWEFVGYNLSQVETNYYYQGVLNYTNGRNVSLEEDVNSFEYFHDYEIDWKEDVITWS IDGDVVRTL KKEDTYNETTDKYMFPQTPSRVQLSIWPAGAESNAIGTVSWAGGNVDWDSEDIQDPGYFYYTLKELTVECYDVPD GTEEDGELAYYFKESDAFDQGDIIITNNSTKIKSLDDTGFDPDEDDDDDESSSSSSSSSRSSSSSSRTGSSSTSS ATSTSTSNSNDDDDNNNSSPTTSSGTSSAASGFVQNMSQTSGSSSATSNNAAASLSAGFLTTISFFASVLGFL Peptide
MRPVLSLLLLLASSVLAD SEQ0049
Homolog of YOL132W; GAS4
1,3-beta-glucanosyltransferase, involved with Gas2p in spore wall assembly; has similarity to
Gaslp; localizes to the cell wall
S. cerevisiae null mutant: viable; normal sporulation chr3_0184
ORF
AAAAAAATCAGAGCAATCAGGCAAAGGCAACAGGAATGGAACTTTAAGGTTTGA
AA
MLYLVTVLLFLVHWLGYIHPITIKGKHFYDSLTGELFMVKGVDYQPGGASGVSTDKDPLSE IEECARDILLFQE LGINTIRVYSVNPDLNHDKCMSLLASAGIYLILDVNSPLPNQSINRYEPWSSYNHDYLYHIFKWEQFSHYNNTL AFFAGNEVVNDKVSATHSSNYMKAWRDLKTYITHHΞDRPIPVGYSAADDLΞFRIPLAKYLECSNSTAELDSVDF YGVNSYQWCGYQNMQTSGYDQLVAHYADYTKPIMFSEYGCNEVTPRIFQEVEAIYSSKMSSVFNGGLAYEFAQEP NNYGMVEYLSDGRVKLLKDFETFKNQLAKASETYHIKQMNENVRSAPLKCQGTYENLGDKLWPQSLGMAFITQG VKGEKGSYVELSDDDFWKKEIFLSDGTKFQKSKIVATHNLNGPDTQQTVPEKSQGSKNGQITNPNPKPKPQTRK IPNKKTKPKQNKQEKSKVQKQIAREKKIRAIRQRQQEWNFKV
Peptide
MLYLVTVLLFLVHVVLGY
SEQ0050
Homolog of YBR286W; APE3
Vacuolar aminopeptidase Y, processed to mature form by Prblp
S. cβrβvisiaβ null mutant: viable; decreased telomere length; decreased acid pH resistance chrl-4_0611
ORF
AAGTACCACGGTCCAAAACTTGTCCTTTAG
AA
MKYLPLVATLASSALAAGINFAQLLDQKPLDIADNVKWELKPEVDSAALQSAVNELDLKIEASYLFKVAHGSVFE YGHPTRVIGSPGHWSTINHVLDTLHNFKHYYDVDVQPFEAFTGILKSFSLTINGVAPKSAEALDLTPPTPGGFPV TGPVVLVDNYGCQASDYPFNVTNGIALIQRGSCSFGQKSELAGLRGAKAALIYNNVPGSAKGTLGAPTPHQVPSL SLSQEDGEAVKRQLLTSGSVIATVAVDSYVKKFKTKNV IATTRYGNDSNIVMLGAHSDSVAAGPGINDDGSGTIS LLNVAKYLTKFKVNNKVRFAWWAAEEEGLLGSDYYVΞKLTPKEKSQIRLFMDYDMMASPNYAYQVYNATNSENPV GSEELKNLYIDWYVEQGLNfTLVPFDGRSDYDGFIKSGIPGGGIATGAEGLKTEEEAELFGGEAGVAYDPCYHSL CDDLANPDYVPWWNTKLIAHSVATYAKSLDGFPLREEPSPFKMTAQSNFKYHGPKLVL
Peptide
MKYLPLVATLASSALAA SEQ0051
Homolog of YLR300W; EXGl
Major exo- 1,3 -beta- glucanase of the cell wall, involved in cell wall beta-glucan assembly; exists as three differentially glycosylated isoenzymes
S. cerβvisiae null mutant: viable chr2-l_0454
ORF
CAGCCATTGGACGACAGACAGTATCCAAATCAATGTGGGTTCTAA
AA
MNLT
ALDRLQQHWSTFYDEKDFQDIAAYGLNFVRIPIGYWAFQLLDDDPYVQGQEEYLDKALEWSRKHGLKVWIDLHGA
PGSQNGFDNSGKRDSWDFQNGNNVQVTLDVLKYISKKYGTTDYYDVVIGIQLLNEPLGPILDMDNLRQFYADGYD
LVRDVGNNFVVIHDAFYQAPEYWGDDFTSAEGYWNVVLDHHHYQVFDADELQRS IDEHIEAACDWGRDANKEYHW
NLCGEWSAALTDCTPWLNGVGKGTRYEGQLDNSPWIGSCENSQDPSKLSSERICEYRRYVEAQLDAFLHGKSAGF
IFWCFKTEASLEWDFKRLVNAGIMPQPLDDRQYPNQCGF
Peptide
MNLYLITLLFASLCSAI
SEQ0052
Homolog of YDR349C; YPS7
Putative GPI-anchored aspartic protease, located in the cytoplasm and endoplasmic reticulum
S. cβrβvisiaβ null mutant: viable; abnormal budding pattern; abnormal cell shape; decreased cell size; decreased resistance to hygromycin B chr3_0394
ORF
ACTGTGAGCTGTCTACTACTGTAA
AA
MYQALLVLSLICPSSANFVKLRSNAGMFYDTMAGVPRΞDEEFWLRLDINQGLSWTLDSSYYSCNGSNVE SAQNVYDASNSPTADFVDVYANTTVNNTDEASAERVNLTNNLFADGVYMEDNFYVTLNNGARMTATDLKFLNAHN SSAAVGSLALGSYTSQDVPTFLQRLQSGGLIESNSFSLALNEIDSSYGELYLGTINSTKYVEPLVEFDFIPVSDP NGVFGFDWEDTFPTVPISGLSMSSNDKQRTVFFPNEWNNTVLTGTYPLPMMLDSRNIFIHLPFSSI IHIAVQLNA LYLDTLHKWAVNCSVGQLDATLNFHMGNLTVHAPIKELIYPAYQGDKRLSFANGEDVCILAMAPDVYIGYPLLGT PFLRNAWAVNHDSKKVAVANLNRDS IPPASNVSVSEΞMGVYVPPPVSTSRTSERPSTLDETSTANFDKREESAI SSSSVTNSSSRNSSTITSSGTQTEQTSGIATIETDS IPGALGNNLTDYSTLTLT IYTNSEVDELNPNIATAFISN GSIYSEPYPFSGTAVAESFSASPSQAEGSNSSSSGSSLVLCFFTSLASLLTVSCLLL
Peptide
MYQALLVLSLICFSSAN
SEQ0053
Homolog of YDR456W; NHXl
Endosomal Na+/H+ exchanger, required for intracellular sequestration of Na+; required for osmotolerance to acute hypertonic shock
S. cerevisiae null mutant: viable; decreased hyperosmotic stress resistance; decreased respiratory growth rate; decreased chloride accumulation; decreased resistance to gentamycin, hygromycin B and l,2-bis(2-aminophenoxy)ethane-N,N,N',N'-tetraacetic acid
S. cerevisiae overexpression mutant: decreased vegetative growth chr2-l_0156
ORF
CTAACTGAAAACAACAACAGCAAGGACAAAATAGAGGACTAA
AA
MIRLLALFFARQILANEITDPTDENPVLVGPEAPEEETNPLTEEIFSSWALF IVLLLVVSALWSSYYLQQRRVKS IHETVLSIFYGMFVGLILRVTPGHYIQDAVKFNSGYFFNFLLPPI ILNSGYELHQANFFRNIGSILTFAIPGTF I SAIVLGVILFIWTKLGLDGIDVSLVDALSVGATLSATDPVTILSIFNSYKVDPKLYTI IFGESLLNDAICIVMFE TCQKFHGQAVSVSSVLKGIGLFLMTFTVSLLIGVWGVFIALVLKHSLIRRYPQIETCLVLLFAYESYFFSNGCH MSGIVSLLFCGITMKHYAYFNMSRRTQIATKYIFQLLAQLSENFIF IYLGLSLFTEVELVFRPMLI IVTTISICI SRWCAVFPLSRLINWTTRAKHKGGSSAINYTQDE IPPNYQMMIFWAGLRGAVGVALAMGLQGEAKSSLLATVLVV WLTVILFGGTTAGMLETLNIRVGVIDEQESDDEFDIEAPKPMQLNQI IPGATTPVYSIYSDAAGSRSRTGSQQS FYNNEDEDAADAPPDMNDDEMTSDLDDIPPMANQAQSSKINLQSMSFNNLLSMDDHAKWFTHFDEQVLKPVLLDN LTENNNSKDKIED
Peptide
MIRLLALFFARQILAN SEQ0054
Homolog of YMR238W; DFG5
Putative mannosidase, essential glycosylphosphatidylinositol (GPI)-anchored membrane protein required for cell wall biogenesis in bud formation, involved in filamentous growth, homologous to Dew Ip
S. cerevisiae null mutant: viable; competitive fitness: decreased; resistance to rapamycin: decreased; resistance to wortmannin increased; resistance to oleate: decreased; sporulation: decreased chrl-l_0147
ORF
GCCACTACACATATTATAAATGGTATGATGGACTACTATGAAGGAACGAGGTATGGAGGTACTGTGGGAATGTTC
ACAGTGTTGGTACTCGGCGTTTTCATTGGAGGTCTGGTATGGATAGTGCTCTAG
AA
MWILWLLTTVSALSLDVDSKDS ICEATTHI INGMMDYYEGTRYGGTVGMFQTPIfYWWQAGEAFGGMIENWFMCDN NTYEEI IYDALLHQTGDNFNYIPANQSTTEGNDDQGFWGFAAMGAAERNFTDPPEDYPSWVALTQAVYNTMWARW DNATCSGGLRWQIFTWNSGfDYKNTISNGCLLHIAARLARFTQNETYGETAEQVWQWLDDVGFINDQGGTfLIYD GANIVSGECTDΪTTIEWTYNYGVLMAGCAYMYDFTHDEVWLTRTTTLLSSIS IFLNDNVIYEQQCQAAMNCNNDQ RSFFSIFSRCLGYTAKLVPSLTDQIMTILTASAEGAAKSCSGGTDGVTCGLNWNIGAYDNMYGLGEQMSALEVIT QLLVLDSPAPYSQNESSSKSDPEAGIGSSTRINMNDLDITGKDKAGAAILTVLVLGVFIGGLVWIVL
Peptide
MWILWLLTTVSAL
1.2 Proteins potentially involved in secretion
SEQ0055
P. pαstoris homolog of S. cerevisiae SECl (YDR164C)
Sm-like protein involved in docking and fusion of exocytic vesicles through binding to assembled SNARE complexes at the membrane; localization to sites of secretion (bud neck and bud tip) is dependent on SNARE function
S. cerevisiae null mutant: inviable
S. cerevisiae conditional mutant: increased heat sensitivity; abnormal subcellular morphology
Chr4, 0134
5' region (in bold start codon next gene) ctccatggacagaatttgaaccaatgatagaattggaacctcttcttgcaaactgaattc tttggggactattgacgtataaagtccaaacagagtcgtcaaagagctcaattctgtatt tgtcaataccatttcttggggcgatgtctccagcgaccgatctaatacccacagcagagt gaaagtaagggatcaagttatagtatacaccagtgatgaatcccattccttgtacaacag ggaattgtataaacgaggaccccgaggtgagattaacattgatagagaactttcgtggag tatccagatcgatatcaatgttggattcaaactctcttgcactcattacgaaatacttga ttccaactggagaatagaagaactgtggaggattaccatcagcgaaatgacgttggctga tgctggtttgggagatcgccaatccattgaatccgtccccagtatttttggaataccaca atgcatacggattggtgtagattggagttgtctgatcaccgaagaagaatattcccgtta agaaattattagtctgcaccgcagtcaaaaggttctcttaagttacgtgatagacctgca
ggaagacctacaggggaaggagctaaagggaacctagttggcgtggaagttgcaactggt tggaaaagattaccggaggtgtcagacatgtgaattcactgtggcacaagacgatgactg gagatgaggaagagaaaccgctggagaaaaaagaagaaggaagaggaagtcggctttgta tctgaatccatccgattagtccctgttgcacagaactataatacgcatgcaacagcgcat ttgttgttctcctgttattctttgctggtgccttgcagggcaacgttttcctcgaatttt ctgtatcttatgtaatcttttttgaagacccgctcctgtaatcccaatatttaccgctag aaaagacaaccagggtacttgaagcaaaactatgagcaata
ORF
CGAAAAAGGGATAAGCTGAAGAAATTTTGGAAATAG
Downstream (in bold start and stop codons previous genes) aatgtactcttatcgaaattaatattcaatcttgggtcttttcctagcctttctacggtt atttttaggctgtatcactcctacagccacgggttgcggcggtggtggtaatggcggagg aggaggtggtggtggtgctgaaaggggaaatgcaaatgacaactggaatgggggtacgaa ggtttggttaggaccagaattagattggattgtaggattaatagaagaaaaatgctgtga tgttggtggtggtggtggtggtgccaacgaatgatggcccggtgtgggcaaatgttgtac agtagattcagaattctcaggtttctggatgacaggggatattggttcgatttctaagtt ctgtccctcgttggataatttctgaaacccatatggagttaccgaatccttgaatccccc ggatcccctaattttctttaattgtaagacgtatcgggaccttgcattatgttgtctagt gatgtctgaatttccagttttcgcctgtgctaatcggtttgccattctaccaaaattatc atccttcggtataagatctacccttcccttttctttccattgtgctatttggtcctgctt atatctttcacttctgaggattctatcatttctaagacgttcatatcgtcgtattgtatc tgggtctgaggaagtggcagctttgacgagagccttatttctcaacctaactaacttcct agtttctttttccacgtctattaacaaacctgagtcttcatgggttggtgtttccgctgg cgttcctgtgctactcatattatcatgccataggatgggcttaagagtgatgatcagata tcaaaggtaggatcaaaaaaaattcgagagatatqagcattctgcatgaagatagacctt taagtaacgattcatgaagtttgtggctctggtttcaggaggaaaagattcatgttttaa catatggcattgcttgagccagggacatgaactcatagcct
AA
MASDLINLQRDYLLKLIGSVETSNGLKCLVLDANSERLVNSLIDSNTLLRYVTTVERIDKKRKIRLSMEGVYLIG PTKFSVNCLLADFQINPTRYKKAHLLFLSPLARELTNLIMGNKQLEANTITRRTVDFTLLPLESHVFLSDAPDSL PTLYNENCLDLIRYQASRAVQTLMNLCI ITGEYPLVRYYSPQNPINKSSVLPRMIAQEFQSTLDDYCRIKQDFPG
DNPRPRSIFIITDRTMDLLAPLMHDFTYEAMCFDLLEFAENVDGDYPNTYRYSVENENGELLDREASLKPPIDDY WEELRNMHILDASNQLDVKLNKLITNNPMMVDRDFASGTRDFLFIVAHLHGFDEERRKIMLHKKLTEELLVINNE RHLAECADFEQNCAAFGVSYDGEKIKDMASFLLSWISLDYFTTSDKIRLILIYAIYRGGLIRADVSKLVKFAGLA SAEEHVMTLFENFSLLGFQLLKAHPKDKSFKKQFWHKIDSNAVLNTSRYKPAIQAIVELASKGILDEASFPYIKD KPLEVSETNPDSATSLKNPRYRAAWSRKGSSYSPPKQRIVVYSAGGITYSEMKAGYDAGCLLNKDVFIGSDEVIT PRMFVNNVIDLTSDRASLSLFYDRRRAAGESAPKVLFEQESHHRPSIGGPVDSSASLASTTSQSHEPPTNDKEKH RKRDKLKKFWK
SEQ0056
P. pastons homolog of S. cerevmae SPC3 (YLR066W)
Subunit of signal peptidase complex (Spclp, Spc2p, Spc3p, Seel Ip), which catalyzes cleavage of N-terminal signal sequences of proteins targeted to the secretory pathway; homologous to mammalian SPC22/23
S. cerβvisiae null mutant: inviable
Chr4, 0874
5' region catggattggaatctatgcaaggtactacacaaaactcgagacactgctcggaatccgga tgtatatgaaattcctgtcaatgctttgaatttttgtaatgttgctactcgagctgaaat gattgcaactccagcatctttgtcatcggaaagttttgatatttatgttgtcagtaacca ccggttagagcgcttgcattcgaacgtagatatatgtcaaatcttacagcaaaatcagcg ttcaaatgaaaccgacgaaaaccgaaaatttggtgctatcatgcaaatcacttggacggg agatttccagttagctgtaggttacgagagtggaattgttgcttgtattcagttgcaaga gagacatactcaagttgtctggatagacagttctcatctcacacaccctatattagggat tgtgtcaaatttggatagggtctttacctgttctgcaagcgatagaattgtcgtcaatag ccttcaaactggaaagactgttgcaatctctaagatcaaacataaaggtatttcaagcct agatgtaactattaataacagcatttgccttgtctccgtgactacttgggatgggttttc aaggatttacagatatgatgaagattcgtcagctgctgtggatgaggctctaaaaggtgt tttcattccctggactaaattgagaagaattccaccaaaagttattcctaacttcaaaag ctcctccaatcatcaaacgattgatcttcctgatttaaagggtagagtcatcaagttgtc aaaaactcaaaatggatgtaataatgcactagtggagagaagtgatggtcagagcaaaat gatccttcgcagttcagagaagcgtgcattaacacgatacgcgtttgtaggttatcaaga tggtcgtgtagcaacctatattctttgtgattcatagtttttttctacgattgttcggta tgaacaaaattcaaggtatcatcatatcacgatcgccaggc
ORF
GTGGCCTATCCTGCAACATAA
Downstream gagcattgcgatgcatccaggtctattaattgtgtctagcttctatatgtacaggtgata caggctccccatcatcgtttgcagtattcttctttgtctgcaacacatgtggtaccaatc gttctggtctaactgcagggtttttctttttgattttgaaatcaagcgacctgttctcct gcgttatccgttcccattcgtcaagttgcgatatattatcatcatatcgggccatagtca tgggatataaatgtttaaacctcatgagaacttgagcacgctccttaccttcaacgcgct gagatataaatttctctatcgcacgcagtgtctgaaagtatgttgctccgctcaagtaat aggaagtatgcggcctgaggcgcaacatagtttggagagctgacctgtccaactcggatt gtttttctcttaccaattttctcttagatttcagaaaaagataacctttgatcaaacgac cttcaataaactctaaggcatcttccaaagtggtgtgatgtatcaaaagcgatctagctg tttcattcatggcatagttcgaaagcttgaagccgattgattctaaatgacgtaagagag gaatgatagatttgtagttcttaacctgtatcatatgcctcagcttaacggcccacacag tagacaacctatttgccacaagataagctgcacgaacattcgaatctgaaaagtccgaat ctttttcaacggggcggtttaactgtgtcgccaaacttttgacatacagatcgaaaatta
gctcaaaaatttgccaattggcttcttctgcatagattgcaagtagagcttctaagaata aataactggaaatgctatcagggggcagcatctttataaactttcagcaaatgtctttgc ctcacctgttttttctacggaggccaactttttaataaaaggtaccaccaactgcaaagg aagatatttaataggctgagacagagtctctttcttctttt
AA
MFNI IQRIQSLSNFYLTVSILLCIVTTVVSI ISMFLDETSSIPAQLSNY7VISTNLKYSRSFGSVGGRPKENSKIL
FDLDMDLAPLFNWNTKQLFVQLVAEYPTSVADDGAKVTYWDS I ITEKKYARVHVNKQRGKYSVWDVSDSFQGRNA
TVKLKWNLQPYVGFLFFGQTKGEIEVAYPAT
SEQ0057
P. pastoris homolog of S. cerevisiae SECIl (YIR022W)
18kDa catalytic subunit of the Signal Peptidase Complex (SPC; Spclp, Spc2p, Spc3p, and
Seel Ip) which cleaves the signal sequence of proteins targeted to the endoplasmic reticulum
S. cerevisiae null mutant: inviable chrl-4_0187
5' region ctgcgtttctcaaggcatgggcaaccgcttacggaaatcttgctcaaatcctgataaacg ctgaatgggcagagttccagaagcagaagtggcatgatttcaaagagtttaaagtgacca aaatcgaaaatgaatgtgacgatgtgaagtctgtttattteactccagtagaaggagaga tagcaaaacctttggatgggcaatacgtctgcatcagatggaaattacctggggaaaagt ttgagaaatctagggagtactctctttcttccaggccaaacaacaatacctacagaatct ctgtcaggttgttggaaaatggtaaaatttcgacatttgttcacaatcaactgaaggtgg gagacataatcactgtagcaccacccgctggacaattgttatacgaagaatctcaaaagg atgctgttttttttatcggaggtattggtatcactccagtagtatcaatcatggaaactg cccttgaaaggggtcagagagttacattattttactccaatagaacctcaaagagtactg cctttagaggttggttgaaagaactaaagagtaaattcaatttgcaactgacagtgaagg aattcgtctctgaagaacaagttactgaaggtgtagaccaaatcaattctgctcaattac agagatcagacatccaaacggtgtcgccggagaatgaagtttatctggttggacctgttc catacatgcagtttgttagctctgagctcaacaaactgggagtccaaaatattcattccg agttctttggaccaactgtggttgcttgaacgagaaaacgatatttattatatagtgaat taatgcttacgatatgcagtgtaagtcactggatcccttctcttggtagctagttagtta cttcgtacctttacatccaacatcttacaaaaatatttcgcggctacgagtatcattaaa ccatcattttacgtggatcatccaactataatttcaccacg
ORF
AATAAGTATGCTAAATTGGGACTATTAGGATTAATGGCGTTGAGCACGTTATTGACTCGAGAATGA downstream atatgacgattattaaaaagatccatagccaaaagtattgattttctttttgggtcgtga aacttcaggggtaattttccgttttttagattgcaactctttttccttctccacttggaa cattccaaagagaatatttgagttcttatcagagacttgataatcgaggtcagcaatctt tcttttgaatgattcaagatcagatatattttccatttccggctttattaataactgaat caatatttgttgcagtccataattttcactatccagcaagtcccaaaatacatcaatagg ggaggatccatttcgacccaaaatttctttaaactctggtaagtctttgatttgctggat gtactcataatatgttaaatttgcattctgaatctcttttagtttcaaagaaattgcctt tttaaacgaatctcttgcttgcctgtctttcctgtagttaagcttttgattgagtttgat agcttggtctaattgctttacatggaggtccatttcttgttcatagcagcccaactgctc catagatgaaagatgagggaacttttccatcaattgacctgatatatctttccattttga
ttcaatagttaagcttgattttaaatatgtcttcagatcttgcagttcgttagatttaag ttcattagtttgtctttcgtgttcctctctgagactgctagtgtactcttcaaacaatct tcttttcaatttctcggatccaagagagtatataggctcgtcaatgattaatctggaaca ggttttccaccttgtataatattttatctctgggtaattagataagactttgagaaatgt gtttttaaatgactgctctgcatcctgttgctctttaaagacttcctgctgtttggatac ttgatatgactggtataggtcatttctcttcagggaatcctccacgttccaatatctgga atcttgaatgattaatgaaatagctttggtaaatggtaact
AA
MNIRQQLVQLLNLAMVLSTAFMFWKGLGLVTNSNSP IVWLSGSMEPAFQRGDI LFLWNRDKYVDIGDVWYEVK GKPIPIVHRVLREHKVTNKDRKVRQLLLTKGDNNPTDDLSLYAHKSNYLDRDEDVLGTVKAYLPKVGYVT ILITE NKYAKLGLLGLMALSTLLTRE
SEQ0058
P. pastoris homolog of S. cerevisme SPCl (YJROlOC)
Subunit of the signal peptidase complex (SPC), which cleaves the signal sequence from proteins targeted to the endoplasmic reticulum (ER), homolog of the SPC 12 subunit of mammalian signal peptidase complex
S. cerevisiae null mutant: viable chrl-1 0491
5' region (in bold stop codon previous gene) ggtagtgatgtccatttctacacagatggatgctgtggatcattcaaacttgagtcaccg atggttcttgggcacgagtctgcgggaattgtcgttgaagttggctccgaggttaagtcg cttagggtgggtgacaaagttgcttgtgaaccgggtatcccgtctcgttacagcaatgcc tataaatcaggacactacaatttgtgtccagagatggcgtttgctgctactccacctatt gatggtaccctatgtcgctactttttacttcctgaagatttctgcgtcaagcttcctgaa catgtgtcattagaggaaggtgcattagtagaacctttgagtgttgetgtccatgctgca agacttgcgaagattacctttggagacagtgtagtagtttttggagctggncctgttggt ctgctggttgctgctacggctagagcttacggcgcaaccaacgttctcatcgtggatatt ttcgatgataaactgacactcgcaaaggacaccttacaagtggccacgcatagtttcaac tcaaagaatggtatggataatcttttggaatcatttgaaggaaagcatccaaacgtttcc attgattgtactggagttgaatcgtgtattgcggcaggtatcaatgcactggctccaagg ggagtgcacgttcaagtcggaatgggaaaatccgaatataacaacttcccattgggactt atatgtgagaaggaatgtatcgtaaagggtgttttcagatactgttacaacgattacaac ttagcagttgaactgatagcttcaggaaaagtcgaagtgaaaggattagtaacccacagg ttcaaatttaccgaagcagtagatgcctatgatactgttaggcaaggtaaggctatcaag gctatcattgacggcccagagtaaacggagacatatataataataaaattgtaaggttct cttagcattgcaccccaccctaggtttctttttagactgtt
AAAAAGAACCCTGTAACATGGTTACCAAAGAAATCCAAAATAGAGATTCAGCATTGA downstream ccgaatcagttccgaatgacctttaccttgggtttgatctaagtacccaacaattgaaaa tcacctctttcgaagggcgttcattgacacatttcaagacttatagagttgatttcgatg aggagttgagtgtttatggaatcaataatggagtttatgtcaatgaagagactggagaga tcaacgctccagtggcaatgtgggtcgaagcgttagatctgatcttttctaagatgcaga aggacaagtttccatttggcatcgtgaaagggatgtctggatcttgtcaacagcatggct cagtttattggtctaaagatgccccagatctcttatcatctctctctccatcgaaagatt tgaagtcccaactgtgtccgaaggctttcacattcgaaaagtctcccaattggcaagatc actcaaccggagaagagctggaaatatttgaaagaaaagcaggatctcctgagaatctgt ctaagattacaggatccagggctcattatcgatttacgggatctcagattagaaaattgg ccaaacgcgtaaatccagaactttataaggaaacatacagaatttctttaatcagtagtt
ttctaagttctctactatgcggaagaataactaaaattgaagagagtgatggatgtggaa tgaatatctacgatattcagaactcgaggtatgatgaagacttgttagctgttacagcag ctgttgatccagagatcgacggggccactgaacatgaacgtcaagaaggcgtagctaggc tgaaagataaactgcaagatttggaaccagtcgggtatcgttcaattggcactatagctg catattttgtggaaaaatatggtttcagtgaagatagcaaagtattctcctttacaggtg acaacttggctaccattctttccttaccactgcataacgatgacattcttgtttctttgg gtacatctacaacagtccttttggttacagagacttattgg
AA
MHSKFRWVCVDTQFCTHHQNLSPFSYISNPSPMSFS YLEGNIDFKGQELANRITKKLITFGAI ISFLVGFLSDNI
LYTVYTFAAFGLLTASLVIPPFSFYKKNPVTWLPKKSKIEIQH
SEQ0059
P. pastons homolog of S. cerevisiae SPC2 (YML055W)
Subunit of signal peptidase complex (Spclp, Spc2p, Spc3p, Seel Ip), which catalyzes cleavage of N-terminal signal sequences of proteins targeted to the secretory pathway; homologous to mammalian SPC25
S. cerevisiae null mutant: viable
S. cerevisiae overexpression mutant: abnormal cell cycle progression; abnormal nuclear morphology chr2-l_0589
5' region (in bold start codon next gene) cagctctaggaacctcaacaaacatgaaaaacaaagaacatgtccacatttgatcatcct cggagcaattggttcttcattcaagcaaataggacattgagcatcgtttttggagatgat gactcgtaaaatatctctttgcttcaaaggaacattaggatctaaaacttgagctctgaa atctcccgtagaattcactataaatttgaaattcacattgatatattccatattgcttag atgaaccgaatcggatcttctccttggcctaggtagtcgtttgttaagtggtctgttatt gctggaataatcattaggtagtgtgtactcgactaaatggcttatattaaccaagttctt cctgttgctgggaaactttccaacatgatcatctaataaaagatcatcagctgatacctt agagtgttttgaccgacctttcctggacatcctgatggtttggacggtcccctgatttat ccagtacgcagagtagagatggagaaaggagacacatgcgtccatattggttggtgcact ttgttatcgggaactgtacagattaggatggtcaaagttcagaaggttaagaatagtaag cagtccaaacaagagaagacgaggactcttggttttgtggtacttgaccagaaaatgact tctagagctgtttgagacgtttaggtgtaaattacatacgatttgcccaagacttgagca ttatcgggcttacagcgcagtacacagttaatggacagaagagctaaaaaagcagcaaaa agagctacactagatggtagttctactgaacctcgttctcctcatgtcttctaacccttc atcaccagcgctgaacgttatatacacacttactggatgaatgatgaaactgtcattttt tgtcacgtgatcatctgaatatgtgatttcctaactcacgcttcttgatcctctaacctg cttcttctcaacacatcatcagtcacggacagagtttaaaa
GACATTAAATACGTTGGAATCAACAAGAATAACCAAAAGAAGATTTCAATTGAGGCCCGGGTTGAAAATAATACC
ACGC
TTTG
TGZ^ downstream (in bold stop codon previous gene) aattattgtattatctgattgtccatgccactgaactgtctcgcccattcgacaatagta aaaggcagtatacgtagaatgggaatggcccatgtatacataggccacatgaatgtgcct tctcgagtgtataaacaggcatctactattctagtagcaagcttttcagcctctaccact ggagccagaaagttatgtgtaacagaaacatccgaaaacatggctgtatctaattgtcca ggtaagaatgttgcaaaactgacgtcttgtgagcgaacttcatgtgataaactttcaata aaggctagaagtgcactcttggtggcagaatacatggataaattcctaggacccaccaat cccagtatagaacttatagttatgatatgaaggtttttccgtggtatatttggattcata
tagttactgctgtgcattttgatcatctttctcatcaatccaacatgtgacaaaaagttg acattaaatccatgcaggaaatgttcttgactggaattcataaaagcacctgaacctctt attgcagcattatttatcagtatatccaactggcctgcgtctcgtattatcttatcaaaa cttatgctcaactcatgagtatctgctatgttacatttatggtattccaatttttcatgg tgaaacttgggcttttcgacgtctaagatgtacaccttatcggatactgggactaatttt ttgactacttccattcctaaaccttgcgagccaccggttattagcactttgatcttatcc ctcttgaccttctttcttccactagcccatataatcagccatattatatctactgtcaaa cagaaacatgctggtaatccaagcagccagaacacccaggtcagcgcaggatgtatcagt gtcaattgagtccaatatatgacaagatcaatactggaccgacttgaactcaccagtggt tctggaacaccgagcattgacgaatatgcttttcgggccta
AA
MSGITKPVNLYSVAELRNATDDALPLALSKLGYEQSFSLIDTKLAIGYVATILAGSLYYLEKKYNNDFSNLTYYY SMVALVVGYFALNGLLWLHGKYREKDIKYVGINKNNQKKISIEARVENNTTPIYKVTVIENNRLLGKQDIPFTGI FDEDGFIHIDQLVEIFTRLLQEKEK
2 P. pastoris homologues of genes involved in the glycolysis pathway
SEQ0060
P. pastoris homolog of S. cerevisiae PGM1 (YKL127W) catalyzes the conversion from glucose-1 -phosphate to glucose-6-phosphate, which is a key step in hexose metabolism; functions as the acceptor for a Glc-phosphotransferase chromosome 1 -4, 0264 E.C.5.4.2.2
5' region catctatccaggtacatcgtagatgatcatttggatttttcaccacgtcactcacttggt attaatagttgaactcatattgaagacgttcacaagttctaatataatctatgagatcta aactagttgataaagtaactttggattcttgtggtgatcttatgaaaccattgcacagcg actactagggatttgcatcgaataacagtggatgttgacgacaaagagaaatagacggag cagatttattttattattccagttgaaaagcttatttaaatgattcaagccttgaccaca aatcaaatcattcaagtcatatcagaaagtcccttctgatgctggaaagcagcctttgaa ggacatttggcaactgatgtgatactctacctgcctaagagctctgtataaggattatag gctcctcctcatggggaaaaccaaggctgaacttgcttatctcctcacattcagaggtta ctaaaggcgtcaagaatccatctgaaaacaatcgggggcatcttgcaccagttggcgaac caggggaccgataactttcagattaaatgtcaggaaataccgcatgctgagatctgtgga aatatgcaagtcaagcgatatgcaggtgatgagatcgagaaagtgaataatcttaaatct gtttgccttgccaacacagccaatccttgaaccaaatgaaccaagtgagggcatgaaatt cctcagtaagcaagaattagcaggatttctgtttcttccctctcaaacgcctaatcttga gagcttggactaacaacgatcaagtaccgaaaaactactctaaacactgactcccactag tgacagaatcaatgaggagtatgggcattgttcttgatttataaatcattctttccaaat ctctgtgatgctactaaaaccaactatatggaaatagaaagtatattcagtcatttttgt ttgaaaacactgggcattttcattgcctaagctgacaatacacgattcttgttcattgcc gtgcaaaaaatgattacttcctaacgctaatctgtaatcagtctaagcccattccatgta gacgtttctctttatggttggagttcagaaaacaaccgattcaagccatacgttttgcga acaatcaggatttactccacagcaatccttcaaatctcatgcagcttttgcaaaccactt gaagtaacaacctccaaaattccaccaaacagacaatctagttttaatagttctaaaaca tcatgtggatctcgagatctgactattattcacgcaattacctaatgacggctcatgaag tcgccccttccattcaagaggggataggaagtagccgtaccaacaaacttagtctgtagc atgcggtctattcattctcaagttaaagacaacgtaaaagcatgcttagcacccttttaa tctctttgtcacctggaagagcagcagaaatggggggtctccatctgcaaagtacagaac aatcacctggttgcttcctttcgcgggagcactccacctttggggatgtcttattgtctc atctcttctaccccaaagttacctacctacccagcatacaaaactgttaagagtgctca
ORF
ATGTCATTTAGCCCTAGAATTGTTCCAACCACTGCCTTCTCGGATCAGAAGCCAGGTACTTCCGGT CTCAGAAAGAAGGTGACGGTCTTTCAGCAGCCAAATTACACCGAGAATTTCCTCCAAGCAATCTTT GATTCCATCCCCGAGGGTGCGCAGGATTCGGTCCTTGTAATTGGTGGCGATGGAAGATATTACAAC GATACAGTCGTTCAATTGATTGCCAAGATTGGCTTGGCCAACGGTGTCAAACGTATCATTGTAGGA CAGAATGGTATTCTATCCACTCCGGCTACTTCCCATATTATCAGGACGTACAAGGATGTCAAACCT ACAGGTGGAATCATCCTGACCGCTTCTCATAATCCTGGTGGACCAACTAATGATCTGGGAATCAAA TACAATCTCAGTAATGGTGGTCCGGCACCAGAAACTGTTACTGACAAAATGTTTGCCAAATCATTA GCCTTGACGGAATACAAGATTATTGAGAATCTACCCACCATTGACCTTTCTAAACTGGGACTCAGC AAATGTGGACCATTGGAAGTTGATATTATTCATTCAACAGAGGCTTATGTTGAAATGCTAAAGGAT ATTTTTGATTTCCCACTGATCAAGTCCTTCATCAAACGTAGATCCCCAGAGGGGTTTAAGGTTCTG TTTGATGCATTGAATGGTGTAACTGGTCCTTATGGAAAGTCGATCTTCATTGATGAATTAGGACTG CCGGAGTCTTCAATTCAAAATTGTGTTCCCAAGGCTGATTTTGGTGGGCTGCACCCGGATCCAAAC CTTACCTATGCAAAGACCTTAGTTGACAGAGTAGACAAGGAAAACATCGCGTTTGGTGCTGCCTCT GATGGTGATGGTGACAGAAATATGATTTATGGTGCCTCTACTTTTGTATCTCCTGGTGATTCCGTC GCAATCATTGCGGAGCACGCTGAATCTATTCCATATTTCAAGAAACTTGGAGTTCACGGCTTAGCA AGATCTATGCCAACAAGTGGAGCTTTGGATTTGGTTGCCAAAGCAAAGGGCCTCAATGTCTATGAA GTCCCTACAGGATGGAAGTTTTTCTGTGCTCTTTTTGATGCCAAAAAGCTGTCCATTTGTGGGGAA GAGAGTTTCGGAACTGGTTCTGACCATATCAGGGAAAAAGACGGTTTATGGGCCATTGTTGCATGG
CTGAATGTCCTTGCTGCATTTGATGCTCAGCATCCAGAAATTGAAGGAGGTGCCACTATTGCATTA GTCCAGAAAAACTTCTGGGAAACATATGGTAGAACTTTCTTTACCAGATACGATTATGAAGGATGC GAGTCGATCCCAGCCAACAAATTGATTGAATTCCTACAAGAGAAGGTTGATGATACCTCTTTTGTT GGTTCAGAGCTAGCTCCAGGCTACACAGTCAAGGAAGCTGCTAACTTCTCATACACCGACTTGGAT GGTTCTGTCTCATCAAAGCAAGGTTTGTTTGTTAAGTTTACTTCCGGCTTGAGATTTATTGTCAGA CTGTCTGGTACCGGATCTTCTGGTGCCACAATTCGTTTGTACTTGGAGAAACATACATCTGACAAA TCTAAGAACTCTTTATCTGCATCAGAGTTTTTAGCCGATGATGTCCGTTTTGTGTTGAACTTTTTG CAATTCCAGAAGTTTGTAGGTAGAGAGGAGCCTGATGTTCGTACATGA
Downstream (in bold stop codon previous gene) taattgaaagaaaattctattgttattatacattctatgtagtccaaagacgggcagtag catctgctccagatgtgaagacccatgcttctttgggatgccattccatattcaatactc ctagactctggattacgcgatgaccagttagcttcttcaaaggaactaacaatggattgg acatcagatcatcataaacggtaccgtggaaaatatgaaccgttccatcgtcggaagctg aaccgaataatggtagccctcctttgtggaaagcaacagctctgacagctttctggtgat acctcattgtcttgtatggctctgaggataaatctaaatcatgccatagcactctcttgt cgtaagaggcagtgataacattgtcacctcttggatgtatatcaatagaagccaaccatc ttgcctgagatactaatttcttcactaggacctgtttctgtagatcataaattctaatat atctttgtgatacaacaaatagctgaggcttgaaaggatggaacttagcgtccatgatga ctcctcttgatttgttgaatggggcctgcgaaatatgtcttgaaagctggtgcaccaaaa cagatgtttttccagcattaggcgaggaggtgacaaagtagtcacctttacggtgccatg aaatcttcttgatgacagagaccttcttcatagagacaaccaaggatattcctcgatcag cttgttctggggttggcttaaaccaggaagtgtgttgtttagtaacggcttcggtggtgt cttcgtcgtcagatgcgttggatggacccttggcagccttgcgaatatttccgtgtttag cataaccaaagccatgctcaatcttagacttcgcagcattttcaatatcaaatccaaaaa taggaggcacgattaaataaacgtgctctcctgcggcaattgctaaaattcctgagtttt cagetgggttccatgetatgcattcgataaaatcttctccc
AA
MSFΞPRIVPTTAFSDQKPGTΞGLRKKVTVFQQPNYTENFLQAIFDSIPEGAQDSVLVIGGDGRYYNDTWQLIAK IGLANGVKRIIVGQNGILSTPATSHIIRTYKDVKPTGGIILTASHNPGGPTNDLGIKYNLSNGGPAPETVTDKMF AKSLALTEYKI IENLPTIDLSKLGLSKCGPLEVDI IHΞTEAYVEMLKDIFDFPLIKSFIKRRSPEGFKVLFDALN GVTGPYGKSIFIDELGLPESSIQNCVPKADFGGLHPDPNLTYAKTLVDRVDKENIAFGAASDGDGDRNMIYGAST FVSPGDSVAI IAEHAESIPYFKKLGVHGLARSMPTSGALDLVAKAKGLNVYEVPTGWKFFCALFDAKKLS ICGEE SFGTGSDHIREKDGLWAIVAWLNVLAAFDAQHPE IEGGATIALVQKNFWETYGRTFFTRYDYEGCESIPANKLIE FLQEKVDDTSFVGSELAPGYTVKEAANFSYTDLDGSVSSKQGLFVKFTSGLRFIVRLSGTGSSGATIRLYLEKHT SDKSKNSLSASEFLADDVRFVLNFLQFQKFVGREEPDVRT
SEQ0061
P. pastoris homolog of S. cerevisiae PGH (YBR196C)
Glycolytic enzyme phosphoglucose isomerase, catalyzes the interconversion of glucose- 6-phosphate and fructose-6-phosphate; required for cell cycle progression and completion of the gluconeogenic events of sporulation chromosome 3, 0456 E.C.5.3.1.9
5' region (in bold start and stop codon of previous gene) agttgagataagaaggatgaatctgtgtagttatccgcataatcctgtttcaaccataat aacttcttccatggatgacgagccatttgagtggactcggcgcaaaagtgacaacgtttc tagagggctactcgaaagcactttctctggagattagtaatcgatggtcaggatgggaac cgaatgtctttggctggcctttccccaaatttgagaacaaaattcaattgctgaggtaat acttttttccccatacaagaaaggccatcgtgtaattatctcttaccttctactattcga gaaqttgtaaccaaatggctqataacatcgaaaccatccgtcctggaatcaactaccagg acaaactattgaccgaaatagacattcttgatgatgttaaacaactgtcaacgcagatgg aacaggaagggaaaggattcgtcttccctaaagatcattactccaagataaaatcactaa aggggttgcaagtgaagctgttgaaagaaatgcaagagttggctgaaattcageagteta ggcatcaagacgaaactgcatacaggcagaagcttcaagacttggaagatactacatcag aacttagaaacgttgctttgacgtaataggtcacctgtgttactagatgtcggatgtctg tgattgcagtaacacccggcattcggtcagctgtctacctttcaggatatcctcccccca ccgagctcggacgcctttctcgcccacccaccacgcaaagttccgcccaccttatacttt gtggaagtcttttcgccgcgcaccacacttgcccgttcgcagaaaaagaaattccaggta
gcccttattccgtcaggcaataaatatataagcgattgcagacaatctgacgtcccatcc ccttgtctatttaaacctcctaggttgcttaaatttaaaatctattcgagtcccaacctt tccattcttccggaataattcaactccaaccaattgataaa
ORF
ATGCCGTCTCTATTGCAAGAGGACAATGCTACTTTCAAGCTCGCATCCGAACTACCAGCTTTCGAA GAGCTAAAAGAGCTTTATAAGTCAAAGGGAAAGAACTTTTCTGCCAAACAGGCTTTCCAAAAGGAT CCAGCCAGATCTTCCAAGTTCAGCCACACTTTCAAGAACTTCGACGGGACTGAGGTGTTTTTCGAC TTTTCCAAGAACTTGATCGATGATGAGATTCTCGCCAAACTGTTCGACTTGGCCAGACAGGCAAAC GTCGAGAAACTCCGAAACGAGATGTTTGCCGGAGAACATATTAATGTCACAGAGGACAGGGCTGTT TTCCACGTCGCTCTGAGAAACAGAGCTAACCGCCCAATGTACGTTGACGGCAAGAACGTCGCCCCA GAAGTTGATAGTGTTTTGCAACATATGAAGGAGTTCTCTACGCAGGTTCGCGATGGTACCTGGAAG GGATACACTGGTAAGCAGATCACTGATGTGGTCAACATTGGTATCGGAGGCTCTGACTTGGGTCCA GTCATGGTGACAGAGGCATTGAAGCCTTACGCCCAGGAAGGACTGCATGTTCACTTCGTATCCAAC GTGGACGGTACCCATATTGCTGAGACTCTAAAATACTTGGATCCTGAGTCTACTCTTTTCTTGATT GCATCCAAGACTTTCACAACCGCTGAAACCATCCGTAACGCCAATACTGCTAAGGACTGGTTCCTT TCGAAAACTGGTAACAAAAGTGAGGCAATTGCCAAGCATTTTGCTGCTTTATCCACAAATGCCGAG GAGGTCGCAAAGTTCGGTATCGACACTAAGAATATGTTCGGTTTTGAAAACTGGGTTGGTGGACGT TACTCTGTGTGGTCTGCTATCGGTCTTTCAGTTGCCATCTACATTGGTTTTGACAACTTTGAGGAC TTCTTGAAGGGTGCCGAAGCCGTGGACAGACATTTCCTGGAAACTCCTCTGGAGCAAAACATCCCA GTTATTGGTGGACTACTCTCCGTTTGGTATACTAACTTCTTTGGAAGTCAGACACATTTGGTCACT CCATTTGACCAATATATGCACAGATTCCCTGCCTACTTACAACAATTGTCCATGGAATCCAACGGT AAATCTGTTACCAAGGGCAATGTTTTCGCCAACTACAGCACCGGCCCTGTCGTCTTTGGTGAGCCA ACAACAAATGCTCAACATTCATTCTTCCAATTGGTGCATCAAGGTACTCATTTGATCCCTGCCGAT TTCATTTTGGCTGCAAAATCCCACAACCCTGTTGCAAACAACGCTCACCAAATCTTGTTGGCATCT AACTTCTTGGCTCAAGCCGAGTCTTTATTGCTAGGAAAGACTGAAGAGGAAGTAGCTGCTGCTGGT GCTACTGGTGGTCTAATTCCACACAAAGTATTTTCAGGTAACAGACCAACTACATCTATTCTGACA CAGAAAATCACTCCCGCAACCTTAGGTTCTTTGATCGCTTATTATGAGCACGTCACATTCACCGAA GGAGCTATATGGAACATCAACTCATTTGATCAATGGGGTGTTGAGCTAGGAAAGGTTCTAGCCAAG GCTGTCCAGAAAGATCTGCAGGATGACAGTGCCAACGTTGAAGAAAGCCACGACTCATCCACTGCT CAATTGATCAAGAAGTTCAAAGCTTGGGCTTAA
Downstream (in bold start codon next gene) ggcgttctagttatagagatgtatattgtaatatggtatatgagtactaaagaagtgatt atctaataaaaatttgagtaatgcactgacattttcactatcagttttggaagagggtat qqqacqcqttcattcacacqtctttqqtaattatgqccqaaqaaqttgaccccqttaqtc tgggtgaccagacgattctccgctataaaatatggaagaagaactctccatacttgtatg attatttccaaagcaagtctctgctgtggccctctttatctgttgagtttttgccagaca ttgaacgaaatgacgaagatgagttcgattaccaaaggcttatttttggaacatttacgt cgggagccagcaatgagtttttgaactttgggatgtttagtagacacaacgaagtctctt tgagagagtcactgaggaactctctggacaattttgacagcgtcaaaggagaaatatcac cactagtattaccatcttccaaagactccaaaaactctaatcgcagctgcgaaaagttga gcatcatccaacgaatagcacataatggagaagttaataaatgcaaatatcttcctcaaa atcccgacatcatagcgacaattaataattatgggagtgtttcgatttttgatcgaacaa aacatccttctcaaccactaagcggcacaattaaaccagatatttactgtacatatcata aggatgaaggttcctgtttgagttggaatcctagcgttgaaggggaactgttgtcaggct caatggacggaacggttgtattatgggatatcaaaaagtacacgagggacaaagattctc ttgatccatacaagatattcattgctcatgacaatggctgcaatgaccttaaattcatcc ctagacacacatcaatttttggttctgttggagaagatggcttttttaaactttgggata ccagacagggactggatcccgttaagtcaacacgacttcat
AA
MPSLLQEDNATFKLASELPAFEELKELYKSKGKNFSAKQAFQKDPARSSKFΞHTFKNFDGTEVFFDFSKNLIDDE ILAFLFDLARQANVEKLRNEMFAGEHINVTEDRAVFHVALRNRANRPMYVDGKNVAPEVDSVLQHMKEFSTQVRD GTWKGYTGKQITDVVNIGIGGSDLGPVMVTEALKPYAQEGLHVHFVSNVDGTHIAETLKYLDPESTLFLIASKTF TTAETIRNANTAKDWFLSKTGNKSEAIAKHFAALSTNAEEVAKFGIDTKNMFGFENIWGGRYSVWSAIGLSVAIY IGFDNFEDFLKGAEAVDRHFLETPLEQNIPVIGGLLSVWYTNFFGSQTHLVTPFDQYMHRFPAYLQQLSMESNGK SVTKGNVFANYSTGPVVFGEPTTNAQHSFFQLVHQGTHLIPADFILAAKSHNPVANNAHQILLASMFLAQAESLL LGKTEEEVAAAGATGGLIPHKVFSGNRPTTSILTQKITPATLGSLIAYYEHVTFTEGAIWNINSFDQWGVELGKV LAKAVQKDLQDDSANVEESHDSSTAQLIKKFKAWA
SEQ0062
P. pastohs homolog of S. cerevisiae PFK1 (YGR240C)
Alpha subuπit of heterooctameric phosphofructokinase involved in glycolysis, indispensable for anaerobic growth, activated by fructose-2,6-bisphosphate and AMP, mutation inhibits glucose induction of cell cycle-related genes chromosome 2-1 , 0402 E.C.2.7.1.1 1
5' Region (in bold start codon previous gene) gctatgttcattcttcacgttattatcggtttggtcaacctggactgtggtattttcagt agggtcgttatggccattttcattgacattttcattagtcaacttgtcgccgccatcatc gtcgtcctgatcctggtatgcctggtcaaagatagcatcctcgttataatttaccttttg ctttttacgcctgggaggtaaaatctcatcttcttcgttggaatgtttctcctcggtgag aatatcactcgtttgacgacgatgcctagctctaactctgggggacgattccacggaatt gttttgggacggcgtccttctgcgaaccatctaccttctgaatgtatcaaaagtactcaa ttgaagaccagtaattcagatatgatactttgaggacaaattaatccgactgtgctgtct tcagtggttcgaagatctagttactaggcaccaaagtgaaaggtgatgaaagagacgtac gaaaatggtaactgatagaccatacgcgaagtcagcgcgctttttcattagccttgaaat ggtcgtctgggaaccgctcgggacaccgtctcacgtactaagatcgaggctgacagagtg aactacaacttttagcttttagaacgttgaaaaatcatcgattttggaaaagattcagct gctcttgagcaattaataaaaactagataatcgcaagcatatattatgctttaaacgcac cagttaggcccgaatttgttcctgaatttctctgctccggcgatgagctattatcgtaag gtgattttcttagtctcgaaggtgtcaaatccataaaaaagagcgatgcagccagggaga agatctttcttctcttgctataaatcaaacattattcatacagagtttaatcgattcaac aaagcataacattgattgcaattggttccgtcttgaactgctaaggagaacaatcataaa ttagtattttgtttgcttgttagactcaaatcgaattacag
ORF
TACTGTGATGCTGTGAAACAATCTGCTTCTGCTAGTAGAAGAAGAACATTTGTTGTGGAAGTTCAAGGTGGATAC
\GGA(
Downstream (in bold start and stop codon next gene) aagacggatcaaatcggttgtttgggtactaaagacaatccattttttttctttctctcg agctggatgaaactagtgcatgtacgaatccgtgtgtaatctactgggatgcttatttta cgcatttttgttgataaaaatagagacctactactactcctgatttcaagcctttctacc tgtaagtttcttttttttttgctggtgacaatagcctttttttttaccttttttgccatc gttgcgtcctgtatagcttcttaaatgtctccgcaacaatttttatgggagcaatagatt acctctggttttactcgtaacctctagggttaataggcccattaaattggctatgggaat gcgagtgatatggcttcaactatagagagtgcggattattgttcattttctactgtatgc aagatgtggcatttttcttattgtggtgaaaggtaccgctgcccctcaattggttgcata ctgaaacgacaagttgagtcgagtccctggaaaatcctaccagttgcaggcaaaataagt caaccacgccgatcatcacagcgaaaagagttgcatttggttttgagagacggtgttttg gtaaacagcaagccccaatgcaaacgatcataccgatctgccatggcagcctgttggtca catgaaacacacaggagcacaggggaagttttattatcatgcgcggtgcaataaatgtgc atgagttacgcatttcggatatctgctccttatataaaagtgtttacagagaccatagta tgatgcgcgtatagtctctgtatctaccttgcttttgttctgcacaacaccacatttttc ttggaccacttctagcataatccgtctgtgaaaaatgcacaccgcacttatctttcacgt atttgaagaaaagcctttttcatcgatcgaatgatattgcacttggacgcacgaacacac ccattaccaataagaatgttcaaaaaaaaacaaccatcatg
AA
MPEPSISALSFTSFVTNDDKLFEETFNFYTKLGFHATRSYVKDNRSDFELTGISTDSIKE IWLESFPLΞEWETS AGRELRKPLQESVGYQSEALLGYSPYQSDGWIKLRLSNHDLQKNKDLPGEVTFFTASIDKLRAKLIE IGAEI IP SEIDLVEFSTKDPMGDVISFSSYPSLSSKKITSPDFFLHPKKEVRSQESIVEQVKSEEGKKKIAIITSGGDAPGM NAAVRAVTRAGIFYGCKVYACYEGYTGLVKGGDMLKELQWQDVRGLLS IGGTI IGTARSKEFRERWGRLQACYNM VSNGIDALWCGGDGSLTGADLFRNEWPELIKELLGEGKITKEQYETHRNLTIVGLVGSIDNDMCGTDSTIGAYS SLERI IELVDYIDATAASHSRAFWEVMGRHCGWLGLMSGIATGADYIFIPERPPSETNWKDDLKKVCLRHREKG RRKTTVIVAEGAIDDQLNPITSEEVKDVLVEIGLDTRITRLGHVQRGGAPCAFDRFLATVQGVDAVRAVLESTPA IPSPVISILENKIVRQPLVESVAQTKTVSDAIEAKDFDKALKLRDQEFATSYESFLSVSKYDDGSYLVPESSRLN IAI IHVGAPTSALNPATRVATLNSLAKGHRVFAIRNGFAGLIRHGAVRELNWIDVEDWHNTGGSEIGTNRSLPSD DMGTVAYYFQQYKFDGLI I IGGFEAFTALYQLDAARAQYPIFNIPMCCLPATVSNNVPGTEYSLGSDTCLNTLSG YCDAVKQSASASRRRTFVVEVQGGYSGYLASYAGLITGALAVYTPENPINLQTVQEDIELLTRTYEEDDGKNRSG KIFIHNEKASKVYTTDLIAAI IGEAGKGRFESRTAVPGHVQQGKSPSS IDRVNACRLAIKCCNFIEDANFQVKHN ANLSADERHLRFFYDDGVKTSAVSGKSSVIDDNTSVVIGIQGSEVTFTPVKQLWEKETHHKWRKGKNVHWEQLNI VSDLLSGRLSIRTT
SEQ0063
P. pastoris homolog of S. cerevisiae PFK2 (YMR205C)
Beta subunit of heterooctameric phosphofructokinase involved in glycolysis, indispensable for anaerobic growth, activated by fructose-2,6-bisphosphate and AMP, mutation inhibits glucose induction of cell cycle-related genes chromosome 1 -4, 0047 E.C.2.7.1.1 1
5' region (in bold stop codon previous gene) attctagcaccactcaagttacaaacgattattatgatcaacatctaatgctaaaggatt tctttgaaagaaaagtgattcgaaattcacaattcatccaaagcgtatcgtatgccccta aagtgttccattctattcgaatgtttcaaagttcagatgaaatgtatatcaatcagtatc tccaacaattcaaaagatgtgatttacggatctccgatacactgtacttcatcaaagacc attacatgctgaaaggtattcgagagatcaatttagatcttttcttacaggcaaaattac atcctgaactgttaggtggccggcttattggcttttatgagataagggcaccctggggaa aatttgagtttatggctattgaatgtgattgtcagcctcaccaatctatggttaatatgt tcctgcagcgcgactaatactacgaatttgcgcgcctgcaactctcctctctcagtctgg ttcttgaagatgtgtataagtctcgcaatgaaacaaatgttctgggagatggatcgttgc
tcaatatgaatatcggattagtaaagtataaacttataagactttcagtggttttcacat tgacctaggcttcttataacaaggcgactccgagaacaaaagaaaaatagaatggccctc aaagtagcttatgtgagatctgggcttgattctagatgtggtaatacttgatattttctt ttttgattattcttccatgctatcaataagtgaaaaaacagtctgtctttctttggcggt gaatatgtgtccattatgcacaaattccgatgtccgtatcggatatcggacctccataat ccataatcgaaccataagtgtgcttatgtaatcaattttttaccctgctgcgaatatctg acagcttttaattacggctccctccaacttatcaacctttcctagttatattccatttca agatttacaaaggtaaataacatctcaagcttttgttaaag
ORF (in bold intron)
ACTGATTTTGAAGTCGGAGAATGCATCGAAGgtactcactacggaagtaatctccactatcatagatgatgaagc tagtggcagatttgactcaaaaacagCTATTCCGGGGCATGTTCAGCAGGGTGGTATTCCTTCTCCGATGGATCG
GAGTAACGTACGAGAAATCAGTGATATGTTAAGTGGAAGAACCTCGTTATAG
Downstream (in bold start codon next gene) gcattcttacataaggaacatactgaaagtatggcaatttttgttagcaatctcatgaac acttttttctatttaatttctatctactttgttacatgagcgctatccaacaataaatgc cgcaagctacaaaattagttcaggaatataacttgtccatcttgtaacttaaataaatct tgcttcaacagtaacaacatggttctgtggtctagtggtatgatacctgcttcacacgca gggggtctccagttcgaacctgggcagaatcatctttttgaagaataagcgatagatgtt tgtcgccactcaaaggctttattatgaaaggcaaactgaagtccttcagacttcaaggta aatttttgaatttagtgcgtaatcttagggcgtatgtttgatacttaagtataaacttta aagtacacttgaagtatctacatgaccaatgattacataaccacatcgctatgttggtca aagaaaatattaacaacaaaagctcccatttcaagtaaccactcgatgtctctttcatat cccgttctcctctccaagttttctgagctgtttcacttgggaggtaaatacccaccagcc cttgtttccggcctcgttatatttttgatatctttttatatcgttggagttttcacttct
ttgagatcaagtttggcacttgatccttgggctctctatcatttgaatcttaacaagata tcactataccctttggttcactcatccttcctccaccttttcttcaacatttttgctttg atctctccactgtcgttatatgaaagatcgaatggtacagttcatacaggagtggtcctc aacgttttggcagttgttactgcattgccttactgtgttttaggaatggtatttttcccc aaagttagtgtagtaggagcatctgcttggtgcttttcattctttggttattattcttat ttgcagagtctcagttatcccactttcaaagtacaagatta
AA
MPDASLFNGTSF ITLFAPNISLFQAS IDFYTDRLGFAIKETSNQKLVWLQLEEDSNNVSIQLLLDPEHAASVSQI DQNIRNLTRSLYRKDWRSIQSNIAFKSSSLSKLVFLLKDGGHPVQQSPNEISPFEVYTLDPLGSLIGFSGFKNPF AVNERSLLPKVSEEKAYRTEDDSEKLFTPIRKTIGVMTSGGDSPGMNPFVRAWRAGIYKGCKVFCIHEGfEGLV RGGEKYIKETQWHDVRGWLVEGGTNIGTARCKEFRERSGRLKACKNMIDMGIDALIVCGGDGSLTGADRFRSEWP SLIEELLQTERISQQQFETfQNLNICGAVGSIDNDMSSTDATIGAFSSLDRICRAIDYIDATANSHSRAF IVEVM GRHCGWLGLLAGLATSADYILIPEKPASSREWQDQMCDIVSKHRARGKRKTIVIVAEGAISNDLSPISCDQVKDV LVNRLGLDTRVTTLGHVQRGGTAVAFDRIYATLQGVEAVNAVLECNADTPSPMIAIKEDQITRVPLVDAVELTQQ VAKS IESRNFKRAISLRDSEFVEHMKNF ISTNSADHVPPSLPLEKRKKVAI INVGAPAGGMNSAVYSMATYCMSR GHVPYAIHNGFSGLARHESVRS INWLDIEGWGSLGGSE IGTNRTLPNDADIGMIAYFFEKYGFDGLILVGGFEAF ISLHQLERARINYPSLRIPLVLIPATISNNVPGTEYSLGSDTCLNSFMEYCDVIKQSAAATRNRVFWEVQGGNS GYIATHAQLACGAQISYVPEEGISLAQLEMDINSLKESFANDQGKTKFRQTDFEVGECIEAIPGHVQQGGIPSPM DRtfRASRFAIRAVSFIEP HSDKCQAFKNSISFRQTDEITSTAWLGIHP SQLRFTPIRQLYDFESDVPRRMRKNI FWSNVREISDMLSGRTSL
SEQ0064
P. pastoris homolog of S. cerevisiae TDH3 (YGR192C)
Glyceraldehyde-3-phosphate dehydrogenase, isozyme 3, involved in glycolysis and gluconeogenesis, tetramer that catalyzes the reaction of glyceraldehyde-3-phosphate to 1 ,3 bis-phosphoglycerate, detected in the cytoplasm and cell-wall chromosome 2-1 , 0437 E.C.1 .2.1.12
5' region (in bold start and stop codon next gene) tctgctactctggtcccaagtgaaccaccttttggaccctattgaccggaccttaacttg ccaaacctaaacgcttaatgcctcagacgttttaatgcctctcaacacctccaaggttgc tttcttgagcatgcctactaggaactttaacgaactgtggggttgcagacagtttcaggc gtgtcccgaccaatatggcctactagactctctgaaaaatcacagttttccagtagttcc gatcaaattaccatcgaaatggtcccataaacggacatttgacatccgttcctgaattat agtcttccaccgtggatcatggtgttcctttttttcccaaagaatatcagcatcccttaa ctacgttaggtcagtgatgacaatggaccaaattgttgcaaggtttttctttttctttca tcggcacatttcagcctcacatgcgactattatcgatcaatgaaatccatcaagattgaa atcttaaaattgcccctttcacttgacaggatccttttttgtagaaatgtcttggtgtcc tcgtccaatcaggtagccatctctgaaatatctggctccgttgcaactccgaacgacctg ctggcaacgtaaaattctccggggtaaaacttaaatgtggagtaatggaaccagaaacgt ctcttcccttctctctccttccaccgcccgttaccgtccctaggaaattttactctgctg gagagcttcttctacggcccccttgcagcaatgctcttcccagcattacgttgcgggtaa aacggaggtcgtgtacccgacctagcagcccagggatggaaaagtcccggccgtcgctgg caataatagcgggcggacgcatgtcatgagattattggaaaccaccagaatcgaatataa aaggcgaacacctttcccaattttggtttctcctgacccaaagactttaaatttaattta tttgtccctatttcaatcaattgaacaactatcaaaacaca
TTCACCACTTTGGAGGGTGCCCAAAAGCACATCGACGCCGGTGCCAAGAAGGTCGTCATCACTGCTCCATCCAAG
Downstream (in bold start codon previous gene) atcgatttgtatgtgaaatagctgaaattcgaaaatttcattatggctgtatctacttta gcgtattaggcatttgagcattggcttgaacaatgcgggctgtagtgtgtcaccaaagaa accattcgggttcggatctggaagtcctcatcacgtgatgccgatctcgtgtattttatt ttcagataacacctgaagacttttgggtcggaggactggctctttccgatcaaattggaa tggaaaattgctcctctaagaaagggtgccaacactctttgtaacacaggacaccgttta ttgctaactcgattgcattctttcctttcccacaccgggatctggtcttggtgaacaatc tctcctgtccttatctaaatatatcatcgcactgtaaccttccttattacttttcgagcg tccgtcctgtattatcttcaacctgaaaccaaactctaaccaggcttcactcgtggatct ataattgaacatgaaaaacttctcttataccagtcaaccgggaactccgttggactacaa ctatggtcggtccatgctgccgtcacatttgttgacaccctttctagctacaccagttct tccatcccagccaagcacgccatatattgatcaacacatgagcaatctggatcagaaata tttaacggagcactaccagttgctgcatgaggtgaatcactctagatccttgttatttga gaaattgcccaaggagttgtctctgaaagagtttttggacgcgttcaaatcgagtcagtt cttggagaatgtgaaaataaaagaggatagtggtgaagggacatccgtgttggtgcagtt cgtcaacgatgagtctgctttagtctttctgcaaaattttgactggagggactttgtttc ttcaacaggctttgacaatctcaagatctcatggtgtttcaataactcatccgatgacta tttggtagatgggacttatatggataggattctgaatctga
AA
MAITVGINGFGRIGRLVLRVALSRADIKWAINDPF IAPEYAAYMFKYDSTHKAYKGEVSASGNKINIDGNEITV FQERDPVNIPWGKAGVDYVIESTGVFTTLEGAQKHIDAGAKKWITAPSKDAPMFVVGVNEEKYTSDLNIVSNAS CTTNCLAPLAKVVNDTFGIESGLMTTVHSMTATQKTVDGPSHKDWRGGRTASGNIIPSSTGAAKAVGKVIPELNG
QLTPSFVKLISWYDNEYGYSTRWDLLQHVAKA
P. pastoris homolog of S. cerevisiae FBA1 (YKL060C)
Fructose 1,6-bisphosphate aldolase, required for glycolysis and gluconeogenesis; catalyzes conversion of fructose 1,6 bisphosphate to glyceraldehyde-3-P and dihydroxyacetone-P; locates to mitochondrial outer surface upon oxidative stress E.C.4.1 .2.13
2 homologs
SEQ0065 chromosome 1 -1 , 0072
5' region tgcactttgaccttaaaggggctgatttaaggttatgccggggaagaagaaatagcgcga tgagcaaagtcgatgcctaaaggagtggttttgctacctcatttaagaagagaataggac gtgcatccagcgatgcgtgctaggacaaagaaccgcacttggcgggtacaaacctgacgt catttcctgatattattgacatttgagctgaccaattaaggtgcccatccacaatagcca cctggatagcggaatgcacccccattgagttgatcaaactaccattttgcttatacctca agttaatgttgaactaccattcttcacatgctcctcctagatcccctgtcccctttctcc ccctctttcatcctttaatttgcatttcttgacggtcttctatccctagaaagtttggaa cgcctgctatatggttaggacacgactgactagctataaaatttttcagaccagactctt tctcttcttaacgcaaatttaacaggcagacaacaacataggaaagaatcaccatatagg ttggactctttacagacgtccttggccgttgaccatggtggtacagttgtccaagttcta caagtttgtctgaagaatgaagttattggtcttgggtgcagctttccatctgttcgattt attcggctaagagtttaccattgtgtgctcgtatggggaagggtgcaaggatcagtaata cagtcgaacctggagtatctaccatagtggggatacaatgtagtttatctgttatctcga ttgttcctaattaaggttttctttgatcctcttctagtccacacctcctagatgacattc gagctgcctggattggatgcctaggtttattgcctagttcaatacaattcgtgcgggcta cagtagaaggcccttacataatccggaaagcatggtcccccaccaaattgagagcttttt cagccttcactggtggtatcattttcgggagataataaggtttcgattgggaattcccac cagagaacactatagagggaccaagctgatgctagcctgacatccccaaagcacacttcg
taattgaaaaccgttacctctagcacactgtccagactacccccgtcaaaaaaacgctct ttttctcgactaattgagtcttcaactcatcccgtccttgccgaattacttgaattcatt tcacacctccgttgcttacgtactctcaccggtctccggtgtacatggatccgctattgc cagatatttctcatacaacaatcaccagatcaaggtcgtgaacggaccaatggcatccag agcaatcctgaacagataggggtccgggctgtataaagtgaaataacgtgacttgaacca gcaactatgtcccagttgtgctacacttaacacgcgattaccccggagctcaccaggcct cttccccctctcattggaaccctcctagcgcttcgaaataatggctgcgtactatttaac tggtgccagttcccgctgacaatatcctttttcttctccccttagttccccacatatcaa ttgaacatattttttacaca
ORF
ATGCCAGATCAATTATCATTTTTAAAGAGAAAGACCGGTGTCATCGTTGGTGACGATGTCAGAAAC CTGTTCCTTCACGCACAGAAGAAGGGCTTCGCCATTCCAGCTATCAACGTCACCTCTTCCTCCACC GCCGTGGCTGCTTTGGAAGCTGCTAGGGACAACAACTCCCCAATTATTTTGCAAACTTCCCAAGGT GGTGCCGCCTACTTTGCTGGTAAGGGTGTCAAGAACTCTAACCAGGAGGCTTCCATTGGTGGTGCC GTTGCTGCTGCTCATTACATTCGTTCCGTAGCTCCATTGTATGGAATTCCAGTTATTTTGCACACT GACCACTGTGCCAAGAAGCTTTTGCCTTGGTTCGAAGGTTTCTTAGCTGCCGATGAGGCTTACTAC AAGGAGCACGGTGAACCTTTGTTCTCTTCCCACATGTTGGATCTGTCTGAGGAGTCTGATGACGAG AACATTGCCACTTGTGTCAAGTACTTCAAGAGAATGGCTAAGATGAACCAATGGTTAGAGATGGAG ATTGGTATCACCGGTGGTGAGGAAGATGGTGTCAACAACGAGAATGCCGACAAGGAATTGCTGTAC ACCAAGCCAGAGACCGTCTTTGCTGTTCACAAAGCTCTTGCTCCAATTTCTCCAAACTTCTCCATT GCTGCTGCCTTCGGTAACGTGCACGGTGTTTACAAGCCAGGTAACGTCGTTCTTCGTCCTTCCTTG TTGGGTGACCACCAGAAATACGCCAAGGAGCAATTGAAGTCTGAGAGTGACAAACCATTGTTCTTG GTCTTCCACGGTGGATCTGGATCTACCCAAGAAGAGTTTGACACCGGTATCCGTAACGGTGTCGTC AAGGTCAACTTGGACACTGACTGTCAGTACGCTTACCTCATTGGTATCCGTGACTACGTCTTGAAG AAGAAGGACTACATCATGTCCCAAGTCGGAAACCCAGAGGGTGACGACAAGCCAAACAAGAAGTAC TTCGACCCAAGAGTTTGGGTTCGTGAGGGTGAAAAGACCATGAGTGCTAGAGTCACCGAGGCTTTG AACATTTTCCACACTGCAAACCAGTTGTAA
Downstream (in bold start codon next gene) gtagtagtagtagtagcagagtatctacagtggtgtgtataatgtatatgagtgtactta ccaaccaaattccgtttagtatttcgtcgacgatgatgtagtacgagtccttcgcgaatc cgttactctcaagacggggaaaaaaaacgacgaaaatgaccaacttactcaactaagcaa acctcaagaaacataacactttgttgtgagacagtaataaaaagctcacagcgtacacat caccactcatataatgtcagagagaagcagcaagaaaggccctaaaggcggcgcgaagcg ttcgtcacagggctcctctcaggggctggaaagcactaaactggccactttgaccgaatt gttcccagattggacggcacaagacttggagcctgtgctggaggaatatccagatgaaga cctcaatgtgattatagaaaacattatcagcggaaaaataaacaaatggactgatccatc agctaagaaggagaagaaaaagagagaagaatcctttaatgcaagtgaagaattatcaac tccctcttatcaccaaacacctaacagcgcaaagaaagagtatcctaagaaagaagttaa ggctaaatccaagaagtctcaaccgcgttctacgacatccacgactactgcatctactaa agctcaactgacgccatcgtctaatccaagcacaaaaagttcatgggcggctgctctaca tcagaaacaagaggataaaccttcttcaaccgtaactcccactactgaaaccgaaactcc aaatggcgaaaacgcatctcagtctccagttgccgagacaaagtctgaacaagaagagtc ttttgcccccgccgcagttgttgaaacttccgctaaaccaaagtcttgggctgctatggt cgctcaatctgctaaacccaagaagaagattttgaaaagacctgaacaagctgcaaagcc ctctagcaacgaggaattgtcgcaacaaaatggggaaattc
AA
MPDQLSFLKRKTGVIVGDDVRNLFLHAQKKGFAIPAINVTSSSTAVAALEAARDNNSPI I LQTSQGGAAYFAGKG VKNSNQEAS IGGAVAAAHYIRSVAPLYGIPVI LHTDHCAKKLLPWFEGFLAADEAYYKEHGEPLFSSHMLDLSEE SDDENIATCVKYFKRMAKMNQWLEME IGITGGEEDGVNNENADKELLYTKPETVFAVHKALAPI SPNFS IAAAFG NVHGVYKPGNWLRPSLLGDHQKYAKEQLKSESDKPLFLVFHGGSGSTQEEFDTGIRNGVVK VNLDTDCQYAYL I GIRDYVLKKKDYIMSQVGNPEGDDKPNKKYFDPRVWVREGEKTMSARVTEALNIFHTANQL
SEQ0066 chromosome 1 -1 , 0319
5' region (in bold start codon next gene) aactctacccaggattatttttcttctgcgaatacaaaactgcttatatgtcacacggat
aactcctcttttaacgagatagttgacttctattaaaaagtccgcatagttagatttacc tccatcttgagttagaagatgaaccttttcattatagggtggatcataccaattccaacc atctggggccaaggttgaaaaactgggggttcttggaaagttatagaagaacaaagttcc tattctggcaaaattatgaaggtcattggtgacattggtaacacttgtcagatccataaa ttaatccataagataaggcaaatgtgcttaagtaattgaaaacagtgttgtgattatata agcatggtatttgaatagaactactggggttaacttatctagtaggatggaagttgaggg agatcaagatgcttaaagaaaaggattggccaatatgaaagccataattagcaatactta tttaatcagataattgtggggcattgtgacttgacttttaccaggacttcaaacctcaac catttaaacagttatagaagacgtaccgtcacttttgcttttaatgtgatctaaatgtga tcacatgaactcaaactaaaatgatatcttttactggacaaaaatgttatcctgcaaaca gaaagctttcttctattctaagaagaacatttacattggtgggaaacctgaaaacagaaa ataaatactccccagtgaccctatgagcaggatttttgcatccctattgtaggcctttca aactcacacctaatatttcccgccactcacactatcaatgatcacttcccagttctcttc ttcccctattcgtaccatgcaacccttacacgccttttccatttcggttcggatgcgact tccagtctgtggggtacgtagcctattctcttagccggtatttaaacatacaaattcacc caaattctaccttgataaggtaattgattaatttcataaat
ORF
GCTCTGGAGGTGTTCCATGCTGCTGGTACCTTCAAGTCTGAGTCAAAACTGTGA
Downstream (in bold stop codon next gene) ataaaatttctctttattctttaagtctataggctattatacttatccaggtagttact gtagtaagcagaaaggagacctcttggagttacaagaaaaactcgcagtggttcctttga actcgttcctgttgtcgagatacccatacccgtaagcttattaagctttgacgcaataga tgattgtttacgtcttgcaggctttaccagtagctttccggaccatgtttcgttggaagg gagatgctgaatacacttgatagagtctcggattattcttctcctggaaggtattttttc atttatgggcctccacttaatttcggtttctgttgagctcgctgtggttatagaagatcc tgtgacattagaagatgaaaatgttaattgagtggaatcggctatggaaggtgaagtact ggcggaaagatgaaccatgtctagatcgtgatttaaatcttccactctaagaaactgttc tataggttccagtctttcttcctcttcttcatcttcatcttcatcgtcgtcgtcttcatc ctgaggaggatactcatcaagatcatcctgatcgtcatgattagtattttgattttgggc ttcatcgactgtgtcatcacaggaacaaccctcaggaatatcattttccatcaacactgg agcaaggcagtcaatttcatcaagagtatcgtcattactttggcccatgagaaattttgg tttctgccccggacgcaatggagagtgttcaactgaactggaaatagttggggtgacagt aggtgtaagaggtaaatcgttggaactggatatagtgagattaagtcgttcgttcaagga agacggcaaagtgggagttgttgcaccagattttgatctcttcctaaagaaatcagacat agatctcgatctatatctagaggcagcgtctgatgatcttgtctccgtagagcttagtat ccttgggttggtgttccttcgggtccttcgaggaactgttct
AA
MSTFDFLSRKSGVIVGDDVRKLFEYARERKFAIPSINVTSSSTAVAVLEAARDNKSPVMLQVSQGGAAFFLGKGV NNKDLSASVTGS IAAALLIRTIAPSYGIPVILHTDHCQKKWLPWFDGMLDADEEYFKTHGEPLFSSHMLDLSEET DDENIAICVKYFKRMTKMNQWLEMEIGITGGEEDGVNNENADKDSLYTSPETVFAVHKALAPISPNFAIAAAFGN VHGVYKPGNVELRPSILGEHQAYAAQQLGLKNGSKPLFLVFHGGSGSSQQEFNTAINHGVVKVNLDTDCQYAYTI GSRDYILKNKDYLQSMVGNPQGADKPNKKYFDPRVWIRESEKTMSGRVKEALEVFHAAGTFKSESKL
SEQ0067
P. pastoris homolog of S. cerevisiae FBP1 (YLR377C)
Fructose-1 ,6-bisphosphatase, key regulatory enzyme in the gluconeogenesis pathway, required for glucose metabolism chromosome 3, 0868 E.C.3.1.3.1 1
5' region (in bold start codon next gene) ctttgatgatgatttgttgtgaacgttccatgtgtttgaatctttggacggccggtaaaa accgactgtttttaatgattgctggttagccttgtcgacaaatgatttgtatgaagctct gaccactgaaaggatttcagatcggttgaaatgatcgacttttgctgtttgagtgttggt attaggagttgtagtggcaccagtcggggtgctacttcgtgatgagctgggattctttgc gttcacagccttcgaagtggcatgcttgtcatttcttatcttatccttaccagcagccag gttcgcccactcaccttgaactggtacaaggatatttctctgtttttgctcggtggttcc cattgctcaaacgagtggagagggaaatcgattcagcagttaaatcaatgctggaaaata ttcgagattacctaatcggatctggaacttacttcgacctgacattttcttgcctgggga gccacgatcgattatgtaatcaagaatatggacagagggaaacagatttagctgtcaaaa gcccaagagaagctaccgatcaatggatgcggatagataaagaaaagtcctttttttttc attagccatccgagttgtccaatcaaatgtctgcctgctacgctggagaggaatcacgcg tgtttaacattcggattgtcgcctaaaataagcctattacctacacagtaaaacccgggg ggtgctttggtatcaatgaccccgggattttatccaccagtttttttctttctggcaaga gtgcattgcatccccgtacaaatagtagcaacctccacaagaggaatcccctatgagcga gaagtccatagtaatacccccgcggaaaagagatattttgtttccgtgttgcccttgaac ttcagtttcccccatcagtttatatagtagccgggttcccaatctctagcccttctttcc tcctatttcattcctctcttcttacgttatcttacattagc
ORF
GTGAATCTCTAG
Downstream (in bold stop codon previous gene) tgcgttcgattggcactgtttccgagatttgatactttgtacacgatgttatatgaggtt atatacttgataagagggtttttacgtttgcaattagcacaatttcggagtagcactggc gggagtgaaccttgagtagtctggatcaatgtaatcttcgtataggctagacaccccgga ttgggagtgctgacgaaatgatgttgggatgtgatgacatcatagggataatagaaaagt aaggttccgcgtgagccgttgaacgcgcactggaatggatggtctgtgacgtagccagac tgaacttgaaattccttccaagaaagtacatttttattteatteatteattcgaaaggga ggcttgtgggggaacccccaatcaaatacctaactactacttacaatatccaaacctaac aacgagctcctcattcccgatcactccttctttctaccaaataatcctctcttcttcttg cccttaccaccacttcccgttggtgaattaacatcctgttcaaaagtattctgatctcct gaaaacgagatcacggggttttgaccagcaatggggggacgaggtctcattctgtcgtga tattcgtagtttctgccggactcataacctggaaaatcaggacgtaccctccaacctaaa gtcggggtctccagtagttgtttataagtagtatctccagtgatggcatattcaaaggaa cggatcgtatctaaaggcctctcgtctctcgatctggcaggattggaaacgtcggggtta ttgattggttgtccaaaaatgtcacggatgtactctccatcgcccgacttattactaaac gattgttgtctggcatgagtgtagtccgttgattgttcgaacggctgagcctcgttgaca gccatcaagatggggtcgtgaatgttgtttgatctgtacttgagtctgtccgtgtcgtct ttggtgacgtgaacttttggtactttaccaacatcagtttt
AA
MSNNTTQNLAEQKGIQTDLVTLTRFILDEQKKSAPNATGELTLLLNSLQFAFKFIAHTIRRSELVNLIGLAGVTN ATGDDQKKLDVIGDEIFINAMKGSGNVKLLVSEEQEDLIVFESSKGNYAVVCDPIDGSSNLDAGVSVGTIFGVYK LLPGSAGSIKDVLRSGTEMVAAGYTMYGASSHLMLTTGNGVNGFTLDTDLGEFILTYPSLKIPHTRAIYSINEGN SHYWTDGVNEYIASLKKPQANGKPYSARYIGSMVADVHRTLLYGGIFGYPADSKSKSGKLRVLYECFPMALLLEQ AGGEAVNDKGERILNLEPKQVHERSGIWLGSKGEVERLLPYLTKKIKIQSVNL
P. pastoris homolog of S. cerevisiae HXK2 (YGL253W)
Hexokinase isoenzyme 2 that catalyzes phosphorylation of glucose in the cytosol; predominant hexokinase during growth on glucose; functions in the nucleus to repress expression of HXK1 and GLK1 and to induce expression of its own gene
E.C.2.7.1.1
E.C.2.7.1.2
2 homologues
SEQ0068 chromosome 1 -4, 0561
5' region (in bold start codon next gene) cattttcaccagttccatgttcttttgctttcactggaccgctttcatctggggaaacct ctttcgggaaatcctcatccggggaaatcttagtgggggaagcgatttctgggggatatt tttcatcgctccttttgactcgcccatctttaaacctattctggtattcgatcccacggc tcgtatcaaatgagtctcttccgaaatccaacgtaggccttcctgattctgcatacttgg gtgaattatcagcagacgactttttccttccttttttctgacgttgtgcggctggtctct tctttttcactacagatcctttcctagatacagttttggaggacggataagaattccccg gattacgacgggaagcagagtagtggttagaagataccctctgtctacggtttttagacc taatctttttctctctgatcctctgggccttgaacttcctggagtcatttctcttactct gtgaccactgggaattatcctcatctaccaggttccacggaacatttaccacctgaagga aaatttgatcagaatcgagtggctccactccagaattttccattttctccgtggggcttg gcgtagaaacggacacagccacgagagagagaaatacttcagtaagaattatcaagttca tatctagcaactaatagtctctgtacttctacagtagtaactaaggaggagtcacgcgtg ttgtctcctctgtgtttttaacccgaaagggcagggggggcagcaaaatgtttcacaaaa gcagcgcagactcggatgcggataatactcaaatttgtcacgtcagagtttttccgaaag tgagatgcaagaagcgcgcgggggatgagagtgccttaaaccaacatgcaaaataccccc ctcaaaaatctatataaagtgtgcccattttctccaaagtatttttcacacgttttttct ttcactctacactagagcttctgctacttcatcacaccata
ORF
CTTGCCCATGCTAGACAGGCAAAGGGTCTGTCTCTGGGATTGGCTGAAGCTTACAAAAAAGCCTAG
Downstream atgttaaatgttttactctacttcctttctatgtataaagcggaaatgacaacgctttag gaacttccttagattccttcatgagtttgatttatatcttcgatgaacataattatttta tcatttggatcggggatttgaacaccacgtgaagctagaccgtgcatccaattcaacccc tgcagctctttaactgtgcgtaaaaatgggattttgtctctctgtgctaattgtactatt attttactatctcctctcaaatcatcttttaaatcgacttgttgctgggccagtttcaat tgtggaactgccgccgagtgagtcaagtcaattaggaacccaaatagaaggatatgcaca atcttgtaacgtaggaacattcttgatactgtccaagggagagtttacgcccttgcgatg cttagccttttgcacgatgcactgacttgcggctttagcagcggtgagatcctgaaattc tcggaaaaaacctagttcccttcggaaaaactttgaataaaaatccacttaccacgtctc ctggatgtatctccccttattctctagctctaaacataaactcttcgactcttcgttcga aattcccaaatgatcttctaatatggtttctatggtgattctaacctgaatgggcatttt tccggatgatccacataggtagaaagtggcatttttattataaataaggtcaaagatcag gttactctgtgcatacaatgtgtcttgaacatatgattttttctcatctctggaaaatga gggaaagagttggagttggctctttgatgcgagagattcccataaatcaccgtaaagaaa atccttatcatgaaatctgtttccagtaaagagatacatgtcttgcagtcctagttctag tctttgctcaatgattgatttgactggtgctattccagttcctggggcaaccattatcaa aggtccagaggaaaacttgacattgttcgggtgaattgaaa
AA
MPIAKPANQVASTITSQHLSDRVLEIQDAFEVSPTKLQQIVAHFVEELKKGLSAKGGNIPMIPVWV MDYPNGTETGDYLAIDLGGTNLRVVLVHLLGGQKFETEQEKYHLPKGMRTTRNRDELFEF IADCLE KFFYKLHPNGIEKGTLLPLGFTFSYPASQTRIDTGVLQRWTKGFDIPNVEGEDVVPLLMDKINEKK LPIKWALINDTAGALVASRYTDPTTEMGLIFGTGVNGAYYDRVGNIEKLKGKLLPDITDESPMLI NCEYGSFDNEHESLPRTKYDILIDEQSPRPGQQAFEKMTAGYYLGELIRLILVELYEEKQVFQKYS KDSEQIKLLYTPYLLDTSFLAEIEGDQDLENFPEWRLFQEFLKIEPTLDERRLSRALSEIIGNRS ARLSVCGIGAACTKMNIKKCHCAADGSVFHKYPKFPERAADSLADIFGWKEQNIQPKNYPIQIVPS QDGSGVGAAVIAALAHARQAKGLSLGLAEAYKKA
SEQ0069 chromosome 3, 1192
5' region (in bold start codon previous gene) aaagttggcaaggccccaatcaaaagatcactggcttctttgaattttctcattgaaaga tcatgaattcctgtgtataccttataacgatttttcctctcccaatcaccccccttctcg attaaagcatttgcttcatccaagtacttcttggtgaattgtttgtcatcataaaagaat ccaatacgagataaggtgagtttgatatcgatctttgcaccgatacttggagctaaatcg gttgcctttccaagtgtttctattgctttattcttgtctccaatttgagcgtagtagttt ccaagttcggcccaacattcagcgttctctacttcaccttcatcgtcttcatctacgact tttattttgttggtcaattcctctaccttcttttcattttgcaagactagtttgtcatac aacccttgatcccactccaagccgggcaatgctatttggtggtagagataatgatagtag ggagccatctcattcgtctcaacgccctgaatgactttgtcaaccaatgcagcgtctcca tcgccatcggctatttgtgcaagaaaactaagctttgatacgtcaaagtctggaattttg ggaattcctgatgcataggttgccatcttgaggtttcgtttagcgtaataggattggaag gcaaatgattaacatctgggattgatccactgttggagtgtgggtagtacacgatttgaa ttttgacggttgtcagatggaagttattctaatctataaattggaattcttagggttagt caaacagctgacaagatctgtcaaggagtgcacaatttaccattcagtgttttcttattc ctatctgagggctatttcggtattctcatcttaaaaaaaaagtcagagaattttggccga ttttagctgaaatctgaatttgattctcctaatccacaaaagcacttaaagcatatcgac accgacctttgcctgcaaagactgttgacaagctagtaaaa
ORF (in bold intron)
ATGGTTCACTTAGGGGCGAAGAAGCCTCAGCATAGAAAAgtgggtattgacctctgaatcaggatttgacatgtc attaacgataagggatatctatttaatcagCTTAGTCCAGAGTTACGAAAAGCTTATAAAGAAGTAGAGGCACAG
TATCACCCTCAAGATGGAAAGGATAGTAACTAA
Downstream (in bold start codon next gene) gtaactaatcccagatctgcttatgtaacgaaatatttacttctcgcttttactcccacc aacgaactcgaacgctcatacagtccaagcctaccatctacacacaacatcaccaaataa atggaacaaccaattggtaagttttaagaactttttttgggccgcgtaggctccagaatg ggccaatggaagtttgtattattcacagggcagaatcaatgctcttgtcgcagtattgca acccgttgtccacttctgtatctcaagaattcaaccactaaccgcattttaagttctgga tcaaggaacaggattcgttaagatcggccgtggaggtaccaacttcccggaccacacgtt cccttctattgttggaagacctatactgcgtgctgaagaaagaactgggactgtggaaat taaagatatcatgtgtggagacgaagccagtagtgttcgtagtgctctgcaaatatcgta tcccatggagaacggaatcatcaaaaactgggaagacatggaacacctctgggattacgc cttttatgagcgtatgaagatagataccacggatcgtaaagtcttactcacagaaccccc gatgaacccgctcaagaatagagaaaagatgtgtgatgtcatgtttgaaaaatatcactt cggacacgtttatgtggccattcaggccgttttagcgctttatgctcaaggattatcatc aggagttgtcgttgatagtggagacggtgtcactcatattgtccctgtttatgaatctgt ggttctgagccacctgacaaagcgccttgacgtcgccggtagagatgttacaagaaatct catcaacttgcttcttagaagaggctatgcgtttaacagaacagctgatttcgaaacagt tcgtcagattaaggagaaactatgttacgtttcatacgacttggattttgacaccaagct ggccaatgatactaccacactagtcgaaagctatgaacttc
AA
MVHLGAKKPQHRKLSPELRKAYKEVEAQFWSTPRLKQIVDQFVAELKEGLKSSSSNIPMLPTWVMDFPTGEETG DYLAIDLGGTNIRVILVRLLGNRKFDTIQSKYVLPKWIRTSTSNELWLFIAQCVKTFIDEEFDYRESPEDPIPLG FTFSYPAFQSRINSGVLQRWTKGFDIPDVEGHDVVPMLQDALESLGLSWWALINDTTGTLVASTYTDPETKMG LIFGTGVNGAYYDTISSVSKISNALPPDIQEDAΞMAINCEYGAFDNNISVLPRTKYDDTIDLESPRPGQQSYEKM ISGYYLGELLRLVLVDLHHQGHIFKGQTIGKLNEPF IMDTSFPARIEEDPFENLCETGELFNSLGIETTVPEREL IRRICELIGTRAARLSVCSIAAICKKRGYKKAHCAADGSVFTRYPYFPDRAARALRDIFQWGHSTPDLVTWPAE DGSGVGAAIIAALTKQRMANGESVGLDEYHPQDGKD
SEQ0070
P. pastoris homolog of S. cerevisiae TPH (YDR050C)
Triose phosphate isomerase, abundant glycolytic enzyme; mRNA half-life is regulated by iron availability; transcription is controlled by activators Rebi p, GcM p, and Rapi p through binding sites in the 5' non-coding region chromosome 3, 0951 E.C.5.3.1.1
5' region (in bold stop codon previous gene) tcaacgagacactcttccgtcagttccaaaaccataagtttgccgatgtgttggtcctt gtaacgcatggaatttgggccagggtatttttgatgaaatggttcagatggtctgtggag gagtttgaaggcttacgaaatataccacattgccagtttatacagatggttaagggtgaa aatcaacgttacaccttgacgaccccattattacgatggcgtgaaggagatgaagaccgg gtagaagaaataagaaaagcggtacagtttaggtccggagatctagggaaggaggcctta gcttatattgtagctgctgagagagaggcagctgctggaagatctgaaggccctatcacg tatgatgatggtgatgaccattagagaacgcccagagattgatagccagttcttggacaa caattcggaactttattcacggtgcaaacatgatttgtgtggatagcttcaagtcagaca tttcatctcatccccccttttactgctgctaatcaccgttagtccgacagttactctaat caatatttattagtgttttagttgcgcaaaactcgagcctcttttccttatctcttgaca cttcctggagtcgaagtttttcagcgcaaattcactctacaatgtctaccgatactagac
cgcctatcttccccctctaaatagcctattggaagggtgcaataaggtatataaatctgg cgcgattcccccggacttttatgatccacatcacctcatcttactgccctcactctcttt cctgatcctcccaggtccaccgatttcctcactatcgtcggatttctccttccagcgccc tagagaattccgtaaccaccgcaaaaatagcagcccccccctcacccatttttttattta aaagaacaccttactggcccgttttcgtttctcctttactacaattgatttttaattttc agttttttttcattgatatacaagatctatcacaaacaca
ORF (in bold intron) ATGgtacgtgcattgagtctcataagtgccatccagtcacacctggcaggtggaggcaatgggaggtcggagaaa cagaacccacatcagaaggaccatactaactcttcccagGCTAGAACATTTTTCGTAGGAGGAAACTTCAAAATG
GAATTCGTGGACATTATTAATTCCAGAAACTAG
Downstream (in bold stop codon next gene) tttacatatgaacatattactactctatattcgggacagcctcgattatttctctttct cttcgtctcttgtttaaagtcttctttcatatcgttcctttttcatcctctcgttccgct cgatccttcaacgttgaaagagccagaggtgtcatattgggaagagaaccaccaacggga tctaagtttggagattctgaccctatatgctgtatgacggattgttgctgtatctgtaac ccaaggggtggagtttgtagccacatctttgatggtgacccgacgactgagactggcatc aagccaggtgagtcaggggtgcggaccgtacttccagagccgtgttgcaccagtgttatt gcaggaccatgagtgggcaccgccgtataatgctgatgcctttggagcatgagatggttg ctggaattagctcttagatgctgtggcggtagccggatagcctttacaacttgttcctct tcatcgtccactactgcactagaagtgctatcattactattgttctgttggttgtgtata ttcaatgaaagtaaactaatagtagaggagctgggtactagacggccctccacatcttta ccaacagacgaggccagaagtggtggtgacggatgttcatcagtgttgttagtccatcta cttgaatttgtattgagcccgcttgctggaatttgaatgggcatgccctgcatgctttcc attagctcaggctctgaaagtactgagcctctaaaagatggaggcatgtctaagtcgaag tggattggttttgataagtttgctgtaaactaatttttcgtatagagttcgcgaaaattc tcaccaccaccaagggtcaagtcactgatcgcttcaacccacaaaacgacgatattaaag ttcaacatattgttctaggtacgccagtccgcgatccaagtttgaaatctccacatctac aggtgcgtactgtagtagttcatccaaagctgggatcttg
AA
MARTFFVGGNFKMNGSKB S IHE I IERLNNTKLPENVEWIAPPAPYLQQAVTENKQKTVYVSAQNSFDKASGAYT GE\/SVEALKDLGVP YVILGHSERRTINKEDDAFIASKTKF ALDQGLKVILCIGETLEEKQANITLDWKRQLQAV VDWSDWTNIWAYEPVWAIGTGLAATPSDAQDVHKQIRDFLATVIGKDQAEKVRI LYGGSVNGKMAVEFRDKAD VDGFLVGGASLKPEFVDI INSRN
SEQ0071
P. pastons homolog of S. cerevisiae ALD4 (YOR374W)
Mitochondrial aldehyde dehydrogenase, required for growth on ethanol and conversion of acetaldehyde to acetate, phosphorylated, activity is K+ dependent, utilizes NADP+ or NAD+ equally as coenzymes, expression is glucose repressed chromosome 4, 0043 E C 1 2 1 9
5' region (in bold stop codon next gene) atggactgttcaatttgaagtcgatgctgacgatgtcaagagagatgctcaattatattt gtcatttgctggttacactggaaacgctacttttgttggcggaaactctaccagtttggc cgtccatgtaaacgatgtcgttctgggccgtgaccgtttcaacacgaacataaccaatga caaatccacttacaggtctagttcatatggaggcaattggtaccttacttctttggatgt cccaagtggggctttaacgtctggtactaacaatgtctcgtttgtcactacaaactccga ggtaaataaaggattcttgtgggattctctcaagtttgtttggaagttgtaacaggttta
taagcatatcgtgcgcttgtccacaattgaatcatttattgttgcgagatacatgaacaa agtgtgaactgggacccattactacaattcccacgcaaccgttgtttcaaagcccatatt ttttgacaattgtttcgttacacccccagtttgatgtacatcgcttgcaatgatgtgtgt cccggagtattttccatattcagcttgaattcgtatactcaaccaatatctgggggtata cttttatgtaacctatacaaatcaactatactatttcacctttcgaccaatcatctccca tcttgttaagttttgcttcctatatccctgaccctgacatcacccatgattccgctcaac ggttctcctctacatcgtccctcttttggagagggtgttcagtttgacattcaaattacc ccccgccatcacgcgcaaccgagaccgcacccccgaattttcacaaattaccccacaccc tatactccaccactatgagggttattagaactgatcacgtataaataccaccgcaagttc ccaagggatcgtgttcttcttctccaattgcaatcatatttctgactctttctagttcag attaattcctttacacttgcttttttcccttacctttatcc
ORF
ATGACATTTGCTCCTCCCTTAGAATTCGAGATTGACCTTCCTAACGGATTGAAGTACACT CAACCATTGGGACTCTTCATCAACAATGAGTTTGTTGAAGGTGTAGAGGGAAAGCTCTTA CCAGTGATCAATCCTTGTGATGAGACTAAAATAACCCAAGTTTGGGAAGCTTCTGCAGCG GATGTTGACCGTGCTGTTGATGCCGCTGAAGATGCTTTCAACAACTCCGTATGGGCTACT CAGGACCCATTAGAGAGGGGAAAGCTGATGAACAAATTGGCAGACCTTATCGATCGTGAC TTCAACATCTTGGCTGGTATCGAATCCATCGACAATGGTAAGGCCTATACCTCTGCCCAG GGTGATGTTACTCTTGCTGTCAACTACATCAGATCCTGTGCTGGATGGGCCGACAAGATT TTGGGAAACGTTGTTGATTCCGGAAACACCCACCTTAACTTGGTTAAAAGAGAGCCATTG GGTGTTGTGGGACAAATTATCCCATGGAACTTTCCTCTCCTGATGTTGGCTTGGAAGTTG GGACCTGCGCTGGCCACAGGTAACACTGTTGTTTTGAAGACTGCCGAGTCTACCCCTCTG TCGGGTTTATACGTTGCCAAATTGATCAAGGAGGCCGGTTTCCCACCTGGTGTGGTTAAC ATTCTCAGTGGTTTCGGTAACCCAGCTGGAGCTGCCATCGCTGCTCATCCCAGAATCAAG AAGATTGCTTTCACCGGATCCACTGCAACAGGCCGTAAGATCATGGAAGCAGCCGCTAAA TCTAACCTGAAAAAAGTCACTTTGGAACTAGGTGGTAAATCTCCAAACATTGTGTTTGAA GATGCTGATATCCAGAAGACTATCCATAACATTATTTTGGGAATCTTCTTCAATTCTGGT GAAGTCTGTTGTGCAGGTTCCAGAGTCTACATTCAAGACACTGTGTATGAAGAAGTGCTT GAAGCCTTCAAGAAGGAGACTGATAACGTTAAGGTTGGTGGACCATTCGAAGAAGGTGTC TTCCAAGGGCCTCAGACCTCTGAGTTGCAACTTAACAGAATCCTTAGTTACATCAAACAC GGTAAGGATGAAGGTGCTCGTGTAATTACCGGTGGTTCAAGATACCGTAACCGAGGTTAC TACATTAAGCCCACAATTTTTGCTGACGTTACTGAAGACATGAAGATTGTCAAGGAGGAG ATTTTTGGTCCTGTGGTTACTATCACTAAGTTCTCTACCGTGGATGAGGTTGTTGGATAT GCCAACAACACCAACTATGGTCTAGCTGCTGGTATTCACACAAACAACTTGAACAAAGCC ATTGATGTTGCCAGTAGAATCAAGGCGGGTGTCGTTTGGATTAACACCTACAACGATTTC CACCACATGGTTCCTTTCGGAGGTTATGGAGAATCTGGTATTGGCAGAGAGCTTGGTGCT GAGGCTTTGGATAACTACACTCAAGCCAAGGCTATCAGAATTGCTTACACTCCTGAACAT AAGTAG
Downstream (in bold start codon previous gene) tagttatgtatgtttatgtgatgttaatgatctaattgtgaacttcattcaactggatga cgctgtaaaaggaatgccttggtggctgatgtcgatgggggcttgcaagaaacctttgct gaatggttggtggtgactcatataagcaagatcatgcagtcatggagccccatcagttat cagtaaattgtcctttcaaggcacaagtgtttgcgtagaaagcttaatcattcagttaca agcagttactataaaggggttcaatcggatccagggcgaggctgaatcatcaatgctgca ctactaagcacattcaagtacacccgacaacactgaagactccaataccagcatcggctt acccaaaaaaataataaaacggggcggtgcttcaatgagtcatatcaatgcggagtatgc aagtgtatagtcctagtttcccactcgtagatctgacacatctgaaaaattataaatagg catcqaqttttccctacctcqctttatcatttttcaaccatttcqttccattcacaatga agttggcagaagccctcatcatcagaaaagacatatataagaacctgaagcaattggaga ataggatacggcggaacatcacaattcaagaaggtacttccgcaccagagaatagcaatg cattattggtagagtacaatgaacagcacaaggagtacgtcggccttgtcgttaaaatca accttaccaatgccaagatctttttaaaattctatcaccccattgaagagaaagaggtag aagcaacaatgaccgaqgcgttaqcattgaqgqactacctqaaaqaqagatcqtctgcat taagagagtttgctgcggaagcctccagttccagagctgttttgacaagaacagaagtca gatttaaaaccacaatgaaagctagtgatatacaaaaagaacaagatagaaccggtaaac ttttgcgacagttggatctcaagatacaagaaaaaaattgg
AA
MTFAPPLEFEIDLPNGLKYTQPLGLF INNEFVEGVEGKLLPVINPCDETKITQVWEASAADVDRAV DAAEDAFNNSVWATQDPLERGKLMNKLADLIDRDFNILAGIESIDNGKAYTSAQGDVTLAVNYIRS CAGWADKILGNVVDSGNTHLNLVKREPLGVVGQI IPWNFPLLMLAWKLGPALATGNTVVLKTAEST
PLSGLYVAKLIKEAGFPPGVVNILSGFGNPAGAAIAAHPRIKKIAFTGSTATGRKIMEAAAKSNLK KVTLELGGKSPNIVFEDADIQKTIHNI I LGIFFNSGEVCCAGSRVYIQDTVYEEVLEAFKKETDNV KVGGPFEEGVFQGPQTSELQLNRILSYIKHGKDEGARVITGGSRYRNRGYYIKPTIFADVTEDMKI VKEE IFGPWTI TKFSTVDEVVGYANNTNYGLAAGIHTNNLNKAIDVASRIKAGWWINTYNDFHH MVPFGGYGESGIGRELGAEALDNYTQAKAIRIAYTPEHK
SEQ0072
P. pastoris homolog of S. cerevisiae PGK1 (YCR012W)
3-phosphoglycerate kinase, catalyzes transfer of high-energy phosphoryl groups from the acyl phosphate of 1 ,3-bisphosphoglycerate to ADP to produce ATP; key enzyme in glycolysis and gluconeogenesis chromosome 1 -4, 0292 E.C.2.7.2.3
5' upstream region (in bold start codon previous gene) aaagttggtacccagccgatcacgcctgctctgagtttggctggagcagcaaatctcatg ataaccgaggtttaaattaaggtacataacaaaaattcaatgttcaaagacgcacatacc aagacttactaatcgcagaatgttggtgcagtatttgtcgtaagccaaaaccatcgatgt tgacttcctaattcagtctttaaaccgcaaaaggattctgattcgcagatggcctgatct ccaaactcaggctggggctctaactcgagcaagtgtcctatgctgtaggccgcagccctt ttggttcgacgacgtgcgtggttatgagacgctcggctgttttgcgctaagctggccgta tcgagtaaattctacaggcacctgcgaggcaagcaatctactaatgtttatttttcgtcc aacctaattgtggtttcaaagcgctatcaggtggggggtaagaggaatgtgagtggaaag cgaaaataactggcagctggggtcagatcccgtgatgccacctcttgtggtattttgaaa cgcgtgttgcgattggccgcgagaacggaaaggaatatatttactgccgatcgcattttg gcctcaaataaatcttgagcttttggacatagattatatgttctttcttggaagctcttt cagctaatagtgaagtgtttcctactaaggatcgcctccaaacgttccaactacgggcgg aggttgcaaagaaaacgggtctctcagcgaattgttctcatccatgagtgagtcctctcc gtcctttcctcgcgcctggcaataaagcctccttcggaggagctccgtctagagaataat tgctgcctttctgactttcggactagcgccaaccgcgaaccacaccaccacaccatcact gtcacccgtcatagttcatccctctctccttataaagcatctaataggttccacaattgt ttgccaaaaaatctcttagcatagcccaattgattacgaaa
ORF
ATGTCTCTTTCAAATAAACTTTCAGTCAAAGATCTCGATGTTGCCGGAAAGCGTGTCTTC ATCCGTGTCGACTTCAACGTTCCTCTGGATGGTGACAAGATCACCAACAACCAGCGTATC GTTGCTGCTTTGCCAACTATCCAATATGTTTTGGATCACAAGCCAAAGGTCGTCGTTCTG GCTTCTCATTTAGGCCGTCCAAACGGAGAGGTCAACCCAAAATTCTCTTTAAAACCAGTT GCTGCTGAATTGTCCTCCCTACTAGGTAAGAAGGTGACTTTCTTGAACGATAGTGTTGGA CCAGAGGTTGAGAAGGCTGTCAACTCTGCCTCCAATGGAGAGGTTATTCTTTTGGAGAAC TTGCGTTTCCACATTGAAGAAGAAGGATCTCAAAAGAAAGATGGTCAAAAGATCAAGGCC GACAAGGAGGCTGTTGCCAGGTTCAGAAAGCAATTGACCGCATTGGCCGATGTCTACGTT AACGACGCCTTCGGTACCGCTCACAGAGCCCACTCCTCCATGGTTGGATTTGAATTGGAG CAAAGAGCTGCTGGTTTCTTGATGGCTAAGGAGTTGACATACTTCGCTAAGGCCCTGGAA AACCCTGTCAGACCATTCTTGGCCATCCTTGGTGGTGCTAAGGTTTCTGACAAGATTCAA TTGATTGACAATTTGCTGGACAAGGTCGATTCCATCATCATTGGTGGAGGAATGGCTTTC ACTTTTATCAAGGTTTTGGATAACGTTGCCATTGGTAACTCTTTGTTCGACGAGGCTGGT GCCAAGTTAGTTCCCGGCTTAGTTGAGAAAGCCAAGAAGAACAATGTCAAACTGGTTCTT CCAGTCGACTTCGTCACTGCCGACGCCTTCTCCAAGGATGCAAAGGTCGGTGAAGCCACG GTTGAGTCTGGTATTCCAGACGGATTGCAAGGATTGGACGCTGGTCCAAAATCCAGAGAA TTGTTCGCAGCTACCATCGCTGAGGCTAAGACAATCGTCTGGAACGGTCCTCCAGGTGTT TTCGAGTTTGACAAGTTTGCTGAAGGTACCAAGTCTATGTTGGCAGCTGCCATCAAGAAC GCTCAGAACGGTGGAACTGTCATCGTTGGTGGTGGTGACACGGCTACCGTTGCTAAGAAG TTCGGTGGTGCTGACAAGCTATCCCACGTTTCCACTGGAGGAGGAGCTTCTTTGGAACTG TTGGAGGGAAAGGAGCTTCCAGGTGTAGTTTACTTGTCCAACAAGGCTTAA
Downstream (in bold start codon next gene) ttagttcatatagtttgaattctgattttgatgacgctcgcataaaccgtagagccacta
cggccacatgttagttgtccgtgaattcacaatttacatgatattattgcaatgctgctg ttcgtcaattcgttgcagtcgtgataagagggattgtcatgtatgcaaaggtattcagca actgtacaaggctgacgtacgtttagtggtcagatttgagacctcccaagttggcgcacg gcggataacccacgttggggtcacggtgggggcatgaactttcgcgtcgatttgggtggg ggttttttatctgagaagttcccctgttgatgactgatggcctggaccggatggtgcatg gcatggacgtgaacataaacagaaatatggacctctctaaatttgtgcctttaacacatc aggctgcaggacacccggacttgctggagtctgaagactcaggcctattcgccaagttaa caaataggaaggaggtcgaattttacagcaggctgaattctaatgtctctgaagataaac cattgggaagcggtttgattgactgggttcctcagtttatgggagtcctaaccccaggaa tttcacctgacttgaaatctcaaggcgctcctgtagctgctgagttggagaagaaggcct ctgtgcaaccttcttcagataaacagtacatcttgttggagaacctattgtttggcttta gccagccctcagtattggatatcaaattgggagtcaaactatatgatgatgatgccacag atgataaaaaggagagactgggtaaagtcagtgattctactactagtggtagcctaggtt ttcgaatatgtggaatggacatcaaaaagacccgtaaagaagtccacgagaaatggtccg actacgtcacaacttaccaagacgcgcacaaggttgagtatctcaagttcgataaatggt ttggaagagcactagacgtagactcgatccttgaagggctg
AA
MSLSNKLSVKDLDVAGKRVFIRVDFNVPLDGDKITNNQRIVAALPTIQYVLDHKPKVWLASHLGRPNGEVNPKF SLKPVAAELSSLLGKKVTFLNDSVGPEVEKAVNSASNGEVILLENLRFHIEEEGSQKKDGQKIKADKEAVARFRK QLTALADVYVNDAFGTAHRAHSSMVGFELEQRAAGFLMAKELTYFAKALENPVRPFLAILGGAKVSDKIQLIDNL LDKVDSIIIGGGMAFTFIKVLDNVAIGNSLFDEAGAKLVPGLVEKAKKNNVKLVLPVDFVTADAFSKDAKVGEAT VESGIPDGLQGLDAGPKSRELFAATIAEAKTIVWNGPPGVFEFDKFAEGTKSMLAAAIKNAQNGGTVIVGGGDTA TVAKKFGGADKLSHVSTGGGASLELLEGKELPGVVYLSNKA
SEQ0073
P. pastoris homolog of S. cerevisiae EN01 (YGR254W)
Enolase I, a phosphopyruvate hydratase that catalyzes the conversion of 2- phosphoglycerate to phosphoenolpyruvate during glycolysis and the reverse reaction during gluconeogenesis; expression is repressed in response to glucose chromosome 3, 0082 E.C.4.2.1.1 1
Upstream (in bold start and stop codon previous gene) aatgaaagagtgagaggaaagtacctgggcaaaatcacacaattccaaaccatgetaaat gagatttaaagaacaaacgatggcaaaaggcaaccgttataaatgtgatctttcttggca gttatctgtcaatttttctaaggaacagtgaattcatcataggagagatgttatacgtta cataatcatacatactgcatgtatctcacctactttacctcatcaactctaaaacagttc tagtcccaaccccagattcctagtcatgacacaagtccgcaccggacaggactcacaacc agcaagagaagctaacaaatttacgccccggtaaaacattctttaggggccgttcaatgg taattttcctctcacccgtttaaacttacctccgggcggtatcttcaataacctctgttg tccccgggtatcattggaaacagtgagggacgttgaacagaagagaggatcaccgtaaat ttgccttgcaattggccctaaccacggatggttaacttcaagccatcacgacagcaattg agtcggcgcatagctaccctcctcttcttgaccccatgcataggaccaaccttaaccgat ggaacaggttcctccgctccgtcccctggtagtgtctctgcgcaagaaatagttaaggta tgaagactgatctctcgcacccccctcacagtactgttatggtgaattgacaaagccatt ggctagattgaaacatgtaattcatatgtaatcttgttcaattaacgagcttcgtacagt ctcaatctagacgtctgataatggcgtttgtgctcctaatcgatgagccatctcatgtga cgtctatacgcttcgatggcttccgtcgcgaatatagaaccacttgaaatatgctgcaaa ccacgatccaccctggtcctgaaaagatataaatacagcacatctagcaggcttttgtct tcttggttgaaacacacaattataacaatctacatctaaaa
ORF
ATGGCTATTTCAAAAGTATTTGCCCGTTACGTCTACGACTCCAGAGGAAACCCAACCGTTGAGGTT GACCTCTACACTGAAAAGGGACTTTTCAGAGCAATTGTTCCTTCCGGTGCTTCTACCGGTGTCCAC GAGGCTCTGGAATTGAGAGACGGTGACAAGTCAAAGTGGCTCGGTAAGGGTGTCTTGAAGGCTGTT GCCAATGTCAATGACATCATTGCTCCAGCTATCGTTAAGGCAAACATTGACGTTAAGGACCAAGAG GCCATTGACGCTTTCTTGAACAAATTGGACGGAACTCCAAACAAGGGTAAGTTGGGTGCCAATGCT GTCCTTGGTGTCTCTTTGGCTGCCGCTAAGGCCGGTGCCGCTGAGAAGAACGTTCCTCTTTACCAA
CACATTGCCGACCTGTCCGGTACTCCAAAGCCATACGTCTTGCCAGTTCCATTCCAGAATGTCTTG AACGGTGGTTCTCACGCTGGTGGTGCTCTAGCTTTCCAAGAGTTCATGATTGTTCCAACCGATGCT CCAACCTTTTCTGAGGCTCTGAGAATTGGTACCGAGGTCTACCACAACTTAAAGTCTTTGGCTAAG AAGACTTACGGACAATCTGCTGGTAACGTCGGTGACGAAGGTGGTGTTGCTCCTGACATTTCTACT CCAAAGGAGGCTTTGGACCTGATCTCTGCCGCCATTGAGAAGGCTGGCTACACCGGAAAGATTGGT ATCGCTTTGGACGTTGCCTCTTCCGAGTTCTACAAAGACGGCAAGTACGACTTGGACTTCAAGAAC CCTAACTCTGACCCATCTAAGTGGTTGAGTGGCCAGGAGTTGGCTGCCCTGTACAAGGAATTGATT TCCCAGTACCCAATTGTATCCATTGAAGATCCATTTGCTGAAGATGACTGGGCTGCCTGGTCTCAC TTTTTCTCTACTGTCGACATCCAAATCGTTGGTGATGACTTGACTGTTACCAACCCTATCAGAATC AAGAGAGCTATTGAGGAGAAGTCTGCCAATGCTTTGTTGTTGAAGGTAAACCAAATTGGTACTTTG ACCGAATCTATCAAGGCTGCTACTGACTCTTATGCTGCTGGCTGGGGTGTAATGGTCTCCCACAGA TCTGGTGAGACTGAAGATACCACCATCGCTGACATTGCTGTTGGTCTGAGAGCCGGCCAGATCAAG ACTGGTGCTCCAGCCAGATCTGAGAGATTGGCCAAGTTGAACCAAATTCTGAGAATTGAGGAAGAA CTTGGTGACAAGGCCATCTACGCTGGTAAGAACTTCCACAAATCTGTTGCTATCTAA
Downstream (in bold stop codon next gene) attggatattaatctatgtattaaggtaatgagtcaaacttacaattgagaactagtatt ttacttttcagaacctaagaataaggtcatctgattcattctgaggttctttaagcttat ttaggagatttagagagtctaggaacccaagacattcatcttctgagtaaaatctgagct caaaaataagtacccgcaatgggatttccgtgaatttctgttgaattactgtgatcgcct tttcacgctccctgtctagaaaagttgctagagaaaaccggatggactgaagacgaaatt gctttccagcagcatggtcctccgtatcgataatcgtcaccaattcgaaacaacttttaa aaaggtcgtggtagttccgatcaagaaaggcataaagaatagacaaaccaatcttgcaca gcaacattggcagtggctggttcgcttgactatcaagtttagaatccaatgcttcaatgc gatcaacgatacattgcgcaaatttatccagttccggttccatgcgatggtaaatatagt aacacgttaatagacagaggtgatattcataagaatcctgttcttgataaagatcaaatg tacttctattttcggcaataaattgtagatatggtagaacttcagttgaaaaatatgatc gtttaaggaacagtttaaccaaaacttgactgcaaactgttttgaagacaacccttttct ccaaatctttgatagcagcagttttttcaaatatactctctatctcttgttgcatggcta gatcatctgagcggcttgaaatattaacgaaatgccttcgtagcagctcaacttgcttct tcaaatttaatttgtctacttcattagtcactgtgttagaaagctgttgcgaattcggag cactttctcgtccgaataaggaattatcaggattcaagtttggacctgatgatggaagtg atgcagtcggtgaaccttttggtgatgttggagagccatct
AA
MAISKVFARYVYDSRGNPTVEVDLYTEKGLFRAIVPSGASTGVHEALELRDGDKSKWLGKGVLKAVANVNDI IAP AIVKANIDVKDQEAIDAFLNKLDGTPNKGKLGANAVLGVSLAAAKAGAAEKNVPLYQHIADLSGTPKPYVLPVPF QNVLNGGSHAGGALAFQEFMIVPTDAPTFSEALRIGTEVYHNLKSLAKKTYGQSAGNVGDEGGVAPDISTPKEAL DLISAAIEKAGYTGKIGIALDVASSEFYKDGKYDLDFKNPNSDPSKWLSGQELAALYKELISQYPIVS IEDPFAE DDWAAWSHFFSTVDIQIVGDDLTVTNPIRIKRAIEEKSANALLLKVNQIGTLTESIKAATDSYAAGWGVMVSHRS GETEDTTIADIAVGLRAGQIKTGAPARSERLAKLNQILRIEEELGDKAIYAGKNFHKSVAI
SEQ0074
P. pastohs homolog of Saccharomyces cerevisiae CDC19 (YAL038W)
Pyruvate kinase, functions as a homotetramer in glycolysis to convert phosphoenolpyruvate to pyruvate, the input for aerobic (TCA cycle) or anaerobic (glucose fermentation) respiration chromosome 2-1 , 0769 E.C.2.7.1.40
5' region (in bold stop codon next gene) ctctcatgaaccggatagcacgtttgccgctaacaaatcagttaaaatgtataagtacca ttctttttgacagagaaacttctcgtacagagttcatcttttactttgatcgggttgcta acatgctgatccatctggcattggaacaggtagagttcggaccctcgcaagatgaggtat tgaccccgcaataccattgcctaactgatgcgatacgaccgttacaatcggttgtcgttg tgactatggtacggacaggtgatgtatttatgaattcaatcagaaaaactattccagatg taagagttggtaagttgctaattcaatcagacctaattacaggcgaacctcaattgcata caaagtcgctgcctccatgtgaacaaactaccaagctactattattcgatgcgcacatta tatcgggggccgcagcaattatgggcattcaagtacttctggaccatggtattgaagaag gtaatatcgtgattgtaagttatcttgcagaagaagctggcctacgtcgcatactgaacg ctttccaaaaggttactattatcgtaggcttatcctctgggaggatgacctcattattga
aagagccaatgtttcgtacacggttcatcgacgattactacttcggcagtacgtagttca gtgtgcgggatactgtattccgctcggggttctaaagaaattgtttaaactaaaccaaat cggatcagaggttccgtacgtttttcacattcaaggatgagggttttccacgagtgaact attactccggtctcccaccatcatttgcggaatgaaaccttttgtgctgagattgtatag ggcgtggggacggacgcttcttaaccgttcccctagaatgtcgtcccctgatcaaaattt aatggcatccaactttgctgtaataggtatatataacctagcaggcgaccgttcatgtac agtaaattgttttagacttttttttaactgaaatcaatcca
ORF (in bold stop codon previous gene)
ATGTCAACTTCTTCAAAACTTGGATGGCTATCCAAACTGGATGTATCATCCACTCCAGAG CGCAATCTTCGCCGATCTTCCATTATTGGTACGATTGGTCCTAAGACTAACAGTCCAGAA GTACTTGTCAGTCTGAGACAGGCTGGGCTTAATATTGTTAGAATGAACTTCTCTCATGGT TCTTACGAGTACCATCAATCAGTTGTGGACAATGCCAGAAAATCAGAGGAAATATATCCT GGTAGACCGCTTGCAATTGCTTTGGACACCAAGGGTCCTGAAATCAGAACAGGAACTACC AAAGGTGAAACAGATTATGCAATTCCTATGGGACATGAAATGATATTCACCACAGATTTG TCTTTCGCAAAGTCTAGCGATGACAAAGTGATGTTCATTGATTATAAGAACATTACAAAG GTTATTGAACCTGGAAAGATTATTTATGTGGATGATGGTGTTCTTTCATTTGAAGTTTTA GAAGTCGTTGATGAAAACACTCTCAAAGTCAGATCTATCAATGCTGGTGCAATTTCATCC CATAAAGGTGTCAATTTACCTAACACAGACGTGGACTTGCCAGCTCTTAGTGAAAAAGAC AAGCAAGATCTGAGGTTTGGTGTGAAAAACAAAGTCAACATGGTATTTGCATCCTTCATT CGTTGTGCTAATGATATCAAGGAGATTCGCCATGTTCTCGGAGAAGATGGCAAACAGATC CAGATTATTGCTAAGATTGAGAACCAGCAAGGAGTTAACAATTTTGATGAAATCCTGGAA GTCACTGATGGTGTCATGGTCGCTAGAGGAGATCTAGGTATTGAAATCCCTGCACCTCAG GTATTTGTTGTTCAGAAGCAATTGATTGCCAAGTGCAATCTGGCAGGTAAACCCGTAATC TGTGCTACCCAAATGTTAGAATCTATGACTTACAATCCTAGGCCAACAAGAGCCGAGGTT TCGGACGTTGGAAATGCTATTTTGGATGGCGCCGATTGTGTGATGTTATCCGGGGAAACT GCCAAGGGAAACTACCCTCATGAGGCTGTTGCTATGATGCACCACACTGCTCTAATTGCA GAGTCAGCTATTGCTTACCTTCCACACTACAACGAAATCAAGGATCTTGCTCGTGGTCTT ATTAACACAGTTGAAACTATTGCTATTGCCGCTGTTTCTGCTCACTTTGAACAAAATGCC AAGGCCATTGTTGTGCTTTCTACTTCAGGAACTTCAGCAAGAATGATTTCTAAGTATAGA CCGAATTGCCCAATCCTTATGGTAACCAGAAATGATGAGGCAGCAAGATATTCTCATCTC TATCGTGGAGTATATCCATTCATCTATAAACAGGAAGTTAATGATAACTGGCAACAAGAT GTTGAAGAACGTTTACAATATGCCATCACTGAAGCCATTGGCATGGGAATATTGAAAAAA GGTGATGCTATCGTTGCAGTACAGGGGTGGACCAAGGGACTGGGTCACACCAATACTATG CGCGTTGTTTTTGCTTAA
Downstream gtagtgtcaatacatgcgcccttttggactaggttggttttgagcatatttaatagctct cctctaatctttcgccaatcactctcattgttagtatcattgtcaactaactcgcttgag tttaaactacaggttatgatatctgctagtgacaacacttcataaatcttttcattcagg attaatttgtgaatgaaaccgtaaaagacatctaggtcattctcatgaataccgattggc ttctcgttttcattttcaaaactgaaccgaacccaatctttcaagagatttttccaatct tcgaactcatagaatgtctgaatgcctaaattgtagtgcttgttgaacgtcttcgttttc tcggaagcccagtaggagttcacaagggttttctgtccctccatagttaaaccacttagt tcagtcttctggtgaattacagaatcccctaaattttcaattggaatattcaacgttagt ttgtttaatgttctttggttactaattagatcaatcctcttgttctgggtttctggccgc actttaccaatctttttgagagatctattttgaagtagacaattgcgattgcgattcttt tttggacttttctcaaataactgctcaacactacgtctctctagttcaagtttaatttct tgctgaaactcgttaggcaaagctttgaaggcgtcaagatctacaagatgcttctgcaga agtttttgtaaagtttcttttgctggagagtagtgagggaaaacagaaccacctttgagt acattctgttcgatgccttgcacttttttttctttaggctttttggcttcgaattgtaat tgcttttggtccacagctgaagttgcgtcactatcacacttttgcagttcactaatggtc acagaaacacctctgagatcagcaattgcaaccccaattatgcggtataatgatttcaac tctgtggctattagaccaacctctcttgttggtgctcctaa
AA
MSTSSKLGWLSKLDVSSTPERNLRRSSIIGTIGPKTNSPEVLVSLRQAGLNIVRMNFSHGSYEYHQSVVDNARKS EEIYPGRPLAIALDTKGPEIRTGTTKGETDYAIPMGHEMIFTTDLSFAKSSDDKVMFIDYKNITKVIEPGKIIYV DDGVLSFEVLEVVDENTLKVRS INAGAISSHKGVNLPNTDVDLPALSEKDKQDLRFGVKNKVNMVFASFIRCAND IKEIRHVLGEDGKQIQI IAKIENQQGVNNFDE ILEVTDGVMVARGDLGIEIPAPQVFVVQKQLIAKCNLAGKPVI
CATQMLESMTYNPRPTRAEVSDVGNAILDGADCVMLSGETAKGNYPHEAVAMMHHTAL IAESAIAYLPHYNE IKD LARGLINTVETIAIAAVSAHFEQNAKAIWLSTSGTSARMISKYRPNCPI LMVTRNDEAARYSHLYRGVYPF IYK QEVNDNWQQDVEERLQYAITEAIGMGILKKGDAIVAVQGWTKGLGHTNTMRVVFA
SEQ0075
P. pastoris homolog of Saccharomyces cerevisiae PDB1 (YBR221C)
E1 beta subunit of the pyruvate dehydrogenase (PDH) complex, which is an evolutionaπly-conserved multi-protein complex found in mitochondria chromosome 1 -4, 0593 E.C.1 .2.4.1
5' region (in bold stop codon of next gene) atgggaattcggaggatatgtcccatttggtggacttgttgctgttttctcctggagtta agaactatagtttctacagagctactccaagtcaacagcagttacagcagcagcagcaat ctcagcaggtgcaacaggcacagcaagttcaacaggctcaacaagctcagcaggcccagc aggctcagcaggttcaacaacaggctcagcaggcccaacagcagcaggcacagcaacagc aacaacagcaacaattgcagcagcagcacatccaacaattgcaacaaatacagcaactac aacttgcgcaacagcagcagcaacagcaacagcagcgtcaacaacaggaacaccttcttc agcaattgcagcagcagcagcagcaacaatatcaccaacctcaaacacaaggccaaaatt tcccacaacagtacttgatgcaacaggcccaagctcaggcccaagctcaagctcaagttt tagcccatgctcagcatgacaactcatccaatccaatgttcactaacatcagagaaggat ctggggtctcagctacacctccacccccaggattgtttagtggagtgaccaatcacacta cggaaactcataacactccatcatctgaactattgaaccagctgatgaacggggagggta gaaactcgatcaatgcttagtcggtgattatctgttatataagaagtacctattagttga ataaagtaataatattggatgtctgatgttccgaggcttccctagtccgagtcgattgcg cgcgtaaattggtgcttttcccccaccgaaacaataatgaggggatctccatatcacgtg atgcattcggtgtaacttttagtggtataaaccgcggtcggatgcactccgcctaacaaa cttctgtgaggtgcgaaacaaggaacccgtaaaggaaaacctcattgattacctgttagt tcctactttctcttttacccacaaggttcactctcaccaca
ORF
ATGTTGCCATTTCAAGCAGTAGCCAAAGCAGGCCTCAAGCCCCAATTGACCCGCATGAGT GTAAACTCATTGAGAGCAGCTTCTTCCTCGGCGGGTCCAACAAAGTTGTCTGTCAGAGAC GCTTTGAATTCAGCCATGGCCGAAGAATTGGACAGAGACCCTGAGGTGTTCTTGATCGGT GAGGAGGTCGCACAGTACAACGGTGCTTACAAGGTTTCCAGAGGACTGCTAGACAAATAC GGGCCCAAACGAATCGTTGATACCCCAATTACCGAAATGGGTTTCACTGGTCTTGCTGTG GGTGCTTCGTTGGCAGGCTTGAAGCCAATCTGCGAATTCATGACATTTAACTTTGCCATG CAGTCAATCGATCACATTATCAATTCCGCTGCCAAGACCCTCTACATGTCTGGTGGTAAG CAACCCTGTAACATCACTTTCCGTGGTCCTAACGGAGCTGCTGCTGGTGTTGCAGCCCAA CATTCCCAGGACTACTCTGCTTGGTACGGATCTATCCCAGGTCTGAAAGTTATCTCTCCC TACTCTGCCGAGGACTATAAGGGTCTGTTCAAGAGCGCCATCAGAGACCCAAACCCTACC ATCTTTTTGGAAAATGAACTGTTGTACAACGAAGAGTTCGAAGTTTCTCCTGAGGTTCTG TCCCCTGATTTCACTGTTCCAATTGGTAAAGCCAAGATCGAGCGTGAAGGTACCGATATC ACGATTGTATCCCACAGCAGAAATTTGCAGTTCTGTTTGGAGGCAGCCACCATTTTGAAG GAAAAGTATGGTGTCTCATCTGAGGTTCTCAACCTTCGTTCCATCAAGCCATTGGATGTT CCTGCCATTGTTGAATCTGTCAAGAAGACCAACCATCTGATAACTGTTGAAGCCGGTTTC CCAGCCTTTGGTGTTGGTTCCGAGATTTGCGCTCAGGTCATGGAATCCGAAGCTTTTGAC TACCTAGATGCCCCTGTTGAAAGAGTGACCGGATGTGAAGTTCCAACCCCCTACGCCAAG GAATTAGAAGACTTTGCCTTCCCAGACACCCCAACTATTATAAGAGCTGTCGAGAAGGTT CTTTCGTTGAAAGAGTGA
Downstream (in bold stop codon previous gene) ttatagagttcaataatgcccatgatatattataaactaattttgaaaagatatacagtt ggctaaatagacctatccaccctctgtttgactctttcaagactcatcttgacgtctccc ttagaaaagttttctagattgtaataaacgggagccaattctatcaaccactcaggtcta attgatgttactgttcggatgtagttcttggtcgtcaagacgaattcattatagattacc cactcgttctcctttgcaagaacggtacttgggtggattaaaacgtcttggttatctttc actgtaatgtagcctttggccccagatcttttctttgcgacttgcatgaaaaatccagac gtaagggctttccggatgttatcgtagtatcttggatcttcaaagtcagttgatacgagg tgcaagtcatacctctccatcaacctttctaattgccttctaacgttttcggcgctcttt agggatctataggagaggaaatgctctttacaccaagggttcaggcccttttcgtaggct
tcctcagactggaaggcatgataaacgttcaaaagagttagatgatcaccatctgagtgg gcaaacaaactcttcatttcatcggctttcttcttgttatttggtggacgcacaaatacc gagggaactgacaacatagccacaatggtaagaatctcgttcgaacaattgaactctgaa gaaccaatcaacataactgccaacattggatcaagagggaattgtgatgetaatcttccc agagctgtcaacgaaccatcgtcgtctaaacaagccagatagttcagctcttccaaagca cgcatcatggtttctggtgctggagggtccataaaatcgaagtgaaccaaatcatcaatt cctagcttttttaactctaacacagttgaagccaggttgcttctcaggatttctggataa ctctgttcaattagttctttctggaatgcctcctcggtgta
AA
MLPFQAVAKAGLKPQLTRMSVNSLRAASSSAGPTKLSVRDALNSAMAEELDRDPEVFLIGEEVAQYNGAYKVSRG LLDKYGPKRIVDTPITEMGFTGLAVGASLAGLKPICEFMTFNFAMQSIDHI INSAAKTLYMSGGKQPCNITFRGP NGAAAGVAAQHSQDYSAWYGSIPGLKVISPYSAEDYKGLFKSAIRDPNPTIFLENELLYNEEFEVSPEVLSPDFT VPIGKAKIEREGTDITIVSHSRNLQFCLEAATILKEKYGVSSEVLNLRS IKPLDVPAIVESVKKTNHLITVEAGF PAFGVGSE ICAQVMESEAFDYLDAPVERVTGCEVPTPYAKELEDFAFPDTPTI IRAVEKVLSLKE
SEQ0076
P. pastoris homolog of Saccharomyces cerevisiae ARO10 (YDR380W)
Pheπylpyruvate decarboxylase, catalyzes decarboxylation of pheπylpyruvate to phenylacetaldehyde, which is the first specific step in the Ehrlich pathway chromosome 4, 0314 E.C.4.1.1.1
5' region (in bold start codon previous gene)
Agcatctgaagggaagtactctgtagcggagaaagacatatggacttttgtgttttattt Ttctgacgctattagtttctttgcagtaagaagtcagcactgttgtttcaaacttcagct Tttcaaatggggagaatgaagggggggatatggaaaagaaggcaggttgtgaaggataaa Gcagcagaactcacagtatactctcaattctagaaatcagataaaacttcaaactgaaac Gtgtagctctccgactttacgtgcaaggaagagatgagggtaataaatttaaaatcggag Ttgttgacgtctttaacccaatcgggctcgatacactatctggaaacgcggtttagcctg Atactcgtgtggtggaaggggttattttaattttgcaggactaatcccaatgcggaaatt Tcacctagactgcgtttgtttcaccgaaaacaaccttgatttgaaatttgaaaggttgtg Gagttcgaaggagttgccttgcatgaaataaattggaagcatagttgcagcccaataggg Atgtttgatgatacagacatcaatggtttgcaaacttatatgactagatgttggatggcc Tgtatatggtgcctctactttttgcttactagactaagttagtgggttggttagagtgta Catggtgtcagaacgatctatcagtttacagcagacactaagtaagggctaaggagcgtt Tacctaaggcgtcaaagaacggaaggttattctaggaaggatctgtgtcctgtctaagct Ttagattacacacggatatctattatgtgcaagaagtttccgccataaacctcgaataac Cgaataaccgctcatatccgaaagtaaccgcacatttccgctatggggggagaacattca Cctagcttatccgtaacaagatcatgaggcattagcaaaatcttcatgtgtactgtggta tatatagacctgtattatccttccgtaaagtcattcaaaccaatttcaag
ORF
ATGGCCCCAGTTAAACAAGACTTTAACATAGACGTTCAAACAATCGAGAACACTGACATC TCTTTGTCTGAATACATATATCTTAGGATAGCCCAATTGGGTGTCAAGTCGATCTTTGGT GTCCCAGGTGACTTCAACCTGAATCTTGTGGATGAACTAGACAAAGTTCCCCAATTGAAA TGGATAGGATGTTGTAATGAGCTAAATGCCACTTATGCTGCTGACGGCTATGCAAAAGCA TCAGGAACGATAGGCGTTGTGGTCACTACTTATGGTGTGGGTGAGCTAAGCGCCATCAAC GGTATAGCAGGAGCATTCGCCGAATATGCTCCTGTTCTTCACATTGTAGGCACCTCCGCT ATGGCAACAAAACGACTTGAACATGTGCACAACATTCATCATCTTGCAGGGTCTAAGAAT TTCTTGGATAGACCAGACCATTATATATACGAAAAAATGGTTGATGACATCTGCATTGTC AAAGAGTCTTTAAGCGAGATCGAAAATGCCTGTGGCCAGATAGATAATGCCATCGTACAG ACTTATCTGCTTTCCAGACCAGGATACCTATTTTTACCTAGAAATATGGCCACTATGAAA GTTCCAAGAGAAAGGCTATTCAACCAACCATTAGCCTTGGAAAGAGTTGATCTTCACCCC GGTGAAACGCTACAAGTTGTTGAGAAGATCTTGGAAAAGTTCTATCATGCAAAAGAACCG GCCTTGATTGTGGATTATTTAACCAGACCCTTTAGAATGATGGAAAACTGCTCCAAGTTG ATCGGTGCTTTGGAGAATAAAGTCAACATTTTCAGCAGACCAATGTCCAAAGGCTTTGTC GATGAGAGCCACCCAAGATATATTGGCTGTTATATTGGAAAACAATCCAAGCACCCAAGT ACTAGTGATATTCTAGAAAAAAAGAGCGATTTCATTCTGAGTGTGGGAACGTTTGATGTT GAAACGAATAACGGAGGCTTCACATCCAAGCTTCCCCAAGAACATTTGGTGGAATTGAAC CCTCATTTTACTCGTGTTGGAACTCAATGCTTTTCAAATGTTAACATGTGCCATGTCCTC
CCACTCCTGGCAAGTAAGCTGAGAGGTGATTTGATCAGCATGGCCACAGTTCATCCAAAT GATTTCAGCCTCAGAAAGAAAGAAAAGGCACAGGATAAAATGAAGGCTCTCAACCAGAGC CATCTTGTAAAATCTACAGAGCTACTCCTAAATGCAAACGACACTTTGATAGTTGAAACT TGCTCATTTATGTTTGCAGTTCCAGATATCGCGTTCCCTAACAATACACAATTTATCAGT CAGTCATTCTACAACTCCATAGGATACGCACTTCCCGCCACCTTGGGTGTCAGCATTGCA AAGCGTGATTTCAGAAAACCAGGAAAAGTCGTACTTATTCAAGGGGATGGCTCTGCTCAA ATGACCATTCAAGAACTGGCTACCATGGTCAGACAAAAGGTCAAACCCACCATTTTACTC CTCAACAACGAAGGATACACAGTTGAACGAATGATTCTAGGTCCAACCAAAGAATATAAC GATATAGCTCCTAATTGGGATTGGACTGGAATGCTGAGAGCATTCGGCGATATACGTGGC CATTCCAAGAGCATCTCAATTGACACATGTGGTAGATTAGATAAGCTCGTTCAAACTCGA GAGTTTCAAGAACCTACACACCTAAATTTTGTGGAACTTATCTTGGGCAGAATGGACGCC CCGGAAAGGTTTGCCAATATGGTAAAAGAAATTGCTAACTTAGAGCACGCAAGTAAAAGT ATTCATTAG
Downstream (in bold start codon next gene) taattaaaataagactacagatcacgtgccgtaaagggtaaaacctggccgagggccagc cttatcatgggtgttttttatcttgtcaaatgcgtattatatttgctcctcgctcgcccg tttggcttctcctctctaatctccacgtgacccaaaacctgacgtctaatagtaccccac atttgaccccctgatgttgagttgagtcatccatcacagaaccaaccagctaagagtttg gcttcccggaaaaatttcccgtccaccgtctccgtctcgtaaactcttctttaaagggtg gtgtttatcgtcgattgactcagtgattagcctttgccctttttctctcgatagggaaaa acctcgcgaccaaaaaaagctatcgatccctctttccttaatctctcctttttttttatt gatagccccagtgtttactagcaaactctgactatttataaacttgaaacagaaacacaa tgaagcaaacgtattacgatccaaaggtagctgagccaattggatccgagtcaatggatg atagtaagctggacgacatggttcaatcgttgcccccagttgaccctttgttagatactc ccttcccgttcaggagtgcagacgtggaacaaggtgacacgtttgccacagagaacaacg ctgggcctacgcagtccgcaaggactaacgtgaacttcttgcccccacatcaatcattgt tcgatgccttacagagctctaacctgaacatggttgaggagattgaccctttgagacgaa gcaaccaaccggaagaatgttccccttcttgttggaaccaggagttctccgtgactgcgt ggcaaatggtccaccgagacgatgtgaagttgcaaaatctttggacccacacaaaaatgt acggaacaggcagtagacgtgttccaggtacggggaacggtgaatcagtgctgtccagct cagagtacgaggagaccgaccttccagatacccctctcctc
AA
MAPVKQDFNIDVQT IENTDI SLSEYIYLRIAQLGVKS IFGVPGDFNLNLVDELDKVPQLKWIGCCMELNATYAAD GYAKASGT IGWVTTYGVGELSAINGIAGAFAEYAPVLHIVGTSAMATKRLEHVHNIHHLAGSKNFLDRPDHYIY EKMVDD IC IVKESLSE IENACGQIDNAIVQTYLLSRPGYLFLPRNMATMKVP RE RLFNQP LALERVDLHPGETLQ WEK ILEKFYHAKEPALIVDYLTRPFRMMENCSKLIGALENKVNIFSRPMSKGFVDESHPRYIGCYIGKQSKHPS TSDI LEKKSDF I LSVGTF DVETNNGGFTSKLPQEHLVE LNPHFTRVGTQCFSNVNMCHVLPLLASKLRGDLISMA TVHPNDFSLRKKEKAQDKMKALNQSHLVKSTELLLNANDTLIVETCSFMFAVPD IAFPNNTQF I SQSFYNS IGYA LPATLGVS IAKRDFRKPGKVVL IQGDGSAQMT IQELATMVRQKVKPTI LLLNNEGYTVERMI LGPTKEYND IAPN WDWTGMLRAFGD IRGHSKS I S I DTCGRLDKLVQTREFQEPTHLNFVEL ILGRMDAPERF ANMVKE IANLEHASKS IH
SEQ0077
P. pastoris homolog of Saccharomyces cerevisiae PCK1 (YKR097W)
Phosphoenolpyruvate carboxykinase, key enzyme in gluconeogenesis, catalyzes early reaction in carbohydrate biosynthesis, glucose represses transcription and accelerates mRNA degradation, regulated by Mcm1 p and Cat8p, located in the cytosol chromosome fragB, 0061 E.C.4.1.1.49
5' region (in bold stop codon previous gene) gctacatcggaaccaattttgaattcgccgatacgaaaccggacatcaagttccatcatc atgcaagatgatcatcaggaagagccgtggtcagtcaacactgcaagtgtcaaagagaca tcaattattgacattagtaccccggatgaggcttacaatgctcgcgggaaagacgatgcg atacgggttgattcggagggcccttcaaaaagaacaactattctcagccgtctgatggag acacaagattcggacaatgacggaacggagtctgacgtaggggagagttctagtctgata attcttagttgaccagccttgcatcgggttgaacaaacttttgtgggttgggccactttg cgccaacaacacaaggttttcacaccacccggttgtcattaccgcccggaaaactcaatg attgacggagcgtaggttcacggaagtcaagttagagtagtcggagagtttcgggtgata
agaatttcccgggtgctttggagtgccaaagtgactgaatcaggaactaaaaacccgggc taacggctgaaggcctgtgttcagtgcactatattgtattctagtctggggtaggttgaa gtttggtggattcacgactccggagtgatggagggacaataccgaagatgaggtcttgcg agtggaaagataggttaagaataattaataacattgtatgaaacaagatggggagacaat tttagctatcgatccgttatcattggtcgaatgatgacttggaaccttatttgcactttt tttgtcaggtgttcgccgaaacttaaagatagttttattgatgtgtgccaaaaatgtggg gacaaaaggtcctccacccgccatcccccggtctcatacaaaaaaaaagaaaaatacctc tactagtccggggtcaaggcaagtcatgataggcatataaataggggcaggctggccgcc attacgaattagattaaccttctcatagattattatccaca
ORF
ATGGCTCCTACTGCTATAGATTTACAACAAGTGTCTGATGAAGACTACCAAACCAAAATA AAAAAGGCTACCTCGCCTTCATATTCTGTCATTGCCTCTCTGGGTTTAAACCCTGAGGCA GTGATCCGCCACAACGCTGCCGTTCCTACTCTTTACGAGGACGGTTTGTTGGAGAAAGGA ACTGCCATCAGTTCTACTGGTGCTTTGATGGCCTACTCCGGTAAGAAGACCGGAAGATCT CCAAAGGACAAGAGAATTGTCGACGAGGAGACCTCCACCGAACACATCTGGTGGGGACCT GTTAACAAGAAAGTTGACGAGCTGACCTGGAAGATTTCCAAGGCCAGAGCCATCGACTAT CTGAGAACAAGAGAAAAGATCTACGTCATTGACGCTTTCGCTGGTTGGGATCCTCGTTAC AGAATCAAGGTTCGTGTTGTTTGTGCCAGAGCTTACCACGCTCTGTTCATGAAGAACATG TTGATTCGCCCTACTGAGGAAGAGTTGCAGAACTTTGGTGAACCGGACTTCACCATCTGG AATGCAGGCCAGTTCCCAGCCAACGTCCACACCAAAGGTATGACCTCTTCCACATCCGTC GAGATCAACTTCAAATCCATGGAGATGGTCATTTTGGGTACTGAGTACGCCGGTGAGATG AAGAAGGGTATTTTCACTGTTATGTTCTACCTGATGCCAATTCGTCACAAGGTCTTGACT TTGCACTCCAGTGCCAACCAGGGTGTCCAGAACAAGGACGTCACTTTGTTCTTCGGTCTT TCTGGTACTGGTAAGACCACTCTGTCTGCTGACCCTAACAGAAAGCTGATCGGTGATGAT GAGCACTGTTGGTCAGACAACGGTGTTTTCAACATCGAGGGTGGTTGTTACGCCAAGTGT CTGGACCTGAGTGGTGAGAAGGAGCCTGAAATTTTCAATGCTATCAAGTTCGGTTCCGTT TTGGAGAACGTTGGCTACGACCCAGTTACCAAGATTGTTGATTACTATGACAAGTCTATT ACTGAGAACACCAGATGTGCTTACCCAATTGACTACATTCCATCTGCTCTTATCCCATGC ACTGTGGACTCCCATCCTAAGAACATTGTGTTGTTAACTTGTGATGCTAGAGGTGTTTTG CCTCCAATTTCCAAGCTAACCAACGCTCAAGTCATGTACCACTTCATTTCCGGTTACACC TCTAAGATGGCCGGTACTGAAGACGGTATCACAGAGCCAGAGGCTACCTTCTCTGCTTGT TTCGGTCAACCATTCTTGGTTCTGCACCCTATGAAGTATGCTCAACAACTGTCAGACAAG ATGATTCAGCATGAATCTACCGCTTGGTTGTTGAACACTGGATGGACTGGATCTTCTGCC ACCAAGGGTGGTAGCCGTTGCCCACTGAAGTACACCAGAAGAATTCTGGACAGCATCCAC AGTGGTGAATTAGCCAAGGCTGAGTACAAGAACTACCCAACCTTCGGTCTACAAATTCCA AAGGAAGTTCCAGGTGTTCCTACCGAGATTTTGGATCCTTCTAACAACTGGGTTGACGGT AAAGACAACTTCAACACCGAGGTTAGAAACCTGGCCCAACTTTTCATGGAGAACTTTGAG AAGTACGCAAGTGAGTGCACTACTGAGGTTATCCAGGCCGGCCCTCAGTTGTAA
Downstream (in bold stop codon next gene) ttgcaacaggataatattatttgaactttgaagtgtatatatacatatttaactttgctc attgagagattaacttattaactatgactggttgtagaggagagagtttctctgagttgt ttcgtatacctattattttcttttgatagtctatcgtcttccttcggacacttgagcgct caccacactttctgcgtcactgtctataacgaggtcgtcctcttcgtcctcactaatctc attccaggacccgtagatcaatagaccaaatccaagtacgacgaccatgtcaccaaagat ctgtgtccagctcagctgaataccgaagagtagccaatccgtgattcccacaaacaggat tgtcactagggacgaaaccgaagagagaacaggagaagttaaagaagtaagggctaaaaa tgaacaactgaataaaaggtttgataacactgacaagccaaccaggagccaaatggtccc actgtagtccagattgaacttatggagtccggtgaaatgtgcaaataccagaggagccca gaagagaccaaaattgaatattccaatcactgacatagcaaagtttgacatagatgcctg tcttcgagaggagattaagcttgatggacagcagaggtacttcttgtagagaacctcgta aaggccatagagtacggcacccactgatatgatgacatttcccacgaacctatatgactt gtcatttccagattgatcgtcatcggaaggcacagcagggttgtaagtgacaataaagac tccgaaaactgcaattattacgcttgaaagtttcaaaaaggaaaacttttcacctagaag cggaattgcaaaggcataggcagtaaatgcggaacaattgtaaatggccgtaacgtcgtt acttgtgctcaaggtcagggcaaaataccacgatgatccagccacattcaatatcagtgt aatagggatactggtcagcaagatatactttatggaattgg
AA
MAPTAIDLQQVSDEDYQTKIKKATSPSYSVIASLGLNPEAVIRHNAAVPTLYEDGLLEKGTAISSTGALMAYSGK KTGRSPKDKRIVDEETSTEHIWWGPVNKKVDELTWKISKARAIDYLRTREKIYVIDAFAGWDPRYRIKVRWCAR AYHALFMKNMLIRPTEEELQNFGEPDFTIWNAGQFPANVHTKGMTSSTSVEINFKSMEMVILGTEYAGEMKKGIF TVMFYLMPIRHKVLTLHSSANQGVQNKDVTLFFGLSGTGKTTLSADPNRKLIGDDEHCWSDNGVFNIEGGCYAKC
ARGVLPPISKLTNAQVMYHF ISGYTSKMAGTEDGITEPEATFSACFGQPFLVLHPMKYAQQLSDKMIQHESTAWL LNTGWTGSSATKGGSRCPLKYTRRILDS IHSGELAKAEYKNYPTFGLQIPKEVPGVPTEILDPSNNWVDGKDNFN TEVRNLAQLFMENFEKYASECTTEVIQAGPQL
SEQ0078
P. pastoris homolog of Sacchammyces cerevisiae PDX1 (YGR193C)
Dihydrolipoamide dehydrogenase (E3)-binding protein (E3BP) of the mitochondrial pyruvate dehydrogenase (PDH) complex, plays a structural role in the complex by binding and positioning E3 to the dihydrolipoamide acetyltransferase (E2) core chromosome 1 -4, 0254 E.C.2.3.1.12
5' region (in bold stop codon next gene) gattcttgtggaagcatatggctctggttttacggatgaagcattcacaaagctagagaa agaacttgagtacatttcagcccaagtggaaagtaataaaagttgtcccacgtttgaaaa acatgcattctatagaagtcgacttttgatgttggaactatcatgcctgagggctcaaac agttttacttctatatcgtccaaaattgactactggaaagtccctccgggctgtcaatgt ggcaaagtcaataatcctcgatatatggagtcaatatacgaaacagtatcccagcaacaa aaaagaaatgctggaacacctagattggaatttttgctatcccttgcgaactgcgacact gacgttgtggatctcgtgtgctatcctccttgaatataaacacatgattgaattcatgag aaccagtgatccatttgaatacattctggcattagaagtattggaaggactagttcaaat acttccaattgagcaacagcttttagatctgctaaaatctcccaacgattcggttatagg gacggacggctgtacaggcaatttttggacgactatgtctgatcaaaagctgcaaaataa gtaggattcttcacttgtccggctatgtgatggtcattattgagggttttataagacgta tcagtttagaaagtgaatctttaaactttaaagtcgagattggtaaatgatacttaatca caaacttcaaaaatgccccactttttctttgcaaatcaatttctaggctgagcttcgttc aacgttaacgagtagtctagtagccttccgagttgaagatcatagctcacaattcataac gtcaaactcgaacctctaacttaaagtaagcttccaccaaatcgtgcacccactacaaat cgcgacttgttccctccacttcggggtatcttatacaatcatcgaagccgctaaaacaat agttggcgaaattgaacattaaatcaatctctttaaggcaa
ORF
ATGTTGAAACAAATAAGCCGTCAGTTCAAACAATTTCCTGTGAGCCGCAGATGTCTTCAC GGATCTTCTTTCAGATTGGGAGCTACAGTGTTTGACATGCCAGCAATGTCCCCTACCATG GAGAAGGGAGGTGTTGTCTCCTGGAAAATCAAAGAGGGAGAAAAATTTAGTGGTGGCGAT GTGCTGTTGGAAGTTGAGACAGACAAAGCCCAGATTGAGGTAGAGGCACAAGACGATGGT GTGCTCGCAAAGATACTGGTACCAGCGGGTACCAATGACATCCCAGTGGGAAAACCAATT GCCTTTCTGGCCGAGCAGGATGACGACTTGAGTACTCTAGAGTATCCTAAACTGGAAGAA ACTGCCTCTAAGAAGATAGAATCAAAACCAGAGAAAGCAGAGGAGAAAATTGAACCACCG CAACCAAAGGAGGAGAAGAATACCAGCGGCAGTGACAGTGGTAAACTGGGTAATCCGAAA CAGACTTTATTACCCTCGGTGGAACTGCTCCTACACGAAAATGGATTGAGCAAAGAAGAT GCTCTTGCCAAAATTCAGGCAACAGGACCCTCCGGAAGAATTTTGAAGGGTGACGTATTG GCATATCTGGGCAAAATTTCACCCGAATTGAATGTCAAAATTTCCGAGTACATTAATAAC AAGTCTCACTTAGATCTGTCTAAGATACAAATCAGGGAAACCAAGGCAAGCGAACAATCT CCATCATCTTCAGGAGATAAACCAGCCAAACCTGAAAAACCTGCTAAGAAAGAGCCTCTA AAGATCGAGAAGGAATTGACACTATCTCAAGCTTACAGTAAGGAGGCCCTGCAAAGATTA TTAACTATTGCAGAACAACATGCATACAGTGCCAAGCTGTACCAAGAGGATTCAGAAGTG ATTGATCCACTGTTTGAGGAGATCGTTGCTCCACCTAGAAATGCCGAACGATTCAAAACA CAATTTATCGTCACGCCCTCAGACAATAGTGTTTCCACATTGAAGCTTACCTTATTAGTC AACGAGAGTATATTGGATGCCAAGCAAAGAGCTCAGTTATTCCTTGATGAAGTCAAGGAT CAATTGACGCAAGATAATGGTGGCGTTTCGTCGTCTCCCCAACTTGAAGAGTTATTCTGA
Downstream (in bold start codon previous gene) gttgcaacaagtctaagtagtaagtaattaaaccatcatgatcctatgatcgtgatcatt cattaaagcacggtgtggcaattattgctagggagatcgtcactgtatggtggcagaatt atctctacaagatgtctcaaagtccccacaaagcttggaccctctcatctgtaatgcatt
ttcctgtaactccccttagccacacgtcaagggctctgaatccgttgaaaagctgtggcg tctgccacctttaacgtcttcatgagggatgtgcacgtgatattgtctttcccttctcta aagcttcgaaaaaaacgcatctcaatgcgagaagcagatcgatatatataaagaactagt ccattgaaagatctctcaatttcactggaaaccaactcagaaagaaatgccttctcctca cggtggtgtgctacaagaccttattaagcgtgacgcttctatcaaggaagatttgttgaa ggaagtccctcagcttcaaagtattgtgctaactggtagacaactctgtgatttagagtt aatcctaaatggaggtttcagtcctttgacaggatttctgaccgagaaggattatcgctc cgttgttgacgatttgagactcgccagtggtgatgtttggtctattccaatcaccctgga cgtcagcaagaccgaggctagtaagttccgtgtcggcgaaagagtggtgttgagagatct tcgtaacgacaatgctctgagtattctgaccatcgaggatatatacgaacctgataagaa cgttgaggctaagaaagtcttccgcggtgatccagaacacccagctgtcaagtacctctt tgatgttgccggtgatgtgtatattggtggcgctttgcaagctctacaattgcctactca ttacgactacaccgccctgagaaaaacgccagcccaattgaggtctgagtttgagagccg taattgggaccgtgttgtcgctttccaaacccgtaacccaa
AA
MLKQISRQFKQFPVSRRCLHGSSFRLGATVFDMPAMSPTMEKGGWSWKIKEGEKFSGGDVLLEVETDKAQIEVE AQDDGVLAKILVPAGTNDIPVGKPIAFLAEQDDDLSTLEYPKLEETASKKIESKPEKAEEKIEPPQPKEEKNTSG SDSGKLGNPKQTLLPSVELLLHENGLSKEDALAKIQATGPSGRILKGDVLAYLGKISPELNVKISEYINNKSHLD LSKIQIRETKASEQSPSSSGDKPAKPEKPAKKEPLKIEKELTLSQAYSKEALQRLLTIAEQHAYSAKLYQEDSEV IDPLFEEIVAPPRNAERFKTQFIVTPSDNSVSTLKLTLLVNESILDAKQRAQLFLDEVKDQLTQDMGGVSSSPQL
EELF
SEQ0079
P. pastoris homolog of Saccharomyces cerevisiae ACS2 (YLR153C)
Acetyl-coA synthetase isoform which, along with Acsi p, is the nuclear source of acetyl- coA for histone acetylation; mutants affect global transcription; required for growth on glucose; expressed under anaerobic conditions chromosome 3, 0403 E.C.6.2.1.1
5' region aacttgaaggagcatgtaaaccttttctttattcgcagagtagtttttatgaaagggact caatgttccaagatacgattgatacaaatttgcgtaatgtaatatcttctttaactgtga ttaaatcttgcacatccagaatatctcttatcgtgtttctttgataaacgtcacttatat atatggttggcatttattcactgtctgtgaacgtatatctggcaactaggaattaccatt caacattcaaacattcagaggattttaggacgctgttaataagagtttaatcacctttcc gagctgaaggtttccactagaccctcatttgacatagattgcaactacaaagttaccctg ccaagcatgtcgattgactatcttatggagtctacacagttcttgttgaacgttgcttgc caagcttgatgtacaagactaattgacttagatttggctcctatcaatgcaataaacgtt agcaatgctattcgatgttcccgtataattaatactaggctagccgcggttagagatgcg ccatcgctcataatgtctggtgttaggcttttgaaaatcacatgctcattaaaaagcgcg caaatagtaccatactgtgcgcagcccgagaactgaacatgatcaacctcgtcgataagg aagtgctttacagaatcgatcgccgttaggccgtgagccggtatcacggtatcagcttgt atcgagatacaactgtacctatcaaagtgtcgtccgcccatgcgaccttgcagtacatta tatatataagctggggtagattccagatacaaaaactctgtgcgattgttgcaactacgt aaataaatacaagaaatctctcaaggtgcgttttatcccatactggcatacgaatatgca aattaggtccacgttgagccggtcatcatcttttccccatttccatgcatatgacaattt ggtgaggagggacgtgtaaaccgttttgaatattctgacatgctataatggcccgctttt aggaattttgttgcatgaggtcgttcgggaacgacacccccccgccaatattttcgaggc actggtcatttttttaacccaaatactaactatcatagta
ORF
ATGACTTTTCCAGAGCCAAGAGAACACAAAGTGGTGCACGAAGCCAACGGCGTAAGGGCT ATCAAAACCCCTCAATCATTTTATGACAAGCAACCTGTTAAGTCATTGGAGGCATTGGAA CATTATCAAGAGCTGTACCAGAAGTCCATCGAGGACCCAGAGGAATTCTTCGGCCAAATG GCAAAGCAGTTTCTAGATTGGGACAAAGACTTTGGTAAGGTCTCCTCTGGATCTTTGAAA GAAGGTGATGCTGCGTGGTTCCTTGGTGGAGAGCTGAATGCTTCGTACAACTGTGTTGAC CGACATGCTTTTTCGCACCCTGATCGTCCCGCCGTAATTTTCGAAGCGGACGAGGAATCT GAATCTCGAACAATAACTTATGCAGAACTTCTACGTGAGGTCTCTCGTGTTGCAGGAGTA
CTGCAGAGCTGGGGTGTACGCAAAGGTGACACTGTCGCAATCTACTTGCCCATGACTACC GAGGCCATTGTGGCCATGCTGGCAGTGGCACGTCTGGGTGCAGTGCACTCCGTTATCTTT TCTGGATTTTCGTCAGGATCTATCCGGGACAGAGTTAACGATGCTGGATCTAAGGCAATT ATTACCTGTGATGAGGGACGCCGTGGGGGTCGTATTGTGAACAATAAGAAAATTGTCGAT GCCGCTGTTGACAGCTGCCCCACAGTGGAAAAAATCCTGGTTTATAAGAGGACTGGTAAC CCAGAAATCAAGATGGTAGAAGGAAGAGACTTCTGGTGGCAGGAAGAGGTTGAGAAATTC CCTGGTTACATTGCCCCTGTCCCTGTAAACTCGGAGGACCCACTATTTCTTTTGTATACT TCGGGATCTACTGGTTCTCCCAAAGGTGTGGTACACTCCACAGGTGGTTATTTGCTGGGA GCAGCATTGACAACTCGTTATGTGTTTGATGTCCAGGATGAGGATATTATATTTACTGCT GGTGACGTCGGATGGATTACTGGTCACACATACTCGTTGTATGGACCACTTGTTCTGGGT GTTCCAACCATTGTTTTTGAGGGAACTCCTGTCTACCCTGACTACGGAAGATTGTGGAAG ATTTGCGCCAAACATAAAGCCACACACTTTTACATCGCTCCTACTGCTCTTCGTCTTTTG AAAAAGGCTGGTGAAGAAGAAATTAAAAAGTACGACTTGTCTAGACTTCGTACTTTAGGA TCTGTTGGTGAACCAATTGCCCCCGAATTGTGGGAGTGGTACAATGAGAAAATCGGAAAC GGAAACTGTCATATTGCTGATACTTACTGGCAGACTGAATCTGGTTCTCATTTGATTGCT CCATTAGCAGGTGCCGTTCCCCAAAAGCCGGGTGCAGCTACTGTTCCTTTCTTTGGTATT GATGCTTGTATCATTGACCCTGTTTCTGGTAAGGAACTTGAAGGCAACGATGTGGAAGGT GTTTTAGCTGTCAAGTCCACTTGGCCATCAATGGCTCGTACAGTCTGGAGAAACCACGCT AAATACCTCGACACATATATGCGTCCTTATCCAGGCTACTACTTTACTGGCGATGGTGCC GGTAGAGATCACGATGGTTATTACTGGATCCGTGGTCGTGTTGACGATGTTGTCAATGTA TCTGGCCACCGTTTATCCACTTCTGAAATTGAAAGTGCTTTACTGGAAAATGGCAAAGTT GCTGAAGCTGCTGTGATTGGTATTTCCGATGAGCTAACTGGTCAAGCTGTTATTGCTTTT GTCGCCTTGAAAGATGCCACTGACTCTGAGAATTTAGACGCTCTCAGACGTGCCTTAGTC TTGCATGTTCGTGGAGAAATTGGTCCATTTGCAGCTCCTAAGTCCGTGATTGTGGTTGAT GACTTGCCTAAGACCCGATCAGGTAAGATCATGCGTAGAGTTTTAAGAAAGATTTCTTGC CATGAAGCTGATCAATTGGGTGATATGTCTACTTTGGCCAATCCTGAATCGGTAGACTCT ATAATCGGAGCTGTTGATAACCAGTTCTTCAAGAAGTAG
Downstream (in bold stop codon previous gene) aatttatgcatttttggatttatacatctttatatttaatagtaccttttctctcccagc taactatttacatggcaccgtcagcgcctatctaatcgttcttattctttgacaaggttc tcattttctgagcagctttccccaccgccttcagtgtttctgtttgaagtagttgttgtg acgattctaaaacactgatatcttctaatttttcatcccgggcctgccgaattacttccc gtaggtcagaaatacgctttgtcaactcttgcctgtcagagtcgtccatatatccacccc attgagataaagcattatcacagtcaaacagtagtagttccaaattgttggcttttgtgt atatttccttcctaatacgatcagtttctgaattactctttgattcggctacaattcttt caacctcttcagaagtcaaacctgtggatccagatacagttattgaagcgtcttgattag tggtcgtctctttcgcatttaccttgatgattccatccgcatcaatttcgaaactgacct tgatctgaggaacacccttgggttgaggttccaagtctgacagtttgaactgtccaatca tcttattgtctctcaccaaagatctttcaccctgatatacgttaacttcgattgaattct ggccatctacagcagttgagtacatctgttgtactttgcaaggcactttggtgtttcttg ggatcaaaggagaaaaaatacctccaaaggtttcgataccgagagtcaaaggtgttacat ctaaaagcaagacatctttaatctccccagataaaactgcaccttgaattgcagcaccaa gagcgacggcttcatcgggatttacagaagtgtttggcttcttgccaaatgtttcttcca caactttgcgaatcttgggcatacgagtcattccaccaacaaggattacttcatcaatgt cggaatttgtaagatctgcatcacgaagacattttttaact
AA
MTFPEPREHKWHEANGVRAIKTPQSFYDKQPVKSLEALEHYQELYQKSIEDPEEFFGQMAKQFLDWDKDFGKVS SGSLKEGDAAWFLGGELNASYNCVDRHAFSHPDRPAVIFEADEESESRTITYAELLREVSRVAGVLQSWGVRKGD TVAIYLPMTTEAIVAMLAVARLGAVHSVIFSGFΞSGΞIRDRVNDAGSKAI ITCDEGRRGGRIVNNKKIVDAAVDS CPTVEKILVYKRTGNPEIKMVEGRDFWWQEEVEKFPGYIAPVPVNSEDPLFLLYTSGSTGSPKGVVHSTGGYLLG AALTTRYVFDVQDEDI IFTAGDVGWITGHTYSLYGPLVLGVPTIVFEGTPVYPDYGRLWKICAKHKATHFYIAPT ALRLLKKAGEEE IKKYDLSRLRTLGSVGEPIAPELWEWYNEKIGNGNCHIADTYWQTESGSHLIAPLAGAVPQKP GAATVPFFGIDACI IDPVSGKELEGNDVEGVLAVKSTWPSMARTWJRNHAKYLDTYMRPYPGYYFTGDGAGRDHD GYYWIRGRVDDVVNVSGHRLSTSEIESALLENGKVAEAAVIGISDELTGQAVIAFVALKDATDSENLDALRRALV LHVRGE IGPFAAPKSVIVVDDLPKTRSGKIMRRVLRKISCHEADQLGDMSTLANPESVDS IIGAVDNQFFKK
SEQ0080
P. pastoris homolog of Saccharomyces cerevisiae ASC1 (YAL054C)
Acetyl-coA synthetase isoform which, along with Acs2p, is the nuclear source of acetyl- coA for histone acetlyation; expressed during growth on nonfermentable carbon sources and under aerobic conditions chromosome 2-1 , 0767 E.C.6.2.1.1
5' region ctggctcagcggttatgttggttagggtctaagttttgaccaatttgcttgtatgtgcgt taaaaataggacagttgggggacaatagcgcgactggatttggtagacagggcttggttg aaggttgacaattgaccaaccaagcaataccggtagtgggcaaatgaaaaaagaccacca tccgccatccgacagcacactaccgtcatcccctcttccgtaagaccaaaacatcattct ctcttcccctcctaggttggtgattcgcgacccattcctcaactccagagttgttcattt tacaatggggaagtgggatatgatttcgttcctcaattgacttatttgtcatgctttttc ggagcttgaagcatatcggtctctgtaccctgagcctttcgcattagtcggagtgcggat atgttatgggtcagggtctgtatcttgaattattgtaaaattgaaagaaaggatctaagg attgtttttccctttgaatgtgtatcttaagaaattgagtttcaccagcgtgttgtttac taccagcgtctctcttctgtgtgatggccatactatactacataccactttactccattt ccgataataagtttcacttcatcgtctttacctccttgactccgtgcggctccaactcta aatctattagatgatcatgaggggcagcatttctcgtaatatcccacagagatacgtcaa atcccattgttccgggtatatgccctcagtaaagacatggcgtccagcatcttgtactca catctcgtttgagcgggtgacgggaggggtatatgcatctatttttcttgagtcttgact ctgaagtaggtctcgtacaatatgtgacggagagcattcgtgttataaaaaggtaagaat ggtcccccattttcttctttctttctttctttctactatctttataaagtttttttttac aaaggatacgacttagttgttgatcaatt
ORF
ATGCCATTAGATAACGAACACTTACTTCATGAAAATTCCATTGACCCACCAAAGGGATTC TTTGAAAGACACCCTGGAACTCCTAATATACCAGGCGGTTGGGAAGAATACTTGAAGCTG TACAATCAGTCCATCGAGAACCCCTCAAAGTTTTTTGGAGAAAAAGCAAAGGAATTCTTG TCATGGGCTACTCCTTTCACTGACGCTCGTTACCCACCTGGTAATGGATTTCAGAATGGT GACTCCGCCGCTTGGTTTCTGAATGGTGAGTTGAACGCGTCGTACAACTGTGTTGATAGA CATGCTTTAAAGAATCCAGACAAACCTGCCATTATTTATGAGGCTGATGAACCTAATCAA GGCCGTACGGTTACCTATGGAGAGTTGCTGAAGGATGTTTGTCGAATTGCCCAAGTATTG ACTGACCTGGGTGTGAAAAAGGGTGACACTGTTGCTGTTTACCTGCCTATGGTTCCAGAA GCTATCACCACTTTATTGGCTATCGTTAGAATCGGTGCTATCCACTCTGTTGTCTTCGCA GGTTTTTCAGCTGGTTCTCTACGTGATCGTATATTGGATGCTGATTCTAGAATTGTTATC ACTTCTGATGAATCTCTGAGAGGTGGGAAGATCATCGAGACTAAGAAGATTGTTGACGAG GCTCTGAAGTCTTGCCCAGATGTTCGTAATGTGCTGGTCTTCAAAAGAACAGGTACACCA CATCTTCCATGGGTTGAGGGTCGTGATCTTTGGTGGCACGAGGAAATCATTAAGCATGTT CCGTACTCTCCCCCAGTGAATGTTAGATCTGAAGATACTTCATTTTTGCTTTACACTTCT GGCTCTACCGGAAAGCCTAAAGGTATCCAGCATTCAACTGCTGGCTACTTACTGGGAGCT CTTTTGACCACCAAGTATGTCTTTGATGTTCAGGGTGATGATATTTTATTCACTGCTGGT GATGTGGGCTGGATCACAGGGCATTCTTATGTAGTTTACGGTCCACTTTTAAACGGGGCT ACGACAGTTGTTTTTGAGGGCACCCCAGCTTACCCAGACTATTCACGTTATTGGGATATC GTTGACAAACACAAAGTTACTCAGTTTTATGTAGCACCAACTGCTCTTAGGTTGCTGAAG AGAGCTGGTAGCAAGTATGTCCAGAATCATGATTTGTCTTCAATCAGGGTTTTGGGTTCC GTTGGTGAACCTATAGCCGCTGAAGTTTGGGAATGGTACAACGAGTATGTTGGAAGAGGA AAAGCTCATATTTGTGATACGTATTGGCAAACAGAGACTGGTTCTCACATTATTGCTCCA ATAGCTGGTGTGTCAAAGACCAAACCAGGTTCAGCATCTTTCCCCTTCTTCGGTATTGAT CCGGTTATTCTAGATGCTACTACTGGAGAGGAACTCAAAGGTAATAATGTTGAAGGTGTT TTGGCTATCAGAAATCCATGGCCATCTATGGCTAGAACAGTCTGGAAGGACTACAACCGT TTCCTGGATACATATCTCAGGCCATATGAAGGTTATTACTTCACTGGTGATGGAGCTGCC AGAGATCAGGAAGGATTTTATTGGGTTCTGGGTAGAGTTGATGATGTTGTTAATGTGTCA GGTCACAGATTGTCTACTGCCGAGATTGAAAGCGCTCTAATCGAACACAATTTGGTAGGA GAGTCTGCTGTCGTCGGATTCCCTGACGAGCTGACTGGTTCTGCTGTGGCCGCGTTTGTG TCTTTGAAGAAGGACGTCGACAATCCAGCGGAAGTGAAAAAGGAGTTAATCCTTACTGTC AGAAAAGAGATTGGACCATTCGCTGCACCTAAACTCATCATCTTGGTAAGTGATCTTCCA AAGACCAGATCAGGTAAGATAATGAGACGTATTCTCAGAAAGGTTTTGGCTGGAGAGGAA GACTCTCTGGGCGACATTTCAACTCTTTCAAACCCTTCGATTGTGGAAGAGATAATCTCT ACCGTTAAAAGGGATGCCCGCAAATGA
Downstream (in bold start codon next gene) gcatctgattaggacttacacttctatttcacatatctattttattatcgtgaagcagta gtcaaatatgtagtatagttttattagtttaccttatcgctactattgctattttgggag gaagcctcgaagacaccatacctaccaccttgggagaactttttctcacacatagatagt tcagtatttggtaaatgtgaactaggtttaccttcgtttcaacaaatcagaaaacgttgc agtttcatagtgggactagtagcattttgtatgaggggtgatcatttatcaaagctattt cacagcctagaatgaataagaattcaaattcacaacccaagtcaaacgagcaagaaatcg gagattcaacatttgaagcctttaatggaacattttcctcaacacaattctctaatgata atcttgttgaagatcccagttttgatgacttggccaaagatgagtttatggaatgtgacg acatttcgccaccaagaaagacaccttttttaaacgatacaaaatcgcaaggcacattag acgcatccctcactgaattagtcgaccaacaacctgatcataaagaaacaaaggaactga aatttggagattatgaaacctatttctatcacaagaaattaaagcagcaggagaaagatg atcagtatgtacagtggcgcacaaagaattcaaaccagtttcctcccattttttctaatt gtgtgatctatgttaacggaaggacagatccagatataacccagctccatagaatgatta ttttgcatggaggagtattcttgaattatcttggtgctaagtctaatgccactcatgtga ttgcctccaacattactccaagaaagataatagagttcaaaaacttcaaagtaattaaac cagaatggattactgatagtataaaggagaagaagattctggattggaccaaatacactc tattcagtccacagtatgatcaaaaggttttggaatcagat
AA
MPLDNEHLLHENSIDPPKGFFERHPGTPNIPGGWEEYLKLYNQSIENPSKFFGEKAKEFLSWATPFTDARYPPGN GFQNGDSAAWFLNGELNASYNCVDRHALKNPDKPAI IYEADEPNQGRTVTYGELLKDVCRIAQVLTDLGVKKGDT VAVYLPMVPEAITTLLAIVRIGAIHSWFAGFSAGSLRDRILDADSRIVITSDESLRGGKIIETKKIVDEALKSC PDVRNVLVFKRTGTPHLPWVEGRDLWWHEEIIKHVPYSPPVNVRSEDTSFLLYTSGSTGKPKGIQHSTAGYLLGA LLTTKYVFDVQGDDILFTAGDVGWITGHSYWYGPLLNGATTWFEGTPAYPDYSRYWDIVDKHKVTQFYVAPTA LRLLKRAGSKYVQNHDLSSIRVLGSVGEPIAAEVWEWYNEYVGRGKAHICDTYWQTETGSHI IAPIAGVSKTKPG SASFPFFGIDPVILDATTGEELKGNNVEGVLAIRNPWPSMARTVWKDYNRFLDTYLRPYEGYYFTGDGAARDQEG FYWVLGRVDDWNVSGHRLΞTAEIESALIEHNLVGEΞAWGFPDELTGSAVAAFVSLKKDVDNPAEVKKELILTV RKEIGPFAAPKLIILVSDLPKTRSGKIMRRILRKVLAGEEDSLGDISTLSNPSIVEEIISTVKRDARK
SEQ0081
P. pastoris homolog of Saccharomyces cerevisiae LPD1 (YFL018C)
Dihydrolipoamide dehydrogenase, the lipoamide dehydrogenase component (E3) of the pyruvate dehydrogenase and 2-oxoglutarate dehydrogenase multi-enzyme complexes chromosome 2-2, 0048 E.C.1 .8.1.4
5' region (in bold start codon previous gene) cgatagtaatggtttcctgactagctgcagatgtggccagagagtgaacactctcatcaa ctgagttatcctttatgctcctattcaaggagatatcagtagatttttcagccatttagt tagagccttcaaaagaatggcagaacattcacctatttttataccctgtgataagagaag cttgaaaaaagcttgttcatagtcatctccacagttgacaggcgaccactgaaatggtgg taccttgcgtcatcggatcgaatttttgaggatttcgttttaaaagtttggatcttaccc gttattaccataggcatgggatttgaaaacgtttcaatgtggagtggcgggtgacgcaat gtgtggtaagctttttcggcaatggcgaaattgggctcctgctaacttaaaaatactgag tctaggcacaatatcttgtttgatgcctcaaaaatggcttcagcaaaccccgaatctcct cccttaacgcttgaactgtatctggccctataatgtctcaattccatctaggcataagtt cctgaaggcttttgtttcttgcacttaccgaagaaatagagttgatttcgggcaaaacga atatccactcaaacatgagatacacttgaattgggctttcgttcccaatttcgtggaagt tcaagtttagatggaataattatgacatctgtggatggcctaatataggcatgtccaaaa agcttaattgtcgagatagtgcatgaagtttggataaacgtcccaaaggtcaactctcct ctccctgcccccactaaagcatatacccccactaccccgggagatcccggtgtccgatgt ctacaacaatactgcgatcgaccaatgtcacaacatcattacaccatatcggagaacccg gactcgacctgccaacttccagctcgttctttctttgttagtccaaccagtcatagattt agtagaccttttggaatataaagtaccaccaaagagacata
ORF
ATGCTCAAATTCGGACAAAGACGTTTTCTTAGCACTTCAAGAATCCTCGCTCAGGCCATC AAACACGATGTCGTCGTGATTGGTGGTGGACCTGGTGGTTATGTCGCTGCCATTAAGGCT GCACAATTGGGTCTAGATACTGCTTGCATTGAGAAAAGAGGGGCCCTTGGAGGAACCTGT TTGAATGTTGGTTGTATCCCTTCTAAGTCTCTGTTGAACAACTCTCACCTTTATCACACC
ATCAAGCACGATACTAAAGAGAGAGGAATCAACGTTGCTGATGTTCAAATTGACATTGGC CAATTGCAAAATGCCAAGGAGAAATCTGTGAAGCAATTGACTGGAGGTATCGAAATGCTG TTTAAGAAAAACGGTGTCAAATATTACAAGGGTTCTGGTTCTTTCGTTGATGAACACACT ATCAACGTTGATCCTGTTGAGGGAGGCGACAAAGTTGAACTCAAAGCTGATAATGTGATT ATTGCTACTGGATCTGAACCTTCGCCATTCCCAGGAATTACTGTCGACGAGGAGCGTATC GTGACTTCAACTGGAGCTTTGGACTTGAAGGAGGTTCCAAAGCGTATGGCTATTATTGGT GGTGGTATCATTGGTTTGGAAATGGCTTCTGTTTGGTCTAGAGTAGGCTCTGAAGTCACT ATCATTGAATACCGTGACAGTATTGGTGCTGGTATGGATGCCGAAGTTGCAAAGAGTACG CAAAAGTTCTTGACCAAGCAAGGCTTGAAGTTCAAGTGTGGTGCCAAGGTCACTAAGGGT GAACGCGTAGGCGAAGTTGTGAACATTGAGATTGAAACAACAAAGGATGGTAAAACTGAG CAATTTGAGGCTGATGTCCTATTAGTTGCCGTTGGCCGTAGACCTTACACTGAAGGCTTG AACGCCGAGGCAATTGGATTGGACTTTGACAACAGAGGAACTTTGGTCATCGACTCTGAA TACCGTACCAAGCACCCTCATATTCGTGTAATTGGAGATGTTACTTTTGGTCCTATGTTG GCTCACAAGGCTGAGGAAGAAGGTATTGCCGCCGCAGAGTTTATAAAGAAGGGCCATGGT CACGTCAATTACGGAAACATTCCATCTGTTATGTACACTCACCCAGAAGTCGCTTGGGTT GGTCAGAACGAACAACAACTTAAGGAGGCTGGTATCAAATACAAAGTTGGTAAATTCCCA TTTATTGCTAACTCTAGAGCTAAGACCAACTTGGACACTGAAGGATTTGTTAAGTTCCTT GCTGATGCCGAAACTCAACGTGTTCTGGGTGTTCACATCATCGGTCCCAACGCTGGTGAG ATGATTGCTGAGGCTGGGCTTGCTTTAGAATACGGTGCTTCTACAGAAGATATCGCTCGT GTCTGCCATGCTCATCCTACTTTGTCGGAGGCTTTCAAGGAAGCAGCTCTGGGTACCTTT GACAAGACTATCAACTTTTAA
Downstream (in bold stop codon next gene) gttttctaaataatgtacaatgaaatgtcttcaaacctatatttactctttgtcattttc tactacttctgtgatatcccctattagtggctttaatgcctgtgggagcaactccgctgc atctcccaggaacttccacccatcacacccttcaagatcttcttcatcagtattaaaaat ggcaactttaccaccctgtagtttcactctgttcacatagcctgcggcgggccatacaga ttgactggtgccgataactagtatgaggtccaccttttgtgacataatgaattgatctgc ttgatcaagcaccttaaagggtagggattctccaaaccagacaaccccaggacgtaataa tccagtcttacattgagggcatgtgggaagcccacttctaggaatttccttaactggcgt gaattctggagatccagttgtcgaaattacagcgtctttaccatcctcattctcattctc gttctcattttcattctcagttcgtctctgtctttttctaggattgatccactcctcttc acaaccttgcaattgaggggtcaaggggtgcttaaaattattattctctgtgtatgtgca gtcaaaacttgtgcatttgagcgtgaacaagcttccatgcaactccaaaagctgttctgg atcatggtgggcccttctagaaagaccgtccacattctgagtcaatgttaagaatttctt accctttgaggctgccactttggaaagttgactcaaggcgtaatggccgttattcggttt tgcctggatcgctttgtagcgtctataagagtagaattgccacactagccctggatcaac gttgaaagcatctggggtagccagatccatcgaactgtaatttttccacaacccacctga cccccggaaagttggtagtcctgaactcgccgaaagtccagcaccacaaagggccaatat ggtacggcacttatctgtggttatgtattcatggaaactgg
AA
MLKFGQRRFLSTSRILAQAIKHDVWIGGGPGGYVAAIKAAQLGLDTACIEKRGALGGTCLNVGCIPSKSLLNNS HLYHTIKHDTKERGINVADVQIDIGQLQNAKEKSVKQLTGGIEMLFKKNGVKYYKGSGSFVDEHTINVDPVEGGD KVELKADNVIIATGSEPSPFPGITVDEERIVTSTGALDLKEVPKRMAIIGGGIIGLEMASVWSRVGSEVTIIEYR DSIGAGMDAEVAKSTQKFLTKQGLKFKCGAKVTKGERVGEWNIEIETTKDGKTEQFEADVLLVAVGRRPYTEGL
EVAWVGQNEQQLKEAGIKYKVGKFPFIANSRAKTNLDTEGFVKFLADAETQRVLGVHIIGPNAGEMIAEAGLALE YGASTEDIARVCHAHPTLSEAFKEAALGTFDKTINF
SEQ0082
P. pastoris homolog of Saccharomyces cerevisiae ADH3 (YMR083W)
Mitochondrial alcohol dehydrogenase isozyme III; involved in the shuttling of mitochondrial NADH to the cytosol under anaerobic conditions and ethanol production chromosome 2-1 , 0472 E.C.1 .1.1.1
5' region cgcagcgttttctgacggtactagaggactcttaggggaaggtagaatcaataaagatca tattaggtaagcaaattttggatggaataggagactaggtgtggatgcgcgatctcgcca
aattgcacgaccagagtggatgccggatggtggtaaaccgtttcttcctttttaccaccc aagtgcgagtgaaacaccccatggctgctctccgattgcccctctacaggcataagggtg tgactttgtgggcttgaattttacaccccctccaacttttctcgcatcaattgatcctgt taccaatattgcatgcccggaggagacttgccccctaatttcgcggcgtcgtcccggatc gcagggtgagactgtagagaccccacatagtgacaatgattatgtaagaagaggggggtg attcggccggctatcgaactctaacaactaggggggtgaacaatgcccagcagtcctccc cactctttgacaaatcagtatcaccgattaacaccccaaatcttattctcaacggtccct catccttgcacccctctttggacaaatggcagttagcattggtgcactgactgactgccc aaccttaaacccaaatttcttagaaggggcccatctagttagcgaggggtgaaaaattcc tccatcggagatgtattgaccgtaagttgctgcttaaaaaaaatcagttcagatagcgag acttttttgatttcgcaacgggagtgcctgttccattcgattgcaattctcaccccttct gcccagtcctgccaattgcccatgaatctgctaatttcgttgattcccacccccctttcc aactccacaaattgtccaatctcgttttccatttgggagaatctgcatgtcgactacata aagcgaccggtgtccgaaaagatctgtgtagttttcaacattttgtgctccccccgctgt ttgaaaacgggggtgagcgctctccggggtgcgaattcgtgcccaattcctttcaccctg cctattgtagacgtcaacccgcatctggtgcgaatatagcgcacccccaatgatcacacc aacaattggtccacccctccccaatctctaatattcacaattcacctcactataaatacc cctgtcctgctcccaaattcttttttccttcttccatcagctactagcttttatcttatt tactttacgaaa
ORF
ATGTCTCCAACTATCCCAACTACACAAAAGGCTGTTATCTTCGAGACCAACGGCGGTCCC CTAGAGTACAAGGACATTCCAGTCCCAAAGCCAAAGTCAAACGAACTTTTGATCAACGTT AAGTACTCCGGTGTCTGTCACACTGATTTGCACGCCTGGAAGGGTGACTGGCCATTGGAC AACAAGCTTCCTTTGGTTGGTGGTCACGAAGGTGCTGGTGTCGTTGTCGCTTACGGTGAG AACGTCACTGGATGGGAGATCGGTGACTACGCTGGTATCAAATGGTTGAACGGTTCTTGT TTGAACTGTGAGTACTGTATCCAAGGTGCTGAATCCAGTTGTGCCAAGGCTGACCTGTCT GGTTTCACCCACGACGGATCTTTCCAGCAGTATGCTACTGCTGATGCCACCCAAGCCGCC AGAATTCCAAAGGAGGCTGACTTGGCTGAAGTTGCCCCAATTCTGTGTGCTGGTATCACC GTTTACAAGGCTCTTAAGACCGCTGACTTGCGTATTGGCCAATGGGTTGCCATTTCTGGT GCTGGTGGAGGACTGGGTTCTCTTGCCGTTCAATACGCCAAGGCTCTGGGTTTGAGAGTT TTGGGTATTGATGGTGGTGCCGACAAGGGTGAATTTGTCAAGTCCTTGGGTGCTGAGGTC TTCGTCGACTTCACTAAGACTAAGGACGTCGTTGCTGAAGTCCAAAAGCTCACCAACGGT GGTCCACACGGTGTTATTAACGTCTCCGTTTCCCCACATGCTATCAACCAATCTGTCCAA TACGTTAGAACTTTGGGTAAGGTTGTTTTGGTTGGTCTGCCATCTGGTGCCGTTGTCAAC TCTGACGTTTTCTGGCACGTTCTGAAGTCCATCGAGATCAAGGGATCTTACGTTGGAAAC AGAGAGGACAGTGCCGAGGCCATCGACTTGTTCACCAGAGGTTTGGTCAAGGCTCCTATC AAGATTATCGGTCTGTCTGAACTTGCTAAGGTCTACGAACAGATGGAGGCTGGTGCCATC ATCGGTAGATACGTTGTGGACACTTCCAAATAA
Downstream gccgaatagtttgtatacgtcttatgtaatgagtttcaatgaattacttatttttacctc tcctttttggctcaattcaactagcctctgtagcaatctgtttgcgaagaagaacttatc taatttttcatgggttttccccacgtttttgaaaagactttgtcttaactctctgtgatc aaaccgttgtggtggaccgctgattgattgtgattcttcttccagatctagcgactctgg cagataagagtccccggagtcaataattaccccaagaatataccttgaaacctccgagtt ggccagtatttgagataacagcagctccttgtatttgtaagtgccttcgttatcaattga tccgatgtccagcattattttgatatcaggtctcccaatcttgacaaacttttctacaat gtgtttttgaagcagttgcacaagaggagcaatatcttcaaggtatgagtcttgctggtc gccgttcaattttaaaataagaaatgtatctgagttccctgtcgtcccaataactgctac actccccaactttgtgacagagagaaaatgatgctcttgtgaagtggaatatatggcgtc aatgctactttgtaatttatggttgaacgcattctcagtatctccataaacttgaaagcc cacggggtaagagacgccagatgctaattctctatgcaattgcgactctgtgtaagtgtg actcaaaagtccaatacagtagagatcactcacatactgaggggtcagtgtgtcggagag ttcacctaccaatggacagtactgggctaattcggttageaagactcgacatatcgggat cccatactgaatctcataggacattattgcctgatcattgtctccgtttccgtactcgct caaattggttctcattgtcactagcaagtcttttcgatctgtagacgtcaaggggttgaa taatggagttatctctttttccaaaacaccgtccatttctc
AA
MSPTIPTTQKAVIFETNGGPLEYKDIPVPKPKSNELLINVKYSGVCHTDLHAWKGDWPLD
NKLPLVGGHEGAGVWAYGENVTGWE IGDYAGIKWLNGSCLNCEYCIQGAESSCAKADLS GFTHDGSFQQYATADATQAARIPKEADLAEVAPILCAGITVYKALKTADLRIGQWVAISG AGGGLGSLAVQYAKALGLRVLGIDGGADKGEFVKSLGAEVFVDFTKTKDVVAEVQKLTNG GPHGVINVSVSPHAINQSVQYVRTLGKVVLVGLPSGAVVNSDVFWHVLKSIEIKGSYVGN REDΞAEAIDLFTRGLVKAPIKI IGLSELAKVYEQMEAGAI IGRYWDTSK
SEQ0083
P. pastoris homolog of Saccharomyces cerevisiae ADH6 (YMR318C)
NADPH-dependent medium chain alcohol dehydrogenase with broad substrate specificity; member of the cinnamyl family of alcohol dehydrogenases; may be involved in fusel alcohol synthesis or in aldehyde tolerance chromosome 4, 0576 E.C.1 .1.1.2
5' region (in bold start of next gene) acttggggaacacagttgaaagaccgtcgatgactccctcaattcccacacttaaagctg ggactaaaacttgccctacccctgcaaatggtgagaaattattatctctcaattccgtgc ttttcccaactaaagtgtaaagagtaaaatcgtaatgggtcaacttacatccgacaatct tttcatgcctagataattccttatagctgtcaaaagtgagctgaatcccattttgcactc caggatagttatacagaataattggcaattctgctttatcggcaatggaattgaaccagc tgataatccccttttgggataccaaacttgggccgtagtaccctggtacaagaatgactg aatatgaagctccattctctttactaacggagatctgacgaataatatcctcaatacagt agccaatcactccagcaataatggtaaattcggggacagccttatggagagtacgtatca atgtcaaccgctcctcctctgttaaatgaatggcctctccaattgatcctccgactaaaa ttcccttaatgccactatcatacaaatgtcttgcatgggccacttgagcctctagatcca gagatttcttggcatcgttcttgaaaaatgtgggcagtggagaataaaccccgccagaaa ggactttatatgacaccatcgtcgatggtttcaatggagggagcaagtacctttcatgtt atatgatctggagctcctatgaaggtcagccttttgaccagtttcgagtgggataatata cccgcagcaaccccactataagccatcccacgcaaattttccgattctagcacgaaaaag cttggaccctactaataggatgggtagcaatcttaactcctgaagcaacgggtgtacttg tcgatgttgcacccaccttgtgaaaaaaatccccacgcttcattacctatattaagcttg aacatcaggtatatcaccagaaaactagtaaaaaccaaaca
ORF
ATGGCGTACCCAGACACCTTTGAAGGATTTGCCGTCACTGACACTGCAAAATGGTCCACA ACCAAGAAGATAGAATTCACCCCAAAAAGGTTCCAGGAACATGATATCGATGTCAAGATC CATGCCTGTGGTATCTGCGGGAGTGATGTTCACACTGTTTGCGGGGGATGGGCAAAACCA GACCTTCCCGTGATCCCAGGACATGAGATCGTTGGTGAGGTTGTTAGAGTGGGCCCAAAA GTGAAGGGATTTGAAATTGGGCAAAGAGTTGGTGTTGGAGCTCAAGTTTGGGCCTGTCTA GAGTGCGACACATGCAAGGATAACAACGAAACGTACTGTCCTCAATGGGTGGACACTTAC AATGCCACTTATCCTGATGGTGACAAGGCATGGGGTGGTTATTCCTCTCACATCAGAGTC CACGATCACTTTGTATTCCCTATTCCTGATGAACTTCCAACTAATGCTGTGGCCCCAATG TTGTGCGCTGGTATCACCACGTACTCTCCGTTGGTAAGAAATGGAGCTGGTCCAGGAAAG AAGGTGGGTATCATCGGAATTGGAGGGTTGGGACATTTTGCCATCATGTGGGCTAGGGCT CTTGGTTGCGAAGTGTACACGTTTTCTAGAACACATAGCAAGGAAGCTGATGCTAAGAAA TTGGGAACTGACCATTTTATTGCGACGTGGGAGGACAAAGACTGGGCCAAGAAGATTGGC AGAAAGCTGGACTTTATCATTTCGTGTGGAAATTCGGCCACGAACTTTGATATGGATGGT TACCTCAGTGTGCTGAAGGTTCATGGTAAACTCATTTCCGTCGGCCTTCCAGAGGAGCCA TTCACGCTGTCTGCTGGAAGCTTTATCAAGAACGGTTGCTACTTGGGATCGTCCCACTTG GGGAACAGACAGGAGATGCTTGATATGCTGAAACTTGCTGCTGATAAGGGCATTGGTTCT TGGTATGAGGAGCTCCCAATCTCTGAGGAAGGGCTGAAGGAAGGACTGGAGAGATGCCAC AACAATGACGTTAAGTATAGGTTCACCCTGACCGGTTACGATAAGGCATTCAAATAG
Downstream acctaccccatggactgaaacttcaaagtccagggtgaactatctttgactaaactgagc atcttagttaacaaaaaatcgggcagagacccaagttagtcagcagacctaccccatgga ctgaaacttcaaagtccagggtgaactatcttctactaacctgagcatctcagtcaataa tacaaacatccgagcaggaggtgttcaattggttctcaaaatcacactctctagtcagaa ggctaacatcgtgactgtcattctaagcgcatgcaacacttagccaagtcccaagatagt aacgtatctaccacaatccactataagttagtaaaagaaacacagaaaacaggaagcaga gtaaaatcagtgtcatgacttgatagtattccaggcacggtcatccagccaatcaattga
tccttccaggcttcacgacatctcatcctttagcaccgaggttcatacccttcctgaaac aaaatgtagcacccctcctacaagctcatgaaaggaagaacaagaccaaccatttgcgaa aacttctatcttgcgaatgaatggttctgtgttcagaaagttgtcacaaacgctaaagga ataatcgccttaccattggagcacggtttggagcagcacgagaagcaaacgtcttgcacc atcaaaatagatgctagactcgttttacgccttggtggccgatatatacggcctatagat atggccttcatagtggcaattccttgacagtgctgcccacatcgtgttaagaggaagaga gcacaaatagagaacttcgctcctatgaccactcaatctggggtaagattacatcagcaa ttagattccacatttccccctcccatctacacttatcgggaagcactaaaccgttgaacg acttacacactctacaaagggcatatgcaaccaacgagcagacactacaacagacaatta actccagacctcccattcactatctattatgcatcatctat
AA
MAYPDTFEGFAVTDTAKWSTTKKIEFTPKRFQEHDIDVKIHACGICGSDVHTVCGGWAKP DLPVIPGHEIVGEVVRVGPKVKGFEIGQRVGVGAQVWACLECDTCKDNNETYCPQWVDTY NATYPDGDKAWGGYSSHIRVHDHFVFPIPDELPTNAVAPMLCAGITTYSPLVRNGAGPGK KVGIIGIGGLGHFAIMWARALGCEVYTFSRTHSKEADAKKLGTDHFIATWEDKDWAKKIG RKLDFI ISCGNSATNFDMDGYLSVLKVHGKLISVGLPEEPFTLSAGSF IKNGCYLGSSHL GNRQEMLDMLKLAADKGIGSWYEELPISEEGLKEGLERCHNNDVKYRFTLTGYDKAFK
SEQ0084
P. pastoris homolog of Saccharomyces cerevisiae ADH6 (YMR318C)
NADPH-dependeπt medium chain alcohol dehydrogenase with broad substrate specificity; member of the cinnamyl family of alcohol dehydrogenases; may be involved in fusel alcohol synthesis or in aldehyde tolerance chromosome 1 -1 , 0357 E.C.1 .1.1.2
5' region (in bold start next gene)
PTCAAAAATTCACGGTAG^AACTGAAGAAAATCATGAAG"AAAACACAΛAGCAGAAGTCT TTCCATCCTTTGCCATAC\TGGGAGGGTTAGTTCCTTCGAT"TTATATG!VAAGATCAACG AACTTGTGCAATGGATTGGATGGAGTGTGCTTATTAGATAACAGATACGiVGCCATGTATG GCAACCAAAGAAACCAAAAATGGCAACCATGTGTGACGCGAGCTCAGTTCG^CCAAGGAA ?GTCCAATTTTTTTAACT\AACCCCAATCCGAACTTCCC"TCTTGTTGATT!VAATTAATA CGGGCGGAGCTAGACTTTCTCTGCTGAGCATTCGACTTTCGGGTAGTCAAAGAAGGAACT GCATTGTCACCTAGGTTT\TGTTTCCAATCGATGACGCTCT"CGACGTCTTGTCTTTTCC TCTTTAGACATTTGTTGTGCIAACCTGGTATACCTGTGAGAGCTGCTTTACSGAAGCAGG AAGCAGAATTCCAAAAACTTTGGGAAGACCCCCGCGAAT^GACTCGAGAGT^CTTCTACT GCTTATGTAATGTAGATGTCAAATGTCTGCCAGGTGTAGATCCGATCGAGGϊVGTAAATAT GTGAATGTTACTGTACAT^VAGGTCAAAATTTC-AGTTGGAGAGGAGAGTAAACAAAGGCAA GATTGCCAATGACCGCACTCAGAGTTGGTTATCTGCATAGTCAAAACTAGTCACCTTTGG TTAGTCATTACATTTGAATACTTCAGAAATAATCGCCTTCCCCAGAATTTAATGCATCCC CCATTCCCACCGAAATTCCACTAAACTCCCACCAGTTTTCAGATTCGGCTGTGAGTGCGC ^TCAGTTGCTCGGATAAACTCTTTCAAAACCATAACGTTCGCCCCTCCACCTCAACATCA GTGGGGGCACTATTGGCTTCAAATATAATATATATAGATGGCGAAGTGAAATTCGTTTCT AACCATCTTATTTTTAATTTTTTATTTCATCACTATAACAA
ORF
ATGGCTTACCCAGACACTTTTGAAGGATTTGCAGTTCACGACCCAAGTAAGTGGTCCGAAGTAAAG AAAATTCAATTTACTCCTAGACCTTTCCAGGAAGATGATGTTGACATCAAGATCGAAGCATGTGGT ATATGCTCCAGTGATATTCACACCATTTCCGGAGGGTGGGGTCAGCCAAAGCTTCCCTCCATTGTT GGTCATGAGATTGTTGGAACTGTAGTTAGAGTTGGTTCAAAGGTCGACAATATCAAGGTTGGAGAC ACTGTCGGTATGGGAGCAATGTGTTGGGCTGATTTGACGTGTGATGTCTGCAAGTCCAAAAACGAA AACTACTGTCCTAACTGGATTGATACATATGATGACGCTTATCCCGATGGGTCAAGAACCTATGGT GGTTACTCCAACTATGCTCGTTGCAACAAAGAGTTTGCATTCAAGATTCCAAAAGGATTATCCGTA GAAGGAGTTGCTCCAATGTTGTGCGCTGGTATCACCACTTATTCTCCTTTAAAGAGAAACAATATT GGACCAGGCAAGAAGGTGGGTGTTGTTGGTATCGGTGGTTTGGGGCATTTTGCTCTTCAATTCGCC AAGGCTCTTGGAGCTGAAGTCTACGCAATTTCCAGGAATGACAAGAAGAAAGCAGACGCTCTTAAA TTGGGAGCTGATTACTTCATTGAAACTGAGAAGGAAGGTTGGAACTTACCATACAAGTATAAGTTC GACTTAATCATTAGTACAGCCAATTCCAGCCAGAACTTTGACCTTGATGCCTATGTATCCACACTT AATATTGGGGCAAAGTTTGTATCTGTCGGACTTCCAGAGGACAAGATGGAAATGAATGCTGGATCT TTCATCAAGAACGGCTGCTATTTCGGTTCTTCTCATTTGGGTAACCGTGAAGAAATGAAAGAGATG
CTTGAGTTAGCTGCTGAAAAGGGTATTGAAGCTTGGTiMGAGCCAATTTCAATCTCTCAGGAGGGT ATTAAAAACGGACTGGAGAAGTTGCACCGCAACGATGTCAAATACCGTTTCACCCTTACTAATTAC GGAAAACAGTTTGAATAG
Downstream (in bold start codon previous gene) ggtaaaatagatatgaatatataagttaggtgacgagatatgtaggaatggaaggtcgaa ggattttattcaatgctaaaataccattttccgttgtatcccagcataattttcatgatg gagtgaaactattcggtatttgatatctattatcctctctcctcggcttcgaagtttgga agcatcgaaattgtttccacttgtgcggaggcgcccccacacttaatagtcatgcatgat gtatcaattatgtactagatgcacccctgtgccccaccaaaaccgtgatgtaacccccac tgtggagccatatggtcaggcgtcaaaataaacttcttctgttttcaactccactttcag ttaatgataaatcgaaggtaatatcattattcggtggcctccccaatcatataaagctta atgagtaagtattaagtttcatttcgctctccactttttacttgtaaacttttctctaac atgacagttgactcttttgttttgccccgtggaatatacacccccatcccaacctacttc aagaaggattactccttagacttagaaacccaagtagagcacgcaaagtttttacataat gctggtatcaatgggctggtcgtcgctggatccatgggcgagtctacacatctgactaga gatgaaagattgtcacttttccgtactttgagagaggctgtccctgatcccaatttcaag atcattgctggagctccgtcaggacctgtggaagagattagagaggagatgaggttggca aaggaagcaggtgctgatttttcaattatattggtcccgggatacttcggtccaaacttg atcagccaagaggggataatagaatactttgaacttgtttccaaggactctccgttgcca ataattatttataactacccaggtaactgtaatggtgtcgatatcaaaccggaaacattt gaaagtttgggtaagcttcccgcaattgttgccgtcaagtt
AA
MAYPDTFEGFAVHDPSKWSEVKKIQFTPRPFQEDDVDIKIEACGICSSDIHTISGGWGQPKLPSIV GHEIVGTVVRVGSKVDNIKVGDTVGMGAMCWADLTCDVCKSKNENYCPNWIDTfDDAYPDGSRTYG GYSNYARCNKEFAFKIPKGLSVEGVAPMLCAGITTYSPLKRNNIGPGKKVGVVGIGGLGHFALQFA KALGAEVYAISRNDKKKADALKLGADYFIETEKEGWNLPYKYKFDLIISTANSSQNFDLDAYVSTL NIGAKFVSVGLPEDKMEMNAGSFIKNGCYFGSSHLGNREEMKEMLELAAEKGIEAWYEPISISQEG IKNGLEKLHRNDVKYRFTLTNYGKQFE
SEQ0085
P. pastoris homolog of Saccharomyces cerevisiae ALD2 (YMR170C)
Cytoplasmic aldehyde dehydrogenase, involved in ethanol oxidation and beta- alanine biosynthesis; uses NAD+ as the preferred coenzyme; expression is stress induced and glucose repressed; very similar to Ald3p chromosome 2-1 , 0453 E C 1 2 1 5
5' region (in bold stop codon next gene) atttccaaaatggtaacaacgttcaagtaactttggatgttctgaaatatatttcaaaaa aatatggaaccacggactactacgatgtggtcattggtattcaactcttaaacgaaccat tgggacctattttagacatggataatctaagacagttctatgcggatggttatgatctag ttagagatgttgggaacaactttgttgtaatccacgatgcattttaccaggcgccagagt actggggggacgatttcacctcagcggaaggttactggaacgtggtgcttgatcaccacc attatcaagtcttcgatgcagatgaattgcaaagaagtatcgatgaacatatagaagccg cctgtgattggggtagagatgcaaataaagagtaccactggaacctctgtggtgaatggt cggcagcacttactgattgtactccttggttaaatggtgtcggaaaaggcacgagatatg aaggtcaacttgataactccccttggatcggatcttgtgagaatagccaggatccttcga aattgagctctgaacgtatctgtgagtacagaaggtacgtagaagcccagctagatgctt tcttacacgggaaaagcgcaggttttattttctggtgtttcaagacagaggccagtttgg agtgggattttaaaaggttggttaatgccggtatcatgcctcagccattggacgacagac agtatccaaatcaatgtgggttctaagtttctccaatcaatgaaagcattaggatacttg aaagcaattgacgattagttttctgtcaatggcccttgcagacccttgtcatatgcaccc tttcctttcgttatgatatatagctgcattttatcgttcttttcttcattatatatgaaa attaaaacttcttcaggtagagcgaagggttgtaccccgtagtaaagtattcgctgatca ttttcatttgctcttatcttccccttcactttgaaaaaaaa
ORF
ATGGCAAGTCCCTTAAGCAAATCAATTGCCTTTCCTACCGGACAAAAATACGATCAGCCAATTGGC CTCTATATCAATGGTGAATGGCGTGAGTCAAAAGACACCATTGATGTTATTAACCCATCCAATGGA GAAGTGATTACCTCGGTCTACGCTGCCCAAGAGAGCGATGTTGACAATGCTGTTTCTAGTGCCCGA AAAGCATTCAAAACTTGGAAAAAGCTAGCTGGAGAAGAGCGTGCAACCTTGATGAACCGTTTAGCA GATCTTTTGGAGAAGAACGCGGAAACTGTAGCTGGAATTGAGGCCTTGGACGCAGGCAAACCCCAG TTCTCTAACGCTCTCCCTGACATCGAAGGATCCGTTAGTATTTTGAGATACTGCGCAGGTTGGGCT GACAAAATTTATGGCAATGTAATTCCGTCAGGCCCAGATAAACTGTTGACGTCGAAAAGGATTCCC TACGGTGTTGTCAGCCAGATTGTTCCTTGGAACTATCCGTTGAATATGGCCATGTGGAAAATTGCT CCCGCACTGTGTGCTGGAAACTGTATTGTCATCAAGTCCTCGGAATCTTCTCCACTTTCTTTGCTG TACTTTGCTGAACTAGTAAATGACGCCGGGTTCCCACCTGGTGTGCTAAACATTATCAGTGGACTA GGTTCAGTAGCTGGTGCCCGTATGGCTAGTCATCCAGATGTCGATAAAATAGCCTTTACTGGGTCC ACAAAAACAGGCAAAGAGATTCAAAAATTGGCCTCCTCAAATTTAAAGACTGTTACTCTAGAATGT GGAGGTAAGTCTCCGTTGATAGTCTTTGATGATGCCAAACTGGATCAGGCAATCTATTGGGCTGCG TTTGGAATAATGTACAACACAGGTCAGATTTGCACGGCAAACTCTCGTGTCTTGGTGCAAGACACA ATCTACGACGAGTTCATTCAAAAGTTCAAGGCTCATGTGCAGGAAAACTGGTTCATCGGGTCTCCA TTTGATAAGAAATCCACCATGGGGCCTGTGATCAACAAATCTCAGTTTGAGAAGGTTAAAGGTTAT ATCCAGAAGGGTAAGGATGAGGGTGCAAAGTTAGTTATTGGAGATGAACCAGTTACATTTGAGAGT GGTTACTGGATCCATCCAACTATTTTTGTTGACTGTACTCAGGATATGTCGATTGTGAAGGACGAG ATATTTGGGCCCGTAGTTGCCATCAGTAAATTCCACTCTCAGGAAGAGGCAATTGAGCTAGCCAAT GATACGGAATACGGTTTGGCAGCTATGGTTTTTTCTAAGGATATTGTCACAGCCAATACCGTTGCC AGCCAGTTGGAGGCTGGAACTGTGTACATCAATTCTTCAAATGATGATAATATCCGTGTTCCATTT GGAGGGTTCAAGATGAGTGGTACTGGTAGCGAGTTGGGAATGGAAGGAGTGTTAGCGTACACTAAA ATCCAGGCCATTCACACTAACCTGACCAGAGATTAA
Downstream (in bold stop codon previous gene) agggaatagctctcctgctatatgcctttcttcgttctgtattttttccacttctccatc gtttcaatagcagccttccaattatgtgttacactttccagctcttttcgaagagattca atctgtgtctccctttcatgtagttgttgtttgagaccgtttctggaggcaacttccgag tctgcagatagttcattcagttcaaaagcatttcctaaagagttgtatagttcattcatt tgggaagaaaactgtctgagaatattctctgattctgttttcgataattgtagagacttc accaactcttcttttgtctcatgaactgtagaggcagacccaggaggggaatttatctcc attgaagtcaattgggcatactcttctcttaggtcatacgatattttgccgtcgtagttt gcaatctttgaaatgaaccttactaaattcttgatgaaattgttgtaaatagtcaagtac ctggttttagagttattatcttgaatcagtgagtagattatatcatataaatcagacagc aaagtcatggtttctttctgtgctacgggaaatctctccactggggttaacttttgagtc tccgttgaagactcggctattttttgtctcttctttggttgatcatcaggcatgatagga agactgtatagggagaatactttccctttaatacccaactgttttgcgactagcttttct tggagttgatcaatctgaaggtctttcctcctgaattctgcattagatgattggagtaca gtttgtagttgtgatttagtcttgttgtgttgtttaatgtttgtttgcactgaacttgtc aattcatcgatctgtgctgttaactggttaatatgtctctcttgagaagttgacttggcc tccaaagtgctaatacgcttcttcaagagagattggtcagttttcagggagtctatcact tgtaaatgatctttttcccgtaagagaagatcgtctctgat
AA
MASPLSKS IAFPTGQKYDQPIGLYINGEWRESKDTIDVINPSNGEVITSVYAAQESDVDNAVSSAR KAFKTWKKLAGEERATLMNRLADLLEKNAETVAGIEALDAGKPQFSNALPDIEGΞVSILRYCAGWA DKIYGNVIPSGPDKLLTSKRIPYGWSQIVPWNYPLNMAMWKIAPALCAGNCIVIKSSESSPLSLL YFAELVNDAGFPPGVLNI ISGLGSVAGARMASHPDVDKIAFTGSTKTGKEIQKLASSNLKTVTLEC GGKSPLIVFDDAKLDQAIYWAAFGIMYNTGQICTANSRVLVQDTIYDEFIQKFKAHVQENWF IGSP FDKKSTMGPVINKSQFEKVKGYIQKGKDEGAKLVIGDEPVTFESGYWIHPTIFVDCTQDMSIVKDE IFGPWAISKFHSQEEAIELANDTEYGLAAMVFΞKDIVTANTVASQLEAGTVYINSSNDDNIRVPF GGFKMSGTGSELGMEGVLAYTKIQAIHTNLTRD
3 P. pastoris homologues of genes involved in homologous recombination
SEQ0086
P. pastoris homolog of S. cerevisiae Rad50 (YNL250W)
Subunit of MRX complex, with Mrel lp and Xrs2p, involved in processing double-strand DNA breaks in vegetative cells, initiation of meiotic DSBs. telomere maintenance, and nonhomologous end joining
S. cerevisiae null mutant: UV resistance: decreased; chromosome/plasmid maintenance: abnormal; colony sectoring: increased; gamma ray resistance: decreased; resistance to cisplatin: decreased: resistance to N-Methyl-N'-Nitro-N-Nitrosoguanidine (MNNG): decreased; resistance to daunorubicin: decreased; resistance to 5-fluorouracil: decreased; resistance to menadione: decreased; resistance to camptothecin: decreased; resistance to tetrabutyl hydrogen peroxide: increased; resistance to chromium(6+): normal; resistance to hydroxyurea: decreased; resistance to chromium(6+): decreased; resistance to methyl methanesulfonate: decreased; resistance to ethylmethane sulfonate: decreased; spore germination: decreased; transposable element transposition: increased; UV resistance: decreased; competitive fitness: decreased; fermentative growth rate: decreased; resistance to doxorubicin: decreased; resistance to hydroxyurea: decreased; resistance to 5-fluorouracil: decreased; resistance to methyl methanesulfonate: decreased; resistance to cisplatin: decreased; resistance to L-l,4-dithiothreitol: decreased; sporulation: decreased; telomere length: decreased; transposable element transposition: increased; viable
S. cerevisiae reduction of function mutant: UV resistance: decreased; X ray resistance decreased; spore germination: absent
Chr 1 -4, 0513
5' region (in bold start codon next gene) tccaagtctgctttgttccaaggctgaagcttggcataatcgtgtttctcattagtgtcg aattgagtggtgtcaaagtagacagatccatccttggtaacgtaggcataccctctatca ataattccctgaacaaattcaacaatttcaggtacatactcggatactctagtcgtaaca gtagctggtaaaatgttcaaacgttccatatcttcatcatatctctgttcccaataggaa gaagtctgtttgaagatctctggatctgtcacggtagatccatacaatttatccagctcc ggcacaacaacatccttgatattttctaaatattcattcaaatcagaacttgttttcagt gccttttttgcagcagtgatatgcataggaattttgggattgtccactgccttttcagca atgtttaaggattgagcccaactttcaaagtttgaagaatcacccttgaaatctgtcaga ttcttagcaacatagctgttccagctccctaacgcaaactgcttcagctcattggttatc tctgaatactttttcgaaaactctttgaataaatattcctgtctagctttcagtataatc ttgtcgtcaatatcagtgacattctgaacaaacttgatattataaccaaaaaagtcctgc aaaatacgacgattgatatcaatcgtgacgtagtttctagcgtgacccatgtgggaggaa tcgtagactgtaggaccacaactgtaccacgtcacctcattggcgtttgtgggaataaaa ggtacctttgtccttgtcaaagagttgaataatttaaggacgggccgttccaccttttgg gtaggctgattccatgctggttgagacatttgtgtgtggaaatttcttattatataaaaa gtgtttctcttccagggtctaatgttcatggaaacaaagtgagtgatcagacagtttgtt ttttttagccggctggtctggcctttgggcggaggctttagatggccaactggaaggacc tgcgcggaggaaaagttgttcatctctcccgttctcctaaacatatcgcgcctggatccc cgtgtagttaattattcaccctcagggaaccaagtccggttactgaactgacctagcc
ORF (in bold intron) ATgtaagttttaacctcaattcctctattcaatgctctttgactctaacagaaagGTCTTCAATTTATAAATTGG
AACAGTTTAAAGAGTTCACTGGTGAAGATCTTAACCCCGAAGACTGCTTACCTGTGTTTCTGGAGGTTCTGAAAA
TACGAAAACACGTAGTTGAAATTAGCAAACTTCAAGAGGATTTGAGTCATTATTATGTTGAAAACAAGTCCATTG
ACGAGAGGCAAAAATCTCAGATATCAATGGTAAAGAACTTTGATTAA
Downstream (in bold stop codon previous gene) agttgatcaagatttgctaaggaggatagagacggaatgtttcccattattcgacattca agattcaacatttcaaatctcaaacttcctaatgggaaaataatttttctcagagacgat catttaggtctaggaccagaaatggcaacgaacataacttggcaattaacattaattcgc cgtggcaaatgtctcttgtactttttgattaaagagaagaagaaatattttgcgggagga aaaatatatatatactactattcgcgattacaggagggtaatgccctataatctaatgtt actcctcatgcccttaccatgatagatcaattgaagcttgaaagattgtaaaccaaactc gacaacggcctgcgcgtcgaccttgagttcaccgttcacagctttccttaaaatatcagc tcctattccggaagttgcatatcggataggaattcgaggagttgtgcttactcccggtgg gatgataagactctgttctctcccaatatatcctaatacagtgccttcataagaggcttt ggcgttctgtatcaagacaaatgcttcagagttagatattggattgtagactgtaaattc tatctcggatgacatgagatgcatcgttgaatctaggataaaaccagattggtaatctgg gtttgatgataggattttgtgttcttcgtagacaatttgtggaattggtaatgtcactga taaatttttcaaaacttgagatagtaaaggtgaagtcggtatacttctatcgtggccttc aatggttactaatgtttcctcgccagataggtatccgctgaccaattgttcgagtttctc tattggggatacatctgatgatgggatattgtcaattgtgtatcgattcggatcaatgtc aaaaagcgcagaagctgagaaccaggagcttcctgaactttcaatgttctcaaataagga gaggtggcctagtgtcgattcattgtatgcaagttgtaaat
AA
MANWKDLRGGKVVHLSRSPKHIAPGSPSSIYKLAIQGVRSFDPHTPETIQFΞKPLTLIVGQNGSGKTTI IECLKY ATTGDLPPNSKGGAFVHDPKIAGDNQVKAQVKLAFQNISGVSMILTKTLQWQKPGKSLQFKTLENQLSVINNGE RTRVSQKAAEIESLIPNYLGVSKAVLNYVIFCHQEENLWPISEPMALKKKFDEIFDSVKF IKVLEGFKGITKDMS VDIKLLTNNVEHLKNDKKRADLKRAEMEALQNTVEDYNAEIIDLNTQVEEVTRKLDELFKSNQDFEKVLSKLDFL ATEKQSTQQQIDRLFSSLTVLPDSTEILENNLSNYGKLLTEKRVKVNEQIQLAKEASESLNALRDEYSDTIKKEG ELRGLESTYKNYIQERIQLIQENASLLGMTISETTISSDE IEEASSKASVLFSTCKKKLDLQTEHYDTRIHDLNL EIGQAESKLGKEEERSSYLKNDINSLKKRNQALQKSINDINSNESEFNETKEDIERLTKQLEDLRSENKLASINN DLKQNQDKILVLENELDQINKQI ITSNRQGEVLAKLHLLKENTKKGNSSISKLVESYGEQFKEFTGEDLNPEDCL PVFLEVLKKRQEDTDLKRKEVASFKQNEYESNHDRSLLEKKLEQSRSQLQECRSRIVSILEDEPIEEYES IVKDL ESDYEIALQNSKLNWATKNFNETALKIAKEHQYCILCKRELNHDELGPVMVTISENIEKANDDLYSKEKDRIKQD
LDDLKSIRDDISNFRNLEGSIASANDELNELKTFVSDDLARVSSELEQLESDLVSLESLRKHWEISKLQEDLSH YYVENKSIESELSAYGVPAKTLSELQEDLNGKNFQLKELRRQADDLKEQREFSNRELSMLEGNVKDKRLLISNFE KSLMIKLNLEKGIDENKARIDELQQTGETVLESMKLTKDRLNKLKDNVKTLEEERDSTLTQLQSNVEQFKTVQDS LVRLNSFVNKYELEDEPILRQCEKNSEHLKASIKDAEGQLNKLHEKVNVLEKQLSEADTEERNIKFNLDLRSLNK ELVQIEESIESLDRQNATSKRQEFQKETAILRESYSRLSSAHSSKMGEVSQLQKQIQSITEEIKRDYEHVEDEYY QEYLKLQTKMFISNDISVYSFGLDNAVMKYHSIKMEEINRIIDELWKRTYSGTDVDTIMIKSDMNTQVKGNRSYN YRWMMKEDVELDMRGRCSAGQKVLASI IIRLALAECFGVGCGMIALDEPTTNLDEENIESLAKALNSIIHLRMS QKNFQLIVITHDEHFLRHMNATEFCDHFFRISRNERQKSQISMVKNFD
SEQ0087
P. pastoris homolog of S. cerevisiae Mre11 (YMR224C)
Subunit of a complex with Rad50p and Xrs2p (MRX complex) that functions in repair of DNA double- strand breaks and in telomere stability, exhibits nuclease activity that appears to be required for MRX function; widely conserved
S. cerevisiae null mutant: chromosome/plasmid maintenance: abnormal; gamma ray resistance: decreased; ReclO4p accumulation: decreased; resistance to hydroxyurea: decreased; resistance to chromium(6+): decreased; resistance to 2-phenyl-3-nitroso-imidazo[l,2- a]pyrimidine: decreased; resistance to daunorubicin: decreased; resistance to camptothecin: decreased; resistance to N-Methyl-N'-Nitro-N-Nitrosoguanidine (MNNG): decreased; resistance to cisplatin: decreased chitin deposition: increased; competitive fitness: decreased; fermentative growth rate: decreased; resistance to doxorubicin: decreased; resistance to hydroxyurea: decreased; resistance to methyl methanesulfonate: decreased; resistance to Calcofluor White: decreased; sporulation: decreased; telomere length: decreased; transposable element transposition: increased; viable; resistance to methyl methanesulfonate: decreased; resistance to chromium(6+): normal; spore wall formation: abnormal; telomere length: decreased; transposable element transposition: increased
S. cerevisiae reduction of function mutant: gamma ray resistance: decreased; resistance to methyl methanesulfonate: decreased; resistance to hydroxyurea: decreased; telomere length: decreased
Chr 3, 0851
5' region (in bold start codon previous gene) tgtcaaacacatagaagtcttttgagtttagctcacagtgaggtggaactaaatatttct cttcaatttcataggtgttcaaccccagacttctcctctttttcttgttactaccgtcgc tactcgttttcgaattcttactgcggtctgtagttgaagacgatgaatcatcaaattggt tagtctcattgccatcattatcgtaattgtcagacgtattcgttcttcgtgtgagctccc tctggatgagatcattcagcatttcttcttcctttgcacggtcctcgttctcctttgcca gccttgccctttcaattgcatccagttcttcttgtcttctctgaagtctgtgcactctct cttcctccaatgattcatttctggcacttgattgccactcatcaagtttctcctggatct gagttatgatttcgtagcacatttcttgacctttcaaatctttgaaagtatgcttaatca tatcgttgagcttatctatctgagaagtaagcacattcttagatttgacgaactttatga taggcattgtcttgggataggtcaaggtcatctcaaactggattgataaagacaagacgg gttcttttgtatcgtctgaatgaagatgtacctcaaatttgggataaggatccttgtgcc
aggctgttgatgtctctgtcaggtcgatgagatcatccatatagatcgccttgagagcct ccagctcctgaacttggagatcatgattttcctttaaagtagatgtcatggtagtcaata taccttgcctgtagatccagaagccgttttagtgaaaaaactactctgctccaaggttag aaggtgtgttgatgtatgagcgtaagcaggaagccaggtatttttttaagagattctgat cgaggccaactacaaatcgagattgtcaggaggggacagaggacaaaggatcacaccagc ctacagaacagctagcccagcctctcaggaacttacagtaa
ORF
GCCACGCCCAAACCAAAAACAAAACCAAAACCAAAACCAAAACCGAAACCAAAAACTGCCAAGCCCAAAGAAACG GGAGAGACTCTGGGATCTTTGATTGGCAACCTAAGTAGACGTTGA
Downstream (in bold stop codon next gene) cgaagacaaacttccaattaatatacatatatgatgataaaaatctataaattcgttaga tactctcttttccaagtccaatattcttgcagtcgaagagcacaatactctgaatagtca aactctatcaatgatattttactctggataccagcccagattccccagtaaaaccctgga agcccaaaaaacaaagaaatttcatcgataagttgactaacgtcagcctcgtcaacttgc ttgtccacagaggcagagacgtaagatcttacccagcttcttagaattgggttatcctgg gatatatcgagaattctggattgatcgcattcaaaaccctgccattcaaccagatggttt gcgatatcaaatgcccttggtcctggcagaacgtattcgtaatcaataaaaagaataggg ttattttcaatagtgggcaacttgctactagttggggttcctctcagaatcacatttcca cttagtaaatcacaatgacaggtaacagttggagatttgtttgagatttggctttttatc cattccagctctttcctgaaaatggcttgtagatctaatttatcgtgggtgtcctgcaaa atgtctttattttcaaggcatgatttttcaagctcttttataggaggcagaatattgatc caagcctccaaaagctgccagatgtctgccgaaaacttctgtgaagcaccttgggcatta gtgaacgatttcaacttcacaagggcgttttctatttcgtctttgtcaataatgttatgc cactgtcctaggcgttgtgcaatcaaagggtataaggtggggtcagacagttcctctggg gtcaacgagcgaccttccagatatccgtagaccaacccgttaccaaatcgactaaaaatt tggggggccaaacccagattattcaggagcaactgggacacaaactctctatctctgtct atgatcatgtctgtgcccttaccatacgttctgatcaagac
AA
MPHVDRILPGKDTLRLLLTTDNHVGYNELDPIVGDDSWKTFEE IML LAKDRDVDMVLQSGDLFHVMKPTKKSMYH VMRI LRSNCYGEKP IEFELLSDPSLCLDNRGFNYPNYEDPNI WSVPFFAI SGNHDDATGDDNLSPLDVLSVSGL MNYFGRWDNDNINVKPLLFQKGRTKLALYGMSNIRDERMFKTFRDGRVTFSTPGIQTDSWFNLMCVHQNHVQHG
ARTAYLPENFLPTFLDLVVWGHEHDCIPYPVPNPETGFDTLQPGSSVATSLSNGETLEKNVFILNIKGKDFSLEK IPLKTVRPFVMKDISLTQLGLNPNSRNKKEVLDFMIDEINGLIEEAQKSWLDKQAENSSSVDDSEVDTPLPLVRL RVEYSGGFEVENPRRFSNRFVGKVANVNDIVIFHRKKEHTTGATRTKPNLKNGEEHLELDELNISKLVDTFVDDN QLNLLNKKDVGSWKAFVEKDDKAALKTFIDEELSKDLKLLMGLSHGEHIEDESISQKKSFNKILFDIKKEKSQA LMSKICTDSIKHIPESLPERPAFLRSITGPDRDSEDAQPKLTNRKISPSVTKKRVSKPSIQSIVSSESEILSEEL DDFIDDDIDEDMDINSNSESDIDDFIDVPPPKKTRQTRAKAKTAPIAKPRATPKPKTKPKPKPKPKPKTAKPKET GETLGSLIGNLSRR
SEQ0088
P. pastoπs homolog of S. cerevisiae RAD51 (YER095W)
Strand exchange protein, forms a helical filament with DNA that searches for homology; involved in the recombinational repair of double- strand breaks in DNA during vegetative growth and meiosis; homolog of Dmclp and bacterial RecA protein
S. cerevisiae null mutant: chromosome/plasmid maintenance: abnormal; gamma ray resistance: decreased; metal resistance: normal; resistance to 5-fluorouracil: increased; resistance to 2-phenyl-3-nitroso-imidazo[l,2-a]pynmidme: decreased; resistance to daunorubicin: decreased; resistance to camptothecin: decreased; resistance to N-Methyl-N'- Nitro-N-Nitrosoguanidme (MNNG) decreased; resistance to cisplatin: decreased UV resistance: decreased; competitive fitness: decreased; fermentative growth rate: decreased; Rad52-YFP distribution: increased; resistance to 5-fluorouracil: decreased; resistance to hydroxyurea: decreased; resistance to doxorubicin: decreased; resistance to methyl methanesulfonate: decreased; resistance to 2-phenyl-3-nitroso-imidazo[l,2-a]pyrimidine: decreased; resistance to cisplatm decreased; resistance to lovastatm: decreased; sporulation: decreased; transposable element transposition: increased; viable; resistance to hydroxyurea: decreased; transposable element transposition: increased
S. cerevisiae reduction of function mutant: UV resistance decreased; X ray resistance: decreased; sporulation: decreased
Chr 3, 0904
5' region (in bold start/stop codon of previous genes) tccacagtgaagagatccatctcctcaggattaccaccgtattgtatatcattttctctg gactcttctagacctttctcaacatcctccaagtcaatgttttttctccaagctctcttt cctttccttgaagaatgattcttcttgaaagtacctgcttcactcatagttattttcaag tctgaaatttttttaagataaaaaaattcaggtgcgaaaaagaaaaacagcgattagate atcaagcttgatgtgataacagctaatggggaaatatagtgtacaggagttcaaattggc cctgctcaggttcagccgacaaagttttgagacaagacagatcttctctacttgaggtgg ggttagactgatagtaaagtctgcttctttatgcttcattcactgttgatgacaatgggt tctgggtacctctaccatctttcaataacttgtccacctctaaaataacatccctaatac tttcgtctgtaaagtttccggagatggagacatatgtaaccacttggtatttcttggtga gcagataggataggcgtttcgtcacatctaatagctcttgatttccgttgagtaacgtct gatggacatttcctgttctctaaaggcgaagttagttatttgcaacactttgcatcaaaa aacaactacttacatctgggattcctaatacgtaacatcctaaattgtcatcggttccat tctcactgaggtgaagagtcatggggattcgacatctcgagctcgttttgatttcttcca atgatggcgtttctaaatacaataagaaaggaccgcccataaatgggcttaattcagctt taaagactcttcttgtcatgtcaaacctcaatcacttcgtattcagaggcaatacaaaaa gagaggagagaagatcgatgatcatgtaatatttcaaaagtcagtcgcgtaaacataaag tatttatttacccacaacacgtaagttactctgaacgaaag
ORF
GAAGATGGAATTGGGGACCCTAGAGAGGATGATGACTAA
Downstream (in bold stop codon next gene) atatgattgctatttacaccctaattaatattgtgagagaattcaaaactccggagccat atttgtatccagccatcctaggaaatcatctcctttatcggaccagttgttctgttgttg gaagattgaatcgataaaactttgatcattattcagatcattcatattgaactcttcgtt ggccgacaagaacgattgaggattgttattcgaacttgtagtagacgcagtgctgtagac atcattgacgtctaattgcgagttgaacaatgacgagttgaacggactctccgttactgg aagtttttcaattttaggtacaacaggctctttactgttttgagaaactcttctaggtgt gttcacatgggcttgagagttggaaacagctttggaaacatccttgctaggtttgtttct cttttccttttggttgctggaggccaatgaacctgtggcttcaagcatcgccatgggaat tccattaatcattgtaggttttgctaagcccgctgctgtagctgtagctgtggcatttaa taaggcctgatccgtaggaaccaatgttgtaattgtggtaccattcggcgtcgttgaggt gatcgtggtaaagtcctctttcgtaatttgattgtaaaatggtaatggaggaggtctact agcgagcttatccttagatttatctttgcttttagtctgagtttcgttcaaagttcttct gcgagcttcatgaatacaccaaacgaggtcgtagaataaagatcctgtcaaatgtgacct catccgggaaattattccttcagatttgacaaataattctggatgtgtgattatcactag gtttagcttctccaaaactgttgcagttcgactaatatcattctctacatctttccacgc tgaaagcatgtttctgaagagtctgtggactgtgacaatactttgcctcgcactgtcaac gtacctgggtaataaataaggtgttagatgcaatctgaata
AA
MSNHEVIEDSQRVMNLEEEHQVNSGNGHVEQEQEEDDEMFGPMPISKLEGNGISPGDIRKLMEAGYNTVEAIAYT PKRALLTVKGISEIKADKLLAEASKFVPMGFTTASEFHHRRSELICITTGSKKLDTLLGGGIETGS ITEVFGEFR TGKSQLCHTLAVTCQLPIDMGGGEGKCLYIDTEGTFRPIRLVSIAKRYGLNEDDTLDNVAYARAYNADHQLQLLN
QAAAMMSESRFSLLIVDSIMALYRTDFSGRGELSARQMHVAKYMRTLQRLADEFGIAVLITNQVVAQVDASAVFN PDPKKPIGGNIVAHSSTTRLSFKKGKAEQRICKIYDSPCLPESECVFAIYEDGIGDPREDDD
SEQ0089
P. pastoris homolog of S. cerevisiae Rad52 (YML032C)
Protein that stimulates strand exchange by facilitating Rad51 p binding to single-stranded DNA; anneals complementary single-stranded DNA; involved in the repair of double-strand breaks in DNA during vegetative growth and meiosis
S. cerevisiae null mutant: UV resistance: decreased; colony sectoring: increased; gamma ray resistance: decreased; metal resistance: decreased; metal resistance: normal; resistance to cisplatin: decreased; resistance to hydroxyurea: decreased; resistance to N-Methyl-N'-Nitro- N-Nitrosoguanidine (MNNG): decreased; resistance to camptothecin: decreased; resistance to daunorubicin: decreased; resistance to bleomycin A2: decreased; resistance to tin(2+): decreased; resistance to methyl methanesulfonate: decreased; resistance to tetrabutyl hydrogen peroxide: increased; resistance to chromium(6+): normal; resistance to methyl methanesulfonate: absent; resistance to chromium(6+): decreased; resistance to menadione: decreased; resistance to ethylmethane sulfonate: decreased; transposable element transposition: increased; glycogen accumulation: increased; competitive fitness: decreased; fermentative growth rate: decreased; mitochondrial genome maintenance: abnormal; Rad52- YFP distribution: increased; resistance to sulfanilamide: decreased; resistance to doxorubicin: decreased; resistance to hydroxyurea: decreased; resistance to 5-fluorouracil: decreased; resistance to methyl methanesulfonate: decreased; resistance to cisplatin: decreased; transposable element transposition: increased; viable
S. cerevisiae reduction of function mutant: X ray resistance: decreased; mutation frequency: decreased; sporulation: decreased
Chr 2-1 , 0153
5' region (in bold start codon next gene) aagtcaacaggagagaatggtaaaacaaggaacagaataacccaaatccgatgttcaaat gtggcctgttgtgattcageccaagaacccaatagaaaggatataacaatacgttgtaaa ctgaagatcttgacaaaacgtatagaattggtgttgaaagagagaccaaaaccgaagagg tcaatgatccactcaagacagtggtgaagttgatcaaattgaattctgaacgtggctcaa tgcggtacacaccaatttgaaagtttagtttgtttcgcaagaatactgcctgttgaattg attgccatagcaccaaagtgccaaaaaagaaccaaaagtacacaaactggtcgttgatat aaggtagcactgtccttgtgggagtgataatataatacgtaagatcgtcactcttctgtg agaataagataacccagtaagtgaaacctgatattccatatgccaaaagatgggagacaa aatctgcacaaaatatctggcccagaatttgtctgaaaaaggttttgaaaccaggatttt ccaacgtttgagttgactccttgcaatgcctgaccaggatcaaagaaacacccagcaata accatttcaaaggccaaaaccaaagtgatgaacctgcaaaaatgccaatgaaaagtgacg agaaaaaggtcaacaaaacaacaattctgttcacaaagctcagacgttttctttggactt gttcaaagatcgttttgtagtagattaatttgggtttggcatcttctatggtcattgttt gtagtgtaggagaggacaacttgagaagtaattaatttttgagaacacggcgaggctaaa gggcagagcaatgtccgcctaacggagaaaccgtgcaaggggtgactttagtacttgtcc cgtatgcgtggaggggcatcgaaaacgcgcaccgaacatcaaatatttactctatcgaac gcgttccacctttgttctctctttcctgcaaaccacactcc
ORF
GAAAACTCTCCAGCTTCGAATTAA
Downstream (in bold start codon previous gene) gtaagcagttccaacttttggagatttgtatgcatacaatcactcatccctcgcatccac cttattttattcatcgcgactcgagccaagcatgctttctttcctctttagcccccaacc tacagtgaaccaaatacatgtccattaactcaatcaaaccatttacggagaagaagggtc cattcacagacgatgtttacgagcctaatacggaagaaacattcaatgccatcagaagtt caaaaatcctagtaattggggcaggaggattaggatgtgaaatcctcaaaaatctgagtc tttcaggattccaggatattcatgttattgatatggatacaattgatttaaccaatttaa atcgtcaatttctattccgaaataaagatattggaaaaagcaaggccaaggtagcatccc agtttgtcatgaaccgtatacctaacgttcaaataactcctcatttttgtaggatccaag ataaagacgatctgttctacaggcaattccaactggtaatctgtggtttggacagcacag aggcaaggaggtggataaatcataaattagtcaccttgctagatccaaatgactttteat cgttaattccaatgatagatgggggaacggaggggtttcggggtcaatcaaggttgattt tacctacgttatcatcatgttttgaatgctcactagacatgatcccaacaaatgtaacat atccagtgtgtacgatagcgaacactccgagacttcccgaacactgtatcgaatgggcac atcagttggaatggcccaagaaatttggtgacaaaccttttgacgcagacgatccgtcac aagttgactggatgtacaaaacaagtctagaaagagcaaagcattttgatattgaagggg
tcactctgtccttgactttgggagtggtaaagaatatcattcctgctatttcatctacaa acgccatcattgctgcttcctgttgcaatgaagccctgaaa
AA
MSFDDAELKRISKELDKQLGPEFICTRPGQGGMKVSfLSGTTAISLANHIFGFNGWHSEVKSTTVDFVDTQHGKI SMGLSSVIRVTLKDGSFHEDIGYGSVENAKSKAIAFEKCRKEAITDGLKRVLRCFGNALGNCLYDKEYLRKISNV KTQAITFNEGDLMRHNQLEARTLKQEAKLLEINQFKAKNSSKSRINVPPKHFGDNEDDDSHLFSDEINIDSEDFM NEIDDYEMDLLMQKNSQREIENVNDDEIENKGDAVDAVTKSNIENQFIAPNSPQIMDNMVLTPDKIPAKVEFVSA KVAEKVQNNSPLNVEEKFNPΞFQSPSLRRTVDPTKSMPIRRTMVQTSVLSQΞKQGPKRMIGIPPDTAKRHKLGSQ NQNTANDKPEEKQLSRFNSTKAPGKENSPASN
SEQ0090
P. pastoris homolog of S. cerevisiae Rad57 (YDR004W)
Protein that stimulates strand exchange by stabilizing the binding of Rad51 p to single-stranded DNA; involved in the recombinational repair of double-strand breaks in DNA during vegetative growth and meiosis; forms heterodimer with Rad55p
S. cerevisiae null mutant: X ray resistance: decreased; gamma ray resistance: decreased; resistance to 2-phenyl-3-nitroso-imidazo[l,2-a]pyrimidine: decreased; resistance to daunorubicin: decreased; resistance to N-Methyl-N'-Nitro-N-Nitrosoguanidine (MNNG): decreased; resistance to cisplatin: decreased; resistance to hydroxyurea: decreased; sporulation efficiency: decreased; transposable element transposition: increased; competitive fitness: decreased
Rad52-YFP distribution: increased; resistance to hydroxyurea: decreased; resistance to methyl methanesulfonate: decreased; resistance to doxorubicin: decreased; resistance to cisplatin: decreased; sporulation: decreased; transposable element transposition: increased; viable
S. cerevisiae reduction of function mutant: X ray resistance: decreased; spore germination: decreased
Chr 1 -3, 0025
5' region (in bold stop codon previous gene) aaaaagtgtccttcactggttctactggagttggaagtgtcttgatgactcaatccgcta agactttgaagaagctttcgtttgagctcggtggtaacgctccattcatcgtttttgaag atgcagacattgacaaggctgtagaaggggctttgatatctaaattgagacaatcagggc aaacctgtgtgtgtgccaacagattgtacatccatgagtcaatttacgatgaatttacag aaaaattggttgcaaaggtaaagaaatgcaaactgggcaatggtcttacagaaggcacta cacacggacctctcgtccacagtagagctgttgataaagttcaacaacatgttaaggact gtttggacaaaggagccactcttttgttaggtggtaaagtaagaaccgatttggggccaa atttccatgatttgaccgtcattggcgatgtgacccaagagatggcagttgcacacgagg aaacatttggaccattggttcctctcataaagttcagctccgaggagcaagtactagaat gggcaaacgacacagaatttggtcttgctggctactttttcaccaacaactactcacgaa tctttagattgggtgagcaaatcgaatgtggtatggttggtatcaatactggtgccatct ctgaggccgctgtaccatttggtggtattaagcactccggttttggtagagagggttcca agtacggtgtcaacgattacatcaacatcaagaccatggtgattggaggactcbgaactc gccgttcattttgtatagtttgtttatttaaggaggttgactaaaattacgatatttaat atagaggataattattgctaaggaagatttcggaagttccattatctgtgctcgatttaa ttcaccttcggatgtctactcgatatccttccaccaggaacctcttggtataacacgctt tggtttcaccttccagtgtcttttcagatcagctgatagtt
ORF (in bold stop codon next gene)
AGCGGCTGGTCCAACGAAACTATCCTGTATAGACAAAACTATCAGGATGATGATGGAGACGCAGACGCTTCGTTC
AGATTTGAGATCTGGAAAGGTGGTATAAGGGGGGTGGTTAACTAA
Downstream (in bold start codon next gene) cctgcaggtcttccgcaacaatagtgttgttgtcatcctcatcatcatcctcttcaataa atacatcaggatcttcttgattaggatcgtcatcgtcgtaagtgagatcatattcttcat cgtttagattgaagttcccactaacttcgtctagagtgcttactttggaatttcttcgca tgggaggggagtatttgttcttaatagtggccttgagatgcaggaacgcgacccgtttcc gtttcttattaaagatgaaccacattaaacaccatagcgtattaggctctaagtcagcca aaactccatccttgggttccaacgcgtaaatggagcagtcttgaatgtccatatgggaat tgatagtctcccaaatccagttaatatccttatcaccatcgtctatctgatccttatagc ccaaagacaatacaattgagttgaaacggttcatcatggaatttgcagagtcaactgctg tgaagtagtccggcgtcaatgaactaaagtcatggtcgggatacgacgcgttcagtattc ctattaggtacgaaaagatttttctgctatgtgtggaagttatgggaccaaatggggtct cctcgtcatacagcgcgctttcgtttatgtcaatattctccaaagacttactcaatgtag ttgcttggtggtggttgaaagactttcttttcaaagttggtttcacaatactagatgagg aggccctccggctaaaccgcgtctgaaaatcaacattgtcggttatcctggaagctgtgg gcggagagtgagacatgttccaattcaagctttcctgcaatatgtcatctagatgcttat caatcgttttatatagtttcttgtccgaccccacaggcttcgtcgtgaacaaatcgcaac cgccacgaaccgaatagtcaggagtttcaaaacagagagcctggttaaccagctctatat caatttcatcaatgaacttcattgtagtgaacgtagcatgc
AA
MDLYEQCPSSELLLNERFEPLLTVLTKAWILTSDIVISSPQDIHKRVSNLTGGKLTISLNELIVFRSLLMKELNS QIESNLVNLNTSMLQEQTLKLNDLSFTTGDELLDYYLRGEYDDNWLPLGYLIEVTGGSSCGKSHFLMQFCLTVQ LPIQLGGLGADLEAPESERTNECKAVYVSTEGYMETRRLSEMCDYYGNIMTTNGIADSDLYPSTSNVLQIQCSDL ETQDHIIFVQLPALLEKESGFVKLICIDSISHHVRVEFENRSYQESMARNTYLfDLSRHLKRLAKQYEVCIVFSN QVSDKPITKLISNDRLALDYQLGWMSGWSNETILYRQNYQDDDGDADASFTSSSRKRDFSSMTEEDFADEEDDGT KEDTDTDERNFTLQDELSFTEQIKLDKHEYLFGNEVTKLKETKIPALGYLWNTLIDVRIVLFKNSRPIIDENLID SMAFDLGLSDFKNIDQLB ENRMMNEGNWATERWCRLCFSPFNGYSSDKVQRFEIWKGGIRGVVN
SEQ0091
P. pastons homolog of S. cerevisiae RFA1 (YAR007C)
Subumt of heterotπmeπc Replication Protein A (RPA), which is a highly conserved single stranded DNA binding protein invoh ed in DNA replication, repair, and recombination
S cerevisiae null mutant: UV resistance: decreased; resistance to methyl methanesulfonate: absent; resistance to hydroxyurea, decreased; inviable
S cerevisiae reduction of function mutant budding: abnormal; heat sensitivity: increased; radiation resistance: decreased; resistance to methyl methanesulfonate decreased
S cerevisiae overexpression cell cycle progression in M phase: abnormal; chromosome segregation: abnormal; nuclear morphology: abnormal, position of spindle pole body: abnormal; resistance to hydroxyurea: decreased; resistance to nocodazole: decreased; spindle morphology: abnormal; cell cycle progression: abnormal
Chr 1 -1 , 0113
5' region (in bold start codon next gene)
tcttattacttgtatctagatgttttatgagagcttggtcagcaatcctccagaacttca catcaccattctctaaaccgcagatggcaatttttccttcaacctcaaagaggttttcga tacgagtattcaccactgaattggcctgactcctgtcttcatgtattttgacagcggtaa ttccacacaaacaatccatcgaggcaacgctcatcccactagagcattcccacagtttea attgtgcgtctaaggatgatgtcacaaagtttggcccctcccaaccagactgaagtccgt aatcgatgcctgatgtcctgaaaaagttcttgcattgtctccagtctcacaatcccaaat ttttagcatcatgtccaagccagcggaaagcaggaccttatcagatggaaaataaaggac ctttacgatcccaccatcatgagccattacttctttcaactcatccctctttgatcccag ctttccatacaacagtcggccattgcggtctccgacaacaaaattgcattttgccgtaac attgacagacgtgagcttatgcgaaataaagtcatacttttctacggggaagacacattt caatgttgactcttcatctgagagctgatatttccagtgatctagcttacgtaacgtcaa acctggactgagtggttctagcttgtccttgtagactctaaacttgctaggtttggtttt gccatcaattatttcageccaaacatcttcatgattcacacggccagactccacgtcact aataacgtccaaaaagctgtcctgaattctcacaagcttcatcgttgggagttgacacag gaaagtcaagggcatctaacgaaggcgcgataaaacgaaatcacctagttctggcacccg tcatcggggtgttccctgtgctgatcgcgcatcgcgtaaattgccatcagaacgtatgta acgcgtctcacgctttctctactcactctacgactaagaca
ORF
AAACTGGATTTCGCTTTGGAAGCAGATGCATTGGCAGATTATTTTGAGGGTGTCAAGGTTAATTAA
Downstream (in bold stop codon next gene) gtttgagttctacgtacggtgtattttttcttttaatttcattaaaggcacttgttcctt taagaatattctattagttcgtttacctgtagccagaactttttcagtaagtcattcacc tttcttgaaccgcgagttaacatcagttgctgacctagcctaacggtatcattggtagga atctgttttttcaagagctggttattgttgatagtttcaaatgctagattaaaccatttc ttgaagttcaaaggatatttgacaagtatcaccttgaatagttccatgaagttttccaaa tcagatctagatgccatcaagaacgaagcaaccaacttgaatatcaatgacggccctagt tcggtatgctcaatcatgtctctgataataactgaatcttctttgtctccctttctgaga gtgacaagtttgatccaaaacttacagcctgatctaagcacaaatttttctttagaaccc aaagattccaaaacgaatggtaatatgttatgcaaaagatagccttttattagtatgcct ggtttcttctcaagtatcactaccaaaatgttgaggaaacataatgctagatcaggatcc tgttggatgatagccaaatactccacacatgattgctgaagcgtggtcgtcaacaaattt gaacctattgacgagctctttattaaggattctagtaaagaatactgtaatggaacaacg ttcaatgaatttttggaaaaacattgcaaaatcttcgcattcaacagatcgatgatagag taggtgtcaaaaacaaaaggacctggaatagcttctgtcaaacctgctttcaaaattcca gtacaaacttcgttcatcatataattatcagaagtaagaggattcaaaatcataaactgt
ttcactacttcagtgatttgctttttaatgtggagggggtcttcaatccagtaagcgtac acggcatctctttcttcaggtgtgtaccagtcgttcacatc
AA
MTEFSKGSLVEIFQKGYKGGLKPLTVQVLNLKAIPNNTGKRLRLALCDGLYNANAVIRPESVEKAEAQGIKKGS I VQLLEYKASMISPVKHVLIIDNLQVLGFQEEKINPSPTSVDQYFSNHSGESNEDLLGTSMNSPAPQEPAQKAQSH HQEDAKPKLSAQVTSKPQQTNSSTAKFPNIHAIDQLNPYQNNWTIKARVSYKSDMRKWSNQRGEGQLFNVNLLDE TNEIRATAFNDVADKYYDLLQEGKVYYISKARIQPAKPQFSNLTHTYELALDRDTQIIEAEDASDVPSLHFNFVK LNKVQDLDANAIIDVIGVIKWNPAFQIVAKSTGRPFDRRDIEWDNTGFAITVGLWNNTALEFDIPVGSWAFK GAKVQDFGGRSLSLTQSATI ITNPDSPEAYQLKAWYDQQGGSNQEFKSLKNEVSSNSGLNTKQDIQSRKTILQAQ SEELGKNDKPDYFS IKAYISYIRTENFSYPACASEGCNRKVIQQSDDTWRCEKCDVNHPKPNHRYILTLSWDHT GQLWVTLFDDQAQQLLGQSAGELIDLKENDMSENMHAFQQVFNRIQMKEFSFRVKASPDSYKGQTRIRYNAVSLA KLDFALEADALADYFEGVKVN
SEQ0092
P. pastoris homolog of S. cerevisiae Rad59 (YDL059C)
Protein involved in the repair of double-strand breaks in DNA during vegetative growth via recombination and single -strand annealing; anneals complementary single-stranded DNA: homologous to Rad52p
S. cerevisiae null mutant: resistance to hydroxyurea: decreased mitotic; recombination: increased; Rad52-YFP distribution: increased; resistance to cisplatin: decreased; resistance to hydroxyurea: decreased; resistance to methyl methanesulfonate: decreased; viable
Chr 4, 0087
5' region (in bold start codon next gene) gtaattttggaatgcagttggtttcaagaaaagttttttggttggaggcattatattcta gaaggttgcaaatcaaagtcagacaatcttgaaccacaattgagcccctgattcctccct ccaattcaataatttcaaaaagtttatcaaagcaattttgaaataccaccagcttttgaa tattattattctgattgaccaagcccataagcagcaaaatgatctcgtttcgaattggct cattagggtcgtctagcaaagacaccaataaaggcgttcccaatggtgatgatacaatgc attcttttgatcttgatggtctacacgacaccagagcctctagtatctgtatggtgtaga tcttgatataaaaatcatggccaatctcactatctaatatctttaataaaaggtcgaaat tctcaggggcctgggttaattcgtcagcaactagcaaggagaactggtccattcttactt cacctgctagaaggctgggagaggggtatttaccattttgagctcgtgattgttccgata tcaaacttctagtttggtcttcatcagattcacctctgatgaatagtattaacaaagtct ctaaaactgcctttacgacgtcaatatcctgtatatccttgtgtagggtggaaactaaac ctttcaaaccattaccaataactgattccctgtaatctctactgaaattcttcaacgaaa gaacagcggctcggcgatcagtgaggagggttgcgttggaaagacggtcacaaatcgtct caatgaccccctctgggtcaacctctgtttttttgctcctaagaccagagaacatagccg tctttggaggggaaatcgaaggttttgacgagagacgtgtacttttctagttgatcgttc ctgcccaatattatgatttgatatctccaggatattttcctactcgcgcgcgtcgtatca gttccaacgattccagctgatcttcatctgggggtttccac
ORF
AACGATGCCATAAGAAGATGCATAGAGAGTCTACCTAACTTACTATTGGAATCTCAAGACCGAAAACAAACTTTT AAGGATTATTGA
Downstream (in bold stop codon previous gene) ttcaaactatttacagagctatatgaatttttccagtctatttactgaaacggtccccga agacggccatctgttgcttgatgcttatgtggaagtgccttagggtcaatttcaggttgt gatttggtgtctatcgtatcgtagaacttttgaaagaatgatgctaacggacgtacatac caacgttttgtcggtgattgaaaccatgccaggtgaccacctaaggatgtcgttaacatt aatgtgaatgggttggatttgcattcctggaacggaagagcttcagaaccaacaattggg tcatccaaagcgttgataattaatagaggtgttctaacagcagggagcctgttaaccgat
gaacccactctataatactcatatgcagtattaaacttgaacaaaggtgccgtgtacctg ttgtcaaagtcatacagtaatttgatgccttcgtggtatctctctctattctgactgtac aagtcactaccacgaagaatgtcatcgtgacgatctactaagttcattagcgattgagtc attgcaggggaataaatgtacctgccaatccatgatcgatgtagcagataagatgaggcc gctagatcccatggatttcctagaactgcagctgctgcaaagcgacactcatcaccctcc tgtccaatgtagttagatagaatagaagctcccaaggaggcacctacagcgaaaagtgtc ctttcaggaaaccatgtggctaagtgattcaccaagaatctcaaatcctcagtccaaaat ccattgaacagttgaggagttgacaattgcgattgcgcacatccacgggcattcagcaca accgcgtcgaaatcatgttcttcggtaacttccttcaacagacatctaacatatgcttca tgactaccaccggataatccatgcaatacaaccagcatggggcgtgtatcatcagaggcc tggttctttagctcaccctcgttcataaatctagtccgagg
AA
MDSSDSESLDACQSELFPKELKFDLFDDFSIIKPSEWAITRIGWLQTKIDRLHRΞRNSVQRRYLKRELENQKLIS TANQVFGFNGWSSVLDGSFEMSECAEEDGRFKMKVLTTVKLILKDGTTCSATGTGRTQNASGRAQALSKCKKEAI NDAIRRCIESLPNLLLESQDRKQTFKDY
SEQ0093
P. pastoris homolog of S. cerevisiae RDH54 (YBR073W)
DNA-dependent ATPase, stimulates strand exchange by modifying the topology of double-stranded DNA; involved in recombinational repair of DNA double-strand breaks during mitosis and meiosis; proposed to be involved in crossover interference
S. cerevisiae null mutant: colony sectoring: increased; sporulation: abnormal; toxin resistance: increased; viable
Frag B. 0039
5' region (in bold stop codon next gene) ctgtctggcgtagggatgatctccaaaccacttatgccaatctcccaaggctcaaactgc tcttagtttccccaatctctgctatcacaaaggttcgttaagcatcatctctatacttgt aagatccgttagagagctaggcacgtgggctttgaaactgcagcatcgatgttgtttcaa agctgctattgccgagtactaacatgttcaatgctggtatgcgttctcaggtggaaagct gcatgaatcaagcacagcaaattcgtctggctgatttttgccaaattagtcacgaatgcc aggtacgatctggagctcggattgcaacttatccaaggatgggctttcaatgaagtatcc ttaatgacataatatcatgtactatcaccgtgaattcgaggagacacacattgacattgc cggtcctctattcttccgaaataattcacccatgcccacaacaattttgagatgtccttt ttcgtcctgtagaagccgaataatcaaaacggatgacgagaatacgaacatcaagactgt cacaggaaatggcggacccagacttttgcagttttcagatggagaggtgtccctgaacag ttccaccgagaacttaaagttctaccaggtcaatgatatgtggacttttgataatattgg agtctctaagccaacttccatggatgatgagtttagtataaccatcgatggtggttctgt tccagtccatatagaacgttacctgacatgtgccgactgtgacaaagggccaattggaat cgccggttcagtcaacaaagaagcattggaagacccaagcggatccaagttgatgtatta tctttacactggctcagtgtttcaagaggtttctccttctccaattgacacatgacatgc gactttcttcactggagtcatcttacataatcaacgcgttatctgcattaaattaacaaa taaaatattacgtagtcaatacccgttgttgccagcccaca
ORF
GACGACACGGATGATGGTTGGATGTCAGCTCTGGATTTCAAACAGTTATCACAAAAAGAGGAGACAGGTGCTGTG
CCCAAGAAACCAGTGGTGAACTTTATCTTTGTGTCAGGCCAAGACTAA
Downstream (in bold start codon previous gene) gctggaagaacggaactttaatcgaaggaaaaattaaatgtcaaagtgggtcgatcagga gataatccatgcttcacgtgatttttcttaataaacgccggaaaaactttctttttgtga ccaaaattatccgatctgaaaaaaaattacgcatgcgtgaagtaggatgagagacttact gttgaactttgtgagacgaggggaaaaggaatatcctgatcgtaaacaaaaaagttttcc agcccaatcgggaacatctgcgaagtgttggaattcaacccctctttcgaaaatgttcca ttttacccaaaattattgttattaaataatacatgtgttactagcaaagtctgcgctttc catgtctcagattcggcagataacaaagttgacacgttcttgcgagatacgcatgaatct tttggctgctttttgtgaaagagaaatggtgccatatattgcagacgcccctgaaagatt agtgtgcggctgagtctttttttttctcaaccagctttttctttttattgggtaccatcg cgcacgcaggactcatgctccattagacttctgaaccacctgacttaatattcatggacg gacgcttttatccttaaattgttcatccattcctcaatttttccgtttgccctccctgta ctattaaattacaaaagctgatctttttcaagtgtttctctttgaatcgctcatgcatca taaagaaagactcatagatcatatttcttcggagtctaacttctccctttccacttcgtc gatgccgtccttcagccatgaatcaaatcaatcgccgaacccaatgctgatcgaacaagc ttgcgattcctgtcgtaagcgcaaattaaggtgttctaaggaatacccaaagtgttccaa atgtgtcacccacaaatggagttgcgtttattcaccaaggacagtgagatctccgttgac gagagctcatctgaccaaagtggaaaatcgtgtacgaatgt
AA
MYKKPMNRPFKPPRMQEATNKPVKPAQPAKIFKRKPEVLSPGTPEVAQTRTTNSVNVLWRKKTLKKNKTWDGDGI LVITGDQLIFKCDPLGGSNLREQGRRSKSSKSLEGVIAIGNYEIELDGEISDVKPASKKHRKIEHEVSSEKESIT SADKKSSSLPFKSPMTNSVTRQPQKKSAPPYKSDYEHSIVLSQESETSKMWVDPLLSQHLRPHQREGVKFLYNC LDGSKELPHKGCILADEMGLGKTLTTITLIWTLLKQNQVDHKRPAVKKVLWCPVTLIHNWKREFRKWLGMNRVS ILEMSSASNVEEDKRSVINFGRTRVYQILILGYEKLQKLTNEISQINVDLLVCDEGHRLKNSNNKVMKSLTSFQI PRKI ILTGTPIQNDLNEFYNIINFVQPGIVGDFASFNRDYMRPILQAREINCLNRKIIKAGNEKSNSLVELTQKF ILRRKAKDINTNFLPPKTELILMVPMTELQQELYKDIIETNQAKLGLINDRNFFLQKILILRKICNSPSLLKDEP DFARYNLGNRFNSGKIKLTVLLLRKLFETTNEKCVIVSNFTKTLDVLQLIIEHNNWKYHRLDGSSKGRDKIVRDF NESPQKDRFIMLLSSKAGGVGLNLIGASRLILFDNDWNPSVDIQAMARVHRDGQKRHTFIYRLYTKGTIDEKILQ RQLMKQNLSDKFLDDNDSSKDDVFNDYDLKDLFTVDLDTNCSTHDLMECLCNGRLRDPTPVLEAEECKTKPLEAV DDTDDGWMSALDFKQLSQKEETGAVSTMRQCLLGYQHIDPKILEPTEPVGDDLVLANILAESSGLAKSALSSEKK PKKPWNFIFVSGQD
SEQ0094
P. pastoris homolog of S. cerevisiae Polδ/Pol3 (YDL102W)
Catalytic subunit of DNA polymerase delta; required for chromosomal DNA replication during mitosis and meiosis, intragenic recombination, repair of double strand DNA breaks, and DNA replication during nucleotide excision repair (NER)
S. cerevisiae null mutant: inviable; resistance to sulfanilamide: decreased
S cerevisiae conditional mutant cell cycle progression: arrested; cell cycle progression in S phase: arrested; cellular morphology: abnormal; heat sensitivity: increased
S. cerevisiae reduction of function: mutation frequency: increased
S cerevisiae unspecified mutation frequency: increased; resistance to phosphonoacetic acid: decreased
Chr 2-1 , 0163
5' region (in bold start codon next gene) tctcaaccccaccatacttcttccatattgtgatgaccgatagcaggtacaggaagaaca atgaatttggaagctgaaggtagaatgagctgggacgagagaactgctccctgatgagtt gctttgcctcatcattcaagttagctttttgaactatgttgatgaatgccattgcgaacg tttccatatcgttgtgtctgaagttgaaagtgaattttaatacttgcaaatccaactctt cattttctccaagtctgtcgaccaacttttccatcaaatcttttctttgaatccactcaa tggcaaactcatgtgataaggagttcactagtttcagcataggaagaggtgatcttccct gtttggaagtcttgaacaatacttccaccagtgcattcctgagaagaaaagaaatagagc catgcgaggacaacagaagcttactaacgtgatccagttggacctgtaaccatgcctcat cccgtttcaaaaccaggttcggacagaaaaatgcaaacatattcaaggctgtaacgtcgc tcccacaatgcagccatatcttttcaaattctaagctgatatgacttcgtaataagaaag tacggatataactctgaattatcctcacactcttatcttgcaatttggcctccttccgtc tattcctctccaattcggctctatgaagcagttgctctctgctttcaagattattccttc tcccactaaggtttacattccgcctcttggttgatccgtcgaagttaaacatgttgccac ggagttagtgggtcttgtgataaggatttcaggtggttaatgtgattacttaagtctaat gttcataagcgcccttcgcttatgtaatactggagccggtgcccggcaaagcacagcgat aaaataaaaccaaaacgcgtgcatctaaggtaacctgattcaccttaaagttgtcttctt agcgcgtccttctctcgtttattagaaatcttcctcacact
ORF
ATCACATTTGATAATATTCAGTACCTTTTGAGAATGATGGTGGACACAAAAGTCAGCGGTATGTGCTGGATCAAG
AAGGATATCATACATCAGGCAGAGGAGTTAGAAAGGTGGAATGATATGGTGTGGTAA
Downstream (in bold stop codon previous gene) acttatttattaaattcaatacattaatgaaaataaagtcataattgggacaaactcatc ctacgagtttgatctacttggtttctctattcgggactcgactcggcgattaacgagctt cctgtcagttcgtcaattgatgacgcatcatagatggctgagatcgaaggtggcttgaca acaggttgattcctggaaggcttcttggctttggcaactgtagcgcctaaggtggtagta gtactaacagtttctcggttttccagcggtatgtgggcaaaatttgagcggccgtctaaa aagttaccgtggatttcgttcaccctattgttaactgagtacagctgcccaccagtgctg ttaggagacataacagcgttgatatcaaagtcttcagagaatgtggtgttcctactgcta acgtttttcttggatagttccttgagttgattgatgttgtgaatacacttgcttattgtc tgtaacaatacagttggcttcaaaggcttggaaagatactcatccatttgggcctttagg cagcgttctctatcacccaacatagcgtgagctgtcaaagcaatgatcggcgtcctaaaa ccgagtacgtcaactggattactctttctctcccagtctctgatattcgcagtagcttcg aatcctcccatgataggcatctgcacatccataagaacaacatcatatctattctttttg actgcattaaacgcctctagaccattttctacaacttccactttgtgtccatgtttctct aatattctaacagcaagcttctgattgatcaaattatcctcggctaggaggatattgtat gattgtgatccgtcggaattggatggcgaggctcttgattccaaagcaggcatcaagata gatgctaaatcgataacggagcatggtgtgtttccgtatgatgtgataccaagatccaag cacactctcatgttcagtgacggtactgacggattcagcaa
AA
MPEQLSQESEAKRVKLEDYHETAHSIPVANRDINGYQLPGHQSNSFHTKMTSSQVDNEPSTFIKELEDMDNEATS KSEE IQIQNWSRPELPKFDPKKDDI IFQQLDAEE ISHVNGHSSARFFGVTDAGNTVLCNVTGFDHYFYVGMPKAL KESDLPGFTNYLKDEYVGVNTIEISFKESIWGYNNNTKLPFLKI IMSNSRLISRVRTGFENGSIQYNDAFPPNGS ITFDNIQYLLRMMVDTKVSGMCWIKLQASKYTLVPPSQQVSTCQLELSVDYRDLVSLPIEGEFSKIAPLRILSFD IECAGRKGIFPEPEHDPVIQIANVVSRTGESTPFVRNVFTVNSCAPIVGSQIFSFEKEEELLRKWRDF I IKVDPD VIIGYNIANFDFPYLINRAATLKVQDFPFFSRVKTSKQEIKDSTFSSKAYGTRENKWNIDGRMQLDLLQFIQRE YKLRSYTLNSVSSHFLGEQKEDVQHS IITDLQNGDAETRRRLAVYCLKDAYLPLRLAEKLMCLVNYTEMARVTGV PFSFLLSRGQQIKVISQLFRKCLELDIVIPNMKSDSVNEEYEGATVIDPVRGYYDVPVATLDFASLYPSIMMAHN LCYTTLLNKPTIDRLGLVKDKDYILTPNGDFFVKEDLRKGILPTILQELLAARKKAKYDLKTETDPFKKQVLNGR QLALKISANSVYGFTGATIGKLPCLAISSSVTAFGREMIEATKQAVQERYTIKNGYDHDAQVIYGDTDSVMVKFG
ENDLRKCMKLGEEAAAFVSTIFKNPIKLEFEKVYFPYLLINKKRYAGLYWTNPDKYDKMDTKGIETVRRDNCRLV QNVITKVLEYMLELRDVEGAENFVKQTIADLLQDRVDLSQLVITKALSRQDYAGKQAHVELAERMRKRDAGSAPT LGDRVAYVIIKTSSTKNYEKSEDPLYVLEHSLPIDTNYYLENQLAKPLTRIFEPILGERRAKALLSGSHTRTVFV AAPKTGGLMRFAKKTNTCKNCKSPLTKSNCKSPNGNAALCNNCMDKAGSLYGEALSLMNSLETKFSKLWTECQRC QGSLHQEVLCSNNDCPIFYMRTKAKKDIIHQAEELERWNDMVW
Othergenes involved in recombination
SEQ0095
P. pastoris homolog of S. cerevisiae SRS2 (YJL092W)
DNA helicase and DNA-dependent ATPase involved in DNA repair, needed for proper timing of commitment to meiotic recombination and transition from Meiosis I to II; affects genome stability by suppressing unscheduled homologous recombination
S. cerevisiae null mutant: UV resistance: decreased; competitive fitness: decreased
resistance to cisplatin: decreased; resistance to hydroxyurea: decreased; resistance to 5- fluorouracil: decreased; resistance to methyl methanesulfonate: decreased; transposable element transposition: increased; viable
Chr 3, 0219
5' region (in bold stop codon previous gene) aagactaagaagattgaaattattcttccgaagtatcaccaagtaaccacttatttgcca agttctgttagaaagtacataccgacacacagagttctcatagaaaaggagatcaaaaca gggccaagatacatatttggttatcaccctcatggggtagtttccctggggatcactgga gcttttggcaccaatggttgtaacattggcgagttactaccaggaatcagaatatattta ttaaccctcatcactcaattcaaacttcctctattgagagattacttaatggcattgggt atttcttctgtttcgaaacgtaatgtgactgcactgataaaacgaaatcagtctgtctgt attgtcattggaggtgcttcggagtctctattatccaaaccacatactattgatattgtc ctgaaaaaaaggaaaggctttgtgaaagtcgcactagagctgggtgacactgagttagtt ccagtatttggttttggagaaaacactgcctataatgtttttgacccaagtgtatctggc aagtcttgctctgtcctaaattacgtgcggaagcaaatgtgtgggtttcaattatggtta aaacaacactttggctttacctttccatttttteatgetaggggtgttttcaatcacgac tttggccttctaccatatcggaaacctatcaacttggtcatcggtagacccatcccggtt ccttacattcattcaccaacccaagaacagattgaccattaccattecctatatgtcgaa gaactgaaacgagtttttgagcagaataaggagaggtttaatgctggatccttggagcta cgaattgtcgagtgaagttgagccagtatcttttttaattgttgtgtaagatattttcat gtcatgtttgtagaccctaagtaaatgcatattttcattatcgcgtcacctactcgtgaa ctttttataagtagggcattgtcaaccactttaggaaggag
ORF
GTCACGGTTGTGGGAGACTCTGATCAAAGTATCTACATGTTTCGAAACGCAGTGGCTTCAAATTTTGAAAACATG
CGAATATAA
Downstream (in bold stop codon next gene) actaaattattcaaacaccagattagcctgcttgtcgttgactttgaccaactttgctaa ttccgttgattcaaatttaaaatcggaactagttttacagaatgggtgacagatcaatga ctgggcagttggcctctcatcaggatcgactgtaaaacatttattgatgaaatcaattcc atctgaagaaatatatgccttaacttcatccgaaattggcggagcagattttgaccttcc taagttgtacattgcgcttataacttcgaagttagaccaaggacgtttgccagcaaacat ctctaacacgacacaccctagcgaccaaacatcaactttggcactataaccctgcttttt gtcgttgaccacattatcaattacttcaggagccatccaaaatatagacccttgcattga catttcagcatcattcgagtagatattgcgacttctcttggatataccaaagtctgttat tttacaggtcccatccgtttctagcaacaaattatcgcccttcaaatctctatgcaatat accacgactgtgaatataagcaagaccttctaatacctgcttagtcaagaatctgattac ttcctctggaaacctgccaaatactcgaagacaccaaccaactgacccccctgcaacata ttctaagaataatgtataggtttgagctctttgttcaaatcctagatactgaacaatgtt gtcatgatccaagtcacttagggtatcgacttcagaaagaatggcgtcgacaacttcctt gaagacggagctgttatgcccgctcgaaggtagatcaacttgtttcacagccatcatctc accagttgtcacgttaagagccaagaagacttttccatatgtccccttaccaatcatttc acctcttacccaggaaaattgtttataatctccattaccatccttcaatttggatagatt tcgggcatctccaggcttcatttctacaacctttctacccc
AA
MLASLNSQQYEAVTADIHKPLQIIAGPGTGKTKVLTSRVAYLLIEHKIPPSNIIVTTFTKKAANEMVERLQVLFA KENLDPDKYLKGLLIGTFHS ICVKLLRRYGTQIGLTPKFKIADERDSNTILSKILTTLSDDVVRLKFHPSSVSVD IYQKDPKETDKLHGYDLKKAKSQISKLKSLTTTPQTYSTTSARSFDLAVIYSLYQESLRKHNLIDFDDCLLYTYL LLSETRKKGRVLLKELKHVLVDEFQDTNKIQMELLYLLGQCTDDQNFEDNVTWGDSDQS IYMFRNAVASNFENM I IHYQNCQEITLKENYRSTSNILNFSEQIMTQQKDRKQKSLRSQISHLNIPIVYVRTQNTFKEADHIASEVAKLG LMSGLLTSYNDIAILLRSSFQSRVIEQSLVTNRIPYEVVHGRAFWELKEIRVAMDYLRAIVNNSDNLAILRTLNF PPRGVGNKTLQRLEELIDKSDCPVIETLEMAIHDKKNGFLNVKSESGVRQYIDILKECGRLLDSTSLDCEASLEA CFDQIVQLANLKEIKKTPETSKGTHDSENALKAVEENLNEFKRQLIEYKPI IDQIITAEAGDTPLTTRDVLSSFL DSIDLYETSNNENKKDSKKGRVTISTIHGAKGLEWPWFIPGCTDGVIPSKWTLEDDGPDSSTNLDEDRRCFYVA LTRAKSMLYLSSYKERSTPWΞGEESNEVSCFLKPTLKMFQNQLVTS ISNMSKNSWRDLYEAMGKEYKPVSNILTE SNRSRNHIVSTKDFLKNKQFKSPLLTKSTTAPSYVPDRSNNKKRRLGVRRRI
4 P. pastoris genes with high expression levels
Based on the micro-array data set by Graf ef a/., 2008 (BMC genomics 2008, 9:390)
55 features selected with a 2Ox higher signal than GAP1. Of these, 35 could be linked to a S. cerevisiae gene and for 32 of these a P. pastoris homolog could be identified.
SEQ0096
P. pastoris homolog of S. cerevisiae GPM1 (YKL152C)
Tetrameric phosphoglycerate mutase, mediates the conversion of 3-phosphoglycerate to 2- phosphoglycerate during glycolysis and the reverse reaction during gluconeogenesis Chr 3, 0826
5' region (in bold start codon of previous gene) ccgaaagaaggtttatctgactgttgcgcaccaccgaaaccaaaagcgggtttagctgca gaactttcagtggccggcttgttgccaaagagactgttactgcctccatcggtcttggta gcgcctgaaccaaacgaaaaagcaccactgggattattgtttcctccaaaagacgcggtc gtgtttgatgctgttgaggccgcaggcgcttgaccaaaactaaacgcttttgatgtattg tttgtaggggctgttgcgttgttagaagctggctggttgttcccaaaggaaaaagcacct gttgatccagtgttagatccacctgcggcattcccttgagctgcggagccaaaaattgat ccttggtttgagctctgtgtaccaaataaagatgaattggccttagctccgtcagcagga gcgtttgcttgtccaaaggggcctccagacgacgtttgcgtgttgttggcattactagta ttggcaggctgatttccaaacaatgagggcttggaggcgttctgggagttaccaaatgag aaccctgtatttccttggccagagcctgagaaactgaaccctgatccagacattatcaat accttgggttattagtagtgtccgttatttttctgtttaggttacgattttgccagattt
tttgggaggagggaaacaaaagaaccagtgctacacgacctttaagtgccatcaggcatc ctgttttctcgacctcatctcatcacatccgtcagtctgagctttcagttctcagttttc gattgactcttgccctgctgcgcgcacaccataccctggctccctctcatgcttctggcg ttaccccgggaatcgtacatccatgccgcgaatcccggacaggactcagacggatttcac tattcctagtgggcacaacctccatatataaggatacttctgccccctagcttaagtgcc tctattttgtcagtaacaactttcaattacacaaacaaaca
ORF
Downstream (in bold stop codon of next gene) gtttcttagtaagtaataaggtgaatatattaactgttcatagtgttattgttggtgtga ataaggggtgattttaatagaagacaacacaacacatcttggggattttaggtcatattg gctgaattattgtgaggaaattcacctagttacttcaatgaataaccaccaaatgataaa aaaaaagtactttttgagttatttcagaataccatacattgaccatattgctttgtggaa caactgatattgaatttcgtttatataatagagttgttttgcaagtaataaacaactaag aagaagaaattcacgaaaagtccgtgatagtttagtggtagaaccaccgcttgtcgcgcg gtagaccggggttcaattccccgtcgcggagttttttttgtcattgatattccctcttac cagacgaaatctaccaacccaaatccttctgctcttataataaataattgatgagtacaa atcctataatctactatgcctccaattgcttactttgctccctagcaagtctctttttct ccttttcttccttccggttgtacttgacatcctcagggttgaatcctccctccgatgagc cccagtatgcccactccggagttggagctaggtaatccatatgtgtgaacttttcaccgg attcaagtaacaggaaaaactttttatagtcaatatgagtgtaccattggtcgttgatgt aatatctcttcgaggcaatcaaaatacagcatgaatgagcatgttcggcagcaattccgt actccaaacctctggaattaagctcatcattcaaagatcttacaaaattttgacattcct catagaaggggatgttttgcattgtcaaaggatttccatttccagaggatgatccgcaga aggttgcacctttgatttcaatcaatgagggcaatgcgctctccaccatgtcagcatatc ccttgaaatcacccatattaaatcctttcacaagagtcaac
AA
MVKLILVRHGQSDWNEKMLFTGWVDVRLSPTGEKEAARAGELLKEKNIHPDILYTSKLSRAIQTANIALDKADRL SIPVVRSWRLNERHYGALQGKDKAETLEKYGAEKFQTWRRSFDIPPPEIEDDSEFSQANDERYNDVDPNVLPKTE SLALVIDRLLPYWQDTIAKSLLEGKDVMIAAHGNSLRALVKHLDKISDADIAGLNIPTGIPLVYELDERLNKTKE AYYLDPKAAEEGAKKVAAQGQKK
SEQ0097
P. pastoris homolog of S. cerevisiae TEF2 (YBR118W)
Translational elongation factor EF-I alpha; also encoded by TEFl ; functions in the binding reaction of aminoacyl-tRNA (AA-tRNA) to ribosomes Frag B. 0052
5' region tggtgggtatttgacaggttggggagcaaataagtgatgatgtcccatgaaagtagaaaatggc tagtagaaggcaaaaatttgaaattcttagagtcaaatagttagactccaagttctaatc cacatttggtcagtttcatagcatccagagcttttgccactggtgaacatatctacccat tgcgatgcaacaagtcactgaaagcctaaaacggagattcccctatcttacagcctcgtt caaaaaaactgctaccgtttatctgctatggccgatgtgaggatgcgctcatgcccaaga gtccaactttatcaaaaacttgacccgtcatacaggctctagatcaagaagcaaacttaa tctcagcatctggttacgtaactctggcaaccagtaacacgcttaaggtttggaacaaca ctaaactaccttgcggtactaccattgacactacacatccttaattccaatcctgtctgg cctccttcaccttttaaccatcttgcccattccaactcgtgtcagattgcgtatcaagtg
aaaaaaaaaaattttaaaatctttaacccaatcaggtaataactgtcgcctcttttatct gccgcactgcatgaggtgtccccttagtgggaaagagtactgagccaaccctggaggaca gcaagggaaaaatacctacaacttgcttcataatggtcgtaaaaacaatccttgtcggat ataagtgttgtagactgtcccttatcctctgcgatgttcttcctctcaaagtttgcgatt tctctctatcagaattgccatcaagagactcaggactaatttcgcagtcccacacgcact cgtacatgattggctgaaatttccctaaagaatttctttttcacgaaaatttttttttac acaagattttcagcagatataaaatggagagcaggacctccgctgtgactcttctttttt ttcttttattctcactacatacattttagttattcgccaac
ORF
AAGGCTGCTCAAAAGGCCGCTAAGAAATAG
Downstream (in bold stop codon of next gene) attgcttgaagctttaatttattttattaacataataataatacaagcatgatatatttg tattttgttcgttaacattgatgttttcttcatttactgttattgtttgtaactttgatc gatttatcttttctactttactgtaatatggctggcgggtgagccttgaactccctgtat tactttaccttgctattacttaatctattgactagcagcgacctcttcaaccgaagggca agtacacagcaagttcatgtctccgtaagtgtcatcaaccctggaaacagtgggccatgt cttttgctccttcaaaaatggcaatgggtaggctgcctcctctcttgtgtatcctctctg ggaccactcagcgtcacttgtgctaataatatcttttaggttgtgtggggagttgtgcaa gattgcaccatctgtttctccgttttctactttacggatttcttctctaatagagatcat agagtcaatgaatctgtctaattcttctttgtattcggattcagttggttctaccatcaa tgtgcccggaatagggaacgacatggtaggagcatggaatccataatcttgcagacgctt ggccacatcaatggcctcaattccgaatttcttgaatggtctaagatcaataatgaactc atggccacagtacttgtcatcgccacctgtgaataaaattteataatggttctttagctt ggctaccatatagttggcatttagaattgcaattattgacgaatactgcaaattgcttcc acccatcattttaatataggcataactgataggcaaaatggaagcactaccaaagggagc tgcaacaactggattaatactctgctcagtctggttggtggtagaaattatcggatgact gggtaggaagggagtgaggtgctccttaacacaaatagggccaacacccggtcctcctcc gccatgaggaatggcaaaagtcttatgtaaatttagatgac
AA
MGKEKLHVNVWIGHVDAGKSTTTGHLIYKCGGIDKRTIEKFEKEAEELGKGSFKYAWVLDKLKAERERGITIDI ALWKFETPKYHVTVIDAPGHRDFIKNMITGTSQADCAILVIASGIGEFEAGISKDGQTREHALLAFTLGVKQLIV AINKMDSVKWSQKRYEEIVKETSNFIKKVGYNPKTVPFVPISGWNGDNMIEPSSNCDWYKGWEKETKAGGATKGK TLLEAIDS IDPPSRPTDKPLRLPLQDVYKIGGIGTVPVGRVETGVIKAGMVVTFAPAGVTTEVKSVEMHHEQLEQ GVPGDNVGFNVKNVSVKE IRRGNVCGDSKNDPPKAAESFNAQVI ILNHPGQISAGYAPVLDCHTAHIACKFDELI EKIDRRTGKKTEENPKFIKΞGDAAIVKLVPSKPMCVEAFTDYPPLGRFAVRDMRQTVAVGVIKSWKTDKAGKVT KAAQKAAKK
SEQ0098
P. pastoris homolog of S. cerevisiae RPS2 (YGL123W)
Protein component of the small (40S) subunit, essential for control of translational accuracy; has similarity to E. coli S5 and rat S2 ribosomal proteins Chr 1 -4, 0589
5' region (in bold start codon of previous gene) cctcctgagaacggacagcagcgctggaggcggcctctttaacggtggcggcgaagtcaa caagggtagttttgattttgcctaacaactgttgcaggtccttctcagtctttgggtggc tcttcttgaactcgtccgatgcatccttgacgaccttctcagcgtcgtcggcaatctcgt cgacaacttgcactgatccatgagggtcttgggtttgttcgacttctggaactggttttg gcttgccagcggaaactccttctgggggtgtggcagcttcggcgtaggtactatcagtgt taatatcggggattggttgatgtgcgagacgaggcaggcgggagatccgtctgcctcgtc caactgcgatagctcttacttacctcatgatgcttgtaaaaaagttaactgaaagatgga aatgggagggggaaaagaattgtggtcaaatccacgtcttgcgataacctcatacatttg ctgcatgattgggggatcatcgcaatttcggtccttgtgacacacgcgagctttccgcct ggggttagagcgcggagcaagcatactatcaagcaagaaaaaatggtatggagaaaccta attggttaagatatatgaaaactacgacggcttcatacgtggtatctctggttgagctac tacgaaccattttcccccttcaaaccctttggcccatgccattcatgccttgcctctctc tcaagcaactaagcaatcaagcaatttcccgccttgctgcacatgactgttcggaaatcg gagacccaaacaccacttgttatctatgcacgtgatttttatccagggcaataaatactc acttttgcttcaaaacgcttggggcgcgcgagcggcaggctgggaaaaaaataatctcag acttttcaaaagactctcctctttaatcattgaataccgtcgatcgaaaaccaccaccat cggctttccacgtacattccgactttttcagtgtagttaat
ATCATCGACCACTTATTGCCTGACTTGAAGGATGAGGTCATGAAGATCAAGCCTGTTCAAAAGCAAACCAGTGCC
TACGCCGACGAGGCCTCTGCTTCTTACAGAAAGAGATATTAA
Downstream (in bold stop codon of next gene) atagtcaaatattaatctatttcacctgttcaaactttacttaatgtacaaatgtggtag ttattagttttgcaacggaacttgttccataatctggtcctctgggacagcaaactgtct ttcactagtagcgccagtttcgggagtccacacagcattagtcaccggtgcaccagcact aatctcacgaccttctgggtgtttaaatgggcagttagggttgcggcatccagctgcaaa cttacaatcctcatcaattggatgagtgaaaaaacagtttggtctggtacaactgttgcc ttcacgacacagtacaggagtagttgcgtgacgtcttgggcacttgtaattacggcatga tttaccaaatcgacattgttccaaagccctctgttgtttttgttgtttctcttcttcggt gatcttgtgttcaggtgatcgatgagcctttggacagtccggattagagcattgcttgtc ttctgggcaccattctagcaccgtgactcgagcagcattgttagctggagttgggtgacc aaatggacacatttccttggaacacatattgccaaacttacatataccgatttgaccttg ctccaaagttttcttcaactgctcctgaacttgacgttccacttgtttcatcaatttggc ctggctaaacgtctttctttcttggaattcttttctggtcttctccagctcggcaatcag tgcatcatcctcacctggatgaagataattacaagttccagcgggatttggacagttggg gaatgcaaaacaggtctttgtaggatggccgaatgggcactgcttgttcttgcaatgtgg gaagtattggcaacggtttcccttcttggaaacaaaactagatccctcgacctcctggtc aggattggatggttgggctatgttcattgtcgaatccattgcacttaaaaaattctcctc attcttaagggcaaaattggatctggcattattgtttctac
AA
MSQAPASAQQGSAPQGQRQFGGARGGRRGPKRGPRRGEEEKGWTPVTKLGRLVKAGKITS IEEIYLHSLPVKEFQ 11DHLLPDLKDEVMKIKPVQKQTSAGQRTRFKAVWIGDSNGHVGLGIKTAKEVATAIRAAI I IAKLS I IPVRRG YWGSNLGAPHSLASKTSGKCGSVLVRIIPAPRGSGIVASPAVKKMMQLAGVEDVYTSSGGSTRTLENTLKAAFIA IGNTYGFLTPTLWDDYELSESPLDVYADEASASYRKRY
P. pastoris homolog of S. cerevisiae EN01 (YGR254W)
Enolase I, a phosphopyruvate hydratase that catalyzes the conversion of 2-phosphoglycerate to phosphoenolpyruvate during glycolysis and the reverse reaction during gluconeogenesis; expression is repressed in response to glucose Chr 3, 0082 See SEQ0073
SEQ0099
P. pastoris homolog of S. cerevisiae PET9 (YBL030C)
Major ADP/ ATP carrier of the mitochondrial inner membrane, exchanges cytosolic ADP for mitochondrially synthesized ATP; phosphorylated; required for viability in many common lab strains carrying a mutation in the polymorphic SALl gene
Chr 4, 0210
5' region (in bold stop codon of previous gene) gttgttgttcctaaggctctgagagttctgagattgaagccaggtagaaaattcaccact gtcggaaagttgtctacttccgtcggttggaaatacgagtctgttgttgagaagttggag gagaagagaaaggctgaggaagctgagtaccaggagaagaagagagcttacacccagaga ttagacgcagctagtgccgagtttgcccaaaccgaggagggaaagcagttggctgccttt ggttactaaatagtaaagtagggtatcttcaagtaatagtatactaaccatctgaaataa ccaccgtcctgtagttttttttcgatatcgaagagcctatgctagtactgtggatttgcg ctccatccaacatctgtgcgcaaactaaaacttccgagactgacatctaccatcgctaga ccctaagtaaaaccaatctcgcgtccgaacttttaaatttcagtccttaaaacttcagag cattggttgtagtttccggatctgaggggtcgtattggagtcaagagacggagctgcctc cacagcgcgaaacgtcaaccccaacaccaacctgaatttgcaatcaccatggggacaagt ttcagcagtcaatgggcaattcagacgttgatacggtacccatttgctaagctcaatgac gatccatccaacttcagagaaaggcctttctctggtatgctctggtattcattcgtcttt tatcactctcgttgcacaatgcccgggtactcccggaacaagggagtcttccagccaagc tgtacagagtgaaaaatagaaatacacctttgcaatcaagacgcgcgttggccaatcaca agacttaatcggtgcaaagaaggattaccaaatttttttttcccaaaatcgctatataga aataatggaggaaaaagggttaatataaaggagaattcccccgtttttctccccttttct tttcttcttcaggctttcttacaaatctataatattccaaa
ORF
TACAACGGTGCTTTCGACGCTTTCAGAAAGATCGTTGCTGCTGAGGGTATCAAATCTTTGTTCAAGGGTTGTGGT AAGAAGTTCAAATAA
Downstream (in bold start codon of next gene) gccaattagtttgaagtgagattttatttcattcctgttaatattatatactagagtata ttttaaattaattgttcatgaacttgccaaattatgttagttttgtgtaaacaatcttag gctatccaatttagttctacttttggtagatttcctgttttggtaaattacaaacaacaa tgatttgacttatattctattcggaattttacttatcaccttgtacagtttgtggggatt tccggacatgaagaagtttggactggaaaaggacggagatggaatcttcaggctcaaacg aaggtcctcctctcccaataagaagggcaaaacaacattgcctcctttctcgtcacctcc aaatctagcccatgacagtccaacagggcccccaatattacctactctcagaaccagagt ttctgaaactgacactttggtagatggtatcagtgatacggtcaacattccctctccaca agctatgaatgcgagggccgggtcccccattagacccatgctaggaagccctacgaaatc attcaatgatgaggccgatgcggaactggcccagattataatgctgctatcaaagattgt
aagactggggttcaactttgtcaaagatcttacgttggaaagcagctttatttcagtgga gaattataccactctgcatctccccagtttcatggtgggtattgctcttgcattgttttt cacctctggaggctcagatgaatcaagtagccgatcaagaagtcacgagtcatcaacatc atcggcctttgggattcttcgtaagtttgtgatgggatttctggttatttccctcgtggt gttctttgttagtttccagagtggggatcaggcatccaattccggatcggatgaactgcc ccttgtgaactacaagcaggaaacaacagctccttcccctagcacacatccctccggtcg gagttcaccaaagagatattcagtcttcaaggaagacctga
AA
MADNNKSNFFVDFMMGGVSAAVSKTAAAPIERVKLLIQNQDEMLKQGRLAKKYDGIAECFKRTAADEGIASFWRG NTANVIRYFPTQALNFAFKDKFKAMFGFKKDEGWWKWLAGNLASGGLAGATΞLFFVYSLDYARTRLANDAKASKG SGERQFNGLIDVYKKTLATDGIAGLYRGFLPSWGIVVYRGLYFGLYDSLKPIVLVGPLEGSFLASFLLGWTVTT GASTASYPLDTVRRRMMMTSGQAVKYNGAFDAFRKIVAAEGIKSLFKGCGANILRGVAGAGVISMYDQLQMIMFG KKFK
S EQ0100
P. pastoris homolog of S. cerevisiae RPS5 (YJR123W)
Protein component of the small (40S) ribosomal subunit, the least basic of the non-acidic ribosomal proteins; phosphorylated in vivo; essential for viability; has similarity to E. coli S7 and rat S5 ribosomal proteins Chr 3, 0762
5' region (in bold start codon of previous gene) aattctggaggttccaaataaattatggtacgagttcatagtcaatgggttcccactctt gggagacacatctggtttcaacacctcactccgaagagcacaggcaaacctcaaggcgga acgaatcaaggcagaagcacgggaagcctgggtggggggaacaacggttttggacaacgg gtcgtcctcaagaacgaaaaaaggatttctaggaagggtcttaccacgaagttcaccgta agtaccaggagttgtggtgctgattccttcggcatccagatagcagtcattcttctcgta ggatctgaccaatcgagcctggatggcccggatggtcttgctgttgatgaagtcactgtt ctgtgccaccaacaaatttgtattcctctggtgacactataggtttcagactgtgcaata tttgctcggcagtagaaaccaaggctgggacgggcagcttcggcagggcctgetcgtggt caaaagttccattatccatctttgtgtgaattagcgggatgtgtcaaactgaagcggagt tggaggagaattgcagggaagggctgaggatgcaagtatataaagggttaggtcttcccc actatagatgatccattctcggctctagggttagtatgataagatcacgtgaggatggtc ggattgagtgaccgatttttcgaaggatgaaggggggttggatgttgaatggatgttgga tggaggacaacagaggagaggaagatgaggggaaaaagggagaaacaaaaataaaaaaaa acatagctgacctccgaacatacgttgagtttgcacgtggagtttctgacgagagcagtc accgtaaacggcaacaacagtcacgcgaaaacttgcccatccaagtccccagttttgtta ggcgtgcgcgcgagagctccgcgacttttgcaatttttcctcaataaaaattcattttct tcaaacatcaagtgttcatacaacaagtaggagatccacaa
ORF
TCATCTACTTCATACGCTATTAAGAAGAAGGATGAGTTGGAACGTGTTGCCAAGTCCAACCGTTAG
Downstream gtagtgttcatttgtataatggtaatcgatactatgaaagacgtgtaggttgtgaggtga aaggaggtgtaagatagagcggagctgtcaatgtggatgggagaacggtaaagtggccag gacgttgtggtggttttggccttgtttaagtttcactactcgtttaacaccggctcatcc tagcttatgttttgagtaactgttcaccggccatatttcaactctatgaattataacact attaattacttgtttggagcgcagcaagaggttcttaaaatcataatttcgctttatata ttaatcacaaggtgatgctaacaattaccaaaaagagaactaaactcttgcatctattat ccactgtgcctaaagacacataaaaaatagtctagatccttctctaatacaatagaaatc
ctctttactgtgttaatctaccacaacatctgcagccgatgctttcttacctgccaagct caaatttttcacttcaagcgccgatcaagttctcaagctggtaatcactagttgtttcca cctcagcagctggccaatctcatctcatgcggcagtttttcctccttttctcatgcatcc aagacaactggcccccctgetaaacaataacataaaaaaaaaaaaaaaaaaaaaaaaaca accgcagaaatttgcaacaggacagaattcaataatagacgcaggtgaataaaccaagag attcagtttattagcccaaatccttcccttgaccgccttagactttaaacaaacgctacg aaatcgtgatacaaccctaatcgttggtgtcaccggctggttaccaagggtttacccttt ctttttgcactgggccttttcttccaaatcgctggacacaaacttctgcaatcatgcgtc gccctagcataaaagacgcttctttctctacatgatctcaatcacataggcaatagctta ctaagatgcaaatggcccttattcatgctgcattggggcac
AA
MSETYDEQWAEPVEVQYVELSTKIPADIREGAEEIKLFNKWSFQDVTVKDISLTDYIQIRQPVFVSHTAGKYAA KRFRKAQCPI IERLTNSLMMNGRNNGKKLLAVRIVKHALE IIHVLTDQNPLQVLVDAIVNTGAREDSTRVGSAGA VRRQAVDVSPLRRVNQAIALLTIGARESAFRNIKSIAECLAEELINAAKGSSTSYAIKKKDELERVAKSNR
SEQ0101
P. pastoris homolog of S. cerevisiae PDC1 (YLR044C)
Major of three pyruvate decarboxylase isozymes, key enzyme in alcoholic fermentation, decarboxylates pyruvate to acetaldehyde; subject to glucose-, ethanol-, and autoregulation; involved in amino acid catabolism Chr 3, 0188
5' region (in bold stop codon of next gene) acatattcaaataccaagcaaataaacgcaaagagcaacatattaggataaaccaacagg ttaagtatagagctaaagcattgctgagcaatatttcggatcaacatcaacagaatagat cgtcaccacgaagtatatccaactattccccaagaatccaggcttatccctcaagaatgg cctctccatctccttcaatgaagatggcattgacaaactccgtcagcttgcaggatacaa tggatgcggaactagaagcgttggcaaacgaacaatattacgtcatgctagatatcctta agggattttgcgatctttcatttgacatggtaaacatattttctatccagatccctgagg tattacatctcctcttcggctttggtgcagggtcactggccttaacgggagtcgtatcca aaacgagacaagaactaataaataaacagatataagggacaagcacacgattacccaatc acttgatatgcaccaatttgttccgttgtttatgccatatttaccgaattttcttcccag gtttttccgaatggacatctgtagtccactttttggttatcataatcgtcccacaagtcg tggatttaaccagaacctagtaattttaagttcgctattaatcactcagaatggtctcac cttgctattggccaagtctggagtcgccagctaccacctcagaggctacatagacctccc aatgtcatctcctcagtgcgctcttcaatctcgtgtcttttccgttaaaactccgttcgt ttcaccctatactgcccctggttgtgcagctcttaccacttcgcgccgctactatccgta gtggtcgagccgcatcaatatcacgttgaaatagaataactccctacaaaagccgcacgc aaccatcaaatctatataaggaacctcaaatatctagcaacatcttttcaatttactaca acatattcgttaatcatcaatcaattagctagtacacaaca
ORF
CTCAGAGCTGGTGACGTCGTCATCACAGAGACCGGAACCTCTGCTTTCGGTATCGTTCAATCAAGATTCCCAACC
GCTGAAiMCACTGCCAAGACCAACGCAGCTTAA
Downstream (in bold start and stop codon of previous genes) atgtgttcacgatcataattgttttttagtctgtatatgttacaatattaacgtccattt gaaatcttatccttcgcgtaatcttctctggagatttcttcctacttatctacttgaaca ttqtacctacqcaccaaaatgqaacaccctqcttttaqtctttcaqqtctqtctttaqtt gqagqaqtcatgqqatagtaaqtqtaqacattqctttttgttgqagtqtactaactcqtc agtgcccgtaaaggttcccttccttccctaatagctggagtcactttttcggcaatttat gccggagctggatatttgctgaagaataacatggaatatggggttcaaactgctttggga gcctccactctgatgtttattgccggaatcaaccggggaattgtgtcaagatataagaaa cccgttccaattgttctgacaattctaggagccgcttccactctgtactatgctaagaag tatgaagaattttacctttaaagctattgaaacttatctaagtcgtatgaatataaactt ttgttcaccatccatggaaaggaacccaaatatctactactgtatcttcctcaattacaa aataccatgggaacgctgcagcagcgatgcaattatgttgggggaccactttgatcttgg tgcctagaggtataaactcgcaattagaatcttctttaaaagggaacaagatgccgtgtt cctggctaagctttccaacaatccaatctccataggtagaagggtcgactactattccat gacctggaatggaactaaactctcgagccaatgcaattactccagcatttatcaactgtt ctccgggattcctattgtttctattagggtaatgtgataccacttctgctagaacatgag ccgacacgtcatctagtgacacgcagtttgttgcaacctgttgtaagtcacaacatggat agtttccagcgtgtaactccagcactccctctagttggcct
AA
MAEITLGTYIFERLKQIDVKTIFGVPGDFNLALLDHIYEVEGMRWAGNANELNAAYAADGYSRVKSMAALITTFG VGELSAVNGIAGSFAEHVGLLHIVGVPAISSQEKFLLLHHTLGNGDFGVFKRVFKNVSKSANFISDINEAQDMID GAIREAFIYQRPIYLGLPTNLVEMKVDRTRLNTPIDLKPVPNPVEAEEEALQSILELISKASKPVILVDACASRH FCQFEVDQLIDVTNFPVFVTPMGKGGVDEQKPQFGGAYVGSLSNPDITEFVEKADLVISVGALLSDFNTGSFSYS
HSKNIVEFHSDYTQIRSAVFQGVQMKALLQKLLPLVGKASKHITAQVPPKIAPPIEKGASEDLTQDWLWSNISKF LRAGDVVITETGTSAFGIVQSRFPTHVSGISQVLWGΞIGFSVGATLGAVCATEELDPNRRVILFVGDGSLQLTVQ EISTMARWNLKPYLFILNNDGYTIERLIHGEKASYNDIQPWDHLKLLDTFKAKNSESTRVSTVGELTKLFKDQGF NKPDKIRVIEIMLETMDAPISLVKQAEITAKTNAA
SEQ0102
P. pastons homolog of S. cerevisiae HNM1/CTR1 (YGL077C)
Choline/ethanolamine transporter; involved in the uptake of nitrogen mustard and the uptake of glycine betaine during hypersaline stress; co-regulated with phospholipid biosynthetic genes and negatively regulated by choline and myo-mositol Chr 3, 051 1
5' region (in bold start codon of previous gene) tagcatctccttccctaggtttatgaaagtactcctcaagaatcttggttgggatatgtt ttctcaataccagatccaaatcgatatcgtctgacatgagaattaaacccttagcatcac catagggaatgggtttgtgctccttaattgcctgatagttggccctatactgatccaacg gaagggaatcaaacatgatgaactgtttgagggaacaaccgtgtttgtaaacagacagag ctagggtgaatacgcgttgaaatgaagcgagggtaggggaacgctggaaggaatctttgg gaggagccctttgggaaaacgctggagggagctctgttggagatgcgaaatgcgaagcag aaaaaccaacctgcaaactattcccaaagtggcagcagggcgctatggtaaaagcaccat tatcaccattttaggaagtaatctttacaacataatttcaaccttacaacaagtctggtc cacctggattatcttctagataagctttacgacctcacaccatctccccaatatctgcct tttcgttcaaacaaaagcagtttactggaactgaccctacttcggagtacctatctaatg tcttggtcatttggaaaaagcttcaattttctgcacagaaaacacagtcacacaagattg gctttcccattcgggacataaccaatatattgcgacatcaccaattaggtcccctatagt tatcaggaaattggcagtggatacttgtacaatcatgttgcacctgtaagggtctttggt tttcccgtttccccttccaaatttcagatttttatcggaaaaggccgcctatcttggtaa tgccttgattcagcgtttttcttttttttcttgtccgaatctcgttctcgccgaaccgtt gcacactataaatactatctcgtatcctggtttggacacctcctctaaaactatcatacc
ttgaggtttttttttaggtaacactcgttctttctacaaaa
TTCGTGTTTTCCGATTTTGACAACCAAACTGGTTGGGAATCCGCTGGAATCGCCTTCATCATTGGTTTAATTAAC
CACAGAGAAAAGGACGCCACCATTCCAGTTGAAGTAACCAAAGAACTCTAG
Downstream (in bold stop codon of next gene) aaataatgtatataaatagaaaatgcagatttaactaatacttatttaccctcagctgtc atcgattctcattttgcttattaaatcactggctttcttgaaagattttgattccatttc atcgggtcctgaagggggtaactggttggcttcagctgtttggtccagttctttcttttg gtctggttctaaagtatgatctaattcttcatcgtagtcgtcttcttcgtcataataatc atctgaatactcttccccatcatcaacttcatctacgtcgtcattttcctcttcttcatc ttcatactcgtaatcgtagtcaaagggatcgtcatcatcatccaaggtttcatcttcgtg atcaacgtcaatgaaatcatcatgatctttcagcttttcttcatcggatgatacagaacc aactgattttgggggtttgattgtgctaggggcaacttcagttggctttgggtcactcaa agtcatccgttctggtgaacctaaaaggctactacttcctttgaaaaacgatgctcctgc tgccaatattggactacgaggagcagactgtaaacgttcctcctcctttcttcttcttag ctcatcttcactagaatcttcatcactgagctcctcaaaattgtacaaaggctctttcca ataattgacatcagagaactttgccgtgttccgataggtatcagaaggcatcgacttaaa tagttcagtgttcactggtggaaagaaatgatctaccagctcactaatatgtacataaga acttctatacccagccaattcgagtagctccatgtgaacatcaccgtctgggtttatggt aaaaattcttgatgatggtatccctactgacctataggacaaagcatcggtaattctatt accaaaaccggcgtaaaatggcgtttgtacatcctctgtaagacttgaagatctcaaagt gttcgtattgatatccgtgaaatctgcagaatctgtgttcg
AA
MSSSVEKSDVNDVSLTRRFTKDSTLDKDDLILAQLGYEPELNRNFSVWSVLGVGFGLTNSWFGISASLVTGISSG GPMMIVYGIMI IAAVSTCIGITLSELASAYPSSGGQYVWAKVLAPKYPVLAFLCGSFSYAGS IFSSTSTTVATVQ IAVAFYELTHADYEFHRWHVFVAFQILNFFIFFFNCYAKFLPSIAKSSLYIΞLGSF IVITITVLACSSGHFQNAK FVFSDFDNQTGWESAGIAFI IGLINPNWSFSCLDCATHMAEEVAQPERVIPIAIMGTVAIGFFTSFVYCISMFFS IRDLDALLNTATGAPI IS IYYQALQNQGGAIFLGFLLFLTACGCLISGHTWQMRLCWSFARDNGLPFSPFLSKVD KRMGIPFNAHLFSVCLCSLVSCLYLASDLAYNSLVTGCITFLLLSYAIPVΞCLIYNGRDS ITHGPFWIGKFGYFT
SSMTVAWALFALVFYSFPFVKPVTKSNMNYAAAAIVGWLVVSLVYWFSYGKKIFIMRNEDDEDAVLKKIFAKTES HREKDATIPVEVTKEL
SEQ0103
P. pastoris homolog of S. cerevisiae GLN1 (YPR035W)
Glutamine synthetase (GS), synthesizes glutamine from glutamate and ammonia; with Gltlp, forms the secondary pathway for glutamate biosynthesis from ammonia; expression regulated by nitrogen source and by amino acid limitation
Chr 4, 0785
5' region acggcactagtgagaatggggcaatagaaaaaaacccatacatcacattgactagtcatt tttacttagtgtatactagatataataataaagctacgatcatcacagtgaaatatcttt gatggtcggtcagtattaatcttgtcacatgacctacttccgtcttcccatgactccctc ctaaggttgtgtgcgctggtgcgtcaggttcggtccccgggaggagtctcgaccaatggc taggtagtgataacaatcaaagtttatgagaccagtagtttgacacatagtgcatttatt ggaaatatggggacactaccttccgccaataggagacgagaatacgactagacgtatctc gtaaagtccctccgcctcgatctagaactgccaccatatttgaggatcaaagaccagaag aagtgtcgtgccagtttttcagataaatcattcgccgagatcagtcgatcatcagcgttc agtccatgagaacaaggtgctggcttgacgtatcttcaggcagccagcagctttttgttt cagcctatcatctttgaggcttatcatcttactgataagactatgggaaaaaacccaaga aaatgagatgttttatgtgaacgaccatttacttttttagataggcaatcttctgccccg gtgagtctcctcattgggcaatgttgtaattacacgactaacaaatcagaggcggtaaac gctagtctcggcagcgtcatcgaggtcagtgcatactcaacaatagaaggcgcgcaatca ttagacaacttgatccatagcccgcagcgtttcttctccagaaagttcactataattacc cgtagaggattggttcaacatacaagaagatcagcaaatcaccttcagtattcggcgaca agatttatttttcactgaggtttcaaagcttttcaaatttcaatataatacttccctatt aatcatgcatgcaccctacaagatttcgcattcttttttttactacaggaattactgacc aggtgactctaaaatagctttgaggtatataaatcacaaaggcgctgttagcagacagtt tttttctttttcgtttcttaccttcggtcaacaatattaattcgcttcaagctacttagt tctctccattataagagaatcttaatataaacacctatacattacaattacaat
ORF
GAGACCATCTGTGGTGCCATCCCTGATGCTGACATGACCAAGGAGTACAAGAGAGAATCTGATTAA
Downstream (in bold start and stop codon of next gene) atagtatatcagcagctcatatctagtacttcctagcttgaagtataaaaagaaatgcct gttaagttttgtaagccaagaaatttacgttttttagttctaaaagttcccgctgtattc tttacataagagattcatgatccatcagccataataaattttaaatatgtaatactacaa acgtattaagctgccaaaccgatgacaccacaagctggtcttccaccagcgttaccggtc ttcaaagagtcggcgtgaccaccctttcctagatcgtcagttccgtcgtgaataacaaca gttctacctagaattgaggtctcaccgatcagcttaacctggttatcggttataacaccc ttggcaacaccttcagcgtcggtcttgacgttacccaagtcaccgacgtgtctagcttcg tcagttggagcaccgtgggtcttaccaaatgggttgaagtgaggaccagctgaagtacaa ccattggtattgtcaccgaactggtgaatgtggaaacctctctcagcgtttggggagtta cctttgatgtcgtaagtgatagtggtagggctgctttcggaggattgttcgaagacaacg gtaccgccgacggtagaatcgcctcttaagactgcaactgctttgaccatttttgtgtag atgttaaatagggaagaaacacagaaaataaagaattcgaaaggtatatgaggggcagct ttcttttatagtttgcggtcggttctagaaaataactcccccttttttcgccactgctag tcgagaacactaataatgactaagcgtcatatcaattactgtatatgaacaaagaagaag atgccaattgagtcaaattatgagtaaaaaacataattgaatgaggatttcaaggaacct ggggggaaagggccccttttatagtgtctcaaaagctctttgaccctgttggtggtgaaa ttgtagttatttactcgtttttttgctcgttttgcgtttgc
AA
MSSSE I IENTALLQKYQSLDQRGKI IAEYIWI DASGGIRSKGRTLSKKPSS I DDLPEWNFDGSSTGQAPGHDSDV YLRPVSYYADPFRRGDNIWLAECWNNDGTPNKFMHRHECAKLMKAHKDQEVWFGLEQEYTLLDKYDNVYAWPKG GFPAPQGPYYCGVGTGKVYARDMIEAHYRACIHAGINISGINAEVMPSQWEFQVGPCNGI DMGDQLWVARfLLQR VAEEFGIKISFHPKPLKGDWNGAGCHTNVSTSAMRVPGGMKYIEAAVEB LAKRHLEHIALYGSDNEQRLTGRHET
S E Q0104
P. pastons homolog of S. cerevisiae RPPO (YLR340W)
Conserved nbosomal protein PO similar to rat PO, human PO, and E. coll LlOe; shown to be phosphorylated on serine 302 Chr 1 -3, 0068
5' region (in bold start codon of next gene) caaaatatctattcctgtcatcaacgttttgaatagccaattgatcgaaaccgtcatgcc caaccccattaactgcattgactggtaggagaacaatcaaaatgagactaccaaaaaaga aagtcaaagtcaatattcccatgtaacgtagaaataggtacccgtcaataccacagaact tgatcaggaatttcttatctttccgtattaaatcccagatcattcttggagcaaacacag aaactggtttgggtctgtcttctgggggcaaaagctttgtgaaactcttgggttgataga ttcgatggaactttgctctgagcaaaataaatgctacaacgaaacctccaaacacggccg aattcacagccaacgaagtcagcaccgtttcggtggagatgttgacctcagtcgtcattg ttggtcaaaatcaaacgggaagtgatggtgggaagagaagaaaagaagcaagggagataa agattcaaatccagtcttgatttcgatctgtagcatatgtatgcgatatgacgcgtttag taatcgtcatgtaagggaatgagaaatgtaacgataaccaactaaatttagaacagagca acaaagaaggcgcttggagtcacccttgctataacatggacatgccgatcaagtagtatt cttttaagttaaatcttggctaaactttttatgctgctgaccttagttcttacattaggc aacgaatcatcatcctccttgctctcgcaccaaagattttcagtgcggctgatgcagtga ttaccatagaaacattgatccagcatcagaggaaaacgtacgcctctcgaacatccagca tcccatcgttgcaactccaaacatctcataaattctatcttcacgtgatgagggctcacg cccttcctcatttttcatcactttttttgtcgtccaaatcgttttcttacttcttcatcc tttaaacagcctatcctcagtatagtaataacagagtcgaa
ORF
TCTGAGGACGACGACATGGGATTCGGTTTGTTTGATTAA
Downstream (in bold stop codon of previous gene) ataacgattaagaactaattgtagtcaatagattaagttttccgcaacattaaacgtaat taaccacaccaaaattgggttcgtagtttggtacaactttgatgggtagtctacatagaa actctgatccactggagtagaattccactcctctgagcaaatctcctttgttgtcggaat agatcaactctaaatcagaagattcctccactaagatgaactttattgtgaccaaatacc ttacgttcacaatgttagttttgaactgacctggtgagttgtaaggaatgggtaaattgg ctgatacttgctcactgttaatagtactaacaactttttcacaaaccttttcaacaagct ggctttgttcagtcagattctcatgtatttgccctttttcatccttgaaaatgtactggt cacttatggtctctactctttcaagtgcgactatcaaccctgtcgttttaatatgattta ataactgaatgttaatgcgaataatatcgttgaccctaaatactgacttgtcgaggtcaa cttgagcaaaattttttgtatttcgtttcaacaagaacctagtttggaattttgcgggta aaaggctatcatattcaaaaaggttcagttcttcaaaattcttttgtaggggtttgcgtt cagtttgtctaagggaaagaacttgtgatgggttcgaaaccaaattgataaggtttaacc tacaattttcgttcttcggagaagtgtaatctgtaagaaactctttgtgaaactcatttt
gtatttccacaacctgattaatgttggcttgacttagtttctgcataaaagccaagaact cgattggttttaccgctatcttgtcattagaaaaatcagactcggattcaatatcgctat gtttcaactgagacttaatggaagcaaagctggccttttgtccatcagttacctcgttaa cttgagcctcaatgggctgacccaagagtggtttatccaag
AA
MGGINE KKAEYFNKLRELLESYKS IF IVGVDNVSSQQMHEVRQTLRGKAVILMGKNTMVRKALRDFVEELPVFEK LLPFVRGNIGFVFTNEDLKT IRDVI IENRVAAPARPGAIAPLDVF IPAGNTGMEPGKTSFFQALGVPTKI SRGT I E I TSDVKWEKDSRVGPSEAQLLNMLNI SPFTYGLTWQVFDDGQVFPANILDI TDDELLSHFTSAISTIAQISL AAGYPTLPSVGHSVVNHYKNVLAVS IATDYSFEGSEAIKDRLANPEAYAAAAPAAGEASAGAEETAAAAEEEDEE SEDDDMGFGLFD
P. pastoris homolog of S. cerevisiae TDH3 (YGR192C)
Glyceraldehyde-3 -phosphate dehydrogenase, isozyme 3, involved in glycolysis and gluconeogenesis; tetramer that catalyzes the reaction of glyceraldehyde-3-phosphate to 1,3 bis-phosphoglycerate; detected in the cytoplasm and cell-wall Chr 2-1 , 0437 See SEQ0064
S EQ0105
P. pastoris homolog of S. cerevisiae RPL17B (YJL177W)
Protein component of the large (60S) ribosomal subunit. nearly identical to RpI 17Ap and has similarity to E. coli L22 and rat Ll 7 ribosomal proteins Chr 2-2, 0109
5' region (in bold start codon of next gene) tcgtcttttctacctttgttatcagacctacatgcagctttggactttattcgtaactcc aattgttgacggttccagccaactaattctaaaaggtctttgtcataatctccgtcttta ctacactcaaaaacgatgtcagcagcgatagactgttttccccaataagcgcctgttaat tgtacagtaacattgttatcattggtttcaattgagggtttgaggtttctagagaatgca accacttcagagactactggctccttaccattgagtagcacctctgtgataccacatatc tgggaatcatcggggcaattctccacgtccatcttatcctttccatctttatcttgagaa tcatatatgttgagataccagactgacttggtgattgatggcggcgtttccaattcagtg gtcacttcgtgaaggccagcaattttgttcaaatcatagggtctcaagtgtttattagat atgtctatggcacaaaccgaagcaataaatagaatcaacctgaaatagttcattgccggt tgaattgattaccttgttatctttgactggtggaggtatgattaagccttgcgcgcattg gtgatggcctgttacttggtagagagcctcctagaaattccgcgaggaaaaccttggaga gcgttaaggggatgcggtagctaatagataaaagattatgtaatacaatctgcggaacat atattgaggaaattcactaaatatggagagtagtgaatatttagtagaagaaactcaaat tcccagaattagaaacctattaagagtcgccactaactcaagagtctataaaatacttcg taacagctctctctcctactagacatcatgaataacatttacatctttacatcaactatt ttagcacgtgatccattctagggttcttagacaacctatctatcgcgatctcatcaacaa cttttttcactcttctgttttcttcttctacatcatcctta
AACGCTGAG (gtatgtattttttttcacaaaaagtcttctggaaaaaaaaccatgaagagatcaagacagatgtt tctaggttgagatattatggcaatatctttcttgggttcacttgtcctctgttccatgtttttttggtcttttaa agtcttaaaggattttttttcagtttttgattactaacatatatttag) AGCAAGGGTTTGGACATTGACAGTTT
AGAGCCAGCCCAGT TGAC TGAAGC TTAG
Downstream (in bold stop codon of previous gene) attttgacgggtaacatatattaatacatgtaattgattgactgttctcttactttccta actatgaattcttttagttgaatgatcaggaagtggtgtgatggatcgtttgtgatctgt
ctccacatcttccgtctcaacttcttctacctcaaactccgtatctgatgccacctcgag tggccgtgattgctgaccaatatccgcgtccgtcgtttttccctgggtagttggctcatc tggttcaagctctgatteatactccagatagtctccctcttcaacttctgttacattgct cggagctgccagcagcccgacattttcaagccctttacgtttaccactgtctgccatgtc cttcaaatgctcatacacactaagcaaagttggaactaaagtaacctttatcaagatcat attgtaggctgttttgtcagtattttccttcagcctacttaaaagttcaataaaatctct tggagtcacatctttggtgtgaacatcatcctcaagaattttaatatcaagctcgatcat ttccaaacagtactgtttctctggtgaataatccaaactatttgagttgcaaaaactgcg taagtgactgaaaaatttattgaaaggttggttcaaaatagtaatatctgagaacattac agctaaatctgaatagtcctcatcctcctctagtgcaaatggtacaagtttaaacgtatc attggaggggagttggatgtataccggaataggcacttggtatgatgactcgtttgcagt tgaagagttcggattttcttcttttcctttttctatttctccttctacttcttctcctgc ctcatattccaattcctcctggtaccccggctgtgaagagtaatccgagcccaggtcaat taactcaatctctgcggttttgacatgtggctggacccttt
AA
MTRYSAVPANPDRSASARGSYLRVSFKNTRETAQAVNGWNLKKAQQYLDQVLDHKRAIPFRRFNSS IGRTGQGKE FGTDKARWPAKSVKFIQDLLKNAESNAESKGLDIDSLI ISHIQVNQAPKLRRRTFRAHGRINAYNSSPCHIELIL SEPDEAIEKAKEEPAQLTEA
S E Q0106
P. pastoήs homolog of S. cerevisiae RPS4B (YHR203C)
Protein component of the small (40S) ribosomal subunit; identical to Rps4Ap and has similarity to rat S4 ribosomal protein Chr 4, 0246
5' region (in bold stop codon of previous gene) ccattcttccagagatcgactctttggccaggatattgttgatctcacaactttgatgtc cccttttgcacgttcaaatttgattgaaaagttgatcatgatgggcaagttccaatcgga tgttagaaacgccaaagagaaagagcaacagcccttcttacctcatcatgagcagcatgt tttggtcaagctgctagaattttttaattggaacaagcttacccacggatacgtaacaaa tgacgattttggagatgaaggagacgatactaaacataagcaaatcagtttttatgaacc tccatgcgttattcaaagctatcaaagacactacaatgatagcatatccaatataatatt tgacttggtgataccggatgatgaagatgaagtgattgatgtcgtgttttacgactttat agagccatacattttgtacgctttgcgacaaatcctcaatgtcgatttagacaagagatc atgggtgcaaccaagggaacccgtatacttacaaagtgctaccaaagttggcaacactac attatatcagagtttcttgcgatcaattcaagtttacaatgatgttgaaaatgaatggaa cgttggagagatgctgaagagatatgtcacagagaactgggcctaaataggcgtaatttt taataacttttgtatcatttttgacttttaatctattgcttctactttaccccagctact aaattctcatagaaatagtccaacatctcaacattttgtcacatgacaaacattagggga gggcccaaactttgttagcagacacgggtacacttgttcaccgaaacacgaccttccata tttgttcatcaaacggagagagagacagtgagcagcacacgacagccgcgattcatcgcg gtaggggaaaacctctcccatgctatggtagcgctaaaaaaattcacccagagttttctc ttttttcatcttagtagaaccattgaagtaattaattgaaa
ORF (intron in bold)
ATGGCTAGAGGACC (gtatgttattttgttcacggattagcagaaaatggaaggagcaatatttcaagagaacag agatggtccatgttcagatctgaacacaaggttgtggatcaagacgtcaactttgaaaagtttttcatcatgatc atctcttttgactgggagattgttgttttctgcaaagctattgaaggataatgaagtcgcaagtactaacaattc ttttag) AAAGAAGCATCTAAAGCGTCTCGCAGCTCCATCCCACTGGATGCTTGACAAGTTGTCGGGTACTTATG
Downstream (in bold start codon of next gene) atatactaggataaagtaacaccaaggtattcgtattatacagacagtagactgctaatt tattgtgcatcattgtagaggtaggtgacttgattaggttttgggtaaatggttttgatt acataataggctgaagtatttatccctctctgtttttggagggcatcgatggaactcctc cgcactatcactaaccatgtattcattgagtaaccgagttttgccgtcatatacttcaag aaatgtcagtaagaggattgttttgtcgtgttctagaggattgattaggagcagtaggtt taattctaccacaacacccttacaaactccccgtcgtggcggatactggaagtactattt gggggcatttgccttaggtggactcgcagcttctcaattgccgctttacagtatatatca attatcagcaacgagtttgccagaggacggtgactcttttgaatacgctgagtatatgac ttctttggaattacagctgcaaggattgccgatagtgaaagcgctaagcaacgaccccag atatgtcaagttcagggcatggcaaaatttggaaggtaatatcttggaaaacagtcttat tgggggtacattgaatgtaccaggaggtttcgcagtgatgcctgtagcgttcgtgaatga gtccactgaagagatgattactgttgttcatgtgggcaggaaactctgtggatatccatt actagttcatggaggaatactagctacaattctagatgaaaatctaaagagatgttcaat tctagaattcaataagaaaaagaatttgaatttgaaatataatgacgagatgagctataa tatgatccacacaaaaacacttacgatatcctacagatctccgacactggctgacaattt catagtgataaaatctcagtgccaacctggtggggataaggattctgtcaatgtcaaggc caacattgaaacagtaaatcaacgggttttggtgaatggtg
AA
MARGPKKHLKRLAAPSHWMLDKLSGTYAPRPSPGPHKLRESLPLIVFLRNRLKYALNGREVKAILMQRHVQVDGK VRTDATFPAGFMDVITLEATNENFRLVYDVKGRFAVHRITDEEATYKLGKVKKVQLGKKGIPYLVTHDGRTIRYP DPDIKVNDSIKIDLATGKILDFVKFDVGKLVMVTGGRNLGRVGVITHREKHEGGFDLVHIKDALENQFVTRLSNV FIIGEANAPIISLPKGKGIKLSIAEERDQRRQQQGLDVEAEEEVEEEADVDSEVDFE
SEQ0107
P. pastoris homolog of S. cerevisiae RPL33A (YPL143W)
N-teraiinally acetylated ribosomal protein L37 of the large (60S) ribosomal subunit, nearly identical to Rpl33Bp and has similarity to rat L35a; rpl33a null mutant exhibits slow growth while rpl33a rpl33b double null mutant is inviable Chr 1 -4, 0275
5' region (in bold start and stop codon of next genes) gaggtttacaccgttgactggagtgtagatggtaagaaggttatcagtggaggtaaagac aaaaaggttagaatctggactcattaagggtctccttacttaattcatactatttagatt acgcgacatttcctatcagaaaaccaaaccgtgatgagtagaccgttcaaagtgaacatt attgtggcagcgttgatacctgggcatgggattggcttaaaggggaaactaccttggtct ctgaaaaatgaaatgaaatatttccgtcttctcacaacaaacacaatctcaccaacttcc aaaaatgtcgttatcatgggcagaaatacgtgggaatcaattccctccagattcaggccg ttacaaaacagattgaatatcgtgcttaccaggaatgccaatatttctgttgaaaatggt gctgcggttgacgactccgtgctggtggctgattccctgaagaatggattggacattata gaaaagacaggaaacatagaggaagtgttcttgataggaggagctgatctttataaccaa tgccttgcatctggtcttgttgaccgtatatttttaacagaagtgaagtccttgaacgga caaattgagactgatgtgtccatcaagattgacccttcccaatggatcaagcaggagtcc caagtcctggagaaatacctccaagaccatggagtgaacggcttccagttagacgggaac aaagagggagagctagtctacgactttactttgctggagcgaagataactaatggcatac tttacagtactatacttaataaacttcatgatagttcatcaaacattgcacgcttgcgcc ctatgtgtactcacgtgatatcttggatttttctcaaacgcagtccattgtgcgtcacat gatctagaagatttctgtcgtgcccgcgcgaaaccatctctactcttgtcaacgaaaaaa acatataatcgttcttactagcactccaacagatagataaa
ORF (intron bold)
ATGGCTGAGGAAAACAGAT_(gtatgtactattaaatcgcaaggtgactggtaaatgatgaagatgatcggcttaa tgagaagaataattgaaaacatagccagaaagttttgatttctgttaaaccaagctgaaatgatgggttgggaat tttttattaagccaaatgaacattcttacaactttaaccaacggatattatatatggctgagccagcgttattcc aaaagcacaagtgggaacaggtctctaagttttttctttactttcaagttgagttgaccgcttcatcttcatttt ctccgactatcaccaccttctcactatattttttagattcacgaggttactaactttgctatag) TATACGTTAA
Downstream (in bold start and stop codon of previous genes) attggttgttaatttatagtaatacagtcagttttttgtacaagtaatacatttgaggtt gaaggtgttttggcaaggtttctctccttccaatggtaaaatttactctctctttctttc tgattatgtaatcttgttgtttctttaacgtcttggtcgtgcgcccgcgacactttcagc ttacccaatagaacatcqtccaatcttgctttaaacaccatcaactabgattccatttga agctctgatcccatatgcaatcatgtttgcgggattctgtcttggtggtggtgtaatgaa cagtgccatcaccgctgatatcaccaagagagcacgtggtccaaaccaagaaattaagag aaacgttggcgaagatgaaggcccaaggaactacacaaagcctcgttacaacaccgacca atgggacaaatattttgcagtcagagacctgagattgaccggctctcttagaggccaatc tgacaacgccgttgctgaagaatccttcaaaacaaactccatccaaccttacagtaatac tagaagaccatgggtgttgagaagacatatcatcatgaagaactctcctaccaagtgggc tagaaaatacagcagtagaagagacgagagagagcaagagattaaagatgactacattag aggaatgggtgaacactgatttcatgtagttgtagttatctttcatttatttatttcgtg aacctatattcggacgcttcttcgcttaaagaagtccataaccttcttggctgtcgacgc tgagttgtctttatggatgtccttagatgtagcaagacgtgtgttagtattcttcaacac gtttacctcgttactgggtactaacggtctcctgacctctaatggagattcaatctccac aggatccactggtgaactagcacttccgctggtatagaaacttaacaggctaaacctttt tctttctggtttagtcttgttttccttgtcttgatgcttac
AA
MAEENRLYVKGKHVSFQRSKSVIHPKTSLIKIEGVENSKDAEFYIGKRIAYVYKGVKAINGTKVRVMWGTVTRTH GNSGWRAKFERNLPGQSFGSTVRIMLYPSNI
S E Q0108
P. pastoήs homolog of S. cerevisiae RPS26B (YER131W)
Protein component of the small (40S) ribosomal subunit; nearly identical to Rps26Ap and has similarity to rat S26 ribosomal protein Chr 2-1 , 0362
5' region (in bold start codon of previous gene) agacggaaccgcactcccaacatcttgcattcttctcagtggtgggcttcaaccattcga tgttatgagaatcggctgggacaccagtgcaaccaacatagcggtagtcatcataggagt cgatcaagattgggtccttcatggtaccctttctgacagcttccaaaggtgtttcgtcga aaacttcgataccttccagtttacctagcaattcgtatctttctaaaccagtagcttgtt ccaaatcggtgggaactgttcctgtcttggcaccaggaccaaccaggttctcaaggcctt ccacctcagcaatggatgtggcagtcttgaattgetttggggtttgttgtctcaagatgg aagcagtggagaacagacgagatgctggtctaacacgaggtaagcattgacgaagcacta tagatgttagtaacgtgatactttaggacaaaggcctatcagattaaaagttcatgtgtg ctgtcaacctgtggagcttggggtgggtgtggaacttactgataaataatttccttgtgg tatcaagtgaaaggaacagttaagttgagatagcttaaacagacgaaaatcatcctattt caaactagtgtgaatatcgcgcttcctcattggttgatgctccagcagctctgctgatgt aatatctgacgtaatcaatctgatgttcaggtgttttgactcagaggtgcaagggtgttt atttgagattctcgaactattgtgactgttgtgatgaaatataccgcgagctccacgtaa ttgcgagatctaaaggagtttttccttgcaaagtcttcgcgacaccttgtaaagaaaaag gttttgttcctaatagtcgcacgtccttatttttttctctctcacctccggcacctcacg acttagggcacacgcgcgttttttggcactccctccctcgattcctcacccaaaattcca tcagtttgaaaaatatctttctccatcaaccaacaaccaac
CAAAGATTCAGCAATGACAGAAGAGTCAACCCAGCTGAAGCTGCCAAGAAGGCTTTGTAG
Downstream (in bold start codon of next gene) gtgtagtgtaattcgaatatgaatcatcagttattggaaactgtagagcgtgtttctttt aaagggaacctcacattgactcgtggtattgtaagctagctcagttctggattgttagac agttctaatgctggtcctaacaaatgagtttttctcgcaccttcgtaggcggcctcggtg gcccccaagcactggccctcccattccgatcctcccactacacctttctctcgcacggct atcaccccatctctctaccgtgctccgcctttgcccttccacaaataatctccgcccccc gcgcccaccaaggcatccaacttagcgaacgaatatttctcagacgggggagagacgtta cctatcttctccaqcaacqtaaactcaacacqaaaqgqagaatgqaagaacacaactqqa agttttcacaatgctttggtgacaaaggtgatgtggagacaattacagaagccgacatca tttcgacggtggaatttgaccataccggcgattttctagccaccggtgatcgaggaggcc ggattgtgctattcgaacggaacgagagcaagaacaactgtgaatacagattctacactg agtttcaaagccacgatgctgaatttgactatcttaagtcattggagattgaagagaaga tcaacaagattaaatqqctcaaaaqaaqcaacacgtcqaacatqctqctaagcacaaacg acaagaccatcaaactgtggaagatctttgagaagcaaataaagatggtcagtgaaaaca atctgagtgagaactctaacttaatgaacacagggaatgagacattgtcggccgccaata taagttccaagttgttgtcgatcaacaacttgaagctgcctagattgaccttacatgata ccatcacttcagcacaacccaagaagatttatgccaatgctcatgcctatcatatcaatt ccatcagtgttaacagtgatggagagacattcatatccgca
AA
MPKKRASNGRNKKGRGHVKFVRCLNCARCVPKDKAIKRVTIRNMVEAAAVRDLSEASVYEEYALPKLYNKLHYCV SCAIHTRIVRVRSRSDRRIRAPPQRQRFSNDRRVNPAEAAKKAL
S EQ0109
P. pastoris homolog of S. cerevisiae RPL3 (YOR063W)
Protein component of the large (60S) ribosomal subunit, has similarity to E. coli L3 and rat L3 ribosomal proteins; involved in the replication and maintenance of killer double stranded RNA virus Chr 4, 0139
5' region (in bold start codon of previous gene) tcagcagattgaactcatctctggtttttcccgatgggacaccgtagtttccaatcaggg gttgagtgaagcagatcatttgtcccttataggacggatcagtcatcgactctgggtaac caaccaaggaggtggtgaagacagcttctccactgatattcttgttggcaccaaaagagt aaccgtcaaagacggggccgtttctaatggtaagagtggccctgtccagtctggcggatg atgacaatctggtctgttgaaagaccaacgacttgaggaaatgacgattatttcttaaga tattcatggtgaggttttccaaatgtgatctgaaatgtaatcttggcaagtttggtteat aaaacgtcaaaagaccacctgaagataaaaaataaagtggaaggagaagaaaaagaggtt tttttccagatgagtctatgactaagatttctcttcaaagaattttttcaaggtttgaga gggatgaaaqcgaqgctgctgcacqattatcgatgtcattcaatagqqaqgtagaagaqa actgagggagaaaaggctacaactgaacctaaaactaactgaatttggaatgaagcctat tacacgagaggtgtggcgcctttccaaagtcatcaggttcaagttattgctttctggagc aataaatgatgtcaatcccatacggggtctccacccttatagaattacttatctaccggc taggtgttgcatcgcggaccagcccaatcgggtaattttagttgaaaaggtttttagcgc gagtctagttattacccggatagtctaatgctgaaaatttctcggacatttttgatccgt acgtatatttcagtacggagaatcatcaacaagaatctttcaatgccaccaaaaaaaaag acatttttcaccatcgcgcattcacagtagcttgaaaaaaaaacttcaccttcacccatt ctcttctcctccacctttttaccctttactaaatttgtata
ORF
ATCAAGAAGTACGCTTCTGTTGTCAGAGTCCTCGTCCACACCCAGATCAGAAAGACCCCTTTGAACCAAAAGAGA
AAGGCCGCCTTCTTGGGTACCCTTAAGAAGGACTTGCAGTAA
Downstream (in bold stop codon of next gene) ataccttataccatacgtaataatatagtataaagagtatatactcttttttcttttagc tctttcaaacttagttagactctactaatgagtatatacatataatatcaccactgcaaa tacaccactgcaaacaaaaagatatcttctttattttccataactccttccactacccac taaaattaaaaaagaaacaccataatcaagcaccacaactgagaaacaaataatatacaa acaaaattaataatattatcagtgtagatccacttgctgtactttaacacaacgataaat tcacacagaaaaacgtatgctaatcaaacaatcaatagaaaaaccaagaaaaatcaaatg agaagatgctgaataaacatgatgactgtgatgagcggttggaaagtggaatccaaccgc aaaaactggattccagcagggtcatgctccctgttggtccaaaactatttgataccttta taaattgataccaacaaaaaacgaaaagcaatcaaaaaaatgcctagcgacagcatcctg tgtgtgactgactggtggtttaccaccaatcagaaccagacgattgttgagtagtgttgc tgcttggagcactaccgtagaagttagaaccaccgaagtaaccacttcctccattagccc taccataagtttgggaacttggctgaggcgggtagttagaagatcttctgaaatcttttc cgtaagcgtttcttccgcctcttccacctctggctccgcctctacccatagtagcttctc tggaaagggcgctcaggaattgaggaacttcttggttagcctcggttaacagttcaatta aacccttgatcaggttcttgttaccgcggttaacgaatgcagtggcaataccaacattac cagcacgaccggtacgaccaatacgatggacgtaatcgtcaatgtcagctggtagatcgt agttgacaacgtgtgtgacatttggaatgtctaaacctctg
AA
MSHRKF EAPRHGSLGFLPKKRASSPRARCKAFPKDDKSKPVALTAFMGYKAGMSTIVRDLDRPGSKMHKREWEA VSIVDTPPLVWGVVGYVETTSGLKSLTTVWAEHLSDDVRRRFYKNWYKSKKVAFSKYSAKYANDPAS IETELAR IKKYASWRVLVHTQIRKTPLNQKRAHLAE IQINGGS ISEKVDWARDHFEQTIAVDTVFEQNENIDVIAVTKGHG FEGVTHRWGTKKLPRKTHRGLRKVAC IGAWHPAHVMWTVARAGQNGFHHRTS INHKIYRVGKGDDESNGSTEYDR TKKTVTPLGGFVRYGE IRNDFVMIKGSIPGPVKRWTLRKSLYTQTSRKATEQVTLKWIDTASKFGKGRFQTPAE KAAFLGTLKKDLQ
P. pastoris homolog of S. cerevisiae RPL2B (YIL018W)
Protein component of the large (60S) ribosomal subunit, identical to Rpl2Ap and has similarity to E. coli L2 and rat L8 ribosomal proteins; expression is upregulated at low temperatures
2 homologs
SEQ0110
Chr 4, 0107
5' region (in bold stop codon of next gene) tcaaacatggctgcccttatatttgcacctatttttggtgttctcattgacaaattcaac aatcatgctcacttacttttgattgtgaacactctgagtctgttaagtgctgttggtgta accttcattaagacaccagattcttcagcactcatctttctgttcgcaacaatgttagga gtctcacaaattgggttcataatcatttcaatgaccatgttatctaacatttcctccaac caagaatactacaagaacaagggatcattagttggtgtctactcattatttggatgtgtt ggagtcttaattgttaataagtttggaggttggtttggtgactggttcatgcatggtcca tttattctgttagctctatttaacgttttattctttattgctttatttgcattgaagggt gttaagaagaacgaatccagagactttcagtcagttttacaaagctttactggcacccgt gtaggacgtatgttttccaacaacagacatgttccaaactctgacgagttttagtattta taggtaaaaatttttagtatataggtcaaaatccaaattagaagtcaattagatggatta tttctgaaagtggaccttttgagccccttggaggttataattaagaagtatttcagagtg tcatgacatactggagtataatcaattcaatatataaattagtgtaaagatgttggaagt ctgataatacgataatgtaactctccctggcacttcaaaatagaccgaagaagaattaaa
tcattcaccctacgccactctaatcatacagacaccttcctaacgaagtcttccataagg tacctttcaactaccaccttatgatacctcacacagcttgcccaataaggcacgtgacca gcactagggttaacccgcgaacacaagaaccctcatactccccaaaaaaaaaaccaataa aaaatttcaagtcctcttacagcaagaaacattaacacaac
ORF
Downstream (in bold start and stop codon of previous gene) actatgtaactaacgaaacagcatgtactaatagaaccgtatcgagaatatttatttagg tgagtagtaggagtgaaccagacagtcaatttagtgagctgtcccagcttttgtgcattc cagaattgccggtcaaattggttatgggttatggggcttttccgattgaggttcagtttc tgcggttatctctttcttgacctggtcttttacaggctgttctttctccccatgattatt ctttagctgaagataccgcttagcctgataatgtcgtcgttttgtaatcaaaatctttag ttgggcatcgtctgaggtttcctttggcttctggggttgttagtaggaacgtaggaacca tagtaacttttacacatacattcttatgattgcgaagtaagctgagtctgctgcttggct cccgaagtactttctctttctctaccggttgattctccttctggtgctcctaaacgattg tgttagaagggattgacaacagactgtggaatagtacatactcttttgaaataaaaatgc ttgaatctagagtttcctcttttcaatggattttgtttctgatcttcctctgctttgcta ctctctggggaagatgcctcctcctctattgtctatcaccagttagataacgtttatcta gacatctaatccacttacaattacctcaattccttctgcgatttcgtcaatagccaaagg tgccaggcttgagtcgtcctcgtcctgttcgatgactgtgctgattacactgtttgactt tagatccgtcttgttcatcgcttcaactgtcattctcagttgttcgttcaggtgataaat ttccggccgaccactacacaaagctaatacgcaaacatgctgcgggggctttgtgtctac aggacctgccctaccccagcttctcttccagtcaaatgcatgcaacacttctctgttgta ccgttacaatctcatctcccaaccttacgtaatcaattctt
AA
MGRVIRAQRKGAGSIFTAHTRLRKGAAKLRALDSSEREGQIRGYVKQIIHDPGRGAPLAKVIFRDPTKKGLREET F IANEGLYTGQF IYAGKDASLNVGNILPLGALPEGTI ISNVEAKPGDKGTIGRTSGNYVIVIGHNVEENKTRVKL PSGAKKIISSNSRGVIGVVAGGGRIDKPLLKAGRAFHKYRVKRNSWPRTRGVAMNPVDHPHGGGNHQHIGKASTI SRNAVAGQKVGLIAARRTGLLRGTQKTQD
SEQOlIl
Chr 1-1, 0219
5' region (in bold stop codon of previous gene) agaagaagaaagccaacaaaaagaagacttccgctgctaagaacaatgatggaaccaccg aaggtgttactaatactgcagtggatgggactgaggagtcccaagctgatgtcagtagca atgccgactccgcttccgattctgttgctgattcgaaagacaaagagaacagcgctccca atggcggcaagaagaagaagccttacagtaagaagaaacaaaatccacataaagttcccc ttgaacagggtacgccttcagagacttcgatcttcatccgtaatttgcaccaggaggttg tcacttccgaactgacggagttcctttctgagtatgaccctcaatgggtttcgatcccca gaaggaatataccccgtcatgttgttcaaaaactgaggaaagagaacattccgatcaggt cgcgtggtatggcatttgtgagatttgccgaccacgaaacgcaacaacgcgctctcaagg aattgaacggtaaggaaattaagggcaaaccaattagcatcgatattgccattgattcct tcgtgaaggaggaacctaaagaatctagcgtcaaagatggtgctgacgtttccatcgtta ctgatggagctcagccgcaggatgagggagaagagggcaaggtagaggaggaagtaaagt ccgagccgctcaaaccattggagtcaaccaccattgctccagaagtgcaaactactacag tgtgattttccaaaccgtttcgagttgctgaactctgtcattataatctataggttttgc ttgcaactagttggcgtaataatgtatcatcattccatatttatttgttcaagacttctg
tacccccgtcgcgagccctagccctgtacacgtgatggatattatactgggctcgcgacc cacgaatacacctctacaccaacctcaaagagtggaaccttccatcttaggtgaaaaatt cgttagactaaaggaaattctatacatttcaacaggacaaa
ORF(intron in bold)
ATGG (gtatgtaaaccaccaaccactaaacatccagaggtcaagtactaacgtcattgtttatag) GTAGAGTTA
Downstream (in bold start codon of next gene) gtacttttatagttatcaaggttaaattttgggtaatactcttgtagcttttcgtagagc tcactgtcacctcctagttttcagcctaacctttctcagccctgacgggtgttactccct tcgggcttgcgctcccttctaaccccgtagagcttgcttgcctttttaagccgttttagt cttcacacctacttatgacatatctttgctgaaaagcttccctttagttgaggttatcca agctagagcttttaactccctccagttttcttgcccgccaacaggaccgtcattccttgc cacagcgttccttgaagctttttccctgtgaagactcaattaccaggtggactcaatttt ttcaacagttcgaagattagaagatattttctttccgttaagaagtcgtctttccttcgt atgaattcgtacaatgagtcagccccgacgggttgcagtttctgggataatgatgatatt tctccctgtatcagaaagtcactgctggattcgtatttgcctgctgcaattgtagtagga tcgttactctatttactgctcataggagcacaacagatcaagactcacagaaagctctac gccaaggatgaaacacaaccgttattagaacctgccaatggctcacctacggattattct aatacttatggcaccatagattacgaagaggaacaatccacagcggagttgacaacctca cagaaacattttgacatttccagactggaacccctgaaagatgatggcacaccgttagga ctggtgaaatatgttcaaagagatggctgggagaaagttaagctaattctagagtttgtg atattgatattccaattagttatagctgtcgtcgctcttttcgtaccttctcttaaccaa gagtgggaaggttacaagcttactccaatagtacgagtattcgtgtggatcttcttgttt gcacttggatccatcagggcgttgaacaagtcgggtccatt
AA
MGRVIRAQRKGAGSIFTAHTRLRKGAAKLRALDSSEREGQIRGYVKQIIHDPGRGAPLAKVIFRDPTKKGLREET FIANEGLYTGQF IYAGKDASLNVGNILPLGALPEGTI ISNVEAKPGDKGTIGRTSGNYVIVIGHNVEENKTRVKL PSGAKKIISSNSRGVIGVVAGGGRIDKPLLKAGRAFHKYRVKRNSWPRTRGVAMNPVDHPHGGGNHQHIGKASTI SRNAVAGQKVGLIAARRTGLLRGTQKTQD
P. pastoris homolog of S. cerevisiae PGK1 (YCR012W)
3-phosphoglycerate kinase, catalyzes transfer of high-energy phosphoryl groups from the acyl phosphate of 1,3-bisphosphoglycerate to ADP to produce ATP; key enzyme in glycolysis and gluconeogenesis Chr 1 -4, 0292 See SEQ0072
SEQ0112
P. pastoris homolog of S. cerevisiae RPS14B (YJL191W)
Ribosomal protein 59 of the small subunit, required for ribosome assembly and 2OS pre- rRNA processing; mutations confer cryptopleurine resistance; nearly identical to Rpsl4Ap and similar to E. coli SIl and rat S 14 ribosomal proteins Chr 2-1 , 0481
5' region (in bold start and stop codon of next gene) gttgaaaaagaagtttttttctcacaagtgtaagagtagaggaaaagaggtagtgtgcgc
gacaatcgcagtgggaacactgaaaaaaaagtacggtaggtgccagcaaaaggagtcatg tgacatatgttacggctaacccctgctaacttcttctggtgtagagagttttttcagctc gtctgccaaaaacagtctgaaaaaaaatacacagctactcttaccatcacacctatcgaa tatgaccagaacctctgttcttgccgacgctctgaacgctatcaacaacgccgagaagac cggtaaacgtcaggtactcatcagacctgcttccaaggttatcgtcaagtttttgaccgt catgcaaaatcacggctacattggagagtttgaattcattgatgaccacagatccggaaa gattgttatccagctgaacggtagattgaacaagtgtggtgttatctccccaagattcaa cgtcaagattggtgacattgagagatggactgacaacctgttgccagctagacaatttgg tttcgttatcttgaccacctccgccggtattatggaccacgaggaggccagaaagaagca cgtttctggtaagatcttgggtttcgtctactagatgtgtgtacttatactaatatactg attattcatcattaatggattgtagttacaggctctttcggtatattgatagtaccagca ctccgcccagggagtcccattggggttcaatcctccaacaccttgtcccctttgaacctc tcccttctacctctcccccctgtaggcattttggcttttctcgcgaacgggatcctaatt gtgactgcgttcctaatgaggcaactctcaacctccgcccaaggctcctttccccgacat aattgggaggagtcccaattacacctccgtctcgcgaccaaaccttctcctccctctctt tttcagtcgaaaaaagttttacctctteatcatacaaaacc
ORF
ACTCCAGTTCCATCTGACTCTACCAGAAGAAAGGGTGGTAGAAGAGGTAGACGTCGTTGA
Downstream (in bold stop codon of previous gene) acggatctctttttactatattcatatactatatacactagtcgtgggattgaatttaat ttcggcgtgataaattgattgtagatatttaattgaaacagacgctgagctgttttattt gctggtattttcctactcctttttgtcggtggtttgaggttggtccttggactcgattac ggtgccatccttactgggttctttcacctcctcttcagattgcgctggagtttcgtcttt gacctcctcgcttgcatcaacctttggttcttctttgacatgttcgttcgtatcttccct cttctcctccacggtgtctttagtaacgtcatccttcacttcatcttctttcacctcctc gtattcgtctagtttgatatcgtctttgattggagtggaacggcgttgtgatttactctg ttgtctctcttcctttatcagctcatcttgttccttcaagacattcagcttcaacttatt gtcctggtcggccagctctaacttagcaacattgtaaacctcatcaggaatcgaagaaag tactgccaacagggcatcgtagtaattgtccaggtcatcagcgtgatctccataagtgaa cgtgctactcaaaatcaaaagtgtcgaaggaatcttctgacgcaacctcaagtccagcca aatcttcaaatcatccctcagtcttgcaggagaagcaccaacggttttgataccacgtga agcacaagcactctggagctcctgaattgacagagattcaactccttcgtaatcaatggc tcgatcatccttgatgatttgcaataacttatatctgatctgataacgtaaaatctcgtc agttccaaacggacgtaaattcatgtaacgggccattgccatcaattgtggtcgggaaag attgtctagcacttggtcattcttgaaacgacgggcgatacgaataatctgttctgtcga tggtctttcacctggttcctggatacttttgaaaaatgata
AA
MASKTENVTLGPQVRDGSQVFGVARIFASFNDTFVHVTDLSGKETIARVTGGMKVKADRDESSPYAAMLAAQDVA AKCKEVGITAVHVKLRATGGTSTKTPGPGGQSALRALARSGLRIGRIEDVTPVPSDSTRRKGGRRGRRR
SEQ0113
P. pastoris homolog of S. cerevisiae ATP1 (YBL099W)
Alpha subunit of the Fl sector of mitochondrial FIFO ATP synthase, which is a large, evolutionarily conserved enzyme complex required for ATP synthesis; phosphorylated Chr 3, 0576
5' region (in bold start codon of next gene) ctcagctttagcttcgatcttgtcgatcagctcttgaggtacctctgatccaatcaaacg agcattagcaacgcctcgttcagcagcacttaattggtctagttcctctgctctcttagc attgttgacacatttggaagcatttgtctcttttgtcactcttcgatcggcaggaatact cgatggctccgtgtaagaggcaaaattgggcaagatcacaccttcatcctcatcaatttc tgaatctgtatcactgatctgttcttcctcgctggagctctcatcattctcaatggattc
atctttgtcgtcatccttgctcacaagcctttcacttggatcatgaacagtaatcgacca tttgttttcttttggattggggttccttgaatgagatctactactccgacctcttgagac tcctctgctatgaggaccaccagaatttggagacacagaccttcccctggcaccagactc cctccgcttttgcctttctagaatatctataatttgttttctctcatcatctaacgcagc ctccagagacattttgcaggactgcttatcaagtagtataaaagaagggtgctttctttg tcgtggaacctagtcaagttgcaagggaaggagcaaagtgcagcatccagtttccatcca cgtgtgggtgctacactacttcttgtcgtgaatattttttccgagtcgttatataagtaa ggtcctatatatatataagtatgtttttgagttattatcttcttttataattaatagtgc cttacttcgaatctcatccttccataaatatgggtcgtgcactacacacccccttggacc gccccccagggtttctaaccgctcgaaattgtgaactgtggttggccaatccaagcgcgc cctcgacactcctcaacgctgtttttatattaactaaccccttcctttccttttctctcc tacacccagtgttaaatctttatctttaataatattcaaag
ORF (introns in bold)
CGCGTG (gtatgtatttcctgatgatttgaatcactttttgatttggctgtcaactggctcccccatgaaaagtt tgtctttactcaacgttgaagcattattgatgattattttggattgtagaggatcaattccatagaatgctttaa atatttcacacattaatgtgttttcgaacaggacaatcgcttcaaatggtgtggtgaaatctatcaaaagtgtac tttttctttccattgacaactactaacatatcattag) GTCCGCCCATTAGTTTCTAGAGGTTACGCTGCCGCCA
AGACTGGTAGAGTCTTGTCTGTTGG (gtaagtttttctgaatttttttttgaatatgtcatgggctgatcttagg gaggcatgatacaacagtgctttactccacaattgtgacgtattgattgatgtgttgtcctcagatatcttcttg tccagatcctatttttttttcaagttttctcgacattttattggacagtcggagatgctcgacttgactccctaa gaactacgcccttacaaacaaccttaaactgaagctcagtaaactaacattatag) TGATGGTATTGCCCGTGTC
GAATTGTTGGCCAAGTTGAAATCTGCTACCGAGTCCTTCGTTGCTACTTTCTAA
Downstream (in bold start codon of next gene) acttagcagttaacgattacgatttaagcatttttgataaaagactcaaaattcgacata caattatgatgttataaattgatatttagatttatgttttgcaatgtagagccttgtaaa tttgatctttgatatctggagaactatgatagtagactgaggctggcgatatgctctttt tagacacttcgtggtgttccttgctgtttcttgactgactttccctctaacatcaaattt gtcgctaagattgtttccaagtttccttccaagtttcctcctacttattgaaccatgaca tcgtctgattcaaccaaccaggaggaaactccgcaaccagagaaacattcaaagcaaatt actaacgatgatctctaccgaaggtcgacccagttcagattttggtcatttaccgagcaa cgattgaaggatttgaagcggtcaactaatgaaaagggagctgccaacttgaaagaaaga atatcacaattggatccttctataccagaagtctctgagatttgggaaaaggacctcgtg aaaccgctcactgtggaagaggaggttaaaattgtcgaatattatgcgaggaaagcccaa ggccttgccaattatttcaaattaccaacacaatcaagggctactgccatatcatttttt agaaagttctttctggtcaacagctgtatacaatttcacccacaatatatcatgtacaca tgcctgtttctagcagctaaatcagacaaccactttattggaataaaggaatttagcaag gccattccaaaaacaaccccagagtcaattttacagtacgaatttcagatactgcaaagc ttgaagtttgctcttttatgccatcacccttacaaaccattgtatggattctttttggat tttcaagtggaaatgaagaataatatccagtcagataaactcgctgagttgtatgaccac
gcaagagataaggtatctgaagcattattttctgatgtttc
AA
MLSARPVIRRAARSAVAINVARSVPRWRPLVSRGYAAAKAAPTEVSSILEEKIRGVSAEADLNETGRVLSVGDG IARVFGLNNCQAEELVEFSSGVKGMALNLEPGQVGIVLFGSDRGVKEGDTVKRTGKIVDVPVGPELLGRVIDALG NPIDGKGPINSKVTSRAQLKAPGILPRTSVFEPMQTGLKSVDALVPVGRGQRELIIGDRQTGKTAVALDTILNQK RWNNGSDESKKLYCVYVAVGQKRSTVAQLVQTLEQNDALKYSIIVAATASEAAPLQYLAPFTACAVGEWFRDNGK HALIIYDDLSKQAVAYRQLSLLLRRPPGREAYPGDVFYLHSRLLERAAKMSDKLGGGSLTALPVIETQGGDVSAY IPTNVISITDGQIFLEAELFYKGIRPAINVGLSVSRVGSAAQVKAMKQVAGSLKLFLAQYREVAAFAQFGSDLDA STKQTLNRGERLTQLLKQKQYAPLAAEEQVPVIYAGVNGFLDSVPLERIGQFEEEFLAYLKSNETEVLEAIRVKG QLSDELLAKLKSATESFVATF
SEQ0114
P. pastoris homolog of S. cerevisiae RPL7A (YGL076C)
Protein component of the large (60S) ribosomal subunit, nearly identical to Rpl7Bp and has similarity to E. coli L30 and rat L7 ribosomal proteins; contains a conserved C-terminal Nucleic acid Binding Domain (NDB2)
Chr 2-1 , 0022
5' region (in bold start codon of previous gene) cctacaactggagcgacactcatcagccttggatgaaagcaagatgtcgattaatgaatg ttcaactgtaggaagaccaaaacactattgtcgcgcgaaatatccattccgtactaagat tcactagcagcagcgttggagtactaattccgttttgatgcttgcattataatgaaatag tttcaagaaggattttaacgacttcttacgaacgcaacaatctgcaaggcgctatgaggt attttaacacgtagattaggatttattttgctagttacgggggagatatcattagttgaa gctcgaaaggaaggatgcctagaagcgcaagtcgtaattacctgacggtgcggaagacct ggcgttaagtttgaagaagcatcctaatttgtcgtcccacattgtttgtgtgatcctgta agtgaatacctggctacgatcactttctcaggtcaaaatctcctctatggtgttgcccta gagcaaatcacatgatatattttgttctttttttttttcccttttatctcgttccacgcc cattttttttgtgttcgcgcgctctcctcctccatcctgaaaaatttgttcaatctttta cctgatattctttcccgtataccaacctattcacacaaaatgactccaaggtaagttagg caacaagaaccctcagttcaagaatggtgactcgatcacatctttgccatttcaagatca agaaatagctgaagtttgtcgaccagcttttgcagtaaagcttatgctctaacatgagaa atatgtggccatgcccaatttcttctgtcgatctcattctgcgaatgacggtgtcagttc gtagctcttcccatactaacagttttttccagagttcagaagccagagactttagtcaag aaggagaaagtcgatgagaagactgcccaagtacaagccgagtacgctaaggctaaggct gccgtaagtaccacaccaaactggtctaataattccttttc
ORF (intron in bold)
CGTTTTGTTAACCATCTTTCGGGACTGAAAgtaagctggtagtctgagaagtggaatcgaaatttgtagattgtc tgaaattttgtttctttactaacatcatgtttctagTTGGACGAGGAGAAGAAGGCCATCATTTCCGAGAGAACT
ATTCAAGGTGGTGCTACTGGTAACAGAGAAGAGCACATCAACAGCTTGGTCAAGAAGATGAACTAA
Downstream (in bold start and stop codon of next gene) acatagtcttaccaataaataggccttaatttgatatgtataacactattgttatcgcat ttatttttgggatatccaatcgttgagtccgtacttgttacgagctatctcgatggcgta gaactgtagacggactacgaaaatagttccttcttggtcgtactctttggaactgtcagt tctgagcaaggttccaaaattgaagtcgtcgatcttttcttcaaaaggtttcatgaactt tctccatctctctttaccgctcagagatttcatttcgtcctcattgatcttgctggcata ttctggattttcaaactctggaaagtccttcaataaagcctcaaaaatctcatcatcgtaa aggagtaagcttgagttttgtgccaggaattttggtgagaagattccagtaagtttcagc ttgtgtcactgccttgacagcaaattgcttctcaatatcgtccaaattttcagcttcttc
ggcgttaaaagtacttaaactcatagttcttctgatctggtatgaatggtgtacggaaaa agatggaacatgagagttttttgaaaatgtggagccagtcacgtgacccagtttcaatac ataacctctcaagttgtgtagctcaattagaatttacgacggttggggtctaagatctgt tgatcaaaaggtttccaagtagcaagtgctatctactgaacagtggaaccaaggcagtat gaatctgcttgctgaaatcttgaaagcctgataataagctgctgaacaacgtttggttca ctttgagttcagctgcttcggtgcaagggtgactacagttaatattacggacgctcgtat gctggaatgaaaggtagtcttgtaaacatcatcaaaagaattgaagaggagctgagtgtt ttggtgaagacaagtttaaccttcggtccaagatagtttccacctggtgtagagaccact ggatgcatgataccgttagtaagcggcaagagacggttcac
AA
MPQKTTSKYLLMIPIFNRLAFGQCCRFVNHLSGLKLDEEKKAIISERTAKYYEEYEEAARVLEKAKSDAAASGSY FVEAEPNLVLWRIKGINKIAPKPRKVLQLLRLLQINAAVFVRITKATAELLRLVEPYVAYGYPSLETVRELIYK RGYGKLNKQRVPLTDNDI IEKVLGQYGI ISIEDLIHEIYTVGPNFKQANNFLYPFRLSNPDGGWGVRRKFFHYIQ GGATGNREEHINSLVKKMN
SEQ0115
P. pastoήs homolog of S. cerevisiae SSB2 (YNL209W)
Cytoplasmic ATPase that is a ribosome-associated molecular chaperone, functions with J- protein partner Zuolp; may be involved in the folding of newly-synthesized polypeptide chains; member of the HSP70 family; homolog of SSBl Chr 3, 0731
5' region (in bold stop codon of previous gene) ccttcataccacaactttgcgacgcctgaggattgtagaaaatggcgtagaaaggccgcc agcctttctgactgttcccgtccttgtcaaagaacaagaagttcacagatggttagattc tttgatggccacttatcagatccagatgcgaacaatatcagagctctctcaattccatac agaggatgcgaactcattcatttccatcatacaattgaccattgatggcatctggttgat ggctctggataaagaatggcttaacagcaggttgacagttgaaggagttttaaccgctgg tagtcacgatagacacagccaattacgctttcatctcccatacaatgtagtaatttcctc aattactcttctggacaagctcactctggcagattgccaagaggttctctatcatgtatt caaacgagcatcttctactgatactaagactcgaagctctttccaaaacttaacaagatc ctcagaattgtacaactgcaagaaggaactccctttacaaccattaagccaggtggatat attggtaacgcacaatttatggctcacaaaaaggcagacttcttatctaattggctgtaa aggttccagaattgaactcatacgaactaactccaaggctaaaatcaagatttttagtgc actagaagaggaattaatgcctaagcaacatatttctataactggaactaagtctcaagt tcagaaagctgtctttctgattcagttttccttggacaaactacttcagggaatcagtga ctggtgagactatttattggtgagcaaaactagaatgggctataaattccatgccacagc cgatgaaccgggtaaccttttagttttacccacaatttactgtttcgctaacctagaaaa aaaacctctagtggttggtgaaaaatttttttaggagagcctttttttttcttgtttcct cttctttctcaaagaaactttcaagttttcgcatatatata
ORF
TCCGCAGTGGAGATCATTGCCAACGAGCAAGGTAACCGTGTTACTCCTTCCTTCGTTGCTTTCACTCCTGAAGAG
Downstream (in bold stop codon of next gene) gtttattagtaaaatttagggtacgcaaagttaatcttacagtatatatgtgctgttact tcttaaatgtaaatcattatttttattaaccgatatagttatagccatccgcagcaacga ttgtcccttcctggatacatttgaacacccactctttctctactatccggggtactttat ctagaacacctttgacgattggtagatcatcacccagcttttttcttagagacccaactt catcctgtatggtcgcctcctcagtcaaccaggaatagttgaatgtggtggaagtatcca cgattataaatgaacattggaaaatatcagttgtaagaaatcctcctagaatcactatta cttctaacaagtatggactcataacatttgtcagaaatggagatctatagcatggtcttg cgatataaaacttcaggcccttgaacaagttgtacagtagtggaaattttagtgggtgtt ctaactttatcttaggtatacctttcataagcattcgcaactccttacggtcaatgggag aggtataactatccccaaatcggtcggagtttttctttaccttttctatcaaagactcat ccgccagtaatatgtcgttaatttgtagaggaattatagttttcagctgaagacacttca gaatgaagcctggtcgaaagacatttagacctttattaaagtactctgtgactttgaaag taatcttgtcactaagtaaaacaagcctctcgccatctgcaagagctacgcaatcattct tagttattctgccaccattctcgaccagcaaagcctccagctgagcaatagagtatcgtt tttcacccaacatgtcacttaaaacataaaagatgtaatccttgaatatgctgctcttcg aagatacactttgtttagcgagaattggttcagtgactattgtcctctttcgtttactca ttagactgtgtgttttctctgttttgtgtctactctttgaa
AA
MADGVFQGAIGIDLGTTYSCVATYDSAVEI IANEQGNRVTPSFVAFTPEERLIGDAAKNQAALNPKNTVFDAKRL IGRAFDDESVQKDIKSWPFKWNDNGNPLIEVEYLGETKQFSPQEISSMVLTKMKEVAEAKIGQKVEKAWTVPA YFNDAQRQATKDAGAISGLNVLRI INEPTAAAIAYGLGAGKSEEEKHVLIFDLGGGTFDVSLLHIAGGVFTVKAT AGDTHLGGQDFDTNLLEFFKKEFQKKTGKDISDDARALRRLRTACERAKRTLSSVAQTTVEVDSLFDGEDFTAE I SRAKFEAINADLFKSTLEPVEQVLKDSKIEKSKVDDWLVGGSTRIPKVQKLLSDFFDGKQLEKSINPDEAVAYG AAVQGAILTGQSTSEETKDLLLLDVIPLSLGVAMQGNVFAPVVPRNTTVPTIKRRTFTTVDDHQTTVQFPVYQGE RVNCSENTLLGEFDLKNIPPMSAGEPVLEAIFEIDANGILKVTAVEKSTGRSANITISNS IGRLSSSE IEKMIND ADKFKKADEDFANRHESKQKLEAYVSSIESTITDPILSSKLKRSAKDKIESALSDALAALELEDASGDDFRKAEL ALKRWTKAMATR
P. pastoris homolog of S. cerevisiae FBA1 (YKL060C)
Fructose 1,6-bisphosphate aldolase, required for glycolysis and gluconeogenesis; catalyzes conversion of fructose 1,6 bisphosphate to glyceraldehyde-3-P and dihydroxyacetone-P; locates to mitochondrial outer surface upon oxidative stress (2 P. pastoris homologs) Chr 1 -1 , 0072 see SEQ0065 Chr 1 -1 , 0319 see SEQ0066
SEQ0116
P. pastoris homolog of S. cerevisiae AGX1 (YFL030W)
Alanine:glyoxylate aminotransferase (AGT), catalyzes the synthesis of glycine from glyoxylate, which is one of three pathways for glycine biosynthesis in yeast; has similarity to mammalian and plant alanine:glyoxylate aminotransferases Chr 4, 0416
5' region (in bold stop codon of next gene) gcctagaaatctcctaaagcatggctgttccttttgaaattgaaccatacgaaattcttg gtgtgaacaatgaagctacacctgtggaaattaaaaagtcctactacaaattatgcctga tacaccatcctgataagaaatctggatcagactcatcaaatgacgaacatttccaaaaga
tccagtttgcttattcaattctctctgactcgaaaaggaggaaaagatacgacagtacag ggtcactagatgacactgccctagacgaagatgggttcgattggaaagagtactttgaga cgatgaaaagccagccagtcactgaggatttgattgaagaagatagagaaaagtacaagg gaagtgatgaagagaagcaagatattattgatgctctccaattttatgaaatggacgttc caaagctctttgaggccattcctcatctcgagtttgacgagagtgaagaagaacgtatct tccacttggtaacagaattggtagacagcaaacaggttgaaactacaaataaatgggaca agtacaaacacaataggaaatccttcataaagagacaattgcgcaaattggaaaaggagg ctgttgaagctgaaaagcttcagaagaaactggcgaagcaaaagaaagcagatcaggata tttcaaccttgcaaggactacagtcaattattcaggctaaacaacgatcatctaatcaaa agttggactcgttgataaataaactggaaaccgaagaggacgcaaaggtaaaatccaaac gaggtaagaaacggggtaccaaaagggcagcagtcccagatattgacgaagaagagttcc aaagaatccaggcaaagctgaaacgatagtgggggtcaccatatactaggttgttctact gaaaaaaaaagcttattggtcgtgtttcacacgatccatcatcaacggcatcatcgcatc atcgcttgcagtttagagtcagacccccaaacacccgcgag
ORF
TTCAGGCTGATCAAGGAGACCTTGGAGGAGTTAGAGAAATAG
Downstream (in bold stop codon of previous gene) ttgtagtaatttatttatattcttttgttcactaattatattctaaacagacaatgggaa ttctcctgcccaggcgctaatctcattcttgagttgagtgatctcgtcagagccttcagc gatcttggccttgaaatctttaagcttgttggcttcaattggtaactcgctttggacttt cttagcatattgaacagatttgtcgatgtagttagcgatcttgacgaagtcttcctcaga agcacctctggtggtcatagcaggagcaccgatacgaacaccacctgggaccaaagcgga tctgtcacctggaatggagttcttattcaaagcgatattgatgttctcacaaacggtttc aattctggcaccatcaatatccttatcctteaatgaaacaaggaccatatgggagtcagt tccgtcagaaaccaacttgtatccaagtctcttaaactcattctccagagccttagcatt cttcaaaacctgttcctgatactgcttgaactcaggtgtagcagcttgcttcaaagcggt agctaaagcggcgattgtgtggttgtgtggaccaccttggtggccagggaaaacggagaa attgatggggttttcaaggtcgtagtagatttccttaccagtctttgggttgacagaacg gacacccttacggaagaaaatcattgcaccacgtgggcctctcaaagacttgtgggtggt tgtggttacaatgtcggcgtattcgaaaggagatggaatgacacctgcggcaatcaaacc agaaatgtgagccatatcaacgactaagtatgcaccaactttgtcagcaatctccctcat tctcttgtaatcgatcaatctacagtatgcagaggtaccggcaaccaggacctttggacg gtatagaactgcagtcttctcaagcatgtcgtagtcgataatgccagtttccagatcgac tctgtaaggcattgtttcaaagtaagtcgagactgcagaga
AA
MIFQRHPLLISMFSKLQRFVQPIRSMSTITIPNSKQPAHRLTMIPGPIEFSDDVLDAMSTPSQAHTSPDF IPVFQ EVLQNTRKLFKSSNPTSQPIVISASGSLGWDIVGANFLNRGEKTLVFSTGFFSDRLADALSYYTETDVDVVKAPL GESVPLSEVEKALSAKSYDVVTITHVDTSSAVISDVEAIAKKIRSIQPEALVWDAVCSAGVEDIQFDNWGLDYV LTASQKAVGVPAGLSISIASERAVKKAFAKKRPTSYYANLQKWIPIMQAYEHGKGAYYATPPIQLIHAYRVSLRE ILEQGLDNRFAKHAETSNKFKDNLESLGLKLVAKREFGANGLTAVYFPKSINGPDLLKKISEKGVTLSTGIYAGI ATEYFRVGHMGVSAVGKDRHDVDIVFRLIKETLEELEK
S E Q0117
P. pastoris homolog of S. cerevisiae ILV5 (YLR355C)
Acetohydroxyacid reductoisomerase, mitochondrial protein involved in branched-chain amino acid biosynthesis, also required for maintenance of wild-type mitochondrial DNA and found in mitochondrial nucleoids Chr 1 -1 , 0432
5' region (in bold start and stop codon of previous gene) tatttccagataatcatgacgcatccacctcgttacaatgtacctaaactaaaagacaac gacagcccccttggttgtgcagatcatcatctcctatcaaacagcacacaaaaaactggg taataagtttagaacgagttacaaaatgtcttcctccttttgcaattctaatctacgcgg aatctgtcaccctcttaggtttattctcttacagtacttcccctagaatcccgacaagag ctaaacaaaaacttaggccagaaagcaaagttcccttagcatataatttacctagctttg ttaggctatttcgaacttgattccgttcaatcgcccactccacttcatcttcgacattat cttccatcaattctccttctacagaaacataggctgacccatctaaagaagatctttcag taatgtcttgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttcttta gaatagttgtttccagaggccaaacattccacccgtagtaaagtgcaagcgtaggaagac caagactggcataaatcaggtataagtgtcgagcactggcaggtgatcttctgaaagttt ctactagcagataagatccagtagtcatgcatatggcaacaatgtaccgtgtggatctaa gaacgcgtcctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtga caaggttgtcgattccgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtg agccagttccaacaatctttgtaatattagagcacttcattgtgttgcgcttgaaagtaa aatgcgaacaaattaagagataatctcgaaaccgcgacttcaaacgccaatatgatgtgc ggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaa aaaattttcttcccttctcttccaaatatcgtctccacaaa
ORF
Downstream (in bold start codon of next gene) gttaaccggcttgtttgattaatgccataagttaactgtccacagaagggacaagagaga tactgtgtatctatgtagaaagcataatgagacctatgtcttgtgaatagactgggtaac aaatttttgacctgtagagtgtgtgaggtggcaagagtatgaaactagtagattgggttc ttcttcgataaccatcttgcgctgctggcgcagggatttgtcacaatacaatctgatgta ctgtgtaacccggggactgaatttgtcgtcaaaaagatatctgtttccatcaggtcacgt gaaagtgtaccatctctatttgaccttttcaggtatataactgtctggttctcccaccat tgtccgtaagtttttaatcgtaacttgcaatgtcaattccaaaaccagaaggttctacac tcagattgggttctacagctcctgatttcaaggccgaaacctctcgtggaccaatctctt tccatgacttcattggagattcctgggtagtgttgttctcacatcctgatgattttaccc ctgtctgcacaacagaactgggagctttcgctaaactgcaaccagagtttgagaagcggg gcgtgaaactgattggcttgtctgccaatactactgattctcatcaagcgtggatcaagg atattgatgaagtaactggctcacatctgacttttcccattattgccgatccagagcgca agatcgctttggcgtatgacatgattgattttcaagatgcctctaatgtggatgataaag gagtacaattcactattcgttccgtcttcattattgatccaaaaaagaaggtcaggttga
tcctgtcttatcctgcctctactggtcgtaacactgctgaagttcttagagtcattgact cattacagactggagatagaaaccgtgtaaccacccctattaattgggttcctggtgacg acgttatcgttcatccatcagtaactaatgaggaagcaaaa
AA
MSVRNATFRLSRAAIAKRTLVRIASQASLRPTTLNSATGVARSQLTFSRGVKTINFGGVEEWHERSDWPREKLL EYFKNDTLALIGYGSQGYGQGLNLRDNGLNVIIGVRKNGASWKAAIEDGWVPGENLFDVTEAIKKGSIIMNLLSD AAQSETWSTVKPLLTEGKTLYFSHGFSPVFSNLTHVEPPSNIDVILAAPKGSGRTVRSLFKEGRGINSSYAVWND VTGKAEEKAIALAVAVGSGYVYQTTFEREVNSDLYGERGCLMGGIHGMFLAQYEVLRENGHTPSEAFNETVEEAT QSLYPLVGKYGMDYMYDACSTTARRGALDWYPIFKDALKPVFEDLYNSVKTGKETQRSLDFNSQPDYREKLEKEL EI IRNMEIWKVGKEVRKLRPENN
SEQ0118
P. pastoris homolog of S. cerevisiae RPS1 B (YML063W)
Ribosomal protein 10 (rplO) of the small (40S) subunit; nearly identical to RpslAp and has similarity to rat S3a ribosomal protein Chr 4, 0524
5' region (in bold start codon of next gene) ctcctgtgtttactatttctggtatggttgcctttatgtcaaagactttactcagtatga ctttgaatagcttttgtatgtttatagaatggcttgtggatgagaaaataagggaagctg acatagctctggcatatttctttgcttgattggtaatctcgatttgatcctcatctggta gatctatgaatagatcatacttggtgcccaccaaaaacgggatggctgttttattgaaac ctcttgcttgtctataccactccttaatagagtttagagtggattttctggtcaggtcaa acataaatattatcaccacagcgtcagttgtcaccaatggcagcatgttaatgaactctt tctccccacccagatcccaaatggagagcgtgatctgggtatttctgatgttgatgattc tctccatgaaattagctcctagtgtctgtgtgtaattttcatcataacagccttctacgt actttaccattaatgatgttttccctatctgggcatctccaatcaggccaattttgacgg cgactgagttagtttccctttgactatgtggtacattctgttgctctgaatggcttcttc tgtgctggtttcttctcggagagtctggcatctcaatttgcgtcgattaaagactgtcta gagacgatcttagtgtcaactgcgtgtgtgaagtaaacaagataactaccctctccataa ctggagaccgtgcatatggaacctgatcaaatagggaaataatatctaaaccataaacac gtgataccttttagggtgtacatttgttcaagtagtatcgtcgtcactttggtgttcaga tgtgtttcagtttaatgggagctcgcgtactagggtatccaccaattgtggtaagaacac gactcctttttttttcgctccctcactagtggtgtctctctctaaggattgaaaattcgc aaagaaaaaaatcctttactaaacacttcaactacataaag
GTTGTTTTGGAGACCGTCTAA
Downstream (in bold stop codon of previous gene) tctaagtgacagtacttatgttttatagcgaagtagaataagaaattcgtccataatagg tgaagaaaattgttgtagtgaccttagtattctttgacttccccgagtaggtcaattcgg atgtaggttaaaatgatttttgaggtctcgagttgagggagtctgaaaatttcaaacatg caagtcgaattcaaattgactacacttccatgcgtctacatagccatcggcttgaggtct taccaaattttaatatattatatggcttatggccaattttttttctagttgtgtaaacgc ttctcgtaagagtgcaaattgttaacaccaccttcgagttgggctgatggagttctcagt tcaaatctgacttctccagcatcgacagctgctctgagctcatccacagacttaactcca atgtcttggcaagaatgttgaagaccagattgcaagtatggaatgaatttggtcacagat cctttgtcgacaacggcaccagtaacaccttgggcaaccaaaactttgtcagattcggaa
aagtatctggaggtggaagcgttggcgttggtagcagtctgctccatggcatcaatagaa cccatacctctgtaagttttcagtctcttaccatcacggtaaaagtattcaccgggtgat tcagtggtaccagcaagcattcctcccatcataacacaagaagctccaagggctagagcc ttgacgatgtgaccaatattctgaacaccaccatcggcaatacatggaacaccgaatttg ttagcaaactcagaaacggcgtaaacagcagtaccttgaggtctaccacaggccatgact tcctgggtgatacaaatagatccggaacccataccaattctaaggccatcagctccagca gcgatcaaagaggcagcctgctctctggtgacgacattaccagcaatgatttgcagatta gggtgattctccttgatccatttgatcatgttcaattggaa
AA
MAVGKNKRLSKGKKGLKKKVVDPFQRKEWYDIKAPTTFENRNVGKTLINKSTGLKNAVDGLKGRWEVSLADLQG SEDHSFKKIKLRVDEVQGKNLLTNFHGFDFTSDKVRSLVRKWQSLVEANVTVKTADDYVLRIFCIAFTRRHQNQI KKTTYAQSSQLSAIRKKMVEIMQREVSNVTLAQLTSKLIPEVIGREIEKAAEGIYPLQNVHIRKVKVLKQAKFDL GALLALHGESVTEENGKKVGSEFQDWLETV
SEQ0119
P. pastoris homolog of S. cerevisiae P0R1 (YNL055C)
Mitochondrial porin (voltage-dependent anion channel), outer membrane protein required for the maintenance of mitochondrial osmotic stability and mitochondrial membrane permeability; phosphorylated Chr 2-2, 0392
5' region (in bold stop codon of previous gene) ttatgccatcgtgaacacaaatatcacaggtgcatttggtgctatttcctggtgtctatt agactggcgtttggagcgccgtttcagtactgttgctctatgctccggtgccatttcggg cctcgtggcagcaactccagcctcaggtatcattcctctttgggccagtgttattcttgg tattgtatcaggagtggtttgtaactacgcaaccaagattaaagtcatttgtcgagtcga tgattccatggatgttctagcagagcacggtatcgctggtgttattggtctcgtcttcaa cgcattatttgggtcggctactgtcattggttatgatggccttaccgagcacgaaggtgg ttggatagaccacaactggaaacagttgtacaaacagattgcattcatttttgcttgtat tggatactcgatggccatcaccgctcttatctgtttcatcctcaaccgtattccattttt gcaactgcgagcttcagaagaggctgaggagaaaggtatggatgaggatcagattggaga gttcgcttatgactacgtggaagtacgtcgtgattttttggcttggggatcaggcccaaa caatggcttcaaggagccggaagttctggatcaggtagttccggttaatgatttcagcag tgaccagaatgtgactaatgagaccaacgaatctgagaagcagtagagtaaatatagaga tgatatttagtgtattctaatgcttatgtaatgtattaagcaaaaagttgtgtttatgag ttagcatttgtcttagcaaacataaaattatgtcgacatttgcaacccgcatgtctagtg tttttagatcgatcttcgatgtgtagaataatagcctccacgtgatgccccgcgattttg ttgggtctcaatgcctccaacataaacccatcacgtataaaaagccctcttaaccctccc ccctgtttcgtttgcttcatcacttaacctgaactatcaaa
ORF
GGTGTATCTCTTTCTTTCTCTGCTTAA
Downstream (in bold start and stop codon of next gene) gcgttctaggtagcaagttttttaaagatgaaaaattagtaatatgatgagtactcgtat attgctgctatgtctagcgtacttctgattaccccactcggacgaactctggtttggtgt tcttgtcgatcagtaaatcggttttgaatgtctcgatgtactccacatcgccctcccctt ttcccccggcaaaacgtccatactcatcaaatatgccagaaatacaccacccctgaagca
gttttctcatgatacccactaatacccccaccctatgtttgcccttattcgaatgaatta acattgggtagttttctttgtgaacaattagcttcagagcctgctgaataaccgagtcat ccttgaacataaacggctctacacagctttgcatatttaggtaatgaaaatctatacctt gatcgcggagccaccggtaatagtcatagttgtccgtcttgtcacccaaatatatgattg tttttaggttcagtttttgtaaatacggaaaattgaatgcttgaggatgacctgacctgt agataccatcctctactatagcaaaattgattgggggaacaaattcaggacaccgatcta aaattttgagagaagtgaactcatcttgaggctcctcgtattgttctaattcaccagctg tggaatcatagattctcagetggcggattagetgatcaccttgagagtacatataacctg acggtgatttgaaggaaggaacttactaaattcgcgatgttctactatagcatcgatgga atttatggttcgctagttggttgaacaaatgagtaaacacattcccgaaataatgagtat tgtacttaaatacccactaccgtcctggcatatcccaaatgcctacgctaaccaccaatg atgcttttcccctttgccaggttcctactaacctgtctttcagcatgacatagctttcac taatagccctcaaatacgtagatcagatcgacccgtagttg
AA
MA\7PAFSDISKASNDVLGKDFYHLTPVSLDVKTVAANGVTFTAKGKSAGDKLSGNLETKYADKKNGLTLTQGWNT ANALDTKVELADTLTPGLKAEVVGSVVPDKKKDAKLNLTYAHQAFTARTFLDLLKGPTVNADFTAGKDGVTLGGT
ASYDINAASVTKYAFAVGYKAPDYSISLSALDNVFLFΞAGYYHKVSPLVEVGGKATYDSKSS IANPVALEVATKY
QVDSTAFVKAKIADSGIASFAYSQDLRKGVKLGLGAAIDVLKLNEATHBLGVSLSFSA
SEQ0120
P. pastons homolog of S. cerevisiae RPS23B (YPR123W)
Ribosomal protein 28 (rp28) of the small (40S) ribosomal subunit, required for translational accuracy, nearly identical to Rps23Ap and similar to E coll S 12 and rat S23 ribosomal proteins; deletion of both RPS23A and RPS23B is lethal Chr 4 0348
5 region (in bold start codon of previous gene) tttgaactgagatttgctgaactggatattcttaatgtctgtaaaatctattgtggtcaa ggttacattattgtccattccaacggagaccaagtatccatgactgtcatcaatggggag atatttcatgtcggagagaagtctaccatgcaactgaaacgactgtatttcttttgttgc tacgtttaaaatggatagttgtccatccatacccccacagagtagcagaggagaactgtc aatgattaataccactttggagatggacttgagaactttttgaaaaccagctagcttaac accattgagatcaaaaacctgcagttcattcgatgaggtggagatcacaatacacgaaga gggttgatctgaaagcttcaagttcccaacagtagttgaaactactaaatcagagcccaa cggaaccaattcttcacctatcactggatttggcaatttccatacaggcaaagctacact attattgagcgcctgaggatcgtcctcgtttgcctttttcaactctaagaacttgtatct gtcttccacgatgtcgtagagacgctcctccaacaaatccctattgtcagagaaagttcg attgagttcaagtttgaacgcagccaagctctccttaagccctctttcagaaagatactg tgcgatcaacgactcgacaatttcattgatagacatacttttcacaatcagacctaaatg ggatctcttgtgctgtatctcatggatagataggtcatgattaagtaatctaacatacag atctcaggcacgtgatactgtagtagctgctacgctgctacatctctacatccttacaaa taaattcattgaattactttaccacgtgatgcgcattagggcaatgtcactgctctccct aaagagttaacaccacctatcggttcgcgacatggatcctctcactgcccaaaaatttca aatttctctgttcaaagttctattatcacacaacagcaaaa
ORF
Downstream (in bold stop codon of next gene) tggttttgtagattatcagtatttacaatacaaatattagtactatacttattgcggaag ccctcttccttcaatggagattcctcctttcttggccgacgacttgcgtttggtccccat agtgttgagccttgagtttaatctcttgtattggtcctctccgctcagggatgatcccac tccggaaggcagagtccttctggctgaactcatcagcagctcggtattgacttgaggagc gatggtgtaagcagctgttcttgcttgggggatggccgcacgctttcttctttcagctgc
ctcttcgtaaagggttttggcctcctggggtctgacactgacccttccaggcaaggattc cgcaaactgggcaggatcaaattgctgggtcctaacacgcatgtctttgtaattatccga tgctcccattatgggtctggtccaatcactgatcaatttctcggcagttctcttaaggtt gggttccactctcttggatttctggtagaataccatcactttgcctaaaccactttctct caaatgagatgtcttgatgggaagtgacttgatagcgtcaaatagcactttctgtatttg gtaagctggcaaggatgcatctggtaatggttccagccatagtcttactgaggccaataa gttgttgtccagtatagaatcggccaagttggctctcaaaagtatgtcctttaccttggg aagtagtttcaatttgaaaatggcaggttttccctgttcaatgttgtttgcatcacggat ggcggattcttccatttgtaatttaagttgttgtataagctcatcctgcatctgctctag atccacgtcatcgttcttctttctcttgtttggcttccttttgattgcagcatccagttt ctcttcgaacaatcttgttcttctggcggcttcgtcttccacgacatcaggctcgtattg agtttctctggcggtttgttgctttggtcttcttctggtgg
AA
MGKGKPRGLNAARKLRVHRRNNRWAD LQ YKSKLLGTAFKSSPFGGSSHAKGIVLEKIGIASKQPNSAIRK CVRVQ LIKNGKKVTAFVPNDGCLNFVDENDEVL LAGFGKRGKA KGDIPGVRFKY^VKVSGVSLLSLWKEKRE KPRS
SEQ0121
P. pastoris homolog of S. cerevisiae RPL13B (YMR142C)
Protein component of the large (60S) ribosomal subunit, nearly identical to RpIl 3Ap; not essential for viability; has similarity to rat Ll 3 ribosomal protein Chr 4, 0413
5' region (in bold start codon of next gene) tatgactggaaaatcttcttcaactcgttcttggtagactcgtcaacgaacttttgagtg taagcgatcaagcccttagcaagagcttgtctgatggcgtaaacttgggaaacgtgacca ccaccggtaaccttaattctgatgtcaatgttggcaaatttgtccaaaccaactaataat agtggctcgtaaactttgtgtctcaagatctcaggttgcaccagagtgataggtgaaccg ttaaccttgataagaccggaaccctgtttaacgtgggcaacggcagtagcggacttcttc ttaccaaaggtctagaaatggatgttagttagattttcttcaatagtgctcccatctgat agaaatgagtcctaccaatggaaagagcgataatacttgggaagtgaacggcgatacatt ctgaacaccgaatccccttaatgatgaacttacctgaacactttgtacggacttagcttc ttgagacattgttactcttacaaagaacaagttttttgttagcaatttttttcacaat.ee gagagagagcgtgagcgatcaggtagtgtgtcgcacaataagagttaacgttttcagagg tttgactgggagggtgaaacagttagtgtgtagccattgactgttagtgcacgacttgaa ataactgtttgcaaagaaatctaaatctcgaatttgtgactgctgtaactgaactaggaa ggccagacctgtaaatacacgcgatttatctcccaaaactagagcgatcacgtgcattat aagactaaattgctttgtgaaaaatagtccacctggggaaccgagcatcgtccgcccatc tccaaagaagacccgcgcatactctcccgccagcacttttcccttcagattttggttacg aagcccctgctcgaaatctttctcgatctgtgccttcactcaccttttacaaattttcac caagttaaaaaaagcttccaggttaactacaacaatcaaag
ORF (intron in bold) ATGGgtatgttctttaaatcacccaagcactatgctctcgtagtgctagcaaataccgtttccgacaaatgaagt actaaccaagttatagCTATCTCTAAAAATTTACCACTTTTGAAGAACCACTTCAGAAAGCACTGGCAAGAACGT
AAATAG
Downstream (in bold start and stop codon of previous genes) acgctttgtagtgtatattgaatccgggagtaatgtaacacatagtaggctagtggtcta catcgtagaagttcgtttggtggattgtacaaacttttcaaccaaatggagtattggcca gatatgcttggacagtccagatatcttaagctccctacttctagatgtacatcgttacag cttagactttacaaggagtgatcgatcgatcctagacttgtcgtcaagataaaacgaaaa gattaaagaatcaaaaatttctgccacttaggcataggcaaggtacatcatttataattc
aaaccatgtgtattcgtatttagaatgtaactagtcgaatagaccgaagcccatgtcgtc gtcggactcttccttggcctcttcctccttctcctcagcagcctcctcagcatcagcaga agaagcggcagcaccaccagcagcagcaccggaggcaggggcagaggaaatgttaaagaa caaatccttcaagttcttaccgtccaaagccttggaaaaaaggtcagcgtaaatcttgtc aaccttaacgtcagccttggtggtcaaagccaacaagttgtcggaggagatctcaatctc agaatcagccaaaataagggcggcgtatgataaagcggtttcggtagccatcttcgttta atagagataatcggtttaagcagatattgcggactaatcccttgaagaagagagaaggac ccactggtttttttttccatgttcgaaaagttacgtggtgtaatgatgtattataatcac gtgcgtgactctggggccgatgcccttcgtcgttgcacttcgtctcgattctccataaga ctaagacgaagagcctcctcaatcatacgctcttctagttcatgtatctcgtcgttttgt ctagctgctcgtagtgctctcctttgactgttagatatatgctcctcctctgttatcagg gaagtagcatgaattgctgatgcggtagccgatcgccgagc
AA
MAISKNLPLLKNHFRKHWQERVRVHLDQAGKRTSRRNARAAKAVNLAPRPLDSLRPIVRCPTVKHMRKVRAGRGF TIEELKAVGLNANYARTVGIAVDTRRKNRSEEGFELNVQRLKEYQSKLY7VFFSRAEAKDAKQISLGASFPVEQPA IEVGPRAVWPEKTAFETLKQARLDQKYAGQVAKRAKEAAEAEANKK
SEQ0122
P. pastohs homolog of S. cerevisiae RPL12B (YDR418W)
Protein component of the large (60S) nbosomal subunit, nearly identical to Rpll2Ap; rpll2a rpll2b double mutant exhibits slow growth and slow translation; has similarity to E. coli LI l and rat Ll 2 nbosomal proteins Chr 1 -1 , 0189
5' region (in bold stop codon of next gene) agcattttcagaactttgcggatgaaatagctgagggtagtctacggtttaggacggaaa gggagcgagtgcaagatatttgttttcgttgtagctctgctgttcgccaattaacaactg aagttaaaactgcagcatcccagctgaaaaagattcatccatctcggggtcgtttcaagg cagattacgaaaggggagtagaccaaatactccgatccagtattggcgtggctgtatttt ccactaaagttaacaagaaaattaaagatgctgagtttgtaactcaatgcctgaaggaga atccaagcaatcaagcatacgatgtttcattgagcagagcccaataccatcggtttttcc gcctgcggagattaaacatatgaggaagtgatagttatgtagtggtctttcttgtacagc tttggagtatgatagtgaatggccatgccagcccccaagtggagtctccgcccagtgtat ttctttgactacggatcaaatcgttaagcagagcgaggaatcggttgactttgagccttt gtacaatagttattatttaatgaaaggttggataagagtctaccaagatcccttttgagc acgacaattttgactaacgactttgaacacccaaccacctggaaaacaggtggttaaagc catcggtcagtcggtcttccatccatgttccttgaaccagtacgaactatacgagctcat cttatcaccaaaattctggtcacctggatttgtgactgtgaaattctaacacttagcgta atgatctgaccggactcgatacttcacatgtaaactttcaaagctccttcgcgaatcata tttccagcctcactctccttccaacaagtaagtctccctggtagttgtcttttcagtgcc gtgagaaaaacctccccccgatttcctcgcgaagaatattctgaaaatgaaaaaatcacc atccatttttttctttcctagtaactcctcacagaaacaag
ORF
GTCATCGAGGCTATCGATGCTGGTGAGATTGATATCCCAGAAAACTAA
Downstream (in bold start codon of previous gene) atgtgtaataattgagatatttaaacggttttgatgaaagatggtttgacttccagctgg tttttgatatttgatttgactttcattttttttccattttttctccacctgacccgtttg ccaacactctggtcccattttttactctgtactgggttccaattttacaagcaaactaga taaaaagcttatgattattacggagtttgtttagctggaagtttcgttaggggggtatca ctattgtggtactttctggagctgtcagtatgatgaggcctcgtcgtggttgttagctga
cattgtatttaagttttgttcaggcagaaaattggaaaaaaaatcaaacaaataatcaaa gtctccagctattaggtagtaagccagtaggttgtaagttgtagtgtcttctctcacctt tttgctaggaagtagtcttaggttaggacgaagtctagtgagcgttaccatcgctcacga ggttttctagcacgtatgaatggtgaccacggtccatgataccagtttcattcgttatag accttataatgagttcgtcactgcttctgacaggttaaccaggcttctaactaaagcacc ctggccatccgttaaacttttcagctcatcacccaaaagtcgtgcgcgggctacattata atattttttttttgtatttctaaggctcatcaggtatactagctacagagtcttttgaac atcaatcacgctacaacatagagtgattcagaccacagagcaagcacagcatttccacta agactgcattttattcttacacaattttttctcttgcatttgattgtctggttctagatt tcactatggataatcctataaggttagacctacaagactcattgaacaagttcaaagagt acttcaacaagctggatttaccacttccggctccattcttgttagaaaatgagagcgtgc cccaaacgttgaatacgctggccatgttgtcactaaatgac
AA
MPPKFDPSEVKF IYLRAVGGEIGASAALAPKIGPLGLSPKKIGEDIAKATTKFKGIKVTVQLRIQNRQAVASWP SASSLVITALKEPPRDRKKEKNVKHSGNIPLEEI IDIARQMRDKSFGKNLAΞVTKE ILGTAQSVGCRVDQKNPHD VIEAIDAGEIDIPEN
P. pastoήs homolog of S. cerevisiae PGH (YBR196C)
Glycolytic enzyme phosphoglucose isomerase, catalyzes the interconversion of glucose-6- phosphate and fructose-6-phosphate; required for cell cycle progression and completion of the gluconeogenic events of sporulation Chr 3, 0456 See SEQ0061
5' region (in bold start and stop codon of previous genes) agttgagataagaaggatgaatctgtgtagttatccgcataatcctgtttcaaccataat aacttcttccatggatgacgagccatttgagtggactcggcgcaaaagtgacaacgtttc tagagggctactcgaaagcactttctctggagattagtaatcgatggtcaggatgggaac cgaatgtctttggctggcctttccccaaatttgagaacaaaattcaattgctgaggtaat acttttttccccatacaagaaaggccatcgtgtaattatctcttaccttctactattcga gaagttgtaaccaaatggctgataacatcgaaaccatccgtcctggaatcaactaccagg acaaactattgaccgaaatagacattcttgatgatgttaaacaactgtcaacgcagatgg aacaggaagggaaaggattcgtcttccctaaagatcattactccaagataaaatcactaa aggggttgcaagtgaagctgttgaaagaaatgcaagagttggctgaaattcagcagtcta ggcatcaagacgaaactgcatacaggcagaagcttcaagacttggaagatactacatcag aacttagaaacgttgctttgacgtaataggtcacctgtgttactagatgtcggatgtctg tgattgcagtaacacccggcattcggtcagctgtctacctttcaggatatcctcccccca ccgagctcggacgcctttctcgcccacccaccacgcaaagttccgcccaccttatacttt gtggaagtcttttcgccgcgcaccacacttgcccgttcgcagaaaaagaaattccaggta gcccttattccgtcaggcaataaatatataagcgattgcagacaatctgacgtcccatcc ccttgtctatttaaacctcctaggttgcttaaatttaaaatctattcgagtcccaacctt tccattcttccggaataattcaactccaaccaattgataaa
ORF
GACGGCAAGAACGTCGCCCCAGAAGTTGATAGTGTTTTGCAACATATGAAGGAGTTCTCTACGCAGGTTCGCGAT
GTCACTCCATTTGACCAATATATGCACAGATTCCCTGCCTACTTACAACAATTGTCCATGGAATCCAACGGTAAA
CACAACCCTGTTGCAAACAACGCTCACCAAATCTTGTTGGCATCTAACTTCTTGGCTCAAGCCGAGTCTTTATTG
CAATTGATCAAGAAGTTCAAAGCTTGGGCTTAA
Downstream (in bold start codon of next gene) ggcgttctagttatagagatgtatattgtaatatggtatatgagtactaaagaagtgatt atctaataaaaatttgagtaatgcactgacattttcactatcagttttggaagagggtat qqqacqcqttcattcacacqtctttqqtaattatgqccqaaqaaqttqaccccqttaqtc tgggtgaccagacgattctccgctataaaatatggaagaagaactctccatacttgtatg attatttccaaagcaagtctctgctgtggccctctttatctgttgagtttttgccagaca ttgaacgaaatgacgaagatgagttcgattaccaaaggcttatttttggaacatttacgt cgggagccagcaatgagtttttgaactttgggatgtttagtagacacaacgaagtctctt tgagagaqtcactqaqqaactctctgqacaattttqacagcqtcaaaqgaqaaatatcac cactagtattaccatcttccaaagactccaaaaactctaatcgcagctgcgaaaagttga gcatcatccaacgaatagcacataatggagaagttaataaatgcaaatatcttcctcaaa atcccgacatcatagcgacaattaataattatgggagtgtttcgatttttgatcgaacaa aacatccttctcaaccactaagcggcacaattaaaccagatatttactgtacatatcata aggatgaaggttcctgtttgagttggaatcctagcgttgaaggggaactgttgtcaggct caatggacggaacggttgtattatgggatatcaaaaagtacacgagggacaaagattctc ttgatccatacaagatattcattgctcatgacaatggctgcaatgaccttaaattcatcc ctagacacacatcaatttttggttctgttggagaagatggcttttttaaactttgggata ccagacagggactggatcccgttaagtcaacacgacttcat
AA
MPSLLQEDNATFKLASELPAFEELKELYKSKGKNFSAKQAFQKDPARSSKFSHTFKNFDGTEVFFDFSKNLIDDE ILAKLFDLARQANVEKLRNEMFAGEHINVTEDRAVFHVALRNRANRPMYVDGKNVAPEVDSVLQHMKEFSTQVRD GTWKGYTGKQITDVVNIGIGGSDLGPVMVTEALKPYAQEGLHVHFVSNVDGTHIAETLKYLDPESTLFLIASKTF TTAETIRNANTAKDWFLSKTGNKSEAIAKHFAALSTNAEEVAKFGIDTKNMFGFENIWGGRYSVWSAIGLSVAIY IGFDNFEDFLKGAEAVDRHFLETPLEQNIPVIGGLLSVWYTNFFGSQTHLVTPFDQYMHRFPAYLQQLSMESNGK SVTKGNVFANYSTGPVVFGEPTTNAQHSFFQLVHQGTHLIPADFILAAKSHNPVANNAHQILLASMFLAQAESLL LGKTEEEVAAAGATGGLIPHKVFSGNRPTTSILTQKITPATLGSLIAYYEHVTFTEGAIWNINSFDQWGVELGKV LAKAVQKDLQDDSANVEESHDSSTAQLIKKFKAWA
SEQ0123
P. pastoris homolog of S. cerevisiae RPS25A (YGR027C)
Protein component of the small (40S) ribosomal subunit; nearly identical to Rps25Bp and has similarity to rat S25 ribosomal protein Chr 2-2, 0326
5' region (in bold stop codon of previous gene) agagtttttcaatgatttcaagtaatccagatactttgtgagaatgggcgtgaaatcagc acttggatcgctagacctggcttgctcgatagcggcaacaaactgattgtttaacgctac aatcttgtgcgtgttatcatcagccaaagaaggtgcggaattggtgctgctcactgtagg tacaggtqgatgcqaaaactttattcctgtqgactqtttagaaqacaacatacctctqqg ttttaagattttccttttggataagacttccgctgaagccctatgcaatcccgatgactg ttcgtccgatgtgtcatcgtcgtgcccgagagtttccctttgaatttgggaaccagctct cctctttcctgacatttactcttaattgaggtttatttgatttgataaatgtttatagga ggtttatgttcgcgctcaggtgtcaacaaatgaaaagggttaagccctgcacaagcatca tcagaaaggcaaaacgccgtcctattttgtcacatgactttaaaaaactggcaggacttt gatattttgggacgttcacactaccttgcgttggctcccgcaccctagagaaaaagagaa aaaatcagaaccaacttttcttagtacacaggttaaagaacaggtaagttaacaagttta gtgtgctgaaaaaagattatctatttgaaatttgaaatagtatctgctacttgctgagcc atgtacaqgqagacaqaaatgqatagaqaqtcaaacgacctcatqgacgaatattagttg taatagaataaggtactcaagattcagcatggcaattacccaactctcaagtgaatttaa aacttgatctcaagatttatgaaatagtccgtttccctgttgtctgtgctttgttgctag
atattcattttctttcccgttttcgaatgagtgcttttttttctctttcactatcatcac tgctttccgaggattatttcagtcatactaacccttttagc
TACACTAGAGCTACTGCTTCCGAGTAA
Downstream (in bold stop codon of next gene) attagtgtacatctgataatatagtactaccacgtatgataatgtagagaatagtcttcc ttgtcgagtgtgtttgcagttttcttgagtttcaaggtttaaatgctggtatattagttc atcgaaggtttcagccaatagcaccttaaatcaatcaaactaattcgactcttacgaaag agcctactgtgtttagtatcgaagtcgtttacctttcatgttgaatagcttcctctctga ccctaacatttcaagatcctcctaaagttacccggattgtgaaattctaatgatccacct gcccaatgcattttttctttattcagtttaccttttttacctaatatacgagcttgttaa agtaagtggcactgcaatactaggcttattgttgatattatgatgaatcgttttcacaaa cttgatttcctgtgaactcaccatgtactaaggaaaaaaacatgcatcaccatctgaata tttgacgattagtccaaccaattagtataccccggttgtcgttactaacatcatagccag actcagtgatcacaatgttcgcaagtatgtaggcattgtgttccgactgaaagtttaatg atgaaaaaaaaatttcagttctgcttctgagactaatgaagataactaaattttaccaaa accactgaaaactttcattaggaaccgaacctattagtcgtatgtagagaagtatgctaa tttctaccaagtttcccattcagccaccagacccacgccttccaggaatctttgaacttg ccattgaaagaaccttctttggggctccgacccccttactgacaccgcctgagattcctc ctccaacttttgatactccagaaatcactgaattaaccagatgttccttactagacatat gagcggaatttattgttttttcttttgtgacgggggctgaagtttcagcagtcttgtttt cttcctcgacgaatgcagcgtcgtagtcctgatttcttttc
AA
MPPKIQQSKAAKAAAAMAGSKKSKKKWSKGKVKDKAQHAVILEQDKYDRIMKEVPTYRYVSVSVLVDRLKIGGSM ARVALRQLENDGVIKPVLLHSKQQIYTRATASE
SEQ0124
P. pastoris homolog of S. cerevisiae ASC1 (YMR116C)
G-protein beta subunit and guanine nucleotide dissociation inhibitor for Gpa2p; ortholog of RACKl that inhibits translation; core component of the small (40S) ribosomal subunit; represses Gcn4p in the absence of amino acid starvation Chr 1 -4, 0063
5' region (in bold stop codon of next gene) taaagctcgttatcctccttacacaagaaaaaatagagcgactctggagctcttagataa aggtgtggtttggtttgaccgttccggtaacgtcccttcaagaagatcgtatttaacgaa agttattggatactgcggattgatggttacaattgctgctgcattaattggtaaacttgt tttttaaacttttattgtttcgtctgtatcatctgttaagagatctgcatctttgataaa ctgtttgagcttatcattggcaagattggttgcctttttgttgatatttatgggaataat ggtaatatctgaatggtcataattggtcccagattgctcttttttttgaagagaatctcg atggtataccccgcttatctggttgtctagtaacttgatttcctttctcaaatgtatcgt agcatttgataagttcccataaaagtcgtgagtgtaatttttgaattcggacttgatctc tgctaatttcggcaggtcttcatttctagaggccaactcttcaccttctcggagtttttt ggtggcattggccaaattttttagaatttcagctatctttgaatctatgtcatgtaagga atcaagacgttgcttgatgtaagatagatgatcttctgacatgaacgcaagttagtaaaa cactgattaccgaaaagtggtaattgttagctggatcttgtcttttcggtaggtgataca atgacgtttacgttgatatgaacagtctcctctactatctgttgaaacacgtctacccca cggctcataagtaattcgctaaccttgttaaggggcatcgcgcacttctgagttcgaggt tcgaaaacaaagtcacataactgcctatttagtcgaactatgtcatagctcttctttctt aacatccacacaccagccactcgttctgctcgatcgttgaaaaaaaatagatagtacata cggatattttattctcttagtgcaatagaatagttcataaa
ORF (introns in bold)
ATGGCTGACAACAGAGAAGTTCTTGTTTTGAGAGGTACCTTGGAAGgtaggttattttaatggagtcgacaaata cactttctgctgcaattttcggcgtgaaccacaacgtatactaactttttttcagGACACAGTGGATGGGTCACT
GCCTCTGCTTCTAGAGATAAGACTATTAAACTTTGGAACATTGTTGGTGAAGAGGTTGCCACTTTGGAGGGTCAC ACCGACTGGGTTACTTCTGTCAGATTTGTTCCTGGATCAGCTTCTGACATTATTTCTGCTTCATCTGACAAGTTG GTTAAGgtatgtttttttcttttgcatggttttttcaagcttttcaatttttcaatcccgtcttcatttatcgcc aattccttctgttcacatgatgttaaaacaatattgctacttcagacagtgatctacactggatgagacttttta caccaagatctctgatataaacaaaggaaatggatggcaattcaaaatcgggtttgaacttgaaatattgattat ttagatatgctcttactaactcactatcttagTCCTGGGATTTGAACGAAATGAAATTGACCGCAGACTTTGTTG
AGCCAATGGATGAGCTGAAACCTGAATTTGTTTCCACTTCCACTAAGGCTAGAGACCCACACTGTATTTCCTTAG CAAGTTCGTAA
Downstream (in bold stop codon of previous gene) gttaacacacgttcatattaatatactctgtggtttttatatctattcttctgattcttg tagctgttcgattgctctaagggcttcttcagcatcttcaacaaaaccaagatcagcctc taaaccccgagagcttgtcaatgcttcgttaatcaatcttcgagccgtttgagcccgacc aaatcgtagattgactagccccaatccatacttactgatcgtagctgcctgtttgagatt atagtcactttgtctttgtttctcaaggattttctcctcggcagcctgcggcccatcttt gcttataactagcatgttgtttgatgatcgaaatctggaggagttattgttaactattgc agaaaatatcgtttgggcagcattcaaagaacttttagcagccgcttcacaaacttgcaa ctcttctttcaatttggagctttcctcagaggtgagctgaaaatcatcgccttgtcccaa cttttttttgatgttttcactattaaactcagcaatagcggattgaagataaagaagaga accaatagaactctgagccaaataaatatccccaataggttcccccatcgccattaagta ctgaacaacatcttgctttactttaattgctcctccaaaatctcccatacgcagcaaaag ggaagaaaatacttccattgcgtctaacatgttatctacaaacatgctagcagcttttcg atttctccaaatacgtgaaatcatttccacactttcaagttccgtagggttttggccctg tgaagctatcaactggttcatattctcttgaagagctcccccctggctttgaatggtcat ggcggcggcttctaagctaagattaaacaatgttttgaggtcctctaattcactcggact ggcatcctgcgacaactccgtagccttcaccaaagacatcaaacagtgttttattgcttt aggtttgtccattagttcttttttattagttgggttaacca
AA
MADNREVLVLRGTLEGHSGWVTSLATCPTNPDILLSGSRDKTLIWJQLNGDDQAYGVPKKALKGHSHIVSDCKLS LDSEFALSSSWDKTVRLWNLKTGEIIKYAGHTSDVLSVDLSQNLRVIASASRDKTIKLWNIVGEEVATLEGHTDW VTSVRFVPGSASDI ISASSDKLVKSWDLNEMKLTADFVGHTGYVTAITVSPDGSLCASAGKDGSI ILWDFSVKKV LYTLDAKEEVHAVAFSPNRFWLCAATSKSIKIFHLEKRKPMDELKPEFVSTΞTKARDPHCISLAWSADGQNLFSG YTDNVIRVWQVMTSSS
5 P. pastoris homologs of promoters used for expression of proteins in S. cerevisiae
SEQ0125
P. pastoris homolog of Saccharomyces cerevisiae GPD1 (YDL022W) and
GPD2 (YOL059W) (glycerol-3-phosphate dehydrogenase)
NAD-dependent glycerol-3-phosphate dehydrogenase, key enzyme of glycerol synthesis, essential for growth under osmotic stress; expression regulated by high- osmolarity glycerol response pathway; homolog of Gpd2p chr 2-2, 01 1 1
5' region (in bold stop codon next gene)
caaagagttgggtgttcctttagacaaatgttttataagtgaaacttccacccaaagtgt tcccaatacgtcggcaactgcagcatctgcagcctccgatttgaatggtatggcagtgaa gaatgcttgtgacaaactcaatgaaaggctatctcctgtgaaagaaaagctaggagactc ggccacatgggaggatattattcgaacagcttatcttgaccgaatttctctctctgcaac tggattctacaaaactccaaagattggatatgtgttcggggatccaaaccccaagcctgc atttttctactatacacagggctcagcaatcagtgtagtcgaagtcgatactttgactgg agattggtcatgcctaagctcccatattaaaatggaccttgggaggccgataaaccatgc aatcgacacatatcaaataacgggcgcatatatgcagggagtcggactttgcacaatgga gcaaagtctctggctgcgtaacaatgggcgcctctttacaaccgggcctggcgcttacaa agtgcctggatttagagatttgcctcaaaaattccacgtttctattctcaaagaccgcga gttcaagcatctagacacaatatggagatccaaaggcataggtgaaccgcctctgttctt gggattcagcgttcatttcgcactacgtgatgctatagcaactgctagaaggagtcaagg catcgaggaagggtgcaatggcctacccttcagaagtccattgaccacagaacgtatcag aaccatgatggctgatcctattcttttggcggctcaagtgcccgccgaaggcaatgagtg gttcattgaggcctagaagtgttgtgattattttatttatttaatttatttttgcctttc aattttttatttttcggctgccaccctagcgcgcgctcgcactaaaggctacttctccgt ctccaacccagcgagacggactccaaagttccttctagtaa
ORF
GACTAA
Downstream (in bold start and stop codon previous genes) ttagaagcctgtgtgggttcagcttgcagtactagacctttgaggcaactatcccaaccc ccatgctaaataaacaattgtatactagcgcgaacatcgaactgttaaatgtgattatat ttcatgtaccgtaacctactcgcttaaccacctttgatcatgtcagctcctccagaagct ctacagaaagtatgttgcactttggttcctgcttattactatgtactaacagataggttc tggtggaaatgaaccagcaaatacaaaaatcccaatcggaattggcaatgactcgtattc aacttcaaagaaacgagaccaatcttcgcatactaaaattggcaaactctgaggtctcca attctaaacaagatcacgtttgggaatctgttggaaagatatttctcaagacatcggttc aagattattcatccagtttgaagaacgaaacaacttecatcaatgatgccgtgaaggctc taaagaccaaagaaacatacttacaaactacattggaaaatacagtacaggcaattaata agtatgtgggaaaaccctcctcctaattaggtcaatctaaactgcactatatccacctct atttgagcttccaaagattttacctgccaactctctcaagaatccagggatccttgatac gacttcaacaacggcatcgacaaactcactgaaagcctcattagaagatccccggttgct gacattcaaccatgcagtgccaattatgtacgcagcaaagccgatcactccgagtataaa aaaccatgtaaaccatccccaagacccgtcagaatcagagtcatgttttccgtccttatc cttgtccttgtttttgtcgtcttttctacctttgttatcagacctacatgcagctttgga ctttattcgtaactccaattgttgacggttccagccaactaattctaaaaggtctttgtc ataatctccgtctttactacactcaaaaacgatgtcagcag
AA
MYLTSTVRALPVHFFRSRHCIRTMSNIVEKKQSEFSRAEPFRVAVIGSGNWGTTVAKIIAENTNERPEEFVRDVN MWVYEEDIEGRKLTDIINEEHENVKYLPGVKLPKNLHATPDLLAVASPADILVFNVPHQFLSRILQQLKGKIKPT ARAISCLKGLNVEKNSCQLLSTQIENELNIHCGVLSGANLAPEIARECWSETTIAYVKPKDYRGKYDDVTPFTIK HLFHRPAYFHVQVIEDIAGASLGGALKNVIALAVGFVDGLNWGDNAKGAI IRLGLREMIHFGHTYFPGAKSYTLT
CESAGAADLITSCAGGRNFKVGREIAATGKPAEQVEAELLNGQSAQGI ITAAEVYEFLSSKGDLNSYPLLITVYL ILKGERSAECIPEYFNKTEDVKHWED
SEQ0126
P. pastoήs homolog of Saccharomyces cerevisiae ADH1 (VOL086C) and
ADH2 (YMR303C) (alcohol dehydrogenase)
Alcohol dehydrogenase, fermentative isozyme active as homo- or heterotetramers; required for the reduction of acetaldehyde to ethanol, the last step in the glycolytic pathway chr 2-1 , 0472
5' region (in italic TATAAA signal) cgcagcgttttctgacggtactagaggactcttaggggaaggtagaatcaataaagatca tattaggtaagcaaattttggatggaataggagactaggtgtggatgcgcgatctcgcca aattgcacgaccagagtggatgccggatggtggtaaaccgtttcttcctttttaccaccc aagtgcgagtgaaacaccccatggctgctctccgattgcccctctacaggcataagggtg tgactttgtgggcttgaattttacaccccctccaacttttctcgcatcaattgatcctgt taccaatattgcatgcccggaggagacttgccccctaatttcgcggcgtcgtcccggatc gcagggtgagactgtagagaccccacatagtgacaatgattatgtaagaagaggggggtg attcggccggctatcgaactctaacaactaggggggtgaacaatgcccagcagtcctccc cactctttgacaaatcagtatcaccgattaacaccccaaatcttattctcaacggtccct catccttgcacccctctttggacaaatggcagttagcattggtgcactgactgactgccc aaccttaaacccaaatttcttagaaggggcccatctagttagcgaggggtgaaaaattcc tccatcggagatgtattgaccgtaagttgctgcttaaaaaaaatcagttcagatagcgag acttttttgatttcgcaacgggagtgcctgttccattcgattgcaattctcaccccttct gcccagtcctgccaattgcccatgaatctgctaatttcgttgattcccacccccctttcc aactccacaaattgtccaatctcgttttccatttgggagaatctgcatgtcgactacata aagcgaccggtgtccgaaaagatctgtgtagttttcaacattttgtgctccccccgctgt ttgaaaacgggggtgagcgctctccggggtgcgaattcgtgcccaattcctttcaccctg cctattgtagacgtcaacccgcatctggtgcgaatatagcgcacccccaatgatcacacc aacaattggtccacccctccccaatctctaatattcacaattcacctcactataaatacc cctgtcctgctcccaaattcttttttccttcttccatcagctactagcttttatcttatt tactttacgaaa
ORF
TAA
Downstream (in bold stop codon next gene) gccgaatagtttgtatacgtcttatgtaatgagtttcaatgaattacttatttttacctc tcctttttggctcaattcaactagcctctgtagcaatctgtttgcgaagaagaacttatc taatttttcatgggttttccccacgtttttgaaaagactttgtcttaactctctgtgatc aaaccgttgtggtggaccgctgattgattgtgattcttcttccagatctagcgactctgg cagataagagtccccggagtcaataattaccccaagaatataccttgaaacctccgagtt ggccagtatttgagataacagcagctccttgtatttgtaagtgccttcgttatcaattga tccgatgtccagcattattttgatatcaggtctcccaatcttgacaaacttttctacaat
gtgtttttgaagcagttgcacaagaggagcaatatcttcaaggtatgagtcttgctggtc gccgttcaattttaaaataagaaatgtatctgagttccctgtcgtcccaataactgctac actccccaactttgtgacagagagaaaatgatgctcttgtgaagtggaatatatggcgtc aatgctactttgtaatttatggttgaacgcattctcagtatctccataaacttgaaagcc cacggggtaagagacgccagatgctaattctctatgcaattgcgactctgtgtaagtgtg actcaaaagtccaatacagtagagatcactcacatactgaggggtcagtgtgtcggagag ttcacctaccaatggacagtactgggctaattcggttagcaagactcgacatatcgggat cccatactgaatctcataggacattattgcctgatcattgtctccgtttccgtactcgct caaattggttctcattgtcactagcaagtcttttcgatctgtagacgtcaaggggttgaa taatggagttatctctttttccaaaacaccgtccatttctc
AA
MSPT IPTTQKAVIFETNGGPLEYKDIPVPKPKSNELLINVKYSGVCHTDLHAWKGDWPLDNKLPLVGGHEGAGVV VAYGENVTGWE IGD YAGIKWLNGSCLNCEYCIQGAESSCAKADLSGFTHDGΞFQQYATADATQAARIPKEADLAE VAPI LCAGITVYKALKTADLRIGQWVAI SGAGGGLGSLAVQYAKALGLRVLGIDGGADKGEFVKSLGAEVFVDFT KTKDWAEVQKLTNGGPHGVINVSVSPHAINQSVQYVRTLGKWLVGLPSGAWNSDVFWHVLKS IE IKGSYVGN REDSAEAI DLFTRGLVKAPIKI IGLSELAKVYEQMEAGAI IGRYWDTSK
SEQ0127
P. pastoris homolog of Saccharomyces cerevisiae PH05 (YBR093C)
Repressible acid phosphatase (1 of 3) that also mediates extracellular nucleotide- derived phosphate hydrolysis; secretory pathway derived cell surface glycoprotein; induced by phosphate starvation and coordinately regulated by
PH04 and PH02 chromosome 2-1 , 0103
P. pastoris promoter sequence partial known, ORF sequence known (Pho 1 PPU28658)
5' region (in bold start codon of previous gene) (in italic TATAAA signal) gaactagcagtttccactattgtgatgtccattttatcctccgtgtcctcttttctagca tgttgtgttcctacttggtcttttttatggagcccgtgatggggcaccttggagttgtta ctcacgtccgtcaggacgtttcttttcagttcctgtgtcttagaccgcgtatgacgtata gagtagacattttcgttaggctgaacattagacatagctcttctagcacttgttcttagt agaggatcttgccagctacaagtaaaaggggatagttaagggaagaaaacgcggatttgt atatagggctagctttgtttcaactgaattgatcggattgggaaaaacaacaacaaaaaa gggatggtgtttggatgctacttgaaatctgcttttaacgtgttgcaacatctctggtgt tgtgatgttggatcgagttagaggaaaggaaaaaagaaacaaaaaaaaaaggcgaaaagg gaaaagagaacaggcaaaagccgtgctatggatgtcaaggacgatggtcttccatcggtt cacaagcccatggatctttgttcccaaatgggaaaagtaaggcttctgagtgtttttcca tcccggatccctattgttacttttgcttaacattccaatattcttcaacggttaattgat taacactgtaacctctgcccatgtgcttcatccaaatctggtaatctgctttctatttct gccaaaatagttaatctatgagacatgtgccctccaattgcgcagtagatcgagtggaag tcttctttgcgtaacactcaaagtatacccctgttagtctttattcacctgttgctgctt ggtgcagttaccattattgtttccacttgaaaagcttgtttttttttgatagcacagaaa cgtgggctccgataagctaaacttcaacgagaatataaaagctgaaaagattcttgtcaa gaacttgtacaacgaccaataagtctttcaaggcatcagac
CCTAAAAATGCTTCCTACCCACTTGAACTATCATTCTTCTGGGATGACTTGTCATAA
Downstream (in bold stop codon next gene) aaatggtaaggaatgttttgcatcagatacgagttcaaaacgattaagaagagaatgctt cttttttttgtttctatccaattggactattttcgtttattttaaatagcgtacaacttt aactagatgatatcttcttcttcaaacgataccacttctctcatactaggtggaggttca atggatcctacaaactccaacgatacaaatccaacaccaacgataaccagtaaaaaccct atcaggccgcggaaaatcgaacctccactgatcaacactcccaaaaagatgtagaagcca cctcttcccaaaaatgacaacaaaaacgatagatacggcgtaactttaggtaccaattcg ggctttatttctagagctgcaatcacacctccaaatgcaattataaacagactcaagaga aactaaaacttgggtcagtttggcgggcttgaaaaagatgtcacataatacttacagcat tgagaccatttagcaattgtagtgctcccgcaaaaatctgtagaagaaccaaagttagta attttccctgaatgacagttctgcttcttatcttgcttaccgtggcccctgcaacaaata gattggcggctttgaatagagaattgtaattttgttccgtttccatctctctataagctt caacttttactgctaggttaggtgaccaaaaataaacaccccaaaaatggtagtgtcccg gtgaaacaccaagggcctagacgttgaaactcagttatcagatgaaattcaactatataa ctaactgagtataagtcttctatgttaactcttagtagctaaatcactataaagttctgg aattgaaaacggaagtcttctcagctcaaaacctaaacaaaaaaaataataaacattctt taggctcaaattcttgtcacctcaagccccgctccgttcacctttctcaaaaatatattc ccgatacatatcttgtatagtttcacaaacactaatcaatc
AA
MFSPILSLEI ILALATLQSVFAVELQHVLGVNDRPYPQRTDDQYNILRHLGGLGPYIGYNGWGIAAESEIESCTI DQAHLLMRHGERYPSTNVGKQLEALYQKLLDADVEVPTGPLSFFQDYDYFVSDAAWYEQETTKGFYSGLNTAFDF GTTLRERYDHLINTSEEGKKLSVWAGSQERWDTAKYFAQGFMKSNYTDMVEWALEEEKSQGLNSLTARISCPN YNSHIYKDGDFPNDIAEREADRLNTLSPGFNITADDIPTIALYCGFELNVRGESSFCDVLSREALLYTAYLRDLG WYYNVGNGNPLGKTIGYVYANATRQLLENTEADPRDYPLYFSFSHDTDLLQVFTSLGLFNVTDLPLDQIQFQTSF KSTE IVPMGARLLTERLLCTVEGEEKYYVRTILNDAVFPLSDCSSGPGFSCPLNDYVSRLEALNEDSDFAENCGV PKNASYPLELSFFWDDLS
SEQ0128
P. pastoris homolog of Saccharomyces cerevisiae ECM17 (YJR137C)
Sulfite reductase beta subunit, involved in amino acid biosynthesis, transcription repressed by methionine Chr 3, 1084
5' region (in bold start codon next gene) cagttgttctttaagctgtgtcaatgtattctgataatctctcagagtaacattatattt agtacgatcacttgtagaaacgttttgcagctctaaattgagtctctctaacaggtcata agagtcgtctaagttttcttctacaagccttatctgccttgttctgtcaacagaattgga cgccgatatgttagccagatttcttttagcttcagctaatgtcatctgaatatcagacga gtatgtttcaaataatctggacatagtcagcaggtattaaactggagtattgaataagag gctaaattgtttggttctttggttggctacgcaataatggcaatatgtttggcactcagt atatattagtggtgtaggggtgtatacaatcttgcagtaaagtctacagctatcgccaaa gaaaagagaagattacataatgactcaaccgatgaaaactactaggtccattcccattac tagcagaaaaaagctcaaaagttcattgtttacagtaattacaagggttcttcagattcg tatgtttgttctcgcctattttctcgttgctttattgacttaggacttccatgatctatc agaagttcactttgaaggtataatcacctatgattatcggctagcgagggcgagatcata gggtcctatcatcttgcctcgcgtagctgcgtccatcatatgcaagtctaggaaagcatt tacactttttaatttcaagggcattctgaatttggctcagaccaaacttaacaggtgatc agagaaaactttaatcccaaagcccgaatgtagcctatagggcgtcttcgctgacataag
cgaatggaaatctggagtttgtacatcgaagtgtggctcactatcacgtgtgatagagta actcaacagtatggagtatgaaaatgtttcctcatatattggaaatgttatcccaaagtt gaagccgaaatactatcctacacactttggactaatacgaa
ORF
TTCGCAGAGAAACTCTCAATTTATAACAACTTTGAGTATTCAGGAGCTAAGAATCCTTTGGATGTCGTTGTTTTG
AACAGAGCAAAGGCTCAACAAAGGTTGAAAGCCAAGGCAGCGTATGATGAGCTATTGGAAAGTTTGACCGGTGCA
GACTATTTTGGATGGGTTACAGATGAGATCGGTTTACATCACTATACTTGTTTCATTGAAAACGGACGAATTGAA
Downstream (in bold start codon previous gene) acattatactacttatttattatttattgtttccctgagtcatttatttgctgtgggatt cgtgtaattcaacaatcctctttccgcttcagtgacctgtaacaccttagctactgagat cctctgctctctttcaagtgccagcacctgtttaatctttacctcccaaagctttttcag aaccctatctagacgatgattgagtttaatatcgacttctcttaacattgactctttttg aagctgaagtttcctagttctctctgccactacgccctgttagtaatgttacagttgaca aaaccatttgatttctggatagtgacaataaacatttgtagtgcaaacttaccttgagct tccattttccgtataaccaactctctttgatgtttagtgtacctcatgcttccaagggtt gaatttgttgtgttgaaaggtacagttgagggcatcaaaaacgatacatccatacatcag gttattccctcgagaagtctggctatctacttgccacccttaagaccccaatatctctca aatcacatccgacgatggtattggaagctacaatgataattctggacaattcagagttta tgcgtaatggagactatcttccatcgcgtttttctgctcaattagactcggttgatttta tttttcatgcaaagaccaattctaacccagaaaatactgtcggcctcatgtcgatgggtg gtgatggacctcaagtactcacaactctcacagccgattttggtcgtatattggcgggta tacatgatacaaagatttccaaaggtatacatttcagtacaggcattcaagttgctctct tagccctgaaacacagacagaacaaggtccaccaccagcgcatcatagtttttgtgggct cacccatcgatgaggatgaaaaagaactagaaaagctagctaagcgactgaagaaaaata gtgttgccattgaccttattaactttggagaacatgaagta
AA
MTGSVGEIVNVLTHVPNRGHVYTTKLSKGLSNNFSKALDDASIDYQSLVSNNDPFALVIETVNKNSFTTVLTDAE TLLFSVPFLSEVRSGDKLIVHARLPKNDYSIVSALKDLDFWLLSSSVQETQDIALAGYKFLTQNNKPVLHFFNE DYSEPIVDQLSPEWSSYLKSEQEPFAEKLSIYNNFEYSGAKNPLDVWLLGDSGISPPTHTIDYGI INIRVFRP
FDIESFLNVLPFSTQRLAIVEQSPKKPFAQFQNLLLDFVGEISLLLERKIFTWASQLASPDLLSLQMLVSNLRL SSPVQNEVFGDSSEFQNSSLHLRKYITGAQALEDAYLKILKQLHSDGNLSILNQQDTETTGSSTPEFGFGRFLFE EEQRQSLVIQVHSI IKNSAVVNEELIQLLTKWLVSAKΞNSKFLDTDRLLQLLKENSALPEIESLLQLESSLHLKS SWLVGSDSWAYDLGNSGIHNVLSSHKNLNMLIIDSEPYGKVNANGLKKKDIGLYAMNFGDVYVASVSVYSSYTQV LQAFLEAEKHDGPSWLAYLPYHSETDSALDVLKESKIAVETGAWPLYRYNPAAQSENDIFKLDSSVIKRELQSF LDRENKLSILAKKSPSFARNLASQNNRAKAQQRLKAKAAYDELLESLTGASLTVAFASDGGNAESLAKRLARRGA GRGLKTTVLSMDDLSLEDFALEENIVFITSTSGQGEFPQNGKAFWDGIKNSADLDLASVKVSVFGLGDSLYWPRK QDAHYYNKPATDLWKRLLFLGAQQLAPLGLGDDQASDGFQTGYEEWEPALWISLGVDGAPTLDEPPPITNEDMKR ESDFLRGTIAAGLQDTSTGAISASDQQMTKFHGIYMQDDRDVRDERKAQGLEPAYAFMARVRLPGGNVTPRQWIK LDELADVRGNGTMKITTRATFQLHGLVKHDLKAAIRGMNSVLMDTLAACGDVNRNVMCSALPGNAKVHAQVAKAA SDISEHLLPQTTAYYEIWLEGEDEGDSSPKDRLVWETRKEGPKKKKVMVAGNSLVDVEPLYGNDSLYLPRKFKIV ITVPPYNDVDVYAHDVGLIAIVDESDWIGFNVLAGGGMGTTHNNKKTYPRTGSMMGYCDYEDVNTVCEKIMLVQ RDNGDRTNRKHARLKYTVDDMGVDVFRSAVEELWGKKFQAAKPFEIKSNIDYFGWVTDEIGLHHYTCFIENGRIE DTPDSPHKTGLLELAKYMMEKGVGEFRLTGNQHMLISNITDNHLQRVKQFLVDYKLDNTNYSALRLSSAACVAFP TCGLAMAESERYLPVLISKIEESLEEYGLRHDSWMRMTGCPNGCARPWLAEVALVGKAYGAYNLLLGGGYHGQR
6 P. pastoris genes involved in glycosylation
6.1 Nucleotide sugar synthesis and transport
SEQ0129
UDP-GIcNAc transporter
YEA4 (YEL004W) Uridine diphosphate-N-acetylglucosamine (UDP-GIcNAc) transporter required for cell wall chitin synthesis; localized to the ER
S. cerevisiae null mutant: viable; increased budding index; increased transposable element transposition
S. cerevisiae overexpression mutant: increased UDP-D-glucose excretion
Chrl-3_0163
5' region gtttcaagggaaatctgagttttgtggagactcccaactgccgttctaattgttcatatt gctgtaatttttcttttgttgtgatcaatttgttttccttgcaataccggtacaattttt ccgagtttcgaaggtaatggtcgtttcccagctgagactgagctttattgagcctcctta tcgtctcatctaaactgctctttgtgatggtgatactaagtattatgtcatccaaatctt gcatcaaggacttgtcattggacacgacaagcttatgatttttgtgtttatcacaggttt ttgacaatttgcgggtcattgcgatgatatcctgctgtctttctgtgctggcaacgatct cgtcattgatgtctgaagaaaagttctcaaaaacagtgcccaagttgttgtttagttgtg atagctcttcactgaatgaaggaaggatctgggtaagagaaaaatccatctttggtagca gtaaatcttgttttcaggtcactggtcaatggattgagctggcccagtggacaagagaga acatctcaagtaaattgacagcccgcgcgcagtgcccccgttaaacaaagggcagtttta ttacgtaagcagaattggccctttttcacgttaataaagcaaaatgccccccttgcttat cgtcctactagtcacacgacaacaataaaacgggtatattacgtaactacattgatccaa gttctagatgacgatgatcagaattcaagttttaagacaccggtgacagtattcaaaaat aaaacaaaatccttacatacctctcttgccagccttccaaacactaccctttaaactccc ctacccctttgataattttttcaaatttttccagcaatcaccgtgttccgaattaatacc agatagcgttatctattgtcggtaaagttgaagctcgaaataccgtcgaactatatttat ttttctcttcccttccctgttcgaacccaaaacacgtcaag
ORF
GATTGA
Downstream tatcccatcctcttttcataatatctcaaggagcctactttagcataatggtcatttttc atagatagtagacatgtgtttaatcatcgctcaaaaagtgcgtaggagcatttaattcat ctttctaagtaactaacccaatctttttatggatctaaaatgctcttttcttcaatgtgc cgtaaccgttgacgcagaccatacacttacataatcatcacgacgctttcgtaccgctcg catctcgatctcattacctgaaatagctaagcatcgatttccttctcttcccgcaccaac tccgtgcgctctgtatactccgtcaacactgctctaattggtctctgaatgctggtatca tggcctacttcacacttccggaagagtacctcccagacctgcaagacgccgtcaatttct tcatttcatttgccccatttctcagctatggcaccacaatttacggcatctcccaaagcc agtcaagtacaggcttctcgctagatatttgtgccaccatgttgetcagcgcaagtttgc gagtgatgtactacttcaacgagaagtatgaattcaccctgttgaggcagtgcatcgtca tgcttatcatccagatattgctcttgagaacggccttgaagtatagatcaaacacctatt catgtgacaatcttgtacaaatgcccaattacaacgtgatatttgtgcataaaatgcagg aattagtggeceaattatcagagagectgatcaacgactctgaggggataccagagcgcc tggatctgatctggaaggcatgggtcagtacttggcgcctcttttttgttctcatcattt atggcaactttgccaacttgttccttctttttgacgtcagctacgcccgccctttaaact tttggcagtgggaggaacaagagacctattggaagttcttaggttgtttttggactacaa tagtttccttagaattgttattatacaagaatgaacagtat
AA
MIDISPFLMVFTGCCGNVFSLETLVTKTTYSVGTMVTFCQFLYVALVSYLFVMLPGEGGKKTDDVLLAGPKKDWY SSWLPESKVPMKRYYVNIFLFFVTNVLNNYVFVFNIGIPVHWFRSSSVTVTMLIGYCFLGKTYNAKQIWGSWL SIGVI ITMIDNQIAQNEAKGVEYTGNFIDWSAIDLHYMFGIFVMTLANVLTCVMALYTETTYKKYGKSTWI-WNLF
YQHAFALPLFLFVS LNLKREFEHFTSNHGSLL INCLTQFLCVAGVNKTASLYNAVTLS I I LMIRKLVSLL ISCYF FDNSFNAMGYFGISMVWGTALYTMGGRS ISPPLAANSEKLDLESKGNNQD
UDP-GaI
Synthesis:
SEQO 130
GALlO (YBROl 9C): UDP-glucose-4-epimerase, catalyzes the interconversion of UDP- galactose and UDP-D-glucose in galactose metabolism; also catalyzes the conversion of alpha-D-glucose or alpha-D-galactose to their beta-anomers
Note: S. cer. protein has 2 enzyme activities (in 2 domains); P. pas. has only one of the 2 domains and so only 1 of the 2 activities (activity in light grey is missing)
S. cerevisiae null mutant: viable; decreased competitive fitness; decreased utilization of carbon source
Chr4_0839
5' region attgtcagtctgtgagctatatctgtagccctcgtgtaccatgggtaagctttagacatt ttaaaaaggtctgaatatcggttaggtaatttggaaatcagatggaacagattttagtct tcagataaaggttggtacactacaaagtgaagtacttagttcgttaatgggctattcttg gattgagattttgaaagtttgtgttcaattgctaacgactcgtcagcagatggaatgtaa aactcttcatatctgcttagaatcaagactacagtagttttgtttttagattggcttctc gtagaaacgagtttttctttaggtttcgaaggtgaaatcctagttacatctaaacggtct ttcaaagcccgtggttactcaaaaacaagcataagggaaggtattcaagaccaaaaatgc aggacttagcgttgctgaaccagtctttcaactgccccgtttactataagcaaagtgagg caagttaagacgtcaatcaacaaattcctttgaaacttcctatatcactttccttcactt cgtcgtggctccaagaggttaaagttcacaggccatagctccgaatcgatctagaacata tcgtaagatcattaccatcgctctattaggggtagagtactacaagtatgttcaatgcga ctccaatatatattaggcctctgctaacgagggttcaactgagtaaaacattcaacgaaa gccgttaacaaagccagtatttgctagtaggggtttgaaagttaaagcattgaatagatt aattagatagttgggtgccatagattccagctaattgcaccacctggcaggatctatccc gttaaacgaggtgttcctctctgctaaaattgcgaaaattgctaaagaaaacaccctgca tcaagggggttgaatgtcggttaattagacttggatgcacctccactttacataacttac taatctgaaattaatccttctttattctgctttataattcg
ORF
ATTTTGCCAGCCCGAATATTGGATGAATTTATGGAGATAGAAGTGAAAACAACGAATTAA
Downstream tgttattcttgaagatgtcaagatacaaaaatactttatttagattagaagaaaaagtgc ttgaagcatgctaaaataattcgaagctagtccatttctacattaactgaatgtaattgt tgtcttgagatgctacaagtaatatagttaaaagtgacttgatgttgcttttgctttaaa tcttctcatggcacctttatttgactgagtcaagttacaggcttgagggcggttgaccag atcatcttcgcaactagaagtgggatccgtcaattactagaaaattccagttggtgaaga ttcaccaaccagccaacgatcttgaaaactcttttaaaaaagagccttaccaggcaagaa
gttgctatgcttaatggttatactataaagtaactaactatagctagcatcaatgatcta cgcagtaaaacatctctacaccatactacaccaccatgacgaattgatcaattttacgcc atacgtgcaatacacgacgaatgcctaatccagcactctgaggcttgctctcgctagtat gatgacattacttcaattgctagttctggtatacataatatataaagtgtttccgatact cagaagatttttcagtactggaacaatggtgtccgaacaggttttaagcaagactaaagc attgatcaaagatcacaaggtatttgttgcatccaagagttattgcccctactgctccca gactaagaagttgttggagtctctaaacgccaatgcttttgtagtggaactagatactga accagatgggtctgatattcaagccgctcttttagagctaaccggccagagaactgtacc taacgtgttcatcaacggagagcatgttggtggaaacagcgatttgcaagcactgaacag tgaaggaaagttaaaaactcttctcaagaactaaagcctacttcaaaacgctattatggc ctcaaagattaatcttcggctctactgattagagaaggata
AA
MNGYVLVTGGAGYIGSHTWELLNNDYLVIVIDNLSNSSYHVIKRIETLTGKSVTFFNIDLREESKLRKVFQDYT IDSVLHFAALKAVGESAKSPLQYYDNNVSGTISLLNVMKQNNVKKLVYSSSATVYGDATRYEDMIPIPEICPWP TNPYGMSKVIVENIIRDIFASDNGWKSAILRYFNPIGAHPSGLIGEDPFGTPNNLLPYLAQVAIGRRDKLFVFGN DYTSKDGTPIRDYLHIVDLAQGHIAALKYLGTRQEGICREWNLGTGNGSTVLEVYEAFCSAVGRKLPYEIVGRRE GDVLNLTALPTRANEELGWRTKLTIEDACIDLWNWTTNNPNGYRIEEQKSDWLGKKDSFSKKLCRAGGVNTSEGP ILPARILDEFMEIEVKTTN
Transport:
SEQ0131
HUTl (YPL244C): Protein with a role in UDP-galactose transport to the Golgi lumen, has similarity to human UDP-galactose transporter UGTrell, exhibits a genetic interaction with S. cerevisiae EROl
S. cerevisiae null mutant: viable
S. cerevisiae overexpression mutant: increased UDP-D-glucose excretion
Chr2-l_0692
5' region ccaagaagtactgaagagtctgcaatcgttgatcgagcgcttgggtcagctgacccctca tgaacttcaaattgtggataaactgtgcctagacgtcatagaaggaagagaaatatccaa aaaggaagttgaccagccgtggtacaaagaactgatagctgttgtggaatattccaacag atgatgatgaaaacgtagtgggaagttgaaccaagttaaggagattgtttttctacgtta ttggagttcccagacgatttgctgtggtgtgaataacagagcaaaatgcttctcatagcc aagttttctatcttcgttgaagacgagcttgtgatccaaagttaatgtggtaccggcctt taccactgcagtcttattatgtatgtacaagtatattgatttcatttccgaaacatgtaa cttttttctaatacctccttcgcctggttaattttcattgctaaatacggactccctcct ttatctggatgatttatcagcatacattgcctatgtcgtttctttagtttctcatgcgtt aaagtcacaatttctgactggggaatccccaatataagcagcgcctcactttcagttatg ggattaaaaaatcctccaggatactgggcaaactttagtttcattacgtcgttttgagtg tgtggttttgataactgtatcccgttcaattgggctatcatagtgggggttaatttgacg tacctataataagctctcaacccacttctaaatgtcaaagctgccacggtaagtcccaag ccagcgattattgggagaaccatagctggagtcggagatggctggtaataatattatgaa gaaagtaaggtagtatcacttggtctcaacctctactctaagaatgcaaacaatcgatat agatagttcgcgagaaccttattattttcaggtaacgcactggacaggtacacttcatca ggtatttcacgggtttctctccaaatttaagcttgcttctc
ORF
Downstream agagaaacttagatttttaatagattagagaattacataatttacatcaaatagcattaa ttctggtgggatgacagttgcaacaaagatttagcagctcccatggatgatggagtaaaa atgtccctggtggttgggctggagaaaagcttgttctctctctttagcttcacgggagta caatacctgttgaaagatgatgattcaggcaattgaatagtggtgtcattacaaatgatg gttgaattagagttggaattgcaggcgggcaactgtgacttctttgagatcggggtggaa gaatttgcccgttctagactatcgttatgcatatctaagtcattgtggatgctaatttta cagttggggttccttttgatctggtcgaatgttggtggtggtagatgagccgtaatcttt gatattggaggcaatgtcacattggctgagttctgggactgtatgcgaggtagagcagat ttcaaagagcttctcttggtgccagcaatgtccgctagttttcttctttttgaaggatta gacatctcatccttaagggaatcatctctcttaaagccaacttgactactgttgatcagg tcgtcgacattcagtgataattggttggtttgaattttgaaaactgccagctgtagtcgg gttttcaatctgtttatcttgtcattgacatcactagactcgtcttcaagttgtaacgct tttatgtgactcttgtctggaaaggtgagaccgttgcttttgagctgctgatcaggatgg tcagcacccaatctatgaatgtttagctgtgtctctgatatagttcccagtactgctcta cccattagttatgctatgttttgtcccttagtttactagaactgttgttattgtctactt cgtcagctatgtatgtattattttttgtcaacctcaatatttacgttttactcttttcga ttctacgcggtacatgcaatgcgatgtctggcgcgtccagc
AA
MKTLTLCVAGIYASFILWGfLQERLSEPYNGRPFKAPLIVNAIQSFFAMIVGSIYLSAKTRKLSTPMDPISQNPK I IYQLALIAVLSSGSSAVGMKSLENVDYLTYLLAFSCKLIPVMIVSVLVYRKRFPVHKYCIALCISAGVILFTLK PKSLSNSIDSSAGNWRGYLCLLISLFLDGLLNSSQDQLFKTFKISGAQLMAALNMLTFILISSYIVLFTDQLPYF VSFVQVSPQLLQNAILYGIAGAIGQIFIFLTLEKFDSIVLTTVTVTRKMFSMVLSVLLFGHSLSLQQQVGIGLVF GGICYESWLKSFGSQVSPKAKLS
SEQO 132 putative UDP-galactose transporter [Pichia stipitis CBS 6054] chr4_0810
5' region ggatcttgatgtgtttataaaaacgctgtttgttattgtggctgcttgtgacacttattt tagcaacaaaggctatgggatttgaggggatccctgaagtccgaccattcaaaagggact cttgaaatgagccagtcatcgcaagagagcttcttcttctaaaatttatgtcattattgg tattgtgtgacaatgagtctagtttgcgctggcacttgtagcatagtcttttctgtgtgg atgaaacaacgggattgttcaatcccggttcagcaattgtctgagacaacgacgatcggc gtctcggttggaacatgagaaagctctgtggaacaggcatttgggcaatgacgctgaaag atgagtatgaggcaggatttcagtcaatcagctatgatttttcgattgtaggctgtattg agttaagaagaacgtatgataaggggcagatgatatcatattatagcgcgaaatgactaa catcataatcgaccccagaatcactcaagtgatcatttatggttatgtaattatgaaatt gtaatgggaaattgtcaggcaaaagtatatagcccaccgaataggtacgtacactgcata catgggtcaaagttatcttgggccgaacaactctctgaaaagtttaatcttcaagatgaa agtcagattgctcgattagaattgggcagacattgaaatagcctatcactctaatgaaac tggctaattttgccaaacgaaaactcaccataaacactgccgactgacgatcaagttcag tagcacagataatgctcccaactaacattctagaaatacaatctcattaatcagactcat gtttcttaaatgataataatctatttatatagtattgttcttccgataccagtccttaga gcgcgcgcccctctttaatcagctgcctctcctcctgctgaatctgaaacgttatctctc tttcccaccagatcattcgaccatagcaagcctgttattac
ORF (in bold intron)
ATGGTCAACATTAGAgtatgtttcgttctatcgaccccttcatcattcctttcattcaattgtctaactttagAT
AATTGACGTTGTGGACGAACAAATCGAAAGATTCTAG
Downstream tatagaattagtcattataagaagatcattgaattcatccttgtagcgagtttcccgtcc agaactcctccgcccatctcttccctcccatcaagcattcgcgccgatggacagcgtaac tcaacatcactcgaagactcaaagttcaaccagagacacaacaatggacgaacataggct cttcaagttcccagtatttagacttcccaaatccgctacagttagagcaatggggatgta cctttccggggctctctatgcgattggtttttgggttcttttagacgctgccctttactc caaacgggtcaacgcctcattagtccatgtcactttcgttgattgggttccagcaatatg ttcatctctaggtatgctgattattaactcgattgaaaagtctgcattattgaacaatga tgcatcattcatgcaaccatcatctacaacaacctggcaggcccgggtggtgctgtttat cgggttctcactacttgccacaggaattgccggttcatttattgtacttatactcaagtt tttgatcaaaggttataatacatttcccaccttgggtatgggggtcgccaatgtggtatc taatgctagcatcatgttaagttgtatagtgttatgggtggtacagaactttgaggatga cgactacaattactcattggccctgtgaatttagtttgtattaggtagaatggaaggcaa aaattgtgcactaatagacttctcatagtccgtaaagttaaatagagtgtcgctgatttc aagtgcctgtgatgcattggtgggatacactgtgaaaccaggagcgcgcagttgatcgaa gaagcggtgggccattatttgtgagcctgaaaacctggaggcaggctcgtaaaccaaaat atgcatcagtaggtctatcgcatcttgtggtgaattattgaacaccttgttgaaacgggt cgaaactttaggactgaacagtttggcggagtacgtagggc
AA
MVNIRI ΞLLSAAVIVTGCLNSVFTKYQDNQCVRNCQDPDFSKHVYFEQPALQTLQMFVGESGCWWVFLSWLSTV VKYYSKGRKKPLYAPLSEGDSDQTKLPELTGLRSLI LA IPAVCDVCGTTMLNVGLLYTPVSI YQMARGSL ILFVA LF SVFFLNRRISRLEWLSLFWTFGVFLVGLSGALGNKVH ISGDKNTSFAVIFGITLILIAE IF AASQFVSEEYI ISRΞSVDPLRLVGFEGIFGGΞ I TFLVMVIAYSWGRSNPGGPFDIVTSFKDVTANSQVIWSSVAIMIS IGSFNFF GI SVTDELSATARSTIDTCRTLLVWL VSLYLGWESFKWLQLVGFSLLVLGTLTFNGVLKAENWYF IPEWF KKDAP LPGERVIDWDEQIERF
6.2 O-alvcosylation
PMTl (YDL095W): Protein O-mannosyltransferase, transfers mannose residues from dolichyl phosphate-D-mannose to protein serine/threonine residues; acts in a complex with Pmt2p, can instead interact with Pmt3p in some conditions; target for new antifungals
S. cerevisiae null mutant viable; decreased protein O-mannosylation; decreased resitance to calcium dichloride and alkaline pH
PMT2 (YAL023C): Protein O-mannosyltransferase, transfers mannose residues from dolichyl phosphate-D-mannose to protein serine/threonine residues; acts in a complex with Pmtlp, can instead interact with Pmt5p in some conditions; target for new antifungals
S. cerevisiae null mutant viable; decreased resistance to alkaline pH; abnormal budding pattern; abnormal cell shape, decreased cell size; decreased competitive fitness; decreased metal resistance
PMT3 (YOR321W):Protein O-mannosyltransferase, transfers mannose residues from dolichyl phosphate-D-mannose to protein serine/threonine residues; acts in a complex with Pmt5p, can instead interact with Pmtlp in some conditions; target for new antifungals
S. cerevisiae null mutant viable
PMT4 (YJR 143 C): Protein O-mannosyltransferase, transfers mannose residues from dolichyl phosphate-D-mannose to protein serine/threonine residues; appears to form homodimers in vivo and does not complex with other Pmt proteins; target for new antifungals
S. cerevisiae overexpression mutant: decreased vegetative growth
PMT5 (YDL093W):Protein O-mannosyltransferase, transfers mannose residues from dolichyl phosphate-D-mannose to protein serine/threonine residues; acts in a complex with Pmt3p, can instead interact with Pmt2p in some conditions; target for new antifungals
S. cerevisiae null mutant: viable
S. cerevisiae overexpression mutant: abnormal cell cycle progression
PMT6 (YGR199W):Protein O-mannosyltransferase, transfers mannose from dolichyl phosphate-D-mannose to protein serine/threonine residues of secretory proteins; reaction is essential for cell wall rigidity; member of a family of mannosyltransferases
S. cerevisiae null mutant: viable
5 genes with homology
SEQ0133 chr2-l_0212 (most likely PMTl or PMT5)
5' region acgaacaccggaaaatcaaccctaatacgatccatcgttggtaaggattactcaaagaag cagacagagaatggcccgggtgtctctcaccttccttcattcacaagaaaacccatgaag ttcaaaatggacaacaacagtcttgaactcgtagatctccctggatacactgctccaaat ggaggtgtttacaagtatcttaaggaagagaactaccgagacattttgaacgttaaacag ttaaagccattgacatccctcaaggcatacacagaaacgttgccttcgaagccaaaacta ttcaatggtgtgcgagtaatatgcattggtggtttagtgtacattcggcccccaaagggt gtagtgctgaaacagtttagtctcgtcaaccttccatccttcatgtactcgtcgctaaaa aaggccaccagtgtaatccaagcgcccccacaagccttggtgaattgcagcgtcgtcaag gaggacagtccagatgaactggtaagatatgtgatccctccattttatggtttaattgac ctggtcattcaaggtgttggatttatcaagcttctgcccactggagctcggaacaccaga gaactgatagaaatttttgccccaaaagacatccagctcatggtgcgtgattccatcctc aaatacgtctacaagacccatgccgaacacgactcaaccaataatctcctgcataaaaag aacataaaagccagaggccaaaccatactacgaagactacccaaaaagcctgtattcaca aagctttttcccgtaccagccaacgtaccgtctcatgaactgctcaccatggtgacggga aaggacgacctagccgaggaagacaaagaataccgctacgatatccagtatcccaacaga tactgggatgaaaccatctgtaaatagaatgcttatgtaatcaagcactttctgaaattc cttagagtttcgcgtgtctccccgtcaaaaatcgcgtctc
TTCAGATCGACCTTAGAAGGTAATGCTGTTCCAAAACAGTCGCTGGCCAACGTTGGTTTGGGCTCTTTAGTCACT ATCCGTCATTTGAACACCAGAGGTGGTTACTTACACTCTCACAATCATCTTTACGAGGGTGGTTCTGGTCAACAG
ACCGTCGGAGAACCTGACGTTGAGAAGCTGGGAGAGACAGTCTAA
Downstream gctgtgtttatatagccctgtacgtaaaatctatgacacaagtttatggttatttgtctt atgtaagcaatatttggattgatgtctcgagaccatcaactccatcactgataagttgat cggatttgtatttctgtcccctatttactaattccctttccagaaatagatcatgaatga ggcagaatataagtgccaaagatgccggctgccgttgaccatagacggatctctggaaga ccttagcatatcacaggccaatcttttgacgggacgaaatgggaactttacaaagaacac aatccccttggaggatgccgtggaagaagatttacccaaggtgcctcagagccgacttaa cctctttaaagaggtctaccagaagatggatcacgattttaccaatgccagagatgaatt tgttgtgttgaacaagcacaatgataacagcgacgtcaatgtggagtatgattacgaaga aaacaacactatcagtcgtagaatcaacacaatgacgaatatcttcaatatcctcagcaa caagtacgaaattgattttccggtttgctacgaatgcgccacattgctgatggaggaatt gaagaatgagtacgaaagggtcaatgctgataaagaagtttacgcaaagtttctatccaa gcttcgcaaacaggacgcaggtacaaatatgaaagaaagaactgctcaactactggagca attggagaaaactaagcaagaagagagagataaagaaaagaagctccaaggcctatatga tgaaagagatagtttggaaaaggtattagcttctttagagaatgaaatggaacagttgaa tattgaagagcagcaaatttttgaattagagaacaaatatgaatatgagttaatggagtt caagaatgagcaaagcagaatggaagcaatgtatgaggatggtttgacgcaattagataa tttaagaaaagtgaacgtctttaatgacgctttcaatatc
AA
MCQIFLPQNVTRCSVSLLTMSKTSPQEVPENTTELKISKGELRPFIVTSPSPQLSKSRSVTSTKEKLILASLFIF
AMVIRFHNVAHPDSWFDEVHFGGFARKYILGTFFMDVHPPLAKLLFAGVGSLGGYDGEFEFKKIGDEFPENVPY
VLMRYLPSGMGVGTCIMLYLTLRASGCQPIVCALTTALLI IENANVTISRFILLDSPMLFFIASTVYSFKKFQIQ
EPFTFQWYKTLIATGVSLGLAASSKWVGLFTVAWIGLITIWDLWFIIGDLTVSVKKIFGHFITRAVAFLVVPTLI
YLTFFAIHLQVLTKEGDGGAFMSSVFRSTLEGNAVPKQSLANVGLGSLVTIRHLNTRGGYLHSHNHLYEGGSGQQ
QVTLYPHIDSNNQWIVQDYNATEEPTEFVPLKDGVKIRLNHKLTSRRLHSHNLRPPVTEQDWQNEVSAYGHEGFG
GDANDDFVVEIAKDLSTTEEAKENVRAIQTVFRLRHAMTGCYLFSHEVKLPKWAYEQQEVTCATQGIKPLSYWYV
ETNENPFLDKEVDEIVSYPVPTFFQKVAELHARMWKINKGLTDHHVYESSPDSWPFLLRGISYWSKNHSQIYFIG NAVTWWTVTASIALFSVFLVFSILRWQRGFGFSVDPTVFNFNVQMLHYILGWVLHYLPSFLMARQLFLHHYLPSL YFGILALGHVFEIIHSYVFKNKQVVSYSIFVLFFAVALSFFQRYSPLIYAGRWTKDQCNESKILKWDFDCNTFPS HTSQYEIWASPVQTSTPKEGTHSESTVGEPDVEKLGETV
SEQ0134 chr2-l_0256 (PMT2 or 3)
5' region gaagcattcttgtcgtcctttcgaaccaatcttgaaccaaatattgccatgtttgggagt aaaactgggatcaaagaaggtaaagaatttaatgggggaaattctaatggtactttgtac tagctgcaaatatcgattgatgagttgttctgagtttccctggataaaatcgaagatgtc gaatatttatatgtatgctggtatgtgttagagtacggctatcccgcaattcgtggtttt tttttttcttttgatagctgccctggctcggcggcccgaatggttcttaggatctaggat gctggttgttctctccccgccaaggcgagcgattcggtgttccatggctttcaaatgata aggcgaacaatcggagaaagccacgctattgctagtgcacaagaggagaccgactgataa ttctctacccacttgaatggcatgggtggtggattaatgtgaggtgattatttgaatgag ttatattttttactatatagttgctttttatagatgtgccacatatacttatgaaagtcg agagtttgatccgcattttccaaagcttaacattttgagaacacgattgattggcagttt ggttctaatttcaagataataatatgcccgggtcgaaattaaatatctaattacaaacag gccctttaaacatcacccgtattttggttttgtaaaagaatgtcaggtgcaaaataaaga
gctataacttatttagaagtacaaaagtacctaaccttgcaaaccctggtctatatacta actactatatactaactagtcccactttaacctttcaatcgttcaaatattatccatccc cagtatcatcctgaatgaaatttcacccgtggcatgcatggcacaaaaacgtcgtgctcc acccaatactcatctcatgttcccccgtcaaatttccctttaacgtgttctttcttcaaa cacggattcaagtcctttcgattcttattttttatcttcg
ORF
TAG
Downstream gtgtatatatcgtcagtacctaaatttatgatagggtaaaaccgacatcttctatctaca attaatgcgcgactcacgctttctcttatctgcttggtcttgtttgacttcatcgttagc gttcgcttctctcttcaggtaaaaccttaccctccatgtcgacctatagagaactggagc gcgaggagggcgacttctctgataagaagcttctcacaggctggttctccttaaccgtgg gacaactacaatatgctgcattcatggccgtaggaatcgcgttgctttggccttggaatt gctttctatcagcttcagacttctttggagagcggttgcaagaacacaagtggctctctg ctaactattcatcgtccatgatgaccatttcaacgttgacctcaacgttatgcaacgtgt ttttgtcccaaaagcaaagtggggtagattattcaaagagactcgtcatgggccaaacaa tcaccatagtcgtatttgcctttatgggtctgctgtgcgtatggaatacggggctagate caataatattttttgtgttggtcatgattaatgttgcactgagttcagtagctgtgtcgt tatcacaggtcggtgctatggcgatcgtgaatgtgttgggaccaatatatgctaatgctg tggttgtggggaatgctgttgctggtgtgctaccgtccatcgctctgattattagcactg ccctgtcgggaactcatgtggctggaaagttgcaacctaaaagagattatgcagttatgg catactttttgactgcctgtgtcgttagtggtattgcattagttctttttgggttggcag agtcacatggcccaatcgacgttgttgctgcacctgttcatacttcttctactgctgatg aggctattgaagaactaggtatccctttagaagaggaggagtatgttcccttttcaactc tgtgggccaaacttcgttttgttgcactcacaatatttac
AA
MTGRVDQKSDQKVKELIEKIDSESTSRVFQEEPVTSILTRYEPYVAPIIFTLLSFFTRMYKIGINNHVVWDEAHF GKFGSYYLRHEFYHDVHPPLGKMLVGLSGYIAGYNGSWDFPSGQEYPDYIDYVKMRLFNATFSALCVPFAYFTMK EIGFDIKTTWLFTLMVLCETSYCTLGKF ILLDSMLLLFTVTTVFTFVRFHNENSKPGNSFSRKWWKWLLLTGIS I GLTCSVKMVGLFVTVLVGIYTVVDLWNKFGDQSISRKKYAAHWLARFIGLIAIP IGVFLLSFRIHFEILSNSGTG DANMSSLFQANLRGSSVGGGPRDVTTLNSKVTIKSQGLGSGLLHSHVQTYPQGSSQQQITTYSHKDANNDWVFQL
TREDSRNAFKEAHYWDGMSVRLVHSNTGRNLHTHQVAAPVSSSEWEVSCYGNETIGDPKDNWIVE IVDQYGDED KLRLHPLTSSFRLKSATLGCYLGTSGAS LPQWGFRQGEWCYKNPFRRDKRTWWNIEDHNNPDLPNPPENFVLPR THFLKDFVQLNLAMMATNNALVPDPDKEDNLASSAWEWPTLHVGIRLCGWGDDNVKYFLIGSPATTWTSSVGIW FLFLLL IYLIKWQRQYVIFPSVQTPLESADTKTVALFDKSDSFNVFLMGGLYPLLGWGLHFAPFVIMSRVTYVHH
SEQ0 135 chr4_0777 (PMT6)
5' region cacttgattatctaggaaaggcattatcaaagacgcagcacccccacccgctctacattg ttctctttcgtaggaaccgacaggatcaaatgagtagactgctcctttaccttcttcgtc cagtcccgcaatgatagtatgcacatagtaaggaaagaatcgttttccatatagcaaatg ttgtatgaatcttgccgcagatttaatcgtcaagttctttttatgatcaaaatgatacca ttttagctgattgcggaatctcttgaccaatgcgtctccatctgctgcgaatccattcgc agtgatcagtaattcgtctcctaccgagaacacttttggctcatatctggaattgattga atagcctgtggtatgtctggtatctcctgctaaaattgcaaagtcttctccggcaattcc caaaactgtaccaccgttgtcttgaaattgatgttagtaactattgccattgaaatcatg ttagcagtcttaagaccgaattataatacctaccagagtaaggattaaactggtgttcta tggggacattggatacgcctgaattataatcagacgacactgcacttgtaacactcatta ggaatgtagtcggtgatggtagtggcaaacctctctcaattgatatagagcgagatatct tgaaatattgagtggtacacgggatatgtttaatttctaatgatcgttaagtgattcgaa aaggtgctattccagaacaattggagtgaaaaatggtacctataaatgtatttgcctcat ctattagttatagagattgtctgaagattaattttctgccattcttttggcttgtgtgga aaacatatctgtctggctttaaaatgtggaaaaaaattacataatcaggtgagcgagaag gggtctagcatcgagtatacctcgtcttgcataacgaacaagtggcctaccagactcata cctgtatcttgcattgacaaattgttctctacaatccgag
ORF
AACCAATGGATTGTTCATTACCCCGTCCTCAGCAAGAAGCAAGTTAAAGAAAATGATAACTCAACAGTTCCAGAG
ATGGTTTGA
Downstream ggaagaaggtgtagggcaaatattagacgatctatacagacaaaatataaatacgagttg
acgtatcactgctgccgactcccatcagtaattttttatcttgtaagtttcgactaatgg atattctgttgactctatcaccaggaacgatcttgtaatccagctcattgatcagtacaa attcattgttgttgttgaaagagaacacttttatactaccagactctgtgccaagtgcaa ttacagcttttccatcaatagcagctttgaacgaatccactgccgtaacttgttcatcca acttgagagaagcgagctgaatcgtatcttttttgatttcgtcgtaactccaagccttga tgtgtttatctctagaaacggtgaagaacgtcgaatctgacaattctgaaggtgcccaac aacaatcccaaataatcctcgagtgcgccttttcttgcagcttaagaagctcaaactgtt gcaagtcacttccagttcttttccataaggaaaatcgtctatccctggacacggcgagca agaattgattgtttggagaaaatttcaatctggtaatggtcaaatcgtgaccttcaagta cctggggcaattgctgccacgttctggtgtcaaaaactctgactacagcatgtgcaggtg tatttgacctacaagaagaagctaccaattgtttgtcgggagatgaatccactgccgtga tttcataaccatgtccgtaaagcttctcaatttcgggccacaaagtatgtctttgcaggt gatcttctaatggaggctgatcaagagcggaaacgacttcatatgaaatattcttctcgg attggtcttcctcctcttcgtcattgttgatattttcgtgtgtaacggcagactcgttct ctgctttgttagaaaggcctagaacaggcaacgaagcactctcaggcatttcatcaatat gagtttgtctcatgctacagaatttttctaatagctttgca
AA
MATEEERNELRSRMDANNSKVSTFTTNNSDDPSVDSQGKVKIKSWVWSLESLIGPLVITALAIFLRVYQIGKADR WWDEAHFGKFGSFYLKHQFYFDVHPPLGKLLTGLAGHIAGYNGSFEFKSGVTYPEYLDFKVMRIFNAVFSALCA PVAYWTAKSCGYSLLTVYLISLMVVFENSYWLGKF ILLDSMLLFFTTTTFLGLSKVHSLRQQGKELTYPWCFWL
TFTGLS IGCVCSVKLVGLFVTALVGLYT ILDLAVKRYDENLKWSKYLTHWAVRI LTLI ILPFAI YMLSFKIHFAV LYKNGDGASSMSTLFQSNLEGTKILI DAPRDVAYGSELTIRSQGLSQNLLHSHGS IYPEGSNQQQVTTYGHRDNN NQWIVH YPVLSKKQVKEMDNSTVPEMMKDGDT IRLRHQHTGANLHSHRIQAHVSKQYYEVSCYGNAKVSDGNDEW WEVAEQIHSDDPKYAAANESDLKFQELLHPI STSFRLRHKRIGCYLATTGMAYPSWGFKQGEWCRPSWTSRDK
STWWNIEDHKNKKLPNATSYKAPKSYFWRDFVMLNYAMLASNNALVPDPDKFDKLASQWVJQWPI INVGLRMCGWS ASQSRYFLMSSPFNTWLSTASLAVFCLIVLILVLQWQRQRLNLSSRQYWELVIKGFVPFFGWALHFAPFIVMQRV TYVHHYVPALYFAMFLLGFTVDYLTAKRNCYIKTLIYFVFYAGTIYSFYYFΞPLSFGMDGPLKNYAYLQWFKSWT MV
SEQ0136 chrl-l_0286 (PMTl or 5)
5' region aattcatttatatccaaattgctgaggcgctgttcaaagaatttaacaacgtttattgca ccccaatcgaaatttggattttctccattgataagtttaaggcaatgttctcaactacgg atactggacctgagactagtttcagaaactatgaatcttgaagaattattcacctccctg cagagactccctgtgttggaacaactgtcgtttcctcggtcttcggtctcttttgaacct acgaagcacatttggcccaagaaactttggtaccttagattaagtggcggtatttccgat gagtttcttatgagatccaaccttccaaagtcagtcacttatctcgagttttcccattgt ccttcgatacacgcagatggaatttatcatgtactggacaaagttggagctaacttgacg catctgtatattcattacccaatgcctgggttgcgaacgaattcgatggatatgatcttc cagtattgccccaatttacagtttctctacattcacattgattacattacaagagaagtg tttgacgaagaattgcttcctgctttggaggcaccaaggcctttgaagactctctggtta gactcttctggcatgttgggtcaatcgtttaaaatctatcccgacgatttaaccattgct ctctctgagggaaggttgccgtgtctgaagaccctaagaatatcccgtaaattaggatgg aacttcaacaacgatgactgtcaggatttggtaaacgaactgcaagaccaggatggagat gtgtatgctgtctgagataaatgtgagggatgaaactccccgagattactatattacgtg tactggattttgagatcaccttttttaacgaattatgacattcttcttattagactgcct agttttgtataatcttacagaaattctagctcaagccctcttacttgttcttcttctatg tcaactgattcctcaatatatgtggttggtggcccgcgaa
ORF
GGTTCGAGACTATTACCAAACTGGGACATCCCTTGCGATCAATTTCAGTAG
Downstream tatatctggcctgtcgttatctccttttgagttttgactcggcggagtgtgtatcggaga acattacaggttttagtatctttgtttctttttgttgttgttaagataacggattaaacg agtactatctgtacggtcaagctttttcgcacttgcggcggactcatgcagtgcgcattt ccaccgccgactcattgggcgtttttttaatgcctactgaaccatgctagttcctgatga ctttaccgtagtgcattataaaggtaaacgagattatcgtggtgtacatatcattaaaaa tataattgctccaatcatggctgaacagaaccccataaggaatgcttaagtccgaaaaag tataatctgatgtggtactattagtagatggtgttatctgatatggctggcgaggctgca ttagatagaataagggtcagagtcatatctacagttaattagatcgtctactatcttgct gcagagccctgaatacattcgctcgatggataggagtatataaaaccttgaggtatccct cattgggctcaattcagtattcttacacatcacaatcaaatgactgaagaaaaaagagag gccttcgataccaccgtaaactcggtgtcttctaatgaccaccaaattggaactggaaaa attgttcttactgaaaaagatgcagacattgctctaggatatgccgccgagggtgttgtg attgatgaagccaccaacaaaagaattctaaggaagattgatctctgcttgggtcctctg atgtgtattttgtacgcctgtcagttcatggataaagtctccaactcctacgcttcagtg atgggtttgagagaagatttgagcatggtaggtgacatgtattcctggacgggaactgca ttctaccttggatacttggcttttgttttccctgcttcttatatgcttcaaagacttcct gttatcaagactgttggaactttcattattatttggggtg
AA
MTPEIFGQTYQRTPHHSTIAQQYMAAFEYKKGIQRPYFFTKPLVKPITLSGFEKIQLALFLAFTVAVRFFNIQYP
NQIVFDEVHFGKYARNYINSSYFMDVHPPLVKMLYAAIGYLGGYRGDFVFNKIGDNYIGKEGEKLVPYVLMRSFP
AICGVLIVILSYFILRYSGCRHFIALFGALLVCIENSLVAQSRFILLDSPLLLFIVLTVYSFVRFSNEPEPFGKG
WIRYLFFTGVSLGLSVSSKWVGIFTIGWLGVMTVNQLWWLIGDLSVPDRDWKHVLYRAYFLIILPVIIYLGVFA
IHFLVLHEASGGSGTVSPRFKASLDGTDFSNLYANVSFGSTVSIRHLGTGEFLHSHNHTYPKSHNQQVTLYGYKD
SNNLFTIEKKDKLSDKELFGEVSFLRHRDVIRLFHKKTQGYLHVSDSRPPISEQEYNNEVSIIGDKDYVPDVNEN
FEVKIIKEYSDEDAKHEVKSIGTVFQLFHKGTKCTLFGHRVKLPKDWGFGQLEVTCIESPVLKNSLWYIEENTHP
LFNQTYPAKVKVEPLGFFGKFLELHQKMWKTNAGLTAΞHKYSSRPEDWPVLDRGVNYFNRSGRTIYLLGNLPIYW
GIVFTIGVFVVFKLVQLWKWKPNHAPTVTDASAKYDSQFFIYFVGWLFHFAPSFLMERQLFLHHYIPSLWFGIIS
IAVLSEYVWAKLGKIVGFFYVMTILGLSGFFFYWYAPIVYGLEWNKDTCLGΞRLLPNWDIPCDQFQ
SEQ0137 chrl-4_0033 (PMT4)
5' region ccttctctgaaaaaaaatcttgctctccgcgtgcaatagaaactagtcggccctgtacaa ttaaagcatactccctggttaaagtacctcctccgaacttgctcttgttgatcaaagttt ctgaccttggggccagtccccagccaccagggccaaacgctttattgaggatacgacgat acttaatctctggaagataaagtagtccatctggtgtgatttcgacatcttcgttgctaa ttggttgacataatatgttactactttcattactgaaggagcaaatacctagtccatgga acgaatccgaccaattgattccatcgccacttgtattagagattggggtgtcgtttaact gtgaagttccaaacaaaattgataaactgctctcgttcttagcttggccactttttggag
tctcaatagtagcgttttggctctcgtgaattttctgcacagagtcggatgaagaaggtg caaatgcttctagcattgtagagtcgaccacatagaacctttttaaagagttatgaaaat aactcttggtagggccaaatacaacccgatatcgtcttagcataagagctgcttctttgg aatatcgtttcttgtaagtaattacgtgttggctaaacacttagaagtcagtcgcgcatg cggccaaaaacagactagggatagaagatgaactgacaaaaacatcaagaaggtgaagac attcattctatgaaaactagtttttatataaaattatggtctgcatttagagagcaatga tgtaatcaaacatcaataagtgcttgtcgcatcaatatttaataggtaatcatggagtat tctagtctaccgccttaaaaaaagctcactcgatctagtgcagcttgattgtgtacttca atagtattccaacgaccttaacatcttaacaccatgtaaatttaagatccacgtatacga tacaatttctttcaatatcaattctcgttcaagccaactg
ORF
ATGATAAAATCAAGAAAGAGATCGAGAAAAGTTTCTTTGAACACTGAAAAGGAGCTGAAA AATAGCCATATTTCTCTTGGAGATGAAAGATGGTACACTGTGGGTCTTCTCTTGGTGACA ATCACAGCTTTCTGTACTCGATTCTATGCTATCAACTATCCAGATGAGGTTGTTTTTGAC GAAGTTCATTTCGGAAAATTTGCTAGCTACTATCTAGAGCGTACTTATTTTTTTGATCTG CACCCTCCGTTTGCCAAGCTCCTGATTGCGTTTGTCGGCTTTTTAGCTGGGTACAATGGT GAGTTCAAGTTTACAACTATTGGTGAATCTTATATCAAAAACGAGGTTCCCTACGTAGTT TACAGATCATTGAGCGCTGTGCAAGGATCTTTAACGGTGCCAATTGTTTATTTGTGTCTC AAAGAATGCGGATATACAGTTTTGACTTGTGTTTTTGGTGCATGTATCATATTGTTTGAT GGGGCCCACGTTGCTGAGACTAGACTAATCTTGCTGGATGCCACGTTGATTTTTTTCGTT TCATTGTCCATCTATAGCTATATCAAATTCACAAAACAAAGATCAGAACCATTCGGCCAA AAGTGGTGGAAGTGGCTGTTCTTTACAGGGGTGTCTTTATCTTGCGTCATAAGTACCAAG TATGTGGGGGTGTTCACCTATCTTACAATAGGCTGTGGTGTCCTGTTTGACTTATGGAGT TTACTGGATTATAAAAAGGGACATTCCTTGGCATATGTTGGTAAACACTTTGCTGCACGA TTTTTCCTTCTAATACTGGTCCCTTTCTTGATATATCTCAATTGGTTTTATGTTCATTTC GCTATTCTAAGCAAGTCTGGCCCAGGAGACAGTTTTATGAGCTCTGAATTCCAGGAGACT CTCGGAGATTCTCCTCTTGCAGCTTTCGCAAAGGAAGTTCACTTTAACGACATAATCACA ATAAAGCATAAAGAGACTGATGCCATGTTGCACTCACACTTGGCAAACTACCCCCTCCGT TACGAGGACGGGAGGGTATCATCTCAAGGTCAACAAGTTACAGCATACTCTGGAGAGGAC CCAAACAATAATTGGCAGATTATTTCTCCCGAAGGACTTACTGGCGTTGTAACTCAGGGC GATGTCGTTAGACTGAGACACGTTGGGACAGATGGCTATCTACTGACGCATGATGTTGCG TCTCCTTTCTATCCAACTAACGAGGAGTTTACTGTAGTGGGACAGGAGAAAGCTACTCAA CGCTGGAACGAAACACTTTTTAGAATTGATCCCTATGACAAGAAGAAAACCCGTCCTTTG AAGTCGAAAGCTTCATTTTTCAAACTCATTCATGTTCCTACGGTTGTGGCCATGTGGACT CATAATGACCAGCTTCTTCCTGATTGGGGTTTCAACCAACAAGAAGTCAATGGTAATAAG AAGCTTGCTGATGAATCAAACTTATGGGTTGTAGACAATATCGTCGATATTGCAGAGGAC GATCCAAGGAAACACTACGTTCCAAAGGAAGTGAAAAATTTGCCATTTTTGACCAAGTGG TTGGAATTACAAAGACTTATGTTTATTCAGAATAACAAGTTGAGCTCAGATCATCCATTT GCGTCTGACCCTATATCTTGGCCTTTTTCACTTAGTGGGGTTTCATTTTGGACAAACAAC GAGTCACGCAAACAGATCTATTTTGTCGGAAATATTCCTGGATGGTGGATGGAGGTTGCA GCATTGGGATCCTTTCTAGGACTCGTGTTTGCAGATCAGTTCACGAGAAGAAGAAACAGT CTTGTTTTGACCAATAGCGCCAGGTCTCGGTTATACAATAATTTGGGGTTCTTCTTTGTA GGCTGGTGTTGTCATTACCTACCCTTTTTCCTAATGAGCCGTCAAAAATTTTTGCACCAT TACTTACCTGCACATTTAATAGCAGCCATGTTCACTGCTGGTTTCTTGGAATTTATTTTT ACTGACAACAGAACTGAAGAATTCAAGGATCAGAAAACTTCATGTGAACCTAACTCTAAT TCTTCAAAGCCGAAAGAGCAATTGATTCTGTGGTTAAGTTTCTCGTCCTTTGTCGCTTTG CTACTAAGCATCATTGTTTGGACTTTCTTCTTTTTTGCTCCTCTAACATATGGTAATACT GCGCTTTCGGCGGAGGAGGTTCAGCAGCGACAATGGTTAGATATGAAGCTCCAATTCGCC AAGTAA
Downstream gagtatacaatgtgtagttcaacgcaaaggaaattctaactttctgtgcaatctggtgac aatttctaaataactatcacaattggaagaagagattatcccaaatcttatcaaaaaatc gatgattgccagtgcacaattaggcttgaatttttcttgcagcaacgaagagattacttc agtgatgttcattagcctgaaatcttcactttcgtggtctatcggattaggaattagacc ttgtttcatcggcaggtcgtatatgtattccacttctggttgaataaaatcttcgggtgg tttgtttctgaacatatatgagatggctcccactggactgatatattgcgaaacatagtc ctcattcaaccctgcctcctcgtaacattctttcaggcaagtttgcaaagtgccattagg atattccaagcctcctgccacagtattatctaacataccgggaaatgttggtttgtgtct gcttctcctaggtatccaaagttgaatactgttaggatcggcagaattttgcaaatatcc
attgatatgaactccataagtaacaactcccaaaatattagaaaaagccctttccaccaa catgtacatcttatggttatcgcagtaaactgcaaaaagctcatttctccaaccgctaag ggtttcaaagagacgctgatctctccaacgctgagctatctttgcaaacatctgcgttct tttattttcggtatccagactaggaattatcttgacttcgtgtttttcattatttactat cacagcctgtgtttcgaactcaaattgttttgccaccttgggaattatataccctagtaa gatcccatcatgcgataagaatttatacacagatacttcaaattcatgaaaagatggctc atctttatgaggaacagaatcaacagatctgactagatcaatatatggcattggttgatt ttattcaatggttatctatctcaaacatgctataaaaata
AA
MIKSRKRSRKVSLNTEKELKNSHISLGDERWYTVGLLLVTITAFCTRFYAINYPDEWFD EVHFGKFASYYLERTYFFDLHPPFAKLLIAFVGFLAGYNGEFKFTTIGESYIKNEVPYVX/ YRSLSAVQGSLTVPIVYLCLKECGYTVLTCVFGACI ILFDGAHVAETRLILLDATLIFFV SLSIYSYIKFTKQRSEPFGQKWWKWLFFTGVSLSCVISTKYVGVFTYLTIGCGVLFDLWS LLDYKKGHSLAYVGKHFAARFFLLILVPFLIYLNWFYVHFAILSKSGPGDSFMSSEFQET LGDSPLAAFAKEVHFNDIITIKHKETDAMLHSHLANYPLRYEDGRVSSQGQQVTAYSGED PNNNWQIISPEGLTGVVTQGDVVRLRHVGTDGYLLTHDVASPFYPTNEEFTVVGQEKATQ RWNETLFRIDPYDKKKTRPLKSKASFFKLIHVPTWAMWTHNDQLLPDWGFNQQEVNGNK KLADESNLIWVDNIVDIAEDDPRKHYVPKEVKNLPFLTKWLELQRLMFIQNNKLSSDHPF ASDPISWPFSLSGVSFWTNNESRKQIYFVGNIPGWWMEVAALGSFLGLVFADQFTRRRNS LVLTNSARSRLYNNLGFFFVGWCCHYLPFFLMSRQKFLHHYLPAHLIAAMFTAGFLEFIF TDNRTEEFKDQKTSCEPNSNSSKPKEQLILWLSFSSFVALLLSI IVWTFFFFAPLTYGNT
ALSAEEVQQRQWLDMKLQFAK
6.3 Man nosyltransf erases
6.3.1 A-MANNOSYLTRANSFERASES
6.3.1.2. α-1,6 mannosyltransferases
SEQO 138
P. pαstoris homolog of S. cerevisiαe MNNIl (YJL183W)
Subunit of a Golgi mannosyltransferase complex that also contains Anplp, Mnn9p, MnnlOp, and Hoclp, and mediates elongation of the polysaccharide mannan backbone; has homology to MnnlOp
S. cerevisiαe null mutant: protein secretion: increased; resistance to Calcofluor White: decreased; chitin deposition: increased; competitive fitness: decreased; freeze-thaw resistance: decreased; metal resistance: decreased; Canlp-GFP distribution: abnormal; resistance to
Calcofluor White: decreased; viable
Chr 2-2, 0125
5' region tctcgtggcatcaactctatttttcttaattcgatccaaaaacactttctgcttctctgt aaactcgtggataatcttcctgtgtctcaagttgaaaataccgtccccaaataaaggccc tagaagatatcctaaaactgaagcacccatcattccaatggcagatgccataaatgggtc aattccaaagattaattcagttggattcatttcgatttgcgaaatatatccccatgaagc tgaagatgcaaacagagcactgaatatggaggttaccaagttgactcgtcgttggctttt acgtaaagccaagaattgttcccaatttagcggattatctgagtgtgatttttgtggttg tgcctgttgttgcatttgaggaacagacaaagcaaatgatctggtcaacatctgattgac acctaaggatctgacaccgatatttctaattgaaagcattttcggccgtttgctagtttc aagtaggtggaaatagaagggttgacttgatcgtaatcgtgagaaatttgaccttctctg aggtagtgcatctactaaattgcagattcaagatgattggggaccatgaagacataaagg ggttggtggggttgacagaatgataatccaaggatccaaggagacagaggcctaaagcgc tacccaatatgtagtactacttagttttggccttgaacttgacttggctcatgactaatt
tctgggataaaacaagatcgaaaaattgagaatatagggtcttgaatactagcctagttc acaaccattaaaaaaccacctatattggatgatccaaattacaataggtgcaccctgatt gttagttcaagtcgttgaaccactcactatacacaatgtaataagttcgagcttacctga aacctaattcgttatgagcgcgcgcagatgtgttatctcgaaatctaaatatctttaacg gcgtcaagtaaatcaagaaacacgaacatctttttctaagc
ORF
ATATTGGAAAGCTACACTAAGAAGAAGAAAAGCCAATCTTAG
Downstream tttcataatagaacgaaatagatatatactcatacactttgtgcaacagagacctctgtt ttagtagtctcttcagttttgtggttgtcaactgcttggtcctcaggggattggggttcg tcgttaccaagacgctcttctaatacttttaaggctacctgtcgtcttctttctgcaata tctcttgattgagaagttttattgattcgtctaatactcctcaagttgccagattctctc tcatcgtcattgaaccctttgattaggccaaaaaggattagcaaatgatagaatatgcga cacagtggtttcaaatatttgtgaagaggagcgggaaagaactgcatgaaattgaacgta tcagacgcatcaccttgaattgaatttttggtgttgtctggtagaacatcgttgataacg ttgagtctttggtaaaatcgtaggtaagtccatgaaactaggaaactattccaaatcggt agaaatggagttaacttctgcaaaacgatcgaaataaacagcgacaaaaccagtacaata aaaggaatctgttttaccctaaggtttaggaatttcacttgatgctcaggagagtactgc ttgagtacgactagaaaaggcatcaaaaatacaaaacttccatgattgacaggcaccaaa ggatcgtaactactgctagcagtagccacaatgaccttgatcagacaagtgaataaattg gtaaaaaatgctgacactagcaaatacgtgatggtttcactaaagccagcactgaaaaat tgtataccgtcaacattctcctctaaattccactgtttctctgtgtaagcagttccaaaa aagatattagcagcacctatgaggaattgaatgatattaacatcagtaaatgcaccagtg ataaacgtccatgggaaaataaaactcgtctttggaactagttccaacatagggataact atatcccttacgctttcaccactgttccccaaacttttaga
AA
MYLGRSEKDAFKPKKNRILVPRSLAIYLVISVLLTNFLVFQLFNINLIPFLNGRSSKNNFDYLVKFEFLNLEEPY RGFYKEEIENPTNYFFPTIEHASRLNEMGLDNLFQYTVDKQTKVHQYMLDPELKYEDQPKGNELDFVRREFLSNG MKVHRGANSPELVIVTGIDFETFDSSYLGNITQNRIDYAQKYNFGVYVRWIQEFAPQFNNFQQSKDWTKALLMRA AMLAFPNSKYFWYIDSNCFIMNMETNIIPYILEPHVLGPI ILRDHPLILPDGGIKTYANIDPKDVHLILTQSATS IRTDSFVMVNSIYSKALLEYLSDPLFTNFSTFAHGF YAGLTHLLQWHPYFLSKAVLIPQRTIAAEHSSV ISEDEN DMRSYHEGDLLLLMPDCVKNNDCAKILESYTKKKKSQS
SEQ0139
P. pastons homolog of S. cerevisiae MNNlO (YDR254W)
Subunit of a Golgi mannosyltransferase complex also containing Anplp, Mnn9p, Mnnl lp, and Hoclp that mediates elongation of the polysacchande mannan backbone; membrane protein of the mannosyltransferase family
S. cerevisiae null mutant: chitin deposition: increased; competitive fitness: decreased; freeze- thaw resistance decreased, metal resistance: decreased; resistance to mycophenolic acid:
decreased; resistance to nickel (2+) : decreased; resistance to Calcofluor White: decreased; resistance to rapamycin: decreased; resistance to wortmannin: normal; resistance to caspofungin: decreased; resistance to cisplatin: decreased; viable; protein secretion: increased; resistance to mycophenolic acid: decreased; septum formation: abnormal S. cerevisiae overexpression mutant: abnormal cell cycle progression; nuclear morphology: abnormal; resistance to hydroxyurea: decreased; resistance to nocodazole: decreased
Chr 2-2, 0185
5' region agcacgttgatgcttggttaaaactgggtgaggtccagacccaaaatgaaaaggagtcag acggtattgcagctctagagaaatgcctggagttggaccccaccaatttagcagctctga tgactcttgcaatttcttacattaatgatggttatgacaatgctgcttatgctacattgg aaaggtggatcgagacgaaataccctgatattgcttccagggcacgctcgagtaatccag atttggatggtggtgatcgtattgagcagaacaagcgtgtcacagagcttttcatgaagg cagcacaattatcaccagatgttgctagcatggatgctgacgttcaaactggcttaggtg tattgttttactcaatggaagagtttgacaagactatcgactgtttcaaggccgccattg aggttgagcccgataaggccttgaactggaatcgactgggtgctgccttagctaattaca acaaaccggaggaggcagtagaggcatattccagagcattgcaattgaatcctaactttg ttagggctcgctataatcttggtgtttcattcataaacatgggcagatataaagaggctg ttgaacacctgttgactggaattagtttgcatgaggttgaaggtgttgatgcatcagaaa tgagtagtaatcagggccttcagaataatgcccttgtcgagaccctaaagagagcatttt tgggtatgaataggagagacttggttgataaagtatatccaggtatgggattggcccaat tcagaaaaatgtttgacttttagatagaattcgaaaccccacaataagtagaccacatat atccaacttgatactagctccttctttcctcccctgaattactgttagcgaagtccggca tccagatcacgtcgagcatctcacgaaacgtattttcaacattgatcaacatttctcata tctctatcgccatctattttggattatagtatccattgtaa
ORF
TTCAAGCCCTATAACGGTGAGAATTCCTACAACAACACCATCAGATGGCTATCAAGTAAAATACACAAATACAAG
TAA
Downstream tcattttccactttgtttattaaatagatagaccattttatcatactgtactatttcagc cttttttcgcatcctgcgcaaggcacacaggggataggctaggaccagccaccttccagc actccctactctagtttctatttacgcatcgcgcgcgcttctggatggtggaaagtctta tttatttcgcttcaccgtgataaatatggctgaccagcactgggtaaatgaacaagagac ttttaaccctagtaacggtaataatcaagatcaatatatcatgtcatatctgaatatgaa tgagaattttaccggggaggaaagtgtgaatccaatgtccattcttgaccagtccaagca acggccccagcaatacacaggttctaatgagtccaataaccagaacgaactaggtttgac ggcggatccttctgtatttttcaatcagccttctaatatgcaaagttctaactctcccat ggtatcgcaagtccagaatcagtcaccgtcatttcaacagtattctcaagcaccaactcc tggtcaactgcaacagcagtttcaggatatgtctaatggggattccaaacaggttgttgg agatcccaaacatcaatctcagatgcaaaaccagttaagaagacaatcgccgtctcttta ccaagcaaattctaattcgtcccagcagcaacaactgtatcaaaaattccaacaagaaca
gggattgaactatagaagaatgcaacaacaacaacaacaacaacagcaacagcaagcaca acttcaacaacaacagcagcagctacagcaacagcaaatgcaacaggctcaggcgcgtgc tcaagcacaggcacaagcacaggcgcaggctcaagcccaagctcaagcccaagcccaagc acgactacagacacagcagaaacageaaaattctcctccacagtttcageaaaactaccc ggctaattctcctcaaggtgtacaggctaaccaagctcaaa
AA
MSRQRFQLPLSKEEGLYSDLDPSFEFKPYNGENSYNNT IRWLSSKIHKYKKLLI LLVFLLLVSTFTSAPLVPHFH SNKDPRWI ILAANEGGGVQKWKGPQEWSVERSS IANKKKYASKHGYGLAIKDLTLKRRYSHEWREGWQKVDILK QTMRQYPNTEWFWWLDLHTL IMDMDVDLEEYLLNSVGSKS YRTVTSFNPIGIENHVPYTDTSQP VDLIVAQDCGG FNLGSFFVRQSEWTEALLDAWWDPALYEQMHMAWEHKEQDALETLYDHQGWIRSRTGF IPLRS INAFPPGACSDQ ADNPKFFYCESDRDFVVNMAGCEYGRDCWNEMEYYKKLAKRHEERWWKFW
SEQ0140
P. pastons homolog of S. cerevisiae ANPl (YEL036C)
Subunit of the alpha-1,6 mannosyltransferase complex; type II membrane protein; has a role in retention of glycosyltransferases in the Golgi; involved in osmotic sensitivity and resistance to ammonitrophenyl propanediol
S. cerevisiae null mutant: viable; acid pH resistance decreased; alkaline pH resistance: decreased; glycogen accumulation: increased; chitin deposition: increased; competitive fitness: decreased; freeze-thaw resistance: decreased; lipid particle morphology: abnormal; metal resistance: decreased, resistance to hygromycin B: decreased; resistance to rapamycin: decreased; resistance to cadmium dichloride: decreased; resistance to hydroxyurea: decreased; resistance to mycophenolic acid: decreased; resistance to sulfanilamide: decreased; resistance to L-l,4-dithiothreitol: decreased; resistance to Calcofluor White: decreased
S. cerevisiae overexpression mutant: decreased vegetative growth
Chr 3, 515
5' region tcctccggcccccaaaccaagaaatcaaaagaaggagccagtcatcgaagaaaataatga tgattcagtggtggaaagtttgaaaaaactctcaccaaagccaaagaccccagccaaacc aagctctttacaagtgccctcaaagccaataactaccaagcctgcacctttgaaacctac aaaacccaaagatttggcgaaaggccttgactcaaatccaccagaagctgttgtcaaggc gaaggaaataaaatccaagcctttaatacttccgaagaaggctagtaagagtaactttaa tcaaccgcaggaagatatattgcagtcacaattgcagaaactaagagaatccaagtctag tggtaagaatcataatttcaacgcagctcaagaaacaatatttaacgctcaattgaatgg actgaggtccagaaaaccagaccagaaccaatctaagggattgttaccaaaaagatcaga aacattctcatctaagcctgttcctattccatttatggtagatccatcgtccagcaaggt gctagagaaatccttgacaaaatcccgaaccattgatggatccgacgtaggattatcaca agaacatcatcagagtctgtcccatccaaacatgactcgagccaaaggccccaaaaggag acttcctaagacgctcagcaacccagaagcaatcaaacaaccacaacctacatcctctac taccagcttgggcaagaaagtgeetcctcccaagccatccaagaagagaattgtctccaa cagtgaactattcctatagaagtttaatgcattttagattatataatcagggagaaaccc cgtagagaagttccacgcggaaatgtttcctgaaatccaccatcgattttcgtacaaaca cgacacattttgtgctttccttcaaaccaacaataaatatcaacggtatcatcgattcct tcgcactgtcgagcccaaaccattttattgtgttgtaggag
ORF
AAGGATCAGGATCGTGCCGCTAACTTTTATCCTCAGTAA
Downstream taaatagtaaagttatttgtcatttaggaagggttaatacatattaataaaaaagtttta tgcccatttaccaagcacactcatcagtatcataccagcaataagagatagaaatccaac ggttgtgttcagccagttggcgttgactaatgctccatatatccagtcaccggcccacat ctctaccaaaccggtccagatcaaaataccagcagagaaagagtcaagagtacccaacgc aatcaacgttgaagagctattaccattgaatctactcaacacgccaataccaattgccat accaataggagtaataagggcaaaagcaagagccataaggagtttataccagagtgagac catattcagttcggcaatcttggttcccaaggcaaaaccctcaaacatttgatggaaaat gataacgatgaacagtgtaataaagaaagaatcaccagcaactaccaatgtgacaccaag taggatggagtggaaaataataccggcttccaacataataacggaaatttttctgctgag tggtggaacaggcacggttccacctctttctagagaagacgaagaatcacttagctcgtt ctcttttggattttcgtcagtcacttcgatttcttcttgggcgttttcagcatctgcttt cttgagtctggctagagcaattcttttaatgaaatattctatagcaaaacagaggaaaat accagccatggtaattgaggtaccagtagcttcatagtcaagtggagccatacactcatt agaccacatcatttcagcatgggtaattaaatgcacaaatgcggtcgaaataatcacacc tgttccaaactgtctgaaacctgtgaagatcacaccagtcaagctgaaatttaggaactt atttgttaggataggaaggtagacacctattgcacttgtagctagaatggcaaacagaca tccgattctaaggggaatattgtaatcatggtctggggcac
AA
MSKDSIIAQPMGTLNFKVLRRYNSTLLRSSKHFATNPKTILVVLVALFLMLFNMIRPFNNSLYEPDFENTKNTHY YDLDNYKGSEDGWQRGERWFLVPLRDAAAHLPMFFRHMRNMTYPHHLVDMAFLVSDSSDNTMGVLKDNLLELQH DPDPKIHFGEIIIFEKDFGQAIGQGFSDRHGFAAQGPRRKLMARARNWLSSVAVKPYHSWVYWRDVDVETVPTTI IEDLMHHNKDVIVPNVWRPLPDWLGNQQPYDLNSWQESDGGLQLASTLDEDAVIVEGYVEYATWRPHLAYLRDPY GDPEAEMPLDGIGGVSILSKAKVFKSGSNFPAFSFEKHAETEGFGKLSRRMGYDWGLPHYWWHIYEPSSDDLK HMEWMAKEEERQRQEKEIQAVYKKYWDQGFEDVSDAWAKERHYMMKNTDFRKQRKIEVDWSGEDDYEPAVKANAV KDQDRAANFYPQ
SEQ0141
P. pastoris homolog of S. cerevisme HOCl (YJR075W)
Alpha- 1.6-mannosyltransferase involved in cell wall mannan biosynthesis; subunit of a Golgi- localized complex that also contains Anplp, Mnn9p, Mnnl lp, and MnnlOp; identified as a suppressor of a cell lysis sensitive pkc 1-371 allele
S. cerevisiae null mutant: glycogen accumulation: increased; competitive fitness: decreased; hyperosmotic stress resistance: increased; resistance to amiodarone: decreased; resistance to PM02734: decreased; resistance to ethanol: decreased; resistance to hygromycin B: decreased; viable
Chr 3, 0620
5' region aaaataaatatatggtgttgctatgtaggccgtaacacggctcttcacctgtatggagca tcggcatctaaagcgtactggatttcctgttcccatttgtcaccattagcaagcaatttt tccttgttatacaacagctcctctctagtttcggtggcatactcaacgttaggggtttca acaatggcatcaactggacaactttcctgacaataaccgcagtagatacacttggtcatg tcaatgtcgtacttgtaagttcttcgggaaccgtcaatacgttcttcggcttcgatggtg
atagcctgagcaggacaaactgcttcgcataacttacaggcaatgcaacgttcttctccg gatggatatcttctcaaagcgtgttctcctctgaaacggggagaaactggacctttttca aaagggtagtagattgtatatggggcacgaaagtacatttccaaagtgatataaaggcct ctaaagatctcagacagtaggtaccatttggtggccttggaaagagcgctttcactgctt tcttcccacgtctttggtcttgggagtctgaatccttctggactgtgtgccttggatcga accagaagttaaagccaatgtgggtttctgtgcaatgagtcccgtcgaaaacctgcaaac tggtattctgtttatggcacggcaggaaggcttcaatgctagtggtctaaacattgtgtt ataaatgtgtaggtagaaacctataagtttgtgttggggttcgataatcaaacttgcctt gaaggtaattggctaggtgcttgttgagtgtcgcacgtctcggtatttttttacttgtcg attctctcttactgtccaagatgtacgaaagatgtgaagtttctttgttggcgattttct tgcagcgcgatcgcggtcggtatgatttcccttcagaccaaaaacagatatagatattca attgaaactcgcccttccaacaccctgaacaccacccccta
ORF
Downstream gggtcagaaccatttagattgtctggatctatcattatggccttgtttatagacaaagaa ttgtatcctggactgaagggaagtttatagagtaatacccctctgacaaccaactcgaat gggtattttgagataatttcccatatattattgtcttcactgggagacttacttcttctt tgtagttccggtatcggcagcaaacttatggtggtcacatcttccaaacttctcggtata gtagaactatttgtggttgatacggtctgggtatcggtgataatgcttgtaataatggat tttgtgggtgtaactgaagctggtgtattagtggtagtggtagtggtggttgttactgtg ctatctgtaggttctgggagtggtttctgagccgggtttaaaattttgatctctgtggta tcttcgtctggatttcgactaaaaaatccgccttgaaatacgagtggaacaggtaaacga ttgatggagcccagaagcactgtgtccaaggaagattcatcaaaagtgaaaggagtgcca tcttcgtttccatcgttaagatgagaacttttggcaaaaaaatcaatcagaacgtcatct atttctatggcgacaaatccgggattgaatccttttaccgttaaactgaaaactatctct tccttagttagtaagatattgtctatccggtccacagtgaagtttgttaaaggcttggtg atggcgataaaaaacccaaataaaaaccccattatgagaaacgctgatatcaagctcgca gtgtaacaaaaatgttttacttgacgtaatcttgtcatcttggtagatgtgaaatcatgc ggcgagtaataatagtggtctggtttaatattctggctgtgtctttttaatttatgtcta ttgttattagatacggaggttgaaccgtattccaagggacattcatgatggtctaacaaa agcggttcttgataaccgtatgcaggatagtcatataatcc
AA
MQQQLFRKVIWLTVGLITVILVIIKISSSKSTATDLQKVLKNANILPQDVINYNΞRKVTDELASKLDEIQKKYLS KQDDRISKLEAERADLLEQVRFLRNPPAGSSLREKLAYLFPYNENGKFPAYIWQTWKYGLNDDRFGEKFKEGETQ WASKNPGFVHELFNDDTSGVFIHHLYINVPEVIKAYELLPNIILKMDFFRYLVLYAKGGVYADVDTMPLQPVPNW IPENVSPKSIGMIIGIQNDANNPDWKKDYVHRLQFSNWCIQAKPGHPILRELIAKITEDTLQRAESNSLELADIS EEGGLSDKNLSIMQWTGTGIFTDAIFTYFNDYIQSSIYTKVTWKEFSKLRKPKLVSDVLVLPIISFSAGAGSGKS TELNDPLAFVQHYFERLHNDNH
SEQ0142
P. pastoris homolog of S. cerevisiae MNN9 (YPL050C)
Subunit of Golgi mannosyltransferase complex also containing Anplp, MnnlOp. Mnnl lp, and Hoclp that mediates elongation of the polysaccharide mannan backbone; forms a separate complex with Vanlp that is also involved in backbone elongation
S. cerevisiae null mutant: septum formation: abnormal; glycogen accumulation: increased; chitin deposition: increased; competitive fitness: decreased; resistance to gentamycin Cl: decreased; resistance to L-l,4-dithiothreitol: decreased; resistance to sulfanilamide: decreased; respiratory growth rate: decreased; viable
Chr 4, 0103
5' region tctggagactattttcgacaaggcatggatttccaagcaccaggacatcttcagatggtt cagcactgtgtcagcccacccagttctgtcttggagatacgttgacttcaagccaagaga caagcctgtcgaatacgtccctcctaagaaggaaaagaaagaaaagaagaaggaggagaa accaaagcaagagaagaagcctgctcctgctgctgctgagcaatctgatgagccacctgt tgagaagaaggcaaaacacccattggaagctcttggaaaaccaaagttggccattgacga atggaagagattctactccaatgaggagaccagagagtctgcaattccatggttctggga acactttgacgccgatgaatggtccttgtggaaggtcacttacaaatacaatgatgaatt gaccttgaccttcatgtcaaacaacttgattggaggtttctttgccagattgaccggttc gatcaagtacatgttcggatgtgctgtagtgtacggagagaacaacaacaacggtatcat cggtggtttcatggtcagaggacaagactacaagccagcttttgaggtcgctccagactg ggagtcttactcttacgagaagttggaccctaacagtgaggaagaccgtgagttcttcaa caacatgttggcttgggataagccaattgttgtcaacggtgaagagaaggagattgctga tggtaaggttttgaaataggataagttagcttagtaattttaaggtaaacgaggccccag tgatactggtattaattggtactttctttcctggtagtctggtagcccgtccaagttttt cttgtcggcggcggagaacggtttttttttatcttttcacccactctgatcggctcttct taagctttttttttctcctttcgcgactaccgaggaattagtttcaatcaaaatgcctga ggttcagcgtgttgtttcctcccactcactagaccattcca
ORF
GCCAAACGTCTGGGCTACCAAGTCTTTGGTCTTCCAAACTACCTGGTTTACCACTATAACGAGTAG
Downstream atggggcagtagtgttaagtatcgcgacacttgcagcatcaaagtcgtgcttcttatcaa tcattaatcctctttcatactgataaggagtagttatccctacccacttcttaggcaaca atagagcaaggccttatcaataagtatcgcaaaatagtagaataaatccaatctataaag atagttccttcacctttcatggcaaagtctcccgaaacaagaatgcggcctcccctggag gcttcatttcactcctcctccacaagacggtttagcgggcctaaaaatagtcttgccgag tccaccagcggattggcctccccattggctaccaaagagcttaaaatttcgaccagcaaa caaaatcagactccagtgaccatcagcaacgatatgctgaacataaatcctaatctcaag tcaattgtggagggtggaacgtctcctatagcagaatcgaatgtcggtgtggcagcgttt gcagctacagcaacggcagcatctttacaactgagtggctccaactcctctaaaagtccc acatcaaggaatttgtcgcatgttccctgcaaattctaccgtcaaggggcctgtcaagct ggttcttcatgtcccttttcccacactttaacccaaacaagccaggcagcaacttgcaaa tattttcagaaaggaacatgcaaattcggatcgaaatgtgcactggtgcatatttctccg
gagggaaaaaaggttaatctgaaggctctaaatcaagcttatcaggtgccaaaacatgaa cgatctctgtcctcttcccagggtttaagtcagcagatacacaaagatccagaaacatct gatgatcaagcttccagcatgtcctctcctactcgaacagtaacatcacagatggcatcg tctcccttcagcaatacaaactccatttggtcaggcacaggttcaaaggaaaggcgattg tctggaccatctttttcgacagggtctttctctagaggttc
AA
MLVPNKNHLFHQIRRKPSQIVAPILFLVLVVYLLFGIGQSSSKKPKYSYKDKTQGWLLSSKIPPGPDEMAKNHIM HYNMNLLETTVMPAFNKEQVLVLTPMSKFYPEYWDNLLALSYPRELMELGFIVPRGKAGDEVVKKLETAVKAVQR SSTESARFARVTILRQDSESVESQLEKDRHALKAQKERRSKMALARNSLLFSTLGPFTSWVLWLDADVVETPKTL
IQDLATHNKAVIAANCYQRYYNEETKKNDIRPYDYNNWVESEEGLRLASTLGPDEI IVEGYAELPTFRALMAHFY DPHGDLSTEMQLDGVGGTAVLVKAEVHRDGAMFPSFPFYHLIETEGFAKMAKRLGYQVFGLPNYLVYHYNE
6.3.1.3. α-1,2 mannosyltransferases
SEQ0143
P. pastoris homolog of S. cerevisiae KTRl (YOR099W)
Alpha- 1.2-mannosyltransferase involved in O- and N-linked protein glycosylation; type II membrane protein; member of the KRE2/MNT1 mannosyltransferase family
S. cerevisiae null mutant: viable
S. cerevisiae overexpression mutant: increased filamentous growth
Chrl_3, 0138
5' region gtttttgtcttcgctgatatcatggtgaaacatcaaagtggtatggggcttagcatattc cttgtgaagtttgtactcggccaacgaagtgaacatacagtgggtggtatttggtcctgg gaaagcccattgaatgaaatgaatttccacaccttcgtcttcctttggtagtggtaaaat gaaagaagggttctccctagcgttggtgtatagtttttgataaacgtctttgggaataac agcaaccaacacattttccttctcctggaatctggctctccacaagaattcaacctgatc cggcgaaaggtctttgatcttctccaactttaggaaatcatccagtactttataaggctt gattggcgcatcaggcgatctaggagatcttaattttgtgaccttatgtccattttttgt agctttcctttcatactcttccagatctgtcaaggggtccatcctgttcaactctatctt ggttttctctattttatccttgtatttttcttgaagtttttcaattgactctgcaccttc ttccttcatctttttctcaagtttttctttatatcgattcagcacgtcctctttagcagc tttggtttggacatatctaatgttatccaaagctacggcatttcggtactgaatggggag gccaacacacctgttgaataggttgaagttttttgatatatacaaagagcggtatgacat gttctcgttacttccagaattcataggggtgaatccaaaccaaaaactcagtaggaagat atttcgaagatgttagttgatactttttccacgcacttctccagagaataaaactgaaaa tgtccgacggcacctatgtccagagggttatattgcgccaagttgattacgtaactaata aggtcataaaatacagaccatacaggtctataaataaattgggatagggtgcttcggaag ttggttgatctgattcagttcctgaatgtgccgctgtaggg
ORF
TCCTGCCCGCTTTCTGAAGAAGTGAGGTTAGAGAAAAAATGCTCATGTGATCCAAAGCAGGACTTTACCATGGAC GCTTATAGCTGTACAAGGTTCTATCAGGACATTATAAGAGATAAACAGAAATCTCAGGGAAGCAACCCCTGA
Downstream ttttgggcatttagtttcaggacagtttagatgaggatacaaatattacataatcaacct acttccaggtagcatatccgcactctctgtatttagatctctgctacctcagcttgggcc gcccacacattagtagtacacagacaatggccatcaagtatttgattatgctttcgaagc agggaaaagtgaggctttccaaatggtataccgtaatttctcagaaggaaaagcttcaat taattaaagaattaacagctatagttctgaatagaaaggcaaagatgtgcaatgtcttgg aatataaaggtgagttttgcgtttgaggcctgtgtctttagaaactttactaacaggatt gcatagatcacaaaattgtttatagaaggtatgcttctttatttttcattgctggcatag atgttgacgataatgagttattgacacttgaaattatccatagatacgtggaacaaatgg acaaagcatatggaaatgtctgcgaactggatattatcttcaatttccagaaagcatatc acatcttagacgaattactgttggacggtacattacaagagagctccaaaagggaggtcc tgagaagggtgggacaacaggatcaatacgagttagaggacgagcttgatactaagttct aatagatgtagctaaagtgatttgattgatcttaaaacgtagaggttggtactatgtcca acacggagtaacaaaaaatgaaaaaattaaagtttgacacggacaggaattgaacctgca acccttcgattgcactgtattccaactggagtcgaaagctctaccattgagccaccgcat caactataaacaagaaagcaaaaaagctgtttaagatgcaaatttccgattaaaaaatac tattataatgaaccgatggaaaagaagatctttctattaaattggtgtgggttattaaag gtaacgcaattaagtaattgactgttttgtaggaactcata
AA
MMRARLSLERVNLSFITSVFLASVAVLFISLEMPKVLARDRQILKLKLGFMGSGLQKGSL ETSGNIENTESNINSQTTQHIGTIGASNERANATFYTLCRNEELYQMLETVQNYEDRFNS KFKYDWVFLNDYPFTDEFKRVISHAISGEAKFGQVPASHWRFPDHIDQQKVYESMDKMDS DNTTGDYLGLPIPYAKSISYRHMCRYQSGFFYKHGLLQGYKYFWRVEPDVKLYCDIDYDV FKSMEQNGKRYGFVISMMEFEKTIESLFKEVKNYLQMKGVSRLLEDTDNLSDFVYDELSG DYTLCHFWSNFE IGDLDFFRGREYNEFFDYLDSKGGFYYERWGDAPIHSIAVSLFMQWND VKWFSDIGYRHPPYLSCPLSEEVRLEKKCSCDPKQDFTMDAYSCTRFYQDI IRDKQKSQG SN
P. pastoris homolog of S. cerevisiae MNN2 (YBR015C)
Alpha- 1,2-mannosyltransferase, responsible for addition of the first alpha- 1,2-linked mannose to form the branches on the mannan backbone of oligosaccharides, localizes to an early Golgi compartment
S. cerevisiae null mutant: budding pattern: abnormal; cell shape: abnormal; cell size: decreased; glycogen accumulation: increased; chitin deposition: increased; competitive fitness: decreased; resistance to hygromycin B: decreased; resistance to Calcofluor White: decreased; viable
3 homologs
SEQ0144
Chr 1-4, 0037
5' region aaggtcgtgcacgacaacggttgttaatgaggttagttacttccgtcctatgaggatgca actggtctaacaaaatctatcctcacagttcgtgtcctttctttcattaagtgatcaata catcatgtcctttaggttagccagaacattttttttttctctctttggatacgaactgaa ggttttttagtcttcaatagcactggctgacttcgaaacaccccatttttccactacaaa ggagctttttatttcgtttatagtctctggtagaatccttttggatatagcatgttttag attttcttgatttagtttcttgtgaatcgacttttgcttaacagaaagcaaatcttcgac tgagtaattcacaaaacttttaatgcatattcgaccgttcaataaatccagtatgaaggc tatatcttgatttttgattgctattaaattcttcctaaaggttgttctatagttttttag atctcgattgtgctttgagtataattctcgttcgtatgctgttgcgattccagtaatgtc gtagcttaccccaccaagcgaaccttcattttcttttttcaatattagcctcttctgagc atccgtaattctagtccgcaaggcctcttctatcttgtggctacaggcacttctcaccct
tgccacttcaatttccaattctgcatcagtcttaacggatgaattatttgagtttctttt cgccaattttcgcttgcgaattacccccttcaagcctttcccagtcgtgggacttttcgc cataattgaaatgatcaagctgactagtgtacactcaaacttgaagtttatggcttgacc aatacatcaacccaaaaatgttctgtgtggtcactacaccattatgttacatgttcgcgc gtctgctatcataaccagaatcgtagtttctgcactttggaaacctacgcgtttcataaa tcagtcctttcattaaattcaaagcgagatcaccgaagata
ORF
TACAAAAATTACGGACAGGGATACTCCCAGGAGCGTATTTGTAAGTTTTCTGACCGATTTTTGAAATTTTTGAGT GACAATCCAATTAGGATCGAGGGCTAA
Downstream tttctatcgtgcttcaaaacgatttccactggtatgccagtatgatcgcgacaattttga tgacttcaaacagggaacacaatttcaccattctgtcatactgtgggcttaggtatatag ttaacttaggagcaacgaatgcaggaagtagatagttcactaccacaagtgatcacgtat ccgtgtgaatacgttactcaacaaattcagggtaattacttctctttaatttttctgata tgcgcttaaataatggccaacaccaatagtatacccccgtcctatgttagcaatcgtact cgtttctgttctaatttgttattagtttagagaaattgttattttttagggagcgagtca caaaaaaaaaaattaattttctcattaagagttcgttacacgatcctcttactaccaaat ctcgtcttttttttagatcgttttggttctccaaaacatcttttccaatccatgggtagt cacagctctacaactggaccgtaagttttttgtgaacattacaaatcgccgtaaagtccc atcctattaacactattttcacatagtcaactcaatttgcatgacttttatgatggaaat gaactgtcctaccaacataaatacactaagctgacatcagttatgtccctagatgactgg tcatctaacacatcaagatggtctcgatctagtgtctactattatagcaatttagtgtta gataaatttgggaggaaccaagtgggcctttacaattctaaaggaaaatacggacctaaa aaccggaacgtcaagcaccttattcacccctttcagtatgaaggtacctggaatgtggcg attctgctgcgaaaaactgcagtcaggggagggatagatgtagacaactcgcatgcattc tccgagctgggaaaagagactaaagtatcagttggggaactggtactgatcaattgcacc cctaatgaccccaggtcggaattcccagttgatgtttgttt
AA
MLFGLIRHSRRQLLFLGALVTVIVLIFTLPNTSPIEANGVKSEEGSITPIIPVLESPANSLEKIVDTASEERIGG ATLEEGHENNKEEQALENAERAKEKEKTEAIAAEEEKLKAAELLRQQETTREKEAAKEDDSKKPNQELVEQDTYL DDIPDDVEDNIIISEQDRKKIILPSYTPKTDPAYSKRATALKIFYNDFFIKVADSGPNTAPITKKTRKKGKSKLK GDVSSGDKYEGPVLTEDFLRFMEIYSDEFIDAVSESHΞKIVNLMPESFPKGMYQGDGIVI IGGGVYSWYGLLAIR
NLRDGGNTLPVELMLPSDNEYEPQLCEQILPSLNAKCIMLSDIVDQDVLKKLDFKGYQFKALSLLASSFENVLSL DSDNIPVANVSHLFDHEPFSETGLVSWPDFWRRTTNPRYYEAAGIKIGEYQVRNCLDGFVPESDFVHIGLKDIPL HDRNGTIPDASTESGQLLVNFNKHAKTLMLMFYYNFfGPGYYYPLLSQGMAGEGDKETFLAAANFFGLPFfQVKA GPGILGHHDSTGAFTGVAIVQYDPIADYELTKENFVGEKRKGIEAPKAFYGNNNKSPLFHHCNFPKLDPVKLIKE KKLIDNKTHKFNRMYGPNTKLKfDFEERQWKYTKEYLCEKKYNLLYFTEQYKNfGQGYSQERICKFSDRFLKFLS DNPIRIEG
SEQ0145 Chr 3, 0370
5' region gtagtatctagaaaaatgaaacgaagaaccccaaccgtactcgtagaaatcagtaaccag gttgtagtaatgcttggtcagctccgaataatcactctttcttgagtgttcagcttcctg atcagtcttaccattccagtgcttgaagtagccatcgaccgctacttcttgggctgaggc gtcctttgtgacccaagctccgagaccggttttcttagcggcatccttgccgtgcaaagc ctttgcaaattctctatcttgctcatagttctttggggcaagtggctggctttcttgttc agttgtagaggtagtcattttcggtaataaacgagtgttgagtagagtaaagttgacctt ctaaaagctgaatattataggttttcactatctaaacgatgggaacgacaaatctaccag gcgcgcgtatcaatttttagagtgtaaacgagatgtacgaccgattcaattcgagtgcga caactgtcaaattgtcacattggacaaaaaacgcgcttgacacgatggtttcaatactga caggctgttcttatatatactttacgaactgaccgaagcaattgcgccagaaaggcgcga gatgaaatttcgaaatccatgttaaagccgcatccgatcaagggaccctcatttcagttg gtttggttccacatgtttactgttaacgagcgaaggcctcttgaacgctctcaacagaaa aacacgtgctgtcgtagaacacaattgttattcacctaggttgccgtaatcgtctttgta acgccctcgttggggtaaactttccatggactgtgcacgttgtttagcttgctctctgct ctactccacatcttctccaccccaattgtcgtctcttacccaatttgtatggaagtctta acaccatttccacatctcagaaccatccgatcccatataagttgacgtatacggctccac taatttccatccgcattctgcattcacaataacttcccaat
GACACTGCCATCGAAGAAGACGAGCAAGCTGCAATGATTAATTTCCCGAAGAGATCTCCCCAGAGAGAGAAGAGT
TTACAGAATAACCCAATACAATTGACGTGA
Downstream cttgctactggtttatcacactaatgacgaaaagaagagaaataaagaacaggtaaatga ctgagaatgtccttttttattcgaaagacaagaatcaaaacaaaatgaccgtaacacgaa
aaataaatacacataccgaattggaaaccatgattgactattttgaactgccaataaaaa ttgtacgatataatgaaacagataattacttgtaataagaacattggttatggtgaacca ttggatgaaacatgttctttaaacgtatggatactttcaaagatcgagaatatcatcttt aaatgtccccctgtagatatgctatactttaccacccaaaatatgctcgttacaaaaaga gagccctcagctgtcacttttatatcaagcagttattcgaaattcttgccactaccataa ctagttgagattagagtagccatcaaaaactaataaacacatttcaaaggaatggacaac aattgtcgtcccgtaatgtttataatgcaatgtgatgtaacataattgatacatataaca atacagatcatttgatacaagctagccaggatacaccgaaaagtttcgaaaataaacagg atacataatgctacagtgtcgtgtctatcaacataacggttttggttcctggaaataggt tagtagagaacagactcgttgtcaaaaatggcagatccttgtcaagtcaaaaaagatgtg tcatcctattctttatctttcttttcttcattgatacagagggaagaatacaaatagagt tgtgcggtgtttctttgtagtggtaggacacaaatgaagaaaaaagtttcatgtggggta aatgggttctgtactggtgggagcgcctgaaatatgtctcaccactcaacgaataacaat gttaaaatacagaagcagaccagataacgatgcaaggaatagttcatagcctatcagaac aacaatagcttttccatcagtgcaatccaaagtgttcaatg
AA
MFGKRRQVRKLLIWWLLLIVYFFGLQFRAKNSAHQSSIRSFYADNKEFFDRQYSRYDEYDIIDNMNSHNELLQE QFRNGKLAAGLRGVAEEPNSDEVTDDTAIEEDEQAAMINFPKRSPQREKSLVELRKFYKNVLSIIINNKPAMPIE
NPRDPTPNENALKRKFGKSGI INIALHDTDPSLPILSEAYLRDSLQLSPSFIASLSKSHSAVVKAFPPΞFPANAY NGTGIVFIGGQKFSWLSLLS IENLRKTGSKVPVELI IPFAHEYEPQLCEEILPKLNATCVLLQETVGIDLLKSGH LKGYQFKSLALLASSFEQVLLVDSDNIIVENPDPIFDSEVFQRTGLVLWPDFWRRVTHPDYYKIAGIKLGSERVR HVVDSYTDPSLYTSSSEDPFTDIPLHDREGAIPDGSTESGQILISKTKHCQTILLSLYYNFFGPDYYYPLFTQGA SGEGDKETFLAAANYYKLPFYNIKKGVDVIGYWKPDQΞAYQGCGMLQYDPIVDYQNLQTFLKTHKGSRVNKLEQS ELDKPGLLSRLIPKFFFRKTFDEHQLQSHFTKDRSKIMFIHSNFPKLDPFGLKLHNYLFVDQDTHKPRIRMYADQ TGLSFDFELRQWII IHEYFCEYPDFNLKYLENANVKPQDLCMFIKEELNFLQNNPIQLT
SEQ0146
Chr 3, 0787
5' region gactctggctccagaaggagaaagacgaagaaaaaaacgaaaaaatatatgactccccct ctcatctttccaacagactacaatgacaggagaacataaacgaagttccctcattaagag gatcacaaacgaaaccaaaatacaaatagctctcagtctcgacggtggacctgtgagtct ggctcaatcacttttcaaagataaggactattctgcagaacatgcagcccaggcaacatc atcccagttcatctctgtgaacacaggaataggattcctggaccatatgttacacgcact tgctaagcacggcggctggtctgtcattatcgaatgtgtaggtgatttgcacattgatga ccatcattcagcagaagatactggaatcgcattggggatggcattcaaagaagccttggg ccatgttcgtggtatcaaaagattcgggtccggatttgctccactagacgaagctctcag tcgggctgttattgatatgtctaacaggccctatgctgttgtcgatctgggtttgaaaag agagaagattggagacctatcgtgtgagatgattccccatgttttggaaagttttgccca aggagcccatgtaaccatgcacgtagattgtttgcgaggtttcaacgaccatcatcgtgc cgagagtgcattcaaagctttggctatagctatcaaagaggccatttcaagcaacggcac ggacgacattccaagtacgaagggtgttcttttctgagtctggaaggtgtctacatctgt gaaatccgtatttatttaagtaaaacaatcagtaatataagatcttagttggtttaccac atagtcggtaccggtcgtgtgaacaatagttcaatgcctccgattgtgccttattgttgt ggtctgcattttcgcggcgaaatttctacttcagatcggggctgagatgaccttagtact cacatcaaccagctcgttgaaagttcccacatgaccactca
ORF
TCCACGGGGGATTAG
Downstream catagagtatgtatataatccagtattattgaatattcataaaaaaagaggtcgtacccg gattcgaaccggggttgttcggatcagaaccgaaagtgataaccactacactatacaacc aatttttctttagaatttgtattttctgtttttactcaaattagatgactctcttgatgg atacttcaagttttattcagagttacacggtcttaccatgtagtcagcccttctttaatg tgcactgtaaactttacctgcactatttatgacaatatcattgcaacaaccagttatttg ttgattagaaagtataatatcggagtttacagtgtattggtctgtgttttatcatgtttt gttattctcgcaaatcgttacgctggtgatccatttgtcgtgttcctcaaattaccggag ttaaagctggaaaatcaaacagccatatcatcattgttgtagaaagtctcctggacgttg agtaatgacatttacacaccaaaaaagtatattcgtatttcatatcctagggtataggct aagaactacactcgtcttcgaactcgtccttattagctcagttggtagagcgttcggctt ttaagtcaacacaaggaaccgaaatgtccagggttcgagcccctgatgaggagttctttt ttgaatttttaaacttcaattacgggccgtctattatattattccctatttaccctttct cttgcccatcttagcctcaattagaagattacttcctttcttcaccgaccttgagatttg ttcttcgtcactcgagtcgctggcatcattctgatttcgttgtgagttctgtttctccaa gttttcttgacgaaacctccttctatcgtcgtttctcatcttgtttttcaaagattgtat caccttactgtcatctttggaaactctgtggcgattgttgattctggccaatttgtcatc cacttgacttttcatcttctgttttttggtcatggtagaaa
AA
MFNSLAPMRLKKLLKVFCASWLLAATSWLFFHFGGQI I IPIPERTVTLSTPPANDTWQFQQFFNGYLDALLEN NLSYPIPERWNHEVTNVRFFNRIGELLSESRLQELIHFSPEFIEDTSDKFDNIVEQIPAKWPYENMYRGDGYVIV GGGRHTFLALLNINALRRAGNKLPVEWLPTYDDYEEDFCENHFPLLNARCVILEERFGDQVYPRLQLGGYQFKI FAIAASSFKNCFLLDSDNIPLRKMDKIFSSELYKNKTMITWPDFWLRSTSPHYYHNITKTPIGDKRVRYFNDFYT NPNEYYYGDEDPRSEIPFHDREGTIPDWTTESGQLVINKEVHFPAILLGLFYNFNGPMGFYPLLSQGGAGEGDKD TFVAASHYYNLPYYQVYKNCEMLYGWVDHANSGRIEHSAIVQYNPIVDYENLQSVKAKAEIILKNHEPDSRKKSS KPKSYSKTRLSTHVKGSIYSYRRLFRDSFNKANSDEMFLHCHTPKIEPYRIMEDDLTLGRNKEAKQRWYGGRKNR VRFGYDVELYIWELIDQYICDKNIQYKIFEGKDRDALCGSFMREQLGFLRSTGD
SEQ0147
P. pastoris homolog of S. cerevisme KTR4 (YBR199W)
Putative mannosyltransferase involved in protein glycosylation; member of the KRE2/MNT1 mannosyltransferase family
S. cerevisiae null mutant: viable, decreased budding index
Chr 1-4, 0050
5' region tttattgcgccgttgccaagaggcgagagtggaaactatatttgatatcatgtccatgga agacgatgagcgagacaatttgttgaagctatccaatgcgcagctccagaaagtagctga attcgtgaacaaattccccaatgttgatattgattacgagttggatatcactgaaggtac tataatagttgacgaggagagagaaataatagttacacttaccagagatgaacaacctga ggatctcactgttatttcttcggtatatccctatacaaaaacagagaactggtggctcgt tattggttgctcccattcaaaagagctttatggtatcaaacgtacaagaatttctaaaca
gcaagagcaagttaaagttactttctcggttccttccccaggatcccacgagataacact ctggtgcatgtgtgattcttatatggatgcagataaggaagtaagcttcgagcttagagt ggaagataatcaaatccgtatatgaataaggattgccaatacttttatttttcgctctat ttatttaattcatgaaaattcgtaaattgaacccagctatttcagtagatatatatgaat aatttgccggcaaagttttgcgttcgctgcgtgagttcaacttgagaatcttggaactag ctgcaaactaaaaatcctggttcagaaaaaaataatgcgtatgcccggagtcgaaccggg ggcccgacgatggcaacgtcgaattttaccgctaaactacatacgctttttgtatacaaa agttcgttcagtatatatcatatattcataacatattagccaatttccctgaaacaggaa aatgagtgaaaaggcaatactgcggtttattcggtatttgtgccgctgctaaagcttttg agtgtctcatcccgacccgaaggcgataacataccaagaaagataattcaaacacgcgaa cgtgctccttatttgacaaggtgcaacttgttattcttctg
ORF
GATCAAAATTGTCTGATCAATTGGATTGAAATGGTAGCGGACAATGAACTAAGCiMGTATTAA
Downstream gacgaacacggaggtgctttctcaatcatttagtattgttcagcttgatatttacagaag ttactaataccacgtgacaatacgagaaattgccaggcaattcttggttttcttctcacc ttttagttcaagtaaactactattctccttatgtcccctgcaatcaacatcgccattatc ggagcaggtattgtcggttctgcgttcattaatcagttgaggagcgtatcgtctttgttc gactttaacgttatttatttggctaaaagcaaaagtgccttgatctcttatgatttcaaa ccgttgaatctagccaattggaaagcagacttggacaatagctcattaaatcccttgacg gtggaagaacttttgaaattcttgaaaaattcacctttaccagcaattttagttgataac acttctagtgctgacgttgcgtttgcgtacccccaatttgtctctaacggtatttcagtt gcaactcctaacaagaaagcattttccggggacctaaatttatggagtgaaatattcaaa gcttctggagaacccaatggtgggctagtatatcacgaggctactgtcggagccggtttg ccggttattagtactatcaatgatctggtcaacactggagataagattgaaaagattgaa ggaattttttccggatctctatcctatatcttcaatcgcttctctactacaaagccaaat gatattaagttttctgaaattgtgtccaaagccaaggagctgggttacactgagccagac cctagagacgacctctccgggttggatgttgctagaaaagtcaccattttagccaggtta tctggctttaatgtgaaatctcctgattcattccctgttgactctttagttccccaggag cttgaaactttggagacgagcgatgaatttctttcaagattgcatgagtttgactcccaa gtggggcacctaaaggaacaagctgcttccgagggcaaggt
AA
MMISLTKRFTKLAIFGSLSFILTTAGLWLYWDAIQYMMTSGKIPTLDFQFEDFMNRHDDIVDDMMFKYDKIMKAE VKEPNVGNLVYAPESLVDYGRENATLLMLVRNKELRTALQAIETVESQFNHKFQYPYVFLNDKEFTDKFKSTITE KVSGQVFFETIDKVTWDRPDWIDSAKESERIKVMRKYNVGYADKLSYHNMCRYFSRGFYNHPRLQQFKYYWRFEP GTHYHTSIDYDVFKFMSANDFTFGFVISLYDTERSIETLWPETLKFIEQNPQFVNKNAAWDWLTEFKQNPQKTRI ANGYSTCHFWSNFEIGDMDFFRSEAYTKWVNHLDATGGFYYERWGDAPVHSIGATLFQDKSKVHWFRDIGFYHAP YYQCPNSPQSDGKCEVGB FSFPNLSDQNCLINWIEMVADNELSMY
SEQ0148
P. pastoris homolog of S. cerevisiae KTR2 (YKR061W)
Mannosyltransferase involved in N-linked protein glycosylation; member of the KRE2/MNT1 mannosyltransferase family
S.cerevisiae null mutant: decreased competitive fitness, viable
Chr 2-2, 0105
5' region gaattgacaggcccatcgcgtctgctgatgtaatcataacagagaaaaaaaagtttagct caacttgatacaatatcaggcgtctcaagatcaaccaccgatgagctttgagttatcagc ttccgatattaaacagactcagaagtcaatagctgcactgatagcccacaccaagaaaac caaagattcaaaactcaatattaatctggttattggcctggaagaagaccagactttacc agaaaatgacaatactccaagaatagtccctgtcaggaatcgactagaaaagagtactga aaaatcaatcctgttgatcacaaaagactcttctgagccatatgtccatgcgcttaaaga aaagggagcacctacagaggatacgttcgcaagaataatttcctatcacaagctcaaatc cctatcaggcaaaccacaggaaatgaagaaattataccacgaatacgatcttattctagc cgactataggatttatccctttctcagaagaacactagggtccaaattttatagctcgaa ggagaagattcccttcatgattcaaatggccaaaccttccaagaaaatcgattattcaag aaccgagggggagccaccagaagacaaacgatgcgatcctatatatgtcaagaaacaggt catttcaatcgttaagaacactttttteatcccttcaaacaaatctcatttcatcagtgt agttgttggagacactgacaaggatccaaaaaaactcgctgaaaatgtatcagacatact gtctttcttgaccgatattaagcgcaaaccaatcggaggcttgctagctgctgacggaat caaatcacttcacgtgaggacgtctgaaagcattcccttaccagtgagagaaaatgatga aatatagaactcagaaaccatgtattatgtaatcatatattagattgtccaaggcttatc acgctaagttatttcgttgcacctatcgatggcgactactt
ORF
GATTTAGTTCCACATAGTTGTCTTAGCAAATGGTGGAAATATGGAGGTAAAACATTCCTCCAATAG
Downstream aaaaatgatcatacatagtatatagtacagggtctacttttatttatagggatcttctaa cctgttcagttcatctaaaatcctttgtaagtttacctcttttttcgcagccaaacttgg aattaatgttatagcttcgtcaacatcttcacaccccagagaacttaattgggcaatttc aaaaggatgtaacacagagcaatctgaagaattatgcaatagctgatcaacggcagtaca agtctcttcatctttgaatcttgcaaacgtgttcagatagtctaatgttttcttcactac tccgttggctacagcacctgaagtcacttttgcaagctcgtcatcgtcgatttccccatt gctggactcaatatcaactccgccattacgtgcccgtgatctagctttcagagcttccct aattaacaaccttgattctgataaactaagagctattaagtcatgttcattaccatcatg atcatactgtttcaaggcaaactctggccctagccttaataatgtagcgttctcctcatc gtctacctgctgctttgctctcctacgacgagctcccaccgtggatgttgaaacattcat agtgacgtacgagtgtttagcaagggctaggaagaaactgaaaaagtactcaaacaaact gttttttgtcgatttccgcagtaacagctagaaggaatagcgaactccctctggcccacg aactctctgggcaaagtttgattattaaccaaaagcttcactaatgtagcgaattgcttt cccaaaaccttccttctaccacagattgccgtttccttccagaaatccgcctagcatgag ccgcccgcggatagttacctttcgtgttccgtgccatatcacctccccacgcttatttac agcaaacacacgaatatttagctcacacatttccatataccatcatgagttttaatgaca
ggcaaggatggaagcaaagggcgcagagtagcaatcaagga
AA
MKVAZ7WLACFI ILAAIWYP DiTQSLRGFMDDRVSKTLPINFNALKLSTNSYIPVDEHLIKPNREPNPKFVKENATLL
MLCRNWELEEVLQSMRSLEDRFNGRYQYTWTFLNDVPFEKQF IQETTLMASGKTQYALISSTDWNRPSFINETRF
EQNLIQSEKDDI IYGGSPSYRNMCRFNSGFFYKQFILDQYDYYFRVEPGVEYFCDLEEDPFRYMRLHDKKfGFVI
SLYEYENTIPTLWQTVEFFIENHPEYIHPNNSYEFLTDKEWGPLGLVALTEQTYNLCHFWSNFEIGDLNFFRSE
KYEAFFQFLDQAGGFYYERWGDAPVHSIAVGILLDKRQIHHFENIGYYHLPFSTCPQSYWSYKCNRCICKRNESI
DLVPHSCLSKWWKYGGKTFLQ
SEQ0149
P. pastom homolog of S. cerevisiae KTR5 (YNL029C)
Putative mannosyltransferase involved in protein glycosylation; member of the KRE2/MNT1 mannosyltransferase family
S. cerevisiae null mutant: viable
Chr 2-2, 0201
5' region aagaaggacaaaaaaagacttcaaatccagaacaaattgataaaaatggaggaaaagttt caagaagaaaaggatttcatttatagaagcaatcttcatgatttgcagtatgaaaggagt actctccacgccgacgtcaatgaggcatacttaacagagatccgtgatcttcaagaaatc agagatgaagagctggtaaggctacgtctttgggaagattatcaagtgagttgcgtcacg agacagtttaatgaaggctatgaaaaggccaacaaagagtatgacagattggtcactatt ttcaaaacaaaactgaaagagagaattgagcacaaaataaaacaacttaaagaagataag gtactaatggactttagtagccatttgccaaagacgaacactcgttctcataataacgtc aatggtggaatatttggccttacccccggtaacatttctgatagagaggtcaattctaac ggctattactcttccacggatagaagatcacgccgtaggaagtttgatcataccacagat aacggtgaagactcatacgattcaaatgataacacttctggatctgcctcgaacatttcc agaagatccaagatgaagaaaactggtggtaatacatctgcaagatcttcatctgacact gacgacactttcttgaccgacgatcatgctttgaaattgttatttggttcagattacgag aagccgaaagataaaccctctactagacattcttcaaaaagctttaatggagtttcaagt cttaaaccagaggaattaaacgaagacctggagatcctaaggagtgctatccacactata tcgcatacaaagaagatggctagtaagtaatgaagttacacctgattctctgtatgtctt ccttttttttttttttttttttgtactctgtttccaatacctcgctgttagggtgtttga cgaaagatgaatgttccacaaatacaaaacctcaggcaata
ORF
Downstream gaagagaacccagaaagagtcaaaatacaaactcgaaaacataccgattggcagaatacc agccatacattacatttgatactaatatacataaaatacactaacctccgaaaccataca aggtacgtccttgtcttttcaaagcgtagacgacatccagagaagtaacagtctttctct tggcatgctcagtgtaagtaacagcatctctaatgacgttctctaagaaagtcttcagca cagctctgacttcttcgtagatcaaagcggaaatacgcttaacaccacctcttctggcca atcttctgatagctggctttgtgataccttgaatgttgtctctaagaatctttctgtgac gcttagcaccggattttcctagaccttttccaccttttcctctaccagacatatttattg attatttgtttatgggtgagtctagaaaaggacgcactcgtcttgtatttatagatgaaa agagttaaggtggacaatgcagtgccaaaccgtaatgttgatacgacacggtacgcgatg aactgagccgtactcctagaggcagagcgtgcgtaaatttgaaactgtaagaacatgtcc catttctatgctaccagaactcataataaacttgcatcccatttcagtcgggagtgtcgc agtgcgaaatactgttggcactgatggcaaaaagtgccaaatcgttatctatgggagact cttataaatatccagatcctccccctcctgcttcttcttgtgtctatcgtagtaaaaatg gctagaactaaacaaacagcaagaaaatccactggtggtaaagccccaagaaagcaattg gcttccaaggctgccagaaaatccgctccatctgctgctggaggtgtcaagaagccccac agatataagccaggtaccgttgctctgagagaaatcagaagattccaaaagtctactgag ttgcttatcagaaagctgcctttccaaagattggtcagaga
AA
MSFRLGYIQAIVLGLVLLSVCWTIVIRPDPSSAIDLASPVTIDLENSLTNLKSFPISSRRISSNIDHVFQTGCRN
VFKNKKKANAALWLARNSELEGVQKSMFSMERHFNQWFNYPWIFLNDEEFTESFKDGVMNMTSSGVSFGVISKP
DWNFSEEKDRGSTEFLRFNEFIQNQGDRGIMYGALPSYHKMCRFYSGYFFKHPLVAKLSWYWRVEPDVEFFCDLT
YDPFLEMEASGKKYGFAVIIKELSNTVPNLFRHTQSFIEKYGISVDEKAWSIFTNRRSFGEKESMKLIDKIRINH
LLSNFSGGIGTRLLSSLSRMNLPTSFSSKKPFFYGEEYNLCHFWSNFE IASTDLFSSPEYESYFQFLEEKKGFYQ
ERWGDAPVHSLAVAMFLNISEIHYFRDIGYRHSNLVHCPKNAPDELQLPYVPASPEYASSAKPDKPPRVSVRDVF
RSGRQTEGVNNLNRGSGCRCNCPKKYKELEDSPSCCIGRWMVLTNDKYKGEKYLDKYSMAEEVKQTLSKGEKLNV
KEILKRHHKYPT
SEQO 150
P. pastoris homolog of S. cerevisiae KRE2 (YDR483W)
Alphal,2-mannosyltransferase of the Golgi involved in protein mannosylation
S. cerevisiae null mutant: viable; increased heat sensitivity; abnormal protein N-glycosylation; decreased resistance to sulfanilamide
Chr 3, 0215
5' region tcacgtattcaagcaacgatggatcttctttatctctcagaaaatgccaaaattgttcgt tatcccaaatacgaagattccaacctatgatattactgtacgtttctagttgattgcgct ctgtagctgcaactttcattttttctgccctcacatagatttccttcagaagatattcaa aaacttcgttttggtgttgatctagggcgatgattaatggggatatggtgtccagcacca tgtatgaacgattatcgaacgtcgtcattttgcaatgacgtgaagcttttgtcaagattt tagcatatttggcaagagtcactgtgttgatattatcctgcttattgcctagcagggcag taagtgcaaattctaagcttcgaactcctccaaatagactaatcaattttttcaattgtt caattatattggctttgttagacagttctaaatccaactggaaaagcaatggataatctt cgggattttcaatagaagagattctgaatctatctatgatctctatcaggagggatagac tcatgaatttgtacttgcagacctctgcgtctaataccaaatgaaattcctcactagaat ctgaatcacatttttggcttataaaactcttggagattactttgtagattaaaaaatccc tttctttttttggtgcagagtttaagttgagaacactgtagaaatgacgattcagggcgt agagcggagccaacttactgaaggaaaatacttcatagattagctctggaggcaaatctt caaaacctatctttcttgccttcttactgggtacctttctaagatatttcttgattcgtt tatgagttcttccagggatgaggaaccccattagggacgaagtgtattaaggtttttttc ttcctagtgttttttttttttgtctccaagaaccagcgaccgagggggggctaagctatt taacatttctccaggctcgtctttcttagctatctagtaaa
TGGAAAGATTATGACTAA
Downstream aattgaaagaaggatacaataaaaaaaagaaactcaaaaccgtttttcatttggtctact taaaaaacgcctgcagtatataatatttaaaaagaacacctggtacccatcactatgcgt ggaacatgtaccgtacccaaagacaatagaaacatctccgcgacatctacctattcctac atatgtgagtgtgaaattccatggaattagtgcgcctggccaatcttgtcaacgtcaacc accctttcgagcaaagcaatatatatcgcgttccacttttcttccttctctcaactacca gaccagacaggacaacggtacaaatggcaggtgcaactaggatcaattcacgagtagttc ggtttgctattttcgcatcaatcctggtactgttaggattcatcctatcaagagggtctg ccacatcgtactctcttccctcagggttgactagtgacacgtcacaatcaactggctcat ctcccaaatctgagtctaaaccttcttcccaaggcagcagtggtgcaactgaactgaaga agacttataccacggacggaaaggaaaaggctactttcgtttctcttgctagaaatagtg acgtatggagtttggctagctctatcagacacgtcgaagatagattcaaccacaagttcc actatgattgggtcttcttgaatgacgaggaattcagtgacgagttcaagcgtgtcacct ctgctttaacttctggaaaagctaaatacggtctgattccaaaggaacattggtcgttcc cggagtggatcgacaaggagcgtgctgctaagaccaggaaggaaatggctgctaagaagg tcatctacggtgactccatttcttacagacatatgtgtcgttttgaatcgggtttcttct ttagacacgaattgatgcaagagtatgaatggtactggcgtgtggagcctgatatcaaga tttactgtgatatcgactacgatgtgttcaagttcatgaag
AA
MVHIGFRSLKAVFILALSSLILYGIVTTFDGSRASRYQPPYVNHSQDPLYHSGNSYNRENATFVTLCRNEDLYS I
IQSIKKVEDRFNNKFAYDWVFLNEVPFTDEFKERTSVLISGQAKYGLIPKEHWSYPDYIDQERAAESRRQLEDQH
WYGGLESYRHMCRFNSGFFYKHPLMLDYRYYWRVEPE IE ILCDVETDLFRYMRENNKTYGFTISIHEFEKTIPT
LWETTKEFMKQNPSYIAENNLMNFISDDNGKTYNLCHFWSNFEVADMDFWRSDVYEKYFKFLDDTGKFFYERWGD
APVHSLAVSLFLPKEKVHFFNEVGYKHSVYSMCPIDKDIWKNRKCYCDPNTDFTFRGYSCGRQYYKATGLTRPSN
WKDYD
SEQ0151
P. pastoris homolog of P. stipitis CBS 6054
A- 1 ,2-mannosyltransf erase
Chr 3, 1162
5' region ctggaaagctacagacacatgtgtcgattcaactctggattcttctacaaacacccgctg atgctggactatcgatattactggagagtggagcctgaaatcgaaattttatgtgacgtt gagacagacttgttcagatacatgcgagaaaacaataagacgtatggcttcactatatca atccatgagtttgagaagactatccccacactttgggagactactaaagagttcatgaaa caaaatcccagctacattgctgaaaacaacctgatgaattttatctccgatgataacggt aagacttataacttgtgccatttctggtcgaattttgaagtggcagacatggatttttgg agatctgacgtttacgagaagtatttcaaatttttggacgataccgggaaatttttctat gaaagatggggagatgctcctgttcattcccttgctgtctcgctatttctaccgaaagaa aaagtccattttttcaatgaagtgggctacaaacatagtgtttactccatgtgtccaatt
gacaaagacatatggaagaaccgtaaatgctactgtgatcccaacacggatttcactttt aggggatattcatgtggaagacaatactacaaagctactggtctgaccagaccttcaaac tggaaagattatgactaaaattgaaagaaggatacaataaaaaaaagaaactcaaaaccg tttttcatttggtctacttaaaaaacgcctgcagtatataatatttaaaaagaacacctg gtacccatcactatgcgtggaacatgtaccgtacccaaagacaatagaaacatctccgcg acatctacctattcctacatatgtgagtgtgaaattccatggaattagtgcgcctggcca atcttgtcaacgtcaaccaccctttcgagcaaagcaatatatatcgcgttccacttttct tccttctctcaactaccagaccagacaggacaacggtacaa
ORF
TTCAGTGGGTAA
Downstream ttttagttttaactatggtcttacataggcgcgatgtgtgctatctagaaaaatgtatag tcgaaacaatcttgtactcgatttcagttctataaactttcaaaagattcttgagggagt ttacgtttcatccttgctaacgttgcatcagtttgcttatgcagatcatcaaacttccat ggctctgtcatcaagaagagcagaggaagttcgaggtttacacaatttccgtttctgctg atggtaaaagactagcttctggaggtttagatggtaagatcaaaatatggtctatggact caatttatcaatacaagaggagcgataaagagaatgatcttaggtacttacaaggttcag ttcagtcggatgaatcaaaaaatgcattgaaagttagtgagaatatctgtcgtcctctct gttctatgtcaagacacactggagctgtcacttgtcttcgattttctccaaacaatagat tcctggcttctggctcggacgataaaatagtgttaatatgggaacaagatgaagagtatg agtacgattcttctgttatggagggcatgaacccagttttttccaacgggtcatctggag atcaaatggacatggagcgctggactgtccgtaaaagattagtggctcatgataatgaca ttcaggacatggcttgggcgcccgatggcagcatccttgttaccgtgggtctcgatagat caattatcatatggaacggacaaacatttgaaaaaatgaaaaggtatgacattcacaatt ctcacgtaaagggaattgtgtttgatcctgcaaacaaatatttcattacatcttctgacg atagaacttgccgtgtcttcagatatcacaaaacttcacccactgagatgatttttagtg tggaacatgtgattacagaaccgttcaccaaatctccaatgacgacttattttcgaagat tgtcgtggtcaccagatggtttaagtattgctattccaaac
AA
MAGATRINSRWRFAIFASILVLLGFILSRGSATSYSLPSGLTSDTSQSTGΞSPKSESKPSSQGSSGATELKKTY TTDGKEKATFVSLARNSDVWSLASSIRHVEDRFNHKFHYDWVFLNDEEFSDEFKRVTSALTSGKAKYGLIPKEHW SFPEWIDKERAAKTRKEMAAKKVIYGDSISYRHMCRFESGFFFRHELMQEYEWYWRVEPDIKIYCDIDYDVFKFM
KDNNKMYGFTVSLPEYVATIETLWDTTRAFIKENPQYLPEDNMMDF ISDDDGLSYNGCHFWSNFEVGSLSLWRSE AYLKYFDHLDKAGGFFYERWGDAPVHSIAAALFLHRDQIHFFDDVGYFHNPFNNCPVDADLREERRCMCNPKDDF TWKGYSCVPEFFTVNNMKRPKGWEAFSG
6.3.2 x-MANNOSYLTRANSFERASES
SEQO 152
P. pastoris homolog of P. stipitis CBS 6054
Mannosyltransferase
Chr 4, 0536
5' region ggtgatcaatggattcaaagtattgcatgacatgacattgttgtttccagaaattgacca gctcctggaactccatttgagectteatatataggacatttttcagtggtagatctttga cgaatctctgtctttcaaattcttgtctgatctttgttctgacttgggaagctggaatat ccaagttgtaattgttgacaaatcctctagagtgtcttaagtatctacggtataaattga gcaccagggttctcatttccgtggaatccttgacataccttgtagtctgggcaaatgcag ttgggttggtcatcttatgttgaagactcctctatgtgtatgatggttcaggtgcagtaa tcattcgtcgctaattcgatgagaggatattgggtgttgaactacttatcgaatagttag gaattttcaggaggaggtaggttgattaaatacgttctattaattggaatgttgttcgtt tcggtcgtattacaatcgaaagctcctttagcaagtagagtctttgaaaaagggtttgac taataaggctccctttcattctatttgacggtagggcacacttcattctcagatctaacg tgactccaccaattttgaacgacgaccactgaacacagttgtaagaacgaatgacaatga caggccacttacctaaaagcgctactcgcaagagctgacgatcctatctggactacagat gtgatccttacaaacttaagatgtctcgcaaaaccatttgaaggcacactatacgatgaa aagatgctcttgccttgagactcaaagggttactacttttaaaggctaaaaaggtcagaa ctcttcaaacacggtcacaacatcaactggtgctcagtatctattgcattcctaacctct atcagccttttctgcctttcgagcttccgctacaacgtggtgttaggattcgttatttgg cttccatctccatctttaactccatttctgcttagctgtct
ORF
CTTCAATAA
Downstream agtaaataaagagaagagttatatatgtctaaatctatacggagattgtcttccaaacat tcagtccatcagatcctgtgctgaccaaagtacctggtatttggctgtgccatctcacat ccttcacatctctttgccaatgaacaaaaagtagttgtggtggaatatcattcagttctt tagtttcttgttgctgttgtttgatttcctcgtcatcggcttccactgccaaatcccaca atgtcactgtgttgtcctcggatgaaacagcaacaatagactcgtcaagcgggttgaaag aaatagaagttatggcagatttgtgaaaatcatagttaacaacaggtgctggagccgggg aagagccgaagcttctcagatcccatataccccagtttccatcgtcatgaccagaagcca gcaaatagtcaaccttggagcaccaggaaatcacattgacatctgtgtcggaggccttca cactgatagcaggcttgttgtttttagaacgagtatcccagattcgaacatatccatcgg tgcctgcagttgcaaacacagttttttctgaagtagaccattgaatgtcttcgatggagg catcagacgcaacaaatggggtgccttctgttgtccagctggatgtagttctcgaagtga agtaaacattaccattaacatcaccggataaaagggcaccggtactgatcaatggagacc aatccagtccatatccttcgaccttgccatgatttcttatagtgtaaataggagctctgg attgcttggggataacaaatccgggtgtggtgaacgctctatattgggaggcaacatcaa atagttggacctcaccgttttctagcattgatgctgtcagatactccccagttttagcag catgaggagagactctgagtctattcgttgtactctttagtggtatgttctcgttttcca agatggggtcagaagtgctgccctcattatcctcatcgtcg
AA
MECLVYKDASWGFVLATLMATGIFLSYVPQHSRIMKRRTSEGLSPFFLLLGTVSAISAFANLLLYSNDGRICCR
LGLSDFQCINSQIGMIQVGLQTLGYSLILVLCVYYTRDSLTENQREYAQLLKVYRFFLFYAFLNLFVWYLIKRK
QDTE IFVQFANLSGVFSSLVGGFQYLPQIYTTYKLKHPGSLS IKMMSIQTPGGFVWTASLFLQPNSKWSSWLPYF
TAAMAQGILLIMCIYYKHTYNIDELEREQLARITEENLAYTSLETTDDGLLQ
6.4 ER alvcosylation pathway
SEQO 153
SEC59 (YMROl 3C): Dolichol kinase, catalyzes the terminal step in dolichyl monophosphate
(DoI-P) biosynthesis; required for viability and for normal rates of lipid intermediate synthesis and protein N-glycosylation
S. cerevisiαe null mutant: inviable
S. cerevisiαe conditional mutant: decreased protein O-mannosylation; decreased protein N- mannosylation; decreased GPI anchor protein modification
chr2-l_0498 5' region ttttcctatgcttaagcactctgtaaaaagattgatcctccttcgaacgatgtctaacta ccctaaagcgactgaagaaagatctaaagagcttttagagaatgttaatcatatattgga ccagataaccagtgtaaagtcgcccaacgggaggcctgtcaacttggtttgtgtgtctaa gctgaaaccatcaagtgatattatggcactttatgatgctggctttcgccattttggaga aaactacgttcaggaactggtttcaaaagcacaagagcttcccaaagacatcaagtggca ctttattggtggtctccaaaccaataaatgtaaggatttagctaccaggattgataattt gtatgcagtggagacaatcgatagcatcaagaaggcgaataagttgaacagcagccgtga tgcatccaaacctaaaattaacgtcttcatccaggtgaatacttccggtgaagaacaaaa gagcggcatatcctcttatgacgatcttctcgcattggccaaggtgatcaagaatgattg ccccaatctcactttgaagggtttaatgaccattgggtcaattcaacaatctacagcagc cggtgaaaggaacaaagattttgaccagttgacggcacacaatgaaaaactagaatctga tttaggaaccacgctggagctctccatgggtatgagctcagacttcgagcaggctataag gcaagggtctacaagcgttcgtgttggtagcagcatatttggtgcaaggcctcctagaaa tgggcattagaacgctacatagagtgtacatatttggtaggactgttcaagatcaggtaa tatttcaagggctctagcccgtttttcccccggtggtggatgttccctaccagtcgattt tgttccagcctcgaaaggaggtacctaccatatttttttatcaaagccagccgccataat ctgatccacctgtcaaactattccttctgcttattccaaa
ORF
ATAAGTGATTTCAATGACAACATCATCATCCCACTAGTCGTGTTTACAATATTGCATGGTCAATA1G
Downstream aatggttatagtctcttaatatatagaaggatgtatatataaaaagttgactaactatat agagacgttttatactaaaactgatatctaaagtactactcattcgctaccccaaactaa accttgttggtggagcaagttgtgtgaatggatgactttgccaagctcaaattcatcgac ttgtctaccttatgcgtttccaggaataaacacctgaagcttcccgagcaaaactctggg caactgaaattttttgagaagaaccagcacttctttgatgtgcgagtgtatgatgacatc
gcgtggaacaattatctcctagaaacaattaaatcaaacaccgacggagtagttatacta tttgactgctttttagaatggcacaagattcagccgatgtttctgaggggctcccaatct aatcatttatttgtcgtttcggagtttactgcaattgaggaattgtcgaacaaactctcc aagtttgaagatatcttgetcaaactggatgtcgaggcaaggaagttaccacttaagggc attgtgctaaccaatatgtcagtttattactggcaatcaaaagataacggtaaaaagacc tggttttctgtggttgaccaattggaacgattgcaggccaaatacgggtgcacgattata cgtggaatgtgggactttacctctaaaagggaataaacgctactgcctgcaattaacaat gtcttttgacccttgttcattcaaaagctaaggctacacatccttcttgctacctttaaa ctagaagacactatttccaaatcaggaaataacatgtccacttttttcttcattcttctc aaatttttttttttccctttaaattcttctctctaagaaagatccctaaacccccgatca atgtctgaagaacaatcaaaaaaggagatcctggagcaagtggaattttacttttccgac tccaacctaccaaaggacagatttttatacactacagcgc
AA
MAKKKAAAKRPKKTLPPASSKPLNSGENPDESQAEYSGLPPPQLSFKSAIVAEMTSVISNGITINKVIQVLILAY IANLLYVKLDLNFDYDSKVEIGIGFLSILSSILIIYVSRKRRWNTGVELGSIDKDDPEKWVELPEFNLIYAIVLP IFVTYIVDKRYLGVNLVNILLVIDIPIAFKILLALSMEQQVLDESGYLNQGFVTTRNNLLIPVAHVGFRELLGYY
VYYGSVYQLAPVLGKHPWDYLVNDIMNEQRIHFLYSWLILLVVWPPVFYFVDAIPLDFRRKIWHFTILIILSYP LSIDPSFCVIALGGVFGLFLVVEILRSTEMPPFGQILSRNLEKFQDERDKRGNITISYLYLVLGIVLPVMFDGSS CAGLVSLGLGDSMASMVGKRYGLVKWPGSNKSVEGTFAFIWTFLGLLAARTFFGYQFSWEISFIAAALAGVLEG ISDFNDNIIIPLWFTILHGQ
SEQ0154
ALG7 (YBR243C): UDP-N-acetyl-glucosamine-1-P transferase, transfers Glc-Nac-P from
UDP-GlcNac to DoI-P in the ER in the first step of the dolichol pathway of protein asparagine- linked glycosylation; inhibited by tunicamycin
S. cerevisiae null mutant: inviable
S. cerevisiae repressible mutant: decreased lipid-linked oligosaccharide accumulation; abnormal mitochondrial genome maintenance; decreased carboxypeptidase Y modification; decreased resistance to tunicamycin
chr2-l_0727 5' region gttattaattgaaggcacaggtgtgcatgaacgccagttcctctgttgtacaggatttct agtgtctcgacccatttagcatcggtaaatgagtctgcaaacgctgtcgctttcactttg ttctttggagtaatagttactttactatctccctgcttctgactttgtggaccaacttgc cgaatataacccagtagcaaggatagaatctctaaatcatcattagcaaacttgcctgcc gtaataatcttttccaaaaaaacatgcagtgtcttgagagttatcccttcgtctctagtg acatggttcttgtctgctgataagagcatttgttgaattatgggtagacttaagaagttt gagttgaccttggagtttgcataggctaacttcacatatatcttcgtaaaatcccataat gtttgccctgataaattaaagttaacaagtccataaactctcaaacaattaagaacaacc aacttcaagattttcctgttggaaataatgtcagggttcacacccccaagactaaaaatc ttcgatgactcttcagctgtttccctcaaatagaccttgagaaaattttgaagctgctga aaagatttggagcttaatctttgcgtggattcatcttctcgttttgataagagaaccgtt actttctgtattagtgcaatgtaggtctgtaacagatggacccatatcacggggtcctta tcggttagcgacgaacgtaatttgaagggtgggtaagctacaatggggatgtttaataca tttgaattaacttcttcactggaaggctggcaattttctactattaccggtgccatttac tggttgactgaaatgactatatcgatagtctagataccctttgatagtgatgcaaggatg tagttttccacgagcgctcataaaatcctaatcgatcttacgtgtctctaacggttgatt gattctgaatggtaataaatagatacgttcactttctaat
ORF
TTTGGATACGATAATCTTAGTATTTTCAATCTCAGCATACTGAGTTCCAACTAA
Downstream tttacatttttatttatgaatctccaaaattcgactcgttggtgaagaatttcatctgcc aatcggcccacttcatcggtcttgatttgctgacgggtcccctcaaagaattcatccaga cgcttagatcctttgtatcgtttctcatattgcacagcttggaccaggaactcatatttg tcaatgtccttcacgattcgagattcaaagtttcgttgttcctcgtaatcaagccacaga ttgaccatctcctttgcaaaagcactattgtagggatctatcaagcttcctagatacttt atggcggccaattctctttcgtatttttgctgtttggtaacagtggtgtcttttggggtg atatctccaaccaaagcttcagctatgtcgtggaccaacgaaatcttgatacattgtgat aaatctataggttccttttgagctgtggataagtttgagttgtctttgaaattggcggtg ttgagagacatagatatgattgacattctatacatatgatctgaaatagactctgcattg tcaatccccatattcaaccacccagttctcttctgtgtcttcagtagttcgacaacttgt ataaatgcactaatgtacgtgtaagggcctacttcagcattctgaagaagatccttaacc tcgctgggtacgtgatcctccgggttccaatcattggtggacatcggacagtataatgta aaatgtaaacagtagaataggatatttactaagcgaaaatttgaccccaagaaaaaaatt aagccaaagcagatatagagtgaaatatggttaattggtgaaatatcactcattcgtttt cagatcatgagatgctcagaaagtacttcaaaaacagatatcttcaaagaaggtctacaa tgcagtttagctatggaaattttttgtttaaatatacaaaagagagagtctcaaagtcga catttaagcaactaattcaacaacaccaccaacagctctg
AA
MQQLPKIGLLAVSMALICHTYSPLQPIQSSIGFAVLGYLLSDYLIPATAPYFIKIGLFGKDLSKKDKPVIPETIG I IPAWYLFIMFSF IPFMFFKFLWDTSGGGSRDTGVDLETADSNYFPHNKLSSYLSGILSLESMVLLGLLDDLF DIRWRHKFFLPAIAAIPLLIVYYVDFGVTHILIPTF IKNIFGFEAVSI DLGALYYGYMAAVAIFCPNS INILAGI NGLEVGQSWLAVLLLLNDFCYLIPASLRSTPAYETHLMSTCILVPFLGVSLSLLKFNWYPAKVFVGDTFCYFSG MVFAFVGISGHFSKTLLLFFLPQIFNFVYSIPQLFGLVECPRHRLPRFNEEDNMMYPSHAVFKKRLPKLIEKGML ILEALGLLEVVKEEMKTDNKTE 11 IKECSNFTLINLVLVlAfFGPMREDRLCFVILLLQFSIGLISLVARHTIAALL FGYDNLΞIFNLSILSSN
SEQ0155
ALG13 (YGL047W): Catalytic component of UDP-GIcNAc transferase, required for the second step of dolichyl-linked oligosaccharide synthesis; anchored to the ER membrane via interaction with Algl4p; similar to bacterial and human glycosyltransferases
S. cerevisiae null mutant: inviable; increased cell size, decreased resistance to 2-phenyl-3- nitroso-imidazol(l ,2-a)pyridine
Chrl-4_0448
5' region tctgaaaacacttttcgaaaacagcctcaaatcgaacgggattgatgcttctgtggagag catcataaatgactcagttgcagttttcgttgctgggaattacttggaaagctgcgtaac gggattagttgctggcactggcgtcaattgtgctttggttgtcgataggaaaatgattgg attacaaaagtcacaggaacagattcggcctttacgaaagggaaacttcgacaaagtagt tctgaactccgaactaggtttctttggtggttttctgtgtcaacatagtaccatctggga taaaatggtgcatgagaactgggcactacatcactcagttcaaacgaaccaaccacatat gactcttgctcttggatgttttcagccgattgaaatgatgtgttctggccgatatattgg agagttgacaaggctagctattctgtcgctaattgactctaaacagctattcaaaggaag tcggcaaaatatccctagtgcatttaacgttccgtatagcctaggaggagcaattgttag tgaaatctacgaatcagaaggacatctacctgagctctgggctccccttgccccttcaca
agaagattaccagatattgtttaaggttgttgattcgataattgaaagagcgtctgtagt tttggggtgtgctatcattgctcttctgaagctcagcgagtcttcagctacaattgaccc gactggaagtggttcatgtcgaccttaccgaatcgggtatgttggttcagtgctaaagtg tttcaagccttatcgggaaaaagttgaaatggttttgaatttggcctcaaaacttcaaat catcaatcgaccagcctccctcatttacatagatgatagtaatctagttggagcggcagt ttcagtcttcacaccagaatgattagattagttagtatgcatcgcgcgatcagaatttac ctaccttactccataagcgcagagaaatatcatccattct
ORF
CCTAGCAAGGGAAGAATCATACAAGATGTTATATTGAATAATATTATTAAGCCATAA
Downstream aagcacaaggttgagtctaaagttacattgtgagtgtacttccgtcgcttatcacatcaa gggaatcactcagccactttttggaaccaaaaatgttattcgataagccttctggcaact cggcttttttcaagttgatgctatcttttatggatattaagccagtgaaacttagagtta gcagtatcttatcaagagtgaaaaagttgtgtttcttttcatttgaattgtgcttggtca ttgatgaaatcagagtcattctcaagatgtataaccatatcgatctataagtcgcagttg cttccaagtttgactcttgctcaatatccagatctatggaatcttgagcaggtcttttgg aataaaatgcgactaaaaacccagaaagtagcccaattatatgcagtctgaacatgagtg gtactttggtgagtgacctccatatccatgacatggatggatttcgcccttttcttgtgt aatatgacatcaacaacgacgtggatgacacagtaacaacagtcaaggagagtttgagac tttcttttacgctttttatgactatctgtttgtaatacttccatttgctagccgctttca gctgttccaattcttccgtgetaagtctcaagtteataaagaagaaaaatggaaagaggt attcaaggactaccgtgtattttcttggcaaatatcgcaacagaaagtttctcagatcaa atgcaaatcgatttttcatgctattcttaccaattatgctttccagttcatagaaagatt tgaccatatcaccagatgaaaccatgcgagaagttcctcttttgactaataggccttcac ccataaagtttaagatgttcctgaaatatactggacagttctcgtaatccatgataaacg acttgaaaatctgcgagtaacataatgggaatagataccatgaacgtaagagtttgtctc tctttggaacactttttagcgctttgagcctacgaatgaa
AA
MKSVLVTGGATVTFVALLQLTLNEKFISALKKNNFDKLWQYGTQPTGESLFLSLINKLTDEDYKKSQLGQLYNI KLKDGFLIQGLSFDTDFVKNYTSKVDLVISHGGTGSILDTLRAGKKLVWVNDTLADNHQLELTEKFAEREVLGY CRKNTVEELIDQVNQAESREFKRLAPSKGRIIQDVILNNIIKP
SEQ0156
ALG14 (YBR070C ) : Component of UDP-GIcNAc transferase required for the second step of dolichyl-linked oligosaccharide synthesis; anchors the catalytic subunit Algl3p to the ER membrane; similar to bacterial and human glycosyltransferases
S. cerevisiae null mutant: inviable
S. cerevisiae conditional mutant: increased heat sensitivity
chr3_0944 5' region ggatatctggttggctgttttgaataaatattcagtgacaggtgttggactgaaggctca catgtacatcgaagaggcagtcgatcctgatttctctgcctctacgccgactaaagaggg agatattgactccacgggactaaacattgcacagttgacgtcatccgttaaccctgaaat tgcagctcgtgttgctgagttgttaaaagaccctgaatttgctgttcatgtaggcaaaga tactgaaaccaagaagcctacagaaacaccagctactccaccagaaacttcataatcatg ccctcagaatattcaatgttttctttttgtacatagaaatatacaatactataactcagc catacatgtcgtcaaaatcatcttcctcctcactatcagagggtagtactatctccttcc ttttcgtcacgtcaacacgacacaacgggcaggtcttgttcaaatgcaaccatggtccta tgcagtctaagtcaaaatagtgattcttgttgcatggcaacctcacaaccaaagggtatg gatcgttgtgaaattcattagtgcaaatgggacatgtatcgtcaggagataaatgcttga cgttaactctatccaattgatctaaataggcatcatcaactcccttggcagtatctgtgt
ctatttgatcccccagagagttcagcaaatcaaccatagaatctaatctgtttgaatcta gactggaactactctggagctggttgagaagcatggatactacggaataaggatcatcat taaaactgcttcgagactgactggtgggtctattttccaggtacgtctctaccatttgtc tgagagtgggccttcgacggccattgttagcctgagatgacatccagctgtatctggttg aatgagaggcaattgattatcaagttacaatattttcgctaaagccattgcacgtttggg gtaggacaagtatactattgttggttcgtcctacaacgag
ORF
TTGGCCCAAAAGTATAGACGTACCGAGTACCATGGAATCCTTGTCTAA
Downstream acgacatacaaaaacacaaaattgaaatagattaaaggaatgaattataatattacaaac tgaagtataaatacgcgttccatttcactagtaggaggttagttgttgacaatttcttct tttatttaaaggatcatgagatgaaaaacttgggtgtcagggttaaaaagctctgtttta ctgaatccacataatatttcaggtgaagaaattcatgttgtaaaaatgttcatcatccaa ccaggttattcagcgatccttactcctaaaaacttacccatggctgaggccaaataaaat agaagacattgctactaacaaaaccgatagctagtaaatttttcttcccagcatcccaca aggttgagaactgtgtgatacacgaccacgcggttgaacggatttttattcagggttgtc ccgaaagtataaatgtgggctgtgtaattttacccttaatcacattctgcgcggaacata ctgcatgattagaaatatttatgctctcaattaattgatcggcattttttatgattcagc ctgtttttccggtgggggcgttttttattgttcgtttcaagttgaaaaagctaaatttca ggtagttacctttttactttaatccctaactattgagggcaaagcctccttgcaggccta ttataagagctattaatactgtttaattactaatttgctcctatatcatgaactctggaa ttctacgtcatatagcgtcgcatgcagctgtgcctattaccccccttttgagagtgccgt ccaatcattgtcggtcacacgtctcgccgaaaaagcttttttttgttgtttctcggacgt tgccaagattaacgtcgcaacccttgcagagtataggccctgaaaccagaattttatcag ccaggtttgccacatcaaaaatcactcaactctatttttactaccgtgttcgatttttgg ccagagatatatatgccctttggcccccagaatagctgtt
AA
MTTS IYCLLALL ISLTLVVIRVIFCVPFCRLQDVSPVLDKPLSVLI LLGSGGHTGEMLNI LSQLDHKFKYSF IVQ SNDESSVLRLEKSQVKGTVYTVPRARNVGDGLLRS IQGTLKCWLGTMKVLVFDKKWKEGNIPSVLLVNGPGSCVP LAYS IWLNILGLASARI I fMESLTRVNELSLSGKLLYLVADRFWQWPELAQKYRRTEYHGILV
SEQO 157
P. pastons homolog of S. cerevisiae ALGl (YBRIlOW)
Mannosyltransferase, involved in asparagine-linked glycosylation in the endoplasmic reticulum (ER); essential for viability, mutation is functionally complemented by human ortholog
S. cerevisiae null mutant: inviable
Chr 2-1, 0759
5' region tacaggcatctcatcacaagtccacgtagaagagtcatatggaactagaatggttgttgt tgatatcactactaatagattgcgatacagaacttgtcacaaataggatatcaatgagcg gcaaaatgactacctaaatatagcgtctcttgtacagatgattcataaacctgcgaatcg tcatactcagcagcagttgcaactgtaacagcctggctgttttccgtccattatcgctaa ttgggtggtcgactttatcactcaattagttaggaccagtcagcggatctattttctgag ctagtccatatttacggactttgtattcggatggtgtctctagagattatcaacaactaa attaactaccgaactatttattaactaggagattggaaaattccttgatgtcaataacac catcattgttagtatcagcttctctaagcatctcgtcaacctcctcctctgtcaacttct caccgatggaagtcagaacatgtttcagttcacccttgtcgatttttccatcgccatcct tgtcaaatactcggaaagcctcgaagatttcggcttgactatcagagtctctcattttcc
tagccatcattgttaaaaactcggggaaatctatgcttccatcggtattgctatcaatct ctcggattaaatcgttcaattcactctctgttggagtttgacccaaagagcgcatcacaa ttcctagctctttggaagtaatcttgccatcttgatcttgatcaaaaagggaaaatgctt ctttgaattcagaaatttgggcttcggacaatttatcggactaagagttgttagtagaca attggctagaaacaatttatcaatacttaccattatggcagtaattacacggtacaaagt gaggagcaggttattttgagattcaagagaaagtatctctttaggttaggacacgacgag catgttgtcatgaattttgtctctataagatctctatttgc
ORF
TAG
Downstream cggagatatggtacttctgaacgatgacaatgttgatcaccattgtccactattgggata cagaatgaactcttttcatcttgaacgcatgagtggaaatatatcataaggcgtcctagt aaggaaactctgtgattcgatctcgaccatcagaattattgggcatatcttatgtattat ttgtatatatagctgtatatctgctattgttctccgtttggctcttacggaacgtgccaa aaagtatctataccaatccattagcaatatgcaaccaattaccttgatacaccggtcgaa tgaacaggaactgtaggaatgcttagaactcgtttattattcaagggtaataccaagcaa ccaggcacgagtaaagaccaaagatagtcaagtcagaactcttgatcgttactgagttag tactcagtaaattgatatcacgaaccataaaggccgacattagactataccaagagagaa tcatcacctaaaagtccaaagcatctctagcccaagtcggcaaaatgaaagatgcttcgt gtatttttttgttgtagtaacgcaagtgctctttctcaaatgtgtcggaaaaggatctca aagggaacttcaaactagcgttaggatctttgctacacaccataaacccaatctgacccg atggatatgttggaatggtacagtaagcgtattcaacaacaggaaaaacctctaaacaag ctttcttaaggttcttgatgatctccatatgaagccaaatattctcaccttgagtagtga taactcccttctcagtcagtgcgtcatttaacaatttgaaatacggtttttcaaacaatg aagctgctggtccttctggatcagaggagtcagtaataatgacgtcaaaagtgtttttgt actcgtccaaaaatttgaacccgtcgccaatatgaagtgtgactttgggatgcttatatg atttagccatttctggaaggtagattttggatagacgggga
AA
MSQLKEGLFKLFDLPTWCWLLVFVYATIPLTFYYFLPMLGSCLRKRSCIADFAVILVMGDLGHSPRMNNHALSFS RIDYQVELCGYIDSKLAFDIMHSDNISINQIQALRNTVGLPYFLFAMWKIVYQLLQLLKLLMRVMQDSRYILVQN PPGIPS ILVIVILKKVLFPHCKLI IDWHNLNYTILNLKYQNLHHPFVRFLRFYEFQMSKYSDLNLTVSESMNKFL QTEFGISSAKLVTLYDRAPTQFKPILDPFQKESVMKAHPRLFQHSIFFSKILVTSTSFTQDEDLPSFLKALKFID CSLKCKILVIVTGKGPLKNNFEEQCNALTFSNILVKTCWLSPEDYPKILAIΞDLGVSLHVSSSGIDLPMKWDLF GCGVPVASLKFDAITELIKEDVNGVLCDDAFSLGTTIQRLFENAAELQALKQGALEESLKEWNVEWNKKLGSLLT
SEQO 158
P. pastoris homolog of Candida glabrata ALG2
Alpha- 1.3-mannosyltransferase
S. cerevisiae null mutant: inviable
cm, 0002
5' region attgtcttgtgagttggttcactgggaagagaatacgctgtcagctgtgcgcttgcgcgg gttttccaatatctgtgcagaccctaaaagtttaaaatttggtctctagatagaactctt ccacttatcaatcgccttttgtgccattttttcattcccctctgatggactttgtggaga aggcagagtaagtgcatcgtctttttagcaatgacttaactatctaacaatccaggaaaa cactcgccacgtattcaacaactagagacctgtgttcggcagtagtatctaatgagtaca accctggattaccccatttcacaaggagtttagtgtggaagcttactttgattacagaat atttcaagtccaaaccacatttgccttcatcgatgaaagttgattattcggttttgactc agactcgaaacacttatcagaagtacagtgagaaatacccagtaccgtggagtttactgc agcaaaatgatgaattttaccaagaacaggatgcttctgttgatcccctttctaacacag acgatttagccaaactgcagactttggtgacagattgtaaacgcctgtttccaagttatc ccaatttatttatcaaggacaaggagaactctagaaatatcatcacatctttatttatat ggtggaagctcaacgacgatgtaaatgatacgaactctgggtatcgccaaggcatgcatg agatctttggactgatcatgatcagcttgatcaaagaacagctaccacaagggaaaccag agcaattctatacgaagcatccggtggcccaattgtacggtcccaaatttacacatattg actcgtttacgatattcaatcgcatgatgtataagctggctccatcattttttcaggaag acaatttcatccaagaatccattaaatttgacattctatttcacaactgtgaccgatacc accatcaatatctgacgaagacgttgaagattgactcttca
ORF (underlined intron)
ATGAACATTGCATTCGTTCATCCCGATCTGGGCATAGgtaagctttgtcatgaatgtagaacaatagaaagaggc
GCAGGGTATACATTCAAGCGAATGAGAGGCGATAATTAA
Downstream agttactaatggctacataaatagagtagacaaaaatataatacaattaattctccctgt aatttggttgatactgattaaattgaccttggaattgatttcccatgagcggatttgaca attgctcatttggtagctgggaggtttgattcggcccttgattggccaaatttgatggaa ttgcaagaggtgagctcatagaagttactgggctggttggagaattttgcatctgtcttt gagcagcctgaagcacttgtgcccttgactgggcttttagctgtgcctgttgaaattgag tctgtgcttgtgattgtacatgtgatggctgttgttgctgctgctgcatctgattttgaa ttgcttgattttttatctgctgttggagctgaactaattgcatcctttgaaattgagctc ttttctcgggtgtaaggtttgcaaactgttgttgctgttgcggagtcatacgcatgtgga gattcctcagatgctgttgctgctgcagtcgaaattgttctttctgcttttgttgtaact gcaattgttgctgaagctgcaattgttgcggcgtcatctgagaatactgttgcggcagtt gattcgaaggacggttgtcggggattggtgattggtgggtaatcggtggtgactgttgct gttgagcttgttgctgttgctgttgttgtaagagtagcatttgctgttgc
AA
MNIAFVHPDLGIGGAERLWDAACSLQKIENQVI IYTSHCDKTHCFEE IKNDEIESIWFG
DFLPTQIAGRFS ILCAI IRQAWLILRLALSGKIKEHDVFIVDQLSFCLPILHYLKRSDAR ILFYCHFPDLKLASRDTTLRSIYRKPFDLLEQYTTICADRWVNSNFTKGIFKETFPI IS KYYEPTVLYPCVDTEASS IDDTTEEEMKEFFHSNERFFLSVNRFERAKNISLAIRVFAKL KKENLQFFKENRLKLIVAGGYDSRVRENVEHLIELEDLARSLGLKVIS IRGRLFΞYPASD VIFLPSVSSDIKDYLISKAEALLYTPGFEHFGIVPLEAMKFGTPVIAVNHGGPTETWDD AQPEPTGYLRSNEVDPWYQACSQVVRLSDEERQKLSANSKKRVETYFSREAMGKAFSDNI DQMLTEPTSDRFSYEKLIDIFFVLGLVLLNVIPIWAGYTFKRMRGDN
SEQ0159
P. pastoris homolog of S. cerevisiae ALGIl (YNL048W)
Alpha- 1.2-mannosyltransferase. catalyzes sequential addition of the two terminal alpha 1,2- mannose residues to the Man5GlcNAc2-PP-dolichol intermediate during asparagine-linked glycosylation in the ER
Chr 1-4, 0417 5' region aggttgttggattagattaaacctagttcttgaagaaatgctatcagcgcgcggcgataa ccgcgatggccggaaacgctatcaagattcatattataatcacagataataccaggtgat gaaagaaaggagattattcattgtggtgaagagcatgaattatacataaaacactactta ctggtttgaacactttaatattggtcaaattcttcttgaaaaagattttaaaacaccggt gaatgaactaaaaatattatgttgccaaatgacaggtaaataattatgtacccaaagcag taatcgatcagtaggcctttcaagagtcatccactagaatgttgttagtggagacaggtc tagcatccttgtccgagtcaacctttaatccattcacttttggagttaaagtgtttaaag tatcttcagttgtgcttcgaatatggtctggatctgttaagtttggattggaggcagtct ccaaagatctacgatcatttcctgactcgtcgtcgacctgagtcgagattccaagagcgc tcacttcaggggcgtaattgaacccttgtggcacctgagcagcatcttggttgggtaatt gctgagcgtaggtggtgtagtcgttaccagaagcttgcgctccggatcgtggagattggg gaccattcctttgtgagttctcaaagtatggagcttgattgccctctgagccaggaacat tctgaggcaaacttagaactgtgttcagctggtcgtattgagtagaatgctggtgcccct tatctgatggtacagttgaagatgaaagaagagaaccagctttagagtggcgctttgatt gtttcacctgtactctctgtgactctctttgggcggcaatagaactacgggcttgaggag taggtagcgcaattaattgcttcatatcgttgtaccatttcatcatctcctggtaaccct cagcacggaacacccagttatgacccctgtttatgatacc
ORF (frameshift present, possible around region in bold)
AATTATTCTAA
Downstream ttttgaaataatgtagtgtcgtttatagggtatggtccttaagttgttccgaatgaattt tcacttgtattatctggatttattatccgcaagacttctctttcatcctcaaaatcgtca tcttgatcaaaccttctaacctgtccgctcccaaatggacctgttgaatttgtatctagt gaggaaatcacggactgggaagatgatgacaaaatgcttccattttgatggtagttttgg acatagttcctaacaaattcagaaacataatcgtctctttctgcaacagaagactcatct tctggtaattgcgcacttttaatgttcctccttcttctgacaaagtagcatattacaacg aagagtaccaagagcgaagaacagccaacggatattcctacgatagcaccagtgctcaaa acttgtccgttattctcaatgtatgaggccgttgccgaagatggagatggagtcggtata aactgggaagaagactggagttcgtcggatgaagtgtcaacagtatttacctgagtggct tcattggctgcagggactgccggagccactgtttcagctacttggctttcaatggaagtc gtctctgaccattctgaagagtcttctgacgatgatgttatgttggccaaagtgttccaa gagaaaaatgtctggggagatgatgatgaggaaagagaaaaggtaaaattctgaactccc gcagtagttaatagtgttgaacttccttccaaaaataggtaactcccatcacttattgtt gaagtatcattactgtcagatggcaaagtagtagaaggatactcactggtggaagcggat gaagagattgagtttgccagatcctcgtggagaaaaacctggaaactgaatgtcccacca caagtttgcgtagtgttaccagcacatctaacttcacactcgatgctcctttcgctttca acggcgaaggaatccgaacaataacactctgttccattga
SEQO 160
P. pastons homolog of S. cerevmae ALG9 (YNL219C)
Mannosyltransferase, involved in N-linked glycosylation; catalyzes the transfer of mannose from Dol-P-Man to lipid- linked oligosaccharides; mutation of the human ortholog causes type 1 congenital disorders of glycosylation
S. cerevisiae null mutant: carboxypeptidase Y modification: decreased; resistance to brefeldin A: decreased; resistance to tunicamycin: decreased; resistance to mercaptoethanol: decreased; resistance to L-l,4-dithiothreitol: decreased, viable; inositols excretion: increased; competitive fitness: decreased; sporulation: abnormal
Chr 2-2, 0036
PATENTED BUT PATENT CONTAINS ONLY PARTIAL SEQUENCE (DD175842.1)
5' region agcacaccgtcgcagtcaaagagaaaagtgtcataggccgctaatagtccaacagctgac tcctttgagtcaatcttggtagccataatcggtaaaatagagagaaaggtgcctattttg cgcgaacaatctctcaataaggtggaatctgataaggtacagtaccatcgtcttcaacat ccgccatttaacatcgctcaattaactaccactctcgaaaaacaaaaactccattggtca aatccagaaccacaactaacgatatggctacccaagacttaccacccatctccgggtatg cacccatccagtggaagagaaatttaccctcacgtggtttcaggcctgttatttggtttc ttggcctctgtaccgctaccggatacggtttctatcagatttccttagcaaacagagaga agatcgaattgaaaagagaaaagctgtgggccagagctcatttgatgcctctactgcaag ccgaacaggacaggaatgtagttagaagaacgtttgcctactacaagagagagggagaaa tcatgaaggacgtcccatggtgggaggtcaaatcaacttactccaacaaagatattttcc atcctcctcaaacggtgttattcggtaaacacattgacagggacagcggactgatgtacg caaactcacgtaacaggtttgggcctgatgaggaggagaccaaatgaagtatttatttat taagactagagcaatatattatttttattgtccattaatggtgttcatgcataaagtttc gatatggagctgaggagtgtcgatcacattgaccacaatcgattcactacaactactcgt tacgctgataccatttgtagactgttggactaattccccgttcattteattttattcatt ttattcattgttgtttcaatcacccgcccaacggttcccactaacctcctctccggtggc cttccgtcgctcgattcttccttttacagtaaccaacatac
ORF
GACTGTGATGAAGTGTTCAACTACTGGGAGCCACTCAΛCTTCATGCTTAGAGGGTTTGGAAAACAGACTTGGGAG
AGATACCAGCATTACTGTCTATTGAAAAAGAAAAGTAGTTAA
Downstream aatattaggaattgtacatatgtttcaatacgctcaaccaaactgtttcaactcaatgtg actttcttgttcgtccgtatcggaagaccagtcattgttgactcgtaacttgactaattt ttcagtcttttctttctcttcttcactcaacgtgttgaccaaatccttaaataaggccgg gttaaattgtattaaatccaatagcttagaatggcagtagtttcttactttcacatcgtc atctccaaccttatcgtccacccagagtatcaacggcaaagtcagacccaatgctgattt caaattgtcttcgtctgtcagttgcttggtgaagacaatgaaaatctcaagaatcagtct aggatcttcaattttgtctgagtcgttttttctcaagttattcttcagcaagcttatcaa tgcaggaagcaacttgcgcacaatgtttttagcaccatcaacgtttccagctgttgettg tttgacgacatttctaacaaactgtgtcgtaaaggttagttgttgcttggagatcaaact tttacacaaatcttgtgcaaacacctcacagtcctcgttggtgaaaaaaaaggtgccttc cgccgacagtaaaatcattcttatcagagttgagtgtttgttgctcaacaggccaggttt tatctcattgtagaacttgatcaaaatatccgatcttcctttcgaattcagtgatactac taccagctttagtaatggcaaagtgactgaaatgatatctgtattgttcaactcgaagca taagcttataacatataacattgaagatagcaaatcaatttggatcatttcgggaagatg gctgattaacttatgaagcaagctaaatgattttttcaacaaagtcagatccttatcaga aaatggtccaagtttctctctttctatagtcttctctgacacaaacggaagaatttgaat aacaggtaaatagctcaatctcaatagttcgaatagcttgt
AA
MFSNRLLLFGLFYLRLISTFYGIISDCDEVFNYWEPLNFMLRGFGKQTWEYSPEYAIRSWSYLVPLWIAGYPPLF LDIPSYYFFYFFRLLLVIFSLVAEVKLYHSLKKNVSSKISFWYLLFTTVAPGMSHSTIALLPSSFAMVCHTFAIR YVIDYLQLPTLMRTIRETAAISPAHKQQLANSLNNSSQYLIVAKAVFWYTVGGLLGWPFALALALPFAITIFVRK VYYKELHQLAVIGALSVFIVLLILAYWQIDSICYNKTELIPLNIVLYNVLNTDDSVGPDIFGTEPVSYYILNLL LNFNFLALLGYASLPLLLIFNFLPDFNLRSSNVKLFGHEGGSNKLITLFAPLYLWSVIFFTQPHKEERFLYPIYP LITLGAAFATHQLIRIPSLLSTVFIPNKRVLHKAINLTIIILFSSFIVVISVLRVFALISHYSAPLFVYQHLDQF ASDTPKNVCVGREWYHYPSSFFLPPSMRLRFVRSGFSGMLPGDFDESVSRLSSMSNIPRELYNNKNLFESDKVIP
FEECDFYVDISKAVDSEQKE IAILAPSYSDQDPAVLLDDWSLVHCDKF IDQDHSRGLGRILYLPKRFHRKMKTNL RYQHYCLLKKKSS
SEQ0161
P. pastoris homolog of S. cerevisiae ALG8 (YOR067C)
Glucosyl transferase, involved in N-linked glycosylation; adds glucose to the dolichol-linked oligosaccharide precursor prior to transfer to protein during lipid-linked oligosaccharide biosynthesis; similar to Algβp
S. cerevisiae null mutant: Bud8p-GFP distribution: abnormal; carboxypeptidase Y (Prclp) modification: decreased; resistance to mercaptoethanol: decreased; resistance to tunicamycin:
normal; resistance to L-l,4-dithiothreitol: decreased; budding pattern: abnormal; cell shape: abnormal; competitive fitness: decreased; resistance to hygromycin B: decreased; resistance to ethanol: decreased; viable
Chr 3, 0999
5' region gttgattctcttttcacctggtgccatttgcttaggtagattatgcctatcgattttgta ggtcaaaagaaatacaggctactttgctttctttcccctgaagtgaagggaaaagacgct ctatctatcccactatcaggatacagctttctacaacgtattggattctgatcctagctc tgattacttgaaagacagcatcagaattggggttggaacccagttacaatataagacagg tcgcaagttaggttcaattgacacttcacaatggagattctactggaagaagaaaggtga gggtgttttgtatccactaaggacgttaaacgactttgttagatttgtcgatagccttag tgagttctcattcaatttaataagactagttgtgagcactgatgatcctgttaatgaatg tagaggggagagacgaaaggaaaaacgaaaggcaaagaaggaaaagaaaaagaaagataa gagaaaagataagaaaagaaaagaccacaaacttgggacggaactgccacaatttgacat gggagaggaaatcaatgaccctaaatggaacgtgaatgaggacctctctttagcaggttt tgattgtctagcaattgcgacttggaataatgaatccagagagaaacccctggatggctt atcgcagggcgaatctgattctgagtgcgaacagaaaaacagttccatcgtaatgaagcc agatgaagcgttatctgaaactgtacgagagcggtataaagagttgttgctccttgacga ggaagtggcgaattgcatgtcagaggacacttctggaggggaggaggactacgcgccaag atctcctttatttatcgaatcaatgttgtcggattctgactttgagcaagtcttttgacg ctaaccatggtaatagcctgtctttgtttattcaaggggccacctgtcctttcccaacct tttttattttagttccattatttgtattacttaatgtttaat
ATCACAAACAAGTTACCGTTGAAGGAATGGTATCTGGAGAACACTAGCCAATGGACTCTGGATTACCCTCCATTC
GAGGTCGCATAG
Downstream gacggaatgcgcagcaattaatatgaacatctcgcgtatcataatagacgtatcattttc accatcgttttcctcgtctcttctttttgcgtaatcggacaacgctgcactctggacaaa accacttcccctcgggtgccctcgtcaagttcacacatttgtaatgaaaccattcatact tgcaaccgggattgtcacaggcaaccatctttcctaaacctggtcctctacataggcaat agacgtcctcggcgggtggactgggcgctggtattggtgaagggagggctggaggggtta taataattttcttagaaggtttgacgactttggacttttgttttccattttgcacttttc ttttagtttttgtagcatttattgatttgcgattttgtttagggagggatatgtgaattt taatggcatcccgattctctacagggagggtattaggatgcagatcccctcttttcaacc
catgttctttcaaaaactgtttcttgaactgttcaaatgacgctcgtgtgttattctgct cgttgatttttagctgcttcaagcgctccatacagagagaatcactattcaagtagtctt gatggttatctagtttatccttgagatgttgggactctgcaacaatctcctttgagtatt tgcttgccaaatttcgaatacttacaatctttgctaacacctggggatcattaaggtccg agtctttcagtatatgatctagctggttcatgtactgatccacccgtaaattgagaaatt ggattagccataatgatctaaccacatcgcaaggtaaatgatcaagcgctgaaatgttag tacaatgtcattatcccaataacgtgacaaatataagtggtactcactggacacaaaaga gttgctcaagtttacgatattcgaagttaacatgacattatttgatatttagtgttatga ggaggtcttgatgaatttgatatcaaccgcgagaattgtt
AA
MFDLISLHFFLHQQAIDTYYPILFFASSPFTKPSHMSGKRNFSLWNIWVASTLLKVLLYPAYHSTDFDVHRNWLA ITNKLPLKEWYLENTSQWTLDYPPFFAYFEWLLSQFVPASVADDGCLDIVDVGNYGWPTVVFQRSTVILSEIVLF LALQKYINISAGKEKARSFVVASSIALSPGLLIVDHIHFQYNGMMFGILIFSLLAAKQKKYLQCGALFSVLLCFK HIFLYIAPAYFVFLLRVYCLDIHETSFKTPRSLLKSVRWSNLFKLGFTVITVFVIAFAPFAYYGVIPNLISRLFP FSRGLTHAYWAPNIWALYSFLDRVLVQLYLHVPGISSIISRYIGPESLVSRLKNSTASTKGLVGDVEFFIVPTIT PKMSFILTLFYQILAVLPLLLFPTFKKFLGSITLCAFASFLFGWHVHEKAIMLVIIPFSF IAVSDRRLLYPFHTL VNAGYVSLFPLLFKSPEWLVKVLYTLVWCI IYFLSFNEVSKLSKSLSRRVFYMDRTNLIYILGLIPLTLTIGVLD VLSENFELLKRLEFLRLMMYSVYCALGIIGSWNGLSWIYFLDDTLNGDSSEVA*
SEQO162
P. pastoris homolog ofS. cerevisiae ALG10/DIE2 (YGR227W)
Dolichyl-phosphoglucose-dependent alpha-1,2 glucosyltransferase ofthe ER, functions in the pathway that synthesizes the dolichol-linked oligosaccharide precursor for N-linked protein glycosylation, has arole in regulation ofITRl and INOl
S. cerevisiae null mutant: Bud8p-GFP distribution: abnormal; budding pattern: abnormal; cell shape: abnormal; resistanceto ethanol: decreased; viable
Chrl-4,0475
5' region taacgtcgatagttgaattttgtatgaaatagagattagagtcccgctccaattcatagc tcgtatcttcatacatattattctcaagaataatcggtatcactctgatccctctactct gcaagagctctcctaaaattgggtagccattgggatcgaattctaaaccctgatatgtgc tattggatagctgagcaacccgtaacgaagcaggggtcaaaatttgtacttgatcgatgc tcctaactaatggtcttgataaggttcccggctgggaaactggagttgggttttgggtgc caattgaatgcaaactcatataaatagtaatcaaatttaaggatacggatctagctgagg ccctgatatgttcatcaattctcgtgaaaatggaagccaaatccagtcttctgtcacagg tgtactgcttattatcccattccagttgtaggatcgtactaccatcagcaattggagtaa cattagcactaaagggagcaggacataattgccaattaaatcctccctcatcccaatagg cgtcaatcaggatgtttcctacagacgcgtttaaaaggttgtccacccgattcaaatacg gttgagtgtatgtgtctctttcaaaaatgactgttgacagccggaccccaacatcaacta actggtcaacgctgacattgtaggagacatccctttgagacctcagagcgacttcattat caacagacaaaggcggccaaatgacacaatagacgcattgtatcaggagaaaaaaggcta tgatcataagtctaaagagacacaggcggcgagcaacatttgggcttttaacaatacagg cctcgaatgcttagttttctgatatctcagatctcaactggcaaggttcaccatcactga aataggctgggggattccgcaacggtgctgcaactcagaggaacctagagcttgaccaat taacagcgtaatttacatacaaaatcattctatcaaccaga
ORF
TACGAGCAGAAGGAAAGGTTTACAGAGGAGGATTTTGCTGATGAAGGGGAAGATCCCATCAAGCCCACCTTCATT ACCATTGCTGCACTTTCAATATGCACTGCTCTTACTTTAGTGCCGTCTCCCCTTTTTGAGCCACGATATTTCATT
ATCCAATACACTTTCAAATGGCCCAGTGAGCCCCTGGAATATCAGAGAATCATCTGGTGA
Downstream tgttgacaatgttttccttatgctgttccaacacccgtttcttcaaaacatcaaagtaac ttacctcttgcttagtaagtgggtcaatgtatgctaaaccagcaaaagtcccatacatat cattctgataatccaacttccagttcttcgactcgagatctccagctacgagtaataaaa aatctccccaattgcgagcaaccagatatttggtatcgaaatctctcccaaacagtatca cctgacccactgttcctttctcatctggagcaagatccagagccacatggttccccatat aatccgttagcaccggaatccagttcgagtgactgtaactaggaatgattgtcttttctg gtatagaacgctgtggtggtagagttagcggtttgatctgtgtcgagtcgtcgaagggag aacttgacaagtggccggatttgtgattctcattcttcaaataccctaattgtggctggg agatgttattaacgctgtgggcgctcttagctttcgccaggtccatttcgattttgtttg tgatttttctccaatttttggtcatagagataattgagtccaaggacatgagttctaacc cgaatatcagcccactggaatcctgactagattcggtatactcttgaccatcatggatgg aaatggatttgacgaagcaagggggaaactcaattcccaaatctttttgagcacaaatta tgtcggggactgtcacaggtttagcaaatctggaggatagttcgtgtttcgaacaccact cgtcaagcaattgccatgcaaaaacgacttcctcagtgccttctgctccttcagcagctt tgactggatccactgatagtttattgaatagtgtgtcgtcaggatatgatgtcaatgagg ctttgacaaatgtaccggagttattcttagccagaaaatcctccgatttaatcgttgttt ggttcgatttgtaatcagcatatctatcgttggtggtagtg
AA
MAGSNS ISLLSHLTVIVIRSCFQKVTNVVRKPFIDE IFHIPQARQYCRGRFDVWDNKITTPPGLYWLGYVWVKIL
AVLNGGEFKCDTNTLRDINFVGFVVLQLLIFYLQKGTTGNSYSTSS ISLNPLITLYYSLFYTDVWSTVFIVASYV
VIVKQPFGKYRSATISAFIGLASVTFRQTNIIWNALILATFIDQQIDPKDRTNSFSDIKLFIAETWRNILGVLPF
AINFGLFLAFVYTNGGITLGDKQNHVFSVHIAQLFYFTSFVAMLSIPLWISPSFFLGYLKLLRQNI ISTI ISWAV
IALLVHYFTVVHPFLLADNRHYTFYIWRRIINLTAYSRYMLAPAYHFSIYVTFKMLADNILSLPNEQEIQQQETE
YEQKERFTEEDFADEGEDPIKPTFITIAALSICTALTLVPSPLFEPRYFI IPFTFWRLLVRPSDSTLFENESLKK
TNNRTRLLFELAYFVAFTYLLYEVFIQYTFKWPSEPLEYQRIIW
SEQO 163
STT3 (YGL022W): Subunit of the oligosaccharyltransferase complex of the ER lumen, which catalyzes asparagine-linked glycosylation of newly synthesized proteins; forms a subcomplex with Ost3p and Ost4p and is directly involved in catalysis
S. cerevisiae null mutant: inviable
chrl-4_0685
5' region tttgagttgtgtttaccacaattaaaaaggcttccgctgatggaatccgacatatttccg tcccttcattttcttttaaaacttgtacaagagatagagagtgaagaggtgcaaaagaag ttggactcattggtaagaagattttcggatacaatttcagaagctttgcagtatgagttg agtttcactgttgataaatccaaccttcccctgccagcaagttgggtatttgagggcgag agcatttcgtatggttcaacgcagtctactcaaccattaagagcaacacctcctcgtatg aaattgagcaaagaaaaacaatacagcttctctcccaagacaccgctttggaaaaggtgg ttcaattcaccaattaagaaacctcaaagactaaagggaaagaaagtccgatggagcatg ccctacactaacgaaaatatacatgaagacggggagaaggattttaacgaagatgatgaa gagcaacagcatatggaaaccagtcttgaagattcagacgtagacatggataaatttgtt gacgctatggatatttcaccgttgccagatgccgcagattcgtcattttctacggttaaa gcttccagacagagttcactaaccacaagaaaactaattccgtccaaacaaagtaagagc
cttctaagctctttgaagaacgctgaagctcaaccagatgaaacagaaatagttccaccc ttaggtgcaccctcacgaatgaatttggtagaacctaatatggtgcttgaagataataat aaatagatcaatcaactcaccgaacaaatgattatataattgggctctcctttcctgcta gcccttgcacttcccttccctagtaaatacatccgagagcatccttcgcgaataccttcc aacacataaacagtacactactccgccgaaaaagacacgttggagcgactagcttaaaat actctccaccgccaaatcctccctcaacggatctccaaca
ORF
AAAGGTCGCAGATTGAGGGTCAACAAGAAACCCTCATTGGATCTACGGGTGTAG
Downstream tgtagttgtgtattttagatagacagagctctggggttagaatgataacatattgcgtgc ttatgtaatgatgttcacctttcgtaggtctcaaaagttagcctagtcttagcccgtaac ttgctaatgactgcaccaaaacctcttcctacgctaaatgaagaccgccaatctgaattg gtcaataatctgagccattgggcgctaggaaatggtttagcgatgtaccccaccaatttt gagcttcataacactggctatgcaccagttaccctttttccaacaccatttccaagagga caatttgagaaggcattggccgttcaggaagactttaatgagctctatgctcaggtagtc aaaaatcaggaatggctgggctcaattctggaggacttatctcaatttgatagggatttt actggtaagttatgggaaatttacaaggaggccaagaagattggaattgttcaaccagtc tcattgggcttgtttcgatccgattacatggtggatacgttatctgctccctcatttgat ggtataaatggaaaaatcaaacaaattgaattcaacactgttagtgtatcctttgggggg ttatcacccaaggttgcccagcttcacaggtacttaaatgagaatggtaattaccacaat gagggtaagcttcattttgaagatgatgagcttccaaattctgaatcgacagtttctttg gccgatggtttggcaaaagctgccgcgtattataatactagtgagtctgttgagtcctct gttgtgctagtggtagtgcaaaataatgaacgaaacgtttttgaccaaagggcccttgaa ttcgaactgttaaaacgtcataaagttagatctattcgattgcgcatggaagatatatct gcaagtattgatattgatgagcaaacgagaagaattcgactgaaaacgacgggggaagaa gtttcagttatctattatcgttcagcttatgcaccaagtg
AA
MVTINDQGYITVNDRVLKLIFSLLIVLIFISITIAAVSSRLFSVIRFESI IHEFDPWFNFRATKYLVHNGFYKFL NWFDDKTWYPLGRVTGGTLYPGLMVTSAVIHNLLAKIGLPIDIRNICVMLAPAFSSLTAIAMYFLTLELTNDSES IANGTAKATAALFSAIFMGITPGYISRSVAGSYDNEAIAITLLMVTFYFWIKAVKLGSIFYSSVTALFYFYMVSA WGGYVFITNLIPLHVFVLLLMGRFTHKIYVSYTTWYVLGTLMSMQIPFVGFLPIRSNDHMAPLGVFGLIQLVLIG DFFKSQLSRKVFIKLAIASGWIGILGVVGLVLATKIGLIAPWTGRFYSLWDTNYAKIHIPIIASVSEHQPTPWA
SFFFDLNFLIWLFPVGVWFCFQELTDGAVFVI IYSVLASYFAGVMVRLILTLAPIVCVCGAIAITKLFEVYSDFT DVVKGKSGNFFTLFSKLAVLGSFGFYLFFYVKHCTWVTENAYSSPSWLASHAADGSQILIDDYREAYYWLRMNT PEDAKVMAWWDYGYQIGGMADRTTFVDNNTWNNTHIATVGKAMAVSEEKSEVIMRQLGVDYILVIFGGVLGYSGD DINKFLWMVRISEGIWPEEVSERGYFTPRGEYKIDDNAAQAMKDSMLYKMSFYRFGELFPSGDAIDRVRGQRLSR SYAESIDLNIVEEVFTSENWLVRLYKLKEPDNLGRSLLTLKDNEKKLATKKGRRLRVNKKPSLDLRV
SEQO 164
GLSl (YGL027C):Processing alpha glucosidase I, ER type II integral membrane N- glycoprotein involved in assembly of cell wall beta 1,6 glucan and asparagine-linked protein glycosylation; also involved in ER protein quality control and sensing of ER stress
S. cerevisiae null mutant: viable; decreased resistance to hygromycin B and PM02734
chrl-l_0215 5' region atgattcaagtcagccctcccagaaataccttatggaaaccgtgaccattaaagacttgg aaagtggtcaattggatatagacctacttttatcacatatctccctattgaaggaaaaac tgtctcatttacggtaccagatggtttcaagtcttaaactgctagcctcagttgacagta gtaccggcccaagtgggtttttccagacggtctcaactcaggtttcagacataaaacggg atatcgaagagtatcaccaagaactgcaaaaagttttaccggtggtacgattttgcaaga tcaaaatgggcctgagccccaatgattctgtgaaagtcacgaagcatgaggttaaaatcc agtaccccaatggttctggggctctagtcggctctgtaccagggtctaacgacaaacaag atagaagggttctacagcagaaaaagactgctgtgaggaagccaagaaaaaatattaaaa agactccaaatataaaccaatcaggtcatactccttccagtctaggcaatactaacggaa atgctcaaacgccaatgtaccagatgaatttacagtcacaacagttacaacaacctccac aaatccaacagcagctcccacaacagccgcctttacaacaatcacaacaacaacaccccc aagcacagcaacaaccacaacaaattcaatatgaacaatacgctcagcaggcccagcaag attacggctctgctaatcagcccattctactgtgattgttccgtgcgagtctataataag gaagtagtatggaaaaaccattttattgtgatcctctcaatgtaaacaacgagctattca tcttagcattagcaccatcagacactctctctgctgttgtctcgaacatctctttttgct ccatgggttggttctccttgggcctcaacccgaatctctgttcagtaccctcgactgttt taagatctaccgagctccaccgtttcatcttttttcactc
ORF
GAGGAACTAGCATAA
Downstream gagaagggagaaacacatagcgtagcgggaaaagaatatttctgcacacatcccaccaga tcaaagttagaagaatttctccatggacaattcatcatcatttagaaccaaaacaatgag ctgttttttattctttttagatcatgaaaagtggagaatagaaggctagaccttcgcata aaagttgatgactttctcaactcaaccccacattggtataaagaaaatttgagttttatc tcaaattacaaaaatttgaaccaaaacacatcatctaaattctgaacttttggtaatcat cttttctaaagaaataagtactaaaaatgaaatccaaaggttctcttgaccacttatctt ctattctccagtctatagttttatttatacgcatgatatagtttttgttgttttctatga atagaaaagtaccgtcattcaaatgtcttgactcagttatttcatacatgcttcctgttc accttcagctttaaaatatgataagtatccaagtctattctcttttcgtctgattggtta tataatctcactttctgttcaacctgatgatcatcaggtggatttcttgatgatgttgct tcttctggcgaattgttactaatttcaatgtcatgcatatattcttccatagactcatat tctcctagtgcatactctctagaatgccatattttttgctttcaaaactacatctttttc gttccttaatacaaaacaatatggattcatagctgtatctagattgcatagataaagttt tatatcttctttatttttgactttaacaaaacataatctgcaatagctattggtacctgt gtcaggacagtaaaaaaaattctacatctggtatttctttattcactaggaacattaaat acatcttaatctagtctacttgccttgaccttagcatatacctccaaattaaattatttg cacccttacaccataaacaaaccctatagacaatcacatg
AA
MNLFNRRILSLISVLWAIAFENVPSDELSLPEAFQKISDQSLLWGPYRSNLYVGIKPKIPH SFLSGLMWFNADDPEGIVKLRHSCEQDPEIQSFGWVKYDARVGGRHIIKDRGCKVLIKSDFV KTSDGNWALKITGVPKKGQENVKTSLIFYAGTEEDQDNMLMFAGNKDEFGNVNNDLTRLTGV SKVLGGAFELLVEDGPSNRYPSATVLAAPDLDPSLTHHLSLHVPDEHLWQAKEVFISLLQES VGKIREDPSLNVSEIPVDQLTTLRNINNFEGNLHFIQKTFQGKFELNIIFNLEDAPEKLSSS NIDSYVDRALTHFDEKFSSQLQFQAPFHTKEYLNFGKEFLSNLAGGIGYFYGEQLVDRNAFV DDDSFDNVKLVGKPEGPSELFTSVPSRPFFPRGFYWDEGFHLLSILDYDSDLALEILKSWFA LIDDDGWIAREQILGPEARSRVPAEFQVQNPNIANPPTLMLVFAKLLNMAHDSPKSEDDFVS IQDLSENMGHVHLDNPEILVDYAEDIYPRLKKHFEWFATTQKGETLGLSRESKYPNQLYRWI GRTKDLCLPSGLDDYPRASEPDSGELNIDLLSWMGLMSRSMKSIAKLLNKADDVHYFETIEQ GVLYNLDQIHWSEEEKSYCDITVDDDDNDVFECHKGYVTLLPFALKLIPEDSERLIHVLQDL RDPDICWSQFGIRSLSKSHPLFHSGEDYWRGNIWLNINYLILDALKHYAESSGALPDVKEMA KPIYKELREILVTNVYNEFKRTGYAWEQYNEATGKGQRTRHFLGWTSLVIPIMKMPEELA
SEQO 165
P. pastoris homolog of S. cerevisiae ALG12 (YNR030W)
Alpha- 1.6-mannosyltransferase localized to the ER; responsible for the addition of the alpha-
1,6 mannose to dolichol-linked Man7GlcNAc2, acts in the dolichol pathway for N- glycosylation
Chr 4, 0544
S. cerevisiae null mutant: viable; increased lifespan; decreased resistance to brefeldin A
5' region tgtgctcaaatttttgcaagctggtgaaatcgaagaggggcaaatcgatgcaaattgtgg aaattctagaatcatgaaaacttggcttgatcgattgtatgctttctatgtctctgtttt catatgcgtgaaggtctaagatcttggataagtcaatgtccccaaattgcgtttcatata tcggagcaatgctgttgatcgttttaatccgtttcttgatgcgagcctcgtcttgttcgt tgattctatcacgtttattcaggagaatgacatcggcgagtgcaatctgtaaatgagcgg tcgtcgtcccgtctgtttctgttgatatatcctctccgtgccaatgtcctccaacatcgt caaggcaggattcaacgtgctcactatccaaaacacacacaaccccgtcaatatacacat tactcgctaatccgtcatctacccaaaacatattaacaataggacctggatcagcaacac
ctgttgtttcaagcagaatgtagtcgaaaccggccttcttggctatcaatctctctattg ctgcaactccattatccttgacagagcagcataaacatccatttcccagatccaaccatt cttcgtaactttctcccttatcttgaatggttagactcttttcaatttctacagaatctc caaattcgttcagtatgactgctagttttctatctcctttttctgctatacttttcagta gggtggatttacctgatcctaagtagccagtgattatcgatatgggcaccttaactgatt cgtctacagcatttcttataacggagacaggattcgaagggtcaacaatcaactctccct tgatttcagaaacatcttctaccaactgtgggatatcctcatccattcttgctgacgaaa aagattcctgtgcgatgctccatgatatcctggtatcgcgaacagaagtatgcattttcc agctcttccactgttagtgtcaagacgggcctatctaagta
ORF
TTCGACTCCTATTTCTGGGGATACAAGGTAATCCCTGAGATCAATAGTTTCGTCTTCAATGTTTTAGAAGGGCAA
CAAACCTGA
Downstream aaatatatacctcatttgttcaatttggtgtaaagagtgtggcggatagacttcttgtaa atcaggaaagctacaattccaattgctgcaaaaaataccaatgcccataaaccagtatga gcggtgccttcgacggattgcttactttccgaccctttgtcgtttgattcttctgccttt ggtgagtcagtttgtttcgactttatatctgactcatcaacttcctttacggttgcgttt ttaatcataattttagccgttggcttattatcccttgagttggtaggagttttgatgatg ctactaaacacttcaaacccatcaaccaattggccaaacacaacgtgatgtccatctaac caagaagtagctttggcagtaataaagaattgagatccattggtatcctttccagagttt gccatgcttactcttcctaatctgtcatgcttcaaaacaaaattctcatcaggaaaatct cctttgttctcaccatatatcgaacgaccacctttcccgtctccagatgtaaaatctcct ccttgaatcataaaatcctcaatgactctatggaataaggatccatcgtatccaaacccg ggtgtcatagcagcgagctggtagaagttcttcacggtcttaggaacgacagtaccgtac agtcccagcaccagatcacccaaaggctgatcgtcctgggtgatgtgaaacgtcacgaaa tgcgtataaggaggttcttctatcggtgcactccaaacaccaagcaaaaggctgatcagg gtgaagatcaaagttaaattcatttctcaagtgttcggtgggggaagcagtcaatacgtt gtgctcttcgaaagatgtaactttcgcgcgatgtattttttcaatcgaaaaattatctga agtgcaaagcttacgagcctaacgtctccctaaacgcttcccaacaatcactagtcgatt ctttggtggaatgaccaactttgtcaagcaaacatcgtttg
AA
MINLKLLDLALLFLIGLHLFISPFTKVEESFNIQACHDMIVYGFDDLSHYDHTQFPGAIQRSFWGAGLLSIAIRP FKGLLSSFFANWPTRLFYQYLVRGLLGLFNGLGLIRIRRVLSRNISKETAFWYMILQMCQFHIVYYSSRTLPNFI ALPLVSNAFALYLTHENVSFAVLAFSGWLRGEIGVFAVILAIVSVLQTHNFDNFLHIFKNGWGTLAGAICSYV FDSYFWGYKVIPEINSFVFNVLEGQSSIWGVEPWYAYLMKYLPNLFNKSPLLILVIPGLFLKNDKLKNSKSLTLS SLLYLAIIAFQPHKEWRFIVYIVPPLVITISTVLAQLPRRFTIVKVAVFLLSFGSLLISLSFLFISSYNYPGGEA LQHLNEKLLLLDQSSLPVDIKVHMDVPACMTGVTLFGYLDNSKLNNLRIVYDKTEDESLDTIWDSFNYVISEIDL
DSSTAP KWEGDWLK I DWQG YNGINKQS IKNT IFNYGILKRMIRDATKLDVGF IRTVFRSFIKFDDKLF I YERSS QT
6.5 Remaining genes of the αlvcosylation pathway
MNN4 (YKL201C): Putative positive regulator of mannosylphosphate transferase (Mnnόp), involved in mannosylphosphorylation of N-linked oligosaccharides; expression increases in late-logarithmic and stationary growth phases S. cerevisiαe null mutant: viable Homology not so high
SEQ0166 PATENTED elimination of mannosylphosphorylation chrl-4_0409
5' region cgcatagaacaaaaggcaaaggattaaacttcattgatgcaagattcataaatgttgaaa caggcctttatatcgatatcactggattaagtaccagtcagtcagctcgaccgccaaggt ttagtaacgcttcgaagaaagatcctatttacaattgcaggaataatcatttctactctc ataacaatatagcacctctcaaatacacgttgatggagggggttcccagtttcattcctc aacagtatgaagaaatattgagagaggagtatacaactggtttgacttcgaaacactaca acggcaacttttttatgactcaattgaatttgtggcttgaaagagatccaatgctagcac ttgtgccttcatccaaatacgaaattgaaggtggaggggtggaccataacaagattatca agtctattcttgaactttccaacatcaaaaaattggaattgttggatgataatcccgata tattagaggaggtgatcaggacatacgaactgacttccattcaccataaagagatgcagt atctttccagtgtcaaaccagatggggacaggtccatgcagtcaaatgacataaccagtt cttaccaggagtttctagcaagtctgaagaaattccagcctttacgcaaagatttgttcc aatttgagcggatagacctttctaagcatagaaaacagtgagcagccgttttgcctaaaa tgttccagaaactataggataaatatatacagtaatgaattaggtgatgttagcatttag tccccaaaaatacctcgaatctccagctccatagcgcaaaatctccaaatctacttcaag acgcactcatatgaaattcctgatgatctcgcgccaaaaggggaaaaatacattgtgtcg tgtgatcagactttgttttcctttggactctgtcagattttggttatcggcacacaaaca gtagttttttttgcctcaagctaaacttacagactcaagg
ORF
GAACAAAAGAGCCAGCAGGAGGCTAAAGAATAG
Downstream cggaggaatgcaaataataatctccttaattacccactgataagctcaagagacgcggtt tgaaaacgatataatgaatcatttggattttataataaaccctgacagtttttccactgt attgttttaacactcattggaagctgtattgattctaagaagctagaaatcaatacggcc atacaaaagatgacattgaataagcaccggcttttttgattagcatataccttaaagcat gcattcatggctacatagttgttaaagggcttcttccattatcagtataatgaattacat aatcatgcacttatatttgcccatctctgttctctcactcttgcctgggtatattctatg aaattgcgtatagcgtgtctccagttgaaccccaagcttggcgagtttgaagagaatgct aaccttgcgtattccttgcttcaggaaacattcaaggagaaacaggtcaagaagccaaac attttgatccttcccgagttagcattgactggctacaattttcaaagccagcagcggata gagccttttttggaggaaacaaccaagggagctagtacccaatgggctcaaaaagtatcc aagacgtgggattgctttactttaataggatacccagaaaaaagtttagagagccctccc cgtatttacaacagtgcggtacttgtatcgcctcagggaaaagtaatgaacaactacaga aagtccttcttgtatgaagctgatgaacattggggatgttcggaatcttctgatgggttt caaacagtagatttattaattgaaggaaagactgtaaagacatcatttggaatttgcatg gatttgaatccttataaatttgaagctccattcacagacttcgagttcagtggccattgc ttgaaaaccggtacaagactcattttgtgcccaatggcctggttgtcccctctatcgcct tccattaaaaaggatcttagtgatatagagaaaagcagac
AA
MKVΞKRLIPRRSRLLIMMMLLVVYQLWLVLGLESVSEGKLASLLDLGDWDLANSSLSISDFIKLKLKGQKTYHK FDEHVFAAMARΪQSNENGKLADYESTSSKTDVTIQNVELWKRLSEEEYTYEPRITLAVYLSYIHQRTYDRYATSY APYNLRVPFSWADWIDLTALNQYLDKTKGCEAVFPRESEATMKLNNITWDWLEGLCITDKSLQNSVNSTYAEEI NSRDILSPNFHVFGYSDAKDNPQQKIFQSKSYINSKLPLPKSLIFLTDGGSYALTVDRTQNKRILKSGLLSHFFS
KKKKEHNLPQDQKTFTFDPVYEFNRLKSQVKPRPISSEPSIDSALKENDYKLKLKESSFIFNYGRILSNYEERLE SLNDFEKSHYESLAYSSLLEARKLPKYFGEVILKNPQDGGIHYDYRFFSGLIDKTQINHFEDETERKKIIMHRLL RTWQYFTYHNNI INWISHGSLLSWYWDGLSFPWDNDIDVQMPIMELNNFCKQFNNSLVVEDVSQGFGRYYVDCTS FLAQRTRGNGNNNIDARFIDVSSGLFIDITGLALTGSTMPKRYSNKLIKQPKKSTDSTGSTPENGLTRNLRQNLN AQVYNCRNGHFYQYSELSPLKLSIVEGALTLIPNDFVTILETEYQRRGLEKNTYAKYLYVPELRLWMSYNDIYDI LQGTNSHGRPLSAKTMATIFPRLNSDINLKKFLRNDHTFKNIYSTFNVTRVHEEELKHLIVNYDQNKRKSAEYRQ FLENLRFMNPIRKDLVTYESRLKALDGYNEVEELEKKQENREKERKEKKEKEEKEKKEKEEKEKKEKEEKEKKEK EEKERKEKEEKEEYEEDDNEGEQPTEQKSQQEAKE
SEQ0167 chrl-4_0410 PATENTED
5' region gggagagttagctagcatacaagataatgaaggatcaatagcggtagttaaagtgcacaa gaaaagagcacctgttgaggctgatgataaagctccaattacattgccacagagaaacac agtaacagaaataggaggggatgcaccacgagaagagcattcagtgaacaactttgccaa attcataaccccaagcgctaataagccaatgtcaaagtcggctactaacattaatagtac aacaactatcgattttcaaccagatgtttgcaaggactacaaacagacaggttactgcgg atatggtgacacttgtaagtttttgcacctgagggatgatttcaaacagggatggaaatt agatagggagtgggaaaatgtccaaaagaagaagcataatactctcaaaggggttaagga gatccaaatgtttaatgaagatgagctcaaagatatcccgtttaaatgcattatatgcaa aggagattacaaatcacccgtgaaaacttcttgcaatcattatttttgcgaacaatgttt cctgcaacggtcaagaagaaaaccaaattgtattatatgtggcagagacactttaggagt tgctttaccagcaaagaagttgtcccaatttctggctaagatacataataatgaaagtaa taaagtttagtaattgcattgcgttgactattgattgcattgatgtcgtgtgatactttc accgaaaaaaaacacgaagcgcaataggagcggttgcatattagtccccaaagctattta attgtgcctgaaactgttttttaagctcatcaagcataattgtatgcattgcgacgtaac caacgtttaggcgcagtttaatcatagcccactgctaagccagaattctaatatgtaact
acgtacctttccttttaataaatgatctgtattttccacctagtagcagatcaaattgtt caactttaagtctttggtccctcaagcgagagaacttgcg
ORF
AAACAGTGA
Downstream gcagccgttttgcctaaaatgttccagaaactataggataaatatatacagtaatgaatt aggtgatgttagcatttagtccccaaaaatacctcgaatctccagctccatagcgcaaaa tctccaaatctacttcaagacgcactcatatgaaattcctgatgatctcgcgccaaaagg ggaaaaatacattgtgtcgtgtgatcagactttgttttcctttggactctgtcagatttt ggttatcggcacacaaacagtagttttttttgcctcaagctaaacttacagactcaagga tgaaagtatcaaagcggttgataccgaggagatctcgtctcctcattatgatgatgctac tggttgtttaccagctggtggttttggtcctaggattggagagcgtctctgaaggaaaat tagcaagcttgcttgacttgggcgattgggatctagctaactcctcgctatctatatccg atttcataaagctgaagctcaaaggccaaaagacttatcacaaatttgatgaacatgtct tcgccgcaatggcaagaattcaaagtaatgagaatggcaagttggcggattacgagtcta cttcatcgaagactgacgtaaccattcaaaatgttgaactttggaagagattgagcgaag aagaatacacttacgaaccgcggataactttggctgtgtatctgagctacattcatcaga ggacttatgacaggtacgcgactagttacgctccttataacttgcgggtgcctttttcgt gggctgactggatagatctgacggccctaaatcaatacttggataaaacgaaaggctgcg aggcagttttccctagagaaagtgaggcaactatgaagcttaacaatatcactgttgtgg actggcttgagggcctttgcataactgataaatcacttcaaaattccgtaaactccacat atgcggaagagattaatagtcgggacatcttgtctcctaa
AA
LIDTESNSRYEDPDDISIENELRYRIAQSTKEEENMWKLDTTLTEASLKIPNIQSFELQPFKERLDNSLYNSKNI GNFYFYDPRLTFSVYLKYIKDKLASGSTTNLTIPFNWAHFRDLSSLNPYLDIKQEDKVACDYFYESSNKDKRKPT GNCIEFKDVRDEHLIQYGISSKDHLPGPFILKSLGIPMQHTAKRLESNLYLLTGAPVPLSLSFMTKKGLYQVGVD QTGKLDPNIARTELWEFYKNGKENLQFNAQEELSHLIETVPSSSNSSSGEGYFTTELKENNFELPLSKNDFTFDD SEVESLIKGLSEQDLDLHTQRYKESLQYSFATRENDVKKYFYEARMIINTVNKEGGAHYDWRFFNGAMNHESSGF TEEERQLRKRSVLHRLLRNWLVFNYQQGSPTWLAHGTLLSWYWNSLMFPWDYDIDVQMPIKSLNNLCANFNQSLI
iEDLTEGYSSFFLDCGSsITHRTKGKGLNFIDARFINVΈTGLYIDITGLSTSQSARPPRFSNASKKDPIYNCRNN HFYSHNNIAPLKYTLMEGVPSFIPQQYEEILREEYTTGLTSKHYNGNFFMTQLNLWLERDPMLALVPSSKYEIEG GGVDHNKIIKSILELSNIKKLELLDDNPDILEEVIRTYELTSIHHKEMQYLSSVKPDGDRSMQSNDITSSYQEFL ASLKKFQPLRKDLFQFERIDLSKHRKQ
SEQ0168 chr2-l_0718 PATENTED
5' region tgcggtgttggtgatgctaattacgtaccagaacccaatctgtggactcaggaccagctc agcttgacaaaccaagacttgcactccaatgtgcacaacccagtgattgagcagatcgaa acctcatcaggagtcagattgtagtatggaaaactttgtattctctatgtacttaaacac tggtttatttttttattgatcgttatattgaacagtttacactggaacatcttcagggtc gatgtccttaatccagtgttgaccaaagattgggatcttctcgaagaaagtcttttggaa caaaggccagttttcagtgaaagtgaagacggcaaacagaccggcacctccccagaaagc caaaattggaaaagtagttttaatttgggttggagtcaatccagcaattttcttgacggt ggtatacttgggacctttaacgtactgaaatgttagaacatgtttgtaaaaatcaaatca tcactgcagaaacggtttgtgtgcctgcaccggagggttatcataatgccacttacgttg accattttggaggtgtttgactaagttcaaatatgaatctctaagaaaactaataatcaa tatggtgcgagcattgattggttggacagctagtttggagaagtacacgacttagatgaa tctgcaataaggaatagtccaatctgattatgtaagctctcctttttggttttcatttcc atcagctcaagcttatcatagctcaggtcccctccagcttatgatggaataggccattat tttttgccctaaaaagtggaagtccacaagaagaaatacaaatactcaaaattcaaaagt cttcctttgagtggatgcaattttacgtagtttactgtatgacgtaactaatgaaccctt ccgacacaaagattgaggtgcctcacttaacgtcattcttctatacccacgagtgcaact gactaggtcttattttgttaattgcctcagtttctccgaa
ORF
TACTATATTGACGTTGGATCCTCCTTCTTTGTTAGGGATAAACTAAATGGTAACAATGCTATAGATGCACGTTTC
TGGCTGTCTCGATACAGGAAAAAAATGACTAGGAGCCAATAA
Downstream ttatgcatatactgaaacaacagaaggaactacagtaaattcataaaaagcttaattctt actttcatctcggcactgtaaattaactcaagttggggcaacattgtgtgtatactctta
ctggcatcttttcatctgaagtcatcttctactactcttctcttctgtatgacgtaatca gctcggcagctgtggcatcgaacaaaaaaatgaacagccatccgtcatatctcatgactg actgagcaagaactaagtcaacaggaaacctaaaataagctttccatttcttttgcgctg aagccaaccactccccacacagttgatgagtggacgcaaaaccagetcctataccttgac agaagagtcgccggaatcaacctcaacaattcaggatatacgagaggaagaccaagttgc tccagggccccagcaagaagcacctaaacagtcacatatccaaaaatggttaagcgatca tcctaaagtatacgcagttttatcttggatatggaaattttggttgaaacagtggtttct catatgtttgggccctgcggttgctctagctcatgcatacccaaattttgccagacatga tggaaccattaggtcagagtatactatcaactacggagccgtagctatcatattcttcat ctctggtcttactatgaaaaccaaagactttttgaagaactttggacactggagagccca tttcacagtgttaagctgctcgtttctacttacttcttctatcatttacggtatagcgtg cggtataagagctgctcatgattccaatatcgatgactggatgttagcaggacttattgt taccgcatgttgtccaaccactgtgagtagtaacgttgtcatgacagaacaagctcatgg taatgttttcttgagtatttgcgaagttattattggcaatgttttagggggtttcatcac acctgctctggtgcaaatgtatctgacaggtagctgggat
AA
MSGNPFLFSPSNFDFSGLDHYRSTDKDHLALDVLDYDKNHFFSRNSPSLKSRIHFYRHKLTTRKQIGLFSGRLKL FVLALFVLITFSAIHIPIPFSLDILGSHVKYLPLREKVDPEEALHLHGLDLSVAELPFFNDDMMSEFNYDPRLPT ALILKLVLDHISVRNGTFDAKFKVPFNWKLWVDLHSRLVPSNSWYNRFRLPSGRFETCDEFKRFFGITKNHFGTD LDNCVDIEYDTPEGYPKFKVLHAEDKALPYEARI IYGASYLYHEAQNPKRLIFLGLGKSNESLILPVEANDSSNL MQFNHEYARSFNDQPFVSLEELVKKVSLTLNLNSDKVLPINELDVIKDTPRLMNHNNQGLSIDKSSFQWDLEREL QLLEHRTSQVNDVEGLDAGIYSTIQCEMRSMYDFSKYFHESKVSGKYLPSGEHYDWRFFNGFYLSQQENLAVLHR LGRAWLRFSRAAGLHTWIAHGTLLGWYWNGLILPWDQDLDVQMTVQSLYLLGRNFNSSLVTDVSIEDGYSSALGH YYIDVGSSFFVRDKLNGNNAIDARFVDTETGLYVDITALAFTDHLKLKLTTFEKVELQKVMDPNVKEKLQWIKNK YSTATLPGVIETDRNKVSDALEKQFHDFKFDNFVNKELFHCRNNHFYKYGEVGRLRSTMFEGVPALIPFEFESIL KREYPKGLTLKHFSNHFWDPVNRLWVPEKKKKIRHIEFSLTKEVTESHKKELAQIHGNETGITSDFAYSPFRIDP
WLSRYRKKMTRSQ
MNN6 (YPL053C): Probable mannosylphosphate transferase involved in the synthesis of core oligosaccharides in protein glycosylation pathway; member of the KRE2/MNT1 mannosyltransferase family
S. cerevisiae null mutant: viable
Chr3_1162
Chr3_0215
Chr2-2_0105
Chrl-3_0138
Chrl-4_0050
These have homology, but are assigned to other mannosyltransferase genes
7 Methanol metabolism
AOX: alcohol oxidase
AOXl: chr4_0152
A0X2: chrl-l_0226 genes known; promoters known
SEQ0169 CAT: catalase chr2-2_0131 gene known; promoter unknown
5' region ctatgttgacggagagtgttgggtctaacatcatggcacatggtagaggaggtgttaaga catctagagagcgtgatagagaggaagagatcagagaatacgaggagaccaatttcacta gattaccaacttcggttactgagaagtcaaagaaacaaaagaaagatcatagattgaaca cctttgcaggagaggattggtcattctttggcaaggacagagatgaagacatgaagaaaa
gtgcgaggaagaataaaaatactgcttcctccgcctgggaaagagcaaaaagacgcagag gaaactaaagtgtgtaatcatatatataataaatgaggaataataattgaatagagattt aacgagtcgaagtttctgaaatatacgcacagtttatatttatgattttgatatctaact acagtcttctccatatatttaactataaataataaagtatataactcttatgaaactgtt tcaccacatttttttctacgtaatcgaactccgaatgcggttctcctgtaaccttaattg tagcatagatcacttaaataaactcatggcctgacatctgtacacgttcttattggtctt ttagcaatcttgaagtctttctattgttccggtcggcattacctaataaattcgaatcga gattgctagtacctgatatcatatgaagtaatcatcacatgcaagttccatgataccctc tactaatggaattgaacaaagtttaagcttctcgcacgagaccgaatccatactatgcac ccctcaaagttgggattagtcaggaaagctgagcaattaacttccctcgattggcctgga cttttcgcttagcctgccgcaatcggtaagtttcattatcccagcggggtgatagcctct gttgctcatcaggccaaaatcatatataagctgtagacccagcacttcaattacttgaaa ttcaccataacacttgetctagtcaagacttacaattaaa
ORF
GACAGGGTTTTTGAATACTTCTCCAAGGTTTACCCTGAAATTGGTGACCAGATTCGTAAAGAAGTATTGCAGCTA TCTCCAAGAGGTGACTCTGCAGCAAGATTGTAG
Downstream gctaactatatttattattaattaaagattctttaacttcggtaatttgtagaacagaaa gagggattgtcattctcggttcattatcctttccagcacatcattttgtgggggactcta gccttcgcgctgcttctactactggtcaaataacaattctccataacggtattcgcttac ctcatcggttacgagcgttgtccgatgtaggccaatcaacgatgatcgaaacatcatcag aaaactaaactctcacatagagacaaaggtcacaatcatgctaagacaacaagtgggtag aaagcttgtatccaagcgtttcttcagtggcttgaaggatagtgaccgtatcttccaaaa tgtgtacagtaaatatggtgaagacctgaagagttcacagcaacgtggtgactggtacaa gaccaaggaaatcatcctcaagggacacgaatggcttatcaacgaattgaaggcttctgg tctacgtggaagaggtggtgccggtttcccatctggccttaagtatagtttcatgcctcc taacccaaacagagaaccacaatacctggtggtcaatgctgatgaaggtgaaccaggaac atgtaaagatagagaaatcattagaaaggagccccacaaattggtagagggctgtctgct ggccggaagagccatgaatgccactgcggcctatatctatattagaggggagttttacaa cgaagctgttgctttgcagacagctattaacgaggcttacaaagctggactactgggaaa gaacgcctgtggttccggatacgacttcgacgtttacattcaccgtggtatgggtgctta tatttgtggtgaagaaactgcacttattgagtccattgagggcaaagctggtaagccaag actgaagcctcctttccccgcaggtgttggtttgttcggtcgtccaacaacagtaacaaa tgttgaaaccgtcgctgtcactccaacaatctttagaaga
AA
MSQPPKWTTSNGAPVSDVFATERATFDNANHANNAPKVGPLLLQDFQLIDSLAHFDRERI PERVVHAKGAGAFGEFEVTDDISDVCAAKFLDTIGKKTRIFTRFSTVGGEKGSADSARDP RGFSTKFYTEEGNLDLVYNNTPIFFIRDPSKFPHFIHTQKRNPATNLKDANMFWDYLVNN QESIHQVMYLFSDRGTPASLRKMNGYSGHTYKWYNKKGEWVYVQVHFKSDLGWNFNNEE AGKLAGEDPDYHTGDLFNAIERGEYPSWTCYIQTMTQEQAAKQPFSVFDLTKVWPHKDFP LRRFGKFTLNENPKNYFAEVEQAAFSPSHTIPSMQPSADPVLQSRLFSYPDTHRHRLGVN YQQIPVNCPVAPVFTPQMRDGSMTVNGNLGSTPNYKSSFCPFSTEAQIQTNSHTPEEVLA
AHTEKFHWGGILDSKSYDFEQPRALWKVFGKTPGQQRNFCHNVAVHVAAANHE IQDRVFE YFSKVYPE IGDQIRKEVLQLSPRGDSAARL
FLD: formaldehyde dehydrogenase chr3_1028
Gene known, promoter known
SEQ0170
FGH: S-formylgluthatione hydrolase chr3_0867 gene known; promoter unknown
5' region tggttccctctcggtccaataccaaaaatattatcaccatacaggtctcccttcgatacc agtgcaaagttgaaccgtgggattaccttggaatctacaaaaatagtgtcactcacaagt ttgtcatcaaccacgctgccgcttgcaaaggagaactgaacatgaaggttgttagggttt gttatattggaataagtggtggatttgttgaaggcgaacgcaccaaagctacatccgtcc tgagcacactgtgaatttgtcacggaattgaccaagaggtcagacgatcctgtatcccat tgagccgttatgetttgtgggggaaaccctatttctatcgtactaagaaaaccaatggtg aactcatattcggtatcaatggcgacgattccagcatagcctgtagacagtaacaacact agggcaacagcaactaacatatcttcattgatgaaacgttgtgatcggtgtgacttttat agtaaaagctacaactgtttgaaataccaagatatcattgtgaatggctcaaaagggtaa tacatctgaaaaacctgaagtgtggaaaattccgatggagccaactcatgataacgcaga agtcccattttgccatcttctcttggtatgaaacggtagaaaatgatccgagtatgccaa ttgatactcttgattcatgccctatagtttgcgtagggtttaattgatctcctggtctat cgatctgggacgcaatgtagaccccattagtggaaacactgaaagggatccaacactcta ggcggacccgctcacagtcatttcaggacaatcaccacaggaatcaactacttctcccag tcttccttgcgtgaagcttcaagcctacaacataacacttcttacttaatctttgattct cgaattgtttacccaatcttgacaacttagcctaagcaatactctggggttatatatagc aattgctcttcctcgctgtagcgttcattccatctttcta
ORF
Downstream aactatgtaagttatataggctttaagtcaaacgaggttaacatgacctatgagaaatga agagcaggcacatgataaccgtgaaagatgccaggaacggagctagccgtctagactttc gttgcttgttgtttggatgagtcgcaggtgctggaaggctagaatcctgggtcatggaag ccttatcagaaatgttttggattgtggtcacagtggttgtcagttcatcgtacgaaggga gatcctcatgttgttccaaggcgttcaaagtgacaactttgtagattttggggttgtatt ggtccatattgtccagatccactttagcgtcgaactctacgtgatcgaaagctggcaaaa tggtatagttcgaaagctctgtgtttctattttgttgtgattgtgggtgaaccttattaa tattgacgttagtgataactattttgctctgctttgcgccgttgatttggacgctgacat tctggttgtctatgatctctggcgaactcgacgaagtcgctgaaaccgtttcaatgtatc ttggtgggttcataaaagggtggtggttgcaagattagaattgtaagatctctactgcga caatagccagggagttaaccgccgcgaacaataggccttaggcacctctgcgtaaataaa tttgctgataagggaaagaatcgcgagatagaacaccttaccctcggtatgctctcaccg acccgagagtttaccgagagttccgatattgagttggggtgagcagaaccagaattcagc atgcttttgatttattgggtcaaaatgggttccgctccgccaaggttcgtatctttgagc cttcgccaagcgggcgggcgtcccgtttgacaccacgctcgtaattcctttaatgcaggc
cgaaaacgtcacaccccacaactcaccccttcaattgttgttttgattcacagctgaaac taagcctttattacgtaagccacaaatgggggactagaag
AA
MSSITTSIFKVTAE IQSFGGKLVKLQHKSDETKTDMDVNVYLPAQFFANGAKGKSLPVLL YLSGLTCTPNNASEKAFWQPYANKYGFAWFPDTSPRGLNIEGEHDSYDFGSGAGFYVDA TTEKWKDNYRMYSYVNSELLPKLQADFPILNFDNISITGHSMGGYGALQLFLRNPGKFKS VSAFSPISNPTKAPWGEKCFSGYLGQDKSTWTQYDPTELIGKYQGPSDSSILIHVGKSDS FYFKDHQLLPENFLKASENSVFKGKVDLNLVDGYDHSYYFISSFTDVHAAHHAKYLGLN
SEQ0171
FDH: formate dehydrogenase (=FMD promoter, used in H. polymorpha)
Chr3-0932 gene known; promoter unknown
5' region aaatggcagaaggatcagcctggacgaagcaaccagttccaactgetaagtaaagaagat gctagacgaaggagacttcagaggtgaaaagtttgcaagaagagagctgcgggaaataaa ttttcaatttaaggacttgagtgcgtccatattcgtgtacgtgtccaactgttttccatt acctaagaaaaacataaagattaaaaagataaacccaatcgggaaactttagcgtgccgt ttcggattccgaaaaacttttggagcgccagatgactatggaaagaggagtgtaccaaaa tggcaagtcgggggctactcaccggatagccaatacattctctaggaaccagggatgaat ccaggtttttgttgtcacggtaggtcaagcattcacttcttaggaatatctcgttgaaag ctacttgaaatcccattgggtgcggaaccagcttctaattaaatagttcgatgatgttct ctaagtgggactctacggctcaaacttctacacagcatcatcttagtagtcccttcccaa aacaccattctaggtttcggaacgtaacgaaacaatgttcctctcttcacattgggccgt tactctagccttccgaagaaccaataaaagggaccggctgaaacgggtgtggaaactcct gtccagtttatggcaaaggctacagaaatcccaatcttgtcgggatgttgctcctcccaa acgccatattgtactgcagttggtgcgcattttagggaaaatttaccccagatgtcctga ttttcgagggctacccccaactccctgtgcttatacttagtctaattctattcagtgtgc tgacctacacgtaatgatgtcgtaacccagttaaatggccgaaaaactatttaagtaagt ttatttctcctccagatgagactctccttcttttctccgctagttatcaaactataaacc tattttacctcaaatacctccaacatcacccacttaaaca
ORF
GGTAAGTACAAGACCAAGGCTTATGGTAATGACAAAAAGGTCGCATAA
Downstream ttgaaatgtatttaatttgatattaagtaaatgaatgattatgactttatgaattcgcaa tgttttctccttgattatttctgtattgtattggaatgattatagaatactcatatattg attatagtattagcacataaaacgtttgttgttaaactcacttccgtacgcaaccatttc tatttctagctatcttgataaggttcactgctcctgtactttgagtttctccttcaaact tctcacctcctccttcagaagaccgatctcaatgtctttctgtttagaccgtccgtaact cagcatttttgccatttcttcattttctttagtcaacgaacgtagtttagtcaacacctt gtgagcgatctcctgctcatcgtaccgtccatctttcgtaattgcttgatattgtccttg tactggattcatcgcaattgcggttaaaacgcactctaatcgttctttggtgtcttcaag ctcagattgtacatcgttcagtttagtagacatttctgtgtacaattgctcgtagttace aatgtgcgacgatccggacattgcatcgttttttccttggctctccatttctagcttctc ttggttcaattgctcaacctctttctctaaagcttcaatctgggagtttttgtattgcag
ctctagagctagaagttgagcggaatcttcacgcttcttgtggccattgtcgttgctatt tgagttaaagggctcttgatacaccgacatgtaagtgtaattgttgccaaacttactaac tccaggattcagagactgtctgtatggggtgttgtctcccataaatgtcattccactgtt attaagaggcgaggtctgcggttgttgttgttggccgactccggtcataaaatcattgtg aaagctagtattggttgaaggaaacaatgaagccttatagttgccttttgtttgcccaac cgtagaagagctgccaattgttgaggctactgtatccaat
AA
MKIVLVLYSAGKHAADEPKLYGCIENELGIRQWLEKGGHELVTTSDKEGENSELEKHIPD ADVI ISTPFHPAYI TKERIQKAKKLKLLWAGVGSDHIDLDYIEQNGLDI SVLEVTGSNV VSVAEHWMTILNLVRNFVPAHEQIVNHGWDVAAIAKDAYDIEGKT IATIGAGRIGYRVL ERLVAFNPKELLYYDYQGLPKEAEEKVGARRVDTVEELVAQADWTVNAPLHAGTKGLVN KELLSKFKKGAWLVNTARGAICNAQDVADAVASGQLRGYGGDVWFPQP APKDHPWRDMRN KYGYGNAMTPHYSGTTLDAQVRYAEGTKNILNSFLTKKFDYRPQDVILLNGKYKTKAYGN
DKKVA
DAS: dihydroxyacetone synthase
2 genes; gene unknown; promoter unknown
SEQ0172 1. Chr3-0832 5' region attactgttttgggcaatcctgttgataagacgcattctagagttgtttcatgaaagggt tacgggtgttgattggtttgagatatgccagaggacagatcaatctgtggtttgctaaac tggaagtctggtaaggactctagcaagtccgttactcaaaaagtcataccaagtaagatt acgtaacacctgggcatgactttctaagttagcaagtcaccaagagggtcctatttaacg tttggcggtatctgaaacacaagacttgcctatcccatagtacatcatattacctgtcaa gctatgctaccccacagaaataccccaaaagttgaagtgaaaaaatgaaaattactggta acttcaccccataacaaacttaataatttctgtagccaatgaaagtaaaccccattcaat gttccgagatttagtatacttgcccctataagaaacgaaggatttcagcttccttacccc atgaacagaaatcttccatttaccccccactggagagatccgcccaaacgaacagataat agaaaaaagaaattcggacaaatagaacactttctcagccaattaaagtcattccatgca ctccctttagctgccgttccatccctttgttgagcaacaccatcgttagccagtacgaaa gaggaaacttaaccgataccttggagaaatctaaggcgcgaatgagtttagcctagatat ccttagtgaagggttgttccgatacttctccacattcagtcatagatgggcagctttgtt atcatgaagagacggaaacgggcattaagggttaaccgccaaattatataaagacaacat gtccccagtttaaagtttttctttcctattcttgtatcctgagtgaccgttgtgtttaat ataacaagttcgttttaacttaagaccaaaaccagttacaacaaattataacccctctaa acactaaagttcactcttatcaaactatcaaacatcaaaa
ORF
AAAC CAAAGCAT GACAAG T T GT AA
Downstream acgggaagtctttacagttttagttaggagcccttatatatgacagtaatgctagtacgt tttgttttgtttaattaataacttagtttatgttagcctagtatagactccatcaatttt ttttgttattacgtaagccgcgatgataatatctgatgaaaaattcctatcagaaaataa tttatcaaaagtttcatgcgatatgagactaagtagaatagggactcccaaagtgtcagt cacaagggtcattcccgttcgtaatgtggtgatagcgaggagaaaacctgtcagagcaag taacaccgacgcaaagacatggctaatgaaagaagagcagagaagaataagacagaagga gcaggagatgaaacaaaggctagaggaactagaaaggttcaaaacaaaagtacagaaatc atatataaggaaagaggataggcatttggcacaagagatagaaaaggatcttgacataat cactgatgattacaatttggacagcgatgtagacctggtctttggagaactcatgcaagc agaagagcagcccaaatcgttgaagagtttaccaggagccagtgatgcaagtgataacac agataaatcggtggaactgttttcctttccatctccgaatctgacgctacctgaaaaggt gatacaccatattgggccactggtgaagcacatcagtaatcctgaaaacattcaatgggg gagacttttactggatttggaaaaaaatcaggggtttaacggtctatctgccgtagatgt taccagactgattcaaaatatacccaaagaggaaaaatatcagcatatgtctttgattca tgaaatgatgtttaacagtgggatcagtcctgatcggtacttgacagatttgatgatgac tgccttctctgaaaggagctactatgaaccattggttgaggctcttttccaagactatga catcaatggctgggctccaacagattatacttttggagca
AA
MARIPKAVSTQDDIHELVIKTFRCYVLDLVEQYGGGHPGSAMGMVAIGIALWKYQMKYAP NDPDYFNRDRFVLSNGHVCLFQYLFQHLTGLKEMTVKQLQSYHSSDYHSLTPGHPEIENP AVEVTTGPLGQGISNAVGMAIGSKNLAATYNRPGFPWDNTIYAIVGDACLQEGPALESI SLAGHLALDNLIVIYDNNQVCCDGSVDVNNTEDISAKFRAQNWNVIDIVDGSRDVATIVK AIDWAKAETERPTLINVRTEIGQDSAFGNHHAAHGSALGEEGIRELKTKYGFNPAQKFWF PKEVYDFFAEKPAKGDELVKNWKKLVDSYVKEYPREGQEFLSRVRGELPKNWRTYIPQDK PTEPTATRTSAREIVRALGKNLPQVIAGSGDLSVSILLNWDGVKYFFNPKLQTFCGLGGD YSGRYIEFGIREHSMCAIANGLAAYNKGTFLPITSTFYMFYLYAAPALRMAALQELKAIH IATHDSIGAGEDGPTHQPIALSSLFRAMPNFYYMRPADATEVAALFEVAVELEHSTLLSL SRHEVDQYPGKTSAQGAKRGGYWEDCEGKPDVQLIGTGSELEFAIKTARLLRQQKGWKV RVLSFPCQRLFDEQSITYRRSVLRRGEVPTWVEAYVAYGWERYATAGYTMNTFGKSLPV EDVYKYFGYTPEKIGERVVQYVNSIKASPQILYEFHDLKGKPKHDKL
SEQ0173
2. chr3_0834 5' region aataaaaaaacgttatagaaagaaattggactacgatatgctccaatccaaattgtcaaa attgaccaccgaaaaagaacaattggaatttgacaagaggaacaactcactagattctca aacggagcgtcacctagagtcagtttccaagtcaattacagaaagtttggaaacagaaga ggagtatctacaattgaattccaaacttaaagtcgagctgtccgaattcatgtcgctaag gctttcttacttggaccccatttttgaaagtttcattaaagttcagtcaaaaatttteat ggacatttatgacacattaaagagcggactaccttatgttgattctctatccaaagagga ttatcagtccaagatcttggactctagaatagataacattctgtcgaaaatggaagcgct gaaccttcaagcttacattgatgattagagcaatgatataaacaacaattgagtgacagg tctactttgttctcaaaaggccataaccatctgtttgcatctcttatcaccacaccatcc tcctcatctggccttcaattgtggggaacaactagcatcccaacaccagactaactccac ccagatgaaaccagttgtcgcttaccagtcaatgaatgttgagctaacgttccttgaaac tcgaatgatcccagccttgctgcgtatcatccctccgctattccgccgcttgctccaacc atgtttccgcctttttcgaacaagttcaaatacctatctttggcaggacttttcctcctg ccttttttagcctcaggtctcggttagcctctaggcaaattctggtcttcatacctatat caacttttcatcagatagcctttgggttcaaaaaagaactaaagcaggatgcctgatata taaatcccagatgatctgcttttgaaactattttcagtatcttgattcgtttacttacaa acaactattgttgattttatctggagaataatcgaacaaa
ORF
AAACCAAAGCACGACAAACTATAA
Downstream gtagatttggccactaacgggttagtagttgtgtaagtctattaaatttgatttttgttt atggatgatcatcgtagtggctatctgtttacctgtaggacatcctagggtgggatggtg atgtacaccccctcaatcttcagatgcaacactatgtggtaggtcattgacataaggttt aggaaagacctgttttttgaccaataaatggaacaggaaggaaaggaggaaccagtttac gaaccccgtcggacatacaaaaccaaaaaatgagagcatgtagctcctcttgatggcttt gcaggaaacactgctgcactcagaaatggcaacgcccggatattgaagggacgaagattg tgggggagaaacctgtcactgcagatatgaggagtttgatttataggaaacatttttcaa tgtgggtgaatcctctgcgacctacatactatgtgcataatacgtgcataaaaagacaac aagataaatatagtaaaatatggggtaatagagagggctcaaccgaacgttagtggtagg attggggaaacagtattggagagagtggaatagatacacagctagtgttggattgtatag tagtttcgtgtttataggtcgctgattctagtactgcgtacagctgaagagagaaaaaaa aatctgaaaactggggtaaagaactccttaaaaatttttcatatgtatgatgaaagactg ctgcaggagacgttcacacttgaaaaatttgatcgaagcgagctacaagtccatatcgtg ttggaacagaatctggaggtgtaaacaccgtcattccctaatctaagtttctgtccagcc gtttagaaagttcctcttcgtcaataaagtttaactttgaacgtgagagttgagctaatc ggaaatactcccgagagcacacccgagcagtgaagcggggcatcatcaaaagaattaagg ggtgctgggtgcggattgcgcgcagtagctggcatcaatg
AA
MARIPKAVSYNDDIHDLVIKTFRCYVLDLVEQYGGGHPGSAMGMVAIGIALWKYQMKYAP NDPDYFNRDRFVLSNGHVCLFQYLFQHLTGLKEMTVKQLQSYHSSDYHSLTPGHPEIENP AVEVTTGPLGQGISNAVGMAIGSKNLAATYNRPGFPWDNTIYAIVGDACLQEGPALESI SLAGHLALDNLIVIYDNNQVCCDGSVDVNNTEDISAKFRAQNWNVIEVENGSRDVATLVK AIEWAKAENERPTLINVRTEIGQDSAFGNHHAAHGSALGEEGIRELKAKYGFDVARKFWF PQEVYDFFAEKPAEGDQLVANWKKLLDEYVKNYPQEGEELKARIRGELPKNWKSFIPQDK PTEPTATRTSAREIVRSLGQNLPQVIAGSGDLSVSILLNWGGVKYFFNPKLQTFCGLGGD YSGRYIEFGIREHSMCAIANGLAAYNKGTFLPITSTFYMFYLYAAPALRMAALQELKAIH IATHDSIGAGEDGPTHQPIALSSLFRAMPNFYYIRPADATEVAALFEVAVELEHSTLFSL SRHEVEQYPGKTSAEGAKRGGYWEDCEGKPDVQLIGAGSELEFAVKTARLLRQQKGWKV RVLSFPCQRLFDQQSLAYRRSVLRRGEVPTVWEAYVAYGWERYATAGYTMNTFGKSLPV
EDVYKYFGYTPEKIGEKVAAYVNS IKASPQILYEFTDLKGKPKHDKL
SEQ0174
TPI: triosephosphate isomerase (= also a gene of the glycolysis) gene unknown; promoter unknown chr3_0951 5' region ttcaacgagacactcttccgtcagttccaaaaccataagtttgccgatgtgttggtcctt gtaacgcatggaatttgggccagggtatttttgatgaaatggttcagatggtctgtggag gagtttgaaggcttacgaaatataccacattgccagtttatacagatggttaagggtgaa aatcaacgttacaccttgacgaccccattattacgatggcgtgaaggagatgaagaccgg gtagaagaaataagaaaagcggtacagtttaggtccggagatctagggaaggaggcctta gcttatattgtagctgctgagagagaggcagctgctggaagatctgaaggccctatcacg tatgatgatggtgatgaccattagagaacgcccagagattgatagccagttcttggacaa caattcggaactttattcacggtgcaaacatgatttgtgtggatagcttcaagtcagaca tttcatctcatccccccttttactgctgctaatcaccgttagtccgacagttactctaat caatatttattagtgttttagttgcgcaaaactcgagcctcttttccttatctcttgaca cttcctggagtcgaagtttttcagcgcaaattcactctacaatgtctaccgatactagac cgcctatcttccccctctaaatagcctattggaagggtgcaataaggtatataaatctgg cgcgattcccccggacttttatgatccacatcacctcatcttactgccctcactctcttt cctgatcctcccaggtccaccgatttcctcactatcgtcggatttctccttccagcgccc tagagaattccgtaaccaccgcaaaaatagcagcccccccctcacccatttttttattta aaagaacaccttactggcccgttttcgtttctcctttactacaattgatttttaattttc agttttttttcattgatatacaagatctatcacaaacaca
ORF (intron in bold and between brackets)
ATG(GTACGTGCATTGAGTCTCATAAGTGCCATCCAGTCACACCTGGCAGGTGGAGGCAATGGGAGGTCGGAGAA
ACAGAACCCACATCAGAAGGACCATACTAACTCTTCCCAG)GCTAGAACATTTTTCGTAGGAGGAAACTTCAAAA
CTGAATTCGTGGACATTATTAATTCCAGAAACTAG
Downstream tttacatatgaacatattactactctatattcgggacagcctcgattatttctctttctc ttcgtctcttgtttaaagtcttctttcatatcgttcctttttcatcctctcgttccgctc gatccttcaacgttgaaagagccagaggtgtcatattgggaagagaaccaccaacgggat ctaagtttggagattctgaccctatatgctgtatgacggattgttgctgtatctgtaacc caaggggtggagtttgtagccacatctttgatggtgacccgacgactgagactggcatca agccaggtgagtcaggggtgcggaccgtacttccagagccgtgttgcaccagtgttattg caggaccatgagtgggcaccgccgtataatgctgatgcctttggagcatgagatggttgc tggaattagctcttagatgctgtggcggtagccggatagcctttacaacttgttcctctt catcgtccactactgcactagaagtgctatcattactattgttctgttggttgtgtatat tcaatgaaagtaaactaatagtagaggagctgggtactagacggccctccacatctttac caacagacgaggccagaagtggtggtgacggatgttcatcagtgttgttagtccatctac ttgaatttgtattgagcccgcttgctggaatttgaatgggcatgccctgcatgctttcca ttagctcaggctctgaaagtactgagcctctaaaagatggaggcatgtctaagtcgaagt ggattggttttgataagtttgctgtaaactaatttttcgtatagagttcgcgaaaattct caccaccaccaagggtcaagtcactgatcgcttcaacccacaaaacgacgatattaaagt tcaacatattgttctaggtacgccagtccgcgatccaagtttgaaatctccacatctaca ggtgcgtactgtagtagttcatccaaagctgggatcttgc
AA
MARTFFVGGNFKMNGSKKSIHEIIERLNNTKLPENVEVVIAPPAPYLQQAVTENKQKTVY VSAQNSFDKASGAYTGEVSVEALKDLGVPYVILGHSERRTINKEDDAFIASKTKFALDQG LKVILCIGETLEEKQANITLDVVKRQLQAVVDWSDWTNIWAYEPVWAIGTGLAATPSD
AQDVHKQIRDFLATVIGKDQAEKVRI LYGGSVNGKNAVEFRDKADVDGFLVGGASLKPEF VDI INSRN
DAK: dihydroxyacetone kinase
gene known; promoter unknown
SEQ0175 chi3_0841:
5' region tgaacatgatatccaaaccttttggagtaggttgtaaaagggtatttaggtgtttttcgt ccagggaaatcagtgttctttctcttggacaatgaattagcctagctctcaggaacatgt tcagcagtgctagacaggagtccaaatccatagaacatgaaatacagaggttcttataaa cggactttatgttcaaagacaatccattcaacttgactattgcatgagatgccttgaaac aatagaggttgttatggagttggagtttgaatatgccaaacaatcccgattcacttgggt ctccagcttccaattggagagacaagatcaacaggctaaagaaatacttgatctctttag aattgaacccgatagactgttcatttgccattgtcatctgctgatgctgtgagggagaaa gaagtaggggtgatacatggtttataggcaaagcatgtttgtttcagatcaaagattagc gtttcaaagttgtggaaaagtgaccatgcaacaatatgcaacacattcggattatctgat aagtttcaaagctactaagtaagcccgtttcaagtctccagaccgacatctgccatccag tgattttcttagtcctgaaaaatacgatgtgtaaacataaaccacaaagatcggcctccg aggttgaacccttacgaaagagacatctggtagcgccaatgccaaaaaaaaatcacacca gaaggacaattcccttcccccccagcccattaaagcttaccatttcctattccaatacgt tccatagagggcatcgctcggctcattttcgcgtgggtcatactagagcggctagctagt cggctgtttgagctctctaatcgaggggtaaggatgtctaatatgtcataatggctcact atataaagaacccgcttgctcaaccttcgactcctttcccgatcctttgcttgttgcttc ttcttttataacaggaaacaaaggaatttatacactttaa
ORF
TACTTCAAATCTGAAACCAAGTTGTAG
Downstream caatccaatcccaagagcaagttatgaatgagtaacgttgaggagtacctctaatgatta cttacggctaacggccatatctagtttttataaacatttaatgataagcacggttctgaa agttgatcaaattgtaagtgtgacttgtgatctatacagtcaaacttgtctagcaagtta actgacaattgagagcactaataattttcatgtggtctcttagcaagtagaccatagaaa tacctgtttcatctcttctggaggtgaccctcaccgaatcattggggtcaaatgcacttg gactacagtaaacctactccttttgtgactgtaccctaacaacgaataaccccataacaa tgcaatcaatatttcagaggttaggatgtattacagacgtaaagagtcagttcatggaag
actatcggtcggtaggtagtcatttccccattaggaagtctgtttagtcgtgactcagcg aatacgttttcttatttaagttgtaagatcccgaattgcgagaggctcatctcggacaac gctggtctctatccttacatgaccaaccaatcaacagtggtggatttacgcctttcatcc aagagagttgttggcaaaccagtcaagttgcccacagtcctagcgtgetcagggtcagat tcttccggtggtgcagggatcgaagcagatatcaaatccatcacggcttttgggtgctat gcgctaacagcaattacatctttaactgcccagaataccaaaggtgtcaccagtatagaa aacaccgacccaaagtttttcgaagagattttagaggcaaattttgaggacattgaaatc gatgtggtgaaaactggactgttaaaccctgagtcatctcgtttattgctgaaattttta gataaataccacaaaggaaagccatttgtcctggatccggtcttagtggctacgtctggt tcaatgcttgcagatcaacacgaattagggttcaccattg
AA
MSSKHWDYKKDLVLSHLAGLCQSNPHVRLIESERWISAENQEDKITLISGGGSGHEPLH AGFVTKDGLLDAAVAGFIFASPSTKQIFSAIKAKPSKKGTLIIVKNYTGDILHFGLAAEK AKAEGLNAELLIVQDDVSVGKAKNGLVGRRGLAGTSLVHKILGAKAYLQKDNLELHQLVT FGEKWANLVTIGASLDHVTIPARANKQEEDDSDDEHGYEVLKHDEFEIGMGIHNEPGIK KSSPIPTVDELVAELLEYLLΞTTDKDRNYVQFDKNDEWLLINNLGGTSVLELYAIQNIV VDQLASKYSIKPVRIFTGTFTTSLDGPGFSITLLNATKTGDKDILKFLDHKTSAPGWNSN ISDWSGRVDNFIVAAPEIDEGDSSSKVSVDAKLYADLLESGVKKVISKEPKITLYDTVAG DGDCGETLANGSNAILKALAEGKLDLKDGVKSLVQITDIVETAMGGTSGGLYSIFISALA KSLKEKELSEGAYTLTLETISGSLQAALQSLFKYTRARTGDRTLIDALEPFVKEFAKSKD LKLANKAAHDGAEATRKLEAKFGRASYVAEEEFKQFEΞEGGLPDPGAIGLAALISGITDA YFKSETKL
FBA: fructose 1,6-bisphosphate aldolase 2 genes; genes unknown; promoters unknown SEQ0176 See SEQ0065 chrl-l_0072:
SEQ0177 See SEQ0066 chrl-l_0319:
SEQ0178
FBP: fructose 1,6-bisphosphatase gene unknown; promoter unknown
chr3_0868: 5' region tttgatgatgatttgttgtgaacgttccatgtgtttgaatctttggacggccggtaaaaa ccgactgtttttaatgattgctggttagccttgtcgacaaatgatttgtatgaagctctg accactgaaaggatttcagatcggttgaaatgatcgacttttgctgtttgagtgttggta ttaggagttgtagtggcaccagtcggggtgctacttcgtgatgagctgggattctttgcg ttcacagccttcgaagtggcatgcttgtcatttcttatcttatccttaccagcagccagg ttcgcccactcaccttgaactggtacaaggatatttctctgtttttgctcggtggttccc attgctcaaacgagtggagagggaaatcgattcagcagttaaatcaatgctggaaaatat tcgagattacctaatcggatctggaacttacttcgacctgacattttcttgcctggggag ccacgatcgattatgtaatcaagaatatggacagagggaaacagatttagctgtcaaaag cccaagagaagctaccgatcaatggatgcggatagataaagaaaagtcctttttttttca ttagccatccgagttgtccaatcaaatgtctgcctgctacgctggagaggaatcacgcgt gtttaacattcggattgtcgcctaaaataagcctattacctacacagtaaaacccggggg gtgctttggtatcaatgaccccgggattttatccaccagtttttttctttctggcaagag tgcattgcatccccgtacaaatagtagcaacctccacaagaggaatcccctatgagcgag aagtccatagtaatacccccgcggaaaagagatattttgtttccgtgttgcccttgaact tcagtttcccccatcagtttatatagtagccgggttcccaatctctagcccttctttcct cctatttcattcctctcttcttacgttatcttacattagc
GTGAATCTCTAG
Downstream tgcgttcgattggcactgtttccgagatttgatactttgtacacgatgttatatgaggtt atatacttgataagagggtttttacgtttgcaattagcacaatttcggagtagcactggc gggagtgaaccttgagtagtctggatcaatgtaatcttcgtataggctagacaccccgga ttgggagtgctgacgaaatgatgttgggatgtgatgacatcatagggataatagaaaagt aaggttccgcgtgagccgttgaacgcgcactggaatggatggtctgtgacgtagccagac tgaacttgaaattccttccaagaaagtacatttttattteatteatteattcgaaaggga ggcttgtgggggaacccccaatcaaatacctaactactacttacaatatccaaacctaac aacgagctcctcattcccgatcactccttctttctaccaaataatcctctcttcttcttg cccttaccaccacttcccgttggtgaattaacatcctgttcaaaagtattctgatctcct gaaaacgagatcacggggttttgaccagcaatggggggacgaggtctcattctgtcgtga tattcgtagtttctgccggactcataacctggaaaatcaggacgtaccctccaacctaaa gtcggggtctccagtagttgtttataagtagtatctccagtgatggcatattcaaaggaa cggatcgtatctaaaggcctctcgtctctcgatctggcaggattggaaacgtcggggtta ttgattggttgtccaaaaatgtcacggatgtactctccatcgcccgacttattactaaac gattgttgtctggcatgagtgtagtccgttgattgttcgaacggctgagcctcgttgaca gccatcaagatggggtcgtgaatgttgtttgatctgtacttgagtctgtccgtgtcgtct ttggtgacgtgaacttttggtactttaccaacatcagttttc
AA
MSNNTTQNLAEQKGIQTDLVTLTRF I LDEQKKSAPNATGELTLLLNSLQFAF KF IAHTIR RSELVNLIGLAGVTNATGDDQKKLDVIGDE IF INAMKGSGNVKLLVSEEQEDLIVFESSK GNYAWCDPIDGSSNLDAGVSVGT IFGVYKLLPGSAGS IKDVLRSGTEMVAAGFTMYGAS SHLMLTTGNGVNGFTLDTDLGEF ILTYPSLKIPHTRAIYS INEGNSHYWTDGVNEYIASL KKPQANGKPYSARYIGSMVADVHRTLLYGGIFGYPADSKSKSGKLRVL YECFPMALLLEQ AGGEAVNDKGERILNLEPKQVHERSGIWLGSKGEVERLLPYLTKKIKIQSVNL
8 Annotation of homologues of S. cerevisiae proteases 8.1 Serine-type peptidases
SEQ0179
NMAlIl homologue: Protein of unknown function which may contribute to lipid homeostasis and/or apoptosis; sequence similarity to the mammalian 0mi/HtrA2 family of serine proteases
S. cerevisiae null mutant: viable; increased heat sensitivity, increased lifespan; normal sporulation
S. cerevisiae overexpression mutant: increased apoptosis; decreased vegetative growth
chr2-2_0367 (69% homologous) 5' region ccacattgtagcttatgttctgatatttttcagggataggttgaaacttgagaccttgag ttgaatttctaggtgctcgagctgttcgatttacgtggtttcttatagaaccatttggtc tgtccgcatttggtctagggttattgtagatgtaacttgaaggtctggagccatagcgtc tagaatgtggatcttccttctcccaagagaaatcatcccaatcactgtactcattgtacg agggcctcatattagaggtacgttctcgtccataaccgtcatgataagaacggttatacg acatattgctaaagatgctatcaaacgttgccctcctaatagaagtctggtaaatagtaa
aactgaagaagaatgcgcgttgaacaattacataatcacgaaacttacaatcctccactt tgagaatcctcttaactaagtatactatttcttttggcaagcagtagaagagagtgcatg catcgccgatctaagttcagaatccgtcttgttgtatccgctgccatgtccaagaacttc ccctttgttatttagacatttagagataaattttgaaccattgaaggcagtagcgtatct catgggacccattaacttatgaagagaatccttagcggatctgtcaacgaggttttcaac ccgcaacttgtaaattttctctgaagcaagtttcaagatccactgagtaagtaaagtaag taggttatagtcggaccccgtctcgattttgagagccgcaacgtacgccatgaaacactc tgccattactttgggtcggaactgctttgctgggaacttgctttccagtagctctctgaa atttaaacgaagcgcccaaatatggacattgttaaagtgtaaaacttcttgcacggcaca atctaacagctcagcatctagcgaaggatgtagttcattcaatttgaaactgactaagca atgacaaaactttgaacccaaataaaccagagagtcattcgtttgaacactggaatcatt cgatacgagttttttatgaagaagagctttgttttcaaaggtctcattctttaacttagg aagacggtcaatatcttggaaagtaactctataatcgtcttgaatgctgatttcattttt atggttgtcagaatcaactgaacctgcagtgttctcatttctttcagtctcatgactatg cacaatttcttcctcagtggtttcgactaaagatctctgaaactggtttatagcatcaat caggccgttcactctatgttttttgtttgaatcaggttcactcacagtatcatccaaaag agagtcatggccccgtttggaagatctctcattttgccgctgaacctctatttgggcatt gtgtattatattctctaccagaccctctatatcttcggagttttcttgcgagttcaatag agaagtggaaaaaagttgtctagaatgagggcacggttcattcttagctttcatagatgt cgccccctagttaattgaaatcttaagttcttgaatgagattgtactcttaattgtgaca gatattagacgaagtaaaccactaatgacgaatatgagataacctctgtgattcttctgg aagaacgtagcatttgcagtgaaagagcagacctgttccattacgtaatccaaatatgga agatactcaagtctagaattacgacctgtacttactcattccaccaaaggtgttagggct gacaaccagtgaacctcccattcatccagtagtcaaacatatccaatgtttaccataatt agtaattctttaaacaatctacttagttcggatatgcttcaatggccctttagagtggta aaagttggcagagtaatccttctttacaaaccgtctaactttgggttcttgaatatagta gtcccagccttgtcctttagcgaaacggatagctccttctctagtatcaaacttcaagat agttccttgcatgtagtcagaggaggactggtaacccattaagtcattttcccatctgtt tcctttattcaggatatcccagtctattttccacgtttcagtattgtgctgaccactctg ggtggcagctttggcttcttggtaaatcctggccactctctcggtaataagttctttggg agccccactaaccaactcctttgctaagtctgtatggccttcatttattgcgggagtaat attggtattcttggcagctgatacaacacggccacacgagaaccctcttacttgtacttt cagcaaacttctcaacattttgtgatggacgttacaacttcgtagaagttggagattgga ttataaccattggctcgcgacttcaaaaaccatgatgacataagctaagattggtatgct cgaaatgatttttttttttttcactccaaatcaagatttgcaagttactcattacaggga ggtcttaactttggctccgcttgatatagtaactactcac
ORF
AACGGGAAGGATCTTCATTTTACTATTCCTGTCAATGACTTACATGCGATCACTCCAGCTAGGTATTTGGAAGTT
GCCCGGTTGAGAGGGGTTCCAGAAGCATGGATCGAAAGAGTAGAATCACGAAAAGATGTTGAGAGACCTCAATTT
GAATTAAACTGA
Downstream atacagtcataaatactaattgacttcattcattgtctctaatatatacacgtagcgact ccctatagattgggcaagagtttagctaatctagctttttgattacctgctcctctagac gttgcctgacacgtaccaaccagctttcccaacactcggttagaccacccagcattgcca tgttttcttccagagagacctctcttgtatccattgcctgtttttctccatctcttataa gttgcccggtgagtcttcagcaagtttcgagtaattgtaaggaccggagaagaatgtgaa attgtggaggcggaaagagccatccgacctgtcatcggggtcttgttaaacaatatgtga aacatgtccaaagaggtgaaaagttgaaatggaagatattcgatacagttcattctatta tattttctagctactgttcgcgacatataatgagactatgctactatcaacccttgctca gaacttgtaatttactaacttacggattcaatacttcaatatgtggtgacttatgagagt actaaccatagttaaataatacaatcactgaaaaaaaaagaaatatatgctacaaaaaac cctgtatttgaagtacgcttctggaaaatatcaacttcaattttgtaaaatttttgaaac gtgaactgtgacgcgtgatggtattcaaaagtggccactacatattgaatcaatcagtcg agttacaacaagctgagtaaattattctggtcatatatcttcatgggctttctcgtcctc gtcatcatcagaactactgtaccccactatatttcctcttgaagaattgtctaactttct tttcttaagcgtaggtttagttttggcagaagtgataaggctggagacctttaaatgttg cattgcatttaggtttggacccatatcttttagctttcgatactcttctactcgttgccc gagaaagcgggcctttatggattcttgctccttttgttttcgttgtagttcattgtaaaa cgtagaatctttttcattaagcctgtaggcagaattggtatgttctagctgttgttggta ttcttgatatttttccaacttgtgcttctccagttgctgaactaaagtctttccctctct ttgacgttgttcttcagcatgctttgcagactcttcattttccctttcaatttgtttttg aaatgtttctgaatctatctgccctgcctgcacaaatctatttgacatgtactaccatca actactcgttcgtaacttgtgaatagtatcatatttcatgcttagtatgtcaagtatgta atcgcgaactatgtatgtcaaactatgttaaccattattaacagtatacatatataactc tatttacataggaggaggttaagcagaagcggcagtagcttcattctggtgaccttgggc ttgagcaattatttcttcaactctcttactaagagctggatcttttttcacccttgcttg
AA
MTQHVSKKRRTDELSDSDNFNTSDHEQE IKLNSIPYVANGSSNQSWQTTIEKWQSWS IHFCQVASFDTEDAW SQATGFWDSVNGYILTNRHVVGPGPFVGYAVFDMHEECDVKPI YRDPVHDFGVLQFEPKNIKYMKVSELTLRPD LAKVGCE IRVVGNDAGEKLS ILSGF I SRLDRNAPEYGSLTYNDFNTEYIQAAASATGGSSGSPVVD IDGYAVALQ AGGS TESSTDFFFPVYRALRALRC IQNGEPISRGTIQVQWILRPFDECRRLGIRSDNEKTMRDKFPS IHGLLVAE WLPEGPADGSLREGDTL IS INGELVSSFVKVDEVLDΞSVGQTVELWDRNGKDLHFTIPVNDLHAITPARYLEV CGASFNDLSYQMARLYAIPVRGVFANRASGSFTLDTRDRCGWI IDSLDHKDTPDLDTF IEVLKS IPDCSRVQISF RHISDLHT IEESWYIDRHWYSEFRLATRNDETGLWDFTNLQEKPLPPKPLRPLHAKF ID IPTEKPGCSKLGHSM VLVTAHFPLVMDGYKDNTSRGYGVIASAENGYVI ISRKWPHDLIEVFVTVAES I IVPAKVKFLHPLHNYAIVKY DPSLVEADVLTPNFSSNPLKRGDKWFVGFNQNMRAVSDI TKVSDIPVLNVP INPLSPRYRACNFEGIQVDSSVA NPAPSGVLADEDGT IRALWLTYLGSVTDEGYDRVFGMGFDTSHINQIVSRCIANEGHLTDLRI I DSEFYALPVIQ ARLRGVPEAWIERVESRKDVERPQFFTVLRTSTPAIGMEASPLKVGDIVLSLNGHAVNKMADLDDMYTQTELQVE ILRKKQI IKVTVPTVSTEKFNTSHLVYWSGALLQPPHQSVRQVMKNLPSS IYIMSRNQGSPATQYGLNSTQFVTH VNEQETPDLEAF INWRGIPDNTYCKLRLVSFDNIPSALTLKTNYHYFPTSVIRKDEAEDKWIE IDFTKDGPVRL
ELN
8.2 Serine-tvpe endooeptidase inhibitor
SEQ0180
Homologue of TFSl (YLR178C): Carboxypeptidase Y inhibitor, function requires acetylation by the NatB N-terminal acetyltransferase; phosphatidylethanolamine-binding protein involved in protein kinase A signaling pathway
S. cerevisiae null mutant: viable; decreased resistance to amiodarone chr3_0640 5' region tggtacgtttccctttctcttcgtctcgaggcacaatatcaatcagaaaatcgaacatat cagatttctttaaggccgcggcaatatcagatcgttgtaaagttcgccttttgttctctt ctgcgtgtatccacgctctcattgttagctctgtgatgaaaatatcacatcccttggcaa acaaaataggggcttcggcactaatcattttcacctcttcatcagttttcattactttct tgattctggccaaaggtagctgatggttcttaaaatcatgatcgtcgtgttcaatagagt tgatggtttcctgccaatattgcatcatcatgttcttggatcggcctgtcagcccttgac ctacattctggaaggcactgccggctccgaactcatcgtcttcatcctcttcttcttctt gtccttctgattggtcatcctcggcagcgttttcctcgttctgctcgtactcttcgtagt actcctgctcttgcattgtattctattaatcaacgagtatagagtctccctgtggggatt gaaggatggaagagtttgatagggaggttcgcgacggctattagggaaggtgtgccaagg ctaaagcttcaggacacgcagaaaaggaaagacccgtccgttttaaatcctccgcctacc gcccacgaaactgcggactcaacgggcaaaaaaaaggtaatctgtcttcttgttcccctt tcatcaacaccttggactgcaaaggtcctaacaacggctgatagcttttttcgtacacac atcccattaaacatgcccagtccatttcagtcatttcaatcaaacctgtctccttctaag catacgcagtcctttcaagagctctacggcgaaccagagaactttcttgaaattgaagta ataaaccccataacacacggatcaggttctagcatgtacacggactatgaaatcgtttgc agagtaagtacaaaaactcaattgccataaagaacgaattactaacaatacagaccaaca taccgatgtttaaatttaaggaatcaagggtgcggagaaaatactcagattttgactcgt tcagaaaggtactagagtcccagacaaataacgtcgtgatccctaaacttcctgagaaat cgttcttcaactaccatcgcttcaatgatgacttcattgaggagcgcagacagggtttac aacagttcttaaaggtgattgccggccatccacttcttcaaacgggctctaaagctttaa cttcatttgtgcaggatgaacattggaataaatctaagttcttataatcaacggttaaag ggagaataaaaggggtgagagcgtcatttttctagaacatggtgaggcataagtacctcg atcttatggggaaacacataatctcgaactaataattaatttgcaaacttgcttttgctg tttcccttctattttctcttcccacttttttctttttgagtcttcaccagctcaatcgat tg ORF
TTGAGTCTTCTTGCTGTGAACTTTTTCTTTGCCCAGAACAAGGACAATTAG
Downstream tacttaagtagttttttattctctccttaaaatttcgttatgtaagcattttctttcaac ctcgaaatctgccttttcttcgttggtgaatcctctttggcataggaacggaacaacttg gattagattcgatacccgccttcctctctttctttcgcctccctaaaaactaaagcacta tggagactcagattagcagtctgatttctgctagagaggttgcattccagaaaccgcaac tgtatacagacatctttcacaaccttcttccgttcatctacaatcaaaatactactttac aaaccttggttattgactttctcaaagaatcttttgtggatgaaaaattggataaagagc ttcagcttcagctggctgatgaactgcttcaacctttgagttacctaatttcctcaaagt ctcatctagaattgccaagggccccttcacctgctaattttgtcaagattattcaattga ccaaccatatatatccgatcattttcaaaaagtgtctgtccagcccccagatgcaggact tggagtccaaatgggaccaattacagagtttgagggacaatttgattcaaaaatttggta ctgcatatcctttactcccactgaatcctgaatcggacgccagtcgctcaaccagatcta acataacgctagttatgttcatcacaaatgtgttgtcaatacttagtattgctccaccaa aggatccaagagcagctagggctagaaaattgaagaacacaccggaggatctatctatag
tgagtatcccaagtgatcacaaatttcttaacatgactcgtctggaggttgaaggaagac aaattttagacgtcctctttgatgctttaaatgattccaggataataaactccccgttat tttgtgtcattacaaaccgtctgttaacattgttcaagaaaagaccaggtttagtaagcg tcaaactgtttgatttcatgatagcctatgaatcaatgaataaggaagacccactctttg agtcgaaccagttgaaaatgcgccttattcgaaggttcaacgatcgtattctgaaagtaa tgataggatattgtctaaataagagttttgtgaaagatacgaagttactcaaaaggttcc agaacaaattcacttactaccagggcatgtttgaagatcaacgtaagaggggtctgttga aaccagatagtcatgaaattaagagagaagaaaagagggctgaatttaagaagaaaaaaa ttaccggaaatgaaagtttggaagagcaacttcagaacaacccttccgaatcaccccttt tttactacaattcaagcccaataacgcctgatttgtcttactcatcattatactcattaa tatcgccaaacaataaactggctagcttcaatatgagtgaactaccacagaacattttga ttgacatgattgcatattctttacaaaatacaactactagaaaactagtaaaaggactag
AA
MLLGFQRFPKFHTLTRKSPINSIRSLRYKFQRPLLTPLFLSPIIKQSMMLVTISDSIRENLIKHEVIPDVIKDKS FVPFGLLIISYGSPDKEVVLGNTLKVEDAQSIPKITFTVNLNEEQEIASFFDEKFTLVVTDPDAPSRTDNKWSEF CHYWSDLSLSTKSGTTDAEDQVNFTTDLKVDTIPESDSKTLVPYLGPGPPPKTGLHRYVFLLYKQKPGVSLEGP DPKNRPNWGTGIPGSGVSDWAAKNSLSLLAVNFFFAQNKDN
8.3 Aspartic-tvpe endopeptidases
Homologue of YPS2 of P. pαstoris: aspartic protease yapsin 2
S. cerevisiae null mutant: viable SEQ0181 chr3_1157 5' region agccggggatgcgatgcactttttttcaccttataaccatatcgtttagcttcaggttgt ttctattcccaccgtttccggccactatttaqctagtagaaaqatcagaaatgtccqtgt ttacgccaatagaggacgccctgaaggcttacagtatgttactcttcctgctacttgaaa acgaatcatatgattactaacgcatctagaaaatggagaatttctcatcgtgatggacga tgaagacagagaaaacgaaggcgacctgattgttgccgcagaattgataacgacacagaa aatggccttcctcgtgaggtattcctctggatatgtttgcgttccactttccaccaagag agctgcagagttagaactaccaccgatgatcaagaacagaacagacaggcagggcactgc gtacacacacactgtggacgccagagatggcactaccactgggatttccgcctctgacag agcgttgacctgccaggttttggccaaccccaggtcaaaacccgaagagttgatgagacc gggacatatttgccctctgattgcacgagaaggcctgttgaaagagagaagagggcacac ggaagctgctgttcaattgtgtgaattgaccggattacaacctgctggagtgataggaga gctggttcgtgacgaggacggctctatgatgcgattagacgactgtgttcagtttggtct ccgccacaacgtaaaaattatcaaccttgaccagatcattgaatacatggattccaagaa caqctagatacqatqgataqqaatacagagatatcatgattgaggaacqtaagagctttt tcgaaagtgtgagtttgtggtgagggccaggcggtggggaggtggtggggagcctccttg gtcgaatgtagatatagtaagcaagacacaagagcgcgcgaaqtcttcaacgaggcggcg ttgggtcttgtacgcaacgtaatgactacacagttgagcttgtcgcgaaccggtcgacat tttgatcatgcatactatgttgagacaccatctcgtactattgcggcaaccagctgtaaa tttgactaattaaagctqatgaaggatgcagggcgtcgtcaattttttgattgattgcat ttaattgtttgagccattcaaggctgaatgcccggcaccctagacccttcttgtgagtac tataaacccgcaggcagggtacccttggccttctgcgagactaccagtcataacgtatat ccacaatgtactagtaatagccccggaaaactctaatcccacagaacgtctaacgcctcc tatgtcatcgatacccattcgcactactgccatggccccccttacgtgatcatttcactt actcccgcctaagcttcgcccacatqcctgcgttttgccaagatttactgacgagtttgg tttactcatcctctatttataactactagactttcaccattcttcaccaccctcgtgcca
ORF atgatcatcaaccacttggtattgacagccctcagcattgcactagcaagtgcgcaactc caatcgcctttcaaggctaacaagttgccattcaaaaaagtttatcattccaacgacccaa aggaccgtttaattaagagagatgactacgagtccctcgacttgagacacatcggagtct tgtacactgcagagatccaaattggatctgacgaaactgaaattgaggtcattgtcgaca ctggttctgccqacttgtgggtcatcgattccgacgctgccgtctgtgagttatcctacg atgagattgaggccaatagcttttcctcggcttctgccaaattcatggacaagatagctc
ctccatcacaagagctcctggatgggctgagtgagtttggatttgctctcgatggtgaaa tttctcaatacctagccgataaatctggacgtgtttcgaaaagagaggaaaatcaacaag atttcaacattaaccgtgacgagcctgtgtgtgaacagtttggttccttcgattctagtt cttccgacactttccaaagcaacaattcagcttttggtattgcttaccttgatggaacca ctgctaacggaacttgggtcagggacacagtccgcatcggcgactttgccatcagccaac agagttttgccttagtcaacatcacagataactacatgggaatcttgggtctcggtcctg ctacccaacaaaccaccaatagtaacccaattgcagcaaacagatttacttatgatggtg ttgtggattcattgcggtcccaaggatttatcaattcagcatcgttttctgtttacttgt ctccagatgaagataacgagcacgacgaattcagcgacggagaaattttatttggtgcta ttgatagggccaagatagacgggccatttagacttttcccatatgtcaatccttacaaac cagtttaccccgatcaatatacttcctacgttacagtgtccacaattgcggtgtcttcgt cagatgaaactctcattattgaaagacgtcctcgtttggcattaatcgatacaggtgcca ccttctcctatttgccaacctacccattgattcgtttagcgttttccatccatggaggct ttgaatatgtttctcaattgggactatttgtcattcgtacaagttctctgtctgttgcta gaaataaggtgattgagttcaagtttggtgaagacgttgtgatccaatccccagtttctg atcatctattggacgtctcaggcctttttactgatggccaacaatactccgcattaactg tacgtgaaagtcttgacggactttccattctaggtgatacattcatcaaatcggcctact tattctttgacaatgaaaacagccagctgggtattggtcagatcaacgtcactgatgacg aggatattgaggtggtcggtgatttcactattgaacgagacccagcctactcctctactt ggtctagcgatttacctcatgaaacacccactagggctttgagtactgcttcagggggag gccttggtaccggaataaacacggccacaagtcgtgcaagttctcgttccacatctggct ctacttcacgaacttcttctacatctggctctgcttctggtacttcttcaggtgcatctt ctgctactcaaaatgacgaaacatccactgatcttggagctccagctgcatctttaagtg caacgccatgtctttttgccatcttgctgctcatgttgtag
Downstream tagactttttttttcactgagtttttatgtactactgattacattgtgtaggtgtaatga tgtgcactataatactaatatagtcaaaatgctacagaggaaagtgcaggttgcctgtgg tggtttttcttattagcaccctctgaacactctttacctctaacatcctcaqccatgcta atcgcgcataaaataaatcttcgaacttttttccattttatgctcataaagcttccttac tgtcaccttatcaaaagagcttttgccactaaagtagtcacacccagaattgctcccgaa tatcgtccaacaatgctaggatctgtggaaagtttgacaaataatttgaacaccttgagc ttgaagcttcctgaagttaatatccaaggctcctttccagaaagtaacccagtggacctt ttgagaaactacatcactcaagaacttagtaaaatttctggagttgacaaagaattgatt ttcccagccttggaatggggtaccacactggaaaaaggtgatcttttgatcccagttcct cgtctgagaataaagggtgctaatcctaaagatttagccgaacaatgggctgctgcattc ccaaagggtggatatcttaaagacgttattgcgcaaggacctttcttgcagttctttttt aacacatcggttctgtacaagttggtgatatctgatgctctggagagaggcgatgacttt ggtgcacttcctctaggaaagggacaaaaagttatagtggagttttcttctccaaatatt gccaaacctttccacgctggccatcttagaagtacaatcatcggtggttttatttccaat ctgtatgaaaagctgggtcatgaagttatgaggatgaattatttgggagactggggaaaa caatttggtgttcttgcagtaggatttgagcgttacggtgatgaggcaaaattaaagact gatccaatcaaccatttgtttgaggtctatgttaaaatcaaccaagatattaaggctcaa tcagagtctactgaggagattgcagaagggcaatcattagatgaccaggcaagagctttt ttcaagaaaatggaaaatggcgacgaatcggctgtaagcttgtggaaaagattccgtgag ttatccattgagaagtacattgatacttatgcccgcctcaacatcaaatatgatctttat tctggtgagtcacaagtcccccaggaaattatggacgaagccaccaagatctttgaagaa aaaggcctgcttagcgaagatcagggtgctaaactgatcgatctaacaaagttcaacaag aaattaggaaaagctatcgttcagaagaaggatggtacttcattatatttaactcgtgat gttggtgctgccattgatcgttacagaaagtataaattcgataaaatgatctacgtgatt gccagtcaacaagatctgcacactgctcagttctttgaaattctgaaacggatggatttt
AA
MI INHLVLTALS IALASAQLQSPFKANKLPFKKVYHSNDPKDRLIKRDDYESLDLRHIGV LYTAEIQIGSDETE IEVIVDTGSADLWVIDSDAAVCELSYDE IEANSFSSASAKFMDKIA PPSQELLDGLSEFGFALDGE ISQYLADKSGRVSKREENQQDFNINRDEPVCEQFGSFDSS SSDTFQSNNSAFGIAYLDGTTANGTWVRDTVRIGDFAISQQSFALVNITDNYMGILGLGP ATQQTTNSNPIAANRFTYDGWDSLRSQGFINSASFSVYLSPDEDNEHDEFΞDGEILFGA IDRAKIDGPFRLFPYVNPYKPVYPDQYTSYVTVSTIAVSSSDETLIIERRPRLALIDTGA TFSYLPTYPLIRLAFSIHGGFEYVSQLGLFVIRTSSLSVARNKVIEFKFGEDWIQSPVS DHLLDVSGLFTDGQQYSALTVRESLDGLSILGDTFIKSAYLFFDNENSQLGIGQINVTDD EDIEWGDFTIERDPAYSSTWSSDLPHETPTRALSTASGGGLGTGINTATSRASSRSTSG STSRTSΞTSGSASGTSSGAΞSATQNDETSTDLGAPAAΞLSATPCLFAILLLML*
Homologue of YPS3 (YLR121C): Aspartic protease, attached to the plasma membrane via a glycosylphosphatidylinositol (GPI) anchor
S. cerevisiae null mutant: viable
2 homologs
SEQ0182 chr3_0303 5' region accgcttgatcaaaagaaagaatggtattttcagggaacaatttcctatagcctctatga gctccaacagtgatagccattcttcggtgcagtaggaagcgatgagatgagatgaggttt tttgacggcattgcgcgaaaaaaagaaagccgggaggagagcgaatccatggagggacta ggcccaggagactttcttctgctgacgaacagcttgaagaatatgtgagagatagggccc ttcttgcacccaaggcggattaggcggacattggtcacatcctcagctgaaatcaacaag aataaagtcaaagctctatgttctgcgttacaatggtcaatcaagtgattgtagattcct tctagctcttgttattctctgttcccgcaggtagctatgactctggccagaaaactgata gacaaacttaacctgactcatatagtctagggtaaaacctcgaagatgacttcattcaac ctctatgattgtctagatgttatcagtccggagtatccaagcattcgtgaatttatcatt gatcgcccagtcaaaatcattataactacagctgaaacgctgcgtgtttaaatgatcgtc gatcgttggtgtgtgacgcgtgcaagctcgaagtgcttgcgcttgacaacttcaataagg gctgggacctttgaagcttgaccacggtgatcaatccacaactcaaaagcagatctgaat cctgtaccaagatataaactcatatactctccgtaaaactatgcgtttattccattttga gtcgtgtgatattgggccttccgaatttgctggctcaatttcttcgtatacgggacaagt aacctaaaaaagtgacaacgctttgaaatctcaaataagtcaaagcacctttttcggggc gaatgcacactttcggtccgatgagaagagaactagtgggctgcaaggacgtctgcattg cctctgatcgggctaaacaggtggttaggctttcctatgcagttcgccggcaaataaccg ctaatcgtgcaaagggcttcaaggacactgttgtttcatttaacccaaaagtgcattatg tctaactctgaacaataaatcctgctcactcaatagatgtatttggattcagtgcgggaa tggacgtttctctaacttcaacaataagctcactcactcttttatccttcaggcaacata ttgcagcattgtggcaattttccgtctgttgaggtcttggttacccaaatttttccttcg aattgaacccttaaggcctatttgctcaagaaaaattcactctcacttcgtttcagtgaa gaaaaaaaaagctgagaaagcttgcctcaattggtcttggatgcacaacagatgtgcata tagaagcttaagcaataaacgcatgaccccacccaaattaactccagcaattgggctagg tctgcaaattgtaaataaagccggcctcctcttcttctatctgttgacagggtcgtcaag
ORF
Downstream gaacatagcagaagctaaggtacttttaaatctacatacctactaatgtaattgagatat attgtagtgtaagcattattctagagaagttttcaaatctattatcatgagcacatacta atacggaattaaatagaagctatcggttgcggccacatctagcagaaagcaccgtttccc gtccgatcaaccgtagttaagctgctaagagcctgaccgagtagtgtagtgggagaccat
acgcgaaactcaggtgctgcaatctttttttgtctctcctagctgtaaagtaactcttgc ccatacgcttttgactggtgcagtgtagtaatcgaataggggcctgacattcaatggttc acctttcattcgttttgagttttctggtttccagatcatgtcaattgtagttaaacatat attttactgtgaaggctttggaaagcctcaatcaagaagtccccaccacttagttggtcc tgcggtttcttctaggtatttcattccacactacgacaatagagagtggcccaatactaa ttagcaacataccgagagagctcggagtgacttgccagttttagaaagattcatccacga tgttatcttcatccacgttctgaacccttatatgggtcatgtctagagtttatcactcct ttatcgtgttgagccaagcgcaccgctataagcttccgtcagttaccgaactgataacca tcaaggaccaatatggtggatcatcacacaacagttattaagcgacacccgcgatgttaa tgtgaagcgaggctttcagggtctatgacgtagtacaggggttaaactatgtggtcctta gctgacctgcatgcactttgactatcatcgaccttataatgcctttattatccgaaggat tgaggacggttagaacagctattacgtcactcaaaagacaagtgataattttcagttaaa gaactaattctgataagcctttttgctgcagtatgatgtagtaagtaagtacgttccgct gttatccgctcacacatcctgttagactgaaagagtcacacatttctccttggatctacc tgactgtcagtgtctgctcagccttgttgaggacaatatattaatcttatttgaaaccgg ttgaactttactcatcaatctatctacaatggctcattttaggataaacatqcttcaaaa acgatccgtccgtcgtcaaattgttatcgttggtgctggcagttatgggatcgctctagc taaagaattggcaccgctggtagaatctatggccgatgttgtgctggtttcagactctga acaagttatctttctaccatcattggttcgattggatgctcttaaaaacgttgctaatct attcgttggtctcgagggtctttttgctaaaaccaaggtaaagctggtcgtggaccatgt tgtcaacatcacgaaaaggggagtgcaattggagcatacaggattactccaatatgatta
AA
MLPIRLSKLLLLLSLKLKLGTAEEKYQKLDLKRIDKDYYAVDVKVGSDEQEIKEVLIDTGSSDFWILDKSFCNSP TSEEEENSNGRSNKESCGVYGSFDSNKSETFQATGQVFDAAYGDTTAESTGSSGVRGIDQLRVGDIHIEELYFGL VTNTTSLPPVLGIAQLSEEFSNNSYPNFPYQMKEEGLIDVVAYSLSLGQSKGELLFGAMDHSKYNGTLLKAPILQ AGTPGMQVLLTGVALTNGSSSVFNETDNKGFIYFDSGTTASTLPSEHFDDLFNHHGWAYDGDTLTYSIQCDSEGE KSLLDFTLEYTIAGNIVIKVPFEDI IMKNENDGECLSTVMVSNQTSFSYSDDTPFFVAGDEVLLNAYVVYNLETQ ELAIAPAVDNPEDTEEDIEIISADFDISEARDYSVGLEFRNTTIPATTDYLPSSMSSGSVSEETGSKSESSTSED FAAATLKPFTFWGFVLFFFHFLI
SEQ0183 chr3_0866 5' region taacctcqtttgacttaaagcctatataacttacataqttctaqtttaaccccaaatact ttgcatggtgagcagcatgaacgtctgtgaatgaagagataaagtagtaagaatggtcat agccatctaccaagttcaagtccacttttcccttgaacacagagttctctgaagccttca agaagttctcaggtagcagctggtggtccttgaagtagaacgaatcactctttccaacgt gaatcaaaatgctggaatctgaggggccttggtattttccaatcaattcggttgggtcgt actgagtccaagtggacttgtcctgtcccaggtatccagagaagcacttctcaccccatg gggctttagtggggttggagattggagaaaatgcggaaaccgacttgaattttcccgggt ttctcaagaataactgtaaagctccgtaacctcccatggagtggcccgtgattgaaatat tgtcaaagtttagaattgggaagtcagcctgcaatttgggtagcaattccgagttaacat aactgtacattctataattatccttccatttctcagtagtggcatccacgtagaacccgg caccggatccaaaatcataagagtcgtgctctccttcgatgttgagccctctgggtgaag tatccgggaaaaccacagcaaaaccgtacttatttgcatatggttgccaaaatgccttct ctgaggcattgttgggagtgcaagtcagaccactcaaataaagtagaactggtaatgatt ttcccttggctccattggcaaagaattgagctggaaggtagacgttcacatccatgtcag tcttcgtctcatcggacttgtgttgaagtttgactagctttcccccaaaactttggattt cagctgttaccttgaagattgaagtagtaattgatgacattagaaagatggaatgaacgc tacagcgaggaagagcaattgctatatataaccccagagtattgcttaggctaagttgtc aagattgggtaaacaattcgagaatcaaagattaagtaagaagtgttatgttgtaggctt gaagcttcacgcaaggaagactgggagaagtagttgattcctgtggtgattgtcctgaaa tgactgtgagcgggtccgcctagagtgttggatccctttcagtgtttccactaatggggt ctacattgcgtcccagatcgatagaccaggagatcaattaaaccctacgcaaactatagg gcatgaatcaagagtatcaattggcatactcggatcattttctaccgtttcataccaaga gaagatggcaaaatgggacttctgcgttatcatgagttggctccatcggaattttccaca cttcaggtttttcagatgtattacccttttgagccattcacaatgatatcttggtatttc aaacagttgtagcttttactataaaagtcacaccgatcacaacgtttcatcaatgaagat
ORF
AAGACATTATTAACATTTATCATATTGTATATTTTTTAG
Downstream ttcgtaaatcaataaatcaaaacgcattgttcattctatatatcaacaaagattaaatca ctctaattctaagcagagccggaatacgtccccgcaacaagggtctttaactcatggact ggtttccattctccactagagacattatcggaagaagaaaacggatcattggtaacctct tggtattgcaaagtaaatcctgaatcaaccaacctggttccaaaatcatcattaatcaca aacttcaccaaacatccagtgtctgcctcttgagccaatcctgaagtcataaatcttgct atcagtcgctcttcgttagtagctgtcagcgttatagggttggggaaagtccaagcaatt ctgttcttttccttattgaaagagccttgtggtttagtggctgctgaagttgcttgcgca cccttgattgaaacggataacgagaagttactaagggtgagccctttgaagtcttctggc aacgtggccaacaattctggggagagtctgactgagattatgacacttgcctgatgttcc tcaaacttccacgtaggaataacaacaattggagcagcaacgtcattcagcagatacttc acagcacctaaagtctggaattgaatctgttgcacatcggttagtgtaaacgagttctct gcatgttgtctcagaatgtcattgtttagtaagtatttgtcagcctgctttttgctattt tgtaacgtaaagttgatattcgaaggcaaggtattatttgcaggatccgttatatatgag aacgcaacttcaccaactatactggaacgagtcttaacaccatctttgaatgaagcgttg attacttcagcaatacttgcattcaatccaggtgcccttagcgaaggatgttggaaagga gaacctgcagaagtacctgtcaagttaggtttaatactggaagaacctgtgttttgtaca gcaattccagcgccaccaatacgtacactgctaacatcttcgttgtttgttgccacggta gacgcaaaacttggttgtctagcctgtggaaggtcatggaacagatgggattgcactctt tggttatttctaggcacaggaggtggtggaacagcaccatgacttgaacgaggaggattc aacgcagattcgctggtataagtgtgcctcgatgttggaggagctggtggttgagcaact ccatacgacagctgagcggcattttctttctgttgttgttgttgctcttgttgtggtggt tgtgcattctgttgttgttttttggttgtaagggttcgaaagaataaagagaagaaggtt tttcatactgtggcgatgattgttggagtttgataggggaaccaggcgaagaagtctcca cagaagacacataattatcttgagcagtctttggtatctgtggttgaatgcccagctgct taatttcggatttcggagcagcaacattattctccaaagaattttgaatactggaaggcc
AA
MLVAVALVLLLSTGYAGIVAIDTEYEFTIGFLSTIE IGFPPQSITAQWDTGSSDLLVNSVTNSQCAQDGCSFGAF AFNKSTTYSNITNPNNLHVQFSFASGSVVDDKLVSDTIFVDSKVIPRFNFALVSKGDLYGDNIFGIGPRGNQGTF DSNGTPAFYDSFPYHLKALGLIKRLAYSFYTGPTQGKVVFGGVDHGKYDGCLEKLE IVHDSAFYTLLEAIDADDT SVLDEQIHVLFDTGTALTLFPSFIAEQLADFLKATYSDEYNTFWPCDQDFDFEYLHFGFRNIKLSVRFKDLFLV IDDSVCAVGFDQGADANKITFGSSLLRNYYTLYDLDSKEILIADVKPDGPDDIE ILSGPVQRICDEKGVSSTSLW SSLS IESTIEPDTFTTKPSISQTRYSTSSIGPQNISNSLGEYPSVSVTLSEHHNTTSIASNSSLEGKPATPTVTD QSYQNNKTTSTVIAVNLITHSTTHSTTHSPTYSTTHSSNGSRSTLEYTSTKESSVKMPCALI ISDTIPYNASGGN SSYGSLISTSTVNNVEENNSNTVRPRKRQTFVSGTTSTILLYSSTTTQAYQMLSSTSIPRPS IKASSNAGSRKTS KTLLTFIILYIF
SEQ0184
Homologue of YPS7 (YDR349C): Putative GPI-anchored aspartic protease, located in the cytoplasm and endoplasmic reticulum
S. cerevisiae null mutant: viable; abnormal budding pattern; abnormal cell shape; decreased cell size; decreased resistance to hygromycin B
chr3_0394 5' region caaaaccttccaagctcaagtaattcgaaggatggtagtcaggatcacagttctttggcc gaatcgtccaaggagcttaacattcataaaaggaacagaagaaagcgtgtctctagtgta gtgacttcagttaccaatggagaaacctactaccattgctccgaatgtccttctaaattt aagtttcgtggatacttaacgcggcattccaagaagcattcacaacgtaaggcttatcat tgtcctttctatgaatcaggccatggcaaatgcagtttaggtgagggtttttccagacgt gatacctacaaggtacatttgaaagctatgcacttcaagtaccctaaaggtgtgaaatgc gctgatcgaactggtatgatggggtggtgtggatcatgtggtcagagctttcagaacaat gaaatctgggttgagaggcatatcgagaagggtgtctgtcgaggtcttccagaaggttac agtagtcgtatcatgacacgaaatcgaaagaaaacaggaaagcagtcatctctattgaat gtcgctaacaacggtgtcaacgaaaatccaatgctcaaactagttatggatggagaagta gtttcagcccccaagttggaggatgtgacacatctacttggtgagtctggctccacgaac acatggactgaaaactcagaaaatgcaagccgaacttcttcgatacacgacttaagtgat cactcctctgctagtgactctttaagcggacatacgagtggggaatcgcttgcaggctca actaacgattctagctctcagtctttagatgaagcgataaatcccattggacacccagtg gatcggtcgttattcgcccatccaaattccaatggggaaatgtttgggaggtcttggacg gcaagagagtcaagtttgcactcagttgttgatccttttggtggacaggtgcaagcacag gtgcaggcacagatttacatgaagtcccaggcagattcacaggtgtctcagagcgactcc cagaaacaggtaaactttcaaactaactttcagcagattccccaaaactttgagaattct gtaccactagctatgccaataggcgatgatgattttcctgcattggatggagaaaatgcg tatattgccctattacaacaacatcggcagctccctgattcaaggccagagcgcgccaag cctgctcacttaggccaagatccattatttcatccaaacatgcatgtttcgtggcttagc caacctgatcttacattttagttgtqtaaatqatqaattqaacgattgaacqattaqaaq agatctcatgtacgtttactctggagttttttcttgatccttgcatgacctgaatttttc agtgatcgtatttccctcattcttttcttaggctcgcataacccctcgaattgcaatatt tgtcatcaaaaatttatagtgacttgttcccttgattaatcccagaccaggccgaaaagc
ORF
CCCTTTTTAAGGAATGCAGTGGTTGCCGTTAATCATGATTCAAAAAAGGTCGCCGTTGCCAATCTTAATAGAGAT AGCATTCCTCCCGCTTCGAACGTTTCTGTTTCGGAATCAATGGGAGTTTATGTTCCTCCACCTGTTTCAACTTCA
ACTGTGAGCTGTCTACTACTGTAA
Downstream
attaattgaatagaaagtaataagtaaaatgtcatgattcaaaattacatatatatatat atgaagttaatcaattatagttcccactggggggccatacttctccagatacaatcactt cggtcatccattcccaattaagaagctgaatcctagtccttttagattcgggcaagtttc ttgcttgaagcttcgaaacgctttttgacctttttgattttatttctggttcatcgtcac tgtcttcattataattgattttcttaactgtccgtcgtcgtcgggtggtaatcaaatctt gcgagtcaatccccctcttctttgtggtcgccccggaagcaagctcttctgtttgatcgt taggggtttgattgtcaattttatcgtatgccaaattgagtgctccagttatctccttga tggtcttgtcgtcatacaaggcacattcacatttatcttcccttccattagaggggtggt atttttctatgatgctcaaagcttcctcgtagttggcaggtttgtccaacgatttaacac caaaacaataagctagaaattcaaccacaccggctttcacctggtcaccatcctggatga ctaagagtctcatgccgttcattttgtaattatttgcccgcaactgggagactaagcttt ttccagcctcaaagttgtttataaagcgggaaatctctatactcttcaaccctgatacca gtttggtctttcctgcaggcaaaagatatgctttccacctagaaaaagttttggaatcct ttaagcaatcaccaataaattcatgccctaatacaggccaacccaacgccaacgcttcca aatattttggtgttcttacaactccgtttgaaattatggcagcaaaagaaaaagagtcta aactgtcctttggtgtcaattgtaaattatgttcgctgtcaataaattgggtcctcgagt ctagaagaaagggtctaaacccattctccaaaatttttccgccatttgaatcgatttcag ccttcaaatccaaaatctctttaccttttttatctgtggtgcctgtgatcgcaaataaac aaccggaaaacacaccatggtcattatttaaatatttcgaagttaggccgagtgattcct ttacctggcataaataatctttgaagaagtctgtgtcattaaatagtttcgacctagtag ggcaacgaaaaagagtacctgacaagtaaatttcattcaagttaactttcttttctteat ccttttcaagatccactcccttcttatgttccttaagatatactgtattgaatccacgag cacattgaatatcagtgttatccgatgcaatgcttcccttaaacgctaatcccgtgacaa tgcatgttctacgcttcagacttttccacttaatgacgtctccgattttgatatccaaga tgccaagtacgttcatttccctaatccggaagacatccacaccatctgaaacatcaactg
AA
MYQALLVLSLICFSSANFVKLRSNAGMFYDTMAGVPRSDEEFWLRLDINQGLSWTLDSSYYSCNGSNVSSSLCFN SAQNVYDASNSPTADFVDVYANTTVNNTDEASAERVNLTNNLFADGVYMEDNFYVTLNNGARMTATDLKFLNAHN SSAAVGSLALGSYTSQDVPTFLQRLQSGGLIESNSFSLALNEIDSSYGELYLGTINSTKYVEPLVEFDFIPVSDP NGVFGFDWEDTFPTVPISGLSMSSNDKQRTVFFPNEWNNTVLTGTYPLPMMLDSRNIFIHLPFSSIIHIAVQLNA LYLDTLHKWAVNCSVGQLDATLNFHMGNLTVHAPIKELIYPAYQGDKRLSFANGEDVCILAMAPDVYIGYPLLGT PFLRNAWAVNHDSKKVAVANLNRDSIPPASNVSVSESMGVYVPPPVSTSRTSERPSTLDETSTANFDKREESAI SSSSVTNSSSRNSSTITSSGTQTEQTSGIATIETDSIPGALGNNLTDYSTLTLTIYTNSEVDELNPNIATAFISN GSIYSEPYPFSGTAVAESFSASPSQAEGSNSSSSGSSLVLCFFTSLASLLTVSCLLL
SEQ0185
Homologue of MKC7 (YDR144C): GPI-anchored aspartyl protease (yapsin) involved in protein processing; shares functions with Yap3p and Kex2p
S. cerevisiae null mutant: viable chrl-l_0379 5' region gccttggtgcaatcaaatggtccatttgaatacactttgataagaaagctcgttttatgc ctgaggattgtaagtctgctatgatatcctctgtgtaatagtacaccgcaactgggaaat cgtaaaagagatttttgggaactcatatctatcacattcaagacattctaagttttgtca attcaagatggctatgtatgtcaggctacaggttgtcgattgcattgttcccgtacactt gatgtgctcccatctaaaataatagaggatatagttaaccactgttgactcaagtttggt caaacaacttgctcagcagataagcttcattatttgcaatagaaaatcaactttagtcac ttacaataaggggaatattcgaaggttccgtttctgtattatagctgctacctgagttag aacggtctttaggctagttgcaatatatccaacagaatgggatctggacgacgttttcga aggttaagatcattcgtctttctctcgtagaaattttgtcaatacattactcccatatga gtgccggaattcttcataaactcaattaatcctttgacgagagatatgtcttagctttac ataatggggattactttttgatacataatacaaggagcattgaaagaaaaaaaaaaaaat taaaacattgacccttgagttgttactcggtccctttttacagccagcgaccaaaatatg ctatctgacatcccggtcaccggttaaatagcattaggctgtgagtctggctggcccggg cgaacatgaacatattggctgcaatcagtcactatttgattcaacattactcttgtttca gatacttaatggccaaaggactgcggttttgataaatctgacatgagcccaagctcatct ttcagaatggtattgaatgaaattttcctggtaagttaaaacatggatatatgcgtttct aggagttgagatagttctacgaaaccgttattcagtgtctgcaaatatgtccaggcaacc
tatcgtggacaaagacgatggatgtcggctctatgatgaaatcagcggtcgccctctttt tggatactgacgttcttgatgtcatgcatttttgagtgagacacatagcttgggtgtcac acaataataaaaacgcagggctgtgtacagctgcgtctataatccacacatttgactagt agcagcatggcagctaacactgttgataataaccatgatgtacgcaaatgcatttcattc agtcttcgaaggggcaaggcaaaggaggtagataatgagggcctcgagtgaagtgttttt aaatttgtaaaattggaaaacctttcttcctgagaaaccttactgttcagcttacgctag aaatattttgcctatatgtatttccgccacgccaagaattgaattacttcgtctgtatct gttttctctccaagactataaataaacaaactcgtccaacttgatcaagtgccttaccga
ORF (frameshift present in bold region) atgtttgtgatccagctggcattcctatgtctaggcgtcagcctaaccactgcacaacct agttcacctttcaaggcaaataagtttccttttaaaaaggttcactactcatcaaaccct agcgatcgccttattaagcgagacaactataagaagcttgacttgagacatcttggcgtc ttgtatactgcggaaattgaaattggttcaggcaaaactgaaatcgaagttattgttgac accggatctgcagatttgtgggtaattgactcaaatgcagccgtatgcgattgtcctatc ttgagatacaaggtacaagtgtttccacccttagtcaaactgccaacgtaacacccctat caggtaaacttttgaatggacttcaagaaattggcattgtaactgatggcaaaatttcca aaaagtttcaggaaaaccatcttttgaagagaaacgaggccttgaattttgatgtcgatc tgaataagcccatttgtgatcaatttggatccttcaatccacagtcatcaagaacttttc aaagcaacgacacagcatttagtatcagatatctggacaactcttttgccaatggatcgt gggtgagggatacggtttatgttggtgattttgaaattgaccagcaaagttttgcattgg ttgatatcacaaataactacatgggaattctgggccttggtccttctagtcagcagacaa ccaatagtgatcctacagataacagtttcacttatcttggtattctggattctttgcggg cccaaggattcattaattcagcctcgtactcggtttatctggccccagatggtaagactg atgatactgatcacgatgatggtgagatcctgtttggtgctatcgacgaggctaaaatta atggacagttgaagttgtttccatatgtcaatccttataaatcggtataccctgaccaat acgcttcatacatcaccgtttccagtattactgtagccagttattttagtagccgcttgg ttgaaagaatccctcaattagctcttttagacactggtgccacattttcttacttgccaa cttatacgctgatacgtctcgcctatgccatccatcctggttttgagtatgtccgacaac tgggtttatttattatagagtcaaacgtactctccagtgcgagacaaagtaccattgact tccggtttggcaaagacgtagtaattcgatccaatgtttcagaccatctactcgacgtat cacaatacttcacatctggacattatcttgcacttaccatccatgaaagtgtcgatgggc ttctcattttgggtgacacgtttatcaagtccacctacttatttttcgacaatgataaca gtgaattgggtattggtcagatcaaaattaccaatgacgaggatattcaagaagttggtg aattcaccttagaacgcgattcagactattcttctacatggtccatttactcttatgaaa cttctttggatcccttaagcactggcactggtacggggtcaacctattctcctactcgca gtactacagctagaagcgaaccgactacgtctcgacgctccaccacccttcaacccagaa caactgtgattccttctattgacaggctttcattgaacagcataactagtcatggttcct ctactaacggaacctccccaactaatgagacttcttttgctgaggatggaggaactttga cacccgaagaagcttctttgacaacttcactaaattctgctactatttctgagactactt ttgtcgatgttgaaacttctactaccaatggtgcttcagttgtatctttgagtgttggtc cctgcattattgccttcctactactcatctcttaa
Downstream aatacgcgtttatgactttcaattgacttgcaactattagatttcatgcgtaattcaagt tttcgaagcccacagagatctttattaccgtgttcaaatgatacaagtaattggttccaa gacggtaccagttagttggattcttactccgggtcactaagacatcggatactcaagaat agtcttgagctacttagagctcagttgtaccgttttctgatgttcatagaaaaacagatt cacacttatgttgcttttatgaccctttgcttttccctatacggctggacaaagtaactg cttttttgcgattgctttatgatctatttttatggatgattgccaatgattgaagaacct agaacttgtttttgaaaaaagattacaaattttcatgacaaggatggttttttgaagcac ataactgactacgcccttttgatcacgattatttctggaatacaaggagctgagctgaga gcataatagcaactgaattcattaaaattaaatagatttctcgttagacataaaaacctg cttgattataatcaggagacctagctgatgacggcattagcagccattttgtttttggaa caggaatataactgttacgttggtttttggtcattccgacatatccagcatagaaagcgc acatagcagcaagaaaacaaaatacaccactagcagtttcacatccagtatgatttccaa agatggcaatatcatacaaaaggaaaaacatagctgtgaagaacgtaagtaagaacaacg gcaaggtcgatttaaaagtagccgtccaaaggacaatggcaaagacagtccaggctgcaa cgtatagacccaacgcgttagtcaattcatccgcatcggtataggcatctactcccatcc cgttctgaattgcaccaaatgaggcaaagaaaccaccataacatagcagcagtactgaac caaagacgttctcaaggacaatactccagataccacaaatgatttggattaaacctgaag
aaaacaaaaaggaacctgttaaaactgctggacttgatacatttctggttccggcaagaa taagacccaaagtgaacagagaagtacccagtgaaaacacgcctgctgggactggatttg caaacttacgagaaggtctgacttggtagccgcttatcaagcccccagtgaaaggggcca agtaatcttgttgagttgtaacaaactcacccttatgtatttcgtcgtgggggttatggg ctgtttctgctgatgactcgatatcttgatgagtagacattaaaagttatattttcttaa gaatggagtgtaattcggatgaattgagctttcttatatacatttctgatcgttattaat gtgctcatgagtgtgaaggtcgatcaactaaaaaaccacgcagaaaactggtgattatgg tccagttgagaagcttcagtggatcggaaattagacggaaaaggattgaagacgtattct
SEQ0186
Homologue of PEP4 (YPLl 54C): vacuolar aspartyl protease (proteinase A), required for the posttranslational precursor maturation of vacuolar proteinases; important for protein turnover after oxidative damage; synthesized as a zymogen, self-activates
S. cerevisiae null mutant: viable; abnormal autophagy; decreased glycogen accumulation; absence of mitophagy; decreased aminopeptidase I modification; abnormal polyphosphate accumulation; decreased Pho8 modification; decreased Cpsl modification
S. cerevisiae reduction of function mutant: increased total cellular protein accumulation
chr3_1087 5' region gaagcgactaaaaagaatattataaaaaagtcgagagagtcatattcttcaaaagtcgat atgtggtctattggttgcttggtgtatgttatcttgacagcacatttaccgtttagcggt tcgacacaagattcgttgacaagaaacatactagaaggtaattatcactactcattgcta aaggagaacggaatctcaaacaaggccaaagatttccttgataggctcttgaatgttgat ccaacaattcggttaggtgtcaaacaagctttaactcattcatggatatctgaattgtcc gatgaaagccaagtcagcctttcacaaagtcaatcacatcaaagaagaattgaaagtcag gaatctcattcattcatcacgggcatgaatggtgatgcatacgagcgaattgaagaagca gatgaggtagatgagcagtgcgaagatcaagaaggagaacaccctcatgaggcccaaggt tcgaaactgcagctagcaaaagaagcgagtttctccgtatctaatctttctcgctccccg tacgttaagaatgaaatttctacttccattatagaaaatagtgtatcactgccagcatct tttactcacaagcaattaaacaaagtaacaatggtctctaagcaattggaatcaccacag gggacctttatcacgttgaatctagttgaaaattcagtgtccaagttcggtgcagtacac ataccacaaggaaaaaccccatttgttgttggtagagattcatcttgtgactggttgatc aaagaagaaagaatttccaaaatacactgcatgattgccaaaaaaaggcatcctactgct aatccttccatatttgagtcacctgctttagggctggaagatatttggttactagatttt agtacaaactcttgctttgtcaatgacattaaaataggcaagaatcgcaaaactcaaata tttcatggagatgagatatgcttgttcaaagatgcccagaaaaaagagcaactcgtttat agggttcatattgatgatggaacaggccttttccagggaggtgaaagaacccaagccaat tctgatgacattctggatattgatgaggttgatgaaaagttaagagaactattgacaaga gcctcaaggaaacggcatatcacccctgcattggaaactcctgataaacgtgtaaaaaga gcttatttgaacagtattactgataactcttgatggaccttaaagatgtataatagtaga cagaattcataatggtgagattaggtaatcgtccggaataggaatagtggtttggggcga ttaatcgcacctgccttatatggtaagtaccttgaccgataaggtggcaactatttagaa caaagcaagccacctttctttatctgtaactctgtcgaagcaagcatctttactagagaa catctaaaccattttacattctagagttccatttctcaattactgataatcaatttaaag
ORF
AAAGATGCAGTAGGTTTAGCCAAGTCTATTTAG
Downstream gcaagaataaaagttgctcagctgaacttatttggttacttatcaggtagtgaagatgta gagaatatatgtttaggtattttttttagtttttctcctataactcatcttcagtacgtg attgcttgtcagctaccttgacaggggcgcataagtgatatcgtgtactgctcaatcaag atttgcctgctccattgataagggtataagagacccacctgctcctctttaaaattctct cttaactgttgtgaaaatcatcttcgaagcaaattcgagtttaaatctatgcggttggta actaaaggtatgtcatggtggtatatagtttttcattttaccttttactaatcagtttta cagaagaggaacgtctttctcaagatcgaaataggactaaatactggagacgatggggtc cttatttgggtgaaaggcagtgggctacagtaagggaagactattccgatgatggagatg cttggtctgcttttccttttgagcaatctcatttgagaacttatcgctggggagaggatg gactagctggagtctcagacaatcatcaactaatttgtttctcaatggcactgtggaatg agaatgatgatattttgaaggagcgattatttggggtcactggagaggctgcaaatcatg gagaggatgttaaggagctttattattatcttgataatacaccttctcactcttatatga aatacctttacaaatatccacaatcgaaatttccttacgaagaattgatttcagagaacc gtaaacgttccagattagaaagagagtacgagattactgactctgaagtactgaaggata acagatattttgatgtgatctttgaaatggcaaaggacgatgaagatgagaatgaacttt actttagaattaccgcttacaaccgaggtcccacccctgcccctttacatgtcgctccac aggtaacctttagaaatacctggtcctggggtatagatgaggaaaaggatcacgacaaac ctatagcttgcaaggaataccaagacaacaactattctattcggttagatagttggaagt atggctcaaatagattggtatttgctccttcgcccgccttcagtgaagaatcatcagaca ttgaacctaagcttcttttcactaataatgaaagtaataaacaaaaactatggaatcaaa aaaatgcgtcaccctacaccaaggatgctttccatgagtacattgtcaatgaagatagct ctgcaattaacccacaacaaaagggtacgaaggcctgtgcctggttttcttttgatgaaa atggtggtgttccccctggtgattatgtaactatccgttacagattcacaagaacagata gtgaccagcctgtaattgatgaagaggctttcgataaagtatttgctcgtagacaactag aggctgacgagttttattggaggatttctcccttgccaatcagtgatgagctcagaagtg
AA
MIFDGTTMSIAIGLLSTLGIGAEAKVHSAKIHKHPVSETLKEANFGQYVSALEHKYVSLFNEQNALSKSNFMSQQ
DGFAVEASHDAPLTNYLNAQYFTEVSLGTPPQSFFVILDTGSSNLWVPSKDCGSLACFLHAKYDHDESSTFKKNG
SSFEIRYGSGSMEGYVSQDVLQIGDLTIPKVDFAEATSEPGLAFAFGKFDGILGLAYDSISVNKIVPPIYKALEL
DLLDEPKFAFYLGDTDKDESDGGLATFGGVDKSKYEGKITWLPVRRKAYWEVSFDGVGLGSEYAELQKTGAAIDT
GTSLIALPSGLAEILNAE IGATKGWSGQYAVDCDTRDΞLPDLTLTFAGYNFTITPYDYTLEVSGSCISAFTPMDF PEPIGPLAI IGDSFLRKYYSVYDLGKDAVGLAKS I
9 Chaperones
SEQO 187
Homologue of Saccharomyces cerevisiae ROTl (YMR200W):
Essential ER membrane protein; may be involved in protein folding; mutation causes defects in cell wall synthesis and in lysis of autophagic bodies, suppresses tor2 mutations, and is synthetically lethal with kar2-l and with rot2 mutations
S. cerevisiae null mutant1 inviable; cell shape' abnormal; chitin deposition: increased; killer toxin resistance: increased; liquid culture appearance: abnormal; resistance to hygromycin B: decreased; resistance to sodium dodecyl sulfate: decreased, viable
S. cerevisiae reduction of function mutant growth rate in exponential phase: decreased; resistance to tunicamycin: decreased; vegetative growth: decreased
S. cerevisiae conditional mutant: increased heat sensitivity
FragB_0048 5' region ggcccaggtggcagcttcaattaatgacatttacgacaaactacaatcacaaggacacag tcaaggacgcagattaatgtcacagcaaagcaaacaatcaaccctatatacaacaactcc atctcatcaaatagcatctctaaacccaaacctaccattgttgtctcctcctcatacatc
ttcacatcccaaactggaagagccagccatcaataagaacactactatcactagttttac aatqcatagtcgatgctacbgatactctagtatccacttactgtgcacactctccagtca aagagtttcaaaagatggcccagatgcaattggtgtcattgaaatccagacaatttagtg cacactctccattctctttaacaggacaagaagaagacatattctccgaagacacctact gggagtctggagacgagtcattcgattctatttttgaacctcccaccatgaaattcaaga accattcactcagctctgtgagctcatcacccacagaacaccattcaccaacaatttcga ttacaaagccttggatatcacaatctcctcgcaagagccctaccttatcacaagtagctt ctcgttaccgcaaggtttcataattgtaccagataccccgtatatccaaataaattgttc acttgcttctgccctcctgtgaggtgccaaacttttcttacataatctactgcccgactt taaatagccagacgcgaaacatcctggcaaaaattcagtttctgtctttccttgcataac attctttttcctctttccactatctgccaagtgttctggaacgtgttgctacccaagccc cttcgtttattactgtccttttaactggcgtatacaactttctggtaccactcattgttc agcttcaggtcccttttcattttgttttccgcctgttttttatcactatcccacagccac attcctacatctgtgatcagtcagccacaaataatacaat
ORF
CCCCTCTACTTAGCGTACAGACCTCCCAACATGTTGCCTACGATCGTACTAAACCCCACTGCAGCAAGTGATGAG GCCCAAGCAACCGCCACTGGTGCCTCTGCCAAGATAAAAAGATCACTAGAGAACAGATACAGGACCAACGCTGTC
TTCCTGTATTAA
Downstream gtaatacgtatcttagataatcacacggtcttacctacaatcgatgttcgcgcgttcgcc atcctcctactgtttaactgaaccttgtaatttggcttttccccgataacagtcccatac ttctaacgaatgtcctggttcagtggaaagaactcagttcccttggaaaacaaaattaat gaggcaacatccgagtttatacccgacggagaaatcgacctagaggtttctctggagatc accgatataatacgatccaaacaggttactcctagggatgcaatgagagctctgaaacgt cgattcatgggttccaacaatcccaatatacagaaatccagtatcaaacttattgacttt tgtatcaaaaatgggggaatacattttgtccaagaaatcagtaccaaggagtttctggat cccattgtattgaaactacatgataagagcctaaactctgaggtgaaggctctaatttta gactccatccaaaactggtcgattttattcagcactaaccccaaattggagtacgtgaca accatttacaataagctccaagatgagaagatattcgaatttccctcaatatatcacacc gagaccattggagcaagttttattgaatctgaggttgcaccagaatggatggactcagat gcttgtatgatatgctccgatttattcaccatgataaacaggaaacaccattgccgttca tgtgggggagtattctgtggacaacattctgcaaagagatgcaagctgccaaaactgggc atcaccctaccggtcagagtatgtgataactgctatgaccagcataagtcaagaaaacaa cgccacaagaattcaaactcagtgacgaccgctgccgcaccttctgacgcggatatggat gctgatttgaaactggcaatcgaactctcattaaaggatagcggtggttcacaatatcct gtgcctgttgctggtccaaaagtatcttccaccgttaaag
AA
MVLIQNFLPLFAYTLFFNQRAALADDSVPESELTIVGTWSSKSNTVFTGSGFYDPVDELLIEPDLPGISYSFTDD GYFEEALYQVAGNAKDHHCPTAVLIFQHGTYRELDNGTLVLEPYDVDGRQLLSQPCEDKGISTYSRYNQTEVFRN YEVSLNTYHGRIQLQLYASDGVKQPPLYLAYRPPNMLPTIVLNPTAASDEAQATATGASAKIKRSLENRYRTNAV KESSLNYTLWWWLGAILMGAGSTIYFLY
SEQO 188
Homologue of LHSl (YKL073W):
Molecular chaperone of the endoplasmic reticulum lumen, involved in polypeptide translocation and folding; member of the Hsp70 family; localizes to the lumen of the ER; regulated by the unfolded protein response pathway
S. cerevisiae null mutant: viable; competitive fitness: decreased; fermentative growth rate: decreased; resistance to rapamycin: decreased; resistance to L-l,4-dithiothreitol: decreased;
resistance to wortmannin: normal; resistance to tunicamycin: decreased; resistance to amiodarone: decreased; resistance to ethanol: decreased Chrl-3_0063 5' region gatagtttggctctggaagcttgaactttttcatatcccctgggtttgacgtcccaaaaa ctaatatcccttttgacctcattcaagggtagcacattcgtaagatctacttttctagat tgattttgtctggagtgactccttggggcgtgttcctgggagcttgtatatctgtcatta gggtactctgatgtatttcttaatctttgttttgagggtccatttctttcaaactgctgt ggctgataatcagaagatctttctctttcccttgcactcaccattccattgtgtggtcta taactatgataactatcctgcacttcttctggtgcataatctctgtaggaccttctctcg tggcgctgattgcctcgtccacgagagtagtcattataccccctatctctggagaaattt ctggagaatccggtatcattataattctgtgaaaaatctttagctcctctctctccatct ctcctataccctctgtagtcattgttgcgttctcttggtgggcctctagaaattgcaccc ggagattcagtgcgataattccttcgagtatttgttacatcaatattttgatctgagtaa agagcttgattatctcgggcatacccctgcccttgctgttgtaccctaggatggtggttc ggttcaatttcttcttgcttctcactagcaggcgcatcgggcatcaacttgttattcctt cgagaagtttttaacagatcttggactttcagtgccccatagccattggaataaactggt tttccattctctcgttggccagttaatctttcatcatggaaagctccgttatctttattt tccatctgctagagcttatatcaaagaattcaaactaagaaaaaaagtctagggttagcg ttccgtcgcgatttttcaccatctctcaagaggcaaggctcgcttcgctttctagaaaaa accctgtagttggaaaagacacttgttcagatcaattgtt
ORF
AGCAGATTGCTTTCTTTAACTCGTCTGCAGGCCAAATGTGCTGACGAGGCTGAATATTTACCTCCTGTGGACACA
GAAGATATAGAGGAACCATCAGAATCGCCTAAGGTTCAAACATCCCATGATGAGTTGTAG
Downstream tatcccacagttatatatagaccacataattcattagagatatttacgcgcgattaccac
ccctcaaagataaaaacgacttgaacctcagatatcaccataccactcgaaacaactacc aaatgtttgqactagqqatattattttacacaqttctattccttqtqaatgccatagcqa tcttatcagaagacagatttctaaatcgaattggatggggctcatatgccccagcaagta ctccgcaagctagettccaaaccactacatttgacaactacaacgccaacccagaagttt ctatgaaaacgaaaattgtaacattaataagtgctacaagaactttgcttagactgccat taattgctatcaacagcgtagtcataatctacgaattggtgctaggatagatgaatacac attaaaaattatctatacactaaccttaatacacagttcgatgtcatgattcatacccta gtattccgcttcaaaggggggttgctcctttcttttagtctcaacgaagttctcaacgat ggatattcgttcgcctgagtaatcgcaatgatcatgaactacatttgagtccttgtcctt gagatacaatcgaaggaataagatgatggtatcataaaccagcagatattgtctcaaagt ttggaccatgaaaacacgctgctttctaaactcttctactattctccaaactgggtcata ggtggcgctacagtttttcgatgtcaaactgctatcctttaacagatgttccacaactgc gtcgattgaacaaaatgttccggttcttccacagcccgcagagcaatgaacgactattgg gacattatgttctttatctcgacaattcctggcaatataccttctgagataaattaaatt caagaggtcattcatcaatagtgtcccgaaatctggccaggaagaaacttgcaactgtaa tacttctctgctgatattagcagtactgacacgtagacgt
AA
MRTQKIVTVLCLLLNTVLGALLGIDYGQEFTKAVLVAPGVPFEVILTPDSKRKDNSMMAIKENSKGEIERFYGSS ASSVCIRNPETCLNHLKSLIGVSIDDVSTIDYKKYHSGAEMVPSKNNRNTVAFKLGSSVYPVEEILAMSLDDIKS RAEDHLKHAVPGSYSVISDAVITVPTFFTQSQRLALKDAAEISGLKWGLVDDGISVAVNYASSRQFNGDKQYHM IYDMGAGSLQATLVSISSSDDGGIVIDVEAIAYDFSLGGQLFTQSVYDILLQKFLSEHPSFSESDFNKNSKSMSK LWQAAEKAKTILSANTDTRVΞVESLYNDIDFRATIARDEFEDYNAEHVHRITAPIIEALSHPLNGNLTSPFPLTS LSSVILTGGSTRVPMVKFHLESLLGSELIAKNVNADESAVFGSTLRGVTLSQMFKAKQMTVNERSVYDYCLKVGS SEINVFPVGTPLATKKWELENVDSENQLTIGLYENGQLFASHEVTDLPKSIKSLTQEGKECSNINYEATVELSE SRLLSLTRLQAKCADEAEYLPPVDTESEDTKSENSTTSETIEKPNKKLFYPVTIPTQLKSVHVKPMGSSTKVSSS LKIKELNKKDAVKRSIEELKNQLESKLYRVRSYLEDEEWEKGPASQVEALSTLVAENLEWLDYDSDDASAKDIR EKLNSVSDSVAFIKSYIDLNDVTFDNNLFTTIYNTTLNSMQNVQELMLNMSEDALSLMQQYEKEGLDFAKESQKI KIKSPPLSDKELDNLFNTVTEKLEHVRMLTEKDTISDLPREELFKLYQELQNYSSRFEAIMASLEDVHSQRINRL TDKLRKHIERVSNEALKAALFEAKRQQEEEKSHEQNEGEEQSSASTSHTNEDIEEPSESPKVQTSHDEL
SEQO 189
Homologue of CNEl (YAL058W)
Calnexin; integral membrane ER chaperone involved in folding and quality control of glycoproteins; chaperone activity is inhibited by Mpdlp, with which Cnelp interacts; 24% identical to mammalian calnexin; Ca+ binding not yet shown in yeast
S. cerevisiae null mutant: viable; decreased competitive fitness; decreased resistance to hygromycin B; increased resistance to mitoxantrone
chr2-l_0322 5' region aagagcgcgatggagtgtagcgtgattacatcatcagatgctacattgattctctgatat gaatggtgatggaactttctagaggttccttgaagaaataaatacatttacaagcagaac tccactttttcacggagaatcatctaagttaggcatacgaaggatctcgccttcgttgtt tgcactcatctcctgtagtttagcgagaatcttggagtccttccacttttcaggcaatgg ggtaacctcgtagtttttcacggcccagtaataaatatcccaatccaactcgtttaataa gtcatcatactcttccatctcttccacactcattgtcggtagatagcgttttgcgaaacg agacagaagaaggtctgtttccaagattcctctctttcttgactgataaaccagacggcg tctcttgacatcttccgactcgttatcacgtttcagaggttcaactttcagtatcagctc ctgcctcaagaaggggagagaatgaaaagatttcgaaaacacccttggacaagtcttgct accttgaaactgagttctttggaaaagccggagcataatgggtgaattaagcagaaagaa ggtaactgatttgctgagacccaaatcatctacagtttcgcgaagcataaagttcacact gattttctggggaagaactggtaaaccacatgttgtctccattccacgataaaccgttca agcaaggccgtcttagaatgcacaagacaatttaggtaaactacctttcctggaagcgaa agcagacgttacaatctgtttcatcccccaactgcactcctctctcctctgctagccaag acgatctttcatagaatttgatggaatttacgcgaaatcgccacgtaatcatatttcgaa cagatctaccaccctatcctactgcatctatatatactcacctagaaccccttttttgaa cggactacttctcttgttactcatcagttcttttgccacgtc
GTCATGTTTGGTCCTGATAAATGTGGAACCACGAATAAAGTGCACTTGATTATTAAGAGGAAGAACCCAGCCACC GGCGAATATGAGGAACATCAATTGGCTACTCCTCCAATGGGTAGAATCGTCAAGACTACTTCTCTATACACCCTG
AACTTGGAACCTATTGGAGGTCTTGGTTTTGAGTTGTGGACCATGAATGCAGATATTCTCTTTGACAACATCTAC
TCCTCGTCTGCTGCTAAGAACTCGACTAAAGCTACAAAGAGAACCTAG
Downstream atatgtttataatgtgtacacactaattaaaaattgcctgtttgacgtattgcttgctgt ttttggagttgggtatatctatctcgcctagattgctttgctcgttatcgattttcaacc actaagtaaacccagagatgaagaaatttttgttqaaattcatcactaaaaagttctttg gagagaccaagctgaaccggtttggctttgatgatccttattacgaggaggtgcaactcg atcctaccgaccccaattccaaagtgattcgcagttatatccagccacctgtggggcttt cagataacgacaccaaactgttccaaaagattcaaagaagatcctataggctagactact tattcaatatttgcggagtcaagttcggttggaatggagttatatctgtggttcctgttg ttggagatgtgttagctgcaatcaactcattgctaatctacaggctaatttttcaattgg acaatccagttccatgggatatccatttcttcatactgctgaatattatagccgattttc tgtttaagctgattccaattgtgggcagctttgtggacattggattcaaagccaacagtc gtaatgctgcgttaattgagaaacatttggcaaccgttgctaaagcgaatcaatttttca acgatgtggattctgaatcagcgtcaactaaatcactggtatcctggcagacctccagtg tcaccgggatacccgatattgttggttcgttgaggtccagaatctttgggaggggagcaa aagtagaagaaaaaccgattgagtccatcgacttgggggtcagctccaaaagtgttctca ataagcctaggccaagggtgaactccacgggaggaatttcaactgccagggcggtagatt ctcctcaaaagaccatgctgaaatgagtaggccgttcgataggtcgatgatgtaagtaaa ggaagtatacagcgcgatgtaataaaatttgttccaacat
AA
MKISTIASSTLFAVGALAESEPAEFRPLEAQLDKSSFFEQFDKEPKLGDTWK ISHAVKNEEFTYV/GEWAIEEPVV YPGFKKDRGLWKSEAAHHAISAQLPQVFDNTDNTLVLQYEVKLQQGLNCGGAfVKLLSAEGLNKNEFSNETPYQ VMFGPDKCGTTNKVHLI IKRKNPATGEYEEHQLATPPMGRIVKTTSLYTLI IKPNNDFE I RINGEVAKAGNLLNE
KLIKPPFGAPKEIDDPEDQKPEDWVDEDMIPDPDAVKPEDWDESEPLRIVDPEAVKPENWNEDAELYIPDPEATK
NLEPIGGLGFELWTMNADILFDNIYLGHSIKEAEFIGNETFVPKLELEEAESAKNAPKPDFEPETPPETGLDSTG NIFTDVFDTVFQTVLEYYVSANAFLNDVVQEPSVLLERPGEAVYYLVTFFSGFTFLVAWJSGLIFALTGAGKKPE TSAKTPTKSTQKIEEVTEDETEKTDSSSAAKNSTFATKRT
SEQO 190
Homologue of YDJl (YNL064C):
Protein chaperone involved in regulation of the HSP90 and HSP70 functions; involved in protein translocation across membranes; member of the DnaJ family
S. cerevisiae null mutant: colony shape: abnormal, gamma ray resistance: decreased; heat sensitivity increased, resistance to daunorubicin: decreased; resistance to camptothecin: decreased; resistance to cisplatin: decreased; resistance to calcofluor white1 decreased; vegetative growth: decreased; competitive fitness: decreased; resistance to rapamycin:
decreased; resistance to L-l,4-dithiothreitol: decreased; resistance to hydroxyurea: decreased; resistance to doxorubicin: decreased; resistance to wortmannin: normal; resistance to cisplatin: decreased; resistance to 4-(N-(S-glutathionylacetyl)amino) phenylarsenoxide (GSAO): decreased; resistance to l,2-bis(2-aminophenoxy)ethane-N,N,N',N'-tetraacetic acid: decreased; resistance to amiodarone: decreased; respiratory growth rate: decreased; respiratory growth rate: absent; viable
chr2-2_0066 5' region tctggatcacccacaaatgctcccttagctagtgcttctgctcctgttgctggaggtccc ccgccaccccctccacctcctcctgctgatgtttttcagactgatgattcgaaagggaag aaaccacaaggaatgggagccgtgtttgctgacttaaacaaaggagaagctattacatca ggtctgcgtaaagtagacaaatcagaaatggttcataagaatcctgaactgcgccaatct aatgttgttccagctaataaggaaaaaccaggaaagcctcaacctccaaagaaaccgtct ggtttgtctacaaagaaacctgcgcgtaaggaattgcaagacaacaaatggctggtcgaa aattacgatggtgaacaagagattgtcattgaaggagagatgcagcagagtgtattcatc agcaagtgtgacaactcgacaatcacaattcgtggcaaagttgctgctgtctcattgact caatgctccaaggttggtctagtagtagactccttggtgtctggaattgatgttatcaaa agtcgtggattcggcgttcaggtactagggaatgtccccaccatcaccattgaccaatct gataacggtcaaatatatttgggagcagaaagcctgggaaccgagatttacgcatcatct accacatcgttgaacataaatgtcccagatgaagacggagatttccgagagatggcagcc cctgaacagttcaaacatacggtggctggaggcaatctacaaaccacagttgtagagcac gtcggttagcaaaagacaccaaccctaccttgactctggaactgatctgtttataacatt taagtgtatttacgtatttactaagagatcagatcgcgaatgaaaataggttttccagaa acttctctcactggtacacgactgtttttcttgagaaaaaaaaaatttagaggcaaaact tccttactggacatacacctagagatatatcaatcgtgag
ORF
TCTCAGTAA
Downstream ttgtatttgatagtgaacataatgtaaatctgattatctgcgttcgtgtctatgaattat ttatggtagtttattttctatctacagtttgtattttgttcccctcctagccacatagag aacagggatcatggcaaccaatgagatacctgcaaagaaagtgtagcaaccaccgatggt cattgctgcaatcatcttggttagcactcctacgaatagggcgctcattaggcatcttgt gatgttcacgcaagaagtggcggtggatgactgtcctgggaacatgtctatcaaaagagt ggtcgacatagtcattgttcccagacatgcccatgtaaagaaaaatattgtgatgatgat gggggctggatgagtccttttatctaaagtccacccaaatgttaactgggaagccacaca gatgagggtgtagacccagtaaattgagagccttgccttgtaaatggaaaactctgactt caatttctggcaacgataagatctgtcaaggatttttccagttgcaaaagatcctaaaac acatcccactcctgatgatatgaatactaatccgatttttaaagtcgaataatggtaatg attctcaagctcagtagcaagagtagttaaacaacaagtccaacttgcaaaaagaattga gcacggtatcagcgttaagtatactggtggcgcaatcaagatcctaaaggtgtccagcag attgtaatgagggtgtgggtctgtctgggcagagtcatccagtcgtctctggtacgatgg taggagcaagataggacaataattccaaagtctgcgtggttccttggacatgtttccaac
gatagaacgattcgtctctggtaagaaagagaaatccagcgtgaataccaacatacccag gactgctaagaaccaaaagacagctctccatgattggaatccttcatcgacggcacctcc taagagcggaccaatggcctggcccattaaaagacagcca
AA
MVRETKLYDILGVSPDATDAQLKKAYRVGALKNHPDKNPSPEAAETFKGMSHAYEVLSDPQKREIYDQYGEEGLN GGGAGPGGMGEDIFSQFFGGMFPGGGQPTGPQRGFDIKHS ISCTLEELYKGRTAKLALNKTVLCKECDGKGGKNV KKCSACNGQGLRFVTRQIGPMIQRAQVRCDVCNGEGDIISGADRCKACSGKKITNERKILEVNIERGMRHGQKVV FSGESDQAPDVIPGDVIFWDEKPHKDFSRKGDDLYYEAKIDLLTALAGGELAIKHISGEYLKITI IPGEVISPG SVKVIVGKGMPVRKSSSYGNLYVKFEIDFPPKNFTTAENLQLLEQVLPARTPVSIPADAEVDEVVLADVDPTQQQ RQGGRGGQSYDSDDEEQGGQGVQCASQ
SEQ0191
Homologue of EPSl (YIL005W):
ER protein with chaperone and co-chaperone activity, involved in retention of resident ER proteins; has a role in recognizing proteins targeted for ER-associated degradation (ERAD), member of the protein disulfide isomerase family
S. cerevisiae null mutant: resistance to rapamycin: decreased; resistance to wortmannin: decreased; resistance to amiodarone: decreased; viable
chr2-l_0421 5' region aggaagaagtcaggactcataaagactataaagagcatatatttgaaggaaggtattcgt gggttctattctggtttcgtcattaatttaaccagaaccttaccctccagtgcagtgacg ctagtctcttttgagtacattaagaactaccttgacaagattgcttctcccgccctatag tcatcacatttgtagcatataatattcggttttatagagtttggttcgattccaaagaat tgcaccaaggagacattaaaggtggttaagataagtgcctgttaatcaatacagcgtgaa tcttggtgccttgaatggactcaacaaagtatttaatttttgatttttcagtgaataccg catcgcatggcaccagcgaagcagactgcttattgtattagatcgaaaatcataaacgag ccattcacttgagcaaacaactgtatacagtcagacaaaagagtgttagcctgtagctat tttcactcaactcgttagctcaaaatcccacaaaactactgattgacttctcatgacata aacatgttcatatgcacaataccaggaccagacacgaaaaactagacacgagacatctat ctgttcgcgattggctttgccttccgtagattactaccgttctacagttacaaacccttg ataaggacagcatagcacaaaccgtttcattctgacaacagtcaccatgaaagttgcaaa atctatcgtggccagtcttttggcaatagcttttgctaaagcagccgatccagccgaggg ctccggtaatgggggtggagaatgggccgactgtcatcagtatcaactccatccagctgc cagaacttttgacacccgatactttcaatgagactcttgcccaaggaatccactttgtag agttttttctccctactgttatcattgcaaacaactggagcccatttggaaagaaactta tcaccgcttcttgaaagaaagtgcgatatcagggatcagt
ORF
GCTTACATCGAATTGAGCAAACACGATTAG
Downstream tagatagagtctcacaccaataggccatcagtaggcgacaatcgcttaaagtgtgtgggt atcggtgcgcgatatagccttacataatacattcagatcacacatgttgtgagccacagt ttaactcttcttcatacttgtatcgacatccctaagaacataatgcagctttactatcat gatggtttagacaccccagataatttctgtgaagatcacaattctggggaagaggtgtca gttgaagagctgtccaaaattggagtgatatactaccatttggacgaagagagtcaagtt gagaagattgccagggaaagagattacaagaacagagattttgtggttctatctgatgag actttccctggtgggccagatgctttgcaagagaaactgaaagtcttctacaaggagcat cttcacgaagatgaggagattagattcatcctggatggagaaggctactttgacgtgaga tcacagaacgacaggtggatccgaagcaagcttttcaaaggtgatttactgattcttcct gctgggatttaccacagattcacgttgtccactaagaagtatgtcaagacaatgaggttg ttcaaggaggagccaaaatggcaagcctataacagaccggctgatgacaaaggagctaga ttggagtacttgaaaagtattcaggtgtacjgtttttgtagttaacgaatatatgtgtatt tattactttcttgggttgaaagatgcagagatcatttggaggtgacattggtgaaaaaaa tgaagaactagagtgttcagaatataattttttcgacagctgacattgcaggtaccagcc tctagttttcctaatgatcaattttggtagctaaagatcatctttgcttgctaacgtatc gatggactcaaaatgtcttcttttaagtgccaacctggattgtcttaaccagttcgctaa cattgaaatattcgaattccctactcatcccaccattatg
AA
MRQVDCAANGDFCGKQSINSYPTLRLYGPSEDNTSYKLITTYPSGGKRTPQNFLKFLRSQYDDLKMDRFNLPVKG EVLTEEKMLKLMNGDIEEPVLVSFWPTTDKETTLKYFEDVHSNPISFKNCYSCFNLAVLWSRITNRLPDLETAVF NCGGTNSRVCQALNLPFNVNRAQVSPEIYMFLPNSHGGIRVKYNHDLVISDIVDWSERLLANAQAEEVTLDSLAE KMTLLNPAKGLDFFKGPDPNQKQVFVYYYEEGSDAPEDFE IWPHLLQPIMDLTTNTYIYRSKDSQLEDLLDKKYK KLIDYINQPDFEPERPLNRETYLARTITSVPTFLVFKDNNMIPTVYQNFSPNEIRDVKKVVNFIEQNQFPLVDRL TTDNYESYFPKFNPQIHDKDEKIVISFFSSDNKTQTTNDYNQLLFVQHSYDYIKHQNRFDLIEKARADKTDKSEQ LKGEGMDPKEIIKVLSKKIDHLNNVGNVLTVYVDLADDRKLLEKTGWISSRTKYKPGESIIVTRFGKHYWDRDVF GHKLMTTAQSLRETLSYLLFPSTYIPPEKKTTVRPSPRVAGSPYPDAFAFVDI IHQYGLFGYLLI ITSVIGVYML IKSHRRRSRKAKFLSHRSHKEGFQDAYIELSKHD
SEQO 192
Homologue of S. cerevisiae SHR3 (YDL212W)
Endoplasmic reticulum packaging chaperone, required for incorporation of amino acid permeases into COPII coated vesicles for transport to the cell surface
S. cerevisiae null mutant: inviable; normal vegetative growth
Chrl-3_0116 5' region cgtggacctcctgaaaaatattctctctcctggtgaaaaaaaacctctaacgttcactaa ttaqtttcttctccccattaaccaacaacatccacaaatatgccagaagttttgatctcc aagaaaagaaagctcgtagctgacggtgtcttctacgctgaattgaacgagtacttcact agagaactcgctgaggacggatactccggtgtcgaagtcagagtcaccccaaccactact gagatcatcatcagagctacccagaccaactctgtcttgggtgagaacggtagaagaatc agagaattgactgctttggtccaaaagagattcaagatcgatgagggtaaggttatcctc tacgctgagagagtccaagaccgtggtttgtccgctgttacccaatgtgagtctttgaga tacaagctgttgaacggtctggccatcagaagagccgcttacggtgtcgtcagattcgtc atggactctggtgccaagggtgttgaggttgtcgtttccggtaagctaagagctgccaga gccaagtctatgaagttcgctgatggtttcatgatccactccggtcaaccagccaaggat ttcattgagactgccaccagacacgttctgttgagacagggtgttttgggtatcaaggtc aagatcatgagagaccctgctgctgccagaaccggtcctaaggcccttcctgactttgtc aagatccacgaaccaaaggatgagagagaggttccagctcctttcgtcaagtcttatgtt gctgagccagtcgttgaggctgaggaggctgaggagcaagctactgcttaaactgtttaa ttctgtaggataaacgaaataactgtatctttataatcatccttaagatagccctggact
cgtcagacattagttcgcgcacattattttcctccactgtctgcaaacagccacacggca tttccctctctagtttcatttatcaaagtagcacaacatc
ORF
ATGATCAGCTACCAAGAATTGCTTCCTCTGGGAACTGGTCTTATTGTGACAT(GTAAGTGATTTCCCTGCTTTCA ACAAACTATTGATCTAGTACTAACAATTGCTCTTCAG)CCGCATCATTTTGCTTGGGTCTTGCCTATTCCAACTG
Downstream ctgttaaccttattcattattgtattatctatgtttatttagcttaagtcgtgtactctt cccaatatatttccacatagtttttttatccaccaatagcgatcgaatcccattccaagt cacatcaaactacacatctatcatattgtccgaaatggctattggtgactatccagtgta ttataagcagcccattagagtaagtcgttaattctgtttaccaagtacttcattttaaac tccaactaactcggattttcaaagtggctgagttatcatgctcacactcgtccccatgtc ttttacgctattgctgttgctgcgctaggccctgttttcatgatggccacccctcttaga agaaagtttttgtacgatgatcatgaacctttgcctgccaatggttacccacttccaaat agatcaagagacaagacactgacgggttacgacgattaagattagaacaggtattccctt ttatttatatttattctctatttacagcatatttgataaaattaccatttgtagaatcca gaaggttttgatctgatcttctgatcaatcacggaccctggaattttgttgttctttggc tgttgggatttttcccatgtaattctacgcaacgtttggtcagaacgtagcacgaatttg acgttgacatgacgatgggtaataataccagttctaccccttcctttacattccactctc ttcgacttatgtgaatctgagcctacccaggcctcagagatgtacagttcttctgggtta tatcctaaagtttcagcatcttttaaacctctttctagtaaaatataaagatctgtggca atcttcttacgcatgaaatgaagctgggtaatggcttcttttaatctaagtcctttaaga gcacggagcacaatgttagcctttttgggagtcgattttattctgtaagactctaaataa acactgggttctaacgccttaacctcttcgggtgtcaatttc
AA
MISYQELLPLGTGLIVTSASFCLGLAYSNWPFDYFTLYSDGGDEAFKQSLDHLLLWSKVPRFVHWTLHGVIGLGY IGCFIKIFKPDPDQSLFEYGTLVLFVVATVMYLTNIRLGILSAEKGEWGDVDEQTGINVIAASTGLLVFVLIGIL LLQFGLWYAVHADRVLKEKYLKEEAAKEAKLEKESEKTEDKAAEESEAKPDSYKSKSKKSSTSGSKSTATTAQKR
KA
SEQO193
Homologue of S. cerevisiae ZUOl (YGR285C):
Cytosolic ribosome-associated chaperone that acts, together with Sszlp and the Ssb proteins, as a chaperone for nascent polypeptide chains; contains a DnaJ domain and functions as a J- protein partner for Ssblp and Ssb2p
S. cerevisiae null mutant: cold sensitivity: increased; gamma ray resistance: decreased; ionic stress resistance: decreased; resistance to daunorubicin: decreased; resistance to camptothecin: decreased; resistance to N-Methyl-N'-Nitro-N-Nitrosoguanidine (MNNG): decreased; resistance to gentamycins: decreased; resistance to paromomycin: decreased; resistance to cisplatin: decreased; acid pH resistance: decreased; budding pattern: abnormal; cell shape: abnormal; competitive fitness: decreased; fermentative growth rate: decreased; metal resistance: decreased; resistance to gentamycin Cl: decreased; resistance to doxorubicin: decreased; resistance to calcium dichloride: decreased; resistance to rapamycin: decreased; resistance to cisplatin: decreased; sporulation: decreased; viable
Chr2-l_0282 5' region tgcaagtatctatctgtttataatcaatatggttttaatttatctaaaaaatactttgac aaatccttctttgttgacgaggtataagtactgttgctcgtttgtttctaagtttttcgc ctcatcttcgaacgatattgatcccttcgaaactgttattgtgcctgttacgtcgtctgc
tcgtcccgtctccaagggtcgtatcgatattatcccctggcttctatgatacaatttggt gagaaactcagtttgtctatgtgtaatggttgaagagtctgtttcattcaggtctatcag atttgcatcagcaaaagtcgaaacaagtacgttcgccaatgagttgatataagatatttg gctcatgaattcgttgatagaacattgatttgtcaaaaagtatagttctggatgatccag aattaatgtgcagggtaaggctgagctcttcaaagcagcaattacaggcttgaataactt gtccacaggagcctctgccgttgaaagatcgaaaaaactaaggttttcgtcaccttgaaa tgaagattccaccccaagatattttacccacgtcttgttgtagaagcttatcccatgcaa gaagctgacaatgacaactctgcttctgggatttttgagtgaaaagttggaagttttgaa ggaatttgcatccttattgatattacatgtgccaaaaattagattctccacaaagcaagt ggtcagccatgatgacgacgttccttgagtgtccgtgtaggtaaaacatgatctaaaagc ttgtgagtcgatgagagcttgcggtatgacctcccctgacctgaaaagagagaaatcttg agcttggtgggacatgatggatttggcccttgattaggttggattctcgttattgtccta aatttgatatttcctcgaactgtggcacgtgattagttggaaaaaaaaatcatcaataat ccagaaaatctttcccttcactagaatacctatagttaac
ORF
GCTGCCAAAGAGTACATTGAGCAAGGTTTCATTACAGCTGACGACTTGAAGTTCTTTGCAAAATAA
Downstream atagataggcaacattatttttcttcaccagaatcgataccatgattttttgctgtcttt gcgttgtaaatccaaatgctcgtctataagttgttgtatatccctttcgagtttcgaaac cagtatttcattttcttggataattttgtctctttccaagtcccaccacaagtactgtag gccgaagtagactagtgatgccaataggaaagtattcaacagaggccttgaaaaatgtga tattctgactttggaatcttccttctgttgctccttttttttctccaaggatttcagttc ctgttgtttgatctggtcgtttatctggtcgattcttgattgaataatgtgtcgttggga aggatattgtagcggatcttgagggacatattgagtagtagattttggtaagtgtgtgct ctgttttcgtatctgtctccagaccagacttgatttacaaatttgcggcctcatggtgaa atttgcttgtagggaagctgaccactttcagagactaacgattgaaagaccttggatgga ctccacgtagatattccatgtgatttaggtttcgcggttattatatattggtgatacgtc aacaaccaagcctaagtatatatatacagtgtacactattatcaatcaccactgttcttt tgctcctcatttgacgttcaaaacctcctgtagattgaacctcttgcatttgtcatatat ctttccgtccaacactgacatgctgatattatttttgaaacactcaaccaatatcagtac atcgtaatcttttaggtcaaccttgtggcctccctcgttaccaacgcaactggccacggt tttgatgacatccagcttctctagcacgctgaagtttctcctattgacttgaatggcaaa tttgagtggtttctgaacctctttgtcttgatgaaaatggggttttaacaatcgttgtgc caacttggttaactctggtaaagtgcctgagcatgaacca
AA
MSLTLPALPSSVSTAASLGKYTAAVRKPIEPVGRHFLALYLWGHTWSEFEKIQAETDVKKAEDVNEDAEFDIEQS EELIEHDPRDWKSADLYAVLGLNNYRYKATEDQIRRAYRVQVLKHHPDKMGEKGGLDKDGFFKI IQKAHETLMDP VKRQQFDSVDEEANIEPPAPKSKYDFFEAWGPLFASEGRFSKKQPVPELGNADTPKEEVDAFYKFFSNLDSWRTF EFLDEDVPDDTSNRYHKRMIEKKNKAARQKRKTEDNKRLSELVRRAANEDPRLKAFRDKEKEEKNKKKQEREAEA AKAAEEVKAKREAVERARAEAAKARAKSKAKPSYQPIQTQSKGKGKGKGKGFGKKSTVSKEQRTESKINIRNSVK DANYFGTELQSEIDHDINVFLDNFEDEKSVEVSNAIAGATADAVKEIFDAAAKEYIEQGFITADDLKFFAK
SEQO 194
Homologue of CPR6 (YLR216C):
Peptidyl-prolyl cis-trans isomerase (cyclophilin), catalyzes the cis-trans isomerization of peptide bonds N- terminal to proline residues; binds to Hsp82p and contributes to chaperone activity
S. cerevisiae null mutant : viable
chr3_0567 5' region caagagctttcaaagtggcatcatcagcttgctcaagcaacgactttttgataatataga tcgtgattccagctaatccaatatttttttgagcaccggccattatcaatccatatttac tgacatccagcctctttgataagatgttggatgacatatctgcaactacttcaacaccag gaaaactttcaaatgggaactccttgaactcaacaccgttcacagtttcattatcacaat aatacacataagaagtgtttgatagatcagtggggatgctccatttgtcaactgggggta tggatccaaattttccatcaacttttttactatccaaaacaatgtcaacgtccaaaccca aacgctgggcttcttcaacagatttcttagaccaagttccagtgattgcataagcagcac ggcccttccttccagttttcttaacaaaggaggcagtcaagttggaagcaatagaagaaa atcctgttgttcctcccccttgtaagaaaaaaacctcatgtgtatcagggatgttcaaca aactaactagattggccttcgtgttatcaatgacttcactagcttgcttagaacgatgag aaatttcaccaatacctataccaacaccctggtagttgaccaagtcataggcagcctctt gcaatactgaagtaggcagtagcgctggcccagccccaaaataatgtggttcttctcttt ccaaagtttttgccattgtattatatggttagttcaagatgaagaagtaagatgagacga agagtcaaaactggcaattgcttagagtcagatgttggcgcgattcagttgatctaggag gatagccacgcagaggtcgcacccttcgaaactgccttacaatgtgccctgtgtctcaga aatttctagacaatcgcgccttacttttcaaacaaggggaccgaatccgtataaatcaga cacccgtccatagtgccccccatccaatcctttttcagta
ORF
AAAGAGAAAGCTGCTTTGTCTAAGTTCTTTGAGTAG
Downstream agtcatataaacacatagacaactaataattcatcccagcgatggcacgtaaacatcaag acaggttagatagtacccgcgcatccgcgtatcgtggttccacctaccgtcccgaagcgt cgtagcacctccccccaaccctggtgcgtaattatctggccatctccgtaatttcgcgcc cctctcttttgcctccgacgaaaaagcatgaaaaacaacgagaacaccctacacacaagc ctctccagctggcaaccatcttcctgagaatcaatctgattagacattgtattaacagtt atattaatccagttgtcttaacgtcttattgtgtgaaaatagaattggcactcatttggt tgttgttgttaatcagctcattagacgtgaatttgatttgaaagggattttgttaatttt tgctaaacccataatttccacttgcaagtgatacaagagtaactgacacatctagttaac ccccgtaaactataatagttcccaaacaaaaacagaaaaaaaaagcttagaaattagcaa gatatgagtcagtttccttattcaagtgccccactgaggtccgtcaaggaagtgcagttt ggcttgttgtctcccgaggaaattagagctatttcagttgttaagattgaatatcctgaa atcatggatgagagtcgccaaagaccgagggaaggtggattaaatgatcccaagttgggt tccatagaccgtaatttcaaatgtcaaacttgtggtgagggaatggcagaatgtcccggt cattttggacacatggagctagctaagccagtctttcatattggttteattccgaagatt aaaaaagtctgcgaatgtatctgtatgaactgtggaaagcttcttcttgacgaaacaaac cctaccatggctcaggctatccgtattagagatccaaaaaaacgatttaacgctgtgtgg cagctgtgtaaaactaagatggtatgtgaagctgatgccc
AA
MTPRSHIFFDIS INNQPAGRI IFELFNDIVPKTAENFRALSTGEKGIGKSGKPLHYKGSTFHRIIKDFMVQGGDF TNGNGTGGESIYGEKFEDENFQLTHDKPFLLSMANAGPGTNGSQFFITTVPTPHLDNKHVVFGKVIAGKATVRKI ERNΞEGEAPIEPWIEDCGELPEDADLTISDETGDKYEEVLKDNENIDIDDFEQVYQAITEIKELGTKYFKNGDT KIAFEKYQKAANYLLEYIPSDLSEEQSSKLELLKTSVFSNVALAGLKVSKFKDTIKYATLVIEDESADAKAKSKG YYRRGSAYSSLKDEDSAISDFQKALELSPGDPAISQSLQRTTKARKDRLAKEKAALSKFFE
SEQO 195
Homologue of CPR7 (YJR032W):
Peptidyl-prolyl cis-trans isomerase (cyclophilin), catalyzes the cis-trans isomerization of peptide bonds N- terminal to proline residues; binds to Hsp82p and contributes to chaperone activity
S. cerevisiae null mutant: growth rate in exponential phase: decreased; viable; competitive fitness: decreased; mitochondrial genome maintenance: abnormal; resistance to amiodarone: decreased; resistance to hygromycin B: decreased; respiratory growth rate: decreased; transposable element transposition: decreased
chr3_0378 5' region ctgttcctgtgacaaatcatgcttggccatcaagtccaacggcaggggaatatgattttg tgttgtagcataatactttaatcctatcaacatagtggcaacccctgtcgcttgacctag atgggccgccacttcagttaaatttgtctctaggaacggggcctgctccaatagttcaac agcagatggggagatgttaggtgataatagtaatcgctgtgttaaataattgctctgtga ataaatgccttcgccataggaacaaacttcatccatgttcttgaaccctaacttagaagt gttcttcaagtaatgcaatctagatgtcaatatctgttgaaacatgccaatgtccaagtt taactctttgttgataccatcccgtaacaaagtagccaccggttcacccaaaggcttctg gtacccgcctagaaacactttttgaagtaattcctcccaaaatttgaatttgacgtcaag taggccaacccccacattagctcggaattgactgttggctcttgtattggaatctgatat cttgttcacttccaatagaaagcatcgtatcgccatccaggtatctcttgcaggttctgg aatgtaatgagccaataggtaacttgacttgtcaaacttgttcaacatctcggtgcaatc tgtccttgctttcatcacagattgtctatattgtagatctggtggggacgtgaactgttg gtggacagcatgaacattcagcctccttgcagcggcgtgaaaccttctccttcccaacat gctaggatatttctgggtaggaatgtctagcgttgataggtaatcaagtaaaatatactc aagacggagaaacgagtctctccgaagttgcttacttttcgcttccttactggctagagt ctggaaagaaatccgatacggggggctaccaggaaatcttgcgcgagctgaaggatcatc tctctctacaaggattgtagattgaagggtttaatcgcaa
ORF
Downstream gatttattggttgaaaaggatttcagaaagaaacgcactttttggcattggttaaactat aggtcctctgtttgctgaggttataaagaacgtgcagcgggctgatcgcttggattgttg aagcattaattagtctgtggactagaggattctagctgttaaagatcttttctgtgtctc gcattaatttctaagcagaagcggtggattacaaatgcaaaatccgaaaaattgtcgatt
gggctccattcgtatctaatcagtgctattcctgttctgatatgtaccgcttaccgaaaa agccactactagaggtggctattgtttgcacatgacaaagtcacagaaatgtccacagat aatctttgtctagatatagcgtaggagcatcaaataagctgggaatgggcatcgctaagc cactattcacgatgcacgaattgaccgggctttcagtgttgattactaagtttaagttta ggtttcaagtgcatgcaaatagacttcatgcagggaattaaagttacatatgcgccatct ttaaacggtttacgatttggccaagtaggaggaaatcgaacagtgtgcaaatttttattt acagccaacgtctttcatcattctttcaataagggagcttagtatatgtcatagttaaca ctcagtatccttggaaaatagatcaagctctgcatttttttcaattgattaataatacta aaggacgcacctaagataatactaacagcttttgattactgcacgcgcccgcgcacagtc tattaacaaaataacgcaactgaaaggcattattcagatcaccttattgttaaggtcgac cccctgaattattgtggaagtatcgcctgaaacgaagcctcataggttgttccgtttgaa cttggagtctctatctctagccatcatcgccctttgatctcccaaatcatggacctcggt aaaaagatttattacgtaatcattttattttgggaacgtg
AA
MFAFLDVTIGGSKIGRIVLQLDNVNCPLASTNFLNLCNQVGKTFHSKEYGDIKLTYQGIRFHRIMKNFIVQAGDL VFGRDDSFNQEQVGGGGVSTYINDLSNNEKGPIYGKFDDEALDNNPVSKFDRPFVLAMANSGPNSNSSQFFITTY PCPHLQGKHTAFGSWHGKSWRDIEKVDTTKENNSPLQPWIVACGEWKEGDPVPVFNCSNSQIGGDVFEEYPD DDDNFDKDSLEKALEAVEIIKESGSLLFKQNQIQNALFKYKKSYRYLNEFSPDQDQSPQLYNRFLALKKKLYLNL SLVYYRLGQFSRARDYAEYLLEMEDNQAITDKDRAKAFFRKGLAQIALNDFENGLHALKAAQDLTPTDSMISNEI LKASKSFELFKKREKNKLEAFFK
SEQO 196
Homologue of PIHl (YHR034C): Protein of unresolved function; may function in protein folding and/or rRNA processing, interacts with a chaperone (Hsp82p), two chromatin remodeling factors (Rvblp, Rvb2p) and two rRNA processing factors (Rrp43p, Nop58p)
S. cerevisiae null mutant: heat sensitivity: increased; resistance to 5-fluorouracil: decreased; competitive fitness: decreased; fermentative growth rate: decreased; resistance to rapamycin: decreased; resistance to 5-fluorouracil: decreased; resistance to wortmannin: increased; viable
Chr3_0296-l
5' region (in bold start codon next gene) cttccacattaagcatcaaatttgacttcacttcattcatgaaagtccataattgtgagc tcatagaattctgtgttttattcgattcgttagttgcgtgagagttgggtgcggataact cgtccatgagatttagtttctcaaattcatgttgtatcaatgtatcaaactgttggagat agctcaacggtttggataaccaggaatttggggtatccttttgcttgccatctaccgatt ttaaaccttcattaggattcgttgtccttaagggatctttgtgcacacctacttccacat ctagtttgtggggtctttggctagatgaaggcgatctcgttacgactggaggtggatttg aggatattttttccacactttggacctggtcatgtacttgcttgtataaggatctttcaa tttcagttttaagcttgttaacgcggtgttctttttgtctctttttctctagtaatgcct taatttgttccttgatctccaggatctccatttccgttgtggccaaaactctcaatttat ccgacaaatgattatcattattcgaactcaaagaaggagaatcaactgaggaattgccac tactttctttccttggtgtggatgatgcctttgtaactttggaggagtttggagcagggc ttcgacggcgggtctcaaggactgatattcccagattcaagccggatttgcgcggcgaat ttgtagatgatgtctgaaattcatccattactgattctgctatcaaataaaagacgcaat cttttggtaagaatagagaaggatgacctactctctacagggttgctggaactagtagtt tcgtttctgtaaatcctatgaaattgatgtctatcgcgcgacttacgatgcgcgcgtgtc aaactgtactagtcatctctgcgatcagtatcaattcaagcccatctttgtttccaccct gtaaatattccggtaatagagactgatacccttgcttttca
ORF
ATGGTCAGTCAGAAAATTACCCCAAAACCTCACTTGGTCCTTAAAACAAAACTTTTGGCT CCAATAAGAGGCCTTCCAGTGGGAACCAAACTCTTTATCAACGTTTGCACTAATGATCTA GTTCCGTTGCCACCGCCTTTAACACCTTTCACCCCAGCTGAGGATGATGTGTTTGATTTG TCGGTGATTTTACCAAAGGTTATGGATGATAAGTGGGAAATCCCCATACTTACATCTCCA GACCTAAGAATGGATAAAGACAAGTCAGGAAACCTTTCATTGGTAACAGATTGCGTAATC AATGACAAACCTATGAATTGGTGTCTAATAAGTAAGGACTTGATGGACATTCTTATTGAA TGGTGTTTTGAGTCCTTGGAATTTCAATATGATTCAGAGGTGTTGGTGGTTGATCATATC
AAGTACTCAATTCCTAAAATGTCTTATAAAGGCAAAGAAGTACAGGAAATGGAAATTGAT CTTGAATCAAACGAACTACAACGTAAGGAGTTTGAGCGAATAAAGACAACCCTAACCAAG AACGAACCACTTGGTATTTTAGAATCGAAGAGGGCTCAAGATGAAGACGATCTTGATACG GATAAACAGGAATTAACCTCATTAGATAGTCTATTGGCCCGTCCAGTAGCAAAACCAGTT GGAAATGGACCTTTGATTCAAGAGATCAATGACATGACAATTTCACAATCTCGCCCCAAA CCAAAACTTGAAACTTCACAGAACGCTACAAAATCCCCTCTCCAGAAATCTAAACCAAAA CTGTCGCTCGACATAAAAGTTGTGAAGCTATCTCCCGAGGATACTGAAACCAAATACAAG CTTCTGCTACAAATCTCATCCAATCTCACTGACAAATCAGCTTACAAACTTGATCTAAAT AAGGAGAACTCTACCCTGTTAATATCTACGACCGATTCAAAGTATGACTTTTCCAAGTCT CAGGTTCATTTTCCTCTACCTATCGATACAGAGCAAATAAGGGTGATTAATGTAAACTCG GAGGCTTTTCTACGTGTCTACATCAAGTAG downstream (in bold start codon previous gene) actctgatctaagaccatatccataagatagtgtagcctattacccctttcggacttaaa cgagactaaaatttttgaaaagaaaaaaagaatcttttcttacatatatacagccagaca aataaaatccaaaggattgatgtattatcttgatatagatttgactctttgaaatctttt cacacatagcaattttttattcatacattggcgatgtcggatacggatggtttatctttg actccgacctcaagcagaaaatacggcagcacgaatgatactgaggcaagtatagataat gagattctggtccaacttccagtgactacgttcaaatctgaggctaaaatctactacgaa tcggcttttcctttagtaatatcatttttcttacagtatctattggctgccactcctata tttattgtagggcatattggagctgttgagttgggatctgtaagtttagctatcatgact ttcaatattacgggtgcaggtttggttcaaggatgcgtcactgctttagacacctttgcc gcccaagcttatggagcacaaaaatatcatttagtgggagagttttgtcaaaaagcttgg atgttgataaatattctttccattccagtgctcatgacttggtggttcatggaacctatc ctcggatttttacaaccagatagacagatagccatgctagccacacagtatttgcgaatt ttaattttgggattaccagcagttattatatttgaagttgcaaaacgatttttgcaatgc cagaaaatattcgacgcttccaccaaagtcttattttttggggcgccattcggttttttt atttcatatctactggttttcaagtttgagcttggatatattggagctcctattgcagtc gttatttcgtattggacaattgccattttactcatattctacgtgatcctcatcgatgga aaacaatgctggggaggattttgtaagaatgctcttttcca
AA
MVSQKITPKPHLVLKTKLLAPIRGLPVGTKLFINVCTNDLVPLPPPLTPFTPAEDDVFDL SVILPKVMDDKWEIPILTSPDLRMDKDKSGNLSLVTDCVINDKPMNWCLISKDLMDILIE WCFESLEFQYDSEVLVVDHIKYSIPKMSYKGKEVQEME IDLESNELQRKEFERIKTTLTK NEPLGILESKRAQDEDDLDTDKQELTSLDSLLARPVAKPVGNGPLIQEINDMTISQSRPK PKLETSQNATKSPLQKSKPKLSLDIKWKLSPEDTETKYKLLLQISSNLTDKSAYKLDLN KENSTLLISTTDSKYDFSKΞQVHFPLPIDTEQIRVINVNSEAFLRVYIK
Homologue of SSBl (YDL229W) and SSB2 (YNL209W):
SSBl : Cytoplasmic ATPase that is a ribosome-associated molecular chaperone, functions with J-protein partner Zuolp; may be involved in folding of newly-made polypeptide chains; member of the HSP70 family; interacts with phosphatase subunit Reglp
S. cerevisiae null mutant: viable, petite-negative
S. cerevisiae overexpression mutant: increased growth rate in exponential phase; increased resistance to FK506.
SSB2: Cytoplasmic ATPase that is a ribosome-associated molecular chaperone, functions with J-protein partner Zuolp; may be involved in the folding of newly-synthesized polypeptide chains; member of the HSP70 family; homolog of SSBl
SEQO 197 chr3_0731 (P. pαstoris known gene)
5' region cttcataccacaactttgcgacgcctgaggattgtagaaaatggcgtagaaaggccgcca gcctttctgactgttcccgtccttgtcaaagaacaagaagttcacagatggttagattct ttgatggccacttatcagatccagatgcgaacaatatcagagctctctcaattccataca gaggatgcgaactcattcatttccatcatacaattgaccattgatggcatctggttgatg gctctggataaagaatggcttaacagcaggttgacagttgaaggagttttaaccgctggt
agtcacgatagacacagccaattacgctttcatctcccatacaatgtagtaatttcctca attactcttctggacaagctcactctggcagattgccaagaggttctctatcatgtattc aaacgagcatcttctactgatactaagactcgaagctctttccaaaacttaacaagatcc tcagaattgtacaactgcaagaaggaactccctttacaaccattaagccaggtggatata ttggtaacgcacaatttatggctcacaaaaaggcagacttcttatctaattggctgtaaa ggttccagaattgaactcatacgaactaactccaaggctaaaatcaagatttttagtgca ctagaagaggaattaatgcctaagcaacatatttctataactggaactaagtctcaagtt cagaaagctgtctttctgattcagttttccttggacaaactacttcagggaatcagtgac tggtgagactatttattggtgagcaaaactagaatgggctataaattccatgccacagcc gatgaaccgggtaaccttttagttttacccacaatttactgtttcgctaacctagaaaaa aaacctctagtggttggtgaaaaatttttttaggagagcctttttttttcttgtttcctc ttctttctcaaagaaactttcaagttttcgcatatatata
ORF
TCCGCAGTGGAGATCATTGCCAACGAGCAAGGTAACCGTGTTACTCCTTCCTTCGTTGCTTTCACTCCTGAAGAG
GCATTGAAACGTGTGGTTACCAAGGCTATGGCAACCCGTTAA
Downstream gtttattagtaaaatttagggtacgcaaagttaatcttacagtatatatgtgctgttact tcttaaatgtaaatcattatttttattaaccgatatagttatagccatccgcagcaacga ttgtcccttcctggatacatttgaacacccactctttctctactatccggggtactttat ctagaacacctttgacgattggtagatcatcacccagcttttttcttagagacccaactt catcctgtatggtcgcctcctcagtcaaccaggaatagttgaatgtggtggaagtatcca cgattataaatgaacattggaaaatatcagttgtaagaaatcctcctagaatcactatta cttctaacaagtatggactcataacatttgtcagaaatggagatctatagcatggtcttg cgatataaaacttcaggcccttgaacaagttgtacagtagtggaaattttagtgggtgtt ctaactttatcttaggtatacctttcataagcattcgcaactccttacggtcaatgggag aggtataactatccccaaatcggtcggagtttttctttaccttttctatcaaagactcat ccgccagtaatatgtcgttaatttgtagaggaattatagttttcagctgaagacacttca gaatgaagcctggtcgaaagacatttagacctttattaaagtactctgtgactttgaaag taatcttgtcactaagtaaaacaagcctctcgccatctgcaagagctacgcaatcattct tagttattctgccaccattctcgaccagcaaagcctccagctgagcaatagagtatcgtt tttcacccaacatgtcacttaaaacataaaagatgtaatccttgaatatgctgctcttcg aagatacactttgtttagcgagaattggttcagtgactattgtcctctttcgtttactca ttagactgtgtgttttctctgttttgtgtctactctttga
AA
MADGVF QGAIGI DLGTTYSCVATYDSAVE I IANE QGNRVTPSFVAF TPEERL IGDAAKNQAALNPKNTVF DAKRL IGRAFDDESVQKDIKSWPFKWNDNGNPLIEVEYLGETKQFSPQE I SSMVLTKMKEVAEAKIGQKVEKAVVTVPA YFNDAQRQATKDAGAISGLNVLRI INEPTAAAIAYGLGAGKSEEEKHVLIFDLGGGTFDVSLLHIAGGVFTVKAT AGDTHLGGQDFDTNLLEFFKKEFQKKTGKDISDDARALRRLRTACERAKRTLSSVAQTTVEVDSLF DGEDFTAE I
SRAKFEAINADLFKSTLEPVEQVLKDSKIEKSKVDDWLVGGSTRIPKVQKLLSDFFDGKQLEKS INPDEAVAYG AAVQGAILTGQSTSEETKDLLLLDVIPLSLGVAMQGNVFAPVVPRNTTVPTIKRRTFTTVDDHQTTVQFPVYQGE RVNCSENTLLGEFDLKNIPPMSAGEPVLEAIFEI DANGILKVTAVEKSTGRSANIT ISNS IGRLSSSE IEKMIND ADKFKKADEDFANRHESKQKLEAYVSSIESTI TDPI LΞSKLKRSAKDKIESALSDALAALELEDASGDDFRKAEL ALKRWTKAMATR
SEQO 198 chr4_0552 5' region cgttcttctacaacgctgtatttgtgggcgctgtcgagtctttcggtgtccttttgttcc aaagctataccaacaagactttgtcgttgaaaccactggaccttttggctttggagaata ttcagtactttgccattgctgctttgttcttgactatcctcccccatattgcaattgctg ttgttccatttaccttgttcagtgtgtttcacgctctgtcatatacccgcagcaacattc taccattattgaatatcccagcaaccgccaccgttagctccaaaattgcggactttgtta agacttataacgacaagtctttatttttggtgtcaaatttcgaattttttaccctggcca ttattttagttaaggctctgctattcagaaaaggttactggattgctgccgctatttaca ccgtcttctttaagctacgattcgaaaaatctatctttttgaggtcagtcgtaaagtctt gggaagcaagaattgatggtttagtctcacatccttccctcaacgttgctgtcaaatcta actgggtgcaggtgaagaaactgattagaacctatggaggaaagccaatcctggccacta cggcttccggttcgtcatcttccaacccatccactccacagaagaatgaataaatttttt tttgaagcaaacctctaggtatgacttttgcttccttttgtttgtgttattattattcct agtatgcatttttataacacctagttttcaatagttacaaattataccctatccagctta tctaacggtgtcgtcatagcacagatcgctagctaatgttgtcgtgctctactcaatatc atggatcttgcatcgaagcctttagaacttcatggaatttacaggcatgaaaaattggcg agaccaagatcctataaattaccataatcccctaccatgcatttcttctctccttcttta caactgtcagttttttcacaaatagacataataacataat
ORF
TCCCAAGTCAACGAGATTGTCTTGGTTGGTGGTTCCACCAGAATTCCAAAGGTCCAAAAGTTGGTGTCTGACTTC TTCAACGGTAAGGAACCAAACAGATCTATCAACCCAGATGAGGCTGTTGCCTACGGTGCCGCTGTGCAAGCTGCT
AACTTATTGGGTAAGTTCGAGCTGTCTGGTATTCCTCCAGCTCCAAGAGGTGTTCCTCAGATCGAGGTCACTTTC GACATGGATGCCAATGGTATCTTGAACGTCTCTGCCGTTGAGAAGGGTACCGGTAAGGCTCAACAGATCACTATC
GCTGCCGGTGCTGGTGCCGAGGCCCCAGGTGCTGACGGACCAACTGTTGAGGAAGTCGATTAG
Downstream atgatatgactccctatcctgaatcttttagatgttgtacattacatagaagagcttaat tctaatttctccatatttacctttcattgtggtaaaaataaaacgtaaacaataaggcac gaaacgcgacacattctaatgtgacggatgcgattttctgtaattcccccatgtaacaat ccctttgtttgacgaggcggagtatataaaacaccaaaaattctcgtgtgtttcgtgttg tcttgaacccattgacctaattttacatgtcattttccattgtatcctctagggactccc gtgtgatgagtgacatggaatcattgaccttgaatcaacaaaagaataagcgattgtcat
ttgaagtggtaagctgcagaagctcacaacacgttgtcagccctgcttatgaacactttg tagaggggtatttgaaagacgttgatcctgctaaggaggctgatacaccagaagacgcca acaaattggaaggtgatacagtccttgaagacgcaaatgaaagcccaaaaagtcagaaaa gtagggctcatagcaatcatagccatagtagtgtgagttttcatgaggcacaatccaatg aagaagagttgtctcaagacttgcaaattaaccaggacatgactgaaagcaatccatctc tagatgataaccaagaagtggtctccttggagagtctaccaccaaaggctagtctaactg atctgccctccaaattactggatattatcgccagtttccttccccaacagtcagctataa atttgtcacaatgtgcaaagaaattgattcccagcacaaggaaacgtttgctgcgaaaag tggtgtttgtcaattcgtggaatgaagcatgtagtttgcagagagaactcaactcaaata tccctagcaccattgtttcaatggaaaaaatgattccctttttcagtaccattagactga gtagcggcaacttgggttcattggtcaaagaattgtggtg
AA
MPAVGIDLGTTYSCVAHFANDRVEIIANDQGNRTTPSFVAFTDTERLIGDAAKNQAAMNPANTVFDAKRLIGRKF SDAETQADIKHFPFKWDKGGKPNIQVEFKGETKVFTPEEISSMVLTKMKDTAEQFLGDKVNDAWTVPAYFNDS QRQATKDAGLIAGLNVMRIINEPTAAAIAYGLDKKAEGEKNVLIFDLGGGTFDVSLLSIEDGIFEVKATAGDTHL GGEDFDNRLVNHFIAEFKRKNKKDLSSNQRALRRLRTACERAKRTLSSSAQTSIEIDSLFEGVDFYTSLTRARFE ELCGDLFRSTIEPVEKVLKDAKLDKSQVNEIVLVGGSTRIPKVQKLVSDFFNGKEPNRSINPDEAVAYGAAVQAA ILSGDTSSKTQDLLLLDVAPLSLGIETAGGIMTKLIPRNSTIPTKKSETFSTYADNQPGVLIQVYEGERAKTADN NLLGKFELSGIPPAPRGVPQIEVTFDMDANGILNVSAVEKGTGKAQQITITNDKGRLSKEDIEAMISEAEKYKDE DEKEAARIQARNALESYSFSLKNTLNEKEVGEKLDAADKESLTKAIDETTSWIDENQTATTEEFEAKQKELEGVA NPIMTKFYQANGGAPGGAAPGGFPGAAGAGAEAPGADGPTVEEVD
SEQ0199 chr3_0230 5' region gggttgtatccattcactatttactctttgtttcatttcttgaattatttggatactact ctgctggcaactctaccagtctcaaacgcagaccaggttcgcaatttgattagaatgttc gtgagctcttacaatgaaaagtccatgtaccttgcggctagttgtgaattatttttagtt ccttctttgttgctatcctctttgaagtcgattatattgctggaatggtatagggctccc ttttcatttatcaggcaattaatcgtggtattctccgtgatctcgtttctgagattaaga tatcaacagaatgtttacatgaaacaattagttgatagttatgatttgaagatcagtcaa ctcttataccatccccaacttcctcaaggattcaggttgggatatttacgatttaagagt ctattaacaagcacgctaggatacttagaattggaaaaaaagaccagataatgagattga actcgaaatttaggatcacccatatgacgaagaattcatttagattattgaaggtgtttt catgtttacctccatgagaccatttctgtcacagcaaatacaggcaacgcttttcaccag agcttgttggtacaacttttcagatgacgccaaattctcacgcgcctcactttgtgcggc gctaacaataggccatttttttgtacctcccggatggttcagctcaatcactcgattgag aggtttttgttccgcgatttttgttcaccccacacttttctcgaaggttctagcaatcaa gataaacaccgcaaagagagccgcaggaaccatatgtggtaccacaagtggtcttaaaca actctggtagaattcgatggaattcgatggaagccgatcgactccgatcgaattgaagca attcgtatatataaggagaacctagttccaccccttactcgaccattagtttacaagact aacttcacagaagcatagaaattaaacaaagttaaacatt
ORF
CCAACCGTGGAAGAAGTCGATTAA
Downstream gcaattcaacggataaattctggttaatatatataacgtgaataggaaattaaggaaatt ttggatctaataatgtgctgtatgccgacatcgggcatcgtagattgtatagtatcgctg acactataataagccagccaaaacccctaaaccagttgccctccactaattagtgtacta cccaatcttgcctcttcgggtgtcttttataaggacagattcacaagctcttgttgccca atacacacatacacacagagataatagcagtcatgaacagcgagtacgactacttattca agttgctgcttattggggactctggagtgggaaaatcttgtctcttgctgagatttgccg aggatacttacacccaggactatatttcgacgatcggagttgatttcaagattagaacta tagagctagatggtaagacaatcaagttacaaatctgggacacggctggtcaagaaagat tccgcactattacatcctcctattatcgaggagctcatggtatcatcatcgtatatgatg ttaccgaccaagagtcatttaataacgtcaaccagtggttgcaagagattgaccgatacg ccactggcggagtgatgaagttgttggttgggaacaagtccgatctcaaggataagaagg ttgtggattacactgttgccaaggagtttgcagatgctctggagatcccattaatagaga catcagccttggactctacaaatgttgagcaagcattcttcaacatggctcgtcaaatca aatctcagatggtgaacaaccaagctggtagtggtaccgagaaggcatcaatcaacctga ggggtcaatccgttggaggcagtcaaaactctagttgctgttagttagtttaagataatt gttaattggtaaatcaaagagtaagagtatatacattctgaggcgttgccgttcttctga gtagcaagtatttgccatttggtgttgttgcctgctcgtt
AA
MGKSIGIDLGTTYSCVAHFANDRVEI IANDQGNRTTPSFVAFTDTERLIGDAAKNQAAMNPANTVFDAKRLIGRK FDDPETQADIKHFPFKVINKGGKPNIQVEFKGETKVFSPEEISSMVLTKMKDTAEQYLGEKINDAVVTVPAYFND SQRQATKDAGLIAGLNVQRI INEPTAAAIAYGLDKKDAGHGEHNILIFDLGGGTFDVSLLSIDEGIFEVKATAGD THLGGEDFDNRLVNHFIAEFKRKTKKDLSTNQRΞLRRLRTACERAKRTLSΞΞAQTSIEIDSLFEGIDFYTSITRA RFEELCADLFRSTIEPVERVLKDSKLDKSQVHEIVLVGGSTRIPKVQKLVSDFFNGKEPNKSINPDEAVAYGAAV QAAILSGDTSSKTQDLLLLDVAPLSLGIETAGGIMTKLIPRNSTIPAKKSEIFSTYADNQPGVLIQVFEGERTRT KDNNLLGKFELSGIPPAPRGVPQIEVTFDMDANGILNVSAVEKGTGKTQKITITNDKGRLSKEDIERMVSEAEKF KDEDEKEAERVAAKNGLESYAYSLKNSAAESGFKDKVGEDDLAKLNKSVEETISWLDESQSASTDEYKDRQKELE EVANPIMSKFYGAAGGAPGGAPGGFPGGFPGGAGAAGGAPGGAAPGGDSGPTVEEVD
SEQ0200
Homologue of SILl (YOL031C): Nucleotide exchange factor for the endoplasmic reticulum
(ER) lumenal Hsp70 chaperone Kar2p, required for protein translocation into the ER; homolog of Yarrowia lipolytica SLS 1 ; GrpE-like protein
S. cerevisiae null mutant: resistance to tunicamycin: decreased; resistance to mercaptoethanol: normal; resistance to L-l,4-dithiothreitol: decreased; viable
chrl-l_0237 5' region gaaacttcctccgttgccgtctcaatccaatcttataatgcagctgaggaaagcagcttt acacccattactgtttagaggaaatttcactgacaaagttattaaggagatgtccgttaa aatcatgaaagaacctgtctacgctgatgccaacctggaatacatttacgaagatatggc cataatgaacgactatgagctaaatgaactgtgcttaaagtatcctcgtacattatcaaa gtacaagttagaagaaaaatcttttatgcagagcggaaaagttcaaaagcttaaggaatt actagacaagattatttttgagagagaggaaaaggtattaatcttttccttattcactca ggtattggacattattgaggtggttctctccgttcttaaaatcaagtttttgagacttga cggccaaacttctgttgatatcagacaagatatcatcgacaaattctatgaagatgaaac aattcccgtttttcttttatctaccaaagctggtggattcggaattaacttggtatgtgc caataatgtaattatcttcgatcaatcgtttaacccacacgatgacaagcaagctgaaga
tagagcacacagggtgggtcagaccaaagaagttaacgtttatcgaatggttaccaagaa tacaatagatgagagcattctgcagttggccttgaacaaactacaattggatagtaccgt cagtgatgacaacgataaaagcaccagggcatttgaggaacagacagccaaggtcattga acaaatgctttttgaaagggaaagtagcacagacgttccagaagagaagaatgcgtaaag ctgaagagtgcttctggtgtacaaatctcctagactgaatttccttgaataattttatag ttttagacacttagattggcattaccacgatagagggtacgggcgccattacataatcaa tactagatcatctctcgaaataccttagaaaccttcaagc
ORF
TACGGATCAGACATAATCTTGAGTGATCAGTATATTTTTGGAGTAGCCGGGCTAGTTCCTACTAAGACAAAGTTT
AAACATTTACGTTTTGAACTTTTTGGGAACCCATTGGCATCTAGGAAAGGTTTCTCCGATGAGTTATAG
Downstream ataattggggatccctatctcattacaacattctccatgttgtaggatttttttcaagga tgcatgcgaaagttaacgtaaattaccaaccggacccagatattcgacatcgctttcgtc tatgaaaaattttcgggagactagttcttgcttcctatatatcccctaatctttaatata qaatgaagtacqttttatccqaqcaagttcttactgtcccagaagatgtctccqtgtcta tcaaggccagagtcatcaaagttactggaccaaggggtgagctgaccaaagacttgaagc acatcaacgttgcctttgagaaatctggcgacaacgagatcaagatcattgtgcatcacg gtaacagaaagcacgttgctgctctgagaactgtcaagtcattaatctctaacatgatca ctggtgtcaccaagggttacaagtacaagatgagattggtttatgcgcatttcccaatta atgtcaacttcctcgagagagatggtcaccagtacgttgagatcagaaacttcttgggtg agaagagagtcagagaggtcaaggtttacgatggtgtcactgcatccaactcttctgctt tgaaggatgagctgatctttgagggtaactccattgagaacgtctctcaaacttgtgctg atgtccaacagatttgccgtgttagaaacaaggatatccgtaaatttttagacggtatct acgtttctgagaagggaactatcgttcaagaagaataaattctggtatatctcaaaagat ttatgtacattagttaggctttccacactgtgtgtttcctacccccattcccaaccccgt gaatcgttacggcccttaaagagaaggccccttacgattacacacccgtaaaataacccc ttattatatctggaattttgttgtgatagttactgattctttggaggattaacctccagt ggcctgggtttagttgtccgaatgaattattaacttcata
AA
MKVTLSVLAIASQLVRIVCSEGENIC IGDQCYPKMFEPDKEWKPVQEGQI IPPGSHVRMDFNTHQREAKLVEENE DI DPSSLGVAWDSTGSFADDQSLEKIEGLSMEQLDEKLEEL IELSHDYEYGSD I I LSDQYIFGVAGLVPTKTKF TSELKEKALRIVGSCLRNNADAVEKLLG TVPNTI TIQFMSNLVGKVNS TGENVDSVEQKRILS I IGAVIPFKIGK VLFEACSGTQKLLLSLDKLESSVQLRGYQMLDDF IHHPEEELLSSLTAKERLVKHIELIQSFFASGKHSLDIAIN RELFTRLIALRTNLESANPNLCKPSTDFLNWL IDE IEATKDTDPHFSKELKHLRFELFGNPLASRKGFSDEL
10 5S ribosomal RNA gene
21 copies spread over the 4 chromosomes
SEQ0201
Ggttgcggccatatctagcagaaagcaccgtttcccgtccgatcaaccgtagttaagctgctaagagcctgaccg agtagtgtagtgggagaccatacgcgaaactcaggtgctgcaatct
11 Xylose metabolism
SEQ0202
XYL1 : chr3_0744 (80% homoloog with P. stipitis; 84% with C. albicans) Aldose reductase: converts xylose to xylitol
5' region atatgtattcccaaactgccgaacacttgcttagacacgtcgaaaaaccagatcagccat ggtctctggtggggatactcttggtatcttttaaaaagaagggaactcagtgacagaagc cctaggatcgcttgtacaaacaaggagaatggacctagtagcttacactgatcagacatt gctgagtaacgaagaaactttaacagaaaaggatgacagaaggggaacacatagggaaag tttcgcgtatgataaacccatgtgggaggagacaataattctttagacggagtaagtggt gtaatgctttttcgtccggggaaagctctttcaaaagccaaccgcgagcctcagttattt tctccttcccaaagatcacaccgacgatgtccgaaccagctcaacaaccaaaagcagaat gtaagtaggcttttaagcttcagaagaattagggattaagcgtactaacggcgttgttag ctaagcccaagccatgttgtgtctgcaagccagagaaagaggcccgtgatcaatgcttat taataaatggtcaagactcaggaaaatgtgatgacttgatttcccaatacaaggtctgca tgaaaggattcgggtttgacactaactagctctcagatgcgtgaatatctcttgtaaata ggtcaaaattgacaggcattcatccatacggagatccaactgtggatacgctgaacatct tgattttcggtgaaaaatcgtaacatctctggccgtctaggcagagtaacttgatagcta tcacagtgcgtcccaaccattcttgtgaagctattttattttgacttctctcagcatgtt tgtccgtttagcctcccaaacttcaaagatatagtttcaccatagtataaccgcatttcg taatcaatcgcggttaaagcaagtatcatgcgctaatatataagaatctcctaagttgcc tcccgaattcagattattctcccaacaattaacttatcggc
ORF
GACCTCGGCTTGAGATTCAACGATCCATGGGACTGGGACAAGATCCCTATATTTGCCTAG
AA:
MATLLKLNNGLKLPQVGLGVWKIPNELTAETVYNAIKQGYRLFDGAEDYGNEKEVGQGVRRAIDEGLVKREDLFI
LERLVDAGRIKSLGVSNFNGALLQDLLRGARIKPVALQIEHHPYLVQQKLIEYAQSEDIVWAYSSFGPQSFLEL KVNKALTAVSLFEHDVIKKIAQAHNRSAGEVLLRWATQRGLAIIPKSSKPERLSSNLHINSFDLTKEDLETISSL DLGLRFNDPWDWDKIPIFA
SEQ0203
XYL2: chr1 -1_0490 (69% homoloog with P. stipitis) Xylitol dehydrogenase: converts xylitol to xylulose
5' region: cctgttcaagcgtttttctttacactttagatccaaaccctggcattgttcttcaagcta ccgatttctctcatgtgaaaacgggtaatgacagaggtgacaatctgccctttgatgatg tattcttctatgaacagaacgcgttgagtaagctagtggttgctcatgttggagatacaa gagtagtgttgtgtgataaaaacggactggcccacccactaacacatagccaccaccctt ctaatccaatcgaaagcagacgtctttcaaggttttctgcagggttaatgatgacggatt cttttggagaagaaaggttcatcaactttgccaatactagatcgtttggagatcttacag ccaaaaacaagggggtgtccgcagagcctgaaatatcccaatacctcataggaaattcga ctaaactgcaagcatacaaagatcaaaattctgatttgattgcaagtaaagagatacgag actttggagggaatgagtgctttttagcacttgtaaccgatggagtcacagatcacttat
cagaccaggagattgtagatttggtcatgacaactaccaataataaaggacttttgagag gaacacctcatgaggctgccaaagaagtcattaaattcgtggagtgcgttggaggtgatg acaatgccacatgtaacataatcaaattgaacggttggggtaattggccgatgatggacc gaacgggtaagctaagagaggacaaaatgatgcaaggcggaagagcaaggttagataggt agtgttaaaaagttgtatattattaatggcggggcagaatgctctcagtctcaacttttg atgattgcgcgggattaaaaaatcatgcacaccggaaaaacttatcccaaatgtatgtac cccacgtaatccgttgtagttccaccaaagaatttaaaacgtacttcccgcccaataatc tgttccctttcccgcttcattacactcactattatccaac
ORF:
AACTTCCCATTGGGACTTATATGTGAGAAGGAATGTATCGTAAAGGGTGTTTTCAGATACTGTTACAACGATTAC
AA:
MS DNPSVI LKRINE IVIEDRPIPAIEDPHYVK IAIKKTGICGSDVHFYTDGCCGSFKLESPMVLGHESAG IWEV
VEPLSVAVHAARLAKITFGDSVWFGAGPVGLLVAATARAYGATNVLIVDIFDDKLTLAKDTLQVATHSFNSKNG MDNLLESFEGKHPNVS IDCTGVESCIAAGINALAPRGVHVQVGMGKSEYNNFPLGL ICEKEC IVKGVFRYCYNDY NLAVEL IASGKVEVKGLVTHRFKFTEAVDAYDTVRQGKAIKAI IDGPE *
SEQ0204
XYL3 (XKS1): chr1 -1_0280 (66 % homology with P. stipitis)
Xylulokinase: converts D-xylulose and ATP to xylulose 5-phosphate and ADP
5' region: ggtgcattagtagaacctttgagtgttgctgtccatgctgcaagacttgcgaagattacc tttggagacagtgtagtagtttttggagctggncctgttggtctgctggttgctgctacg gctagagcttacggcgcaaccaacgttctcatcgtggatattttcgatgataaactgaca ctcgcaaaggacaccttacaagtggccacgcatagtttcaactcaaagaatggtatggat aatcttttggaatcatttgaaggaaagcatccaaacgtttccattgattgtactggagtt gaatcgtgtattgcggcaggtatcaatgcactggctccaaggggagtgcacgttcaagtc ggaatgggaaaatccgaatataacaacttcccattgggacttatatgtgagaaggaatgt atcgtaaagggtgttttcagatactgttacaacgattacaacttagcagttgaactgata gcttcaggaaaagtcgaagtgaaaggattagtaacccacaggttcaaatttaccgaagca gtagatgcctatgatactgttaggcaaggtaaggctatcaaggctatcattgacggccca gagtaaacggagacatatataataataaaattgtaaggttctcttagcattgcaccccac cctaggtttctttttagactgttatgcactcgaaatttaggtgggtatgtgtcgatactc aattctgcacacaccaccaaaatctgtcgcctttctcttatatctccaacccgagtccaa tgtcattttcttaccttgaaggcaacatcgattttaaaggacaggaacttgcaaacagga tcactaaaaaactaatcacatttggtgcaattattagttttctggtaggatttttgagtg acaacatcttatacactgtatacactttcgcagcttttggtttattgactgcttctttgg ttattcccccttttagcttctacaaaaagaaccctgtaac
ORF:
TATGTCGATGGAGTTGGAATCTTGAGTGAAATAGAACAAACTCTAGAGAAGTAA
AA:
MVTKEIQNRDSALTESVPNDLYLGFDLSTQQLKITSFEGRSLTHFKTYRVDFDEELSVYGINNGVYVNEETGEIN APVAMWVEALDLIFSKMQKDKFPFGIVKGMSGSCQQHGSVYWSKDAPDLLSSLSPSKDLKSQLCPKAFTFEKSPN WQDHSTGEELEIFERKAGSPENLSKITGSRAHYRFTGSQIRKLAKRVNPELYKETYRISLISSFLSSLLCGRITK IEESDGCGMNIYDIQNSRYDEDLLAVTAAVDPEIDGATEHERQEGVARLKDFLQDLEPVGYRSIGTIAAYFVEKY GFSEDSKVFSFTGDNLATILΞLPLHNDDILVSLGTSTTVLLVTETYWPNSNYHVFKHPTVPGSYMVMLCYVNGAL ARNQIKTSLDKKYNVSDPNDWTKFNE ILDKSKPLHGKEELGVYFPKGE I IPNCVAQTKRFSYDAKSKKLVTANWD IEDDWSIVESQALSCRLRSGPLYHGSDETDQEEESEVIQRLSNFPKISADGKDQRLPDLISHPKKAFYVGGASQ NVSIVRKFSEVLGAKEGMYQINLGDACAIGGAFKAVWSDLCETEKAIPYSDFLRKNFHWKENVKPVEADSSLWLQ YVDGVGILSEIEQTLEK*
SEQ0205
RPE1 : chr3_0441 (81 % homology with S. cerevisiae; 86% homology with D. hanseii) D-ribulose-5-phosphate 3-epimerase: converts Xylulose-5-P to Ribuloεe-5-P
5' region (bold and underlined stopcodon previous gene) gcgatggacagactgcctcacaacagttcgtcaaacagggaaaagacgccttgagagtct actggttgatgcttatctacatgattatcatgatggccgcattcaactttagttctcact cctcccaagatctttaccctactcttctaacagttcaatatgaatttggtaaagacagat ctacagttactaacgtcgtggcaaatcttggagccattgtcggaggtattttctggggtc atatgtccaacttcattggtcgacgtcttgcagttttgctctgttgtatagtaggtggag ctttcatttacccttgggcatttataaatggttcagggatcaatgctggtgctttcttct tacaatttgctgtccaaggtggatggggtgtatgtccagttcatcttgctgagatgtctc cgccccatttccgagcttttgttacaggaaccacctaccagctaggaaacctggcatcat ctggatcatctactatcctggccgatattggagaaagattccccatttacgacgagaatg gagtgcggaaagagggtgtttatgactatgccaaagtaatggctatatttatgggagctg tgtttgctttcctgtttattgtgatgttgctgggtccagaaagaagaggagcctccaatc ttgatgatctccaggattacattgacgattgggagcaaaaagatctggacaacaaaaaag gaattgaaactgagttcatcgaagatgttcagttaatggaggctgtctctaatgaggacc tttctgagaaaggcaatgagaaagtagaggcttttgaaggaaataagtaagattagggta cttcttagaaattactaatactacacctctacattcgttatcttatcaggcttgtcgcga tgggctaaactatagcctatgagtggggggtcgtagttaaagaaatatatactatatgcg aagatgggaaagtaaaactagttgaatactgtgtgagatc
ORF:
ATGGTTAAAACAATTATTGCTCCTTCAATCCTGTCGGCAGATTTTTCAAACTTGGGATGT GAATGTCACAAGATGTTTAAGCTACGGGCTGACTGGGTCCACATTGACGTCATGGACGGC CATTTCGTCCCTAATTTGACCATGGGTCCTCCAATTATCTCGTCCCTTAGAAAAGCTGTT CCGAGGGGAGAGAAAGATGGACAGACGACACACTTCTTTGACTGCCATATGATGGTGGCC AATCCCGAACAATGGGTCCCAGAGGTTGCAAAGGCAGGTGGTGATCAGTACACTTTTCAC
TACGAAGCAACTAAGGATCCAGTGAAGTTAGTGGAACTTATCAAGAGCCATGGACTGAAA GCTGCATGTGCTATCAAGCCCGGAACATCAGTCGATGTCTTGTACCCTCTAGCAGACAAG CTTGACATGGCTCTGGTAATGACAGTTGAGCCAGGGTTTGGTGGACAGAAGTTTATGGCT GATATGATGCCCAAAGTTGAAGCATTGCGGGCCAAATTTCCAAACTTAGATATTCAGGTT GACGGTGGTCTTGGAAAGGAAACCATAGGAGTAGCAGCCGATGCTGGGGCAAACGTAATT GTTGCTGGATCTTCTGTTTTTGGTGCTAAAGACCCCGGTGAGGTTATCCAATTTTTGCGT GACACTGTCCAAGATAGTCTCAAAAAGAAAGGTCTTTTAGATGAATAG
AA:
MVKTI IAPSILSADFSNLGCECHKMFKLRADWVHIDVMDGHFVPNLTMGPPI ISSLRKAV PRGEKDGQTTHFFDCHMMVANPEQWVPEVAKAGGDQYTFHYEATKDPVKLVELIKSHGLK AACAIKPGTSVDVLYPLADKLDMALVMTVEPGFGGQKFMADMMPKVEALRAKFPNLDIQV DGGLGKETIGVAADAGANVIVAGSSVFGAKDPGEVIQFLRDTVQDSLKKKGLLDE
12 Arabinose metabolism
Not all enzymes could be detected
LAD1 : L-arabinitol-4-dehydrogenaεe of Trichoderma reesei has homology with P. paεtoriε XYL2 (chr1 - 1_0490)
LXR1 : L-xylulose reductase of Trichoderma reesei has homology with peroxisomal 2,4- dienoyl-CoA reductase (chr4_0754 & chr4_0988)
SEQ0206 chr2-2_0305 putative kinase (possibly ribulokinase, araB), 72% homology with possible ribitol kinase or glycerol kinase of Pichia stipitis and homology with hypothetical proteins of P. guillermondi & D. hanseii
5' region (bold & underlined stopcodon previous gene) acattggtaattcgttagcgctggagccttacttcagcgagagatggccccagaaccacg ttctactgggggtcatggacaagatacacaagatttactcaacagactttgcaccattgg ttgcagcccgttcatttggtgttgatgttttgaattcactgggaccaatcaaggatttta tggtgggaaggatcagtggaaattccaaaagatgagatgtcttctgttctcccttcttgt ttagcagagtagaacagctttacctgaatggctgatgtgcattgcgattaggctaatcag taactcaagcttttccagtcagatcgatcgactactcgaaagccccagtcctcatccctc tactatcggcgcccgtgtatataaagagatagaagtaaactctgcgcgcgttttgcctct attcttctgctagaaaagaaaaaaaaaaacgtttctcatatactaagcactaagaatgag tgaagaagactcccaggaattagaagagacttttaacagaataaaaacacatttcccctc tgctaggataaaaaaattgatgcagagtgatgatgatatcggaaaagttgcccaggccac cccagtggtggtgggaagagcattggaactgtttttatgttcactggtagacaagtcatt ggaggtggctcgtgagagcggatcacgccgtatacaacctgcacatttgcgtaaggctgt ggcagagaatgagcaattcgacttttgccagtcgattctggatggtgagccaagggaaga gtgattgactacaaaacaataaatttcatgcttctgcgtcttctccccctcccatttcct ccgcgcaaaaagagaaactagcttacccaatagtaacacaagggaatttgttttttttgc cccttccacgtcctttagcatcgccttcttatcgcaaccatttcccattctagtctcttt cttctccactaaccctagagagcattttattttagataca
ORF:
TACGATTTGGCAGACTTTTTAACACACAAGGCTACAAACACGGAAACTAGGTCTTATTGCTCAGTTACTTGCAAG
AA:
MSKNNLKSSRSRSFTNPNFHLNISELHQNFATPVYYVGVDVGSGSARAAVVDQAGAILGLAEKPISKYTPKADYV NQSSTEIWEAVSYCVKTALTQSHIDPALVMGIGFDATCSLWLDEETDEPIAVGPDFTESEQNIIMWMDHRAHEE TKAINRTGDKCLKYVGGQMSIEMELPKMKWLKHHLPRDETGKSLFERCFFYDLADFLTHKATNTETRSYCSVTCK QGFVPQGVDGSVDGWSKEFLAQVELPELAANDFQKLGGIPGKNGKYLSAGDSVGPLSADAAEQLGLTTACWVASG VIDCYAGWVGTIAAKTEIPLPDLVEQDNNFSGIDFACGRLAAVAGTSTCHCVMSKDPIFVHGVWGPYRDVMAKDY WLAEGGQSCTGALLAHVLTTHPAYTELGKASESSGLSRFDFLNNRLENLKRSRKERSVLALGKDLFFYGDFHGNR SPLADPDMKASI IGQSMDTSLDSLAIEYLGACEFIAQQTRQIVEKMEKSGHNISCIFLSGGQCRNGLLMRLLADC TGLPIIIPRYIEASWFGSALLGAVAADDAILAQNKGRTGRTVRSSSNVSEPPHEMPPSPYTAPTATASMTSISA MNAPNLSGAPGGFPFPLMTPIEDESKQLGFEDGVEDNTSSDEETLSFGTKNFNQANVQTNFVQQRTSQKPLVSSK PSKLGEERSGDRLWKVMESLSGVGKVIMPSSPNEPDRKLLNTKYKVFLDQAESQMRYRKLVADTEEAIKAFMNI*
SEQ0207
Chr4_0280 Putative methylthιo-πbulose-1 -phosphate dehydratase (homology with E coll araD), acts in the methionine salvage pathway, potential Smt3p sumoylation substrate, expression downregulated by caspofungin and deletion mutant is caspofungin resistant
5' region (bold & underlined stopcodon of previous gene) aagctcgtaatcagggcttggacagatcattgtttaagattttgagtgagaagcatcctc aaagtgtcgtctctttgagctatcaatatcgcatgtgcgcagacgtgatgcttctatcca acaaactaatttatgaaaaccggttaaaatgtggtagcgataacgttgcccaacggtctt taagaattccaatgaaagagagactggatcacttactgcttgatccaagtacaccgaaaa atcgattgtggcttcaccacatctttgtagaagagaacagggttatcttcttggaccatg atgacgttccagcaaaagaaagaatgtcgacagatagagtggacaacccaactgaagcag aactggttggccaaattgttgacagtcttatcaattgtgggtgtgacagcaaggagattg gtgtgatgtcgttttatcgagcccaggtaaaattgatgaaaacaagacttgtttctcatg tgcaaaatgtggaggtcatgacagcagatcagtttcaaggaagggataaagaatgcatta taatatcactagttcgatccaacgctaccaactcagtgggtaatctactaaaagagtgga gacgaatcaacgttgccattaccagagccaaatcaaagttactaattctcggcagcagga agacattggaacaacttgaaactctgaaagcctttttgtcactattgaaaagtagaggat ggatttacagactgccgcttggagctgattccgtttacaaaggcctcaagcaagataacc ataagacagagccagccagaaaagacaattccaaaatgtcattgaaaatgataaaatcca ggccaatattgaaggacattttcaatgacttggtaaactaatattattatcatacatcgt gtatatagacgatcgattagcattttatttagtaggcccacgagcccagtgatttctcga tttaataatttgacatttgatccattcggctgtcaactcca
ORF
GCTCTCAAGATGCATCAACTTGGTATTCCCACTGTAAAACAGAATTAG AA
MSSCFCSKSQETEKRGAVSEDQLVISDDSTHPANLICELCKVFYNNGWVTGTGGGISIREGSKIYIAPSGVQKER MVPDNMFVMDLESENYLRTPLTLKPSACTPLFLSAYKMRDAGACIHTHSQAAVMVTLLYDKVFEISSIEQIKAIP KVTEKGNLMYSDRLVIPI IENTEREEDLTDSLQQAIEEYPGTTAVLVRRHGIYVWGETWJKAKVYNEAIDYLLEL ALKMHQLGIPTVKQN araA (L-arabιnose_ιsomerase) no homology
13 Threhalose metabolism
SEQ0208
ATH 1 (YPR026W)
Acid trehalase required for utilization of extracellular trehalose
Chr4_0342
Upstream (in bold start of next gene) cgtgagatcgacctttccgttgcttaaatgtttgacagtttctagatcttgtatacttct agccccaccagcatatgtaactggaagatcactccattcacttaatttccgtaccagctc ttggtcaattccttgacaaagtccctccacatctgcagcatgcactaagaactcatcgca atactgggacagcagttggaaaaactctctatctagcacggatgaggtcaatgtttgcca tttgttcatagcaaccacccattggggacttccatctatctcttgtctacgacaactcaa gtccactaccagtttctcttttccaataaggctggacaggtgttgaagtctttctataga aaactgaccctcagggaataaccatgaagtgactatgacatgcgatgcaccttgatctat ccattcttgagcattggaatccgttattccaccgccaatctgcaaaccccctctccacgc acccagagcttctcttgcagcgtcgtcattagcctttaaagagcccaatttgatgacatg tgttccatgaacctggttctgettatataactgtgcataataggacgagggctgttcact gacaaaatttgtttctacttcatcagattcagtgtcatccttcaccagcttgccgcctac aatctgtttgactttgccggaatggatatcgatgcaacctctgaagacagtcatacggac tatttttctatggtggagcacttactccaaaacctctcgatctggcgagcaagatcccaa agacggaaaaaaaaagtcatacattccccatgtttaatgcacctaaatttctaatttagg aactcgcaaactttcaggtgccagatagagcaatcaaatttctccgtggaccgatgcgga acatttggcttttacttcactagattagggcaatgccgtatggatccatatataatagca gaataccaaagaaacctcctcccacctctcagactagagaa
ORF
CCCCAAGATTATTCATATGGAGCCCAAGTAGCCGAAGTTGTTCTCTACTAA
Downstream (in bold start of previous gene) cgaccttacgtaacgaaatatcccccatctttagattaaatgaaaaagttttgtttgtta ttacataatctcagtaggttgtcgcatcatccggggtaatgtcgcaccgaaatctgggga ttccctctgtaacccaccagtataaataatgaacagttctaccagcttcccggttactcc aaccaatgtctccagtacctcctgctccaggaatttacactccagtaccaacattcttta agaacgacggttataccatcgatttcgacgctaatgtcaagcatgcgaagtttttgaaag acaacggtattgctggtttagttatcatgggttccaccggtgaaggtgtccatttgacta aagaggagagagctgcaactataaaagctgttcatgacgctcttccagatttccccatca ttggtggtgtggttcaaaacagtgtacaagatgccctggatgagatcgattccatcaaag ctgcaggagccacccatgctgtagtcctttcaagtaattactacggtgctggtatcaagc aacaaggtatcattgactggtttactgccgttgcagacaaagcttctctcccaattttgg tgtacgtttacccaggagtttccaataacttgttcaccgaccctgccacagtaatcaagc tatctgggcacccaaacattgttggcactaaaatttcacacggtgacgtatcccatcaca caataattgccactgataagaatgtccagcagaataacttcaacacttttactggccttg gacaattgctagttccaactcttaccatcggctgtaaaggaaccatcgatgctctttctg gtgcctttcctaaaatttacgttaaaatctttcaattggtgcaggacggcaagattgagg aggcagtaaaactgcagaacactgtatctagaggcgaggagatcgtggttaattttggcg tcattggaatcaagaaggctattcatttgggagccggaata
AA
MLNRVLLVALSCWFFHLVTTFPVGTSSDSLQIRNLLSHNFTRANISEGLSSGATYFVDEDTETYYDKELKVLRT TRFPRYNNYQLQPYVANGYIGSRIPRVGSGFTYDTSDNKTSENLKNGWPLFNKRYSGAFIAGFFNSQPTVPETNF EELEKDGYESI IAS IPQWTSLELTVNVNGTNQTLKADDVDITHISDYSQQLΞLLDGIVTTNYTWLGLVNVSISVL AHRDIVSLGFVSLELSSQKNITVSVTDILDFATSTRCSYLDSGVNEQS IFMKVQPSNVPTNATIYSSLMSSNSTS SLLKQNQTVSQTLRVNLSKNQAASFQKYVGWSDDYLDSIETNLTSYQFARETAKFAEIKGRSWILKSHKEAWNE LLNGKS IVFHDNDFLTLASDSS IYHLMANTRSEANGGTSALGVSGLSSDSYGGMVFWDTDFWMLPSVQAFSPRHA VSLSKFRDHTHDQAKKNAQTRDMNGAVYPWTSGRFGNCTSTGPCYDYEYHINIDIAFMFWKLYLGGAIDDDYMKE FGYPI IEDVASFFVDYVDYNΞTLDKYTTRNLTDPDE YAEFKNNAAFTNVGIΞQLMKWALILGKHLKVGNERSYDK WEDIMTKMYLPVNHAGDVTLEYTGMNNSIEVKQADVVLISYPLDDEDGALQEYFDYDEDRAISDVRYYSDKQTDE GPAMTFSVYSAVNAKFNKEGCSSQTYLLKSVEPYFRFPFGQMSEQSTDQYDTNGGTHPAFPFLTGHGAFLQSSIY GLTGLRFSYIYNDTDKSIKRRLAFDPLQLPCLPGGFSINGFVYMNQTLDITVNDTYATIAHRGNATTINVIfVDSR NEMGGKEHKIQPGKSLSIPLYQTEQNIPGSFIECTVKNVTALQPGVVGDPIQAVADGDNSTIWKIESREEPTHLI FDLGDELDIEGGLWWGTYPAESFSVSVLRDFNSTNYRVINNVENYDLIYESGNVTASSPFDESHIKKVQILPHN CTNFTFSELTASRYVLFEFTDVLGYPQDYSYGAQVAEVVLY
SEQ0209
NTH1 (YDR001 C)
Neutral trehalase, degrades trehalose; required for thermotolerance and may mediate resistance to other cellular stresses; may be phosphorylated by Cdc28p
Chr4_0816
Upstream gaaaaaccaatctaatcagtaatgttctcaagaatatagcaaaacaaattcgcaggaaaa gggaaataggataatacacaattgttagcattcattatttggtcttgttaatctcttgag ctttctcaaattccttcttgaatccgtcagcattttccttgttaccaaaacggatggcaa aagtttgggcttctggcttaccctcagaaacatcggcggttactgagtaaacccatgaac ggtctgaaccgacgtttggcttcaaggtgaagtcaggggaaatgaagtgattggcgcata ctttgagaaccttgtctctacgcatcaaaagacgaacttttccagtttccttgttcttta agaacttagcatctccaacacctctttccttccattccttaccttcttggtcaaaacgga acaacttagcacgaactttgtacaatacctcctcattctcttcgagtgttttcacttcca
ctttttctagtttgacaataggatcgaaatgaatatctggtgattctggctcctcttctt tgtcagcctttggctcgctggaggcttcagattctttggagacctcagcttccttgtctt ccttagaagtttcagtttccttcttctttccaccgaacatagagaagacattggaagtag gaggagtccgagttccagaaaaggaattcaaagtggaagaagtctgagtttcggctggct tggactcttcagctggcttgttttcttgcttttgttcagacattgactctgttatgcgtg ttttcaagaaaaatttgttcgttttggttctctttgacattttgatcccttcgcgaagga gcttctgggggaaggaactttgctattttgcttggtggcttctcctgcccccccacatac gcgaactgcctgacctgacttcaaaaattaagaaacttgcttttgcctcttatctatttc cattttacagccttataacataccgcatacaacattcatca
ORF
GACACCCCCGATGGCCGGATCCCTCGTATCTACGTTCCACATGATGACCCGGAGCAATATGAGTTCTATCTAGAA
CACTTTACTACACTTTTGACTCCATATGCTAAGAAGCATGGAGTCTCTATAGAAGAGTTCTATGAGCTATACAAC
TGCAGTTCTCAAGAACAAGCTCAGTTAATGGTGGATAAGTCTCTATCTAAATTTGAGGAATTTGGTGGTCTTGTC
GAGATAGTCAAGTCAAACTAG
Downstream (in bold start codon previous gene) aagtatacggtttccccgcagaaatagcagaaataggcgacaaatacatacaacattttc attgtgatagggggcggcggttcctaggagggacaacccccagaaaccttgtagactacg ttttcacgacgatgggttattactgtaaaggaagaatatactacccaccagttgaatgtt tgaacggatcaaaggtcgaagggagtacacggcccaaccaacgtagctaccggagaaagc aagactttcccaaaccaaatagctccgggtttcttctccggcaacccgtcagtttttgtg tggccggacaaaaattcgcaccctcagtctaattgaaaggtcgggctccgagctctaggc gtttgcgcatgtaatattgcatcccctcccatagataatactgcgcgaacacagggtgca aattatgatgaccacacatgccagtgaccaaaacagttttttagtctttaaaaaccctcg gaacttctgagtatataaaggcttctcatttcctacaagcaaacaaagaagaaacttcca ctttctaactttttatctatagactttagagttacaaccaacgaacaataacaaaatggt taaagtcacagtttgcggagcagccggaggtattgggcagccattatctctgatgttcaa attgaacccatatgttaccactttggcactttacgatgtcgtaaatgttccaggagtcgg taaggatctatcccacattgacacagacactaaacttgagtcttacttgcccgagaacga cggactggagaaggccttgacgggttctgacttggtcatcatcccagctggtgttccaag aaagcctggtatgaccagagacgatttgtttgccattaacgctggtatcatcagagactt ggctaatggaatcgctcagtttgccccatcggctttcgttcttgtcatttctaaccctgt
caactccacggttcctattgtcgcagagattcttaagaaga
AA
MQDGNAIERPHSVAVDVPDGLQQPHHTRTFSTGSNSGFIDPFSEPSQYYGPATDIARASTRSKIGRTRTLSAVEN RFNRPMVHEDDAFSKVHLQRRGSNDTLVNRRFFISDIDETLRDLLKNEDSDΞNCQITIEDTGPKVLRVGTANSNG
YRQTDIRGTYMLSNLLQELT IAKRFGRNQMVLDEARLNENPVKRLRRL ISHΞFWKNLTRQVGSGNVAE IAQDTKI DTPDGRIPRIYVPHDDPEQYEFYLE IARKNEAVGGARLDVRYLNEVIDPEYVKS INGTPGLLALATHKVPDGQGG ITLEGYPYWPGGRFNELYGWDSYMMALGLLKDGMLDLARGMVENF IYE IRHYGMI LNANRSYYLGRSQPPFLTD
FGIMVYEAIWEHAHQNEEDTEKSLQDAEYFLRRTFCAAICEYKTWJCCEPRLDKTTGLSCYHPQGVGIPPETEST HFTTLLTPYAKKHGVSIEEFYELYNNHTIVEPELDE YFLHDRAVRESGHDTΞYRLEGICASLATVDLNALLYKYE
VDIAKI IKEHFNDELVTENSVEHSAEWTKRAELRKERMDKYLWNEEEGIYFDYNLKLKKQHRYESVTTFWPMWAG CSSQEQAQLMVDKSLSKFEEFGGLVAGTLASRGRVGLERPSRQWDYPYGWAPQQILAWVGLDKYGFRGHTKRLCY RWLYLMTKAFVDYNGIWEKYDVTTGTSPHKVDAEYGNQGADFKGVATEGFGWVNASF ILGLTYLDVQGIRAIGA VTSPDVFFRKLKPWERASYGLRPCNE IVKSN
14 β-mannosyltransferases
3 Pichia pastons homologs of the AMR2 gene (chr4_0450)
Of all 3 the amino acid sequence is patented (US 7465577) and published (Identification of a new family of genes involved in beta-l,2-mannosylation of glycans in Pichia pastoris and Candida albicans. Mille C, Bobrowicz P, Trinel PA, Li H, Maes E, Guerardel Y, Fradin C, Martinez-Esparza M, Davidson RC, Janbon G, Poulain D, Wildt S )
SEQ0210
5' region taccatgaacattcgtttgggcttaccattgttgccgataactgttgcattctgtaagat ggtgatttccagcgttcgaaaagcctgtagtacccagaattttgatcaaagtgtcctctt agccgtggcatccgcggctgtcatcttgatccttcttcgcatgagcacttacgtgatgct gataagatgggcacgccacaacactagggccaagacctcgcctagtggtggcgactatct aaagggctcacctaacacctctttagcggggattgaagacattcatacacgaagtctgct gtatgggtcagacgagaaaatgccgcaatctaaagaagaacaaaggtcccacaatgcgaa tcgtgatggtgataaatcattggctcaggtcagtagatttgaaatgcatgataaacgcat atggtgagagccgttctgcacaactagatgttttcgagcttcgcattgtttcctgcagct cgactattgaattaagatttccggatatctccaatctcacaaaaacttatgttgaccacg tgctttcctgaggcgaggtgttttatatgcaagctgccaaaaatggaaaacgaatggcca tttttcgcccaggcaaattattcgattactgctgtcataaagacagtgttgcaaggctca catttttttttaggatccgagataaagtgaatacaggacagcttatctctatatcttgta ccattcgtgaatcttaagagttcggttagggggactctagttgagggttggcactcacgt atggctgggcgcagaaataaaattcaggcgcagcagcacttatcgatgcatgcaaggcga gaaaaataaagaacaaaaacacaccttgctaagaccacaacctttcaaattttgagattg ttattctttcatcctaaaacaccaccgtcctatctcttgggaacgtacatatcattgagc ttggttcattgatacatcactgtatctaactctccttttt
ORF
AGAAGTTAA
Downstream cgggcagaaagactgaactaccataaaacgtaaaccagtgtttttcaaccttttgtttta cagtcaaggctttatcttctaagtcttttatggcggagtaaacgtcaatgtccaaaatct gagctcttatgaactccaaatcgcaagctaagtcatgctttggggtgactttcaacaatc catgataagacaaatccgtacatcctagcactggagcattgtcgtagatgttggtgacat aggagtcgacctccatgtcgttgaagcttcccaatgatagcagatcatcataggcagaat ccttgcctccgtttagggagattccgccaagattattatcattccaagtaaggaggactc cgatccaaatcacagacaaaacactggcaatacagagcagcaggaagttgagtcgtgttc tcatccaacagtatcaggcacggtatagaagcggagatgaaatagaccaaccttgcttga ggctaagggaaaacaaccaagaaaaaaggcgcctgataaattgggccctcaaaaacaata caccaaggcgcacaggcacctgcattcaattgcgacacactttctcgtttaagcggaaag aggaaggtttcattccgttgtagggaagtagccacataagtgcacagcataccgtgttgt gcctcggcaatttgttggtacggtcctatacaatcttatgaagcctcctgctacgctcag aaatgccaaattgctctctgagacaataaatccgcgtagttttgaaatttgtagtgaaaa tctaggcccgctcgcaggcgagattgacattagtagctgcatagggctgaagcatactgt gcttgcatcacataatcttggaattacattcatttattagacaaaaatttaaattaagtt attttgtggagcccaaactttttgagtttttgtttttatttataactgtgaattctacga caatattaccgttttcaggccattacggtaactctagctttt
AA
MVDLFQWLKFYSMRRLGQVAITLVLLNLFVFLGYKFTPSTVIGSPSWEPAWPTVFNESYLDSLQFTDINVDSFL SDTNGRISVTCDSLAYKGLVKTSKKKELDCDMAYIRRKIFSSEEYGVLADLEAQDITEEQRIKKHWFTFYGSSVY LPEHEVHYLVRRVLFSKVGRADTPVISLLVAQLYDKDWNELTPHTLEIVNPATGNVTPQTFPQLIHVPIEWSVDD KWKGTEDPRVFLKPSKTGVSEPIVLFNLQSSLCDGKRGMFVTSPFRSDKVNLLDIEDKERPNSEKNWSPFFLDDV EVSKYSTGYVHFVYSFNPLKVIKCSLDTGACRMIYESPEEGRFGSELRGATPMVKLPVHLSLPKGKEVWVAFPRT RLRDCGCSRTTYRPVLTLFVKEGNKFYTELISSSIDFHIDVLSYDAKGESCSGSISVLIPNGIDSWDVSKKQGGK SDILTLTLSEADRNTVWHVKGLLDYLLVLNGEGPIHDSHSFKNVLSTNHFKSDTTLLNSVKAAECAIFSSRDYC KKYGETRGEPARYAKQMENERKEKEKKEKEAKEKLEAEKAEMEEAVRKAQEAIAQKEREKEEAEQEKKAQQEAKE KEAEEKAAKEKEAKENEAKKKI IVEKLAKEQEEAEKLEAKKKLYQLQEEERS*
SEQ0211 chr4_0471
5' region atgcagccccaggcgcccgttctgatggcttgatgaccgttgtattgcctgtcactatag ccaggggtagggtccataaaggaatcatagcagggaaattaaaagggcatattgatgcaa tcactcccaatggctctcttgccattgaagtctccatatcagcactaacttccaagaagg accccttcaagtctgacgtgatagagcacgcttgctctgccacctgtagtcctctcaaaa cgtcaccttgtgcatcagcaaagactttaccttgctccaatactatgacggaggcaattc tgtcaaaattctctctcagcaattcaaccaacttgaaagcaaattgctgtctcttgatga tggagacttttttccaagattgaaatgcaatgtgggacgactcaattgcttcttccagct cctcttcggttgattgaggaacttttgaaaccacaaaattggtcgttgggtcatgtacat caaaccattctgtagatttagattcgacgaaagcgttgttgatgaaggaaaaggttggat acggtttgtcggtctctttggtatggccggtggggtatgcaattgcagtagaagataatt ggacagccattgttgaaggtagagaaaaggtcagggaacttgggggttatttataccatt ttaccccacaaataacaactgaaaagtacccattccatagtgagaggtaaccgacggaaa aagacgggcccatgttctgggaccaatagaactgtgtaatccattgggactaatcaacag acgattggcaatataatgaaatagttcgttgaaaagccacgtcagctgtcttttcattaa ctttggtcggacacaacattttctactgttgtatctgtcctactttgcttatcatctgcc acagggcaagtggatttccttctcgcgcggctgggtgaaaacggttaacgtgaaatgaaa
ttagacacacaacaaatctctcatctcctctcacgtcaa ORF
GGCTCCAAACAATACTGTCAAAGGTATGGTGAGCTTCACTAA
Downstream gccttgggggacttcaagtctttgctagaaactagatgaggtcaggccctcttatggttg tgtcccaattgggcaatttcactcacctaaaaagcatgacaattatttagcgaaataggt agtatattttccctcatctcccaagcagtttcgtttttgcatccatatctctcaaatgag cagctacgactcattagaaccagagtcaagtaggggtgagctcagtcatcagccttcgtt tctaaaacgattgagttcttttgttgctacaggaagcgccctagggaactttcgcacttt ggaaatagattttgatgaccaagagcgggagttgatattagagaggctgtccaaagtaca tgggatcaggccggccaaattgattggtgtgactaaaccattgtgtacttggacactcta ttacaaaagcgaagatgatttgaagtattacaagtcccgaagtgttagaggattctatcg agcccagaatgaaatcatcaaccgttatcagcagattgataaactcttggaaagcggtat cccattttcattattgaagaactacgataatgaagatgtgagagacggcgaccctctgaa cgtagacgaagaaacaaatctacttttggggtacaatagagaaagtgaatcaagggaggt atttgtggccataatactcaactctatcattaatgtggttcttttggtagcaaaaatctt tgttgttttgttcagttcctcactctcattgatggcttcgttagttgactccgtgatgga tttcttatctactttgatcatatatgtttctaactcttttgctgggaaaagagacaagaa tgagtatccagttggaaggtcaaggttggagcccttaggagttcttgtcttttccgtaat cataattgtctctttcatccaggtcggaaatgaatctttgaaaaagctaatcagtggcga ccgtgatgttgtttctttagataaaaccactatcagcgtc
AA
MYHLAPRKKLLIWGGSLGFVLLLLIVASSHQRIRSTILHRTPISTLPVISQEVITADYHPTLLTGFIPTDSDDSD
CADFSPSGVIYSTDKLVLHDSLKDIRDSLLKTQYKDLVTLEDEEKMNIDDILKRWYTLSGSSVWIPGMKAHLWS
RVMYLGTNGRSDPLVSFVRVQLFDPDFNELKDIALKFSDKPDGTVIFPYILPVDIPREGSRWLGPEDAKIAVNPE
TPDDPIVIFNMQNSVNRAMYGFYPFRPENKQVLFSIKDEEPRKKEKNWTPFFVPGSPTTVNFVYDLQKLTILKCS
I ITGICEKEFVSGDDGQNHGIGIFRGGSNLVPFPTSFTDKDVWVGFPKTHMESCGCSSHIYRPYLMVLVRKGDFY
YKAFVSTPLDFGIDVRSWESAESTSCQTAKNVLAVNSISNWDLLDDGLDKDYMTITLSEADWNSVLRVRGIAKF
VDNLTMDDGSTTLSTSNKIDECATTGSKQYCQRYGELH*
SEQ0212 chrl-4_0696
5' region cgctgctttctttatcatggtgcagggtcccacacagaccaaatccaagttattccctaa gggatggttttcagagagagaagaaactataatggtaaaccggttgttgcgggatgaccc ctctaagagtgatatgcacaatagacagtctgttacgccaaaagaactgtggaaagtgtt gggagactacgatctatggcctatatacgcattggccatggtattttccattccccagat accgataaagagataccttactcttactttgagggcattagagttcactaccactgagat caatctcttaacaatccctgcttcgtttctggcgggaatcatgtcaattgctatttcatt agtcagtgagttcttcaatgaaggtttgattatcggtatattgtgtcaattctggttgct tattatggttatcattgaatacacctctgtggaaaagatatctccctggggacaatatgt gttgcaactgttcgttgttggtgccccagtcccccaaccggtactaatcggtctatgttc
ccgtaactcatattcggttagaactagaacaataagtgcatcattgttcaacattgtggt tcaattgtcgaacattgctggtgcttatatctacagggaagacgataagcctttgtacaa gagaggtaacagacagttaattggtatttctttgggagtcgttgccctctacgttgtctc caagacatactacattctgagaaacagatggaagactcaaaaatgggagaagcttagtga agaagagaaagttgcctacttggacagagctgagaaggagaacctgggttctaagaggct ggactttttgttcgagagttaaactgcataattttttctaagtaaatttcatagttatga aatttctgcagcttagtgtttactgcatcgtttactgcatcaccctgtaaataatgtgag cttttttccttccattgcttggtatcttccttgctgctgttt
ORF
GAACTACAAGAAGAGCTTGAAAAGCAGAAAGATGAAGTGAAGGATACAAAGGCGAAGTAG
Downstream gccttaataacacttgagttttttggtatgaccccatcctgatcgccaaatctataatga ttttcgtggatgtgaaaagtcaaagagggagggaatttttcgtacttgtgaagtatctcc tcactcgtttcggcaaatttgtaatttttgagtttctgatgcagtactttctgttgcaat aatctttgttgttgtggagtcaagttctgagggtttatcctctggccattgctcagggtc acagacccattgattggtctttgttgagcctgaggctgtgcggtttgcatatttttcgcc cccacttggcttgactggatttgactcatttcggagtatttattagatgggggccagtgc ttcaactctgaggatttatgatacggagccttcgtgttgtgaaacttagtttaccttgag tatgtgacagacgaaaaaagatcgcgaagtgttggttccggagatgttcagcggtcacat gatcacaacgttttacctctttctatgtaaaacggtttttaccccggttcttgaaaagac tctatgtagaaaccgagctgctgtatgcgtaaggttgtatggcggaaggtaaacggagat ttcatgtttaagaggccttgcgacatctcatgtttttgatccttatattatggactatgc aaaaacggaaatgtcaatcaaatactggaaaccccatgetatctgtaagacctgtgtgtt gaaatacagtcgatgtcaataatcgtgacagcgatattagcagattaaattataaaagga ttatagacagcctagaaagatatagtagcaagaaaccgattacggaatcaatctgcccta tggtgtttaatttctgcattcttttttcggtctttcagtttcagtttcgttaaatctatt ctactattttcttaatctctccagaataccttgctcttccctttgcttctctctcccttc gttcttcgttctccgttctcctttctccgttttccgtctc
AA
MRIRSNVLLLSTAGALALVWFAVVFSWDDKSIFGIPTPGHAVASAYDSSVTLGTFNDMEVDSYVTNIYD
NAPVLGCYDLSYHGLLKVSPKHEILCDMKFIRARVLETEAYAALKDLEHKKLTEEEKIEKHWFTFYGSS
VFLPDHDVHYLVRRVVFSGEGKANRPITSILVAQIYDKNWNELNGHFLNVLNPNTGKLQHHAFPQVLPI
AVNWDRNSKYRGQEDPRVVLRRGRFGPDPLVMFNTLTQNNKLRRLFTISPFDQYKTVMYRTNAFKMQ
TTEKNWVPFFLKDDQESVHFVYSFNPLRVLNCSLDNGACDVLFELPHDFGMSSELRGATPMLNLPQAIP
MADDKEIWVSFPRTRISDCGCSETMYRPMLMLFVREGTNFF AELLSSSIDFGLEVIP YTGDGLPCSSGQS
VLIPNSIDNWEVTGSNGEDILSLTFSEADKSTSVVHIRGLYKYLSELDGYGGPEAEDEHNFQRILSDLHFD GKKTIENFKKVQSCALD AAKAYCKEYGVTRGEEDRLKNKEKERKIEEKRKKEEERKKKEEEKKKKEEE EKKKKEEEEEEEKRLKELKKKLKELQEELEKQKDEVKDTKAK*
Claims
1. An isolated nucleic acid which encodes a protein as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202-0212, or a protein substantially homologous thereto.
2. The isolated nucleic acid of claim 1, comprising a nucleotide sequence that is identical or substantially homologous to the coding sequence of an ORF as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202-0212.
3. An isolated protein comprising an amino acid sequence as set forth in any one of SEQ0001-0025, 0027-0126, 0128-0165, 0172-0174, 0176-0200, or 0202-0209, or an amino acid sequence substantially homologous thereto.
4. An isolated nucleic acid, which encodes a signal peptide as set forth in any one of SEQ0001-0025 or 0027-0054.
5. An isolated peptide comprising an amino acid sequence identical with a signal peptide as set forth in any one of SEQ0001-0025 or 0027-0054.
6. A library of expression vectors, wherein each vector of the library encodes one and only one of the signal peptides identified in SEQ0001-0025 and 0027-0054, and each of the signal peptides identified in SEQ0001-0025 and 0027-0054 is encoded by one vector in the library, each vector comprising from 5' to 3', a promoter, the coding sequence of a signal peptide, and an intron sequence comprising a cloning site for insertion of a coding sequence of a heterologous protein.
7. A library of expression vectors, wherein each vector of the library encodes one and only one of the signal peptides identified in SEQ0001-0025 and 0027-0054, and each of the signal peptides identified in SEQ0001-0025 and 0027-0054 is encoded by one vector in the library, each vector comprising from 5' to 3', a promoter, the coding sequence of a signal peptide, fused in frame to a coding sequence of a heterologous protein
8. An isolated promoter, comprising a promoter sequence identified in any one of SEQ0060-0085, SEQ0096-0124, SEQ0125-0128, or SEQ0169-0178.
9. An expression vector comprising the isolated promoter of claim 8, operably linked to a heterologous coding sequence.
10. An expression vector for enhanced expression of a glycosylation precursor synthesis enzyme or transporter, comprising, from 5' to 3', a promoter, operably linked to a coding sequence as set forth in any one of SEQ0129-0132, and a transcription termination sequence.
11. The expression vector of claim 10, wherein the protein encoded by the coding sequence is modified to include an ER or Golgi localization signal
12. A genetically engineered methylotrophic yeast strain, capable of overexpressing a glycosylation precursor synthesis enzyme or transporter as set forth in any one of SEQ0129-0132.
13. A Pichia pastoris strain, in which at least one of the genes encoding a mannosyl transferase as set forth in SEQ0133-0137 or 0210-0212 has been inactivated.
14. A Pichia pastoris strain, in which at least one of the genes as set forth in SEQ0153-0165 has been inactivated.
15. A genetically engineered methylotrophic yeast strain, capable of overexpressing any one of the genes set forth in SEQ0166-0168.
16. A Pichia pastoris strain, in which at least one of the protease-encoding genes as set forth in SEQ0179 and SEQ0181-0186 has been inactivated.
17. A genetically engineered methylotrophic yeast strain, capable of overexpressing the protease inhibitor as set forth in SEQO 180.
18. An expression vector for overexpression of chaperones, comprising, from 5' to 3', a promoter, operably linked to a coding sequence as set forth in any one of SEQOl 87- SEQ0200, and a transcription termination sequence.
19. A collection of expression vectors, each vector in said collection comprising from 5' to 3', a promoter, operably linked to a different coding sequence as set forth in any one of SEQOl 87-SEQ0200, and a transcription termination sequence.
20. A genetically engineered methylotrophic yeast strain, capable of overexpressing at least one chaperone set forth in any one of SEQOl 87-SEQ0200.
21. The strain of claim 20, capable of overexpressing multiple chaperons.
22. An isolated nucleic acid molecule comprising the nucleotide sequence as set forth in SEQ0200.
23. An vector for mediating multi-copy integration of a heterologous coding sequence, comprising the nucleotide sequence as set forth in SEQ0200, operably linked to an expression cassette, wherein said expression cassette comprises a promoter placed in operable linkage to said heterologous sequence and a selectable marker gene.
24. A Pichia pastoris strain, comprising an expression cassette integrated in multiple copies at native 5SrRNA loci, wherein said expression cassette comprises a promoter in operable linkage to said heterologous sequence and a selectable marker gene.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18050209P | 2009-05-22 | 2009-05-22 | |
US61/180,502 | 2009-05-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010135678A1 true WO2010135678A1 (en) | 2010-11-25 |
Family
ID=43126528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2010/035825 WO2010135678A1 (en) | 2009-05-22 | 2010-05-21 | Nucleic acids of pichia pastoris and use thereof for recombinant production of proteins |
Country Status (2)
Country | Link |
---|---|
US (1) | US8440456B2 (en) |
WO (1) | WO2010135678A1 (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013107905A1 (en) | 2012-01-19 | 2013-07-25 | Vib Vzw | Tools and methods for expression of membrane proteins |
JP2014530016A (en) * | 2011-10-07 | 2014-11-17 | ロンザ リミテッドLonza Limited | Regulatable promoter |
WO2014204258A1 (en) * | 2013-06-21 | 2014-12-24 | 한국생명공학연구원 | Method for screening yeast strain having improved recombinant protein secretion capacity using cell wall deficient yeast mutant |
US9534039B2 (en) | 2011-05-09 | 2017-01-03 | Ablynx N.V. | Method for the production of immunoglobulin single variable domains |
EP3130598A2 (en) | 2012-10-29 | 2017-02-15 | Lonza Ltd | Expression sequences |
JP2017511148A (en) * | 2014-04-17 | 2017-04-20 | ベーリンガー インゲルハイム エルツェーファウ ゲゼルシャフト ミット ベシュレンクテル ハフツング ウント コンパニー コマンディトゲゼルシャフト | Recombinant host cells modified to overexpress helper proteins |
WO2018110616A1 (en) * | 2016-12-15 | 2018-06-21 | 株式会社カネカ | Novel host cell and production method for target protein using same |
WO2018141872A1 (en) * | 2017-02-02 | 2018-08-09 | Lallemand Hungary Liquidity Management Llc | Heterologous protease expression for improving alcoholic fermentation |
CN108699151A (en) * | 2016-02-12 | 2018-10-23 | 埃博灵克斯股份有限公司 | The method for being used to prepare immunoglobulin (Ig) list variable domains |
WO2019070246A1 (en) * | 2017-10-03 | 2019-04-11 | Bolt Threads, Inc . | Modified strains for the production of recombinant silk |
US10647975B2 (en) | 2017-10-03 | 2020-05-12 | Bolt Threads, Inc. | Modified strains for the production of recombinant silk |
CN112391402A (en) * | 2020-11-17 | 2021-02-23 | 华中科技大学 | Method for improving expression level of target protein in yarrowia lipolytica |
KR20230011481A (en) * | 2017-03-10 | 2023-01-20 | 볼트 쓰레즈, 인크. | Compositions and methods for producing high secreted yields of recombinant proteins |
US11634729B2 (en) * | 2018-05-17 | 2023-04-25 | Bolt Threads, Inc. | SEC modified strains for improved secretion of recombinant proteins |
CN117467695A (en) * | 2023-12-27 | 2024-01-30 | 南京鸿瑞杰生物医疗科技有限公司 | Method for improving expression quantity of exogenous protein by over-expressing pichia pastoris molecular chaperone |
US12122810B2 (en) | 2022-05-23 | 2024-10-22 | Bolt Threads, Inc. | Compositions and methods for producing high secreted yields of recombinant proteins |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103003438B (en) * | 2010-02-10 | 2016-01-20 | 拜康有限公司 | Reduce Method and Process and the protein thereof of Protein Glycosylation Overview |
US20120102054A1 (en) * | 2010-10-25 | 2012-04-26 | Life Technologies Corporation | Systems and Methods for Annotating Biomolecule Data |
WO2013055822A2 (en) | 2011-10-11 | 2013-04-18 | Life Technologies Corporation | Systems and methods for analysis and interpretation of nucleic acid sequence data |
EP2771477A4 (en) * | 2011-10-27 | 2015-04-22 | Merck Sharp & Dohme | Controlling o-glycosylation in lower eukaryotes |
US9862958B2 (en) * | 2012-10-10 | 2018-01-09 | Biocon Limited | Nucleotide sequence and a process thereof |
EP2931895A4 (en) * | 2012-12-17 | 2016-08-10 | Merck Sharp & Dohme | Pmt2, och1, pmt5 mutant cells |
US9150870B2 (en) | 2013-03-15 | 2015-10-06 | Lonza Ltd. | Constitutive promoter |
SG11201507557UA (en) * | 2013-03-15 | 2015-10-29 | Lonza Ag | Constitutive promoter |
EP3027752B1 (en) * | 2013-07-01 | 2018-08-08 | Biocon Limited | Signal sequence for protein expression in pichia pastoris |
SI3016970T1 (en) | 2013-07-04 | 2019-08-30 | Glykos Finland Oy | O-mannosyltransferase deficient filamentous fungal cells and methods of use thereof |
KR102144998B1 (en) | 2013-08-30 | 2020-08-14 | 삼성전자주식회사 | A polypeptide confering acid tolerant property to a yeast cell, polynucleotide encoding the same, a yeast cell having increased amount of the polypeptide, a method of producing a product using the yeast cell and a method of producing acid tolerant yeast cell |
US10113164B2 (en) | 2013-12-23 | 2018-10-30 | Research Corporation Technologies, Inc. | Pichia pastoris surface display system |
WO2015158800A1 (en) | 2014-04-17 | 2015-10-22 | Boehringer Ingelheim Rcv Gmbh & Co Kg | Recombinant host cell for expressing proteins of interest |
WO2016012468A1 (en) | 2014-07-21 | 2016-01-28 | Novartis Ag | Production of glycoproteins with mammalian-like n-glycans in filamentous fungi |
CN106075445B (en) * | 2016-05-07 | 2019-08-13 | 上海大学 | The new application of tRF-Leu-CAG |
WO2018013551A1 (en) * | 2016-07-11 | 2018-01-18 | Massachusetts Institute Of Technology | Tools for next generation komagataella (pichia) engineering |
CA3187918A1 (en) | 2021-07-30 | 2023-01-30 | Helaina, Inc. | Methods and compositions for protein synthesis and secretion |
WO2024126811A1 (en) * | 2022-12-16 | 2024-06-20 | Boehringer Ingelheim Rcv Gmbh & Co Kg | Means and methods for increased protein expression by use of a combination of transport proteins and either chaperones or transcription factors |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6638735B1 (en) * | 1998-05-07 | 2003-10-28 | Doosan Corporation | Plasmid for gene expression in pichia ciferrii and transformation method using the same |
US20090042264A1 (en) * | 2005-01-14 | 2009-02-12 | Archer-Daniels-Midland Company | Compositions and methods for manipulating carbon flux in cells |
-
2010
- 2010-05-21 US US12/785,286 patent/US8440456B2/en active Active
- 2010-05-21 WO PCT/US2010/035825 patent/WO2010135678A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6638735B1 (en) * | 1998-05-07 | 2003-10-28 | Doosan Corporation | Plasmid for gene expression in pichia ciferrii and transformation method using the same |
US20090042264A1 (en) * | 2005-01-14 | 2009-02-12 | Archer-Daniels-Midland Company | Compositions and methods for manipulating carbon flux in cells |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3590950A1 (en) * | 2011-05-09 | 2020-01-08 | Ablynx NV | Method for the production of immunoglobulin single varible domains |
US9534039B2 (en) | 2011-05-09 | 2017-01-03 | Ablynx N.V. | Method for the production of immunoglobulin single variable domains |
EP2707382B1 (en) * | 2011-05-09 | 2019-07-17 | Ablynx NV | Method for the production of immunoglobulin single variable domains |
US11401523B2 (en) | 2011-10-07 | 2022-08-02 | Lonza Ltd | Methods of producing promoter variants |
JP2014530016A (en) * | 2011-10-07 | 2014-11-17 | ロンザ リミテッドLonza Limited | Regulatable promoter |
US9512432B2 (en) | 2011-10-07 | 2016-12-06 | Lonza Ltd. | Regulatable promoter |
US10301634B2 (en) | 2011-10-07 | 2019-05-28 | Lonza Ltd. | Regulatable promoter |
US11479791B2 (en) | 2012-01-19 | 2022-10-25 | Vib Vzw | Tools and methods for expression of membrane proteins |
WO2013107905A1 (en) | 2012-01-19 | 2013-07-25 | Vib Vzw | Tools and methods for expression of membrane proteins |
US9890217B2 (en) | 2012-01-19 | 2018-02-13 | Vib Vzw | Tools and methods for expression of membrane proteins |
US10160988B2 (en) | 2012-10-29 | 2018-12-25 | Lonza Ltd | Expression sequences |
US11359223B2 (en) | 2012-10-29 | 2022-06-14 | Lonza Ltd | Expression sequences |
EP3130598A2 (en) | 2012-10-29 | 2017-02-15 | Lonza Ltd | Expression sequences |
WO2014204258A1 (en) * | 2013-06-21 | 2014-12-24 | 한국생명공학연구원 | Method for screening yeast strain having improved recombinant protein secretion capacity using cell wall deficient yeast mutant |
JP2020072696A (en) * | 2014-04-17 | 2020-05-14 | ベーリンガー インゲルハイム エルツェーファウ ゲゼルシャフト ミット ベシュレンクテル ハフツング ウント コンパニー コマンディトゲゼルシャフト | Recombinant host cell engineered to overexpress helper proteins |
JP2017511148A (en) * | 2014-04-17 | 2017-04-20 | ベーリンガー インゲルハイム エルツェーファウ ゲゼルシャフト ミット ベシュレンクテル ハフツング ウント コンパニー コマンディトゲゼルシャフト | Recombinant host cells modified to overexpress helper proteins |
JP7062633B2 (en) | 2014-04-17 | 2022-05-06 | ベーリンガー インゲルハイム エルツェーファウ ゲゼルシャフト ミット ベシュレンクテル ハフツング ウント コンパニー コマンディトゲゼルシャフト | Recombinant host cells modified to overexpress helper proteins |
US10865416B2 (en) | 2014-04-17 | 2020-12-15 | Boehringer Ingelheim Rcv Gmbh & Co Kg | Recombinant host cell engineered to overexpress helper proteins |
CN108699151B (en) * | 2016-02-12 | 2022-08-12 | 埃博灵克斯股份有限公司 | Method for preparing immunoglobulin single variable domains |
CN108699151A (en) * | 2016-02-12 | 2018-10-23 | 埃博灵克斯股份有限公司 | The method for being used to prepare immunoglobulin (Ig) list variable domains |
US11851480B2 (en) | 2016-02-12 | 2023-12-26 | Ablynx N.V. | Method for the production of immunoglobulin single variable domains |
US10975141B2 (en) * | 2016-02-12 | 2021-04-13 | Ablynx N.V. | Method for the production of immunoglobulin single variable domains |
CN110088279A (en) * | 2016-12-15 | 2019-08-02 | 株式会社钟化 | Novel host cell and used its target protein manufacturing method |
WO2018110616A1 (en) * | 2016-12-15 | 2018-06-21 | 株式会社カネカ | Novel host cell and production method for target protein using same |
US11365419B2 (en) | 2016-12-15 | 2022-06-21 | Kaneka Corporation | Host cell and method for producing target protein using same |
WO2018141872A1 (en) * | 2017-02-02 | 2018-08-09 | Lallemand Hungary Liquidity Management Llc | Heterologous protease expression for improving alcoholic fermentation |
KR102618002B1 (en) * | 2017-03-10 | 2023-12-27 | 볼트 쓰레즈, 인크. | Compositions and methods for producing high secreted yields of recombinant proteins |
KR20230011481A (en) * | 2017-03-10 | 2023-01-20 | 볼트 쓰레즈, 인크. | Compositions and methods for producing high secreted yields of recombinant proteins |
US11214785B2 (en) | 2017-10-03 | 2022-01-04 | Bolt Threads, Inc. | Modified strains for the production of recombinant silk |
WO2019070246A1 (en) * | 2017-10-03 | 2019-04-11 | Bolt Threads, Inc . | Modified strains for the production of recombinant silk |
US10647975B2 (en) | 2017-10-03 | 2020-05-12 | Bolt Threads, Inc. | Modified strains for the production of recombinant silk |
US11634729B2 (en) * | 2018-05-17 | 2023-04-25 | Bolt Threads, Inc. | SEC modified strains for improved secretion of recombinant proteins |
CN112391402A (en) * | 2020-11-17 | 2021-02-23 | 华中科技大学 | Method for improving expression level of target protein in yarrowia lipolytica |
US12122810B2 (en) | 2022-05-23 | 2024-10-22 | Bolt Threads, Inc. | Compositions and methods for producing high secreted yields of recombinant proteins |
CN117467695A (en) * | 2023-12-27 | 2024-01-30 | 南京鸿瑞杰生物医疗科技有限公司 | Method for improving expression quantity of exogenous protein by over-expressing pichia pastoris molecular chaperone |
CN117467695B (en) * | 2023-12-27 | 2024-05-03 | 南京鸿瑞杰生物医疗科技有限公司 | Method for improving secretion of reporter protein by over-expressing pichia pastoris molecular chaperones |
Also Published As
Publication number | Publication date |
---|---|
US20110021378A1 (en) | 2011-01-27 |
US8440456B2 (en) | 2013-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2010135678A1 (en) | Nucleic acids of pichia pastoris and use thereof for recombinant production of proteins | |
AU2005238308B8 (en) | Methods for reducing or eliminating alpha-mannosidase resistant glycans in the production of glycoproteins | |
EP2912162B1 (en) | Pichia pastoris strains for producing predominantly homogeneous glycan structure | |
AU2015292421A1 (en) | Promoters derived from Yarrowia lipolytica and Arxula adeninivorans, and methods of use thereof | |
US11795487B2 (en) | Fungal cell with improved protein production capacity | |
US20220220161A1 (en) | Constitutive Yeast LLP Promotor-Based Expression Systems | |
US9023637B2 (en) | Enhanced citric acid production in Aspergillus with inactivated asparagine-linked glycosylation protein 3 (Alg3), and/or increased LaeA expression | |
JP2005515749A (en) | Methods and compositions for high efficiency production of heterologous proteins in yeast | |
JP2014532406A (en) | Engineered lower eukaryotic host strains for recombinant protein expression | |
Sibirny et al. | Genetic engineering of nonconventional yeasts for the production of valuable compounds | |
Aguiar et al. | Investigation of protein secretion and secretion stress in Ashbya gossypii | |
US9909152B2 (en) | Enhanced itaconic acid production in Aspergillus with increased LaeA expression | |
US9206450B2 (en) | Enhanced citric acid production in aspergillus with inactivated asparagine-linked glycosylation protein 3 (ALG3), and/or increased laeA expression | |
Cheon et al. | New selectable host–marker systems for multiple genetic manipulations based on TRP1, MET2 and ADE2 in the methylotrophic yeast Hansenula polymorpha | |
US20190093115A1 (en) | Novel gene targeting method | |
US11414653B2 (en) | Promoter useful for high expression of a heterologous gene of interest in Aspergillus niger | |
Feng | Improving protein production in the yeast Saccharomyces cerevisiae: the role of non-coding RNAs | |
US20140308702A1 (en) | Yeast recombinant cell capable of producing gdp-fucose | |
Wiesenberger et al. | RNA degradation in fission yeast mitochondria is stimulated by a member of a new family of proteins that are conserved in lower eukaryotes | |
CA3239731A1 (en) | Improved production of secreted proteins in yeast cells | |
Aguiar | Understanding the biotechnological potential of Ashbya gossypii | |
EP4291632A1 (en) | Genetically-modified filamentous fungi for production of exogenous proteins having reduced or no n-linked glycosylation | |
Boer et al. | The MAPk ASTE11 is involved in the maintenance of cell wall integrity and in¢ lamentation in Arxula adeninivorans, but not in adaptation to hypertonic stress | |
Böer et al. | The MAPk ASTE11 is involved in the maintenance of cell wall integrity and in filamentation in Arxula adeninivorans, but not in adaptation to hypertonic stress |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10778487 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10778487 Country of ref document: EP Kind code of ref document: A1 |