CA2456955A1 - Novel nucleic acids and secreted polypeptides - Google Patents

Novel nucleic acids and secreted polypeptides Download PDF

Info

Publication number
CA2456955A1
CA2456955A1 CA002456955A CA2456955A CA2456955A1 CA 2456955 A1 CA2456955 A1 CA 2456955A1 CA 002456955 A CA002456955 A CA 002456955A CA 2456955 A CA2456955 A CA 2456955A CA 2456955 A1 CA2456955 A1 CA 2456955A1
Authority
CA
Canada
Prior art keywords
polypeptide
polynucleotide
protein
cells
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002456955A
Other languages
French (fr)
Inventor
Y. Tom Tang
Yonghong Yang
Zhiwei Wang
Gezhi Weng
Yunqing Ma
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nuvelo Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2456955A1 publication Critical patent/CA2456955A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals

Abstract

The present invention provides novel nucleic acids, novel polypeptide sequences encoded by these nucleic acids and uses thereof.

Description

DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.

NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des brevets JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME

NOTE: For additional volumes, please contact the Canadian Patent Office NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:

NOVEL NUCLEIC ACIDS AND SECRETED
POLYPEPTIDES
1. CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part application of U.S. Application Serial No.
09/552,317 filed April 25, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 784CIP, which in turn is a continuation-in-part application of U.S.
Application Serial No. 09/488,725 filed, January 21, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 784; U.S. Application Serial No.
09/491,404 filed January 25, 2000 entitled "Novel Contigs Obtained from Various Libraries'.', Attorney Docket No. 785; U.S. Application Serial No. 09/560,875 filed April 27, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 787CIP, which in turn is a continuation-in-part application of U.S. Application Serial No. 09/496,914 filed February 03, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 787;
U.S. Application Serial No. 09/577,409 filed May 18, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 788CIP, which in turn is,a continuation-in-part application of U.S. Application Serial No. 09/515,126 filed February 28, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 788;
U.S. Application Serial No. 091574,454 filed May 19, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 789CIP which in turn is a continuation-in-part application of U.S. Application Serial No. 09/519,705 filed March 07, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 789;
U.S. Application Serial No. 091649,167 filed August 23, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 790CIP, which in turn is a continuation-in-part application of U.S. Application Serial No. 09/540,217 filed March 31, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 790;
U.S. Application Serial No. 09/770,160 filed January 26, 2001 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 791CIP, which is in turn a continuation-in-part application of U.S. Application Serial No. 091552,929 filed April 18, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 791;
and U.S. Application Serial No. 09/577,408 filed May 18, 2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket No. 792; all of which are incorporated herein by reference in their entirety.
2. BACKGROUND OF THE INVENTION
2.1 TECHNICAL FIELD
The present invention provides novel polynucleotides and proteins encoded by such polynucleotides, along with uses for these polynucleotides and proteins, for example in therapeutic, diagnostic and research methods.
2.2 BACKGROUND
Technology aimed at the discovery of protein factors (including e.g., cytokines, such as lymphokines, interferons, circulating soluble factors, chemokines, and interleukins) has matured rapidly over the past decade. The now routine hybridization cloning and expression cloning techniques clone novel polynucleotides "directly" in the sense that they rely on information directly related to the discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of hybridization cloning; activity of the protein in the case of expression cloning). More recent "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences based on the presence of a now well-recognized secretory leader sequence motif, as well as various PCR-based or low stringency hybridization-based cloning techniques, have advanced the state of the art by making available large numbers of DNA/amino acid sequences for proteins that are known to have biological activity, for example, by virtue of their secreted nature in the case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based techniques, or by virtue of structural similarity to other genes of known biological activity.
Identified polynucleotide and polypeptide sequences have numerous applications in, for example, diagnostics, forensics, gene mapping; identification of mutations responsible for genetic disorders or other traits, to assess biodiversity, and to produce many other types of data and products dependent on DNA and amino acid sequences.
3. SUMMARY OF THE INVENTION
The compositions of the present invention include novel isolated polypeptides, novel isolated polymcleotides encoding such polypeptides, including recombinant DNA
molecules, cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more epitopes present on such polypeptides, as well as hybridomas producing such antibodies.
The compositions of the present invention additionally include vectors, including expression vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such polynucleotides and cells genetically engineered to express such polynucleotides.
The present invention relates to a collection or library of at least one novel nucleic acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by hybridization (SBH), and in some cases, sequences obtained from one or more public databases. The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid sequences axe designated as SEQ ID NO: 1-1041, or 2083-2534 and are provided in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenine; C
is cytosine; G is guanine; T is thymine; and N is any of the four bases or unknown. In the amino acids provided in the Sequence Listing, * corresponds to the stop codon.
The nucleic acid sequences of the present invention also include, nucleic acid sequences that hybridize to the complement of SEQ ID NO: 1-1041, or 2083-2534 under stringent hybridization conditions; nucleic acid sequences which are allelic variants or species homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ
ID NO: 1-1041, or 2083-2534. A polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying sequence of SEQ m NO: 1-1041, or 2083-2534 or a degenerate variant or fragment thereof. The identifying sequence can be 100 base pairs in length.
The nucleic acid sequences of the present invention also include the sequence information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-2534. The sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-2534 that uniquely identifies or represents the sequence information of SEQ ID NO: 1-1041, or 2083-2534.
A collection as used in this application can be a collection of only one polynucleotide.
The collection of sequence information or identifying information of each sequence can be provided on a nucleic acid array. In one embodiment, segments of sequence information are provided on a nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed to detect full-match or mismatch to the polynucleotide that contains the segment. The collection can also be provided in a computer-readable format.
This invention also includes the reverse or direct complement of any of the nucleic acid sequences recited above; cloning or expression vectors containing the nucleic acid sequences;
and host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their reverse or direct complements) according to the invention have numerous applications in a variety of techniques known to those skilled in the art of molecular biology, such as use as hybridization probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing full-length genes, use for chromosome and gene mapping, use in the recombinant production of protein, and use in the generation of anti-sense DNA
or RNA, their chemical analogs and the like.
In a preferred embodiment, the nucleic acid sequences of SEQ m NO: 1-1041, or 2534 or novel segments or parts of the nucleic acids of the invention are used as primers in expression assays that are well knov~m in the art. In a particularly preferred embodiment, the nucleic acid sequences of SEQ m NO: 1-1041, or 2083-2534 or novel segments or parts of the nucleic acids provided herein are used in diagnostics for identifying expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human genome.
The isolated polynucleotides of the invention include, but are not limited to, a polynucleotide comprising any one of the nucleotide sequences set forth in SEQ
ID NO: 1-1041, or 2083-2534; a polynucleotide comprising aaiy of the full length protein coding sequences of SEQ )D NO: 1-1041, or 2083-2534; and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ~ NO: 1-1041, or 2083-2534. The polynucleotides of the present invention also include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ m NO: 1-1041, or 2083-2534; (b) a nucleotide sequence encoding any one of the amino acid sequences set forth in SEQ m NO: 1-1041, or 2083-2534; (c) a pol5mucleotide which is an allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog (e.g.
orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an amino acid sequence set forth in SEQ m NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8.
The isolated polypeptides of the invention include, but are not limited to, a polypeptide comprising any of the amino acid sequences set forth in the Sequence Listing;
or the corresponding full length or mature protein. Polypeptides of the invention also include polypeptides with biological activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in SEQ B7 NO: 1-1041, or 2083-2534; or (b) polynucleotides that hybridize to the complement of the polynucleotides of (a) under stringent hybridization conditions. Biologically active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain biological activity are also contemplated. The polypeptides of the invention may be wholly or partially chemically synthesized but are preferably produced by recombiilant means using the genetically engineered cells (e.g. host cells) of the invention.
The invention also provides compositions comprising a polypeptide of the invention.
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The invention also provides host cells transformed or transfected with a polynucleotide of the invention.
The invention also relates to methods for producing a polypeptide of the invention comprising growing a culture of the host cells of the invention in a suitable culture medium under conditions permitting expression of the desired polypeptide, and purifying the polypeptide from the culture or from the host cells. Preferred embodiments include those in which the protein produced by such processes is a mature form of the protein.
Polynucleotides according to the invention have numerous applications in a variety of techniques known to those skilled in the art of molecular biology. These techniques include use as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample using, e.g., ira situ hybridization.
In other exemplary embodiments, the polynucleotides are used in diagnostics as expressed sequence tags for identifying expressed genes or, as well known in the art and exemplified by Vollrath et al., Science 25:52-59 (1992), as expressed sequence tags for physical mapping of the human genome.
The polypeptides according to the invention can be used in a variety of conventional procedures and methods that are currently applied to other proteins. For example, a polypeptide of the invention can be used to generate an antibody that specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight markers, and as a food supplement.
Methods are also provided for preventing, treating, or ameliorating a medical condition which comprises the step of administering to a mammalian subject a therapeutically effective amount of a composition comprising a polypeptide of the present invention and a pharmaceutically acceptable carrier.
In particular, the polypeptides and polynucleotides of the invention can be utilized, for example, in methods for the prevention and/or treatment of disorders involving aberrant protein expression or biological activity.
The present invention further relates to methods for detecting the presence of the polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the identification of subjects exhibiting a predisposition to such conditions.
The invention provides a method for detecting the polynucleotides of the invention in a sample, comprising contacting the sample with a compound that binds to and forms a complex with the polynucleotide of interest for a period sufficient to form the complex and under conditions sufficient to form a complex and detecting the complex such that if a complex is detected, the polynucleotide of interest is detected. The invention also provides a method for detecting the polypeptides of the invention in a sample comprising contacting the sample with a compound that binds to and forms a complex with the polypeptide under conditions and for a period sufficient to form the complex and detecting the formation of the complex such that if a complex is formed, the polypeptide is detected.
The invention also provides kits comprising polynucleotide probes and/or monoclonal antibodies, and optionally quantitative standards, for carrying out methods of the invention. Furthermore, the invention provides methods for evaluating the efficacy of drugs, and monitoring the progress of patients, involved in clinical trials for the treatment of disorders as recited above.
The invention also provides methods for the identification of compounds that modulate (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides of the invention. Such methods can be utilized, for example, for the identification of compounds that can ameliorate symptoms of disorders as recited herein.
Such methods can include, but are not limited to, assays for identifying compounds and other substances that interact with (e.g., bind to) the polypeptides of the invention. The invention provides a method for identifying a compound that binds to the polypeptides of the invention comprising contacting the compound with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a reporter gene sequence in the cell; and detecting the complex by detecting the reporter gene sequence expression such that if expression of the reporter gene is detected the compound that binds to a polypeptide of the invention is identified.
The methods of the invention also provide methods for treatment which involve the administration of the polynucleotides or polypeptides of the invention to individuals exhibiting synptoms or tendencies. In addition, the invention encompasses methods for treating diseases or disorders as recited herein comprising administering compounds and other substances that modulate the overall activity of the target gene products. Compounds and other substances can affect such modulation either on the level of target gene/protein expression or target protein activity.
The polypeptides of the present invention and the polynucleotides encoding them are also useful for the same functions known to one of skill in the art as the polypeptides and polynucleotides to which they have homology (set forth in Table 2); for which they have a signature region (as set forth in Table 3); or for which they have homology to a gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and polynucleotides of the present invention are useful for a variety of applications, as described herein, including use in arrays for detection.
4. DETAILED DESCRIPTION OF THE INVENTION
4.1 DEFINITIONS
It must be noted that as used herein and in the appended claims, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise.

The term "active" refers to those forms of the polypeptide which retain the biologic and/or immunologic activities of any naturally occurnng polypeptide. According to the invention, the terms "biologically active" or "biological activity" refer to a protein or peptide having structural, regulatory or biochemical functions of a naturally occurring molecule.
Likewise "immunologically active" or "immunological activity" refers to the capability of the natural, recombinant or synthetic polypeptide to induce a specific immune response in appropriate animals or cells and to bind with specific antibodies.
The term "activated cells" as used in this application are those cells which are engaged in extracellular or intracellular membrane trafficking, including the export of secretory or enzymatic molecules as part of a normal or disease process.
The terms "complementary" or "complementarity" refer to the natural binding of polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the complementary sequence 3'-TCA-5'. Complementarity between two single-stranded molecules may be "partial" such that only certain portions) of the nucleic acids bind or it may be "complete" such that total complementarity exists between the single stranded molecules. The degree of complementarity between the nucleic acid strands has significant effects on the efficiency and strength of the hybridization between the nucleic acid strands.
The term "embryonic stem cells (ES)" refers to a cell that can give rise to many differentiated cell types in an embryo or an adult, including the germ cells.
The term "germ line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady and continuous source of germ cells for the production of gametes. The term "primordial germ cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells are .
capable of self renewal. Thus these cells not only populate the germ line and give rise to a plurality of terminally differentiated cells that comprise the adult specialized organs, but are able to regenerate themselves.
The term "expression modulating fragment," EMF, means a series of nucleotides which modulates the expression of an operably linked ORF or another EMF.
As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the expression of the sequence is altered by the presence of the EMF.
EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements). One class of EMFs are nucleic acid fragments which induce the expression of an operably linked ORF in response to a specific regulatory factor or physiological event.
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or "oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic origin which may be single-stranded or double-stranded and may represent the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G
is guanine and N is A, C, G, or T (L~ or unknown. It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences provided herein is substituted with U
(uracil).
Generally, nucleic acid segments provided by this invention may be assembled from fragments of the genome and short oligonucleotide linkers, or from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral operon, or a eukaryotic gene.
The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or "segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 11 nucleotides and most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from about 20 to 25 nucleotides.
Preferably the fragments can be used in polymerase chain reaction (PCR), various hybridization procedures or microarray procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A fragment or segment may uniquely identify each polynucleotide sequence of the present invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ ID NO: 1-1041, or 2083-2534.
Probes may, for example, be used to determine whether specific mRNA molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1:241-250).
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the art. Probes of the present invention, their preparation andlor labeling are elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold 5 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John Wiley & Sons, New York NY, both of which are incorporated herein by reference in their entirety.
The nucleic acid sequences of the present invention also include the sequence infornlation from the nucleic acid sequences of SEQ ff~ NO: 1-1041, or 2083-2534. The 10 sequence information can be a segment of any one of SEQ m NO: 1-1041, or that uniquely identifies or represents the sequence information of that sequence of SEQ m NO: 1-1041, or 2083-2534, or those segments identified in Tables 3, 5, 6, and 8. One such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set of chromosomes. Because 42° possible twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of human chromosomes.
Using the same analysis, the probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed sequences is also approximately one in five because expressed sequences comprise less than approximately 5% of the entire genome sequence.
Similarly, when using sequence information for detecting a single mismatch, a segment can be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome with a single mismatch is calculated by multiplying the probability for a full match (1=4z5) times the increased probability for mismatch at each nucleotide position (3 x 25). The probability that an eighteen mer with a single mismatch can be detected in an array for expression studies is approximately one in five. The probability that a twenty-mer with a single mismatch can be detected in a human genome is approximately one in five.
The term "open reading frame," ORF, means a series of nucleotide triplets coding for amino acids without any termination codons and is a sequence translatable into protein.
The terms "operably linked" or "operably associated" refer to functionally related nucleic acid sequences. For example, a promoter is operably associated or operably linked with a coding sequence if the promoter controls the transcription of the coding sequence.

While operably linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic elements e.g. repressor genes are not contiguously linked to the coding sequence but still control transcription/translation of the coding sequence.
The term "pluripotent" refers to the capability of a cell to differentiate into a number of differentiated cell types that are present in an adult organism. A
pluripotent cell is restricted in its differentiation capability in comparison to a totipotent cell.
The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more preferably at least about 9 amino acids and most preferably at least about 17 or more amino acids. The peptide preferably is not greater than about 200 amino acids, more preferably less than 150 amino acids and most preferably less than 100 amino acids.
Preferably the peptide is from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient length to display biological and/or irmnunological activity.
The term "naturally occurring polypeptide" refers to polypeptides produced by cells that have not been genetically engineered and specifically contemplates various polypeptides arising from post-translational modifications of the polypeptide including, but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipi'dation and acylation.
The term "translated protein coding portion" means a sequence which encodes for the full-length protein which may include any leader sequence or any processing sequence.
The term "mature protein coding sequence" means a sequence which encodes a peptide or protein without a signal or leader sequence. The "mature protein portion" means that portion of the protein which does not include a signal or leader sequence. The peptide may have been produced by processing in the cell wluch removes any leader/signal sequence. The mature protein portion may or may not include the initial methionine residue.
The methionine residue may be removed from the protein during processing in the cell. The peptide may be produced synthetically or the protein may have been produced using a polynucleotide only encoding for the mature protein coding sequence.
The term "derivative" refers to polypeptides chemically modified by such techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer attachment such as pegylation (derivatization with polyethylene glycol) and insertion or substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur in human proteins.
The term "variant"(or "analog") refers to any polypeptide differing from naturally occurnng polypeptides by amino acid insertions, deletions, and substitutions, created using, a g., recombinant DNA techniques. Guidance in determining which amino acid residues may be replaced, added or deleted without abolishing activities of interest, may be found. by comparing the sequence of the particular polypeptide with that of homologous peptides and minimizing the number of amino acid sequence changes made in regions of high homology (conserved regions) or by replacing amino acids with consensus sequence.
Alternatively, recombinant variants encoding these same or similar polypeptides may be synthesized or selected by making use of the "redundancy" in the genetic code. Various codon substitutions, such as the silent changes which produce various restriction sites, may be introduced to optimize cloning into a plasmid or viral vector or expression in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in the polypeptide or domains of other peptides added to the polypeptide to modify the properties of any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover rate.
Preferably, amino acid "substitutions" are the result of replacing one amino acid with another amino acid having similar structural and/or chemical properties, i.
e., conservative amino acid replacements. "Conservative" amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The variation allowed may be experimentally determined by systematically making insertions, deletions, or substitutions of amino acids in a polypeptide molecule using recombinant DNA techniques and assaying the resulting recombinant variants for activity.
Alternatively, where alteration of function is desired, insertions, deletions or non-conservative alterations can be engineered to produce altered polypeptides. Such alterations can, for example, alter one or more of the biological functions or biochemical characteristics of the polypeptides of the invention. For example, such alterations may change polypeptide characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover rate. Further, such alterations can be selected so as to generate polypeptides that are better suited for expression, scale up and the like in the host cells chosen for expression. For example, cysteine residues can be deleted or substituted with another amino acid residue in order to eliminate disulfide bridges.
The terms "purified" or "substantially purified" as used herein denotes that the indicated nucleic acid or polypeptide is present in the substantial absence of other biological macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more preferably at least 99% by weight, of the indicated biological macromolecules present (but water, buffers, and other small molecules, especially molecules having a molecular weight of less than 1000 daltons, can be present).
The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or polypeptides present in their natural source.
The term "recombinant," when used herein to refer to a polypeptide or protein, means that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications;
polypeptides or proteins expressed in yeast will have a glycosylation pattern in general different from those expressed in mammalian cells.
The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural or coding sequence which is transcribed into mRNA
and translated into protein, and (3) appropriate transcription iutiation and termination sequences.
Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell.
Alternatively, where recombinant protein is expressed without a leader or transport sequence, it may include an amino terminal methionine residue. This residue may or may not be subsequently cleaved from the expressed recombinant protein to provide a final product.
The term "recombinant expression system" means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will express heterologous polypeptides or proteins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. This term also means host cells which have stably integrated a recombinant genetic element or elements having a regulatory role in gene expression, for example, promoters or enhancers.
Recombinant expression systems as defined herein will express polypeptides or proteins endogenous to the cell upon induction of the regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells can be prokaryotic or eukaryotic.
The term "secreted" includes a protein that is transported across or through a membrane, including transport as a result of signal sequences in its amino acid sequence when it is expressed in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. "Secreted" proteins also include without limitation proteins that are transported across the membrane of the endoplasmic reticulum. "Secreted"
proteins are also intended to include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al.
(1995) Annu. Rev. hnmunol. 16:27-55) Where desired, an expression vector may be designed to contain a "signal or leader sequence" which will direct the polypeptide through the membrane of a cell.
Such a sequence may be naturally present on the polypeptides of the present invention or provided from heterologous protein sources by recombinant DNA techniques.

The term "stringent" is used to refer to conditions that are commonly understood in the art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization to filter-bound DNA in 0.5 M NaHPO4, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65°C, and washing in O.1X SSC/0.1% SDS at 68°C), and moderately stringent 5 conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are described herein in the examples.
In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligonucleotides), 55°C (for 20-10 base oligonucleotides), and 60°C (for 23-base oligonucleotides).
As used herein, "substantially equivalent" or "substantially similar" can refer both to nucleotide and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between the reference and 15 subject sequences. Typically, such a substantially equivalent sequence varies from one of those listed herein by no more than about 35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a substantially equivalent sequence, as compared to the corresponding reference sequence, divided by the total number of residues in the substantially equivalent sequence is about 0.35 or less). Such a sequence is said to have 65% sequence identity to the listed sequence. In one embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, by no more than 25%
(75% sequence identity); and in a further variation of this embodiment, by no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no more than 10% (90% sequence identity) and in a further variation of this embodiment, by no more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid sequences according to the invention preferably have at least 80% sequence identity with a listed amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 90%
sequence identity, more preferably at least 95% sequence identity, more preferably at least 98% sequence identity, and most preferably at least 99% sequence identity.
Substantially equivalent nucleotide sequence of the invention can have louver percent sequence identities, taking into account, for example, the redundancy or degeneracy of the genetic code.
Preferably, the nucleotide sequence has at least about 65% identity, more preferably at least about 75% identity, more preferably at least about 80% sequence identity, more preferably at least 85% sequence identity, more preferably at least 90% sequence identity, more preferably at least about 95% sequence identity, more preferably at least 98% sequence identity, and most preferably at least 99% sequence identity. For the purposes of the present invention, sequences having substantially equivalent biological activity and substantially equivalent expression characteristics are considered substantially equivalent. For the purposes of determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a new stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun Hein method (Hero, J. (1990) Methods Enzymol. 183:626-645).
Identity between sequences can also be determined by other methods known in the art, e.g.
by varying hybridization conditions.
The term "totipotent" refers to the capability of a cell to differentiate into all of the cell types of an adult organism.
The term "transformation" means introducing DNA into a suitable host cell so that the DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether or not any coding sequences are in fact expressed.
The term "infection" refers to the introduction of nucleic acids into a suitable host cell by use of a virus or viral vector.
As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides which mediate the uptake of a linked DNA fragment into a cell.
UMFs can be readily identified using known UMFs as a target sequence or target motif with the computer-based systems described below. The presence and activity of a UMF can be confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated with an appropriate host under appropriate conditions and the uptake of the marker sequence is determined. As described above, a UMF will increase the frequency of uptake of a linked marker sequence.
Each of the above terms is meant to encompass all that is described for each, unless the context dictates otherwise.
4.2 NUCLEIC ACIDS OF THE INVENTION
Nucleotide sequences of the invention are set forth in the Sequence Listing.

The isolated polynucleotides of the invention include a polynucleotide comprising the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534; a polynucleotide encoding any one of the peptide sequences of SEQ m NO: 1-1041, or 2083-2534; and a polynucleotide comprising the nucleotide sequence encoding the mature protein coding sequence of the polynucleotides of any one of SEQ m NO: 1-1041, or 2083-2534.
The polynucleotides of the present invention also include, but are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ m NO: 1-1041, or 2083-2534; (b) nucleotide sequences encoding any one of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c) a polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the polypeptides of SEQ m NO: 1042-2082, or 2535-2986 (for example, as set forth in Tables 3, 5, 6, or 8). Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or combinations thereof;
domains in irmnunoglobulin-like proteins include the variable immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and substrate binding domains;
and domains in ligand polypeptides include receptor-binding domains.
The polynucleotides of the invention include naturally occurring or wholly or partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides may include entire coding region of the cDNA or may represent a portion of the coding region of the cDNA.
The present invention also provides genes corresponding to the cDNA sequences disclosed herein. The corresponding genes can be isolated in accordance with known methods using the sequence information disclosed herein. Such methods include the preparation of probes or primers from the disclosed sequence information for identification and/or amplification of genes in appropriate genomic libraries or other sources of genomic materials.
Further 5' and 3' sequence can be obtained using methods known in the art. For example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides of SEQ m NO:
1-1041, or 2083-2534 can be obtained by screening appropriate cDNA or genomic DNA
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ m NO: 1-1041, or 2083-2534 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1041, or 2083-2534 may be used as the basis for suitable primers) that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries.
The nucleic acid sequences of the invention can be assembled from ESTs~and sequences (including cDNA and genomic sequences) obtained from one or more public databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, representative fragment or segment information, or novel segment information for the full-length gene.
The polynucleotides of the invention also provide pol5mucleotides including nucleotide sequences that are substantially equivalent to the polynucleotides recited above.
Polynucleotides according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence identity to a polynucleotide recited above.
Included within the scope of the nucleic acid sequences of the invention are nucleic acid sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534, or complements thereof, which fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the polynucleotides of the invention are contemplated. Probes capable of specifically hybridizing to a polynucleotide can differentiate polynucleotide sequences of the invention from other polynucleotide sequences in the same family of genes or can differentiate human genes from genes of other species, and are preferably based on unique nucleotide sequences.
The sequences falling within the scope of the present invention are not limited to these specific sequences, but also include allelic and species variations thereof.
Allelic and species variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1-1041, or 2083-2534, a representative fragment thereof, or a nucleotide sequence at least 90%
identical, preferably 95% identical, to SEQ m NO: 1-1041, or 2083-2534 with a sequence from another isolate of the same species. Furthermore, to accommodate colon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one colon for another colon that encodes the same amiilo acid is expressly contemplated.

The nearest neighbor or homology results for the nucleic acids of the present invention, including SEQ m NO: 1-1041, or 2083-2534 can be obtained by searching a database using an algorithm or a program. Preferably, a BLAST (Basic Local Aligmnent Search Tool) program is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA
version 3 search against Genpept, using FASTXY algorithm may be performed.
Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also provided by the present invention. Species homologs may be isolated and identified by making suitable probes or primers from the sequences provided herein and screening a suitable nucleic acid source from the desired species.
The invention also encompasses allelic variants of the disclosed polynucleotides or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also encode proteins which are identical, homologous or related to that encoded by the polynucleotides.
The nucleic acid sequences of the invention. are further directed to sequences which encode variants of the described nucleic acids. These amino acid sequence variants may be prepared by methods known in the art by introducing appropriate nucleotide changes into a native or variant polynucleotide. There are two variables in the construction of amino acid sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably constructed by mutating the polynucleotide to encode an amino acid sequence that does not occur in nature.
These nucleic acid alterations can be made at sites that differ in the nucleic acids from different species (variable positions) or in highly conserved regions (constant regions). Sites at such locations will typically be modified in series, e.g., by substituting first with conservative choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions may be made at the target site. Amino acid sequence deletions generally range from about 1 to 30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one hundred or more residues, as well as intrasequence insertions of single or multiple amino acid residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, preferably from 1 to 5 residues.
Examples of terminal insertions include the heterologous signal sequences necessary for secretion or for intracellular targeting in different host cells and sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein.
In a preferred method, polynucleotides encoding the novel amino acid sequences are changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter 5 a polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the site of being changed. In general, the techniques of site-directed mutagenesis are well known to those of skill in the art and this technique is exemplified by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and efficient method for producing 10 site-specific changes in a polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. When small amounts of template DNA are used as starting material, primers) that differs slightly in sequence from the corresponding region in the template DNA can generate the desired amino acid variant. PCR
amplification 15 results in a population of product DNA fragments that differ from the polynucleotide template encoding the polypeptide at the position specified by the primer. The product DNA
fragments replace the corresponding region in the plasmid and this gives a polynucleotide encoding the desired amino acid variant.
A further technique for generating amino acid variants is the cassette mutagenesis 20 technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well known in the art, such as, for example, the techniques in Sambrook et al., supra, and Cur~eht Protocols i~z MoleculaY Biology, Ausubel et al. Due to the inherent degeneracy of the genetic code, other DNA sequences which encode substantially the same or a functionally equivalent amino acid sequence may be used in the practice of the invention for the cloning and expression of these novel nucleic acids. Such DNA sequences include those which are capable of hybridizing to the appropriate novel nucleic acid sequence under stringent conditions.
Polynucleotides encoding preferred polypeptide truncations of the invention could be used to generate polynucleotides encoding chimeric or fusion proteins comprising one or more domains of the invention and heterologous protein sequences.
The polynucleotides of the invention additionally include the complement of any of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known to those of skill in the art and can include, for example, methods for determining hybridization conditions that can routinely isolate polynucleotides of the desired sequence identities.
In accordance with the invention, polynucleotide sequences comprising the mature protein coding sequences corresponding to any one of SEQ m NO: 1-1041, or 2083-2534, or functional equivalents thereof, may be used to generate recombinant DNA
molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. Also included are the cDNA inserts of any of the clones identified herein.
A polynucleotide according to the invention can be joined to any of a variety of other nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY).
Useful nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the art. Accordingly, the invention also provides a vector including a polynucleotide of the invention and a host cell containing the polynucleotide.
In general, the vector contains an origin of replication functional in at least one organism, convenient restriction endonuclease sites, and a selectable marker for the host cell.
Vectors according to the invention include expression vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a multicellular organism.
The present invention further provides recombinant constructs comprising a nucleic acid having any of the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534 or a fragment thereof or any other pol5mucleotides of the invention. In one embodiment, the recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF. Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially available for generating the recombinant constructs of the present invention. The following vectors are provided by way of example: Bacterial:
pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNHBa, pNHl6a, pNHl8a, pNH46a (Stratagene), pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia);
Eukaryotic:

pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL
(Pharmacia).
The isolated polynucleotide of the invention may be operably linked to an expression control sequence such as the pMT2 or pED expression vectors disclosed in Kaufinan et al., Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly.
Many suitable expression control sequences are known in the art. General methods of expressing recombinant proteins are also known and are exemplified in R.
Kaufinan, Methods iu Enzymology 185, 537-566 (1990). As defined herein "operably linked"
means that the isolated polynucleotide of the invention and an expression control sequence are situated within a vector or cell in such a way that the protein is expressed by a host cell which has been transformed (transfected) with the ligated polynucleotide/expression control sequence.
Promoter regions can be selected from any desired gene using CAT
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lacI, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art. Generally, recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP 1 gene, and a promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence.
Such promoters can be derived from operons encoding glycolytic enzymes such as phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium.
Optionally, the heterologous sequence can encode a fusion protein including an amino terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product. Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for transformation include E. coli, Bacillus subtilis, Salmonella typhimur iuna and various species within the genera Pseudomonas, Streptonayces, and Staphylococcus, although others may also be employed as a matter of choice.
As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pI~K223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed. Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter is induced or derepressed by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period. Cells axe typically harvested by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained for further purification.
Polynucleotides of the invention can also be used to induce immune responses.
For example, as described in Fan et al., Nat. Biotech 17, 870-872 (1999), incorporated herein by reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies against the encoded polypeptide following topical administration of naked plasmid DNA or following injection, and preferably intra-muscular injection of the DNA. The nucleic acid sequences are preferably inserted in a recombinant expression vector and may be in the form of naked DNA.
4.3 ANTISENSE
Another aspect of the invention pertains to isolated antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1-1041, or 2083-2534, or fragments, analogs or derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA
sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of SEQ >D NO:
1-1041, or 2083-2534 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ m NO: 1-1041, or 2083-2534 are additionally provided.
In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers to the region of the nucleotide sequence comprising codons which are translated into amino acid residues. In another embodiment, the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term "noncoding region" refers to 5' and 3' sequences that flank the , coding region that are not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions).
Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ >D NO: 1-1041, or 2083-2534, antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be complementary to the entire coding region of an mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of an mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of an mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.
Examples of modified nucleotides that can be used to generate the antisense nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxyhnethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyamiuomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, 5 uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced .biologically using an expression vector into which a nucleic acid has been subcloned in an 10 antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).
The antisense nucleic acid molecules of the invention are typically administered to a subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 15 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in the major groove of the double helix. An example of a route of 20 administration of antisense nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to 25 peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II
or pol III
promoter are preferred.
W yet another embodiment, the antisense nucleic acid molecule of the invention is an a,-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual a,-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15:

6625-6641). The antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (moue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (moue et al. (1987) FEBS Lett 215: 327-330).
4.4 RIBOZYMES AND PNA MOIETIES
In still another embodiment, an antisense nucleic acid of the invention is a ribozyme.
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having specificity for a nucleic acid of the invention can be designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO: 1-1041, or 2083-2534). For example, a derivative of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742.
Alternatively, mRNA of the invention can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel et al., (1993) Seience 261:1411-1418.
Alternatively, gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical structures that prevent transcription of the gene in target cells. See generally, Helene.
(1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N Y. Acad.
Sci.
660:27-36; and Maher (1992) Bioassays 14: 807-15.
In various embodiments, the nucleic acids of the invention can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1996) Bioorg Med Chern 4: 5-23). As used herein, the terms "peptide nucleic acids"
or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under conditions of low ionic strength.
The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al.
(1996) PNAS 93:
14670-675.
PNAs of the invention can be used in therapeutic and diagnostic applications.
For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in combination with other enzymes, e.g., S1 nucleases (Hyrup B.
(1996) above);
or as probes or primers for DNA sequence and hybridization (Hyrup et al.
(1996), above;
Perry-O'Keefe (1996), above).
In another embodiment, PNAs of the invention can be modified, e.g., to enhance their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug delivery known in the art. For example, PNA-DNA chimeras can be generated that may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA
portion while the PNA portion would provide high binding affinity and specificity.
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of base stacking, number of bonds between the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a DNA chain can be synthesized on a solid support using standard phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 5'-(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA and the 5' end of DNA (Mag et al.
(1989) Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment (Finn et al.
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5' DNA
segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Clzem Lett 5:
1119-11124.
In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al., 1989, P~oc. Natl. Acad.
Sci. U.S.A.

86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT
Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No.
W089/10134).
In addition, oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et al., 1988, BioTechhiques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pha~m. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, etc.
4.5 HOSTS
The present invention further provides host cells genetically engineered to contain the polynucleotides of the invention. For example, such host cells may contain nucleic acids of the invention introduced into the host cell using known transformation, transfection or infection methods. The present invention still fizrther provides host cells genetically engineered to express the polynucleotides of the invention, wherein such polynucleotides are in operative association with a regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in the cell.
Knowledge of nucleic acid sequences allows for modification of cells to permit, or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous recombination) to provide increased polypeptide expression by replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells express the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it is operatively linked to the encoding sequences. See, for example, PCT International Publication No. WO94/12650, PCT International Publication No. W092/20808, and PCT International Publication No. W091/09955. It is also contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA
may be inserted along with the heterologous promoter DNA. If linked to the coding sequence, amplification of the marker DNA by standard selection methods results in co-amplification of the desired protein coding sequences in the cells.
The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the recombinant construct into the host cell can be effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, L. et al., Basic Metlaods iri Molecular Biology (1986)). The host cells containing one of the polynucleotides of.the invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF.
Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, COS cells, 293 cells, and S~ cells, as well as prokaryotic host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the polypeptide or protein at low natural level.
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention. Appropriate cloning arid expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A
Laboratory Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of which is hereby incorporated by reference.
Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kichley fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Co1o205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived from ih vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.
Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents.
Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include SaccharonZyces cerevisiae, SclZizosacchaYOtnyces potrtbe, Kluyvet~omyces strains, Candida, or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella typhimuriut~t, or any bacterial 10 strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it may be necessary to modify the protein produced therein, for example by phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent attachments may be accomplished using known chemical or enzymatic methods.
15 hl another embodiment of the present invention, cells and tissues may be engineered to express an endogenous gene comprising the polynucleotides of the invention under the control of inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may be replaced by homologous recombination. As described herein, gene targeting can be used to replace a gene's existing regulatory region with a regulatory 20 sequence. isolated from a different gene or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional initiation sites, and regulatory protein binding sites or combinations of said sequences.
Alternatively, sequences which affect the structure or stability of the RNA or protein 25 produced may be replaced, removed, added, or otherwise modified by targeting. These sequence include polyadenylation signals, mRNA stability elements, splice sites, leader sequences for enhancing or modifying transport or secretion properties of the protein, or other sequences which alter or improve the function or stability of protein or RNA
molecules.
30 The targeting event may be a simple insertion of the regulatory sequence, placing the gene under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the targeting event may replace an existing element;
for example, a tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type specificity than the naturally occurnng elements. Here, the naturally occurring sequences are deleted and new sequences are added. In all cases, the identification of the targeting event may be facilitated by the use of one or more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection of cells in which the exogenous DNA has integrated into the host cell genome. The identification of the targeting event may also be facilitated by the use of one or more marker genes exhibiting the property of negative selection, such that the negatively selectable marker is linked to the exogenous DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and such that a correct homologous recombination event with sequences in the host cell genome does not result in the stable integration of the negatively selectable marker.
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in accordance with this aspect of the invention are more particularly described in U.S.
Patent No. 5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; Tnternational Application No.
PCT/US92/09627 (W093/09222) by Selden et al.; and International Application No.
PCT/US90/06436 (W091/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety.
4.6 POLYPEPTIDES OF THE INVENTION
The isolated polypeptides of the invention include, but are not limited to, a polypeptide comprising: the amino acid sequences set forth as any one of SEQ
ID NO: 1042-2082, or 2535-2986 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ DJ NO: 1-1041, or 2083-2534 or the corresponding full length or mature protein. Polypeptides of the invention also include polypeptides preferably with biological or immunological activity that are encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534 or (b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ m NO: 1042-2082, or 2535-2986 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. The invention also provides biologically active or immunologically active variants of any of the amino acid sequences set forth as SEQ m NO: 1042-2082, or 2535-2986 or the corresponding full length or mature protein;
and "substantial equivalents" thereof (e.g., with at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by allelic variants may have a similar, increased, or decreased activity compared to polypeptides comprising SEQ m NO: 1042-2082, or 2986.
Fragments of the proteins of the present invention which are capable of exhibiting biological activity are also encompassed by the present invention. Fragments of the protein may be in linear form or they may be cyclized using known methods, for example, as described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S.
McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such fragments may be fused to Garner molecules such as immunoglobulins for many purposes, including increasing the valency of protein binding sites. Fragments are also identified in Tables 3, 5, 6, and 8.
The present invention also provides both full-length and mature forms (for example, without a signal sequence or precursor sequence) of the disclosed proteins.
The protein coding sequence is identified in the sequence listing by translation of the disclosed nucleotide sequences. The predicted signal sequence is set forth in Table 6.
The mature form of such protein may be obtained and confirmed by expression of a full-length polynucleotide in a suitable mammalian cell or other host cell and sequencing of the cleaved product. One of skill in the art will recognize that the actual cleavage site may be different than that predicted in Table 6. The sequence of the mature form of the protein is also determinable from the amino aci°d sequence of the full-length form.
Where proteins of the present invention are membrane bound, soluble forms of the proteins are also provided. In such forms, part or all of the regions causing the proteins to be membrane bound are deleted so that the proteins are fully secreted from the cell in which they are expressed.
Protein compositions of the present invention may further comprise an acceptable carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic acid fragments of the present invention are the ORFs that encode proteins.
A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially available peptide synthesizers. The synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary structural and/or conformational characteristics with proteins may possess biological properties in common therewith, including protein activity. This technique is particularly useful in producing small peptides and fragments of larger polypeptides.
Fragments are useful, for example, in generating antibodies against the native polypeptide.
Thus, they may be employed as biologically active or immunological substitutes for natural, purified proteins in screening of therapeutic compounds and in immunological processes for the development of antibodies.
The polypeptides and proteins of the present invention can alternatively be purified from cells which have been altered to express the desired polypeptide or protein. As used herein, a Bell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level. One skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention.
The invention also relates to methods for producing a polypeptide comprising growing a culture of host cells of the invention in a suitable culture medium, and purifying the protein from the cells or the culture in which the cells are grown. For example, the methods of the invention include a process for producing a polypeptide in which a host cell containing a suitable expression vector that includes a polynucleotide of the invention is cultured under conditions that allow expression of the encoded polypeptide.
The polypeptide can be recovered from the culture, conveniently,from the culture medium, or from a lysate prepared from the host cells and further purified. Preferred embodiments include those in which the protein produced by such process is a full length or mature form of the protein.

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein. One skilled in the art can readily follow known methods for isolating polypeptides and proteins in order to obtain one of the isolated polypeptides or proteins of the present invention. These include, but are not limited to, S immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Pf-ateih Pu~ificatiafa: Priheiples afad PYactice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboy~atoYy Manual; Ausubel et al., Cu~~efzt Protocols in Molecular Biology. Polypeptide fragments that retain biologicallimmunological activity include fragments comprising greater than about 100 amino acids, or greater than about 200 amino acids, and fragments that encode specific protein domains.
The purified polypeptides can be used in in vitro binding assays which are well knov~m in the art to identify molecules which bind to the polypeptides. These molecules include but are not limited to, for e.g., small molecules, molecules from combinatorial 1S libraries, antibodies or other proteins. The molecules identified in the binding assay are then tested for antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested for either cellla~zimal death or prolonged survival of the animal/cells.
In addition, the peptides of the invention or molecules capable of binding to the peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor. or other cell by the specificity of the binding molecule for SEQ )D NO: 1042-2082, or 2S3S-2986.
The protein of the invention may also be expressed as a product of transgenic animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are 2S characterized by somatic or germ cells containing a nucleotide sequence encoding the protein.
The proteins provided herein also include proteins characterized by amino acid sequences similar to those of purified proteins but into which modification are naturally provided or deliberately engineered. For example, modifications, in the peptide or DNA
sequence, can be made by those skilled in the art using known techniques.
Modifications of interest in the protein sequences may include the alteration, substitution, replacement, insertion or deletion of a selected amino acid residue in the coding sequence.
Fox example, one or more of the cysteine residues may be deleted or replaced with another amino acid to alter the conformation of the molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are well known to those skilled in the art (see, e.g., U.S.
Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, insertion or deletion retains the desired activity of the protein. Regions of the protein that are important 5 for the protein function can be determined by various methods known in the art including the alanine-scanning method which involved systematic substitution of single or strings of amino acids with alanine, followed by testing the resulting alanine-containing variant for biological activity. This type of analysis determines the importance of the substituted amino acids) in biological activity. Regions of the protein that are important for protein function 10 may be determined by the eMATRIX program.
Other fragments and derivatives of the sequences of proteins which would be expected to retain protein activity in whole or in part and are useful for screening or other immunological methodologies may also be easily made by those skilled in the art given the disclosures herein. Such modifications are encompassed by the present invention.
15 The protein may also be produced by operably linking the isolated polynucleotide of the invention to suitable control sequences in one or more insect expression vectors, and employing an insect expression system. Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. (the MaxBatTM kit), and such methods are well known in the art, as described 20 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No.
1555 (1987), incorporated herein by reference. As used herein, an insect cell capable of expressing a polynucleotide of the present invention is "transformed."
The protein of the invention may be prepared by culturing transformed host cells under culture conditions suitable to express the recombinant protein. The resulting 25 expressed protein may then be purified from such culture (i.e., from culture medium or cell extracts) using known purification processes, such as gel filtration and ion exchange chromatography. The purification of the protein may also include an affinity column containing agents which will bind to the protein; one or more column steps over such affinity resins as concanavalin A-agarose, heparin-toyopearlTM or Cibacrom blue 3GA
SepharoseTM;
30 one or more steps involving hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography.
Alternatively, the protein of the invention may also be expressed in a form which will facilitate purification. For example, it may be expressed as a fusion protein, such as those of maltose binding protein (MBP), glutatluone-S-transferase (GST) or thioredoxin (TRX), or as a His tag. Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, Mass.), Pharmacia (Piscatav~iay, N.J.) and Invitrogen, respectively. The protein can also be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope. One such epitope ("FLAG~") is commercially available from Kodak (New Haven, Conn.).
Finally, one or more reverse-phase high performance liquid chromatography (RP-HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing purification steps, in various combinations, can also be employed to provide a substantially homogeneous isolated recombinant protein. The protein thus purified is substantially free of other mammalian proteins and is defined in accordance with the present invention as an "isolated protein."
The polypeptides of the invention include analogs (variants). This embraces fragments, as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs may exhibit improved properties such as activity and/or stability.
Examples of moieties Which may be fused to the polypeptide or an analog include, for example, targeting moieties which provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be fused to the polypeptide include therapeutic agents which are used for treatment, for example, immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as alpha or beta interferon.
4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE
IDENTITY AND SIMILARITY
Preferred identity and/or similarity are designed to give the largest match between the sequences tested. Methods to determine identity and similarity are codified in computer programs including, but are not limited to, the GCG program package, including GAP

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA
(Altschul, S.F. et al., J. Molec. Biol. 215:403-410 (1990), PST-BLAST (Altschul S.F. et al., Nucleic Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), Pfam software (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mo1 Biol, 157, pp. 105-31 (1982), incorporated herein by reference).
polypeptide sequences were examined by a proprietary algorithm, SeqLoc that separates the proteins into three sets of locales: intracellular, membrane, or secreted.
This prediction is based upon three characteristics of each polypeptide, including percentage of cysteine residues, Kyte-Doolittle scores for the f rst 20 amino acids of each protein, and Kyte-Doolittle scores to calculate the longest hydrophobic stretch of the said protein. Values of predicted proteins are compared against the values from a set of 592 proteins of known cellular localization from the Swissprot database (http:llwww.expasy.ch/sprot). Predictions are based upon the maximum likelihood estimation.
The BLAST programs are publicly available from the National Center for Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul, S., et al.
NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-(1990).
4.7 CHIMERIC AND FUSION PROTEINS
The invention also provides chimeric or fusion proteins. As used herein, a "chimeric protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to another polypeptide. Within a fusion protein the polypeptide according to the invention can correspond to all or a portion of a protein according to the invention. In one embodiment, a fusion protein comprises at least one biologically active portion of a protein according to the invention. In another embodiment, a fusion protein comprises at Least two biologically active portions of a protein according to the invention. Within the fusion protein, the term "operatively linked" is intended to indicate that the polypeptide according to the invention and the other polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or C-terminus, or to the middle.

For example, in one embodiment a fusion protein comprises a polypeptide according to the invention operably linked to the extracellular domain ~of a second protein.
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide sequences of the invention are fused to the C-terminus of the GST
(i.e., glutathione S-transferase) sequences.
In another embodiment, the fusion protein is an immunoglobulin fusion protein in which the polypeptide sequences according to the invention comprise one or more domains fused to sequences derived from a member of the immunoglobulin protein family.
The immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject to inhibit an interaction between a ligand and a protein of the invention on the surface of a cell, to thereby suppress signal transduction ira viv~. The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of proliferative and differentiative disorders, e.g., cancer as well as modulating (e.g., promoting or inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand.
A chimeric or fusion protein of the invention can be produced by standard recombinant DNA techniques. For example, DNA fragments coding for the different polypeptide sequences are ligated together in-frame in accordance with conventional techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR
amplification of gene fragments can be carned out using anchor primers that give rise to complementary overhangs between two consecutive gene fragments that can subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for example, Ausubel et al. (eds.) CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the invention can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the protein of the invention.
4.8 GENE T~IERAPY
Mutations in the polynucleotides of the invention gene may result in loss of normal function of the encoded protein. The invention thus provides gene therapy to restore normal activity of the polypeptides of the invention; or to treat disease states involving polypeptides of the invention. Delivery of a functional gene encoding polypeptides of the invention to appropriate cells is effected ex vivo, ih situ, or is? vivo by use of vectors, and more particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998).
For additional reviews of gene therapy technology see Friedmann, Science, 244:

(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992).
Introduction of amy one of the nucleotides of the present invention or a gene encoding the polypeptides of the present invention can also be accomplished with extrachromosomal substrates (transient expression) or artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of proteins of the present invention in order to proliferate or to produce a desired effect on or activity in such cells.
Treated cells can then be introduced ifa vivo for therapeutic purposes. Alternatively, it is contemplated that in other human disease states, preventing the expression of or inhibiting the activity of polypeptides of the invention will be useful in treating the disease states. It is contemplated that antisense therapy or gene therapy could be applied to negatively regulate the expression of polypeptides of the invention.
Other methods inhibiting expression of a protein include the introduction of antisense molecules to the nucleic acids of the present invention, their complements, or their translated RNA sequences, by methods known in the art. Further, the polypeptides of the present invention can be inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such as a silencer, which is tissue specific.
The present invention still further provides cells genetically engineered ih vivo to express the polynucleotides of the invention, wherein such polynucleotides are in operative association with a regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in the cell. These methods can be used to increase or decrease the expression of the polynucleotides of the present invention.
Knowledge of DNA sequences provided by the invention allows for modification of cells to permit, increase, or decrease, expression of endogenous polypeptide.
Cells can be 5 modified (e.g., by homologous recombination) to provide increased polypeptide expression by replacing, in whole or in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells express the protein at lugher levels. The heterologous promoter is inserted in such a manner that it is operatively linked to the desired protein encoding sequences.
See, for example, PCT International Publication No. WO 94/12650, PCT
International 10 Publication No. WO 92/20808, and PCT International Publication No. WO
91/09955. It is also contemplated that, in addition to heterologous promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired protein coding sequence, 15 amplification of the marker DNA by standard selection methods results in co-amplification of the desired protein coding sequences in the cells.
In another embodiment of the present invention, cells and tissues may be engineered to express an endogenous gene comprising the polynucleotides of the invention under the control of inducible regulatory elements, in which case the regulatory sequences of the endogenous 20 gene may be replaced by homologous recombination. As described herein, gene targeting can be used to replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene or a novel regulatory sequence synthesized by genetic engineering methods.
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional initiation sites, regulatory protein binding 25 sites or combinations of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or protein produced may be replaced, removed, added, or otherwise modified by targeting. These sequences include polyadenylation signals, mRNA
stability elements, splice sites, leader sequences for enhancing or modifying transport or secretion properties of the protein, or other sequences which alter or improve the function or stability of 30 protein or RNA molecules.
The targeting event may be a simple insertion of the regulatory sequence, placing the gene under the control of the new regulatory sequence, e.g., inserting'a new promoter or enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the targeting event may replace an existing element;
for example, a tissue-specific enhancer can be replaced by an enhancer that has broader or different cell-type specificity than the naturally occurring elements. Here, the naturally occurring sequences are deleted and new sequences are added. In all cases, the identification of the targeting event may be facilitated by the use of one or more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection of cells in which the exogenous DNA
has integrated into the cell genome. The identification of the targeting event may also be facilitated by the use of one or more marker genes exhibiting the property of negative selection, such that the negatively selectable marker is linked to the exogenous DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and such that a correct homologous recombination event with sequences in the host cell genome does not result in the stable integration of the negatively selectable marker. Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xantlune-guanine phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in accordance with this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; W ternational Application No.
PCT/LTS92/09627 (W093/09222) by Selden et al.; and International Application No.
PCT/LTS90/06436 (W091/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety.
4.9 TRANSGENIC ANIMALS
In preferred methods to determine biological functions of the polypeptides of the invention in vivo, one or more genes provided by the invention are either over expressed or inactivated in the germ line of animals using homologous recombination [Capecchi, Science 244:1288-1292 (1989)J. Animals in which the gene is over expressed, under the regulatory control of exogenous or endogenous promoter elements, are known as transgenic animals.
Animals in which an endogenous gene has been inactivated by homologous recombination are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference.
Transgenic animals are useful to determine the roles polypeptides of the invention play in biological processes, and preferably in disease states. Transgenic animals are useful as model systems to identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, are produced using methods as described in U.S.
Patent No 5,489,743 and PCT Publication No. WO94/28122, incorporated herein by reference.
Transgenic animals can be prepared wherein all or part of a promoter of the polynucleotides of the invention is either activated or inactivated to alter the level of expression of the polypeptides of the invention. Inactivation can be carried out using homologous recombination methods described above. Activation can be achieved by supplementing or even replacing the homologous promoter to provide for increased protein expression. The homologous promoter can be supplemented by insertion of one or more heterologous enhancer elements known to confer promoter activation in a particular tissue.
The polynucleotides of the present invention also make possible the development, through, e.g., homologous recombination or knock out strategies, of animals that fail to express polypeptides of the invention or that express a variant polypeptide.
Such animals are useful as models for studying the i~ vivo activities of polypeptide as well as for studying modulators of the polypeptides of the invention.
In preferred methods to determine biological functions of the polypeptides of the invention in vivo, one or more genes provided by the invention are either over expressed or inactivated in the germ line of animals using homologous recombination [Capecchi, Science 244:1288-1292 (1989)x. Animals in which the gene is over expressed, under the regulatory control of exogenous or endogenous promoter elements, are known as transgenic animals.
Animals in which an endogenous gene has been inactivated by homologous recombination are referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference.
Transgenic animals are useful to determine the roles polypeptides of the invention play in biological processes, and preferably in disease states. Transgenic animals are useful as model systems to identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, are produced using methods as described in U.S.
Patent No 5,489,743 and PCT Publication No. W094/28122, incorporated herein by reference.
Transgenic animals can be prepared wherein all or part of the polynucleotides of the invention promoter is either activated or inactivated to alter the level of expression of the polypeptides of the invention. Inactivation can be carried out using homologous recombination methods described above. Activation can be achieved by supplementing or even replacing the homologous promoter to provide for increased protein expression. The homologous promoter can be supplemented by insertion of one or more heterologous enhancer elements known to confer promoter activation in a particular tissue.
4.10 USES AND BIOLOGICAL ACTIVITY
The polynucleotides and proteins of the present invention are expected to exhibit one or more of the uses or biological activities (including those associated with assays cited herein) identified herein. Uses or activities described fox proteins of the present invention may be provided by administration or use of such proteins or of polynucleotides encoding such proteins (such as, for example, in gene therapies or vectors suitable for introduction of DNA). The mechanism underlying the particular condition or pathology will dictate whether the polypeptides of the invention, the polynucleotides of the invention or modulators (activators or inhibitors) thereof would be beneficial to the subject in need of treatment.
Thus, "therapeutic compositions of the invention" include compositions comprising isolated polynucleotides (including recombinant DNA molecules, cloned genes and degenerate variants thereof) or polypeptides of the invention (including full length protein, mature protein and truncations or domains thereof), or compounds and other substances that modulate the overall activity of the target gene products, either at the level of target gene/protein expression or target protein activity. Such modulators include polypeptides, analogs, (variants), including fragments and fusion proteins, antibodies and other binding proteins; chemical compounds that directly or indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening assays as described herein); antisense polynucleotides and polynucleotides suitable for triple helix formation; and in particular antibodies or other binding partners that specifically recognize one or more epitopes of the polypeptides of the invention.
The polypeptides of the present invention may likewise be involved in cellular activation or in one of the other physiological pathways described herein.
4.10.1 RESEARCH USES AND UTILITIES
The polynucleotides provided by the present invention can be used by the research community for various purposes. The polynucleotides can be used to express recombinant protein for analysis, characterization or therapeutic use; as markers for tissues in which the corresponding protein is preferentially expressed (either constitutively or at a particular stage of tissue differentiation or development or in disease states); as molecular weight markers on gels; as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene positions; to compare with endogenous DNA sequences in patients to identify potential genetic disorders; as probes to hybridize and thus discover novel, related DNA
sequences; as a source of information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known sequences in the process of discovering other novel polynucleotides; for selecting and making oligomers for attachment to a "gene chip" or other support, including for examination of expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as an antigen to raise anti-DNA
antibodies or elicit another immune response. Where the polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of the binding interaction.
The polypeptides provided by the present invention can similarly be used in assays to determine biological activity, including in a panel of multiple proteins for high-throughput screening; to raise antibodies or to elicit another immune response; as a reagent (including the labeled reagent) in assays designed to quantitatively determine levels of the protein (or its receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is preferentially expressed (either constitutively or at a particular stage of tissue differentiation or development or in a disease state); and, of course, to isolate correlative receptors or ligands. Proteins involved in these binding interactions can also be used to screen for peptide or small molecule inhibitors or agonists of the binding interaction.
Any or all of these research utilities are capable of being developed into reagent grade or kit format for commercialization as research products.
Methods for performing the uses listed above are well known to those skilled in the art. References disclosing such methods include without limitation "Molecular Cloning: A
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F.
Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning Techniques", Academic Press, Bergen S. L. and A. R. Kimmel eds., 1987.
4.10.2 NUTRITIONAL USES
Polynucleotides and polypeptides of the present invention can also be used as nutritional sources or supplements. Such uses include without limitation use as a protein or aanino acid supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be added to the feed of a particular organism or can be administered as a separate solid or liquid preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the polypeptide or polynucleotide of the invention can be added to the medium in or on which the microorganism is cultured.
4.10.3 CYTOHINE ANI) CELL PROLIFERATION/DIFFERENTIATION
ACTIVITY
10 A polypeptide of the present invention may exhibit activity relating to cytokine, cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) activity or may induce production of other cytokines in certain cell populations.
A polynucleotide of the invention can encode a polypeptide exhibiting such attributes.
Many protein factors discovered to date, including all known cytokines, have exhibited 15 activity in one or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic compositions of the present invention is evidenced by any one of a number of routine factor dependent cell proliferation assays for cell lines including, without limitation, 32D, DA2, DAIG, T10, B9, B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RBS, DAl, 123, T1165, HT2, CTLL2, TF-1, 20 Mo7e, CMI~, HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following:
Assays for T-cell or thymocyte proliferation include without limitation those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M.
Kruisbeek, D. H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 25 Wiley-Interscience (Chapter 3, Ih Yitro assays for Mouse Lymphocyte Function 3.1-3.19;
Chapter 7, linmunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986;
Bertagnolli et al., J. Iminunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular Irmnunology 133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992;
Bowman et al., I.
hnmunol. 152:1756-1761, 1994.
30 Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or thymocytes include, without limitation, those described in: Polyclonal T
cell stimulation, Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in hnmunology. J. E.
e.a. Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E.
e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994.
Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E.
In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A.
80:2931-2938, 1983; Measurement of mouse and human interleukin 6--Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991;
Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11--Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons, Toronto. 1991;
Measurement of mouse and human Interleukin 9--Ciarletta, A., Giannotti, J., Clark, S. C.
and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991.
Assays for T-cell clone responses to antigens (which will identify, among others, proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and cytokine production) include, without limitation, those described in:
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H.
Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, Ih T~itYO assays for Mouse Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et al., Proc.
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun.
11:405-41 l, 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. hnmunol.
140:508-512, 1988.
4.10.4 STEM CELL GROWTH FACTOR ACTIVITY
A polypeptide of the present invention may exhibit stem cell growth factor activity and be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential state wluch would be useful for re-engineering damaged or diseased tissues, transplantation, manufacture of bio-pharmaceuticals and the development of bio-sensors.
The ability to produce large quantities of human cells has important working applications for the production of human proteins which currently must be obtained from non-human sources or donors, implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung.
It is contemplated that multiple different exogenous growth factors and/or cytokines may be administered in combination with the polypeptide of the invention to achieve the desired effect, including any of the growth factors listed herein, other stem cell maintenance factors, and specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage inflammatory protein 1-alpha (MIP-1-alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF).
Since totipotent stem cells can give rise to virtually any mature cell type, expansion of these cells in culture will facilitate the production of large quantities of mature cells.
Techniques for culturing stem cells are known in the art and administration of polypeptides of the invention, optionally with other growth factors and/or cytokines, is expected to enhance the survival and proliferation of the stem cell populations. This can be accomplished by direct administration of the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder layer for the stem cell populations in culture or in vivo. Stromal support cells fort feeder layers may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926).
Stem cells themselves can be transfected with a polynucleotide of the invention to induce autocrine expression of the polypeptide of the invention. This will allow for generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be differentiated into the desired mature cell types. These stable cell lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for polymerase chain reaction experiments. These studies would allow for the isolation and identification of differentially expressed genes in stem cell populations that regulate stem cell proliferation and/or maintenance.
Expansion and maintenance of totipotent stem cell populations will be useful in the treatment of many pathological conditions. For example, polypeptides of the present invention may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell populations can also be genetically altered for gene therapy purposes and to decrease host rejection of replacement tissues after grafting or implantation.
Expression of the polypeptide of the invention and its effect on stem cells can also be manipulated to achieve controlled differentiation of the stem cells into more differentiated cell types. A broadly applicable method of obtaining pure populations of a specific differentiated cell type from undifferentiated stem cell populations involves the use of a cell-type specific promoter driving a selectable marker. The selectable marker allows only cells of the desired type to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin.
Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. Tn:
Prifaciples of Tissue Ehgiraeering eds. Lanza et al., Academic Press (1997)). Alternatively, directed differentiation of stem cells can be accomplished by culturing the stem cells in the presence of a differentiation factor such as retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the effects of endogenous stem cell factor activity and allow differentiation to proceed. i I~ vitro cultures of stem cells can be used to determine if the polypeptide of the invention exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad.
Sci, U.S.A., 92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in combination with other growth factors or cytokines. The ability of the polypeptide of the invention to induce stem cells proliferation is determined by colony formation on semi-solid support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991).
4.10.5 HEMATOPOIESIS REGULATING ACTIVITY
A polypeptide of the present invention may be involved in regulation of hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders.
Even marginal biological activity in support of colony forming cells or of factor-dependent cell lines indicates involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or treatment of various platelet disorders such as thrombocytopenia, and generally for use in place of or complimentary to platelet transfusions; and/or in supporting the growth and proliferation of hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as those usually treated with transplantation, including, without limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment post irradiation/chemotherapy, either i~-vivo or ex-vivo (i.e., in conjunction with bone marrow transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) as normal cells or genetically manipulated for gene therapy.
Therapeutic compositions of the invention can be used in the following:
Suitable assays for proliferation and differentiation of various hematopoietic lines are cited above.
Assays for embryonic stem cell differentiation (which will identify, among others, proteins that influence embryonic differentiation hematopoiesis) include, without limitation, those described in: Johansson et al. Cellular Biology 15:141-151, 1995;
I~eller et al., Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993.

Assays for stem cell survival and differentiation (which will identify, among others, proteins that regulate lympho-hematopoiesis) include, without limitation, those described in:
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells.
R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, W c., New York, N.Y.
1994;
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 l, 1992; Primitive hematopoietic colony forming cells with high proliferative potential, McNiece, I. I~. and Briddell, R. A. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994;
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of Hematopoietic Cells.
10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y.
1994; Long term bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. I994; Long term culture initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., 15 New York, N.Y. 1994.
4.10.6 TISSUE GROWTH ACTIVITY
A polypeptide of the present invention also may be involved in bone, cartilage, tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing and 20 tissue repair and replacement, and in healing of burns, incisions and ulcers.
A polypeptide of the present invention which induces cartilage and/or bone growth in circumstances where bone is not normally fomned, has application in the healing of bone fractures and cartilage damage or defects in humans and other animals.
Compositions of a polypeptide, antibody, binding partner, or other modulator of the invention may have 25 prophylactic use in closed as well as open fracture reduction and also in the improved fixation of artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is useful in cosmetic plastic surgery.
A polypeptide of this invention may also be involved in attracting bone-forming 30 cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) mediated by inflammatory processes may also be possible using the composition of the invention.
Another category of tissue regeneration activity that may involve the polypeptide of the present invention is tendoWligament formation. Induction of tendon/ligament-like tissue or other tissue formation in circumstances where such tissue is not normally formed, has application in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by a composition of the present invention contributes to the repair of congenital, trauma induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for attachment or repair of tendons or ligaments. The compositions of the present invention may provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return ira vivo to effect tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is well known in the art.
The compositions of the present invention may also be useful for proliferation of neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a composition may be used in the treatment of diseases of the peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in accordance with the present invention include mechanical and traumatic disorders, such as spinal cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from chemotherapy or other medical therapies may also be treatable using a composition of the invention.
Compositions of the invention may also be useful to promote better or faster closure of non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular insufficiency, surgical and traumatic wounds, and the like.
Compositions of the present invention may also be involved in the generation or regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity.
A composition of the present invention may also be useful for gut protection or regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and conditions resulting from systemic cytokine damage.
A composition of the present invention may also be useful for promoting or inhibiting differentiation of tissues described above from precursor tissues or cells; or for inhibiting the growth of tissues described above.
Therapeutic compositions of the invention can be used in the following:
Assays for tissue generation activity include, without limitation, those described in:
International Patent Publication No. W095/16035 (bone, cartilage, tendon);
International Patent Publication No. W095/05846 (nerve, neuronal); International Patent Publication No.
W091/07491 (skin, endothelium).
Assays for wound healing activity include, without limitation, those described in:
Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest.
Dermatol 71:382-84 (1978).
4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY
A polypeptide of the present invention may also exhibit immune stimulating or immune suppressing activity, including without limitation the activities for which assays are described herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A protein may be useful in the treatment of various immune deficiencies and disorders (including severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and proliferation of T andlor B lymphocytes, as well as effecting the cytolytic activity of NIA cells and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such as candidiasis. Of course, in this regard, proteins of the present invention may also be useful where a boost to the immune system generally may be desirable, i.e.~ in the treatment of cancer.
Autoimmune disorders which may be treated using a protein of the present invention include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, autoirmnune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, including antibodies) of the present invention may also to be useful in the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema, multiforme, Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune suppression is desired (including, for example, organ transplantation), may also be treatable using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, I99~), skin prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization test (Vohr et al., Arch. Toxocol. 73: 501-9), and marine local lymph node assay (Kimber et al., J. Toxicol. Environ. Health 53: 563-79).
Using the proteins of the invention it may also be possible to modulate immune responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an immune response already in progress or may involve preventing the induction of an immune response. The functions of activated T cells may be inhibited by suppressing T
cell responses or by inducing specific tolerance in T cells, or both.
Immunosuppression of T
cell responses is generally an active, non-antigen-specific, process which requires continuous exposure of the T cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or energy in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased.
Operationally, tolerance can be demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence of the tolerizing agent.
Down regulating or preventing one or more antigen functions (including without limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell function should result in reduced tissue destruction in tissue transplantation.
Typically, in tissue transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, followed by an immune reaction that destroys the transplant. The administration of a therapeutic composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, and thus acts as an immunosuppressant.
Moreover, a lack of costimulation may also be sufficient to energize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it may also be necessary to block the function of a combination of B lymphocyte antigens.
The efficacy of particular therapeutic compositions in preventing organ transplant rejection or GVHD can be assessed using animal models that are predictive of efficacy in humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Aced.
Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Paul ed., Fundamental Irmnunology, Raven Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic compositions of the invention on the development of that disease.

Blocking antigen function may also be therapeutically useful for treating autoimmune diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are reactive against self tissue and which promote the production of cytokines asld autoantibodies involved in the pathology of the diseases. Preventing the activation of 5 autoreactive T cells may reduce or eliminate disease symptoms.
Administration of reagents which block stimulation of T cells can be used to inhibit T cell activation and prevent production of autoantibodies or T cell-derived cytokines which may be involved in the disease process. Additionally, blocking reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to long-teen relief from the disease.
The efficacy of 10 blocking reagents in preventing or alleviating autoimmune disorders can be determined using a number of well-characterized animal models of human autoimmune diseases.
Examples include marine experimental autoixnmune encephalitis, systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, marine autoimmune collagen arthritis, diabetes mellitus in NOD mice and BB rats, and marine experimental myasthenia 15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp.
840-856).
Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means of up regulating immune responses, may also be useful in therapy.
Upregulation of immune responses may be in the form of enhancing an existing immune response or eliciting 20 an initial immune response. For example, enhancing an immune response may be useful in cases of viral infection, including systemic viral diseases such as influenza, the common cold, and encephalitis.
Alternatively, anti-viral immune responses may be enhanced in an infected patient by removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 25 APCs either expressing a peptide of the present invention or together with a stimulatory form of a soluble peptide of the present invention and reintroducing the in vitro activated T
cells into the patient. Another method of enhancing anti-viral immune responses would be to isolate infected cells from a patient, transfect them with a nucleic acid encoding a protein of the present invention as described herein such that the cells express all or a portion of the 30 protein on their surface, and reintroduce the transfected cells into the patient. The infected cells would now be capable of delivering a costimulatory signal to, and thereby activate, T
cells in vivo.

A polypeptide of the present invention may provide the necessary stimulation signal to T cells to induce a T cell mediated immune response against the transfected tumor cells.
W addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and (32 microglobulin protein or an MHC
class II
alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I
or MHC class II proteins on the cell surface. Expression of the appropriate class I or class II
MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding an antisense construct which blocks expression of an MHC
class II associated protein, such as the invariant chain, can also be cotransfected with a DNA
encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce tumor specific immunity. Thus, the induction of a T
cell mediated immune response in a human subject may be sufficient to overcome tumor-specific tolerance in the subject.
The activity of a protein of the invention may, among other means, be measured by the following methods:
Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. I~ruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19;
Chapter 7, Immunologic studies in Humans); Hemnann et al., Proc. Natl. Acad.
Sci. USA
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J.
Tm_m__unol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998;
Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Tmmunol. 153:3079-3092, 1994.
Assays for T-cell-dependent immunoglobulin responses and isotype switching (which will identify, among others, proteins that modulate T-cell dependent antibody responses and that affect Thl/Th2 profiles) include, without limitation, those described in:
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function:
In vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J.
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto.
1994.
Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins that generate predominantly Thl and CTL responses) include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19;
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 1986;
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol.
149:3778-3783, 1992.
Dendritic cell-dependent assays (which will identify, among others, proteins expressed by dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et aL, Journal of Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990.
Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte homeostasis) include, without limitation, those described in:
Darzynkiewicz et al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993;
Gorczyca et al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991;
Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993;
Gorczyca et al., International Journal of Oncology 1:639-648, 1992.
Assays for proteins that influence early steps of T-cell commitment and development include, without limitation, those described in: Antica et al., Blood 84:111-117, 1994; Fine et al., Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-2778, 1995;
Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991.
4.10.8 ACTIVIN/INHIBIN ACTIVITY

A polypeptide of the present invention may also exhibit activin- or inhibin-related activities. A polynucleotide of the invention may encode a polypeptide exhibiting such characteristics. Inhibins are characterized by their ability to inhibit the release of follicle stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive based on the ability of inlubins to decrease fertility in female mammals and decrease spermatogenesis in male marmnals. Administration of sufficient amounts of other inlubins can induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH release from cells of the anterior pituitary.
See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for advancement of the onset of fertility in sexually immature mammals, so as to increase the lifetime reproductive performance of domestic animals such as, but not limited to, cows, sheep and pigs.
The activity of a polypeptide of the invention may, among other means, be measured by the following methods.
Assays for activiWinhibin activity include, without limitation, those described in:
Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc.
Natl. Acad. Sci. USA 83:3091-3095, 1986.
4.10.9 CHEMOTACTIC/CHEMOHINETIC ACTIVITY
A polypeptide of the present invention may be involved in chemotactic or chemokinetic activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A
polynucleotide of the invention can encode a polypeptide exhibiting such attributes.
Chemotactic and chemokinetic receptor activation can be used to mobilize or attract a desired cell population to a desired site of action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) provide particular advantages in treatment of wounds and other trauma to tissues, as well as in treatment of localized infections. For example, attraction of lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved immune responses against the tumor or infecting agent.
A protein or peptide has chemotactic activity for a particular cell population if it can stimulate, directly or indirectly, the directed orientation or movement of such cell population. Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. Whether a particular protein has chemotactic activity for a population of cells can be readily determined by employing such protein or peptide in any known assay for cell chemotaxis.
Therapeutic compositions of the invention can be used in the following:
Assays for chemotactic activity (which will identify proteins that induce or prevent chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells across a membrane as well as the ability of a protein to induce the adhesion of one cell population to another cell population. Suitable assays for movement and adhesion include, without limitation, those described in: Current Protocols in Immunology, Ed by J. E.
Coligan, A. M. I~ruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub.
Greene Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995;
Lind et al.
APMIS 103:140-146, 1995; Muller et al Eur. J. Imrnunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994.
4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY
A polypeptide of the invention may also be involved in hemostatis or thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Compositions may be useful in treatment of various coagulation disorders (including hereditary disorders, such as hemophiliac) or to enhance coagulation and other hemostatic events in treating wounds resulting from trauma, surgery or other causes. A
composition of the invention may also be useful for dissolving or inhibiting formation of thromboses and for treatment and prevention of conditions resulting therefrom (such as, for example, infarction of cardiac and central nervous system vessels (e.g., stroke).
Therapeutic compositions of the invention can be used in the following:
Assay for hemostatic and thrombolytic activity include, without limitation, those described in: Linet et al., J. Clin. Phannacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 35:467-474, 1988.
4.14.11 CANCER DIAGNOSIS AND THERAPY
5 Polypeptides of the invention may be involved in cancer cell generation, proliferation or metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the invention may be useful for the diagnosis and/or prognosis of one or more types of cancer.
For example, the presence or increased expression of a polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing 10 malignancy. Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer condition. Identification of single nucleotide polymorphisms associated with cancer or a predisposition to cancer may also be useful for diagnosis or prognosis.
Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 15 inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) and/or prohibiting metastasis by reducing tumor cell motility or invasiveness.
Therapeutic compositions of the invention may be effective in adult and pediatric oncology including in solid phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic cancer, including l5nnphatic metastases, blood cell malignancies 20 including multiple myeloma, acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell cancers, breast cancers including small cell .
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps associated with colorectal 25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 30 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, hemangiopericytoma and Karposi's sarcoma.
Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be administered to treat cancer. Therapeutic compositions can be administered in therapeutically effective dosages alone or in combination with adjuvant cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise improving overall clinical condition, without necessarily eradicating the cancer.
The composition can also be administered in therapeutically effective amounts as a portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. Anti-cancer drugs that are well knovcm in the art and can be used as a treatment in combination with the polypeptide or modulator of the invention include:
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HCI, Doxombicin HCl, Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-Za, Interferon Alpha-Zb, Leuprolide acetate (LHRH-releasing factor analog), Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX), Mitomycin, Mitoxantrone HCI, Octreotide, Plicamycin, Procaxbazine HCI, Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine ulfate, Amsacrine, Azacitidine, Hexamethyhnelamine, Interleukin-2, Mitoguazone, Pentostatin, Semustine, Teniposide, and Vindesine sulfate.
In addition, therapeutic compositions of the invention may be used for prophylactic treatment of cancer. There axe hereditary conditions and/or environmental situations (e.g.
exposure to carcinogens) known in the art that predispose an individual to developing cancers. Under these circumstances, it may be beneficial to treat these individuals with therapeutically effective doses of the polypeptide of the invention to reduce the risk of developing cancers.
In vitfro models can be used to determine the effective doses of the polypeptide of the invention as a potential cancer treatment. These ivy. vitYO models include proliferation assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY
Ch 18 and Ch 21), tumor systems in nude mice as described in Giovanella et al., J.
Natl. Can. Inst., 52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial cell migration as described in Ribatta et al., Intl. J. Dev.
Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9 (1899), respectively.
Suitable ttunor cells lines are available, e.g. from American Type Tissue Culture Collection catalogs.
4.10.12 RECEPTOR/LIGAND ACTIVITY
A polypeptide of the present invention may also demonstrate activity as receptor, receptor ligand or inhibitor or agonist of receptor/ligand interactions. A
polynucleotide of the invention can encode a polypeptide exhibiting such characteristics.
Examples of such receptors and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions and their ligands (including without limitation, cellular adhesion molecules (such as selectins, integrins and their ligands) and receptorfligand pairs involved in antigen presentation, antigen recognition and development of cellular and humoral immune responses. Receptors and ligands are also useful for screening of potential peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of the present invention (including, without limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand interactions.
The activity of a polypeptide of the invention may, among other means, be measured by the following methods:
Suitable assays for receptor-ligand activity include without limitation those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D.
H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987;
Bierer et al., J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160 1989;
Stoltenborg et al., J. Iminunol. Methods 175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995.

By way of example, the polypeptides of the invention may be used as a receptor for a ligand(s) thereby transmitting the biological activity of that ligand(s).
Ligands may be identified through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel overlay assays, or other methods knOWn 1I1 the art.
Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a partial antagonist require the use of other proteins as competing ligands.
The polypeptides of the present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, colorimetric molecules or a toxin molecules by conventional methods.
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of toxins include, but are not limited, to ricin.
4.10.13 DRUG SCREENING
This invention is particularly useful for screening chemical compounds by using the novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. The polypeptides or fragments employed in such a test may either be free in solution, affixed to a solid support, borne on a cell surface or located intracellularly. One method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such transformed cells in competitive binding assays.
Such cells, either in viable or fixed form, can be used for standard binding assays. One may measure, for example, the formation of complexes between polypeptides of the invention or fragments and the agent being tested or examine the diminution in complex formation between the novel polypeptides and an appropriate cell line, which are well known in the art.
Sources for test compounds that may be screened for ability to bind to or modulate (i.e., increase or decrease) the activity of polypeptides of the invention include (1) iilorganic and organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides or organic molecules.
Chemical libraries may be readily synthesized or purchased from a number of commercial sources, and may include structural analogs of known compounds or compounds that are identified as "hits" or "leads" via natural product screening.

The sources of natural product libraries are microorganisms (including bacteria and fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine microorganisms or (2) extraction of the organisms themselves. Natural product libraries include polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a review, see Science 282:63-68 (1998).
Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or organic compounds and can be readily prepared by traditional automated synthesis methods, PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curs. Opin. BioteclZnol. 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23 (1998); Hruby et al., Curn Opin Clzem Biol, 1(1):114-19 (1997); Dorner et al., BioofgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides).
Identification of modulators through use of the various libraries described herein permits modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit"
to bind a polypeptide of the invention. The molecules identified in the binding assay are then tested for antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested for either cell/animal death or prolonged survival of the animal/cells.
The binding molecules thus identified may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity of the binding molecule for a polypeptide of the invention. Alternatively, the binding molecules may be complexed with imaging agents for targeting and imaging purposes.
4.10.14 ASSAY FOR RECEPTOR ACTIVITY
The invention also provides methods to detect specific binding of a polypeptide e.g. a ligand or a receptor. The art provides numerous assays particularly useful for identifying previously unknown binding partners for receptor polypeptides of the invention. For example, expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used to identify polynucleotides encoding binding partners. As another example, affinity chromatography with the appropriate immobilized polypeptide of the invention can be used to isolate polypeptides that recognize and bind polypeptides of the invention. There are a number of different libraries used for the identification of 5 compounds, and in particular small molecules, that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. Ligands for receptor polypeptides of the invention can also be identified by adding exogenous ligands, or cocktails of ligands to two cells populations that are genetically identical except for the expression of the receptor of the invention: one cell population expresses the receptor of the invention whereas the other does 10 not. The responses of the two cell populations to the addition of ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the polypeptide of the invention in cells and assayed for an autocrine response to identify potential ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known in the art can be used to identify binding partner polypeptides, including, (1) organic and inorganic 15 chemical libraries, (2) natural product libraries, and (3) combinatorial libraries comprised of random peptides, oligonucleotides or organic molecules.
The role of downstream intracellular signaling molecules in the signaling cascade of the polypeptide of the invention can be determined. For example, a chimeric protein in which the cytoplasmic domain of the polypeptide of the invention is fused to the 20 extracellular portion of a protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated with the ligand specific for the extracellular portion of the chimeric protein, thereby activating the chimeric receptor. Known downstream proteins involved in intracellular signaling can then be assayed for expected modifications i.e.
phosphorylation. Other methods known to those in the art can also be used to identify 25 signaling molecules involved in receptor activity.
4.10.15 ANTI-INFLAMMATORY ACTIVITY
Compositions of the present invention may also exhibit anti-inflammatory activity.
The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in 30 the inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production of'other factors which more directly inhibit or promote an inflammatory response. Compositions with such activities can be used to treat inflammatory conditions including chronic or acute conditions), including without limitation intimation associated with infection (such as septic shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from over production of cytokines such as TNF or IL-1. Compositions of the invention may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material.
Compositions of this invention may be utilized to prevent or treat conditions such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to intrauterine infections.
4.10.16 LEUKEMIAS
Leukemias and related disorders may be treated or prevented by administration of a therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the invention. Such leukemias and related disorders include but are not limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia).
4.10.17 NERVOUS SYSTEM DISORDERS
Nervous system disorders, involving cell types which can be tested for efficacy of intervention with compounds that modulate the activity of the polynucleotides and/or polypeptides of the invention, and which can be treated upon thus observing an indication of therapeutic utility, include but are not limited to nervous system injuries, and diseases or disorders which result in either a disconnection of axons, a diminution or degeneration of neurons, or demyelination. Nervous system lesions which may be treated in a patient (including human and non-human mammalian patients) according to the invention include but are not limited to the following lesions of either the central (including spinal cord, brain) or peripheral nervous systems:
(i) traumatic lesions, including lesions caused by physical injury or associated with sua-gery, for example, lesions which sever a portion of the nervous system, or compression injuries;
(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord infarction or ischemia;
(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured as a result of infection, for example, by an abscess or associated with infection by human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, tuberculosis, syphilis;
(iv) degenerative lesions, in which a portion of the nervous system is destroyed or injured as a result of a degenerative process including but not limited to degeneration associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral sclerosis;
(v) lesions associated with nutritional diseases or disorders, in which a portion of the nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus callosum), and alcoholic cerebellar degeneration;
(vi) neurological lesions associated with systemic diseases including but not limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or sarcoidosis;
(vii) lesions caused by toxic substances including alcohol, lead, or particular neurotoxins; and (viii) demyelinated lesions in which a portion of the nervous system is destroyed or injured by a demyelinating disease including but not limited to multiple sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis.
Therapeutics which are useful according to the invention for treatment of a nervous system disorder may be selected by testing for biological activity in promoting the survival or differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit any of the following effects may be useful according to the invention:
(i) increased survival time of neurons in culture;
(ii) increased sprouting of neurons in culture or in vivo;
(iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or (iv) decreased symptoms of neuron dysfunction in vivo.
Such effects may be measured by any method known in the art. In preferred, non-limiting embodiments, increased survival of neurons may be measured by the method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3S1S); increased sprouting of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol.
70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., depending on the molecule to be measured;
and motor 1 S neuron dysfunction may be measured by assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional disability.
In specific embodiments, motor neuron disorders that may be treated according to the invention include but are not limited to disorders such as infarction, infection, exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as well as other components of the nervous system, as well as disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary Motorsensory 2S Neuropathy (Charcot-Marie-Tooth Disease).
4.10.18 OTHER ACTIVITIES
A polypeptide of the invention may also exhibit one or more of the following additional activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, including, without limitation, bacteria, viruses, fungi and other parasites;
effecting (suppressing or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such as, for example, breast augmentation or diminution, change in bone form or shape); effecting biorhythms or circadian cycles or rhythms;
effecting the fertility of male or female subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional factors or component(s);
effecting behavioral characteristics, including, without limitation, appetite, libido, stress, cognition (including cognitive disorders), depression (including depressive disorders) and violent behaviors; providing analgesic effects or other pain reducing effects;
promoting differentiation and growth of embryonic stem cells in Iineages other than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of the enzyme and treating deficiency-related diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such as, for example, the ability to bind antigens or complement); and the ability to act as an antigen in a vaccine composition to raise an immune response against such protein or another material or entity which is cross-reactive with such protein.
4.10.19 IDENTIFICATION OF POLYMORPHISMS
The demonstration of polymorphisms makes possible the identification of such polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or susceptibility to various disease states (such as disorders involving inflammation or immune response) or a differential response to drug administration, and this genetic information can be used to tailor preventive or therapeutic treatment appropriately.
For example, the existence of a polymorphism associated with a predisposition to inflammation or autoimmune disease makes possible the diagnosis of this condition in humans by identifying the presence of the polymorphism.
Polymorphisms can be identified in a variety of ways known in the art which all generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally involving isolation or amplification of the DNA, and identifying the presence of the polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment of genomic DNA which rnay then be sequenced. Alternatively, the DNA
may be subjected to allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). In addition, traditional restriction fragment length polymorphism analysis (using restriction enzymes that provide differential digestion of the genomic DNA
depending on the presence or absence of the polymorphism) may be performed.
Arrays with 5 nucleotide sequences of the present invention can be used to detect polyrnorphisms. The array can comprise modified nucleotide sequences of the present invention in order to detect the nucleotide sequences of the present invention. In the alternative, any one of the nucleotide sequences of the present invention can be placed on the array to detect changes from those sequences.
10 Alternatively a polymorphism resulting in a change in the amino acid sequence could also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., by an antibody specific to the variant sequence.
4.10.20 ARTHRITIS AND INFLAMMATION
15 The immunosuppressive effects of the compositions of the invention against rheumatoid arthritis is determined in an experimental animal model system. The experimental model system is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963, Int. Arch.
Allergy Appl. Immunol., 23:129. W duction of the disease can be caused by a single 20 injection, generally intradermally, of a suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The route of injection can vary, but rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of administering PBS only.
25 The procedure for testing the effects of the test compound would consist of intradermally injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of the data would 30 reveal that the test compound would have a dramatic affect on the swelling of the joints as measured by a decrease of the arthritis score.
4.11 THERAPEUTIC METHODS

The compositions (including polypeptide fragments, analogs, variants and antibodies or other binding partners or modulators including antisense polynucleotides) of the invention have numerous applications in a variety of therapeutic methods. Examples of therapeutic applications include, but are not limited to, those exemplified herein.
4.11.1 EXAMPLE
One embodiment of the invention is the administration of an effective amount of the polypeptides or other composition of the invention to individuals affected by a disease or disorder that can be modulated by regulating the peptides of the invention.
While the mode of administration is not particularly important, parenteral administration is preferred. An exemplary mode of administration is to deliver an intravenous bolus. The dosage of the polypeptides or other composition of the invention will normally be determined by the prescribing physician. It is to be expected that the dosage will vary according to the age, weight, condition and response of the individual patient. Typically, the amount of polypeptide administered per dose will be in the range of about 0.01 ~,g/kg to 100 mg/kg of body weight, with the preferred dose being about 0.1 ~.g/kg to 10 mg/kg of patient body weight. For parenteral administration, polypeptides of the invention will be formulated in an injectable form combined with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting of small amounts of the human serum albumin.
The vehicle may contain minor amounts of additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. The preparation of such solutions is within the skill of the art.
4.12 PHARMACEUTICAL . FORMULATIONS AND ROUTES OF
ADMINISTRATION
A protein or other composition of the present invention (from whatever source derived, including without limitation from recombinant and non-recombinant sources and including antibodies and other binding partners of the polypeptides of the invention) may be administered to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term "pharmaceutically acceptable" means a non-toxic material that does not interfere with the effectiveness of the biological activity of the active ingredient(s). The characteristics of the carrier will depend on the route of administration.
The pharmaceutical composition of the invention may also contain cytokines, lymphokines, or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In further compositions, proteins of the invention may be combined with other agents beneficial to the treatment of the disease or disorder in question. These agents include various growth factors such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF), transforming growth factors (TGF-oc and TGF-[3), insulin-like growth factor (IGF), as well as cytokines described herein.
The pharmaceutical composition may further contain other agents which either enhance the activity of the protein or other active ingredient or complement its activity or use in treatment. Such additional factors and/or agents may be included in the pharmaceutical composition to produce a synergistic effect with protein or other active ingredient of the invention, or to minimize side effects. Conversely, protein or other active ingredient of the present invention may be included in formulations of the particular clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as IL-lRa, IL-1 Hyl, IL-I Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein of the present invention may be active in multimers (e.g., heterodimers or homodimers) or complexes with itself or other proteins.
As a result, pharmaceutical compositions of the invention may comprise a protein of the invention in such multimeric or complexed foam.
As an alternative to being included in a pharmaceutical composition of the invention including a first protein, a second protein or a therapeutic agent may be concurrently administered with the first protein (e.g., at the same time, or at differing times provided that therapeutic concentrations of the combination of agents is achieved at the treatment site).
Techniques for formulation and administration of the compounds of the instant application may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest edition. A therapeutically effective dose further refers to that amount of the compound sufficient to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the relevant medical condition, or an increase in rate of treatment, healing, prevention or amelioration of such conditions. When applied to an individual active ingredient, administered alone, a therapeutically effective dose refers to that ingredient alone. When applied to a combination, a therapeutically effective dose refers to combined amounts of the active ingredients that result in the therapeutic effect, whether administered in combination, serially oresimultaneously.
In practicing the method of treatment or use of the present invention, a therapeutically effective amount of protein or other active ingredient of the present invention is administered to a mammal having a condition to be treated. Protein or other active ingredient of the present invention may be administered in accordance with the method of the invention either alone or in combination with other therapies such as treatments employing cytokines, lyrnphokines or other hematopoietic factors. When co-administered with one or more cytokines, lymphokines or other hematopoietic factors, protein or other active ingredient of the present invention may be administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, the attending physician will decide on the appropriate sequence of administering protein or other active ingredient of the present invention in combination with cytokine(s), lyrnphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors.
4.12.1 ROUTES OF ADMINISTRATION
Suitable routes of administration may, for example, include oral, rectal, transmucosal, or intestinal administration; parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections.
Administration of protein or other active ingredient of the present invention used in the pharmaceutical composition or to practice the method of the present invention can be carned out in a variety of conventional ways, such as oral ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral or intravenous injection. Intravenous administration to the patient is preferred.
Alternately, one may administer the compound in a local rather than systemic manner, for example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the scarnng process frequently occurring as complication of glaucoma surgery, the compounds may be administered topically, for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery system, for example, in a liposome coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the afflicted tissue.
The polypeptides of the invention are administered by any route that delivers an effective dosage to the desired site of action. The determination of a suitable route of administration and an effective dosage for a particular indication is within the level of skill in the art. Preferably for wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage ranges for the polypeptides of the invention can be extrapolated from these dosages or from similar studies in appropriate animal models.
Dosages can then be adjusted as necessaxy by the clinician to provide maximal therapeutic benefit.
4.12.2 COMPOSITIONS/FORMULATIONS
Pharmaceutical compositions for use in accordance with the present invention thus may be formulated in a conventional manner using one or more physiologically acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. These pharmaceutical compositions may be manufactured in a manner that is itself known, e.g., by means of conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. When a therapeutically effective amount of protein or other active ingredient of the present invention is administered orally, protein or other active ingredient of the present invention will be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, the pharmaceutical composition of the invention may additionally contain a solid carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or other active ingredient of the present invention, and preferably from about 25 to 90% protein or other active ingredient of the present invention. When administered in liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the pharmaceutical composition may further contain physiological saline solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. When administered in liquid form, the pharmaceutical composition c~antains from about 0.5 to 90%
by weight of protein or other active ingredient of the present invention, and preferably from 5 about 1 to 50% protein or other active ingredient of the present invention.
When a therapeutically effective amount of protein ox other active ingredient of the present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally acceptable aqueous solution. The preparation of such parenterally 10 acceptable protein or other active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should contain, in addition to protein or other active ingredient of the present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and Sodium Chloride 15 Injection, Lactated Ringer's Injection, or~other vehicle as known in the art. The pharmaceutical composition of the present invention may also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 20 physiological saline buffer. For transmucosal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art.
For oral administration, the compounds can be formulated readily by combining the active compounds with pharmaceutically acceptable carriers well knov~m in the art. Such 25 carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 30 excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulbse, hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures.
Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to characterize different combinations of active compound doses.
Pharmaceutical preparations which can be used orally include push-fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers may be added. All formulations for oral administration should be in dosages suitable for such administration. For buccal administration, the compositions may take the form of tablets or lozenges formulated in conventional manner.
For administration by inhalation, the compounds for use according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch. The compounds may be formulated for parenteral administration by injection, e.g., by bolus injection or continuous infusion. Formulations for injection may be presented in unit dosage form, e.g., in ampules or in mufti-dose containers, with an added preservative. The compositions may take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents.
Pharmaceutical formulations for parenteral administration include aqueous solutions of the active compounds in water-soluble form. Additionally, suspensions of the active compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain substances which increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable stabilizers or agents which increase the solubility of the compounds to allow for the preparation of highly concentrated solutions. Alternatively, the active ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use.
The compounds may also be formulated in rectal compositions such as suppositories or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other glycerides. In addition to the formulations described previously, the compounds may i also be formulated as a depot preparation. Such long acting formulations may be administered by implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent system.
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD
co-solvent system (VPD:SW) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system may be varied considerably without destroying its solubility and toxicity characteristics.
Furthermore, the identity of the co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other biocompatible polyners may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well known examples of delivery vehicles or Garners for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity.
Additionally, the compounds may be delivered using a sustained-release system, such as semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent.
Various types of sustained-release materials have been established and are well known by those skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the compounds for a few weeks up to over 100 days. Depending on the chemical nature and the biological stability of the therapeutic reagent, additional strategies for protein or other active ingredient stabilization may be employed.
The pharmaceutical compositions also may comprise suitable solid or gel phase Garners or excipients. Examples of such carriers or excipients include but are not limited to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the invention may be provided as salts with pharmaceutically compatible counter ions. Such , pharmaceutically acceptable base addition salts are those salts which retain the biological effectiveness and properties of the free acids and which are obtained by reaction with inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and the like.
The pharmaceutical composition of the invention may be in the form of a complex of the proteins) or other active ingredients) of present invention along with protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T
lymphocytes. B lymphocytes will respond to antigen through their surface imm.unoglobulin receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and structurally related proteins including those encoded by class I and class II MHC genes on host cells will serve to present the peptide antigens) to T lymphocytes. The antigen components could also be supplied as purified MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as well as antibodies able to bind the TCR and other molecules on T cells can be combined with the pharmaceutical composition of the invention.
The pharmaceutical composition of the invention may be in the form of a liposome in which protein of the present invention is combined, in addition to other pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution.
Suitable lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like.
Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated herein by reference.
The amount of protein or other active ingredient of the present invention in the pharmaceutical composition of the present invention will depend upon the nature and severity of the condition being treated, and on the nature of prior treatments which the patient has undergone. Ultimately, the attending physician will decide the amount of protein or other active ingredient of the present invention with which to treat each individual patient.
Initially, the attending physician will administer low doses of protein or other active ingredient of the present invention and observe the patient's response. Larger doses of protein or other active ingredient of the present invention may be administered until the optimal therapeutic effect is obtained for the patient, and at that point the dosage is not increased further. It is contemplated that the various pharmaceutical compositions used to practice the method of the present invention should contain about 0.01 ~g to about 100 mg (preferably about 0.1 ~,g to about 10 mg, more preferably about 0.1 ~,g to about 1 mg) of protein or other active ingredient of the present invention per kg body weight. For compositions of the present invention which are useful for bone, cartilage, tendon or ligament regeneration, the therapeutic method includes administering the composition topically, systematically, or locally as an implant or device. When administered, the therapeutic composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable form. Further, the composition may desirably be encapsulated or injected in a viscous form for delivery to the site of bone, cartilage or tissue damage.
Topical administration may be suitable for wound healing and tissue repair.
Therapeutically useful agents other than a protein or other active ingredient of the invention which may also optionally be included in the composition as described above, may alternatively or additionally, be administered simultaneously or sequentially with the composition in the methods of the invention. Preferably for bone and/or cartilage formation, the composition would include a matrix capable of delivering the protein-containing or other active ingredient-containing composition to the site of bone and/or cartilage damage, providing a structure for the developing bone and cartilage and optimally capable of being resorbed into the body. Such matrices may be formed of materials presently in use for other implanted medical applications.

The choice of matrix material is based on biocompatibility, biodegradability, mechanical properties, cosmetic appearance and interface properties. The particular application of the compositions will define the appropriate formulation.
Potential matrices for the compositions may be biodegradable arid chemically defined calcium sulfate, 5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides.
Other potential materials are biodegradable and biologically well-defined, such as bone or dermal collagen. Further matrices are comprised of pure proteins or extracellular matrix components. Other potential matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may be comprised 10 of combinations of any of the above-mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be altered in composition, such as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and glycolic acid in the form of porous particles having 15 diameters ranging from 150 to 800 microns. In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl cellulose or autologous blood clot, to prevent the protein compositions from disassociating from the matrix.
A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being cationic salts of carboxyrnethylcellulose (CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). The amount of sequestering agent useful 25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which represents the amount necessary to prevent desorption of the protein from the polymer matrix and to provide appropriate handling of the composition, yet not so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the protein the opportunity to assist the osteogenic activity of the progenitor cells. In further compositions, 30 proteins or other active ingredients of the invention may be combined with other agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in question.
These agents include various growth factors such as epidermal growth factor (EGF), platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-(3), and insulin-like growth factor (IGF).
The therapeutic compositions are also presently valuable for veterinary applications.
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired patients for such treatment with proteins or other active ingredients of the present invention.
The dosage regimen of a protein-containing pharmaceutical composition to be used in tissue regeneration will be determined by the attending physician considering various factors which modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., bone), the patient's age, sex, and diet, the severity of any infection, time of administration and other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and with inclusion of other proteins in the pharmaceutical composition.
For example, the addition of other known growth factors, such as IGF I
(insulin like growth factor I), to the final composition, may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline labeling.
Polynucleotides of the present invention can also be used for gene therapy.
Such polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a mammalian subject. Polynucleotides of the invention may also be administered by other known methods for introduction of nucleic acid into a cell or organism (including, without limitation, in the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of proteins of the present invention in order to proliferate or to produce a desired effect on or activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes.
4.12.3 EFFECTIVE DOSAGE
Pharmaceutical compositions suitable for use in the present invention include compositions wherein the active ingredients are contained in an effective amount to achieve its intended purpose. More specifically, a therapeutically effective amount means an amount effective to prevent development of or to alleviate the existing symptoms of the subject being treated. Determination of the effective amount is well within the capability of those skilled in the art, especially in light of the detailed disclosure provided herein. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a circulating concentration range that can be used to more accurately determine useful doses in humans. For example, a dose can be formulated in animal models to achieve a circulating concentration range that includes the ICso as determined in cell culture (i. e., the concentration of the test compound which achieves a half maximal inhibition of the protein's biological activity). Such information can be used to more accurately determine useful doses in humans.
A therapeutically effective dose refers to that amount of the compound that results in amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LDso (the dose lethal to 50%
of the population) and the EDSO (the dose therapeutically effective in 50% of the population).
The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio between LDso and EDso. Compounds which exhibit high therapeutic indices are preferred. The data obtained from these cell culture assays and animal studies can be used in formulating a range of dosage for use in human. The dosage of such compounds lies preferably within a range of circulating concentrations that include the EDSo with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. The exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch.
1 p.1. Dosage amount and interval may be adjusted individually to provide plasma levels of the active moiety which are sufficient to maintain the desired effects, or minimal effective concentration (MEC). The MEC will vary for each compound but can be estimated from ire vitro data. Dosages necessary to achieve the MEC will depend on individual characteristics and route of administration. However, HPLC assays or bioassays can be used to determine plasma concentrations.
Dosage intervals can also be determined using MEC value. Compounds should be administered using a regimen which maintains plasma levels above the MEC for 10-90% of the time, preferably between 30-90% and most preferably between 50-90%. In cases of local administration or selective uptake, the effective local concentration of the drug may not be related to plasma concentration.

An exemplary dosage regimen for polypeptides or other compositions of the invention will be in the range of about 0.01 ~g/kg to 100 mg/kg of body weight daily, with the preferred dose being about 0.1 q.g/kg to 25 mg/kg of patient body weight daily, varying in adults and children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter intervals.
The amount of composition administered will, of course, be dependent on the subject being treated, on the subject's age and weight, the severity of the affliction, the manner of administration and the judgment of the prescribing physician.
4.12.4 PACKAGING
The compositions may, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the active ingredient.
The pack may, for example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may be accompanied by instructions for administration. Compositions comprising a compound of the invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an appropriate container, and labeled for treatment of an indicated condition.
4.13 ANTIBODIES
Also included in the invention are antibodies to proteins, or fragments of proteins of the invention. The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain an antigen-binding site that specifically binds (inununoreacts with) an antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab, Fab' and F~ab~>2 fragments, and an Fib expression library. In general, an antibody molecule obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain present in the molecule.
Certain classes have subclasses as well, such as IgGI, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a reference to all such classes, subclasses and types of human antibody species.
An isolated related protein of the invention may be intended to serve as an antigen, or a portion or fragment thereof, and additionally can be used as an immunogen to generate antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the invention provides antigenic peptide fragments of the antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8, and encompasses an epitope thereof such that an antibody raised against the peptide forms a specific immune complex with the full length protein or with any fragment that contains the epitope.
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are located on its surface; commonly these are hydrophilic regions.
In certain embodiments of the invention, at least one epitope encompassed by the antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A
hydrophobicity analysis of the human related protein sequence will indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely to encode surface residues useful for targeting antibody production. As a means for targeting antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity may be generated by any method well known in the art, including, for example, the I~yte Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., Hopp and Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. Mol.
Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety.
Antibodies that axe specific for one or more domains within an antigenic protein, or derivatives, fragments, analogs or homologs thereof, are also provided herein.
A protein of the invention, or a derivative, fragment, analog, homolog or ortholog thereof, may be utilized as an immunogen in the generation of antibodies that immunospecifically bind these protein components.
The term "specific for" indicates that the variable regions of the antibodies of the invention recognize and bind polypeptides of the invention exclusively (i.e., able to distinguish the polypeptide of the invention from other similar polypeptides despite sequence identity, homology, or similarity found in the family of polypeptides), but may also interact with other proteins (for example, S. aureus protein A or other antibodies in ELISA
techniques) through interactions with sequences outside the variable region of the antibodies, and in particular, in the constant region of the molecule. Screening assays to determine binding specificity of an antibody of the invention are well known and routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow et al.
(Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY
(1988), Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of the 5 invention axe also contemplated, provided that the antibodies are first and foremost specific for, as defined above, full-length polypeptides of the invention. As with antibodies that are specific for full length polypeptides of the invention, antibodies of the invention that recognize fragments are those which can distinguish polypeptides from the same family of polypeptides despite inherent sequence identity, homology, or similarity found in the family 10 of proteins.
Antibodies of the invention are useful for, for example, therapeutic purposes (by modulating activity of a polypeptide of the invention), diagnostic purposes to detect or quantitate a polypeptide of the invention, as well as purification of a polypeptide of the invention. Kits comprising an antibody of the invention for any of the purposes described 15 herein are also comprehended. In general, a kit of the invention also includes a control antigen for which the antibody is immunospecific. The invention further provides a hybridoma that produces an antibody according to the invention. Antibodies of the invention are useful for detection and/or purification of the polypeptides of the invention.
Monoclonal antibodies binding to the protein of the invention may be useful 20 diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal antibodies binding to the protein may also be useful therapeutics for both conditions associated with the protein and also in the treatment of some forms of cancer where abnormal expression of the protein is involved. In the case of cancerous cells or leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in detecting and 25 preventing the metastatic spread of the cancerous cells, which may be mediated by the protein.
The labeled antibodies of the present invention can be used for i~r vita°o, iya vivo, and in situ assays to identify cells or tissues in which a fragment of the polypeptide of interest is expressed. The antibodies may also be used directly in therapies or other diagnostics. The 30 present invention further provides the above-described antibodies immobilized on a solid support. Examples of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and Sepharose~, acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports are well known in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W.D. et al., Meth.
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the present invention can be used for in vitro, ifa vivo, and i~a situ assays as well as for immuno-affinity purification of the proteins of the present invention.
Various procedures known within the art may be used for the production of polyclonal or monoclonal antibodies directed against a protein of the invention, or against derivatives, fragments, analogs homologs or orthologs thereof (see, for example, Antibodies:
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below.
4.13.1 POLYCLONAL ANTIBODIES
For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, goat, mouse or other mammal) may be immunized by one or more injections with the native protein, a synthetic variant thereof, or a derivative of the foregoing.
An appropriate immunogenic preparation can contain, for example, the naturally occurring immunogenic protein, a chemically synthesized polypeptide representing the immunogenic protein, or a recombinantly expressed inununogenic protein. Furthermore, the protein may be conjugated to a second protein known to be immunogenic in the mammal being immunized.
Examples of such immunogenic proteins include but are not limitedrto keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an adjuvant. Various adjuvants used to increase the immunological response include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of adjuvants that can be employed include MPL-TDM
adjuvant (monophosphoryl Lipid A, synthetic trehalose dicorynomycolate).
The polyclonal antibody molecules directed against the immunogenic protein can be isolated from the mammal (e.g., from the blood) and further purified by well known techniques, such as affinity chromatography using protein A or protein G, which provide primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific ~i antigen which is the target of z~ i~~~~~~wsr~~iin sought, or an epitope thereof, may be imrri~bilized on a column to purify the immune specific antibody by immunoaffinity chromatography. Purification of immunoglobulins is discussed, for example, by D.
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28).
4.13.2 MONOCLONAL ANTIBODIES
The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used herein, refers to a population of antibody molecules that contain only one molecular species of antibody molecule consisting of a unique light chain gene product and a unique heavy chain gene product. In pauticular, the complementarity determining regions (CDRs) of the monoclonal antibody are identical in all the molecules of the population. MAbs thus contain an antigen-binding site capable of immunoreacting with a particular epitope of the antigen characterized by a unique binding affinity for it.
Monoclonal antibodies c'an be prepared using hybridoma methods, such as those described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma method, a mouse, hamster, or other appropriate host animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro.
The innnunizing agent will typically include the protein antigen, a fragment thereof or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in a suitable culture medium that preferably contains one or more substances that inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT
medium"), which substances prevent the growth of HGPRT-deficient cells.
Preferred immortalized cell lines are those that fuse efficiently, support stable high level expression of antibody by the selected antibody-producing cells, and are sensitive to a medium such as HAT medium. More preferred immortalized cell lines are marine myeloma lines, which can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, California and the American Type Culture Collection, Manassas, Virginia.
Human myeloma and mouse-human heteromyeloma cell lines also have been described for the production of human monoclonal antibodies (Kozbor, J. linmunol., 133:3001 (1984);
Brodeur et al., Monoclonal Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 51-63).
The culture medium in which the hybridoma cells are cultured can then be assayed for the presence of monoclonal antibodies directed against the antigen.
Preferably, the binding specificity of monoclonal antibodies produced by the hybridoma cells is determined by immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the art. The binding affinity of the monoclonal antibody can, for example, be determined by the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980).
Preferably, antibodies having a high degree of specificity and a high binding affinity for the target antigen are isolated.
After the desired hybridoma cells are identified, the clones can be subcloned by limiting dilution procedures and grown by standard methods. Suitable culture media for this purpose include, far example, Dulbecco's Modifed Eagle's Medimn and RPMI-1640 medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal.
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture medium or ascites fluid by conventional immunoglobulin purification procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or affinity chromatography.
The monoclonal antibodies can also be made by recombinant DNA methods, such as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the invention can be readily isolated and sequenced using conventional procedures (e.g., by using oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and light chains of marine antibodies). The hybridoma cells of the invention serve as a preferred source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal antibodies in the recombinant host cells. The DNA
also can be modified, for example, by substituting the coding sequence for human heavy and light chain constant domains in place of the homologous rnurine sequences (LJ.S. Patent No.
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be substituted for the constant domains of an antibody of the invention, or can be substituted for the variable domains of one antigen-combining site of an antibody of the invention to create a chimeric bivalent antibody.
4.13.3 HUMANIZED ANTIBODIES
The antibodies directed against the protein antigens of the invention can further comprise humanized antibodies or human antibodies. These antibodies are suitable for administration to humans without engendering an irninune response by the human against the administered immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')Z or other antigen-binding subsequences of antibodies) that are principally comprised of the sequence of a human immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. Humanization can be performed following the method of Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et al., Nature, 332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of the human immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies can also comprise residues that are found neither in the recipient antibody nor in the imported CDR or framework sequences. W general, the humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR regions correspond to those of a non-human immunoglobulin and all or substantially all of the framework regions are those of a human immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at least a portion of an immtmoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct.
Biol., 2, 593-596 (1992)).
5 4.13.4 HUMAN ANTIBODIES
' Fully human antibodies relate to antibody molecules in which essentially the entire sequences of both the light chain and the heavy chain, including the CDRs, arise from human genes. Such antibodies are termed "human antibodies", or "fully human antibodies"
herein. Human monoclonal antibodies can be prepared by the trioma technique;
the human 10 B-cell hybridoma technique (see Kozbor, et al., 1983 linmunol Today 4: 72) and the EBV
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In:
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).
Human monoclonal antibodies may be utilized in the practice of the present invention and may be produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80, 15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et aL, 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp.
77-96).
In addition, human antibodies can also be produced using additional techniques, including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227, 381 (1991);
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 20 introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous immunoglobulin genes have been partially or completely inactivated.
Upon challenge, human antibody production is observed, which closely resembles that seen in humans in all respects, including gene rearrangement, assembly, and antibody repertoire.
This approach is described, for example, in U.S. Patent Nos. 5,545,807;
5,545,806;
25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al.
(Bio/Technology 10, 779-783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature 368, 812-13 (1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol.
13, 65-93 (1995)).
30 Human antibodies may additionally be produced using transgenic nonhuman animals that are modified so as to produce fully human antibodies rather than the animal's endogenous antibodies in response to challenge by an antigen. (See PCT
publication W094/02602). The endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins are inserted into the host's genome. The human genes are incorporated, for example, using yeast artificial chromosomes containing the requisite human DNA segments. An animal which provides all the desired modifications is then obtained as progeny by crossbreeding intermediate transgenic animals containing fewer than the full complement of the modifications. The preferred embodiment of such a nonhuman animal is a mouse, and is termed the XenomouseTM as disclosed in PCT
publications WO
96/33735 and WO 96/34096. This animal produces B cells that secrete fully human immunoglobulins. The antibodies can be obtained directly from the animal after immunization with an immunogen of interest, as, for example, a preparation of a polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as hybridomas producing monoclonal antibodies. Additionally, the genes encoding the immunoglobulins with human variable regions can be recovered and expressed to obtain the antibodies directly, or can be further modified to obtain analogs of antibodies such as, for example, single chain Fv molecules.
An example of a method of producing a nonhuman host, exemplified as a mouse, lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S.
Patent No. 5,939,598. It can be obtained by a method including deleting the J
segment genes from at least one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, the deletion being effected by a targeting vector containing a gene encoding a selectable marker; and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells contain the gene encoding the selectable marker.
A method for producing an antibody of interest, such as a human antibody, is disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing an expression vector containing a nucleotide sequence encoding a light chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an antibody containing the heavy chain and the light chain.
In a further improvement on this procedure, a method for identifying a clinically .
relevant epitope on an immunogen, and a correlative method for selecting an antibody that binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT
publication WO 99/53049.
4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES
According to the invention, techniques can be adapted for the production of single-chain antibodies specific to an antigenic protein of the invention (see e.g., LJ.S. Patent No. 4,946,778). In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid and effective identification of monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen may be produced by techniques known in the art including, but not limited to: (i) an F(ab')z fragment produced by pepsin digestion of an antibody molecule;
(ii) an Fab fragment generated by reducing the disulfide bridges of an F~~b~~2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule with papain and a reducing agent and (iv) F~ fragments.
4.13.6 BISPECIFIC ANTIBODIES
Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have binding specificities for at least two different antigens. In the present case, one of the binding specificities is for an antigenic protein of the invention. The second binding target is any other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit.
Methods for making bispecific antibodies are known in the art. Traditionally, the recombinant production of bispecific antibodies is based on the co-expression of two immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of the random assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a potential mixture of ten different antibody molecules, of which only one has the correct bispecific structure. The purification of the correct molecule is usually accomplished by affinity chromatography steps. Similar procedures are disclosed in WO
93/08829, published 13 May 1993, and in Traunecker et al., 1991 EMBO J., 10, 3655-3659.
Antibody variable domains with the desired binding specificities (antibody-antigen combining sites) can be fused to immunoglobulin constant domain sequences. The fusion preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region (CH1) containing the site necessary for light-chain binding present in at least one of the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into separate expression vectors, and are co-transfected into a suitable host organism. For further details of generating bispecific antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210 (1986).
According to another approach described in WO 96/27011, the interface between a pair of antibody molecules can be engineered to maximize the percentage of heterodimers that are recovered from recombinant cell culture. The preferred interface comprises at least a part of the CH3 region of an antibody constant domain. In this method, one or more small amino acid side chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side chains) are created on the interface of the second antibody molecule by replacing large amino acid side chains with smaller ones (e.g.
alanine or threonine). This provides a mechanism for increasing the yield of the heterodimer over other unwanted end-products such as homodimers.
Bispecific antibodies can be prepared as full-length antibodies or antibody fragments (e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody fragments have been described in the Literature. For example, bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intermolecular disulfide formation.
The Fab' fragments generated are then converted to thionitrobenzoate (TNB) derivatives.
One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB
derivative to form the bispecific antibody. The bispecific antibodies produced can be used as agents for the selective immobilization of enzymes.
Additionally, Fab' fragments can be directly recovered from E. coli and chemically coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med_ 175, 217-225 (1992) describe the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was able to bind to cells overexpressing the ErbB2 receptor and normal human T
cells, as well as trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor targets.
Various techniques for making and isolating bispecific antibody fragments directly from recombinant cell culture have also been described. For example, bispecific antibodies have been produced using leucine zippers. I~ostelny et al., J. Immunol.
148(5), 1547-1553 (1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region to form monomers and then re-oxidized to form the antibody heterodimers. This method can also be utilized for the production of antibody homodimers.
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci.
USA 90, 6444-6448 (1993) has provided an alternative mechanism for making bispecific antibody fragments. The fragments comprise a heavy-chain variable domain (VH) coimected to a light-chain variable domain (VL) by a linker which is too short to allow pairing between the two domains on the same chain. Accordingly, the VH and VL domains of one fragment are forced to pair with the complementary VL and VH domains of another fragment, thereby forming two antigen-binding sites. Another strategy for making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been reported.
See, Gruber et al., J. Immunol. 152, 5368 (1994).
Antibodies with more than two valencies are contemplated. For example, trispecific antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991).
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an irnmunoglobulin molecule can be combined with an arm which binds to a triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for IgG (Fc~yR), such as Fc~yRI (CD64), Fc~yRII (CD32) and Fc°yRIII (CD16) so as to focus cellular defense mechanisms to the cell expressing the particular antigen.
Bispecific antibodies can also be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA.
Another bispecific antibody of interest binds the protein antigen described herein and further binds tissue factor (TF).

4.13.7 HETEROCONJUGATE ANTIBODIES
Heteroconjugate antibodies are also within the scope of the present invention.
Heteroconjugate antibodies are composed of two covalently joined antibodies.
Such antibodies have, for example, been proposed to target immune system cells to unwanted cells 5 (IJ.S. Patent No. 4,676,980), and for treatment of HIV infection (WO
91/00360; WO
921200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic protein chemistry, including those involving crosslinking agents. For example, immunotoxins can be constructed using a disulfide exchange reaction or by forming a thioether bond. Examples of suitable reagents for this purpose include 10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for example, in U.S.
Patent No. 4,676,980.
4.13.8 EFFECTOR FUNCTION ENGINEERING
It can be desirable to modify the antibody of the invention with respect to effector 15 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine residues) can be introduced into the Fc region, thereby allowing interchain disulfide bond formation in this region. The homodimeric antibody thus generated can have improved internalization capability andlor increased complement-mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron et 20 al., J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Tmmunol., 148, 2918-2922 (1992).
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53, 2560-2565 (1993). Alternatively, an antibody can be engineered that has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et al., 25 Anti-Cancer Drug Design, 3, 219-230 (1989).
4.13.9 IMMUNOCONJUGATES
The invention also pertains to immunoconjugates comprising an antibody conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active 30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a radioconjugate).
Chemotherapeutic agents useful in the generation of such immunoconjugates have been described above. Enzymatically active toxins and fragments thereof that can be used include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A
chain, alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of radionuclides are available for the production of radioconjugated antibodies. Examples include ZiaBiy3ih i3lln, 9oY, and ls6Re.
Conjugates of the antibody and cytotoxic agent axe made using a variety of bifunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediasnine), diisocyanates (such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as 1,5-difluoro-2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). Carbon-14-labeled 1-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating agent for conaugation of radionucleotide to the antibody. See W094/11026.
In another embodiment, the antibody can be conjugated to a "receptor" (such streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is administered to the patient, followed by removal of unbound conjugate from the circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn conjugated to a cytotoxic agent.
4.14 COMPUTER READAELE SEQUENCES
In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, "computer readable media"
refers to any medium which can be read and accessed directly by a computer.
Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM;
electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently known methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present invention.
A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring formats (e.g. text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.
By providing any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 or a representative fragment thereof; or a nucleotide sequence at least 95%
identical to any of the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 in computer readable form, a skilled artisan can routinely access the sequence information for a variety of purposes.
Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. The examples which follow demonstrate how software which implements the BLAST (Altschul et al., J. Mol.
Biol.
215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be useful in producing commercially important proteins such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.
As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPL, input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based systems are suitable for use in the present invention. As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means. As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence infornlation of the present invention.
As used herein, "search means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means.
Search means are used to identify fragments or regions of a known sequence which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems. As used herein, a "target sequence"
can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide residues. However, it is well recognized that searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.
As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequences) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif.
There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).
4.15 TRIPLE HELIX FORMATION
In addition, the fragments of the present invention, as broadly described, can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA.
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription (triple helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 15241, 456 (1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA
hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention is necessary for the design of an antisense or triple helix oligonucleotide.
4.16 DIAGNOSTIC ASSAYS AND HITS
The present invention further provides methods to identify the presence or expression of one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic acid probe or antibodies of the present invention, optionally conjugated or otherwise associated With a suitable label.
In general, methods for detecting a polynucleotide of the invention can comprise contacting a sample with a compound that binds to and forms a complex with the polynucleotide for a period sufficient to form the complex, and detecting the complex, so that if a complex is detected, a polynucleotide of the invention is detected in the sample.
Such methods can also comprise contacting a sample under stringent hybridization conditions with nucleic acid primers that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is detected in the sample.

In general, methods for detecting a polypeptide of the invention can comprise contacting a sample with a compound that binds to and forms a complex with the polypeptide for a period sufficient to form the complex, and detecting the complex, so that if a complex is detected, a polypeptide of the invention is detected in the sample.
In detail, such methods comprise incubating a test sample with one or more of the antibodies or one or more of the nucleic acid probes of the present invention and assaying for binding of the nucleic acid probes or antibodies to components within the test sample.
Conditions for incubating a nucleic acid probe or antibody with a test sample vary.
Incubation conditions depend on the format employed in the assay, the detection methods employed, and the type and nature of the nucleic acid probe or antibody used in the assay.
One skilled in the art will recognize that any one of the commonly available hybridization, amplification or immunological assay formats can readily be adapted to employ the nucleic acid probes or antibodies of the present invention. Examples of such assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in hnmunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or urine. The test sample used in the above-described method will vary based on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be adapted in order to obtain a sample which is compatible with the system utilized.
In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the assays of the present invention.
Specifically, the invention provides a compartment kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the probes or antibodies of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound probe or antibody.

In detail, a compartment kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, labeled secondary antibodies; or in the alternative, if the primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed probes and antibodies of the present invention can be readily incorporated into one of the established kit formats which are well known in the art.
4.17 MEDICAL IMAGING
The novel polypeptides and binding partners of the invention are useful in medical imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the invention is involved in the immune response, for imaging sites of inflammation or infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of a labeling or imaging agent, administration of the labeled polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging the labeled polypeptide ih vivo at the target site.
4.18 SCREENING ASSAYS
Using the isolated proteins and polynucleotides of the invention, the present invention further provides methods of obtaining and identifying agents which bind to a polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ m NO: 1-1041, or 2083-2534, or bind to a specific domain of the polypeptide encoded by the nucleic acid. In.detail, said method comprises the steps of (a) contacting an agent with an isolated protein encoded by an ORF of the present invention, or nucleic acid of the invention; and (b) determining whether the agent binds to said protein or said nucleic acid.
In general, therefore, such methods for identifying compounds that bind to a polynucleotide of the invention can comprise contacting a compound with a polynucleotide of the invention for a time sufficient to form a polynucleotide/compound complex, and detecting the complex, so that if a polynucleotide/compound complex is detected, a compound that binds to a polynucleotide of the invention is identified.
Likewise, in general, therefore, such methods for identifying compounds that bind to a polypeptide of the invention can comprise contacting a compound with a polypeptide of the invention for a time sufficient to form a polypeptide/compound complex, and detecting the complex, so that if a polypeptide/compound complex is detected, a compound that binds to a polynucleotide of the invention is identified.
Methods for identifying compounds that bind to a polypeptide of the invention can also comprise contacting a compound with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a receptor gene sequence in the cell, and detecting the complex by detecting reporter gene sequence expression, so that if a polypeptide/compound complex is detected, a compound that binds a polypeptide of the invention is identified.
Compounds identified via such methods can include compounds which modulate the activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to activity observed in the absence of the compound). Alternatively, compounds identified via such methods can include compounds which modulate the expression of a polynucleotide of the invention (that is, increase or decrease expression relative to expression levels observed in the absence of the compound). Compounds, such as compounds identified via the methods of the invention, can be tested using standard assays well known to those of skill in the art for their ability to modulate activity/expression.
The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.
For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally selected or designed"

when the agent is chosen based on the configuration of the particular protein.
For example, one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like, capable of binding to a specific peptide sequence, in order to generate rationally designed antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and I~aspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like.
In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control. One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix formation by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.
Agents suitable for use in these methods preferably contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix -see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241, 456 (1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-Okano, J.
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense W hibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention is necessary for the design of an antisense or triple helix oligonucleotide and other DNA binding agents.
Agents which bind to a protein encoded by one of the ORFs of the present invention can be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the present invention can be formulated using known techniques to generate a pharmaceutical composition.

4.19 USE OF NUCLEIC ACIDS AS PROBES
Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid hybridization probes capable of hybridizing with naturally occurnng nucleotide sequences. The hybridization probes of the subject invention may be derived from any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534. Because the corresponding gene is only expressed in a limited number of tissues, a hybridization probe derived from any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 can be used as an indicator of the presence of RNA of cell type of such a tissue in a sample.
Any suitable hybridization technique can be employed, such as, for example, in situ hybridization. PCR as described.in US Patents Nos. 4,683,195 and 4,965,188 provides additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both.
The probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a degenerate pool of possible sequences for identification of closely related genomic sequences.
Other means for producing specific hybridization probes for nucleic acids include the cloning of nucleic acid sequences into vectors for the production of mRNA
probes. Such vectors are known in the art and are commercially available and may be used to synthesize RNA probes ifa vitro by means of the addition of the appropriate RNA
polyrnerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may be used to construct hybridization probes for mapping their respective genomic sequences. The nucleotide sequence provided herein may be mapped to a .
chromosome or specific regions of a chromosome using well-known genetic and/or chromosomal mapping techniques. These techniques include in situ hybridization, linkage analysis against known chromosomal markers, hybridization screening with libraries or flow-sorted chromosomal preparations specific to known chromosomes, and the like. The technique of fluorescent in situ hybridization of chromosome spreads has been described, among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY.
Fluorescent ifz situ hybridization of chromosomal preparations and other physical chromosome mapping techniques may be correlated with additional genetic map data.
Examples of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation between the location of a nucleic acid on a physical chromosomal map and a specific disease (or predisposition to a specific disease) may help delimit the region of DNA associated with that genetic disease. The nucleotide sequences of the subject invention may be used to detect differences in gene sequences between normal, carrier or affected individuals.
4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES
Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced using an automated oligonucleotide synthesizer.
Support bound oligonucleotides may be prepared by any of the methods known to those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to precisely spot oligonucleotides synthesized by standard synthesizers.
Immobilization can be achieved using passive adsorption (hlouye & Hondo, (1990) J. Clin.
Microbiol. 28(6), 1469-72); using UV light (Nagata et al., 1985; Dahlen et al., 1987; Morrissey &
Collins, (1989) Mol.
Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (I~eller et al., 1988;
1989); all references being specifically incorporated herein.
Another strategy that may be employed is the use of the strong biotin-streptavidin interaction as a linker. For example, Broude et al. (1994) Froc. Natl. Acad.
Sci. USA 91(8), 3072-6, describe the use of biotinylated probes, although these are duplex probes, that are immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating any surface with streptavidin. Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies (Alameda, CA).
Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used.
Nunc Laboratories have developed a method by which DNA can be covalently bound to the microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino groups (>NH) that serve as bridgeheads for further covalent coupling.
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 5'-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA (Rasmussen et al., (1991) Anal. Biochem. 198(1) 138-42).
The use of CovaLink NH~ strips for covalent binding of DNA molecules at the 5'-end has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond is employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as immobilization using only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and then streptavidin used to bind the probes.
More specifically, the linkage method iilcludes dissolving DNA in water (7.5 ng/~,1) and denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 1-methylimidazole, pH 7.0 (1-MeIm~), is then added to a final concentration of 10 mM 1-Melm~.
A ss DNA solution is then dispensed into CovaLink NH strips (75 p,l/well) standing on ice.
Carbodiimide 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 10 mM 1-Melm~, is made fresh and 25 ~,1 added per well. The strips are incubated for 5 hours at 50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash;
first the wells are washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS
heated to 50°C).
It is contemplated that a further suitable method for use with the present invention is that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by reference. This method of preparing an oligonucleotide bound to a support involves attaching a nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to aliphatic hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported nucleoside and protecting groups removed from the synthetic .oligonucleotide chain under standard conditions that do not cleave the oligonucleotide from the support.
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen phosphorate.
An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe arrays may be employed. For example, addressable laser-activated photodeprotection may be employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by reference. Probes may also be immobilized on nylon supports as described by Van Ness et al.
(1991) Nucleic Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan &
Cavalier (1988) Anal. Biochem. 169(1), 104-8; all references being specifically incorporated herein.
To link an oligonucleotide to a nylon support, as described by Van Ness et al.
(1991), requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of oligonucleotides with cyanuric chloride.

One particular way to prepare support bound oligonucleotides is to utilize the light-generated synthesis described by Pease et al., (1994) Proc. Nafl. Acad.
Sci., USA 91(11), 5022-6, incorporated herein by reference). These authors used current photolithographic techniques to generate arrays of immobilized oligonucleotide probes (DNA
chips). These methods, in which light is used to direct the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 5'-protected N acyl-deoxynucleoside phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. A
matrix of 256 spatially defined oligonucleotide probes may be generated in this manner.
4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS
The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC
inserts, and RNA, including mRNA without any amplification steps. For example, Sambrook et al. (1989) describes three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 9.14-9.23).
DNA fragments may be prepared as clones in M13, plasmid or lambda vectors and/or prepared directly from genomic DNA or cDNA by PCR or other amplification methods.
Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA
samples may be prepared in 2-500 ml of final volume.
The nucleic acids would then be fragmented by any of the methods known to those of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et al. (1989), shearing by ultrasound and NaOH treatment.
Low pressure sheariilg is also appropriate, as described by Schriefer et al.
(1990) Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this method, DNA
samples are passed through a small French pressure cell at a variety of low to intermediate pressures. A lever device allows controlled application of low to intermediate pressures to the cell. The results of these studies indicate that low-pressure shearing is a useful alternative to sonic.and enzymatic DNA fragmentation methods.
One particularly suitable way for fragmenting DNA is contemplated to be that using the two base recognition endonuclease, C'viJI, described by Fitzgerald et al.
(1992) Nucleic Acids Res. 20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and sequencing.

The restriction endonuclease CviJI normally cleaves the recognition sequence PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of this enzyme (CviJI**), yield a quasi-random distribution of DNA
fragments form the small molecule pUCl9 (2688 base pairs). Fitzgerald et al. (1992) quantitatively evaluated the randomness of this fragmentation strategy, using a CviJI** digest of pUCl9 that was size fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z
minus M13 cloning vector. Sequence analysis of 76 clones showed that CviJI**
restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate consistent with random fragmentation.
As reported in the literature, advantages of this approach compared to sonicaion and ,:
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ~Cg instead of 2-5 ~,g); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel electrophoresis and elution are needed).
Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is important to denature the DNA to give single stranded pieces available for hybridization.
This is aclueved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the chip. Phosphate groups must also be removed from genomic DNA by methods known in the art.
4.22 PREPARATION OF DNA ARRAYS
Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. Spotting may be performed by using arrays of metal pins the positions of which correspond to an array of wells in a microtiter plate) to repeated by transfer of about X20 n1 of a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the density of the wells is achieved. One to 25 dots may be accommodated in 1 mm2, depending on the type of label used. By avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same gene) from different individuals, or may be different, overlapped genomic clones. Each of the subarrays may represent replica spotting of the same samples. In one example, a selected gene segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in one 96-well plate (all 96 wells containing the same sample).
A plate for each of the 64 patients is prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from each patient.

Where the 96 subarrays are identical, the dot span may be 1 mmz and there may be a 1 mm space between subarrays.
Another approach is to use membranes or plates (available from NL1NC, Naperville, Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage screens or x-ray films.
The present invention is illustrated in the following examples. Upon consideration of the present disclosure, one of skill in the art will appreciate that many other embodiments and variations may be made in the scope of the present invention. Accordingly, it is intended that the broader aspects of the present invention not be limited to the disclosure of the following examples. The present invention is not to be limited in scope by the exemplified embodiments which are intended as illustrations of single aspects of the invention, and compositions and methods which are functionally equivalent are within the scope of the invention. Indeed, numerous modifications and variations in the practice of the invention are expected to occur to those skilled in the art upon consideration of the present preferred embodiments. Consequently, the only limitations which should be placed upon the scope of the invention are those which appear in the appended claims.
All references cited within the body of the instant specification are hereby incorporated by reference in their entirety.
5.0 EXAMPLES
5.1 EXAMPLE 1 Novel Nucleic Acid Seguences Obtained From Various Libraries A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various human tissues and in some cases isolated from a genomic library derived from human chromosome using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The inserts of the library were amplified with PCR using primers specific for the vector sequences which flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered into groups of similar or identical sequences.
Representative clones were selected for sequencing.

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems (ABA sequencer to obtain the novel nucleic acid sequences.
5.2 EXAMPLE 2 Assemblage of Novel Conti~s The contigs of the present invention, designated as SEQ m NO: 2083-2534 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend the seed EST into an extended assemblage, by pulling additional sequences from different databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and UniGene, and exons from public domain genomic sequences predicated by GenScan) that belong to this assemblage. The algorithm terminated when there were no additional sequences from the above databases that would extend the assemblage. Further, inclusion of component sequences into the assemblage was based on a BLASTN hit to the extending assemblage with BLAST
score greater than 300 and percent identity greater than 95%.
Table 8 sets forth the novel predicted polypeptides (including proteins) encoded by the novel pohynucleotides (SEQ )D NO: 2083-2534) of the present invention, and their corresponding translation start and stop nucleotide locations to each of SEQ
ID NO: 2083-2534.
Table 8 also indicates the method by which the polypeptide was predicted.
Method A refers to a polypeptide obtained by using a software program called FASTY (available from http://fasta.bioch.virginia.edu) which selects a polypeptide based on a comparison of the translated novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B
refers to a polypeptide obtained by using a software program called GenScan for human/vertebrate sequences (available from Stanford University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic model of gene structure/compositional properties (C.
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by reference).
Method C refers to a polypeptide obtained by using a Hyseq proprietary software program that translates the novel polynucheotide and its complementary strand into six possible amino acid sequences (forward and reverse frames) and chooses the polypeptide with the longest open reading frame.

5.3 EXAMPLE 3 Novel Nucleic Acids The novel nucleic acids of the present invention SEQ ID NO: 1-1041 were assembled from Hyseq's proprietary EST sequences as described in Example 1 and human genome sequences that are available from the public databases (htt~://www.ncbi.nlm.nih.~ovn.
Exons were predicted from human genome sequences using GenScan (http:l/genes.mit.edu/GENSCANinfo.html); HMMgene (http~l/www cbs dtu.dl~/services/HMM~enemmmgenel l.html); and GenMark.hmm (httpyenemark.biology.~atech.edu/GeneMark/whmm info.html). The Hyseq proprietary EST sequences and the predicted exons were assembled based on a BLASTN hit to the extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. Then, the predicted genes were analyzed using Neural Network SignalP V1.1 program (from Center for Biological Sequence Analysis, The Technical University of Denmark) for presence of a signal peptide. These sequences ware further analyzed for absence of a transmembrane region using the TMpred program (http://www.ch.embnet.or~/software/TMPRED form.html).
Table 1 shows the various tissue sources of SEQ ID NO: 1-1041.
The homologs for polypeptides SEQ m NO: 1042-2082, that correspond to nucleotide sequences SEQ ID NO: 1-1041 were obtained by a BLASTP version 2.0a1 WashU searches against Genpept release 124 using BLAST algorithm. The results showing homologues for SEQ ID NO: 1042-2082 from Genpept 124 are shown in Table 2.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J.
Comp. Biol., Vol. 6, 219-235 (1999), http:l/motif.stanford.edu/ematrix-search/
herein incorporated by reference), all the polypeptide sequences were examined to determine whether they had identifiable signature regions. Scoring matrices of the eMatrix software package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO
databases. Table 3 shows the accession number of the homologous eMatrix signature found in the indicated polypeptide sequence, its description, and the results obtained which include accession number subtype; raw score; p-value; and the position of signature in amino acid sequence.
Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were examined for domains with homology to certain peptide domains. Table 4 shows the name of the Pfam model found, the description, the e-value and the Pfam score for the identified model within the sequence. Further description of the Pfam models can be found at http://pfam.wustl.edu/.
The GeneAtlasT"' software package (Molecular Simulations Inc. (MSI), San Diego, CA) was used to predict the three-dimensional structure models for the polypeptides encoded by SEQ ID NO 1-1041 (i.e. SEQ ID NO: 1042-2082). Models were generated by (1) PSI-BLAST which is a multiple alignment sequence profile-based searching developed by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High Throughput Modeling (HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated sequence and structure searching procedure (http://www.msi.com/), and (3) SeqFoldTM
which is a fold recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-791 (1998)).
This analysis was carried out, in part, by comparing the polypeptides of the invention with the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier given to template structure; "Chain ID", identifier of the subcomponent of the PDB
template structure; "Compound Information", information of the PDB template structure and/or its subcomponents; "PDB Function Amlotation" gives function of the PDB template as annotated by the PDB files (http:/www.rcsb.or DB/); start and end amino acid position of the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, and the Potentials) of Mean Force (PMF). The verify score is produced by GeneAtlasT"' software (MST), is based on Dr. Eisenberg's Profile-3D threading program developed in Dr. David Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc.. Natl.
Acad. Sci. USA, 95:13597-12502. The verify score produced by GeneAtlas normalizes the verify score for proteins with different lengths so that a unified cutoff can be used to select good models as follows:
Verify score (normalized) _ (raw score -1/2 high score)/(1/2 high score) The PFM score, produced by GeneAtlasT"' software (MSI), is a composite scoring function that depends in part on the compactness of the model, sequence identity in the alignment used to build the model, pairwise and surface mean force potentials (MFP). As given in table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good model. A SeqFoldTM score of more than 50 is considered significant. A good model may also be determined by one of skill in the art based all the information in Table 5 taken in totality.
Table 6 shows the position of the signal peptide in each of the polypeptides and the maximum score and mean score associated with that signal peptide using Neural Network SignalP V1.1 program (from Center for Biological Sequence Analysis, The Technical University of Denmark). The process for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, Vol.
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and a mean S score, as described in the Nielson et al reference, was obtained for the polypeptide sequences.
Table 7 correlates each of SEQ ID NO: 1-1041 to a specific chromosomal location.
Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-1041, their corresponding polypeptide sequences SEQ ID NO: 1042-2082, their corresponding priority contig nucleotide sequences SEQ ID NO: 2083-2534, their corresponding priority contig polypeptide sequences SEQ ID NO: 2535-2986, and the US
serial number of the priority application in which the contig sequence was filed.
Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-1041, the novel polypeptide sequences SEQ ID NO: 1042-2082, and the corresponding SEQ
ID NO in which the sequence was filed in priority US application 60/311,261.

Table 1 'Tissue Ori in 1Z1VA/Tissue Librar Name SEQ ID NO:
Source adrenal gland Clontech ADR002 13 23 34 45 77 111 adult bladder Invitrogen BLD001 9 87 189 320-321 adult brain Clontech ABR001 ~ 184-186 277 282 352 adult brain Clontech ABR006 30 45 170 199 210 adult brain Clontech ABR008 15 45 54 61 67 81 adult brain Clontech ABRO 11 1012 adult brain GIBCO AB3001 23 57-58 67 85 296 adult brain GIBCO ABD003 45 59-62 67 72 82 adult brain Invitrogen ABR014 45 115 238 470 599 adult brain Invitrogen ABR015 45 600 885 1012 adult brain Invitro en ABR016 599 1012 adult brain Invitrogen ABT004 ' 34 45 54 74 84 118 adult cervix BioChain CVX001 23 26 48 54 57 67 Table 1 'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source adult colon Invitrogen CLN001 250 322-325 429 630 adult heart GIBCO AHR001 28-30 45 61 67 90-94 adult kidney GIBCO AKD001 24 31-34 44-46 48 adult kidney Invitrogen AKT002 32 53-54 67 85 177 adult liver Clontech ALV003 101 121 193 579 638-639 adult liver Invitrogen ALV002 75 157 173 183 212-214 adult lung GIBCO ALG001 67 77 152 369 386 adult ovary Invitrogen AOV001 5 26 34 43 45 48 55 adult lacenta Clontech APL001 67 419 688 728 848 adult spleen Clontech SPLc01 82 101 187 255 260 adult spleen GIBCO ASP001 87 105 108 122 158 adult testis GIBCO ATS001 68-69 106 183 251 bone marrow Clontech BMD001 10-12 16-19 24-26 TahlP 1 'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source bone marrow GF BMD002 23 45 81-82 104-105 cultured preadipocytesStratagene ADP001 121 255 400 490-494 endothelial cellsStratagene EDT001 34 45 54 58 67 120-122 fetal brain Clontech FBR001 139 168 356 599 702 fetal brain Clontech FBR004 138 168 250 363 873-875 fetal brain Clontech FBR006 14 29 45 51 81 87 fetal brain GIBCO HFB001 13-15 54-57 62 67 fetal brain Invitrogen FBT002 7 45 49 144-149 157 fetal heart Invitrogen FHR001 24 45 81-82 104 114-115 569 571 576 582.596 ' 668 674-688 719-722 Table l 'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source fetal kidney Clontech FI~D001 82 107 208 458 483 fetal kidney Clontech FKD002 61 101 105 183 189 fetal kidney Invitro en FKD007 116 fetal liver Clontech FLV002 410 429 454 692-695 fetal liver Clontech FLV004 67 107 115 118 151 fetal liver ~ Invitrogen FLV001 45 101 130-137 157 fetal liver-spleenColumbia FLS001 1-9 18 20-23 27 34 University 67 70 83 89 94 118 fetal liver-spleenColumbia FLS002 3 8 17 22 36-37 46 University 72 85 89-90 94 106 . 973 980 992 999 1003 fetal liver-spleenColumbia FLS003 23 67 106 150 158 University 376 411 443 478 493 Table l 'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source fetal lung Clontech FLG001 728 824 1008 fetal lun Clontech FLG004 115 668 fetal lung Invitrogen FLG003 120 183 322 333-336 fetal muscle Invitrogen FMS001 45 338-339 365 369 fetal muscle Invitrogen FMS002 45 115 171 247 327 fetal skin Invitrogen FSK001 29 57 67 74 81 118 fetal skin Invitrogen FSK002 34 45 77 81 85 115 fibroblast Stratagene LFB001 55 72 143 255 490 induced neuron-cellsStratagene NTD001 30 82 111 124 181 infant brain Columbia IB2002 18 21 45 66 73-75 University 152 168-171,177 180 infant brain Columbia IB2003 81 101 113 118 177 Uiliversity 293 340 345 367 371 infant brain Columbia IBM002 168 358 413-414 913 University infant brain Columbia IBS001 415 417 533 581 886-888 University leukocyte Clontech LUG003 77 619889 949 leukocyte GIBCO LUC001 34 36 38-42 50-52 Tahl a 1 'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source leg 55 72 143 lung tumor Invitrogen LGT002 55 61 65 77-79 82 lymph node Clontech ALN001 47 63 104-105 183 lymphocytes ATCC LPC001 45 53 77 158 193 251 macrophage Invitrogen HMP001 122 147 157 183 251 mammary gland Invitrogen MMG001 45 64 67 83-84 101 melanoma from-cell-line-Clontech MEL004 62 158 181 298 362 ATCC-#CRL-1424 515 536 896-897 958 *Mixture of 16 Various VendorsCGd010 353 358 823 942 982 tissues - 1020 mRNA

*Mixture of 16 Various VendorsCGd011 569 630 944 955 999 tissues -mRNA

*Mixture of 16 Various VendorsCGd012 9 38 59 63 80 85 122-123 tissues - 152 ~A 154 177 195 217 232 *Mixture of 16 Various VendorsCGd013 232 434 748 956-958 tissues - 992 mRNA

*Mixture of 16 Various VendorsCGd015 18 69 115 324 335 tissues - 548 551 569 ~A 582 600 622 731 819 Tahle 1 'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source *Mixture of 16 Various VendorsCGd016 46 172 183 323 371 tissues - 481 493 565 ~A 569 571 596 599 630 neuronal cells Stratagene NTU001 7 33 45 107 113 121 pituitary gland Clontech PIT004 158 222 255 345 356 placenta Clontech PLA003 7 36 61 279 419 478 placenta Invitrogen APL002 57 173 536 728 793 prostate Clontech PRT001 26 219-222 229 412 rectum Invitrogen REC001 9 292 343-346 431 retinoic acid-induced-Shatagene NTR001 112 400 478 569 582 neuronal-cells 758 800 819 831 835-836 salivary gland Clontech SAL001 58 61 77 118 150 158 skeletal muscle Clontech SI~M001 80 118 247 365 483 small intestine Clontech SIN001 34 37 45 52 60 93 spinal cord Clontech SPC001 51 164 182-183 190 stomach Clontech STO001 72 222 232 247 258 thalamus Clontech THA002 45 49 113 155 164 thymus Clontech THM001 45 141 160 183 258 thymus Clontech THMc02 47 108 115 121 144 Table 1 Tissue Ori in RNAITissue Librar Name SEQ ID NO:
Source thyroid gland Clontech THR001 46 58 67 80 82 144 trachea Clontech TRC001 45 154 236 238 281 umbilical cord BioChain FUC001 34 45 54 58 67 70 uterus Clontech UTR001 177 237-239 255 258 young liver GIBCO ALV001 45 419 440 443 490 *The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA
(Invitrogen), 4) Normal adult liver mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Norn~al fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) Human bone marrow mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 11) Human thymus mRNA
(Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain).

Tahle 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1044 AAB32400 Homo SapiensHUMA- Human secreted 339 100 protein sequence encoded by gene 30 SEQ ID

N0:86.

1044 AAM74711 Homo SapiensMOLE- Human bone marrow335 100 expressed probe encoded protein SEQ

ID NO: 35017.

1044 AAM61909 Homo SapiensMOLE- Human brain expressed335 100 single exon probe encoded protein SEQ ID

NO: 34014.

1045 gi3859599Arabidopsis similar to class I chitinases74 27 (Pfam:

thaliana PF00182, E=1.2e-142, N=1) 1045 gi15292107Drosophila LD38671p 74 33 melanogaster 1045 gi2258324Fusarium yellowing-associated 73 32 protein oxysporum f. Sp.

ciceris 1046 gi17428204Ralstonia CONSERVED HYPOTHETICAL 74 32 solanacearumPROTEIN

1046 gi4314432Homo Sapienssimilar to phosphatidylinositol71 30 (4,5)bisphosphate 5-phosphatase;

match to PID:g1399105 1046 gi~17545909~Ralstonia CONSERVED HYPOTHETICAL 74 32 ref~NP_5193solanacearumPROTEIN

11.1 1047 gi9756017Actinoplanesalpha-amylase 69 38 Sp.

1047 gi~6572499~gHomo SapiensLHX3 protein 67 26 b~AAF17291 .1~

1047 gi~18572988~Homo SapiensLIM homeobox protein 67 26 re~XP_0291 70.2 1048 AAY28474-Homo SapiensUYJO Human Capon protein.721 99 1048 gi2895555Homo sapienscarboxyl-terminal PDZ 721 99 ligand of neuronal nitric oxide synthase 1048 gi2895557Rattus carboxyl-terminal PDZ 654 92 ligand of norve icus neuronal nitric oxide synthase 1049 gi19713721FusobacteriumGTP-binding protein 66 28 era nucleatum subsp.

nucleatum 1050 131291 Homo sa iensfumarylacetoacetase 175 70 (AA 1-349) 1050 g1182393 Homo sa iensfumarylacetoacetate 175 70 hydrolase 1050 g112803409Homo Sapiensfiunar lacetoacetate 175 70 1052 g14680089Human envelope glycoprotein 79 26 immunodeficienc y virus a 1052 g13868997Ephydatia EFPDE2 74 20 fluviatilis 1052 g14679590Human envelope glycoprotein 74 25 immunodeficienc y virus type 1054 g13844648Mycoplasma glycerol kinase (glpK) 71 28 genitalium Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1054 gi18448155Ipomoea AC3 70 27 leaf curl virus 1054 gi~12044888~Mycoplasma glycerol kinase (glpK) 71 28 ref~IVP_0726genitalium 98.1 1056 AAM56747 Homo SapiensMOLE- Human brain expressed229 72 single exon probe encoded protein SEQ ID

NO: 28852.

1056 AAM67067 Homo SapiensMOLE- Human bone marrow 224 69 expressed probe encoded protein SEQ

ID NO: 27373.

1056 AAM54664 Homo SapiensMOLE- Human brain expressed224 69 , single exon probe encoded protein SEQ ID

NO: 26769.

1058 gi~13310191~multiple recombinant envelope 228 79 protein gb~AAK181sclerosis 89.1~AF331associated 500_1 retrovirus element 1058 gi~21103962~Homo sapiensenverin-2 209 77 gb~AAM331 41.1 1058 gi~8272468~gHomo Sapiensenvelope protein 198 75 b~AAF74215 .1 ~AF15696 1059 120380199Homo sa Similar to LOC168246 251 100 iens 1059 gi~8388692~eLeishmania probable DNA-binding 67 46 protein mb~CAB940major 42.1 ~

1060 gi~21292780~Anopheles agCP4203 70 39 ' gb~EAA049gambiae str.

25.1 J PEST

1061 g1330862 Equine membrane glycoprotein 179 30 herpesvirus 1061 g117221106Equine glycoprotein gp2 178 34 herpesvirus 1061 AAE03643 Homo SapiensINCY- Human extracellular175 29 matrix and cell adhesion molecule-7 (XMAD-7).

1062 gi~11037117~Homo SapiensNAG13 334 66 gb~AAG274 85.1 CAF

1062 gi~1335205~eHomo SapiensORFII 332 66 mb~CAA364 80.1 , 1063 g121323402CorynebacteriumABC-type transporter, 70 36 periplasmic glutamicum component 1063 gi~19551869~CorynebacteriumCOG1464:ABC-type uncharacterized70 36 reflNP-5998glutamicum transport systems, periplasmic 71.1 ~ component 1063 gi~17551878~CaenorhabditisTPRDomain 67 37 re NP elegans Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

90.1 1064 gi2308977Aspergilluschitin synthase 66 29 nidulans 1065 gi18076958Yarrowia Optl protein 74 30 lipolytica 1065 gi786145 Walleye envelope polyprotein 73 28 dermal sarcoma virus 1065 gi2801522Walleye gPr env 73 28 dermal sarcoma virus 1066 gi9294279ArabidopsisTal l-like non-LTR retroelement67 32 thaliana protein-like; CHP-rich zinc finger rotein-like 1066 gi~20848817~Mus musculussimilar to HEAT SHOCK 83 69 COGNATE

ref~XP_1380 PROTEIN 80 10.1 1069 AAM77637 Homo SapiensMOLE- Human bone marrow 96 65 expressed probe encoded protein SEQ

ID NO: 37943.

1069 AAM64901 Homo SapiensMOLE- Human brain expressed96 65 single exon probe encoded protein SEQ ID

NO: 37006.

1069 gig 17473741Homo Sapienssimilar to Meningioma-expressed112 56 ~

ref~~ antigen 6/11 (MEA6) (MEAL
0623 l) 80.1 1070 gi296288 Homo Sapienshistone H1 77 44 1070 15923857 Artemisia s ualene synthase 75 35 annua 1070 AAO08837 Homo SapiensHYSE- Human polypeptide 73 39 SEQ ID

NO 22729.

1071 g121483554Drosophila SD02058p 72 29 melano aster 1071 g18515845Homo Sapienshepatocellular carcinoma71 38 associated rotein TD26 1071 gi~21483554~Drosophila SD02058p 72 29 gb~AAM527melanogaster 52.1 ~

1072 g15902896Streptomycestype I polyketide synthase74 50 avermitilis 1072 gi~21301752~Anopheles agCP8235 70 34 gb~EAA138gambiae str.

97.1 PEST

1073 AAV30916 Homo SapiensGEMY Human secreted protein9.9 66 _ AR415 4 cDNA.
aal 1073 ABB89113 Homo SapiensHUMA- Human polypeptide 99 66 SEQ ID

NO 1489.

1073 AAB90679 Homo SapiensGEMY Human AR415 4 protein99 66 sequence SEQ ID 35.

1074 AAG99338 Homo SapiensTAKE Human atypical tachykinin380 92 ~

rotein fragment SEQ ID
NO: 20.

1074 AAG99336 Homo SapiensTAKE Human atypical tachykinin329 91 rotein fragment SEQ ID
NO: 13.

1074 AAG99333 Homo SapiensTAKE Human atypical tachykinin324 91 protein fra ment SEQ
ID NO: 3.

1075 g117945760Drosophila RE33302p 305 29 melanogaster Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1075 gi1039447SaccharomycesLpblp 91 25 cerevisiae 1075 AAB64777 Homo SapiensHUMA- Human secreted 78 77 protein sequence encoded by gene N0:63.

1076 AAB50261 Homo SapiensCORI- Human breast cancer308 39 associated B726P-20 rotein.

1076 AAB50244 Homo SapiensLORI- Human breast cancer308 39 associated B726P-79 rotein.

1076 AAB84702 Homo SapiensCORR Amino acid sequence308 39 of a human cancer associated antigen.

1077 12529735 Gorilla 1 co horin BlE recursor 71 31 orilla 1077 AAB74724 Homo SapiensINCY- Human membrane 70 31 associated protein MEMAP-30.

1077 g14164424Scluzosaccharomsimilar to yeast cytoskeleton70 24 control yces ombe protein Bnilp 1078 g118145107Clostridiumprobable transcriptional71 28 regulator perfringens 1078 gi~9581801~ePlasmodium guanylyl cyclase 69 24 mb~CAC005falciparum 46.1 1078 gi~16805032~Plasmodium Ser/Thr protein kinase 69 26 ref~NP_4730falciparum 61.1 1079 gi~20886321~Mus musculussimilar to olfactory 72 34 receptor, family 5, ref~XP subfamily V, member 1;
1406 olfactory _ receptor, family 5, subfamily 14.1 V

member 1 1081 g19650824Petroselinumcommon plant regulatory 76 28 factor 5 Iris um 1081 g1559695 Hydrolagus This CDS feature is included74 31 to show colliei the translation of the corresponding C_region. Presently translation qualifiers on C region features are illega1 1081 g1476622 Hydrolagus immunoglobulin light 74 31 chain colliei 1082 AAM39205 Homo SapiensHYSE- Human polypeptide 363 71 SEQ ID

NO 2350.

1082 AA007159 Homo SapiensHYSE- Human polypeptide 357 76 SEQ ID

NO 21051.

1082 AAM40991 Homo SapiensHYSE- Human polypeptide 343 79 SEQ ID

NO 5922.

1083 gi~17229222~Nostoc Sp. similar to HetF protein 72 30 PCC

reflNP-48577120 70.1 1084 g117221628Felis catusT-lym hocyte surface 76 38 CD2 antigen 1084 g118565073Crimean-Congoenvelope glycoprotein 74 29 precursor hemorrhagic fevervirus 1084 gi~17221628~Felis catusT-lymphocyte surface 76 38 CD2 antigen dbj~BAB784 75.1 1085 117430213Ralstonia PUTATIVE HEMAGGLUTININ- 74 26 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

solanacearumRELATED PROTEIN

1087 gi2323287multiple polyprotein 618 79 sclerosis associated retrovirus 1087 gi~4996596~dHuman polyprotein 317 74 bj~BAA7854endogenous 9.1 ~ retrovirus W

1087 gi~9630708~rFeline leukemiagag-pol precursor polyprotein293 38 gPr80 e~NP_0472virus 55.1 1088 gi15075953SinorhizobiumPUTATIVE MOLYBDENUM 70 56 meliloti TRANSPORT SYSTEM PERMEASE

ABC TRANSPORTER PROTEIN

1088 gi2288880Arthrobactertransmembrane protein 67 56 nicotinovorans 1088 gi17298547BradyrllizobiumModB 67 56 japonicum 1089 AAY95660Homo sa iensZYMO Human Zntr2 protein.231 61 1089 AAU83682Homo SapiensGETH Human PRO protein, 210 59 Seq ID No 182.

1089 AAY99386Homo SapiensGETH Human PR01305 (UNQ671)210 59 amino acid sequence SEQ
ID N0:153.

1090 gi7688355Solanum Dof zinc finger protein 70 31 tuberosum 1090 gi4389445Drosophila transcription factor 67 32 melanogaster 1090 gi~7688355~eSolanum Dof zinc finger protein 70 31 mb~CAB898tuberosum 31.1 1092 AAG78884Homo SapiensBIOW- Human ribosomal 90 44 protein s5-17.

1092 AAM91239Homo SapiensHUMA- Human 72 53 immune/haematopoietic antigen SEQ

ID NO:18832.

1092 AAM95026Homo sapiensHUMA- Human reproductive72 48 system related antigen SEQ ID
NO: 3684.

1094 gi18676450Homo sa iensFLJ00122 protein 69 38 1094 gi18073428Homo sa iensstabilin-2 69 38 1094 gi~20806091~Homo Sapiensstabilin-2; CD44-like 69 38 precursor FELL

ref~NP_0600 34.8 1095 gi20906397Methanosarcinaconserved protein 76 44 mazei Goel 1095 gi~21299784~Anopheles agCP6531 75 30 gb~EAA119gambiae str.

29.1 PEST
~

1095 gi~17549046~Ralstonia CONSERVED HYPOTHETICAL 73 32 reflNP-5223solanacearumPROTEIN

86.1 1096 AAB58317Homo SapiensROSE/ Lung cancer associated678 100 of eptide sequence SEQ
ID 655.

1096 gi862600Drosophila male-specific lethal-1 176 25 protein melanogaster Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1096 gi601930 Oryctolagus neurofilament-H 115 24 cuniculus 1097 AAU83109 Homo SapiensZYMO Novel secreted 76 85 protein Z701935G4P.

1097 gi~20348496~Mus musculussimilar to RII~EN cDNA 72 57 ref~XP_1117 12.1 1098 gi18031887Mus musculusFanconi anemia complementation77 29 gr ou G

1098 112002137Mus musculusFanconi anemia grou 77 29 G rotein 1098 AAB72381 Homo sapiensLEEM/ Human hairy and 75 28 enhancer of S lit homolo a amino acid se uence.

1099 g18217648Homo SapiensdJ579F20.1 (high-mobility159 70 group (nonhistone chromosomal) protein 1-like 1) 1099 g15815432Gallus gallushi h mobility group 154 70 protein HMGl 1099 14140289 Gallus allushigh mobility group 154 70 1 rotein 1100 ABB 11527Homo SapiensHYSE- Human apolipoprotein84 26 B

rece for homolo ue, SEQ ID N0:1897.

1100 1487347 Homo sa iensbrea oint cluster region81 32 rotein 1100 g1144050 Bordetella filamentous hemagglutinin78 30 periussis 1102 AAM68946 Homo SapiensMOLE- Human bone marrow327 81 expressed probe encoded protein SEQ

ID NO: 29252.

1102 AAM79768 Homo SapiensHYSE- Human protein 324 80 SEQ ID NO

3414.

1102 AAM78784 Homo SapiensHYSE- Human protein 324 80 SEQ ID NO

1446.

1103 AAZ11186 Homo SapiensSAGA Gene encoding transmembrane143 68 _ domain containing protein aal clone HP02239.

1103 AAD31079_Homo SapiensINCY- Human cornichon 143 68 protein aal (CORN) cDNA.

1103 AAA88439_Homo SapiensGETH Antitumour PR0181 143 68 cDNA

aal clone DNA23330-1390.

1104 ABB07527 Homo sapiensINCY- Human drug metabolizing562 100 enzyme (DME) (ID: 5643401CD1).

1104 ABB07515 Homo SapiensINCY- Human drug metabolizing562 100 enzyme (DME) ID: 8097779CD1).

1104 113161409Mus musculusfamily 4 cytochrome 431 76 1107 g113542874Mus musculusSimilar to CGI-67 protein677 64 1107 AAU81978 Homo sa iens1NCY- Human secreted 665 65 protein SECP4.

1107 AAU77137 Homo SapiensMILL- Human alpha/beta 665 65 hydrolase 38618 polypeptide.

1108 113620885Homo Sapiensmitochondrial ribosomal323 100 protein S6 1108 113620887Mus musculusmitochondrial ribosomal284 82 protein S6 1108 g119713140FusobacteriumFusobacterium outer 79 28 membrane protein nucleatum family subsp.

nucleatum 1109 g118378673Homo SapiensPATE 607 89 1109 g1530'5193Rattus sperm protein 10 108 30 norvegicus 12,8 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1109 gi969103 Mus musculusmSP-10 107 27 1110 12462979 Bos taurus Tenascin-X 119 34 1110 g13413958Homo SapiensLDL rece for related 110 27 rotein 105 1110 g113938519Homo Sapienslow density lipoprotein110 27 receptor-related protein 3 1111 g117981053Mus musculustranscri tion factor 82 32 NFATS

1111 g115425825Mus musculustonicity-responsive 82, 32 enhancer binding rotein 1111 g16911148Mus musculustranscription factor 82 32 NFATS isoform b 1112 g16634473Metarhizium adenylate cyclase, ACY 73 . 30 anisopliae var.

anisopliae 1113 AAU19759 Homo SapiensHUMA- Human novel extracellular900 70 matrix rotein, Seq ID
No 409.

1113 g13171934Mus musculusneuronal-STOP rotein 886 52 1113 g12769587Mus musculusSTOP protein 885 52 1114 g118652188Oenococcus OppF 72 41 oeni 1115 g19119 Drosophila fos-related anti en 69 37 s .

1115 g17769652Drosophila Fos-related antigen 69 37 melanogaster 1115 g117862946Drosophila SD04477p 69 37 melanogaster 1116 121212948Mus musculusperoxisomal rotein (PeP)243 83 1116 12347114 Mus musculusCC chemokine receptor-572 28 1116 12431976 Mus musculusCCRS 72 28 1117 gi~20825251~Mus musculussimilar to RE1-silencing77 40 transcription ref~XP factor; neuron restrictive 1319 silencer _ factor; re ressor bindin 98.1 ~ to the X2 box 1117 gi~15597871~Pseudomonas probable type II secretion69 41 system ref~NP_2513aeruginosa protein 65.1 1118 gi~3860513~eMus famulus reverse transcriptase 303 82 mb~CAA135 74.1 ~

1118 gi~3860536~eMus saxicolareverse transcriptase 303 81 mb~CAA135 77.1 ~

1118 gi~3860510~eMus dunni reverse transcriptase 298 63 mb~CAA135 73.1 1119 AA004758 Homo SapiensHYSE- Human polypeptide234 59 SEQ ID

NO 18650.

1119 AAM69569 Homo sapiensMOLE- Human bone marrow220 63 expressed probe encoded protein SEQ

ID NO: 29875.

1119 AAM67717 Homo SapiensMOLE- Human bone marrow219 49 expressed probe encoded protein SEQ

ID NO: 28023.

1120 g121107877Xanthomonas cytochrome C 78 27 axonopodis pv.

citri str.

1120 g115292331Drosophila LD47230p . 77 42 melanogaster 1120 115072444Avian phospho rotein 72 38 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

paramyxovirus 1121 AAB44126 Homo SapiensHUMA- Human cancer associated150 83 protein sequence SEQ
ID N0:1571.

1121 gi550015 Homo sapiensribosomal protein L21 150 83 1121 gi619788 Homo sa L21 ribosomal protein 150 83 iens 1122 AAU74448 Homo SapiensOULU- Human protein sequence125 100 of lysyl hydroxylase 1 (LH
1 ).

1122 1190074 Homo sa lysyl hydroxylase 125 100 iens 1122 g15817297Homo Sapienslysyl hydroxylase 1 125 100 1123 g121281601CaenorhabditisC. elegans PQN-44 protein78 34 ele ans (corresponding sequence F55A12.9c) 1123 g114578225CaenorhabditisC. elegans PQN-44 protein76 38 elegans (comes ondin se uence F55A12.9b) 1123 g12088669CaenorhabditisC. elegans PQN-44 protein76 38 elegans comes ondin se uence F55A12.9a) 1125 AAU17301 Homo SapiensHUMA- Novel signal transduction344 88 athway rotein, Se ID
866.

1125 AAE11776 Homo SapiensINCY- Human kinase (PKIN)-10344 88 protein.

1125 AAU17304 Homo SapiensHUMA- Novel signal transduction340 86 athway rotein, Se ID
869.

1126 AAM41712 Homo sapiensHYSE- Human polypeptide 152 96 SEQ ID

NO 6643.

1126 AAM39926 Homo SapiensHYSE- Human polypeptide 152 96 SEQ ID

NO 3071.

1126 AAM79067 Homo SapiensHYSE- Human protein SEQ 152 96 ID NO

1729.

1127 AAE02938 Homo SapiensMILL- Human adenylate 252 98 cyclase 25678.

1127 AAB02006 Homo sapiensTEXA Adenylyl cyclase 252 98 type II-C2 C2 al ha domain.

1127 g1202752 Rattus adenylyl cyclase type 252 98 II

norvegicus 1128 AAA94860_Homo SapiensTEXA Human caspase activator96 100 Smac aal codin se uence.

1128 AAU78447 Homo SapiensUYJE- Inhibitor of apoptosis96 100 (IAP) roteiii Smac.

1128 AAB26210 Homo sa TEXA Human cas ase activator96 100 iens Smac.

1129 g13874765CaenorhabditisSimilarity to Drosophila97 30 acetylcholine elegans receptor protein (SW:ACH1 DROME), contains similarity to Pfam domain:

(Neurotransmitter-gated ion-channel), Score=296.9, E-value=5e-86, N=3 1129 g16681597Yaba monkeysimilar to vaccinia G8R 72 28 tumor virus 1129 gi~17548199~Caenorhabditisacetylcholine receptor 97 30 reflNP elegans 32.1 ~

1130 gi~17564116~Caenorhabditistyrosine-proteinkinase 73 29 ref~IVP-5064elegans 84.1 1131 113925613Homo sa insulinoma-associated 88 27 iens protein IA-6 r 1131g1158485 Drosophila son of sevenless protein85 24 ~

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

melanogaster 1131 gi728778205-Feb-1998symbol=Sos; 85 24 synonym=BG:DS00941.4;

match=method:"sim4", score:"1000.0", desc:"GenBank::M83931:Drosophila melanogaster son of sevenless (Sos) mRNA, complete cds. CDS:346..5133;

PID:g158485.", species:"Drosophila melanogaster' ;

match=method: "BLASTX", version:"2.Oa19MP-WashU
[Build so12.5-ultra 01:47:30 1132 gi9696 Mytilus of henolic adhesive protein75 25 edulis 1134 gi13562016Plectreurysfibroin 2 72 29 tristis 1134 gi1129074Bacillus beta-N-acetylglucosaminidase69 28 subtilis 1134 gi2636104Bacillus N-acetylglucosaminidase 69 28 subtilis (major autolysin (CWBP90) 1135 AAB58870 Homo SapiensHUMA- Breast and ovarian72 80 cancer associated antigen protein sequence SEQ ID 578.

1135 111595476Homo sa RPBllblbeta protein 72 80 iens 1135 AAB44840 Homo SapiensHUMA- Human secreted 69 45 protein encoded by gene 11.

1137 g1206985 Rattus troponin I 70 46 norve icus 1137 g116945895Takifugu SUN-like 1 70 31 rubri es 1137 gi~8394466~rRattus troponin I, skeletal, 70 46 fast 2 ef~NP norvegicus _ 81.1 1140 AA004998 Homo SapiensHYSE- Human polypeptide 277 96 SEQ ID

NO 18890.

1140 g119917538MethanosarcinamttA/Hcf106 protein 80 28 acetivorans str.

C2A]

[Methanosarcina acetivorans 1140 14959705 Mus musculusfibulin-2 76 28 1141 g110141010Vesicular non-structural polyprotein91 31 exanthema of swiiia virus 1141 g16566147Drosophila large Forked protein 85 30 melanogaster 1141 g12317953murid glycoprotein 150 79 28 he esvirus 1142 AAB54067 Homo SapiensHUMA- Human pancreatic 218 56 cancer antigen protein sequence SEQ ID

N0:519.

1142 g11710365Mus musculusnoggin 89 29 1142 g121105761Equus caballusno gin 89 29 1143 gi~21295753~Anopheles agCP1560 69 26 gb~EAA078gambiae str.

98.1 ~ PEST

1144 g1505094 Homo Sapienssimilar to an actin bundling127 35 ~ protein, Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

dematn.

1144 gi2337952Homo Sapiensactin-binding double-zinc-finger122 36 rotein 1144 gi21304227Oryza sativaovule development aintegumenta-like76 29 rotein BNM3 1145 gi~21298336~Anopheles agCP2121 68 37 gb~EAA104gambiae str.

81.1 ~ PEST

1146 AAW22049 Homo SapiensINCY- Interferon gamma 221 100 inducing factor-2 (IGIF-2) alternate transcript variant.

1146 AAV05368_Homo SapiensSCHE cDNA encoding human167 84 aal interleukin-1-gamma.

1146 AAH78060-Homo SapiensSTRD Nucleotide sequence167 84 of human aal interleukin 18 (IL-18).

1147 AAY57937 Homo SapiensINCY- Human transmembrane123 100 protein HTMPN-61.

1147 gi~20345904~Mus musculussimilar to delta-like 105 86 homolog ref~XP_1098 (Drosophila) 23.1 1148 gi19069293Encephalitozoonsimilarity to ADP/ATP 75 32 CARRIER

cuniculi PROTEIN

1148 gi8978336Arabidopsis contains similarity 74 26 to CHP-rich zinc thaliana finger rotein~ ene id:K23F3.4 1148 gi19716318Aspergillus antigenic cell wall 74 32 protein MP1 flavus 1149 gi5456699Emericella ATP-binding cassette 70 35 multidrug nidulans traps ort protein ATRC

1149 gi~20898840~Mus musculussimilar to HSPC038 protein69 0 31 re~XP_1393 87.1 ~

1150 gi3883128Arabidopsis arabinogalactan-protein96 32 thaliana 1150 gi17429208Ralstoua CONSERVED HYPOTHETICAL 92 26 solanacearumPROTEIN

1150 gi4063766Emericella chitinase 91 27 nidulans 1151 gi13561058Homo SapiensdJ1108D11.1 (novel protein107 31 similar to C. elegans T22C1.7 ) 1151 gi21105299Mytilus precollagen-NG 105 26 alloprovincialis 1151 gi14164347Oncorhynchuscollagen al(I) 96 28 mykiss 1152 gil8479434Mus musculusolfactory rece for MOR188-176 33 1152 gi2653915Oran virus glycoprotein G1 and 72 46 G2 precursor;

envelo a Tyco rotein precursor 1152 gi18479436Mus musculusolfactory rece for MOR188-272 33 1153 gi3403167Homo sa tensGBAS 161 86 1153 112804791Homo sa tensglioblastoma am lifted 161 86 sequence 1153 AAB57149 Homo SapiensROSEI Human prostate 134 81 cancer antigen protein se uence SEQ
ID N0:1727.

1154 g117742234Agrobacteriumhistidase 87 35 tumefaciens str.

C58 (U.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

Washington) 1154 gi15159496AgrobacteriumAGR_L_1400GMp 87 35 tumefaciens str.

C58 (Cereon) 1154 gi158521Drosophila seven-up protein type 80 32 melano aster 1155 gi~10441551~Cryptotermescytochrome b 65 28 gb~AAG170domesticus 99.1~AF189 1156 AA012089Homo SapiensHYSE- Human polypeptide 475 98 SEQ ID

NO 25981.

1156 gi20147787Xeno us laevisnuclear rece for core 74 25 ressor 1156 gi19881705Oryza sativaPutative transposable 72 32 element 1157 19963851Homo SapiensHT019 80 34 1157 AAB93530Homo SapiensHELI- Human protein sequence77 34 SEQ

ID N0:12884.

1157 11040970Homo sa iensfus-like protein 77 42 1158 19795254Sepia officinalisGABA-A rece for beta 71 27 subunit 1158 g115026157Clostridium amidase, germination 68 34 specific acetobutylicumcwlC/cwlD B.subtilis ortholo ) 1158 gi~9795254~gSepia officinalisGABA-A receptor beta 71 27 subunit b~AAF97816 .1 1159 AAB93423Homo sapiensHELI- Human protein sequence336 100 SEQ

ID NO:12641.

1159 g113097768Homo SapiensSimilar to RIKEN cDNA 336 100 ene 1159 g120071708Mus musculusRIKEN cDNA 2900073H19 334 96 gene 1160 AAM72558Homo SapiensMOLE- Human bone marrow 274 100 expressed probe encoded protein SEQ

ID NO: 32864.

1160 AAM59959Homo sapiensMOLE- Human brain expressed274 100 single exon probe encoded protein SEQ ID

NO: 32064.

1161 AAB07704Homo SapiensINMR Protein encoded 139 36 by the endogenetic fragment of HERV-W.

1161 g18272464Homo sa iensag 139 36 1161 gi~5726238~gmultiple gag polyprotein 131 35 b~AAD4837sclerosis 5.1~AF1238associated 81_1 retroviriis element 1162 AAU25448Homo sapiensINCY- Human mddt protein346 79 from clone LG:1083264.1:2000MAY
19.

1162 AAU11265Homo sa iensBODE- Human zinc finger 319 65 rotein 51.

1162 AAB95637Homo SapiensHELI- Human protein sequence314 67 SEQ

ID N0:18371.

1163 g114189950Homo Sapiensconnexin 58 536 84 1163 g19957542Homo Sapiensconnexin 59 536 84 1163 110946367Danio rerio connexin 55.5 485 81 1164 1755700 Bombyx mori sericinlB 76 27 1164 g119569861DictyosteliumRTOA protein (Ratio-A). 76 28 discoideum Table 2 SEQ AccessionSpecies Description Score ID No, Identity NO:

1164 gi10580635HalobacteriumVng1087c 76 25 s . NRC-1 1165 gi19915386MethanosarcinaWD-domain containing 89 28 protein acetivorans str.

C2A]

[Methanosarcina acetivorans 1165 15639663 Homo sa iensWD re eat protein WDR3 83 28 1165 g111544739Homo sa iensdJ776P7.2 (WD re eat 83 28 domain 3 1166 AAM69338 Homo SapiensMOLE- Human bone marrow72 31 expressed probe encoded protein SEQ

ID NO: 29644.

1166 AAM56953 Homo sapiensMOLE- Human brain expressed72 31 single exon probe encoded protein SEQ ID

NO: 29058.

1166 g120197507Arabidopsis expressed protein 67 39 thaliana 1167 g15802812Homo SapiensGa rotein 83 30 1167 g17160650Bordetella pertactin (P.68) 79 31 bronchiseptica 1167 g113173444Bordetella pertactin 79 31 bronchise tics 1168 g11495029Danio rerio protein kinase CK2 alpha'84 24 1168 g1643443 Penicillium PHOG 82 32 chrysogenum 1168 gi~18858419~Danio rerio casein kinase 2 alpha 84 24 re~NP_5713 15.1 1169 g1206716 Rattus salivary proline-rich 90 31 protein norvegicus 1169 g115029903Mus musculusSimilar to proline-rich89 36 protein BstNI

subfamil 2 1169 g153182 Mus musculusproline rich rotein 81 34 1170 gi~17553370~CaenorhabditisF40H6.S.p 78 33 ref~NP_4983elegans 18.1 1170 gi~15215731~Arabidopsis AT4g36780/C7A10 580 73 30 gb~AAK914thaliana 11.1 1171 1340446 Homo sa ienszinc fm er protein 7 218 61 (ZFP7) 1171 AAB43928 Homo SapiensHLTMA- Human cancer 216 58 associated protein sequence SEQ
ID NO:1373.

1171 AAB21040 Homo SapiensINCY- Human nucleic 213 48 acid-binding protein, NuABP-44.

1172 AAE04368 Homo sapiensINCY- Human kinase (PKIN)-9.120 85 1172 AAM79153 Homo SapiensHYSE- Human protein 120 85 SEQ ID NO

1815.

1172 AAE10614 Homo SapiensCUR A- Human novel STE20-like120 85 rotein, NOV-3d.

1173 1218572 Pan troglodytesrot GOR 74 29 1173 1243898 Pan GOR 74 29 1173 11666473 Mus musculusNOV rotein 71 50 1174 g15901830Drosophila BcDNA.GH07910 74 31 melano aster Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1174 AAM80237 Homo SapiensHYSE- Human protein SEQ 71 38 ID NO

3883.

1174 ABB 11528Homo SapiensHYSE- Human secreted 71 38 protein homologue, SEQ ID N0:1898.

1175 gi~12054759~Podospora catalase A 65 33 emb~CAC20anserina 748.1 1176 AAM93289 Homo SapiensHELI- Human polypeptide,145 100 SEQ ID

NO: 2777.

1176 gi17431512Ralstonia PUTATIVE OUTER MEMBRANE 71 26 solanacearumCHANNEL LIPOPROTEIN

TRANSMEMBRANE

1176 gi15823991Streptomycesmodular polyketide synthase70 51 avermitilis 1177 AAM41939 Homo SapiensHYSE- Human polypeptide 84 61 SEQ ID

NO 6870.

1177 gi870751 Homo SapiensN-acetylgalactosamine 84 61 6-sulfate sulfatase (GALNS) 1177 1618426 Homo sa N-acetyl alactosamine 84 61 iens 6-sul hatase 1178 1435855 Mus Sp. CREB-binding protein; 89 22 CBP

1178 AAW40058 Homo sapiensUSSH Cellular transcriptional87 22 factor CBP.

1178 g117944308Drosophila RE12101p 86 26 -melanogaster 1179 AAM25814 Homo SapiensHYSE- Human protein sequence73 93 SEQ

ID N0:1329.

1179 AAM25290 Homo SapiensHYSE- Human protein sequence73 93 SEQ

ID N0:805.

1179 AAM79441 Homo SapiensHYSE- Human protein SEQ 73 93 ~ NO

3087.

1180 AAB88388 Homo SapiensHELI- Human membrane 719 97 or secretory protein clone PSEC0131.

1180 g120810493Homo SapiensSimilar to RII~EN cDNA 716 96 gene 1180 AAD30543_Homo SapiensMILL- Human B7RP-2 DNA. 83 38 aal 1181 ABB 14686Homo SapiensHUMA- Human nervous system190 97 related olypeptide SEQ ID NO
3343.

1181 g114329731Secale cerealehigh molecular weight 88 27 glutenin subunit x 1181 g114329761Triticum high molecular weight 84 26 glutenin subunit aestivum x 1182 111692645Mus musculusaspartly beta-hydroxylase74 28 _ g111878112Mus musculusaspartyl beta-hydroxylase74 28 1182 6.6 kb transcript 1182 g111878110Mus musculusaspartyl beta-hydroxylase74 28 4.5 kb transcript 1183 g115485622Homo SapiensQ9H4T4 like 80 25 1183 g119714949FusobacteriumTong protein 78 32 nucleatum subsp.

nucleatum 1183 g17717375Homo Sapienshuman CHD2-52 down syndrome71 23 cell adhesion molecule Table 2 SEQ AccessionSpecies Description Score /a ID No. Identity NO:

1184 AAU83667 Homo SapiensGETH Human PRO protein,388 100 Seq ID No 152.

1184 AAG89161 Homo SapiensGEST Human secreted 388 100 protein, SEQ ID

NO: 281.

1184 AAY99348 Homo SapiensGETH Human PR01194 (UNQ607)388 100 amino acid sequence SEQ ID NO:29.

1185 AAB93506 Homo SapiensHELI- Human protein 543 100 sequence SEQ

ID N0:12830.

1185 AAB87570 Homo SapiensGETH Human PR01268. 426 95 1185 AAY78808 Homo sapiensPROT- Hydrophobic domain426 95 containing protein clone rotein se uence.

1187 gi15823978Streptomycesmodular polyketide synthase75 41 avermitilis 1187 AAB66657 Homo SapiensHSCR- Human elastin 71 39 protein without si nal pe tide.

1187 AAY69137 Homo SapiensUNSY Amino acid sequence71 39 of a human tropoelastin derivative.

1188 gi6907090Oryza sativaSimilar to Oryza sativa76 30 root-specific (japonica RCc3 mRNA. (L27208) cultivar-ou 1188 AAY36063 Homo SapiensGEST Extended human 74 26 secreted rotein se uence, SEQ
ID NO. 448.

1188 AAY35971 Homo SapiensGEST Extended human 73 26 secreted protein sequence, SEQ
ID NO. 220.

1189 gi9827989Leishmania possible CG12797 protein72 36 ma' or 1189 gi~13625467)Leishmania LACK protective antigen68 27 gb~AAK350donovani 68.1 1190 gi17027071Xiphocentronelongation factor-1 107 27 Sp. alpha 2-Costa Rica 1190 gi310665 StrongylocentrotNf Y-A subunit 88 24 us p uratus 1190 gi21743 Triticum lugh molecular weight 86 23 glutenin subunit aestivum lAxl 1191 gi16878287Homo SapiensSimilar to C-terminal 167 96 modulator protein 1191 115866714Homo SapiensC-terminal modulator 167 96 protein 1191 AA006984 Homo SapiensHYSE- Human polypeptide132 83 SEQ ID

NO 20876.

1192 AAD05496_Homo SapiensHUMA- Human secreted 859 100 protein-aal encoding gene 5 cDNA
clone HHBCS39, SEQ ID N0:15.

1192 AAE01707 Homo SapiensHUMA- Hurnan gene 5 859 100 encoded secreted protein HHBCS39, SEQ ID

N0:119.

1192 AAE01676 Homo SapiensHUMA- Human gene 5 encoded859 100 secreted protein HHBCS39, SEQ ID

N0:88.

1193 g118650588Homo Sapiensretinoic acid early 1312 99 transcript 1 1193 AAB15540 Homo SapiensINCY- Human immune system1283 97 molecule from Inc a clone 3402252.

1193 ABB84887 Homo SapiensGETH Human PR0791 protein1234 94 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

se uence SEQ ID N0:142.

1195 11196427 Homo sa a 2 protein 248 50 iens 1195 g11780975Human gag protein 248 50 endogenous retrovirus K

1195 g11556397Human gag 248 50 endogenous retrovirus K

1196 g1556256 Leishmania G protein alpha subunit 72 22 donovani 1197 AAY07237 Homo SapiensISTF Wild type monocyte 121 100 chemotactic rotein 2.

1197 AAY05300 Homo sa ISTF C-C chemokine, MCP2.121 100 iens 1197 AAW42072 Homo sa INCY- Human MC roprotein.121 100 iens 1198 ABB57423 Homo sapiensHUMA- Human secreted 187 79 protein encodin olypeptide SEQ
ID NO 69.

1198 ABB57394 Homo SapiensHUMA- Human secreted 187 79 protein encoding polypeptide SEQ ID NO 40.

1198 AAY59757 Homo SapiensMETA- Human normal ovarian187 79 tissue derived protein 34.

1199 AAY72603 Homo SapiensINCY- Human Electron 155 100 Transfer Protein, ETRN-1.

1199 AAB88465 Homo SapiensHELI- Human membrane 155 100 or secretory protein clone PSEC0259.

1199 AAE03926 Homo sapiensHUMA- Human gene 29 encoded155 100 secreted protein HTADC63, SEQ ID

N0:89.

1200 g16458884Deinococcuschorismate mutase/prephenate73 42 radioduransdehydratase 1201 g120803920MesorhizobiumHYPOTHETICAL PROTEIN 68 32 loti 1201 gi~17545158~Ralstonia PUTATIVE LIPASE/ESTERASE66 31 ref~NP_5185solanacearumPROTEIN

60.1 1202 AAM67586 Homo SapiensMOLE- Human bone marrow 69 30 expressed probe encoded protein SEQ

ID NO: 27892.

1202 AAM55191 Homo SapiensMOLE- Human brain expressed69 30 single exon probe encoded protein SEQ ID

NO: 27296.

1202 g1849219 SaccharomycesProlp: Glutamate 5-kinase69 33 (Swiss Prot.

cerevisiae accession number P32264) 1203 g118676554Homo SapiensFLJ00174 rotein 269 84 1203 gi~20913341~Mus musculussimilar to FLJ00174 protein125 81 ref~XP-1267 63.1 1203 gi~20850247~Mus musculussimilar to proline-rich 121 33 protein ref~XP-1366 64.1 1204 AAM68056 Homo SapiensMOLE- Human bone marrow 140 84 expressed probe encoded protein SEQ

ID NO: 28362.

1204 AAM55676 Homo SapiensMOLE- Hurnan brain expressed140 84 single exon probe encoded rotein SEQ ID

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

NO: 27781.

1205 gi541624 Drosophila pdm2 71 39 virilis 1205 gi9955855AspergillusRNA polymerase II largest69 38 subunit oryzae 1205 gi662296 Rattus MIBP1 68 32 norvegicus 1206 ABB50703 Homo SapiensHLTMA- Human secreted 260 94 protein encoded by gene 52 SEQ
ID N0:651.

1206 AAW88802 Homo SapiensHLJMA- Polypeptide fragment260 94 encoded by ene 52.

1206 ABB50706 Homo sapiensHL1MA- Human secreted 143 96 protein encoded by gene 52 SEQ
ID N0:654.

1207 AAM79588 Homo SapiensHYSE- Human protein SEQ 72 41 ID NO

3234.

1207 AAM78604 Homo SapiensHYSE- Human protein SEQ 72 41 ID NO

1266.

1207 AAB58944 Homo SapiensHUMA- Breast and ovarian72 41 cancer associated antigen protein sequence SEQ ID 652.

1208 AAE03429 Homo SapiensHLTMA- Human gene 3 encoded575 64 secreted protein HETDB76, SEQ ID

NO: 112.

1208 gi19110438Homo Sapienspolycystin-1L1 575 64 1208 AAE03463 Homo SapiensHLTMA- Human gene 3 encoded185 97 secreted protein HETDB76, SEQ ID

NO: 146.

1209 16760015 Homo sa brain rotein 1114 85 iens 1209 g11747306Mus musculusSDR2 151 31 1209 g120381292Mus musculusstromal cell derived 151 31 factor receptor 2 1211 g114043211Homo SapiensSimilar to RIKEN cDNA 460 89 gene 1211 g1190508 Homo Sapienssalivary proline-rich 113 28 rotein recursor 1211 112862320Homo SapiensWDC146 102 28 1212 AAO14407 Homo SapiensFARB Human 11 beta-hydroxysteroid291 63 dehydrogenase 1-like enzyme.

1212 AAM79592 Homo sapiensHYSE- Human protein SEQ 217 45 ID NO

3238.

1212 g14581319Homo SapiensdJ28O10.3(HSD11B1 (hydroxysteroid217 45 (11-beta) dehydrogenase 1) 1213 AAR06514 Homo SapiensSTRI Natural human Platelet238 64 Factor-4var1 encoded by EcolZi fra ment.

1213 g1292390 Homo Sapiensplatelet factor 4 238 64 1213 AAZ28361_Homo SapiensSMIK Platelet factor-4 200 56 (PF-4) aal nucleotide sequence.

1214 AAD12580 Homo SapiensSAGA Human protein having162 82 _ hydrophobic domain encoding aal cDNA

clone HP 10753.

1214 AAD08193 Homo SapiensHUMA- Human secreted 162 82 protein-_ encoding gene 3 cDNA
aal clone HNTAC64, SEQ ID N0:13.

1214 AAD05544_Homo sapiensHUMA- Human secreted 162 82 protein-aal encoding gene l2 cDNA
clone HNTAC64, SEQ ID N0:63.

13~
Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1215 gi21429094Drosophila LD38004p 354 49 melanogaster 1215 gi15292155Drosophila LD40717p 354 49 melanogaster 1215 AAG75596 Homo SapiensHL1MA- Human colon cancer294 50 antigen protein SEQ ID N0:6360.

1216 gi7248894Xeno us laevisAr rotein-tyrosine kinase84 35 1216 1402191 Mus musculusHNF-3beta 80 26 1216 g1404764 Mus musculusfork head related rotein80 26 1218 AAM39205 Homo SapiensHYSE- Human polypeptide559 74 SEQ ID

NO 2350.

1218 AAO03505 Homo SapiensHYSE- Human polypeptide502 81 SEQ ID

NO 17397.

1218 AAM40991 Homo SapiensHYSE- Human polypeptide467 66 SEQ ID

NO 5922.

1220 AA001188 Homo SapiensHYSE- Human polypeptide248 86 SEQ ID

NO 15080.

1220 AAY73334 Homo sapiens1NCY- HT1ZM clone 180506179 35 protein se uence.

1220 120249 Oryza sativagt-2 77 32 1221 g14519619Haliotis colla en pro al ha-chain90 28 discus 1221 g17380690Neisseria UDP-N-acetylglucosamine--N-90 37 meningitidesacetylmuramyl-(pentape 22491 pyrophosphoryl-undecaprenol N-acetylglucosamine transferase 1221 g17225645Neisseria UDP-N-acetylglucosamine--N-90 37 meningitidesacetylmuramyl-(pentapeptide) MC58 pyrophosphoryl-undecaprenol N-acetyl lucosamine transferase 1222 ABA05334_Homo SapiensMILL- Human fucosyltransferase2154 99 aal family member 32132 coding sequence.

1222 AAM47905 Homo SapiensMILL- Human fucosyltransferase2154 99 family member 32132.

1222 ABA05333_Homo SapiensMILL- Human fucosyltransferase2154 99 aal family member 32132 encoding cDNA.

1223 AAY21852 Homo SapiensINCY- Human signal peptide-150 100 contianing protein (SIGP) (clone ID

2652271).

1223 AAY48563 Homo SapiensMETA- Human breast tumour-150 100 associated rotein 24.

1223 AAW75103 Homo SapiensHLTMA- Human secreted 150 100 protein encoded by ene 47 clone HMCBP63.

1224 AAM67078 Homo SapiensMOLE- Human bone marrow517 99 expressed probe encoded protein SEQ

ID NO: 27384.

1224 AAM54676 Homo SapiensMOLE- Human brain expressed517 99 single exon probe encoded protein SEQ ID

NO: 26781.

1224 117467358Sus scrofa MIF2 suppressor 184 80 1225 g19454237CochliobolusDNA binding protein 73 30 sativus 1225 g121428792Drosophila GH03582p 72 38 melanogaster Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1225 gi6633838ArabidopsisF2K11.15 70 31 thaliana 1226 gi21430124Drosophila HL01222p 76 28 melanogaster 1226 AAM77437 Homo SapiensMOLE- Human bone marrow 72 33 expressed probe encoded protein SEQ

ID NO: 37743.

1226 AAM64659 Homo SapiensMOLE- Human brain expressed72 33 single exon probe encoded protein SEQ ID

NO: 36764.

1227 AAM50715 Homo SapiensMILL- Human TRP-like 243 83 calcium channel-5 (TLCC-5).

1227 gi~20874183~Mus musculussimilar to hornerin 80 29 ref~XP_1310 03.1 1227 gi~17864717~Mus musculushornerin 80 29 gb~AAKl 91.1 1229 gi4019247Ateline thymidine kinase 71 46 he esvirus 1229 gi2760368Drosophila Shar pei/DRhoGEF2 70 26 melanogaster 1229 gi17862944Drosophila SD04476p 70 26 melanogaster 1230 gi4559296Mus musculussilencing mediator of 80 30 retinoic acid and thyroid hormone receptor extended isoform 1230 118181872Mus musculusGATA-2 protein 78 41 1230 g118033511Rattus transcription factor 78 41 norvegicus 1231 g113365501C rinus integrin beta2-chain 75 27 carpio 1231 g13322933Treponema DNA ligase (11g) 73 32 allidum 1231 gi~13365501~Cyprinus integrinbeta2-chain 75 27 carpio dbj~BAB391 30.1 1232 AAM79791 Homo SapiensHYSE- Human protein SEQ 78 35 ID NO

3437.

1232 AAM78807 Homo sapiensNYSE- Human protein SEQ 78 35 ID NO

1469.

1232 AAB19338 Homo Sapiens1NCY- Amino acid sequence78 35 of a human fibrous roteiii (FIBR).

1233 AAU21459 Homo SapiensHUMA- Human novel foetal87 26 antigen, SEQ ID NO 1703.

1233 g115081227Arabidopsisglycine-rich protein 75 37 thaliana 1233 12645433 Homo SapiensCHD3 74 30 1234 AAU83676 Homo SapiensGETH Human PRO protein, 178 97 Seq ID No 170.

1234 ABB84911 Homo SapiensGETH Human PR01244 protein178 97 sequence SEQ ID N0:190.

1234 AAB62403 Homo sapiensCURA- Human MBSP7 polypeptide178 97 (clone 3499605Ø64 .

1235 ABB 10348Homo SapiensHUMA- Human cDNA SEQ 409 61 ID NO:

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

656.

1235 AAU18012Homo SapiensHUMA- Human immunoglobulin178 83 olypeptide SEQ ID No 157.

1235 ABB89226Homo SapiensHUMA- Human polypeptide 78 82 SEQ ID

NO 1602.

1236 gi10566951Rattus s-gicerin/MIJC18 85 45 norvegicus 1236 gi10566949Rattus 1-gicerin/MUC18 85 45 norvegicus 1236 AAB90798Homo sapiensNOJI/ Human shear stress-response84 42 rotein SEQ ID NO: 96.

1238 gi21464300Drosophila GH20068p 95 36 melano aster 1238 gi3868879Xeno us laevisZic-related-2 88 35 1238 gi1841756Mus musculusGATA-5 cardiac transcription87 52 factor 1239 gi17946266Drosophila RE61793p 96 40 melanogaster 1239 gi15636898Gallus gallusformin binding protein 91 27 11-related rotein 1239 gi780454African swinepB407L 88 30 fever virus 1240 AAE05302Homo SapiensMILL- Human TANGO 457 1331 100 protein.

1240 AAE05303Homo SapiensMILL- Human mature TANGO1207 100 rotein.

1240 AAE05305Homo SapiensMILL- Human TANGO 457 1201 100 protein cyto lasmic domain.

1241 gi5640111LycopersiconRAD23 protein 84 25 esculentum 1241 gi17131739Nostoc Sp. polyketide synthase type76 33 PCC I

1241 gi~5640111~eLycopersiconRAD23 protein 84 25 mb~CAB515esculentum 44.1 1242 AAG03496Homo SapiensGEST Human secreted protein,67 39 SEQ ID

NO: 7577.

1242 gi~13876270~Mus musculusprotocadherin alpha 8 66 35 gb~AAK260 55.1 1243 AAE16665Homo SapiensMILL- Human calcium chaimel196 87 family member, 21784 rotein.

1243 AAB62248Homo SapiensWARN Human calcium channel196 87 alpha2delta subunit.

1243 AAY92320Homo SapiensWARN Human alpha-2-delta-C196 87 calcium channel subunit polype tide.

1244 gi~4102990~gAspergillus DNA polymerase epsilon 70 30 homolog b~AAD0163nidulans 7.1 1245 15917666Zea mays extensin-like rotein 94 26 1245 g119481644shrimp whiteWSSV052 89 36 spot syndrome virus 1245 g117016928shrimp whitewsv001 89 36 spot syndrome virus Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1246 AA012623 Homo SapiensHYSE- Human polypeptide 169 69 SEQ ID

NO 26515.

1246 AA012822 Homo SapiensHYSE- Human polypeptide 153 75 SEQ ID

NO 26714.

1246 AAO02255 Homo SapiensHYSE- Human polypeptide 123 65 SEQ ID

NO 16147.

1247 gi1653353Synechocystisnodulation protein 75 28 s . PCC

1247 14468626 Mus musculusTEF-5 74 26 1247 g117430764Ralstonia SKWP PROTEIN 5 74 23 solanacearum 1248 g115139973SinorhizobiumCONSERVED HYPOTHETICAL 77 47 meliloti PROTEIN

1249 g17191078Leishmania L712.2 99 29 maj or 1249 g117384256Homo sapiensmucin 5 85 31 1249 g15821153Homo SapiensRNA binding rotein 83 33 1250 AAY36495 Homo SapiensHUMA- Fragment of human 124 86 secreted protein encoded by ene 27.

1250 AA012122 Homo sapiensHYSE- Human polypeptide 123 91 SEQ ID

NO 26014.

1250 AAB95063 Homo SapiensHELI- Human protein sequence121 90 SEQ

ID N0:16901.

1252 gi~15839838~Mycobacteriummembrane protein, MmpL 68 27 family re~NP_3348tuberculosis 75.1 CDC1551 1254 AAG00399 Homo SapiensGEST Human secreted protein,328 100 SEQ ID

NO: 4480.

1254 g121428466Drosophila LD22609p 85 24 melanogaster 1254 g119914274Methanosarcinasensory transduction 85 26 histidine kinase acetivorans[Methanosarcina str.

1256 g114161094Choloepus von Willebrand Factor 80 24 didactylus 1256 g114161092Cyclopes von Willebrand Factor 78 23 didactylus 1256 g113872552Acomys von Willebrand Factor 77 23 cahirinus 1258 g17008025Callithrix prochymosin 715 64 'acchus 1258 g111990126Camelus chymosin 634 57 dromedarius 1258 g1491952 synthetic preprochymosin 618 56 construct 1259 gi~21402709~Bacillus AMP-binding, AMP-binding72 34 enzyme ref~NP_6586anthracis [Bacillus anthracis 94.1 1260 gi~4505431~rHomo Sapiensnuclear protein, ataxia-telangiectasia64 33 ef~NP_0025 locus; NPAT gene; E14 gene 10.1 1260 gi~15309894~Homo Sapienssimilar to nuclear protein,64 33 ataxia-ref~XP_0408 telangiectasia locus;
NPAT gene; E14 46.2 gene Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1260 gi~1304114~dHomo sapiensNPAT 64 33 bj ~BAA

1.1 1261 gi4519535Homo SapiensLeukotriene B4 ome a-hydroxylase133 49 1261 gi1857022Homo Sapiensleukotriene B4 omega-hydroxylase133 49 1261 gi18266446Homo Sapienscytochrome P450, subfamily133 49 IVF, of epode 2 1262 gi13363530Escherichia cell division protein 79 26 coli HfIB/FtsH

0157:H7 protease 1262 gi746401 Escherichia ATP-binding rotein 79 26 coli 1262 1146028 Escherichia ftsH 79 26 coli 1263 AAW67859 Homo SapiensHUMA- Human secreted 283 100 protein encoded by gene 53 clone HBMCL41.

1264 g111066248Helix lucorumpresenilin 85 21 1264 gi~19115422~Schizosaccharomribonuclease II RNB 69 30 family protein;

ref~NP'5945yces pombe dis3-like 10.1 1264 gi~14720912~Homo Sapienssimilar to Matrin 3 69 32 ref~XP_03 04.1 1265 g15757703Mus musculussyntrophin-associated 82 38 serine-threonine protein kinase 1265 g14996035Human 69.8% identical to U47 76 42 gene of strain heipesvirus U1102 of HHV-6 1265 g1330951 Gallid ICP4 76 36 lie esvirus 1266 gi~17511177~CaenorhabditisZK1053.3.p 75 40 ref~NP,4933elegans 24.1 ~

1266 gi~17538077~CaenorhabditisZK1248.2.p 69 34 ref~NP elegans 59.1 1267 g1915540 Ovis aries pregnancy-s ecific antigen85 25 1267 16179989 Capra hircuspregnancy-associated 84 25 glycoprotein-2 1267 g19798658Rhinolophus pepsinogen A 80 23 ferrume uinum 1268 gi~15789526~Halobacteriumserine proteinase; HtrA69 30 ret~NP_2793Sp. NRC-1 50.1 1269 g19988674Influenza hemagglutinin protein 70 24 A virus .

(A/Swine/Wisco nsin/14094/99(H

3N2)) 1269 g16552676Influenza hemagglutinin 70 25 A virus (ABangkok/1/97 (H3N2)) 1269 g16552638Influenza hemagglutinin 70 24 A virus (A/Trinidad/51/9 6(H3N2)) 1270 13378527 Zea mays anther specific protein87 41 1270 AAW 15787Homo sapiensPENN- Human metastasis 85 28 suppressor KISS-1.

1270 g121410770Homo SapiensSimilar to RTKFN cDNA 84 46 gene Table SEQ AccessionSpecies Description Score ID No. Identity NO:

1271 gi1335527Human reading frame VP3 75 38 oliovirus 1271 gi61253 Human polyprotein 75 38 oliovirus 1271 gi~17453412~Homo Sapienssimilar to 60S ribosomal76 40 protein L7A

reflXP-0631 (Surfeit locus protein 3) 32.1 1272 AAU87081 Homo SapiensBRIM Sialic acid-binding69 43 Ig-related lectin, Siglec-11.

1272 AAU87077 Homo SapiensBRIM Sialic acid-binding69 43 Ig-related lectin, Siglec-BMS-L3d.

1272 AAU87076 Homo SapiensBRIM Sialic acid-binding69 43 Ig-related lectin, Siglec-BMS-L3c.

1273 AAA09121 Homo SapiensCURA- Clone 2355875 720 100 cDNA

_ (update), encodes syncollin aal homologue.

1273 AAY92233 Homo SapiensCURA- Glone 2355875f 720 100 - syncollin homologue.

1273 AAB54267 Homo SapiensHUMA- Human pancreatic 715 100 cancer antigen protein sequence SEQ ID

N0:719.

1274 gi15559064Mus musculusSNAGl 198 59 1274 AAU17435 Homo sapiensHUMA- Novel signal transduction131 62 athway protein, Se ID
1000.

1274 AAW99023 Homo sa iensMOUN 1762 eptide sequence.131 62 1275 gi~6753732~rMus musculusepidermal growth factor65 30 ef~NP_0342 43.1 ~

1275 gi~50801 Mus musculuspolyprotein 65 30 hem b~CAA2411 5.1 1275 gi~20341089~Mus musculusepidermal growth factor65 30 ref~XP_1093 85.1 1276 AAM39205 Homo sapiensHYSE- Human polypeptide447 78 SEQ ID

NO 2350.

1276 AAM40991 Homo SapiensHYSE- Human polypeptide424 74 SEQ ID

NO 5922.

1276 AA007159 Homo SapiensHYSE- Human polypeptide401 75 SEQ ID

NO 21051.

1277 gi13905120Mus musculusRIKEN cDNA 0610013I17 134 35 gene 1277 113936283Mus musculusTRH3 134 35 1277 AAB92625 Homo SapiensHELI- Human protein 127 35 sequence SEQ

ID N0:10921.

1279 AAM66940 Homo SapiensMOLE- Human bone marrow362 85 expressed probe encoded protein SEQ

ID NO: 27246.

1279 AAM54534 Homo SapiensMOLE- Human brain expressed362 85 single exon probe encoded protein SEQ ID

NO: 26639.

1279 gi~208153~gbsynthetic crystal toxin 79 40 ~AAA73184.construct 1~

1280 AAE05187 Homo Sapiens1NCY- Human drug metabolising484 100 enzyme (DME-18) rotein.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1280 AAU12266 Homo SapiensGETH Human PR05780 polypeptide484 100 sequence.

1280 AAY91631 Homo SapiensHUMA- Human secreted 484 100 protein sequence encoded by gene N0:304.

1281 AAH46856 Homo SapiensHUMA- Human serine/threonine238 100 _ phosphatase encoding aal cDNA (clone ID

HLD0020.

1281 AAG77801 Homo SapiensHUMA- Human HLD0020 238 100 serine/threonine phosphatase protein se uence. .

1281 AAB85476 Homo SapiensHUMA- Human serine/threonine238 100 phosphatase (clone ID
HLD0020).

1282 gi~14762786~Homo SapiensGS2 gene 70 30 ref~XP

71.1 1283 gi3860165Arabidopsisdisease resistance protein69 38 RPP1-WsB

thaliana 1283 AA009033 Homo SapiensHYSE- Human polypeptide 68 38 SEQ ID

NO 22925.

1283 gi6967115Arabidopsisdisease resistance protein68 38 homlog thaliana 1285 gi1055252Rattus pheromone receptor VN5 78 32 norve icus 1285 gi2746733Drosophila circadian clock protein 73 26 virilis 1285 gi2641617Drosophila TIM 73 26 virilis 1286 gi6013135Rattus coxsackie-adenovirus-receptor86 67 norvegicus homolog 1286 AAV50429 Homo SapiensUYNY Human coxsackievirus83 75 and Ad2 _ and Ad5 receptor (HCAR) aal cDNA.

1286 AAV28845 Homo SapiensDAND Human coxsackievirus83 75 and _ adenovirus receptor encoding aal DNA.

1287 AAU83224 Homo SapiensZYMO Novel secreted protein642 100 Z930757G12P.

1287 AAY70692 Homo sa DAND Human soluble aitractin-2.84 54 iens 1287 AAY70691 Homo sa DAND Human membrane attractin-2.84 54 iens 1288 AAW70326 Homo SapiensGEMY Secreted protein 1655 99 DU123 1.

1288 ABB 12473Homo SapiensHYSE- Human bone marrow 547 72 expressed protein SEQ ID NO: 312.

1288 15689736 Homo SapiensMyopodin rotein 475 100 1289 g14103543Tomato chlorosisheat shock protein 70 73 29 virus 1289 g112247413Cristatellacytochrome b 72 30 mucedo 1289 gi~4103543Tomato chlorosisheat shock protein 70 73 29 ~g b~AAD0179virus 0.1~

1291 AAB94128 Homo SapiensHELI- Human protein sequence520 98 SEQ

ID N0:14383.

1291 AAY85576 Homo sapiensJANC Hs-UNC-53/1 fragment/GFP520 98 fusion insert of plasmid pGI3150.

1291 AAY85564 Homo Sapiens~ JANC Human homologue ~ 520 ~ 98 of UNC-53 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

(Hs-UNC-53/1) se uence.

1292 AAY01413 Homo SapiensHLTMA- Secreted protein 207 97 encoded by gene 31 clone HHBAG64.

1292 AAY05324 Homo SapiensGEMY Human secreted protein207 97 1j167 5.

1292 g115157864AgrobacteriumAGR_C_4816p 71 34 tumefaciens str.

058 (Cereon) 1294 AAB 12146Homo SapiensPROT- Hydrophobic domain219 100 protein from clone HP 10672 isolated from Thymus cells.

1295 gi~17228767~Nostoc Sp. probable glycogen phosphorylase78 34 PCC

ref~NP,48537120 15.1 1295 gi~10835203~Homo Sapiensadvanced glycosylation 65 58 end product-ref~NP_0011 specific receptor 27.1 ~

1295 gi~190846~gbHomo Sapiensreceptor for advanced 65 58 glycosylation ~AAA03574. end products 1~

1296 g117511816Homo SapiensSimilar to RIKEN cDNA 1268 99 ene 1296 AAB88440 Homo sapiensHELI- Human membrane 688 100 or secretory rotein clone PSEC0222.

1296 g17211438Homo sa golgin-67 94 30 iens 1298 g118314436Homo SapiensSimilar to RIKEN cDNA 481 79 gene 1298 11872546 Mus musculusNIK 86 25 1298 g15533305Homo Sapienssomatostatin receptor 85 29 interacting rotein s lice variant a 1299 11334643 Xeno us APEG recursor roteiii 105 27 laevis 1299 g117428053Ralstonia PROBABLE RIBONUCLEASE 100 32 E

solanacearum(RNASE E) PROTEIN

1299 g16690017HerpesvirusNTR 96 25 apio 1300 AAB87346 Homo SapiensHUMA- Human gene 5 encoded586 74 secreted protein HDPIE85, SEQ ID

N0:87.

1300 AAB44298 Homo SapiensGETH Human PR0706 (UNQ370)586 74 rotein sequence SEQ ID
N0:385.

1300 AAY41742 Homo SapiensGETH Human PR0706 protein586 74 sequence.

1301 g1218572 Pan troglodytesprot GOR 1344 62 1301 1243898 Pan GOR 1040 68 1301 g117862570Drosophila LD38414p 486 45 melano aster 1302 g113276598Homo sapiensdJ614O4.7 (Novel rotein)260 28 1302 g113397804Homo SapiensdJ616B8.3 (novel gene) 230 30 1302 AAB56641 Homo SapiensROSE/ Human prostate 226 30 cancer antigen protein sequence SEQ
ID N0:1219.

1303 g1603989 Drosophila salivary gland glue protein149 23 melano aster 1303 g113324584Borrelia LMP1 129 17 burgdorferi Table 2 SEQ AccessionSpecies Description Score 1D No. Identity NO:

1303 g1161956 Trypanosomasurface antigen 128 13 cruzi 1304 g113569248Human gag protein 81 34 immunodeficienc y virus a 1 1304 g14324832Human gag-pol polyprotein 80 29 immunodeficienc y virus a 1 1304 g111691875Mus musculusADP-ribosylation factor 79 22 1 GTPase activatin rotein 1305 AA006469 Homo SapiensHYSE- Human polypeptide 191 100 SEQ ID

NO 20361.

1305 g13608368Xenopus origin recognition complex69 30 laevis associated protein p81 1305 ABB 15196Homo SapiensHUMA- Human nervous system68 36 related polype tide SEQ ID NO
3853.

1306 AAE03657 Homo SapiensINCY- Human extracellular109 27 matrix and cell adhesion molecule-21 (XMAD-21).

1306 ABB 11890Homo SapiensHYSE- Human protocadherin109 27 Flamingo 1 homologue, SEQ ID

NO:2260.

1306 13449298 Homo SapiensMEGF2 109 27 1308 g19294050Arabidopsisprotein kinase-like protein84 32 thaliana 1308 g115983765ArabidopsisAT3g24550/MOB24 8 84 32 thaliana 1308 g113877617Arabidopsisprotein kinase-like protein84 32 thaliana 1309 AAU00375 Homo SapiensBERN/ Htunan stem cell 127 54 growth factor rece tor.

1309 AAE07145 Homo SapiensSALK Human Kit/stem cell127 54 factor receptor kinase insert region.

1309 13236223 E uus caballustyrosine kinase receptor127 50 homolog 1310 g121449343Actinosynnemapolyketide synthase 77 46 pretiosum subsp.

auranticum 1310 g121114513Xanthomonastranscriptional regulator75 36 campestris pv.

campestris str.

1310 gi13364364Escherichiaacetylglutamate kinase 73 36 - coli 0157:H7 1311 g120146220Oryza sativasimilar to splicing factor/activator110 33 (japonica protein cultivar-oup) 1311 g1206712 Rattus salivary proline-rich 104 27 protein norvegicus 1311 AAY84592 Homo SapiensUNIW Amino acid sequennce103 34 of a human artemin olypeptide.

1312 12065210 Mus musculusPro-Pol-dUTPase of rotein530 69 __ gi~10834720~Homo sapiensPP565 249 66 gb~AAG237 90.1 ~AF258 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

587_1 1312 gi~13194728~Gallus galluspol-like protein ENS-3 115 21 gb~AAK155 26.1 ~AF329 1313 AAW03515Homo sa iensSHKJ Human DOCK180 rotein.147 58 1313 gi1339910Homo sa iensDOCK180 protein 147 58 1313 gi1504002Homo sapienssimilar to a human major111 43 CRK-binding protein DOCK180.

1314 gi12007418Mus musculusB3 olfactory rece for 76 38 1314 118480290Mus musculusolfactory rece for MOR260-376 38 1314 112007432Mus musculusB3 olfacto rece for 76 38 1315 g1483581Mus musculusNotch 3 82 26 1315 g118159668Pyrobaculum paREP2b 81 29 aerophilum 1315 g14584086Spermatozopsisp210 protein 79 25 similis 1316 AAM71305Homo SapiensMOLE- Human bone marrow 422 98 expressed probe encoded protein SEQ

ID NO: 31611.

1316 AAM58790Homo SapiensMOLE- Human brain expressed422 98 single exon probe encoded protein SEQ ID

NO: 30895.

1316 g1149490Lactococcus sucrose-6-phosphate hydrolase72 31 lactis 1317 g11620040Paramecium Asp-rich 72 28 bursaria Chlorella virus 1 1317 13721615C rinus carpioMEF2C 71 25 1317 gi~9631936~rParamecium Asp-rich 72 28 ef~NP_0487bursaria 25.1 Chlorella virus 1 1318 gi~21291797~Anopheles agCP3974 74 35 gb~EAA039gambiae str.

42.1 PEST
~

1319 g121306283Chlamydomonasiron transporter Ftrl 74 30 reinhardtii 1319 AAB60461Homo sapiens1NCY- Human cell cycle 73 33 and proliferation protein CCYPR-9, SEQ

ID N0:9.

1319 g16013155Homo Sapiensp35s ' 73 33 1320 g19717245Mus musculuscytoplasmic dynein heavy430 94 chain 1320 g1402528Rattus cytoplasmic dynein heavy430 94 chain norvegicus 1320 g1294543Rattus dynein heavy chain 430 94 norvegicus 1323 gig 17221411Burkholderiakdo transferase 70 34 ~

emb~CADl2cepacia 639.1 ~

1324 g11698601Cricetulus beta-1,6-N- 440 38 griseus acetylglucosaminyltransferase 1324 g1349091Rattus N-acetylglucosaminyltransferase438 43 V

norvegicus 1324 118997007Mus musculusN-acetylglucosaminyltransferase438 43 V

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1325 AAM70545 Homo SapiensMOLE- Human bone marrow 115 47 expressed probe encoded protein SEQ

ID NO: 30851.

1325 AAM58098 Homo SapiensMOLE- Human brain expressed115 47 single exon probe encoded protein SEQ ID

NO: 30203.

1325 AAM72994 Homo SapiensMOLE- Human bone marrow 111 28 expressed probe encoded protein SEQ

ID NO: 33300.

1326 gi12724969Lactococcusphenolic acid decarboxylase77 46 lactis subsp.

lactis 1327 AAB53097 Homo SapiensGETH Human angiogenesis-associated372 63 rotein PRO 1246, SEQ
ID N0:167.

1327 AAU12416 Homo SapiensGETH Human PR01246 polypeptide372 63 sequence.

1327 AAY99377 Homo SapiensGETH Human PR01246 (UNQ630)372 63 amino acid sequence SEQ
ID NO:132.

1328 gi6014505Hepatitis polyprotein 76 43 GB

virus B

1328 gi765145 Hepatitis polypeptide 68 41 GB

virus B

1328 gi~20544059~Homo Sapienssimilar to U4/U6-associated294 100 RNA

ref~XP_0862 splicing factor 20.4 1329 AAV42689_Homo sapiensSIBI- DNA encoding human158 91 calcium aal channel alpha-2 subunit.

1329 AAQ84667_Homo SapiensSALK Human neuronal calcium158 91 aal channel subunit alpha 2c.

1329 AAQ84664-Homo SapiensSALK Human neuronal calcium158 91 aal channel subunit alpha 2b.

1330 gi19923 Nicotiana pistil extensin like 71 38 protein, partial CDS

tabacum 1330 gi~144429~gbCellulomonasbeta-1,4-xylanase 67 30 ~AAA56792.fimi 1~

1331 12388676 Mytilus precolla en P 85 35 edulis 1331 g117862044Drosophila LD06016p 75 30 melano aster 1331 g113879780MycobacteriumPE_PGRS family protein 74 30 tuberculosis 1333 AA000015 Homo SapiensHYSE- Human polypeptide 442 61 SEQ ID

NO 13907.

1333 AAB82479 Homo SapiensZYMO Human RING finger 81 31 protein Za op2.

1333 120975274Homo sapiensskeletrophin 81 31 1334 ABB 11819Homo SapiensHYSE- Human secreted 367 82 protein homolo ue, SEQ ID N0:2189.

1334 AAW80398 Homo SapiensGEMY A secreted protein 130 67 encoded by clone cw1543 3.

1334 g15081693Samanea pulvinus inward-rectifying70 34 samara channel 1335 ABB89969 Homo sapiensHUMA- Human polype tide 142 96 SEQ ID

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

NO 2345.

1335 AAB38385 Homo SapiensHUMA- Human secreted 142 96 protein encoded by gene 18 clone HTLEJ24.

1335 AAB38338 Homo SapiensHUMA- Human secreted 142 96 protein encoded by gene 18 clone HTLFE57.

1336 gi~14590195~Pyrococcus asparaginyl-tRNA synthetase70 37 re~NP_1422horikoshii 60.1 1337 gi3879419Caenorhabditiscontains similarity to 69 29 Pfam domain:

elegans PF00102 (Protein-tyrosine phosphatase), Score=51.6, E-value=1.8e-14, N=1 1337 gi~17563828~Caenorhabditisprotein tyrosine phosphatase69 29 ref~NP_5059elegans 65.1 1338 gi~2072960~gHomo Sapiensp40 138 33 b~AACS

8.1~

1338 gi~4185940~eHuman env protein 124 75 mb~CAA768endogenous 80.1 ~ retrovirus K

1338 gi~757872~eHuman env 124 75 mb~CAA577endogenous 23.1 ~ retrovirus 1340 gi1491979Molluscum MC036R 78 33 contagiosum virus subtype 1340 gi~9628968~rMolluscum MC036R 78 33 ef~NP_0439contagiosum 87.1 virus 1341 gi18676514Homo SapiensFLJ00154 protein 1560 100 1341 AAB84252 Homo SapiensHUMA- Amino acid sequence572 63 of a human cytokine receptor-like rotein.

1341 AAB84251 Homo SapiensHUMA- Human cytokine 572 63 receptor-like protein fragment.

1342 AAY27757 Homo SapiensHUMA- Human secreted 152 71 protein encoded by gene No. 47:

1342 AAB27551 Homo SapiensMYRI- Human tumour suppressor77 32 BRG1 encoded by cDNA
mutated at base 1705.

1342 AAB27550 Homo sapiensMYRI- Human tumour suppressor77 32 BRG1 protein from cell lines DU145 and NCI-H 1300.

1344 gi21464394Drosophila RE18651p 78 26 melanogaster 1344 AAM39065 Homo SapiensHYSE- Human polypeptide 77 21 SEQ ID

NO 2210.

1344 1338290 Homo Sapiensson3 protein 77 21 1345 12202 Canis s Clox 135 37 .

1345 g13879551Caenorhabditiscontains similarity to 125 33 Pfam domain:

elegans PF01391 (Collagen triple helix repeat (20 copies)), Score=56.4, E-value=2e-13, N=2; PF01484 (Nematode cuticle collagen N-terminal domain), Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

Score=87.2, E-value=l.le-22, N=1 1345 gi158695 Drosophila tropomyosin isoform 118 30 33 (9C) melanogaster 1346 gi7862077Giardia 3-hydroxy-3-methylglutaryl-coenzyme90 26 intestinalisA reductase 1346 gi1098615Mycoplasma adhesin-related 30 kDa 87 23 protein pneumoniae 1346 gi20380058Homo sa iensSimilar to PRAM-1 rotein84 28 1347 113905302Mus musculusSimilar to ATPase, class736 85 II, type 9A

1347 g117862322Drosophila LD22119p 633 72 melanogaster 1347 AAM25271 Homo SapiensHYSE- Human protein 572 100 sequence SEQ

ID N0:786.

1348 g1456319 Bacteriophage74kDa protein 75 33 1348 g11524115Lycopersiconsubtilisin-like endoprotease73 28 esculentum 1348 g14200334LycopersiconP69A protein 73 28 esculentum 1349 g121391988Drosophila HL08052p 78 31 melano aster 1349 g120148339Arabidopsis cyclin delta-3 77 25 thaliana 1349 gi~17647607~Drosophila maroon-like; bronzy; 78 31 section 5 ref~NP_5234melanogaster 23.1 1351 g118676524Homo sa iensFLJ00159 rotein 164 52 1351 g121392066Drosophila RE04357p 139 34 melanogaster 1351 AAB92637 Homo SapiensHELI- Human protein 81 43 sequence SEQ

ID N0:10953.

1352 g119071965Aspergillus chitin synthase 79 28 oryzae 1352 g117945592Drosophila RE26660p 78 41 melano aster 1352 g116184663Drosoplula LD28370p 74 22 melanogaster 1353 gi~11037117~Homo SapiensNAG13 307 65 gb~AAG274 85.1 CAF

537_1 1353 gi~1335205~eHomo SapiensORFII 305 65 mb~CAA364 80.1 1354 g11388166Drosophila Bowel 80 32 melano aster 1354 g115553187Scyliorhinushomeodomain protein 79 22 Otxl canicula 1354 AAY85573 Homo sapiensJANC Hs-UNC-53/3 fragment/GFP78 26 fusion insert of plasmid pGI3303.

1358 gi~21288288~Anopheles agCP9766 71 30 gb~EAA006gambiae str.

09.1 ~ PEST

1358 ~ gi~17465558~Homo Sapiens~ similar to mucin ~ 68 ~ 36 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

re~XP_0698 88.1 1359 gi~21302892~Anopheles agCP5020 70 31 gb~EAA150gambiae str.

37.1 PEST

1361 gi15080686Lentinula CDCS 79 26 edodes 1361 gi495516 Plasmodium circumsporozoite protein77 31 vivax 1361 gi21070569DictyosteliumVSAE2 (FR.AGMENT). 3/10176 31 discoideum 1362 gi8953400Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23 ~

thaliana s these-like rotein 1362 gi~15239030~Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23 ref~NP-1966thaliana synthase - like protein 99.1 ~

1363 gi2444430Xenopus laevisdeacetylase 327 81 1363 gi602098 Xeno us laeviseast ltPD3 homologue 324 80 1363 AAB49954 Homo SapiensMETH- Human histone 323 80 deacetylase HDAC-1.

1364 AAM69686 Homo SapiensMOLE- Human bone marrow418 55 expressed probe encoded protein SEQ

ID NO: 29992.

1364 AAM57281 Homo SapiensMOLE- Human brain expressed418 55 single exon probe encoded protein SEQ ID

NO: 29386.

1364 gi~1780971~eHuman gag protein 172 37 mb~CAA714endogenous 16.1 ~ retrovirus K

1365 gi437084 Gallus gallusvitamin D3 hydroxylase 510 41 associated protein 1365 12149156 Homo Sapiensfatty acid amide hydrolase477 38 1365 AAW57783 Homo SapiensSCRI Human fatty acid 468 38 amide hydrolase.

1366 g13510695Homo SapiensDNA polymerase theta 77 21 1366 g1309132 Mus musculuscalnexin 72 22 1366 g115214567Mus musculusSimilar to calnexin 72 22 1367 gi~17508849~Caenorhabditishelicase 73 40 re~NP elegans 26.1 ~

1368 g15457567Pyrococcus Na+/H+ antiporter (napA-1)76 33 abyssi 1368 g18247211Candida albicansShe9 rotein 69 31 1368 gi~14590079)Pyrococcus Na(+)/H(+) antiporter 76 30 ref~NP_1421horikoshii 43.1 1369 g117644260Homo SapiensbB206I21.1 (ATPase, 305 98 Class VI, type 11C ) .

1369 AA014200 Homo SapiensINCY- Human transporter166 50 and ion channel TRICH-17.

1369 g15080816Arabidopsis Putative ATPase 166 49 thaliana 1370 gi~18573281~Homo Sapienssimilar to 40S ribosomal70 38 protein S3A

re~XP_0959 33.1 Tahle: 7 SEQ AccessionSpecies Description Score ID No. Identity NO:

1372 gi6683562Mus musculushe aran sulfate 6-sulfotransferase886 91 1372 gi6683558Mus musculusheparan sulfate 6-sulfohansferase265 72 1372 ABL39900_Homo SapiensSEGK Human HS6ST2v encoding262 71 aal cDNA SEQ ID NO:1.

1373 gi~20882231Mus musculussimilar to LIM domain 76 24 ~ only 7 ref~XP_1392 03.1 1373 gi~20302988~Medicago nodule-specific glycine-rich72 26 sativa protein 3 gb~AAM189 48.1 ~AF498 1373 gi~9965267~ginfectious non-structural protein 72 24 b~AAG1000hypodermal and 8.1 ~ hematopoietic necrosis virus 1374 13355835 Rhizobium RBSK 78 32 etli 1374 g17453560Polyangium epoD 73 28 cellulosum 1374 g11749684Schizosaccharomsimilar to Saccharomyces72 28 cerevisiae yces pombe porphobilinogen deaminase, SWISS-PROT Accession Number 1375 116973455Danio reriobeta-3-galactosyltransferase1050 63 1375 AAB24035 Homo SapiensGETH Human PR04397 protein725 46 sequence SEQ ID NO:42.

1375 AAB88404 Homo SapiensHELI- Human membrane 709 43 or secretory protein clone PSEC0159.

1376 g17668 Drosophila bsg25D protein 73 33 melanogaster 1376 g120177037Drosophila LD21844p 73 33 melanogaster 1376 g11353669CaenorhabditisUNC-24 69 43 ele ans 1379 AAS16182_Homo SapiensGENA- Human apolipoprotein245 67 aal (APOC1 DNA.

1379 AAU10534 Homo SapiensGENA- Human apolipoprotein245 67 (APOC1) of eptide.

1379 AAS 16825-Homo SapiensGENA- Human apolipoprotein245 67 aal (APOC1) DNA coding se uence.

1380 AAY36290 Homo sapiensHUMA- Human secreted 177 74 protein encoded by gene 67.

1380 g116551305Tatianyx DNA-directed RNA polymerase71 38 beta' arnacites subunit 2 1380 13411013 Candida protein mannosyltransferase68 35 albicans 1 1381 AAM80132 Homo SapiensHYSE- Human protein SEQ 173 66 ID NO

3778.

1381 g14731867Dictyosteliumsterol glucosyltransferase107 30 discoideum 1381 AAB74726 Homo SapiensINCY- Human membrane 89 41 associated protein MEMAP-32.

1382 AAB62100 Homo SapiensWIST- Human bridging 78 27 integrator-2 (Bin2) rotein.

1382 g16527168Homo Sapiensbreast cancer associated78 27 protein 1382 g15852834Homo Sapiensbridging integrator-2 78 27 ~ ~

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1383 gi7670050Xeno us type I collagen al ha 92 27 laevis 1 1383 AA001606 Homo SapiensHYSE- Human polypeptide 85 29 SEQ ID

NO 15498.

1383 gi17738485Agrobacteriumbiopolymer transport 85 28 protein tumefaciens str.

C58 (U.

Washin ton) 1384 gi20451261CaenorhabditisC. elegans GCY-17 protein71 26 elegans (comes onding se uence W03F11.2) 1384 gi2665714AgrobacteriummoaC 71 29 tumefaciens 1384 gi~20864452~Mus musculusRIKEN cDNA 2410018E23 130 59 ref]XP-1500 76.1 ~

1385 AAY94938 Homo SapiensGEMY Human secreted protein103 25 clone ye78 1 protein sequence SEQ ID

N0:82.

1385 gi12831176Agelaius gamma filamin protein 96 29 phoeniceus 1385 AAU81998 Homo sapiensINCY- Human secreted 87 27 protein SECP24.

1386 gi10440468Homo SapiensFLJ00070 protein 102 41 1386 gi11136912Danio rerioRPTP-al ha protein 94 32 1386 120377083Homo Sapiensp78 92 36 1387 AAM40810 Homo SapiensHYSE- Human polypeptide 190 59 SEQ ID

NO 5741.

138.7 AAM39024 Homo SapiensHYSE- Human polypeptide 190 59 SEQ ID

NO 2169.

1387 g115080474Homo SapiensSimilar to RIKEN cDNA 190 59 ene 1388 g112802591Bovine tegument protein 82 30 herpesvirus 1388 g1950226 SaccharomycesTrf4p ' 73 26 cerevisiae 1388 gi~13095641~Bovine tegumentprotein 82 30 ref~NP_0765herpesvirus 56.1 1389 AAI67224_Homo SapiensCORI- BS11S cDNA sequence.363 100 aal 1389 AAF85500_Homo SapiensEOSB- Nucleotide sequence363 100 of a aal human breast cancer protein designated BCH1.

1389 AAA54120-Homo sapiensEOSB- Breast cancer protein363 100 aal codin se uence.

1390 g1184653 Homo SapiensIFN-alpha responsive 74 30 transcription factor 1390 gi~2580453~gXenopus Xbap 68 47 laevis b~AAB8233 6.1~

1391 AAB88456 Homo SapiensHELI- Human membrane 85 52 or secretory protein clone PSEC0246.

1391 AAB62392 Homo SapiensLEXI- Human LDL receptor85 52 family rotein (LDLP).

1392 ABB 12009Homo Sapiens~ HYSE- Human RAMP 1 ~ 90 ~ 100 homologue, Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

SEQ ID N0:2379.

1392 gi3171910Homo sa RAMP1 90 100 iens 1392 gi12653551Homo Sapiensreceptor (calcitonin) 90 100 activity modifying rotein 1 1394 gi4467343Drosophila EG:140G11.1 70 27 melano aster 1394 gi6018879Drosophila BACN4L24.d 70 27 melanogaster 1394 gi157993 Drosophila developmental protein 70 27 melanogaster 1395 gi4928919Arabidopsiszinc forger protein 2 86 26 thaliana 1395 gi2702272Arabidopsisexpressed protein 86 26 thaliana 1396 AAM25276 Homo sapiensHYSE- Human protein sequence729 93 SEQ

ID N0:791.

1396 AAE14340 Homo sapiensINCY- Human protease 528 33 protein.

1396 AAB47561 Homo sa INCY- Protease PRTS-3. 528 33 iens 1397 gi18369843Infectious P6 89 40 salmon anemia virus 1397 gi4092530Infectious NS1 protein 87 39 salmon anemia virus 1397 gi14009648Infectious NS1 87 39 salmon anemia virus 1398 AAW63707 Homo sa UYOR- Human hSK2 protein.331 91 iens 1398 gi1575663Rattus ~ calcium-activated potassium331 91 channel norvegicus rSK2 1398 gi15082148Homo Sapienssmall-conductance calcium-activated331 91 otassium channel 1399 AAB01.381Homo sapiensINCY- Neuron-associated 1653 68 protein.

1399 gi18157547Mus musculuspecanex-like 3 1620 66 1399 16650377 Mus musculusecanex 1 1277 51 1400 gi~20887681Mus musculussimilar to melastatin 468 91 ~ 1 ref~XP,1405 75.1 1400 gi~3243075~gHomo Sapiensmelastatin 1 355 75 b~AAC8000 0.1~

1400 gi~20552333~Homo Sapienssimilar to melastatin 355 75 ref~XP-0076 62.9 1401 AAU15955 Homo SapiensHUMA- Human novel secreted931 92 protein, Seq ID 908.

1401 g13978441Homo SapiensPITSLRE protein kinase 95 24 alpha SV9 isoform 1401 g11517914Homo Sapiensmonocytic leukaemia zinc91 28 finger rotein 1402 g11289326Mus musculusROR-al ha 1 84 25 1402 g1530878 Chlamydomonasamino acid feature: N-glycosylation79 32 , eugametos sites, as 41 .. 43, 46 .. 48, 51 .. 53, 72 ..

Tahle 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

74, 107 .. 109, 128 ..
130, 132 .. 134, 158 .. 160, 163 .. 165;
amino acid feature: Rod protein domain, as 169 ..

340; amino acid feature:
globular protein domain, as 32 .. 168 1402 gi220763 Rattus HES-3 factor 79 52 norve icus 1403 gi~20479430~Homo Sapienssimilar to olfactory 71 32 receptor MOR231-ref~XP-1149 1 55.1 1403 gi~20480897~Homo sapienssimilar to olfactory 71 32 receptor MOR234-ref~XP-1150 3 14.1 ~

1404 AAA88548_Homo sapiensSMIK Human CASB616 cDNA.89 100 aal 1404 AAB 19591Homo SapiensSMIK Human CASB616. 89 100 1404 11100110 Homo sa protein-tyrosine kinase 89 100 iens 1405 g14206753Oryctolagushomeodomain-containing 74 24 protein cuniculus 1405 g113445253Mus musculusorphan Gpr37-like rotein72 33 1405 g13080552Mus musculusHoxa-9 71 50 1406 AAM50585 Homo SapiensNISB Benign prostatic 325 100 hyperplasia associated protein JT460914.

1406 g118031947Homo SapiensSOCS box protein ASB-5 325 100 1406 AAU20593 Homo sapiensHUMA- Human secreted 316 100 protein, Seq ID No 585.

1407 AAU83222 Homo SapiensZYMO Novel secreted protein895 97 Z930005G2P.

1407 AAY02712 Homo SapiensHUMA- Human secreted 91 56 protein encoded by gene 63 clone HBJFV28.

1407 AA000641 Homo SapiensHYSE- Human polypeptide 86 64 SEQ ID

NO 14533.

1408 ABB17944 Homo SapiensHUMA- Human nervous system81 53 related pol eptide SEQ ID NO
6601.

1408 AAM77906 Homo SapiensMOLE- Human bone marrow 72 40 expressed probe encoded protein SEQ

ID NO: 38212.

1408 AAM65199 Homo SapiensMOLE- Human brain expressed72 40 single exon probe encoded protein SEQ ID

NO: 37304.

1409 g15230847Vitreoscillaglutamine synthetase 68 33 Sp. homolog 1409 g18515736Drosophila highwire 67 35 melano aster 1409 g13138797Sulfolobus Ssh7b 65 48 shibatae 1410 AAW23309 Homo sapiensEIJI- Human Werner's 151 96 syndrome WS-2 protein.

1410 g11913785Homo SapiensRep-8 151 96 1410 g118089098Homo sapiensre roduction 8 151 96 1411 gi~21297468~Anopheles agCP15537 166 56 gb~EAA096gambiae str.

13.1 PEST

1411 gi~20983200~Mus musculusRIKEN cDNA 1810030007 73 24 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

ref~XP-1358 12.1 1412 gi532572 Hordeum lipoxygenase 1 82 28 vulgare 1412 gi945419 Mus musculushepatoma derived growth 77 35 factor (HDGF) 1412 gi17932895stork hepatitispreC/core antigen 77 26 B

virus 1413 gi2370143Homo Sapiensimmunoglobulin-like domain-169 42 containing 1 1413 gi2645890Homo sa IGSF1 169 42 iens 1413 AAB40232 Homo SapiensHUMA- Human secreted 162 40 protein sequence encoded by gene N0:142.

1414 gi21204314Staphylococcusproline-tRNA ligase 78 32 aureussubsp.

aureus MW2 1414 gi14247033Staphylococcusproline-tRNA ligase 78 32 aureus subsp.

aureus Mu50 1414 gi13701063Staphylococcusproline-tRNA ligase 78 32 aureus subsp.

aureus N315 1415 gi9948469Pseudomonasprobable non-ribosomal 78 31 peptide aeruginosa synthetase 1415 AAE19251 Homo SapiensBIOI- SOSl protein sequence75 23 from PS462.

1415 AAU84311 Homo SapiensBAAI~/ Protein ABCB2 74 30 differentially ex ressed in breast cancer tissue.

1416 gi18676710Homo sa FLJ00254 rotein 623 75 iens 1416 gi2065210Mus musculusPro-Pol-dUTPase pol rotein583 69 1416 gi~18676710~Homo SapiensFLJ00254 protein 623 75 dbj~BAB850 07.1 ~

1417 AAR85785 Homo SapiensUYNY Human GRB-10. 77 32 1417 gi841210 Mus musculusgrowth factor receptor 77 32 binding protein Grb 10 1417 AAM90963 Homo SapiensHUMA- Human 74 32 immune/haematopoietic antigen SEQ

ID N0:18556.

1419 AAM79990 Homo SapiensHYSE- Human protein SEQ 82 100 ID NO

3636.

1419 AAM79006 Homo SapiensHYSE- Human protein SEQ 82 100 ID NO

1668.

1419 AAR28494 Homo SapiensXIAM/ Sequence encoded 82 100 by the CAMPATH-1 antigen cDNA.

1420 AAU01383 Homo SapiensMILL- Human TANGO 499 828 73 form 2, variant 1 amino acid sequence.

1420 AAU01382 Homo SapiensMILL- Human TANGO 499 828 73 form 2, variant 4 amino acid se uence.

1420 AAU01380 Homo SapiensMILL- Human TANGO 499 828 73 form 2, amino acid se uence.

1421 gi19069609EncephalitozoonPROTEASOME REGULATORY 76 26 cuniculi SUBUNIT YTA6 OF THE AAA

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

FAMILY OF ATPASES

1422 AAM66177 Homo SapiensMOLE- Human bone marrow199 72 expressed probe encoded protein SEQ

ID NO: 26483.

1422 AAM53791 Homo SapiensMOLE- Human brain expressed199 72 single exon probe encoded protein SEQ ID

NO: 25896.

1422 AAM68472 Homo SapiensMOLE- Human bone marrow176 81 expressed probe encoded protein SEQ

ID NO: 28778.

1423 11800227 Oryza sativaBowman-Birk roteinase 74 34 inhibitor 1423 g110141005San Miguel non-structural polyprotein74 26 sea lion virus 1423 gi~17490177~Homo sapienssimilar to RING finger 76 28 protein 18 re~XP-0623 (Testis-specific ring-forger protein) 00.1 ~

1424 g1461336 Pyrenomonas hsp70 75 29 salina 1424 g113880037Mycobacteriummembrane protein, MmpL 75 24 family tuberculosis 1424 g11449306MycobacteriummmpL2 75 24 tuberculosis H37Rv 1425 g115600 Enterobacteriagene 7.3, host range 79 30 ha a T7 1425 g116198065Drosophila LD28477p 77 30 melanogaster 1425 g111870012Drosophila xnp/atr-x DNA helicase 77 30 melanogaster 1426 g116185397Drosophila LD39815p 204 44 melano aster 1426 g12244793Arabidopsis disease resistance N 86 30 like protein thaliana 1426 AAU84280 Homo SapiensBGHM Human endometrial 77 26 cancer related rotein, HERC1.

1427 AAY36302 Homo SapiensHUMA- Human secreted 183 79 protein encoded by gene 79.

1427 AAB88359 Homo SapiensHELI- Human membrane 178 80 or secretory protein clone PSEC0087.

1427 AAM41635 Homo SapiensHYSE- Human polypeptide178 80 SEQ ID

NO 6566.

1428 AAU82008 Homo Sapiens1NCY- Human secreted 114 64 protein SECP34.
Y

1428 AAB32391 Homo SapiensHUMA- Human secreted 114 64 protein sequence encoded by gene 21 SEQ ID

N0:77.

1428 AAY08306 Homo SapiensFIBR- Human collagen 74 45 IX alpha-3 chain rotein.

1429 g12792523Ralstonia alternative RNA sigma 69 30 factor RpoS

solanacearum 1429 g117428221Ralstonia RNA POLYMERASE SIGMA 69 33 S

solanacearum(SIGMA-38) FACTOR

TRANSCRIPTION REGULATOR

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

PROTEIN

1429 gi~5032313~rHomo Sapiensdystrophin Dp140bc isoform;73 26 e~NP_0040 Dystrophin (muscular dystrophy, 14.1 Duchenne and Becker types) 1433 gi9954445Rattus TEMO 171 62 norve icus 1433 gi14030260maize rayadopolyprotein ~ 79 32 fino virus 1433 AAB95656 Homo sapiensHELI- Human protein 77 36 sequence SEQ

ID N0:18419.

1434 AAR04212 Homo SapiensCALB- Human 32K alveolar391 43 surfactant rotein.

1434 AAP60661 Homo SapiensKUSH/ Genomic sequence 386 43 of human alveolar surfactant protein (hASP)encoded by genomic DNA.

1434 AAB58135 Homo SapiensROSE/ Lung cancer associated366 42 pol a tide sequence SEQ ID 473.

1435 gi17224904Mus musculusimmuno lobulin superfamily180 48 member 9 1435 gi20988778Homo SapiensSimilar to immunoglobulin173 53 su erfamily, member 1435 gi14149050Drosophila turtle protein, isoform114 36 melanogaster 1436 gi1465855CaenorhabditisC. elegans PQN-57 protein85 23 elegans (correspondin sequence R09F10.7) 1436 gi1465856CaenorhabditisC. elegans PQN-56 protein85 23 elegans (correspondin sequence R09F10.2) 1436 117864717Mus musculushornerin 83 26 1437 gi~21292574~Anopheles agCP3449 66 33 gb~EAA047gambiae str.

19.1 PEST

1438 ABB 10160Homo SapiensHUMA- Human cDNA SEQ 166 62 ID NO:

468.

1438 g19657279Vibrio choleraeaspartokinase II/homoserine71 28 dehydrogenase, methionine-sensitive 1439 g14582571Gallus gallusH erion protein, 419 75 24 kD isoform 1439 g113165 Oenothera ATPase alpha-subunit 72 26 (aa 1-511) biennis 1439 g1903838 Oenothera F-1-ATPase alpha subunit72 26 berteriana 1440 g14558758Homo Sapienstestis-specific chromodomain233 62 Y-like protein 1440 g14558762Mus musculustestis-specific chromodomain231 36 Y-like rotein 1440 g13342716Homo Sapienstestis-specific ChromoDomain195 36 Y

isoform 1 1441 g1155627 Acanthamoebamyosin I heavy chain 118 42 castellanii 1441 g113093370Mycobacteriuminitiation factor IF-2 116 33 1e rae 1441 AAY20289 Homo SapiensUYRO- Human apolipoprotein114 39 E

mutant rotein fragment 5.

1442 g12253707Mus musculusDaxx 84 36 1442 g11934970Plasmodium AARP1 protein 79 65 falciparum Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1442 14050098 Mus musculusFas-bindin protein 78 34 1443 g12425111DictyosteliumZipA 90 26 discoideum 1443 AAY06119 Homo SapiensHARD Human CIITA interacting88 26 protein 104 CIP104).

1443 g15420387Leishmania proteophosphoglycan 86 21 maj or 1444 g1893355 AcinetobacterL-2,4-diaminobutyrate 77 26 decarboxylase baumannii 1445 ABB55744 Homo sapiensFECH/ Human polypeptide 135 47 SEQ ID

NO 94.

1445 AAU39035 Homo SapiensGEMY Human secreted protein135 47 nh328 5.

1445 AAY28679 Homo SapiensGEMY Human nh328 5 secreted135 47 rotein.

1446 g119744390Homo sapiensretinoic acid inducible 247 54 in neuroblastoma cells RAINB
1 d 1446 g119744388Homo Sapiensretinoic acid inducible 247 54 in neuroblastoma cells RAINB

1446 AAY85565 Homo SapiensJANC Human homologue 240 52 of UNC-53 (Hs-UNC-53/2) se uence.

1447 AAU19716 Homo SapiensHUMA- Human novel extracellular71 31 matrix protein, Seq ID
No 366.

1447 g118025476cercopithicineBPLF1 71 38 he esvirus 1447 AAS 14575_Homo SapiensMILL- Human cDNA encoding69 62 G

aal protein-coupled receptor, GPCR, 52872.

1448 g114027507Mesorhizobiumsalicylate hydroxylase 69 31 loti 1449 AAG64798 Homo sapiensSREH- Human peptide methionine192 . 71 sulphoxide reductase (hPMSR).

1449 AAB81893 Homo SapiensSEQU- Human genomic database192 71 related protein SEQ ID
NO: 38.

1449 AAM42046 Homo SapiensHYSE- Human polypeptide 192 71 SEQ ID

NO 6977.

1450 g118249657Mus musculusNC8 1063 80 1450 1406748 Mus musculuszinc finger protein 250 37 1450 AAB43498 Homo SapiensHUMA- Human cancer associated249 37 rotein sequence SEQ ID
N0:943.

1451 ABB89331 Homo SapiensHUMA- Human polypeptide 732 88 SEQ ID

NO 1707.

1451 g113421927CaulobacterMaoC family protein 273 42 crescentus 1451 g119338616MethylobacteriuR-specific enoyl-CoA 261 44 hydratase m extorquens 1452 gi~20908171~Mus musculussimilar to NADPH oxidase68 30 3; NADPH

ref~XP_1397 oxidase catalytic subunit-like 15.1 1452 gi~17533619~CaenorhabditisF32A5.8.p 67 42 ref~NP_4955elegans 16.1 1453 gi~15614051~Bacillus sodium-dependent phosphate65 34 reflNP halodurans traps orter Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

54.1 ~

1454 gi~17551878~CaenorliabditisTPRDomain 76- 29 ref~NP_4990elegans 90.1 1455 AAM40727 Homo SapiensHYSE- Human polypeptide 191 56 SEQ ID

NO 5658.

1455 AAM38941 Homo SapiensHYSE- Human polypeptide 191 56 SEQ ID

NO 2086.

1455 gi19702127Homo sa P-Rexl rotein 191 56 iens 1456 ABB05666 Homo SapiensGEHU- Human nucleic acid496 91 management rotein clone amy2 l 1n4.

1456 AAE03372 Homo SapiensHUMA- Human gene 18 encoded496 91 secreted protein fragment, SEQ ID

N0:152.

1456 AAE03371 Homo SapiensHUMA- Human gene 18 encoded496 91 secreted protein fragment, SEQ ID

N0:150.

1457 AAM66940 Homo SapiensMOLE- Human bone marrow 290 77 expressed probe encoded protein SEQ

ID NO: 27246.

1457 AAM54534 Homo SapiensMOLE- Human brain expressed290 77 single exon probe encoded protein SEQ ID

NO: 26639.

1457 AAM64410 Homo SapiensMOLE- Human brain expressed287 77 single exon probe encoded protein SEQ ID

NO: 36515.

1458 AAB53445 Homo SapiensHUMA- Human colon cancer335 100 antigen rotein se uence SEQ ID
N0:985.

1458 AAY30055 Homo SapiensARIA- Amino acid sequence165 91 of a FK506-binding protein (FKBP).

1458 AAQ52277_Homo sapiensVERT- FK506 binding protein159 100 aal (FKBP12A) cDNA.

1460 AAU20255 Homo SapiensHUMA- Human novel endocrine104 76 antigen, SEQ ID No 312.

1460 ABB 17663Homo SapiensHUMA- Human nervous system94 77 related pol a tide SEQ ID NO
6320.

1460 AA002331 Homo SapiensHYSE- Human polypeptide 88 61 SEQ ID

NO 16223.

1461 AAM65951 Homo SapiensMOLE- Human bone marrow 97 57 expressed probe encoded protein SEQ

ID NO: 26257.

1461 AAM53568 Homo SapiensMOLE- Human brain expressed97 57 single exon probe encoded protein SEQ ID

NO: 25673.

1461 AAU83199 Homo sapiensZYMO Novel secreted protein96 38 Z891639G1P.

1463 15565687 Homo sa topoisomerase-related 514 75 iens function protein 1463 15139669 Homo SapiensLAK-1 468 75 1463 g121430468Drosoplula LP06848p 332 51 melano aster 1464 AAY91421 Homo sapiensHUMA- Human secreted 109 35 protein sequence encoded by gene N0:142.

1464 AAY91396 Homo SapiensHUMA- Human secreted 109 35 rotein Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

sequence encoded by gene N0:117.

1464 AAY91352 Homo SapiensHUMA- Human secreted 109 35 protein sequence encoded by gene N0:73.

1465 AAU15978 Homo SapiensHUMA- Human novel secreted575 100 protein, Se ID 931.

1465 AAU15958 Homo SapiensHUMA- Human novel secreted575 100 protein, Se ID 911.

1465 116041675Homo sa 'oined to JAZF1 575 100 iens 1466 AA001502 Homo SapiensHYSE- Human polypeptide 173 66 SEQ ID

NO 15394.

1466 gi~10947038~Homo Sapiensankyrin 1, isoform l; 74 28 anlcyrin-1, ref~NP erythrocytic; ankyrin-R

09.1 ~

1466 gi~10947036~Homo Sapiensankyrin 1, isoform4; 74 28 ankyrin-1, reflNP erythrocytic; ankyrin-R

08.1 1467 g119354550Mus musculussimilar to src homology 842 91 three (SH3) and cysteine rich domain 1467 AAU17352 Homo SapiensHUMA- Novel signal transduction361 98 athway rotein, Se ID
917.

1467 g11799566Mus musculusstet 302 44 1468 g113506771Mus musculusstructural protein FBF1 767 74 1468 g17549210Babesia 200 lcDa antigen p200 213 29 bigemina 1468 g11747 Oryctolagustrichohyalin 191 30 cuniculus 1469 111345048Homo SapiensSCAN domain-containing 86 32 rotein 2 1469 111320940Homo SapiensSCAND2 86 32 1469 g114210722Tupaia t41 86 30 herpesvirus 1470 AAY88278 Homo SapiensMILL- Human TANGO 188 1442 100 rotein.

1470 114336711Homo Sapienssimilar to C. Elegans 1442 100 protein F17C8.5 1470 AAA39947'Homo SapiensMILL- Human TANGO 188 1438 99 cDNA.

aal 1471 AAE10204 Homo SapiensHYSE-Humen bone marrow 71 44 derived contig protein, SEQ ID
NO: 69.

1471 AAA23458 Homo SapiensALPH- cDNA encoding human67 46 _ secreted protein vpl5_l, aal SEQ ID

N0:71.

1471 AAB80228 Homo sa GETH Human PR0269 protein.67 46 iens 1472 AAB88433 Homo SapiensHELI- Human membrane 136 86 or secretory rotein clone PSEC0210.

1472 AAB95155 Homo SapiensHELI- Human protein sequence136 86 SEQ

ID N0:17188.

1472 AAE01745 Homo SapiensHUMA- Human gene 2 encoded136 86 secreted protein HOGCS52 variant, SEQ ID N0:160.

1473 g19294201Arabidopsisdisease resistance protein70 24 thaliana 1474 AAE1915 Homo SapiensTHOR/ Human lcinase polypeptide631 98 (PKIN-15).

1474 AAM79131 Homo SapiensHYSE- Human protein SEQ ~ 494 ~ 72 ID NO

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1793.

1474 AAW 19920Homo sapiensREGC Human I~sr' (kinase494 72 suppressor of Ras).

1475 AAD 12609_Homo SapiensSAGA Human protein having657 73 aal hydrophobic domain encoding cDNA

clone HP03974.

1475 AA014199Homo Sapiens1NCY- Human transporter 657 73 and ion channel TRICH-16.

1475 AAE06614Homo SapiensSAGA Human protein having657 73 hydrophobic domain, HP03974.

1476 113905246Mus musculusRIKEN cDNA 2410024K20 71 34 gene 1476 gi~17505208~Mus musculusCD2 antigen (cytoplasmic71 34 tail) binding ref~NP'0816 protein 2; 1500011B02Rik 29.1 ~

1477 g1806491Rarius guanylylcyclase 140 65 norvegicus 1477 g12648066Canis familiarisguanylate cyclase E 118 55 1477 g12623074Bos taurus rod outer segment guanylate116 55 cyclase precursor 1478 12065210Mus musculusPro-Pol-dUTPase polyprotein585 73 1478 118676710Homo SapiensFLJ00254 protein 408 69 1478 AA004042Homo SapiensHYSE- Human polypeptide 392 75 SEQ ID

NO 17934.

1479 AAU05396Homo SapiensGEHO Human titin (connectin)208 29 protein sequence.

1479 g11212992Homo SapiensProtein sequence and 208 29 annotation available soon via Swiss-Prot;
available at present via e-mail from LABEIT EMBL-Heidelber .DE

1479 g117066105Homo sa iensTitin 208 29 1480 AAV44685,Homo SapiensTEXA Osteoclast inhibitor94 41 protein, aal OIP-1, coding sequence.

1480 AAB35287Homo sa iensUROG- Human stem call 94 41 antigen-2.

1480 AAY99709Homo SapiensREGC Human stem cell 94 41 antigen-2, hSCA-2.

1481 AAB57094Homo SapiensROSE/ Human prostate 122 100 cancer antigen protein sequence SEQ
ID N0:1672.

1481 g132672 Homo Sapiensinterferon alphalbeta 122 100 receptor 1481 AAQ49625-Homo SapiensEUBI- Human interferon 118 96 receptor aal extracellular domain codin se uence.

1482 AAD17516_Homo SapiensSENO- Human taste receptor,890 94 hTlR1 aal cDNA coding sequence.

1482 ABB77319Homo Sapiens1NCY- Human G-protein 890 94 coupled rece for SEQ ID NO 3.

1482 AAE10372Homo SapiensSEND- Human taste receptor,890 94 hTlR1 rotein.

1483 g118376312Neurospora related to SSD1 protein 109 39 crassa 1483 g12645173Schizosaccharomsts5+ 99 42 yces ombe 1483 g12459997Candida albicansrotein phosphatase Ssdl 99 40 homolog 1484 gi~18569064~Homo Sapienssimilar to 40S RIBOSOMAL319 96 ref~XP-0953 PROTEIN S3A (V-FOS

78.1 TRANSFORMATION EFFECTOR
~

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

PROTEIN

1484 gi~20539276~Homo Sapienssimilar to olfactory 259 94 receptor MOR145-ref~XP_0952 2 20.2 1484 gi~21295882~Anopheles agCP1347 68 32 gb~EAA080gambiae str.

27.1 PEST

1485 ABB 11761Homo SapiensHYSE- Human secreted 197 36 protein homologue, SEQ ID NO:2131.

1485 gi930259 Woolly monkeyreverse transcriptase 148 33 (476 AA) sarcoma virus 1485 gi18076262porcine Pol protein 147 38 endogenous retrovirus 1486 AAM74887 Homo SapiensMOLE- Human bone marrow 172 100 expressed probe encoded protein SEQ

ID NO: 35193.

1486 AAM62085 Homo sapiensMOLE- Human brain expressed172 100 single exon probe encoded protein SEQ ID

NO: 34190.

1486 1152661 Plasmid neomycin resistance rotein75 26 SB24.2 1487 112653493Homo sa Similar to brain acid-soluble75 34 iens protein 1 1487 g117428832Ralstoilia PROBABLE AVRBS3-LIKE 75 33 solanacearuxnPROTEIN

1487 g17329672Arabidopsisphosphatidate cytidylyltransferase-like72 46 thaliana protein 1488 AAU74754 Homo SapiensINCY- Human protease 2042 83 rotein se uence.

1488 AAU74752 Homo SapiensINCY-Human protease PRTS-12476 39 protein sequence.

1488 111935122Mus musculusa ilin 431 40 1489 gi~17543712~CaenorhabditisYSSF3C.8.p 72 32 ref~NP-4999elegans 76.1 1489 gi~20344600~Mus musculusRIKEN cDNA 4933431K05 70 30 ref~XP_1095 79.1 1489 gi~11692798~Xenopus ataxia telangiectasia 69 26 laevis and Rad3-related gb~AAG400 protein 02.1 ~AF320 1490 AAB95817 Homo SapiensHELI- Human protein sequence256 63 SEQ

ID N0:18817.

1490 ABB06369 Homo SapiensBODE- Human neurogenesis173 64 related rotein 12 SEQ ID N0:2.

1490 AAB44394 Homo sapiensHUMA- Gene 10 encoded 83 66 human secreted protein fragment as BLASTX

query se uence.

1491 g1438795 Mus musculusserotonin 1A receptor 73 26 1491 g11066326Mus musculusserotoninlA receptor 72 26 1491 gi~438795~gbMus musculusserotonin 1A receptor 73 26 .

AAA 16850.

1~

1492 g116198083Drosophila LD29875p ~ 87 ~ 33 Table 2 SEQ AccessionSpecies Description Score No. Identity NO:

melano aster 1492 gi2327063Pneumocystisprotease 1 75 34 carinii f.
Sp.

carinii 1492 120420 Prunus dulcisextensin 75 34 1493 AAG67087 Homo SapiensSHAN- Human ATP-dependent106 67 serine rotein hydrolase 13.

1493 AAM76636 Homo SapiensMOLE- Human bone marrow103 68 expressed probe encoded protein SEQ

ID NO: 36942.

1493 AAM63822 Homo SapiensMOLE- Human brain expressed103 68 single exon probe encoded protein SEQ ID

NO: 35927.

1494 AAY31225 Homo SapiensAVET Human RNA helicase73 38 p135 protein.

1494 g13123906Homo sa ienspre-mRNA splicin factor73 38 1494 g113278975Homo Sapienspre-mRNA splicing factor73 38 similar to S.

cerevisiae P 16 1495 gi~17568307~Caenorhabditiscollagen 74 35 ref~NP-5098elegans 37.1 ~

1496 12065210 Mus musculusPro-Pol-dUTPase polyprotein410 81 1496 gi~10834720~Homo SapiensPP565 301 77 gb~AAG237 90.1~AF258 1496 gi~6753924~rMus musculusFriend virus susceptibility127 37 ef~NP_0343 74.1 1497 g120901968CaenorhabditisC. elegans RPL-36 protein71 34 elegans (comes ondin sequence F37C12.4) 1497 gig 17554754CaenorhabditisRibosomal protein YL39 71 34 ref~NP elegans 73.1 1498 g15305335Mycobacteriumproline-rich mucin homolog102 27 tuberculosis 1498 g1330130 human latency associated transcript97 37 (LAT) herpesvirus ORF-2 1498 AAU83682 Homo SapiensGETH Human PRO protein,94 30 Seq ID No 182.

1499 AAY57937 Homo Sapiens1NCY- Human transmembrane199 81 protein HTMPN-61.

1499 AAY36295 Homo SapiensHUMA- Human secreted 151 100 protein encoded by gene 72.

1499 AAG75708 Homo SapiensHUMA- Human colon cancer141 92 antigen rotein SEQ ID N0:6472.

1500 g121428712Drosophila SD05267p 165 54 melanogaster 1500 g120975274Homo Sapiensskeletrophin 114 40 1500 g119773434Mus musculusskeletrophin 99 52 1501 ABB 17830Homo SapiensHUMA- Human nervous 82 37 system related pol epode SEQ ID NO
6487.

1501 AA012929 Homo SapiensHYSE- Human polypeptide73 43 SEQ ID

NO 26821.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1502 gi8778340ArabidopsisF15O4.13 77 39 thaliana 1503 AAW03515 Homo sa SHKJ Human DOCK180 protein.144 33 iens 1503 11339910 Homo sa DOCK180 protein 144 33 iens 1503 113195147Mus musculusHCH 129 25 1505 AAM70790 Homo SapiensMOLE- Human bone marrow 77 53 expressed probe encoded protein SEQ

ID NO: 31096.

1505 AAM58316 Homo SapiensMOLE- Human brain expressed77 53 single exon probe encoded protein SEQ ID

NO: 30421.

1505 gi~21302711~Anopheles agCP4916 77 30 gb~EAA148gambiae sir.

56.1 PEST

1506 AAU75102 Homo sa MYRI- Heat shock protein592 79 iens 8 (HspB).

1506 AAB82535 Homo SapiensUYCO- Human heat shock 592 79 protein Hsc70.

1506 AAE12987 Homo SapiensSRIV/ Human Hsp70 family592 79 homologue, Hsc70.

1507 ABL53627 Homo SapiensGENO- Breast protein-eukaryotic213 92 _ conserved gene 1 (BSTP-ECG1) aal cDNA.

1507 ABB75677 Homo SapiensGENO- Breast protein-eukaryotic213 92 conserved gene 1 (BSTP-ECG1) protein.

1507 AAY99421 Homo sapiensGETH Human PRO1433 (UNQ738)213 92 amino acid se uence SEQ
ID N0:292.

1508 AAW 15565Homo SapiensUYJO Human intracellular79 29 tyrosine kinase Tnkl-al ha.

1508 g1233062 Gallus gallussrc dovcmstream region 78 33 1508 g118376366Neurospora related to ribosomal 72 30 protein S 15 crassa precursor (mitochondrial) 1509 gi~21297482~Anopheles agCP15541 68 36 gb~EAA096gambiae str.

27.1 PEST

1510 AAM41631 Homo SapiensHYSE- Human polypeptide 127 37 SEQ ID

NO 6562.

1510 AAM39845 Homo sapiensHYSE- Human polypeptide 127 37 SEQ ID

NO 2990.

1510 AAM79502 Homo SapiensHYSE- Human protein SEQ 127 37 ID NO

3148.

1511 g121217669Mus musculusm osin IIIA 70 28 1511 gi~21302393~Anopheles agCP8799 71 36 gb~EAA145gambiae str.

38.1 PEST

1511 gi~20822589~Mus musculussimilar to myosin IIIA 70 28 ref~XP,1408 54.1 ~

1512 g16911049Babesia p9.6.2-like variant erythrocyte82 28 bovis surface antigen-la 1512 g16911045Babesia p9.6.2 variant erythrocyte82 28 bovis surface antigen-la 1512 g16911047Babesia p8.4.1 variant erythrocyte81 28 bovis surface antigen-la Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1513 gi10174843Bacillus maltose transport system77 25 (permease) halodurans 1513 gi56312 Rattus Gephyrin 76 31 norvegicus 1513 gi4325371Arabidopsis contains similarity to 74 28 Medicago thaliana truncatula N7 protein (GB:Y17613) 1514 AAY14196Homo SapiensTAKEI T cell receptor 95 100 zeta chain protein sequence.

1514 1623042 Homo SapiensT-cell receptor zeta 95 100 chain 1514 14960202Sus scrofa CD3 zeta chain 95 100 1515 ABB07508Homo SapiensINCY- Human aminoacyl 726 100 tRNA

synthetase (ATRS) polypeptide (ID:

7474756CD 1 ).

1515 AAB43670Homo SapiensHUMA- Human cancer associated604 82 rotein sequence SEQ ID
NO:1115.

1515 g11464742Homo sa iensthreonyl-tRNA synthetase604 82 1516 g121109348Xanthomonas cytochrome B561 77 29 axonopodis pv.

citri str.

1516 g121114046Xanthomonas cytochrome B561 76 28 campestris pv.

campestris str.

1516 gi~21243760~Xanthomonas cytochrome B561 77 29 reflIVP-6433axonopodis pv.

42.1 citri str.

1517 ABB 11450Homo SapiensHYSE- Human neurotoxin 119 33 homologue, SEQ ID N0:1820.

1517 18809770Mus musculusLy-6I.1 94 30 1517 18809768Mus musculuslymphocyte antigen LY6I 94 30 recursor 1519 gi~59977~emHuman tripartite fusion transcript171 67 b~CAA7866endogenous 2.1 ~ retrovirus 1519 gi~17826947~Pseudomonas beta-1,4-xylanase 73 34 sp.

dbj~BAB792ND137 87.1 ~

1519 gi~21232680~Xanthomonas ribonuclease PH 72 30 ref~NP_6385campestris pv.

97.1 campestris ~ str.

1520 AAM78023Homo sapiensMOLE- Human bone marrow 190 100 expressed probe encoded protein SEQ

ID NO: 38329.

1520 AAM65326Homo sapiensMOLE- Human brain expressed190 100 single exon probe encoded protein SEQ ID

NO: 37431.

1520 g113447468Emericella FH1/FH2 protein homolog 121 49 nidulans 1522 AAG81417Homo SapiensZYMO Human AFP protein 287 100 sequence SEQ ID N0:352.

1523 AAY90349Homo SapiensSMII~ Human fatty acid 158 85 synthase (FAS) protein sequence.

1523 AAB43871Homo SapiensHLTMA- Human cancer associated158 85 rotein se uence SEQ ID
N0:1316.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1523 1915392 Homo Sapiensfatty acid synthase 158 85 1525 AAG03819 Homo SapiensGEST Human secreted protein,93 100 SEQ ID

NO: 7900.

1525 11311466 Homo sa 24-kDa subunit of Com 93 100 iens lex I

1525 g1188852 Homo SapiensNADH-ubi uinone reductase93 100 1526 AAD02855_Homo SapiensSUKA Human platelet membrane73 31 aal lycoprotein VI (GPVI) cDNA.

1526 AAB49403 Homo SapiensMERE Human glycoprotein 73 31 VI mature protein.

1526 AAB61257 Homo SapiensMILL- Mature human TANGO73 31 rotein.

1527 g117864896Mus musculusrotocadherin 18 precursor81 31 1527 g115980222Yersinia aconitate hydratase 1 79 30 pestis 1527 g112248353Fasciola NADH dehydrogenase subunit75 56 hepatica 5 1528 g12440214Trypanosomainvariant surface glycoprotein83 28 bruceibrucei 1528 g110567463Rhizobium probable viral gene 78 22 rhizogenes .

1529 g12231279Porcine envelope protein 66 31 reproductive and respiratory syndrome virus 1530 gi~199851~gbMus musculuspot protein 257 42 ~AAA39757.

1~

1530 gi~1498648~gMus musculusGag-Pol polyprotein 257 42 b~AAB0645 0.1~

1530 gi~331995~gbAKV marine gag-pot polyprotein (tag257 42 amber codon ~AAB03091.leukemia at 2250-2252 inserts virus Gln in Mo-MuLV) 1~

1533 g1435698 Homo sa CD44SP 136 100 iens 1533 AAV63461_Homo SapiensGEHO Human CD44 antigen 130 100 cDNA.

aal 1533 AAT14724_Homo SapiensGEHO Human haematopoietic130 100 aal cDNA clone CD44.5.

1534 g12622165Methanothermobacetyltransferase 71 29 acter thermautotrophic us str.
Delta H

1534 gi~15679078~Methanothermobacetyltransferase 71 29 ref~NP_2761acter 95.1 ~ thermautotrophic us 1535 g17777 Drosophila protein H 73 28 melanogaster 1535 g1457146 Plasmodium rhoptryprotein 73 38 yoelii 1535 g113195258Plasmodium 235 kDa rhoptry protein 73 38 yoelii yoelii 1536 ABB09740 Homo sapiensBODE- Amino acid sequence132 43 of human protein hos hatase 11.66.

1536 gi~20830386~Mus musculussimilar to importin alpha72 35 1b reflXP

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

42.1 1537 gi14039907Rattus cytochrome P450 monooxygenase353 39 norvegicus CYP2T1 1537 gi2920650Mus musculuscytochrome P450 CYP2B19 275 44 1537 12353336 Capra hircuscytochrome P450 271 31 1538 AAU83175 Homo SapiensZYMO Novel secreted protein282 100 Z874015G4P.

1538 g16714803Streptomycesintegral membrane protein.77 26 coelicolor A3(2) 1539 g112963397Prunus x ribulose-1,5-bisphosphate74 32 yedoensis carboxylase/oxygenase lar a subunit 1539 g1466436 SaccharomycesBOI1 69 31 cerevisiae 1539 g15833897Besleria ribulose 1,5-bisphosphate69 31 affinis carboxylase large subunit 1542 AAY32193 Homo SapiensINCY- Human receptor 73 26 molecule (REC) encoded by Incyte clone 044150.

1542 g17576677HelicobacterIceAl 72 44 ylori 1542 gi~20841498~Mus musculussimilar to MUF1 protein 73 26 re~XP_l 41.1 1546 114581448Homo SapiensFSHD Region Gene 2 protein73 42 1546 g115982852ArabidopsisAT5g66850/MUD21_ll 71 34 thaliana 1546 gi~14581448~Homo SapiensFSHD Region Gene 2 protein73 42 gb~AAK219 77.1 ~

1547 g118676660Homo sa FLJ00229 protein 192 92 iens 1547 AAU21409 Homo SapiensHUMA- Human novel foetal179 100 antigen, SEQ ID NO 1653.

1547 AAM42128 Homo SapiensHYSE- Human polypeptide 114 53 SEQ ID

NO 7059.

1548 AAG64494 Homo SapiensSHAN- Human natriuretic 539 100 peptide receptor 18.

1548 118676710Homo sa FLJ00254 rotein 268 77 iens 1548 AAB28764 Homo SapiensHUMA- Sequence homologous249 72 to rotein fragment encoded by gene 21.

1549 AAB67055 Homo Sapiens1NCY- Human immune response606 82 molecule (IMUN) protein SEQ ID NO:

9.

1549 AA001862 Homo SapiensHYSE- Human polypeptide 404 72 SEQ ID

NO 15754.

1549 gi~6753924~rMus musculusFriend virus susceptibility213 36 ef~NP

_ 74.1 ~

1550 1190129 Homo Sapiens70kDa peroxisomal membrane92 100 protein 1550 g1825711 Homo Sapiens7bkD peroxisomal integral92 100 membrane protein 1550 g1220862 Rattus PMP70 89 94 norve icus 1551 AAM69543 Homo SapiensMOLE- Human bone marrow 228 100 expressed robe encoded rotein SEQ

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

ID NO: 29849.

1551 AAM57148 Homo SapiensMOLE- Human brain expressed228 100 single exon probe encoded protein SEQ ID

NO: 29253.

1551 AAB93944 Homo SapiensHELI- Human protein 94 57 sequence SEQ

ID N0:13960.

1552 gi4884924Rangiferine glycoprotein C 75 34 he esvirus 1552 gi~18556240~Homo sapienssimilar to Salivary 78 30 glue protein SGS-3 ref~~ precursor 28.2 1552 gi~4884924~gRangiferine glycoprotein C 75 34 b~AAD3187herpesvirus 6.1~

1553 gi~2193870~dMus musculusreverse iranscriptase 176 35 bj ~BAA2041 9.1 1553 gi~2731767~gMus musculusendonuclease/reverse 176 35 transcriptase b~AAC5354 2.1 1554 ABB08776 Homo SapiensBODE- Human neuregulin 75 29 NO 2.

1554 AAM92816 Homo SapiensHUMA- Human digestive 71 29 system antigen SEQ ID NO: 2165.

1554 gi~6322838~rSaccharomycesProtein required for 70 27 cell viability;

ef~NP cerevisiae Yk1014cp _ 11.1 1555 gi7528184Drosophila bicoid-interacting protein78 28 melanogaster 1555 gi15292595Drosophila SD09926p 78 28 melanogaster 1555 gi4514620Mus musculusRor2 71 24 1557 ABA91504_Homo SapiensEYEE- Human epidermal 144 93 growth factor aal rece for recursor cDNA.

1557 AAF85332_Homo SapiensNOVS Nucleotide sequence144 93 of wild aal a EGFRl.

1557 AAM50768 Homo SapiensEPEE- Human epidermal 144 93 growth factor receptor precursor.

1558 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein221 100 cysteine methyltransferase 14.

1558 AAU16267 Homo SapiensHUMA- Human novel secreted221 100 protein, Seq ID 1220.

1558 ABB 11507Homo SapiensHYSE- Human secreted 183 97 protein homologue, SEQ ID N0:1877.

1559 gi14599730Sachea correaematurase 71 28 1559 gi14599648Blepharandramaturase 71 30 hetero etala 1559 gi14599673Galphimia maturase 70 28 acilis 1560 gi2323287multiple polyprotein 340 83 sclerosis associated retrovirus 1560 gi 13310191multiple recombinant envelope 260 70 protein Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

gb~AAK181sclerosis 89.1~AF331associated 500_1 retrovirus element 1560 gi~21103962~Homo Sapiensenverin-2 248 84 gb~AAM331 41.1 1561 AAB94698 Homo SapiensHELI- Human protein sequence107 95 SEQ

ID NO:15680.

1561 AAU18480 Homo SapiensHUMA- Human endocrine 107 95 polypeptide SEQ ID No 435.

1561 ABB 10288Homo sapiensHUMA- Human cDNA SEQ 107 95 ID NO:

596.

1562 gi969078 Drosophila S-adenosylhomocysteine 73 26 hydrolase melanogaster 1562 gi21064553Drosophila RE58316p 73 26 melano aster 1562 AAM41205 Homo SapiensHYSE- Human polypeptide 72 30 SEQ ID

NO 6136.

1563 gi1778844DictyosteliumLimA 71 34 discoideum 1563 gi~20985456~Mus musculussimilar to actin beta 75 36 chain - human ref~XP-1421 11.1 1563 gi~1778844~gDictyosteliumLimA 71 34 b~AAB4092discoideum 9.1~

1564 gi~9507757~rPlasmid resolvase 507 91 F

etlNP_0614 23.1 1564 gi~148589~gbPlasmid Protein D 507 91 F

~AAA24900.

1~

1564 gi~10955295~Escherichiaresolvase 501 90 coli retlNP_0526 36.1 1565 gi7649370Arabidopsisguanine nucleotide-exchange-like77 38 thaliana rotein 1565 gi1674160Mycoplasma involved in cytadherence,71 35 see:

neumoniae MPN142 1565 gi~15229258~Arabidopsisguanine nucleotide-exchange77 38 - like ref~NP_1899thaliana protein 16.1 1566 gi1799600SwissProt similar to 1051 99 Accession Number P31458 1566 gi13814506Sulfolobus Mandelate racemase /muconate286 35 solfataricuslactonizing enzyme related protein (MR/MLE) 1566 gi10640034Thermoplasmastarvation-sensing protein270 35 rspA related acido hilumprotein 1567 gi13359972Escherichiaacridine efflux pump 573 98 coli 0157:H7 1567 gi1773144Escherichiaprobable transmembrane 573 98 coli protein AcrE

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1567 gi532311 Escherichia114 kDa rotein 573 98 coli 1569 gi8918871YccA of 96 pct identical to gp:AB021078288 98 plasmid 30 ColIb-P9]

[Plasmid F

1569 gi~17136976~Drosophila repo-P1; Antibody RK2 71 33 ref~NP_4770melanogaster 26.1) 1569 gi~6502544~gGlomus homeobox protein HB 1 70 31 b~AAF14351intraradices .1~AF11019 1570 gi13363792Escherichiazinc-transporting ATPase410 87 coli 0157:H7 1570 gi466605 EscherichiaNo definition line found410 87 coli 1570 gi12518128Escherichiazinc-transporting ATPase410 87 coli 0157:H7 1571 AAU83186 Homo SapiensZYMO Novel secreted protein1006 100 Z887014G7P.

1571 gi7248459Zea mays arabinogalactan protein 85 29 1571 gi3513742Arabidopsiscontains similarity to 82 35 Zea mays thaliana embryogenesis transmembrane protein (GB:X97570) 1572 gi12597465CaenorhabditisCED-1 72 44 elegans 1572 gi19571666Caenorhabditissimilar to EGF-like domain72 44 elegans 1572 gi4883938Drosophila laminin alphal,2 67 31 melanogaster 1573 ABB12490 Homo sapiensHYSE- Human bone marrow 106 38 expressed rotein SEQ ID NO: 329.

1574 11478205 Mus musculusPNG rotein 75 41 1574 AAM40148 Homo SapiensHYSE- Human polypeptide 69 56 SEQ ID

NO 3293.

1574 AAM79341 Homo SapiensHYSE- Human protein SEQ 69 35 ID NO

2987.

1576 gi~20882651~Mus musculusATPase, class 2, member 234 91 b ref~XP_1233 03.1 1576 gi~7656918~rMus musculusATPase, class 2, member 234 91 b; ATPase ef]NP_0566 9B, class II; ATPase 9B, p type 20.1 ~

1577 g118143418Alteromonaschitinase A 77 39 Sp.

1577 g115426105Leishmania probable surface antigen75 24 protein ma'or 1578 119702241Homo Sapiensrabconnectin 439 93 1578 g17452946Homo SapiensX-like 1 protein 132 41 1578 g11279384Drosophila X 109 29 melanogaster 1580 AAE20337 Homo SapiensHUMA- Human B7-H11 protein122 23 mature extracellular domain.

1580 AAE20336 Homo SapiensHUMA- Human B7-H11 protein122 23 extracellular domain.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1580 gi2062702Homo sa butyrophilin 122 23 iens 1581 AAE18640 Homo SapiensINCY- Human G-protein 70 35 coupled rece for (GCREC-1).

1581 118369751Oryza sativaethylene res onsive rotein70 50 1581 g115217292Oryza sativa]Putative AP2 domain containing70 50 [Oryza sativaprotein (japonica cultivar-oup) 1583 g16468047Homo SapiensKrup el-like factor 85 73 1583 g15916096Homo SapiensKru pel-like factor LKLF85 73 1583 g14583418Homo SapiensKruppel-like zinc forger85 73 transcription factor 1585 g12570021Homo Sapienspaired box containing 77 .37 transcription factor 1585 13115988 Homo SapiensdJ394P2-1.1 (PAX-7) 77 37 1585 12570015 Homo sa alternative 77 37 iens 1586 g17861533Rattus retina specific protein 72 43 PAL

norvegicus 1586 g120977028Xenopus mitotic hosphoprotein 72 34 laevis 39 1586 AAB58458 Homo SapiensROSE/ Lung cancer associated68 39 polype tide se uence SEQ ID 796.

1587 g15901864Drosophila BcDNA.LD27873 81 24 melanogaster 1587 g115458514StreptococcusPneumococcal histidine 78 27 triad protein D

neumoniae precursor 1587 15042400 Homo sa NFI-X3=transcription 75 30 iens factor AA

1592 g14210501Homo sa BC85722_1 253 61 iens 1592 g114794910Homo sa ca icua protein 253 61 iens 1592 114794914Mus musculusca icua protein 253 61 1593 gi~8131854~gTrypanosomaantigen JL8 69 34 b~AAF73108cruzi .1 CAF

1595 g118892729Pyrococcus 3-hydroxyisobutyrate 70 27 dehydrogenase furiosus DSM

1595 gi~20847046~Mus musculussimilar to Transcription70 28 factor BTF3 ref~XP_1366 (RNA polymerise B transcription 21.1 factor 3) 1595 gi~18977088~Pyrococcus 3-hydroxyisobutyrate 70 27 dehydrogenase ref~NP_5784furiosus DSM

45.1 3638 1597 AAU83621 Homo SapiensGETH Human PRO protein, 151 42 Seq ID No 60.

1597 AA005826 Homo SapiensHYSE- Human polypeptide 146 83 SEQ ID

NO 19718.

1597 AAM41346 Homo SapiensHYSE- Human polypeptide 102 46 SEQ ID

NO 6277.

1598 AAM79503 Homo SapiensHYSE- Human protein SEQ 80 35 ID NO

3149.

1598 AAM78519 Homo SapiensHYSE- Human protein SEQ 80 35 ID NO

1181.

1598 g118676526Homo sa FLJ00160 rotein 80 35 iens 1599 g12149640ArabidopsisAr~onaute protein 72 33 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

thaliana 1599 gi15027491respiratoryglycoprotein 71 32 syncytial virus 1599 gig 15221177Arabidopsisleaf development protein72 33 Argonaute reflNP-1752thaliana 74.1 1601 gi17130010Nostoc Sp. WD-40 repeat protein 136 28 PCC

1601 gi1653631Synechocystisbeta transducin-like 131 26 protein s . PCC ' 1601 gi17135261Nostoc Sp. WD-40 repeat protein 115 27 PCC

1602 gi1103853Rattus rHAPl-A 89 33 norve icus 1602 gi1103851Rattus huntingtin associated 89 33 protein norve icus 1602 gi14579673Takifugu pericentriolar material 87 30 1 protein rubripes 1603 gi537446 ArabidopsisAtHSP101 75 31 thaliana 1603 gi12324908Arabidopsisheat shock protein 101; 75 31 thaliana 1603 gi6715468Arabidopsisheat shock protein 101 75 31 thaliana 1604 12190531 Vibrio choleraemethyl acceptin chemotaxis71 26 rotein 1604 g19657614Vibrio choleraehemolysin secretion protein71 26 HyIB

1604 g19655306Vibrio choleraeheat shock rotein E 70 35 1605 g13912936Geobacillusornithine carbamoyltransferase68 31 stearothermophil us 1606 g18797 Drosophila CYS3HIS finger protein 678 51 melano aster 1606 g115291975Drosophila LD33756p 617 65 melanogaster 1606 g16967181Homo Sapiensc399E4.1 (similar to 549 75 D.melanogaster unkem t protein.) 1607 gi~21301783~Anopheles agCP8730 72 35 gb~EAA139gambiae str.

28.1 PEST

1607 gi~21361276~Homo Sapiensinterferon-stimulated 68 29 transcription ref~NP_0060 factor 3, gamma (48kD);
interferon-75.2~ stimulated gene factor 3, gamma subunit (48 kD) 1609 g12661094Spinacia cold acclimation protein76 32 oleracea 1612 gi~1780975~eHuman gag protein 312 34 mb~CAA714endogenous 18.1 ~ retrovirus K

1612 gi~5802810~gHomo SapiensGag-Pro-Pol protein 309 34 b~AAD5179 1.1~

1612 gi~887448~eHuman gag 309 34 mb~CAA513endogenous 06.1 ~ retrovirus Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1613 AA013889Homo SapiensHYSE- Human polypeptide 73 42 SEQ ID

NO 27781.

1614 111065727Homo sa iensdJ493F7.1 (similar to 347 100 marine BET3) 1614 g12791806Mus musculusbeta 253 69 1614 113277654Mus musculusBet3 homolo (S. cerevisiae)253 69 1615 g11122901SaccharomycesMSP8 77 20 cerevisiae 1615 g1825546SaccharomycesCatBp 77 20 cerevisiae 1615 g117978563Xeno us laevisSpl-like zinc-finger 75 40 protein XSPR-1 1616 AAY02536Homo SapiensICOS- Human ICAM-6 protein458 98 sequence.

1616 g112248907Homo sa iensTCAM-1 458 98 1616 g14579740Ratios testicular cell adhesion366 76 molecule 1 norve icus (TCAM1) 1617 AAM67067Homo SapiensMOLE- Human bone marrow 271 64 expressed probe encoded protein SEQ

ID NO: 27373.

1617 AAM54664Homo SapiensMOLE- Human brain expressed271 64 single exon probe encoded protein SEQ ID

NO: 26769.

1617 AAM56747Homo SapiensMOLE- Human brain expressed229 69 single exon probe encoded protein SEQ ID

NO: 28852.

1618 g15802814Homo sapiensGag-Pro-Pol-Env rotein 532 52 1618 g11780973Human poi protein 531 52 endogenous retrovirus K

1618 15802821Homo sa iensGa -Pro-Pol protein 531 52 1619 g12769587Mus musculusSTOP rotein 662 86 1619 g11370291Rattus STOP protein 662 92 norve icus 1619 g13287265Rattus E-STOP protein 662 92 norve icus 1620 AAM65980Homo sapiensMOLE- Human bone marrow 266 100 expressed probe encoded protein SEQ

ID N0: 26286.

1620 AAM53601Homo SapiensMOLE- Human brain expressed266 100 single exon probe encoded protein SEQ ID

NO: 25706.

1620 gi~20270271~Mus musculusRIKEN cDNA 1190017012 198 80 ref~NP_6200 82.1 1621 g111862941Mus musculusDDM36E 74 33 1621 111862939Mus musculusDDM36 74 33 1621 g17650186Mus musculusneighbor of Punc e1 l 73 33 rotein 1622 g13157464Thermos Sp. integral membrane rotein74 38 1623 gi~59977~emHuman tripartite fusion transcript129 82 b~CAA7866endogenous 2.1 ~ retrovirus 1623 gi~20161147~Oryza sativaVsaA -like protein 88 32 dbj~BAB900(japonica 75.1 cultivar-group) ~

1623 gi~17864474~Drosophila domino ~ 87 41 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

ref~NP_5248melanogaster 33.1 1626 AA000498 Homo SapiensHYSE- Human polypeptide99 43 SEQ ID

NO 14390.

1627 g114041733Xenorhabdus XptA2 protein 70 23 nematophila 1627 gi~15641593~Vibrio choleraecatalase 69 23 re~NP_2312 25.1 1628 g119888204MethanopyrusSite-specific DNA methylase80 27 kandleri 1628 g16358691Simian Pol protein 78 32 immunodeficienc y virus 1628 gi~20094956~MethanopyrusSite-specific DNA methylase80 27 ref~NP-6148kandleri 03.1 ~

1629 AAB07704 Homo Sapiens1NMR Protein encoded 594 67 by the endogenetic fragment of HERV-W.

1629 g18272464Homo sa iensgag 594 67 1629 AAB07703 Homo SapiensINMR Protein encoded 590 66 by the endogenetic fragment of HERV-W.

1630 g132498 Homo sa iensprecursor (AA -23 to 145 100 476) 1630 1339595 Homo sa ienstriglyceride lipase 145 100 precursor 1630 1386859 Homo sa ienshepatic 1i ase 145 100 1631 g18777465Rattus cytoplasmic dynein heavy703 77 chain norvegicus 1631 g117019507Tripneustes dynein heavy chain isotype505 53 gratilla 1631 AAB93815 Homo SapiensHELI- Human protein 457 71 sequence SEQ

ID N0:13606.

1632 AAM68837 Homo SapiensMOLE- Human bone marrow122 48 expressed probe encoded protein SEQ

ID NO: 29143.

1632 AAM56460 Homo SapiensMOLE- Human brain expressed122 48 single exon probe encoded protein SEQ ID

NO: 28565.

1632 g117861826Drosophila GM01964p 90 51 melano aster 1633 gi~21300783~Anopheles ebiP1105 77 33 gb~EAA129gambiae str.

28.1 ~ PEST

1633 gi~19880523~Bactrocera vitellogenin 1 precursor68 27 gb~AAM003dorsalis 72.1 ~AF3 1633 gi~21070999~Homo Sapiensstromal interaction 68 39 molecule 2 ref~NP-0659 precursor 11.1 1637 g12323287multiple polyprotein 289 91 sclerosis associated retrovirus 1637 gi~21103962~Homo Sapiensenverin-2 261 82 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

gb~AAM331 41.1 1637 gi~13310191~multiple recombinant envelope 259 82 protein gb~AAK181sclerosis 89.1~AF331associated 500_1 retrovirus element 1638 AAR58809 Homo sa iensUYNY Human RPTP- aroma.86 26 1638 gi292411 Homo Sapiensreceptor-type protein 86 26 tyrosine hosphatase aroma 1638 11263069 Homo sa iensreceptor tyrosine phos 86 26 hatase gamma 1639 g19857054Leishmania possible CG7055 protein74 27 maj or 1639 gi~20853034~Mus musculusexpressed sequence AI44751973 35 ref~XP_1259 62.1 1639 gi~7008003~dMus musculustranscription factor 73 35 MAZR

bj ~BAA9087 4.1~

1640 AAG03810 Homo SapiensGEST Human secreted 220 95 protein, SEQ ID

NO: 7891.

1640 1186800 Homo Sapiensribosomal protein L12 220 95 1640 g157680 Rattus rattusribosomal protein L12 220 95 1641 AAB44286 Homo SapiensGETH Human PR01072 (UNQ529)1709 100 protein sequence SEQ
ID N0:303.

1641 AAY41730 Homo sapiensGETH Human PR01072 protein1709 100 sequence.

1641 114602625Homo sapiensPAN2 rotein 1709 100 1642 g120147241Arabidopsis ATSg09850/MYH9 6 74 32 thaliana 1642 g114329782Homo sa iensdJ1121G12.3 (Novel gene)72 28 1642 gi~16648730~Arabidopsis ATSg09850/MYH9_6 74 32 gb~AAL255thaliana 57.1 1643 g12952340Ratios insulin receptor substrate89 31 norvegicus 1643 g12653351Bovine product of latency-related83 30 gene herpesvirus type 1.1 1643 14511969 Homo Sapiensinsulin rece for substrate-282 26 1644 g19964099Chlamydia inclusion membrane protein73 35 trachomatis 1644 g119171028EncephalitozoonATP DEPENDENT DNA BINDING67 29 cuniculi HELICASE (RAD3/XPD

SUBFAMILY OF HELICASES) 1644 gi~9964095~gChlamydia inclusion membrane protein73 35 b~AAG0982trachomatis 1.1 ~AF2793 1646 gi~10863995~Homo Sapiensclones 23667 and 23775 67 42 zinc finger ref~NP_0670 protein 11.1 1647 11196425 Homo sa iensenvelo a rotein 93 39 1647 g1200296 Mus musculusperlecan 85 26 Tahle 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1647 18131894 Homo Sapiensmitofilin 84 27 1648 g11573040Haemophilusaspartokinase I / homoserine73 36 influenzae dehydrogenase I (thrA
Rd 1648 g18778726ArabidopsisT25N20.14 73 31 thaliana 1648 gi~16272063~Haemophilusaspartokinase I / homoserine73 36 refjNP-4382influenzae dehydrogenase I (thrA) Rd 62.1 1649 g1295642 Saccharomycesphospholipase C 79 36 cerevisiae 1649 g17548846Saccharomycesdelta class phosphoinositide-specific77 36 cerevisiae hos holi ase C homolo 1649 g1161104 Schistosomaengrailed-like homeodomain74 35 protein mansoni 1651 gi~13129464~Oryza sativa]Polyprotein 66 40 gb~AAK131[Oryza sativa 22.1~AC080(japonica 019 14 cultivar-ou ) 1652 AAG81446 Homo SapiensZYMO Human AFP protein 249 100 sequence SEQ ID N0:410.

1652 118032212Homo sa histone acetyltransferase89 34 iens MOZ2 1652 AAR34936 Homo sapiensUYJO CENP-B. 77 35 1653 g120145484Bos taurus SCO-spondin 71 29 1655 AAM86382 Homo SapiensHUMA- Human 129 55 immune/haematopoietic antigen SEQ

ID N0:13975.

1655 ABB03887 Homo SapiensHLTMA- Human musculoskeletal118 62 system related polypeptide SEQ ID NO

1834.

1655 AAM75964 Homo SapiensMOLE- Human bone marrow 85 56 expressed probe encoded protein SEQ

ID NO: 36270.

1659 g138035 Homo Sapiensp25 protein 110 45 1659 g1330915 Equine IR4 protein 99 28 herpesvirus 1659 g1156606 Chironomus SpId 84 30 tentans 1660 g19654641Vibrio cholerae3-deoxy-D-manno-octulosonic-acid84 23 transferase 1660 gi~20835446~Mus musculussimilar to STARP antigen73 25 reflXP-1444 09.1 ~

1660 gi~15596880~Pseudomonasprobable sugar aldolase 72 26 re~NP_2503aeruginosa 74.1 1661 g14062318EscherichiaHeat-responsive re ulatory79 36 coli protein 1661 g1976025 EscherichiaHrsA 79 36 coli 1661 g11786951Escherichiaprotein modification 79 36 coli enzyme, induction K12 of om C

1662 AAM68588 Homo sapiensMOLE- Human bone marrow 155 100 expressed probe encoded protein SEQ

ID NO: 28894.

1662 AAM56212 Homo SapiensMOLE- Human brain expressed155 100 single exon probe encoded rotein SEQ ID

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

NO: 28317.

1662 gi3845169Plasmodium phosphatase (acid phosphatase66 52 family) falci arum 1663 AAG89215 Homo SapiensGEST Human secreted protein,218 100 SEQ ID

NO: 335.

1663 gi20070921Mus musculusRIKEN cDNA 2410008M22 130 55 ene 1663 AAR77602 Homo SapiensFORSI Human circulating 92 44 cytokine CC-1 C-terminal fragment.

1664 AAE18212 Homo SapiensCURA- Human MOL4 protein.75 47 1664 AAM00966 Homo SapiensHYSE- Human bone marrow 72 35 protein, SEQ ID NO: 442.

1665 AAB92828 Homo SapiensHELI- Human protein sequence74 93 SEQ

ID N0:11365.

1665 AAG63852 Homo SapiensINCY- Amino acid sequence74 93 of human GTPase activating protein GTPAP2.

1665 AAG63851 Homo SapiensINCY- Amino acid sequence74 93 of human GTPase activatin protein GTPAP 1.

1666 AAM72897 Homo sapiensMOLE- Human bone marrow 135 65 expressed probe encoded protein SEQ

ID NO: 33203.

1666 AAM60268 Homo SapiensMOLE- Human brain expressed135 65 single exon probe encoded protein SEQ ID

NO: 32373.

1666 gi4007097Homo SapiensdJ1118D24.2 (60S Ribosomal135 65 Protein L 10 LIKE) 1667 gi212267 Gallus anuscartilage link protein 917 49 1667 12010 Sus scrofa link rotein recursor 913 51 (AA -15 to 339) 1667 g1459439 E uus caballuslink protein 910 51 1668 110443237Mus musculuss licing factor 3a, subunit276 36 1668 g1396743 Podocoryne Pod-EPPT 276 30 carnea 1668 g1294131 Plasmodium circumsporozoite protein266 22 falcipanxm 1669 AAM49641 Homo sapiensBOEH Human tumour-associated132 65 antigen B345 rotein SEQ
ID NO 4.

1669 AAU12252 Homo SapiensGETH Human PRO5773 polypeptide132 65 se uence.

1669 AAY91592 Homo SapiensHUMA- Human secreted 132 65 protein sequence encoded by gene N0:265.

1670 g14835383Homo sa alias DLC1 226 47 iens 1670 g14704343Homo Sapiensalias DLC1; candidate 226 47 tumor suppressor ene 1670 g1155627 Acanthamoebamyosin I heavy chain 118 42 castellanii 1671 ABB 12490Homo SapiensHYSE- Human bone marrow 237 88 expressed protein SEQ ID NO: 329.

1671 g16002932Streptomycesglycosyltransferase 67 35 fradiae 1671 gi~9634613~rHuman Ll 65 39 ef~NP_0381papillomavirus 50.1 ~ type 69 1672 g113938013Homo SapiensSimilar to RIKEN cDNA 333 66 ene Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1672 gi2388970Schizosaccharomtat-binding homolog 235 41 7, AAA ATPase yces pombe family roteiii 1672 gi6850321Arabidopsis Contains similarity 214 40 to YTA7 ATPase thaliana gene from Saccharomyces cerevisiae gb~X81072, and contains Bromodomain PF~00439, AAA PF~00004,, and Sigma-54 PF~00158 transcription factor domains.

1673 gil 1066113Drosophila Misexpression suppressor71 29 of ras 4 melano aster 1673 gi~20829387~Mus musculusRIKEN cDNA 4930455F23 77 27 rel]XP-1295 40.1 1673 gi~17647635~Drosophila Misexpression suppressor71 29 of ras 4 ref~NP,5237melanogaster 75.1 1674 gi~20535935~Homo sapienssimilar to splicing 75 37 coactivator subunit ref~XP-1157 SRm300; RNA binding protein; AT-87.1 rich element bindin factor 1674 gi~17544226~CaenorhabditisY76B12C.4.p 72 34 re~NP_5001elegans 51.1 1674 gi~17559826)CaenorhabditissepB domain 70 26 ref~NP_5057elegans 99.1 1675 gi5708067Oryctolagus hyperpolarization activated99 27 cation cuniculus channel 1675 gi402558 Canis familiarismucin 98 27 1675 110636484Homo Sapienspolyglutamine-containin96 26 protein 1676 AAM95365 Homo SapiensHUMA- Human reproductive73 26 system related antigen SEQ
ID NO: 4023.

1676 AAB56709 Homo SapiensROSEI Human prostate 72 34 cancer antigen protein sequence SEQ
ID NO:1287.

1676 g11881288Bacillus FUNCTION UNKNOWN, SIMILAR71 30 subtilis PRODUCT IN E.COLI, H.

INFLUENZAE AND NEISSERIA

MENINGITIDIS.

1677 gi~15892512~EC:2.7.7.41]phosphatidate cytidylyltransferase65 34 ref~NP_3602[Rickettsia 26.1 conorii 1679 g114231 SaccharomycesNADH dehydrogenase (ubiquinone)75 31 cerevisiae 1679 g1805022 SaccharomycesNdilp 73 31 cerevisiae 1679 g11353352Chlamydomonasalanine aminotransferase70 27 reinhardtii 1680 g11805421Bacillus surfactin production 77 36 subtilis 1680 g1396482 Bacillus srfA2 77 36 subtilis 1680 g1516360 Bacillus surfactin synthetase 77 36 subtilis 1681 AAG64494 Homo SapiensSHAN- Human natriuretic156 80 peptide rece for 18.

1681 AAE16275 Homo SapiensINCY- Human kinase PKIN-21154 73 protein.

1681 AAM40599 ~ Homo Sapiens~ HYSE- Human polypeptide~ 154 ~ 73 SEQ ID I

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

NO 5530.

1682 g12323287multiple polyprotein 1646 75 sclerosis associated retrovirus 1682 gi~2351212~dFriend marinegag-pol polyproteiii 807 40 (precursor protein) bj ~BAA2206leukemia virus 4.1~

1682 gi~9626961~rMarine leukemiaPr180 802 40 ef~NP_0579virus 33.1 1683 AAM39205 Homo SapiensHYSE- Human polypeptide 457 53 SEQ ID

NO 2350.

1683 g13033415Gibbon ape gag polyprotein 353 38 leukemia virus 1683 gi~6524623~gPhascolarctosgag protein 343 38 b~AAF15097cinereus .1~

1684 g119110438Homo Sapienspolycystin-1L1 712 98 1684 g16361629Periplanetavitellogenin 81 25 americana 1684 13115393 Rana 1 iensguanylate cyclase inhibitory80 35 protein 1686 AAY91542 Homo SapiensHUMA- Human secreted 212 84 protein sequence encoded by gene N0:215.

1686 11279841 Bos taurus glycine trans otter 72 36 1686 119879917Oryza sativaacid hosphatase 70 35 1687 g112056568Homo sa MSTP063 212 88 iens 1687 113539684Homo sa zinc forger rotein 291 212 88 iens 1687 gi~12056568~Homo SapiensMSTP063 212 88 gb~AAG479 45.1~AF119 1689 g15689766Homosa ienszinc finger 2.2 222 91 1689 AAU16267 Homo SapiensHUMA- Human novel secreted178 58 protein, Seq ID 1220.

1689 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein177 60 cysteine methyltransferase 14.

1690 g13328880Chlamydia Protein Export 73 29 trachomatis 1690 g12832232Brucella flagellin; FIiC 67 29 melitensis biovar Aborius 1690 g117984285Brucella FLAGELL1N 67 29 melitensis 1692 g14927443Haemophilushemoglobin/hemoglobin-haptoglobin93 80 influenzae binding protein 1692 g14204775Haemophilushemoglobin and hemoglobin-93 80 influenzae ha toglobin bindin protein 1692 g13647226Haemophilusliemoglobin binding protein93 80 influenzae 1694 AAW95631 Homo SapiensGEMY Homo Sapiens secreted102 100 protein gene clone hj968 2.

1694 g113162186Homo Sapiens~ calsyntenin-3 protein ~ 102 ~ 100 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1695 AA004205 Homo SapiensHYSE- Human polypeptide 81 37 SEQ ID

NO 18097.

1695 gi160180 Plasmodium circumsporozoite antigen81 29 cynomolgi 1695 gi495522 Plasmodium circumsporozoite protein80 30 simiovale 1696 AAM80223 Homo SapiensHYSE- Human protein SEQ 252 66 ID NO

3869.

1696 AAM79239 Homo SapiensHYSE- Human protein SEQ 252 66 ID NO

1901.

1696 gi3688394Homo sa triple LIM domain rotein252 66 iens 1697 gi19887715MethanopyrusPredicted membrane protein74 28 kandleri 1698 AAM93184 Homo SapiensHELI- Human polypeptide,269 87 SEQ ID

NO: 2552.

1698 118044066Mus musculusRIKEN cDNA 5033406L14 226 76 gene 1698 AAB95302 Homo SapiensHELI- Human protein sequence194 78 SEQ

ID N0:17538.

1699 ABB17279 Homo SapiensHUMA- Human nervous system110 56 related olypeptide SEQ ID NO
5936.

1699 AA013013 Homo SapiensHYSE- Human polypeptide 101 71 SEQ ID

NO 26905.

1699 gi~7650258~gHepatitis polyprotein 74 28 C virus b~AAF65960 .1 ~AF20777 1700 g112697585Arabidopsis4-(cytidine 5'-phospho)-2-C-methyl-D-69 40 thaliana erithritol kinase 1701 g116740569Homo sa Similar to thymus expressed84 27 iens gene 3 1701 g117940760Mus musculuscask-interacting protein79 26 1701 g117940758Homo sapienscask-interacting protein77 26 1702 g117385401Homo SapiensTPIP alpha 1i id phosphatase234 62 1702 AAU75783 Homo sapiensINCY- Human protein phosphatase208 57 (PP1) protein sequence.

1702 AAG67638 Homo SapiensHELI- Amino acid sequence202 56 of a human rotein.

1703 AAO07887 Homo SapiensHYSE- Human polypeptide 246 85 SEQ ID

NO 21779.

1703 AA008651 Homo SapiensHYSE- Human polypeptide 239 83 SEQ ID

NO 22543.

1703 AA008732 Homo SapiensHYSE- Human polypeptide 221 80 SEQ ID

NO 22624.

1704 AAB94588 Homo SapiensHELI- Human protein sequence82 52 SEQ

ID N0:15392.

1704 g13288914Mus musculusaortic carboxypeptidase-like82 24 protein ACLP

1704 AAM93437 Homo SapiensHELI- Human polypeptide,81 32 SEQ ID

NO: 3074.

1706 AAM86104 Homo SapiensHUMA- Human 179 100 immune/haematopoietic antigen SEQ

ID N0:13697.

1706 g110039425E uus caballusALR rotein 120 40 1706 120502826Eimeria cGMP-dependent rotein 115 35 maxima kinase 1707 AAM70251 Homo sapiensMOLE- Human bone marrow ~ 115 ~ 78 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

expressed probe encoded protein SEQ

ID NO: 30557.

1707 AAM57834 Homo SapiensMOLE- Human brain expressed115 78 single exon probe encoded protein SEQ ID

NO: 29939.

1707 gi15450860Arabidopsisserine/threonine-protein71 56 kinase Mak thaliana (male germ cell-associated kiiiase)-like protein 1708 11620403 Homo sa SF1-Bo isoform 82 41 iens 1708 119072991H ocrea class III chitinase precursor82 40 virens 1708 118765873Hypocrea class III chitinase 82 40 virens 1709 AAM52240 Homo sa 1NCY- Human MFAP4 SEQ 1384 100 iens ID NO 3.

1709 g1790817 Homo sa microfibril-associated 1384 100 iens glycoprotein 4 1709 AAM52239 Homo sapiensINCY- Human MAG4V SEQ 1374 100 ID NO 1.

1710 g116769882Drosophila SD07884p 67 27 melanogaster 1710 gi~17545505~Ralstonia CONSERVED HYPOTHETICAL 66 41 ret)NP_5189solanacearumPROTEIN

07.1 1711 AAU82954 Homo SapiensANAD- Human homologue 111 27 of MPT1 rotein target for antifungal com ound.

1711 g12058326Homo Sapienssubunit of RNA polymerase111 27 II

transcri tion factor TFIID

1711 g113559031Homo sapiensbA11M20.1 (TATA box binding108 26 protein (TBP)-associated factor, RNA

polymerise II, C1, 130kD) 1712 AAB65626 Homo SapiensSUGE- Novel protein kinase,209 82 SEQ ID

NO: 152.

1712 AAM25283 Homo sapiensHYSE- Human protein sequence209 82 SEQ

ID N0:798.

1712 AAU17269 Homo SapiensHUMA- Novel signal transduction176 67 pathway protein, Se ID
834.

1713 g118256065Mus musculusSimilar to ATPase, class127 67 II, type 9A

1713 AAM76495 Homo SapiensMOLE- Human bone marrow 123 70 expressed probe encoded protein SEQ

ID NO: 36801.

1713 AAM63681 Homo SapiensMOLE- Human brain expressed123 70 single exon probe encoded protein SEQ ID

NO: 35786.

1714 g18096269Nicotiana KED 149 28 tabacum 1714 g11752736Saccharomycesgene required for phosphoylation148 30 of cerevisiae oligosaccharides/ has high homology with YJR061w 1714 g12292986Rattus cyclic nucleotide-gated 141 28 channel beta norvegicus subunit 1715 AAM72995 Homo SapiensMOLE- Human bone marrow 158 47 expressed probe encoded protein SEQ

ID NO: 33301.

1715 AAM60359 Homo SapiensMOLE- Human brain expressed158 47 single exon probe encoded protein SEQ ID

NO: 32464.

1715 gi~13539605~Paramecium cycloplulin-RNA interacting144 45 protein emb~CAC35tetraurelia Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

733.1 ~

1716 AAM71015 Homo SapiensMOLE- Human bone marrow251 64 expressed probe encoded protein SEQ

ID NO: 31321.

1716 AAM58517 Homo sapiensMOLE- Human brain expressed251 64 single exon probe encoded protein SEQ ID

NO: 30622.

1716 AAU19766 Homo SapiensHUMA- Human novel extracellular161 44 matrix rotein, Seq ~D
No 416.

1718 g11420924Zea mays IN1 75 27 1718 gi~14521970~Pyrococcus O-sialoglycoprotein 73 35 endopeptidase ref~NP_1274abyssi 47.1 1719 g120513851Hordeum BPM 74 35 vul are 1719 g121039126Cryptosporidium60 kDa glycoprotein 74 26 parvum 1719 g1207158 Ratios big tau 73 36 norvegicus 1720 g118181943Caenorhabditisheparan sulfate GIcNAc 67 34 transferase-I/II

elegans 1720 g12058699Caenorhabditismultiple exostoses homolog67 34 ele ans 1720 gi~17554740~CaenorhabditisMULTIPLE EXOSTOSES 67 34 reilNP-4993elegans HOMOLOG 2 68.1 ~

1721 AAM69150 Homo SapiensMOLE- Human bone marrow200 38 expressed probe encoded protein SEQ

ID NO: 29456.

1721 AAM56769 Homo SapiensMOLE- Human brain expressed200 38 single exon probe encoded protein SEQ ID

NO: 28874.

1721 g14185947Human pol protein 196 38 endogenous retrovirus I~

1722 g12065210Mus musculusPro-Pol-dUTPase olyprotein615 60 1722 g118676710Homo SapiensFLJ00254 rotein 592 60 1722 gi~20469453~Homo Sapienssimilar to FLJ00254 283 50 protein ref~XP_1140 40.1 1723 g113881755Mycobacteriumcation efflux system 74 30 protein tuberculosis 1724 AAG78866 Homo sa iensSHAN- Human zinc fin 141 68 er protein 15.

1724 ABB 17928Homo sapiensHUMA- Human nervous 99 53 system related polypeptide SEQ ID NO
6585.

1724 gi~21295712~Anopheles agCP1631 75 26 gb~EAA078gambiae str.

57.1 ~ PEST

1725 121104340Homo Sapiensobscurin 1586 83 1725 g17024535Gallus allusstructural muscle rotein207 24 titin 1725 g11513030Gallus gallusconnectin/titin 207 24 1727 AAE19162 Homo SapiensTHOR/ Human lcinase 1096 99 polypeptide (PK1N-20).

Table 2 SEQ AccessionSpecies Description Score ~

ID No. Identity NO:

1727 gi2736151Rattus mytonic dystrophy kinase-related902 78 norvegicus Cdc42-binding kinase 1727 gi1695873Homo Sapiensser-thr rotein kinase 896 77 1728 AAY99411 Homo SapiensGETH Human PR01487 (UNQ756)862 67 amino acid sequence SEQ
ID N0:260.

1728 115617453Homo sapienschondroitin synthase 862 67 1728 AAE15959 Homo SapiensEUMO- Human 4589624/92-303761 79 protein, member of Fringe and Brainiac family.

1729 gi~15804980~EscherichiaUncharacterized conserved71 33 coli protein ref~NP_29090157:H7 60.1 EDL933 .

1731 114268490Musca domesticahunchback 82 33 1731 AAM93401 Homo SapiensHELI- Human polypeptide,76 27 SEQ ID

NO: 3002.

1731 12076606 Musca domesticahunchback zinc finger 73 30 rotein 1732 AAY91949 Homo SapiensINCY- Human cytoskeleton1047 57 associated protein 4 (CYSKP-4).

1732 ABB90754 Homo SapiensUYJO Human Tumour Endothelial1043 57 Marker polypeptide SEQ
ID NO 240.

1732 g1619577 Gallus alluscardiac muscle tensin 1043 56 1733 g13090889Homo Sapienssynapsin IIIa 70 38 1733 g16572355Homo sa cE86D10.1 (syna sin III)70 38 iens 1733 gi~19924105~Homo Sapienssynapsin III, isoform 70 38 IIIa ref~NP

81.2 1734 AAB85144 Homo SapiensHUMA- Human NKCR polypeptide1506 93 (clone ID HMSOM53).

1734 g14973126Mus musculushigh affinity inununoglobulin490 39 gamma castaneus Fc receptor I

1734 g14973124Mus musculushigh affinity immunoglobulin489 39 gamma Fc receptor I

1735 gi~15597595~Pseudomonaspyoverdine synthetase 69 30 D

reflIVP-2510aeruginosa 89.1 ~

1736 114488302Oryza sativaPutative trans oson rotein81 24 1736 g13851516Phytophthoracyst germination specific72 33 acidic repeat infestans rotein precursor 1736 gi~14488302~Oryza sativaPutative transposon protein81 24 gb~AAK638 83.1 ~AC074 1737 AAB85357 Homo Sapiens1NCY- Human phosphatase 1591 100 (PP) (clone ID 3402521CD1).

1737 g121205864Homo SapiensT-cell activation protein1591 100 phosphatase 2C; TA-PP2C

1737 g121464366Drosophila RE06653p 758 52 melano aster 1738 g17271811Drosophila GTPase activating protein292 38 melanogaster 1738 AAM76430 Homo SapiensMOLE- Human bone marrow 246 100 expressed probe encoded protein SEQ

ID NO: 36736.

1738 AAM63615 Homo SapiensMOLE- Human brain ex 246 100 ressed single Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

exon probe encoded protein SEQ ID

NO: 35720.

1739 ABB50365 Homo SapiensHUMA- Human secreted 272 87 protein encoded by gene 65 SEQ
ID N0:313.

1739 AAW88598 Homo SapiensHUMA- Secreted protein 272 87 encoded by gene 65 clone HFVHY45.

1739 ABB50764 Homo SapiensHUMA- Human secreted 143 92 protein encoded by ene 65 SEQ
ID N0:716.

1740 12065210 Mus musculusPro-Pol-dUTPase pol rotein1210 58 1740 gi~10834720~Homo SapiensPP565 274 80 gb~AAG237 90.1 ~AF258 1740 gi~385615~gbMus sp. fibulin gene homolog 248 75 ~AAB26708.

1~

1741 ABB90748 Homo SapiensUYJO Human Tumour Endothelial2116 97 Marker polype tide SEQ
ID NO 228.

1741 115987493Homo Sapienstumor endothelial marker2116 97 1741 ABB90754 Homo SapiensUYJO Human Tumour Endothelial530 37 Marker of eptide SEQ
1D NO 240.

1742 ABB 11753Homo SapiensHYSE- Human NOV/plexin-A1291 90 homolo ue, SEQ ID N0:2123.

1742 g11665757Mus musculusplexin 1 291 90 1742 16010217 Homo sa NOV/ lexin-A1 rotein 291 90 iens 1743 AAM79514 Homo SapiensHYSE- Human protein SEQ 149 90 ID NO

3160.

1743 AAM78530 Homo SapiensHYSE- Human protein SEQ 149 90 ID NO

1192.

1743 g11244510Homo Sapiensp311 rotein 149 90 1744 AAG93324 Homo SapiensNISC- Human protein HP 83 41 10370.

1744 g121064771Drosophila RH61467p 83 46 melano aster 1744 g118676554Homo sa FLJ00174 protein 77 41 iens 1745 14128039 Homo SapiensTL132 rotein 81 29 1745 g117983118Brucella METAL DEPENDENT HYDROLASE74 23 melitensis 1745 AAU75578 Homo SapiensUYNA- Human ubiquitin 71 31 specific rotease 10 (USP 10).

1746 g115074154SinorhizobiumPUTATIVE FATTY 76 25 meliloti ACID/PHOSPHOLIPID SYNTHESIS

PROTEIN

1746 g11869833human myristylated tegument 75 27 protein he esvirus 1746 g120516045ThermoanaerobaChemotaxis response regulator69 20 CheB, cter consists of CheY-like receiver domain tengcongensisand a methylesterase (demethylase) domain 1747 g118025496cercopithicineEBNA-1 124 37 he esvirus 1747 g15821153Homo SapiensRNA binding protein 123 29 1747 g16649242Homo Sapienssplicing coactivator 123 29 subunit SRm300 1748 gi~4321764~gMus musculusMAP kinase kinase 7 alpha65 30 b~AAD

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

9.1~

1748 gi~20859704~Mus musculusmitogen activated protein65 30 kinase kinase ref~XP'1339 7 86.1 1748 gi~4321768~gMus musculusMAP kinase kinase 7 beta65 30 b~AAD

1.1~

1749 AAB50964 Homo sapiensGETH Human PR01313 protein.439 89 1749 AAB47290 Homo sa GETH PR01313 0l a tide. 439 89 iens 1749 AAB24431 Homo SapiensGETH Human PR01313 protein439 89 se uence SEQ ID N0:216.

1750 AAU00502 Homo sa MILL- Human TANGO 437 115 91 iens protein.

1750 g120384654Homo Sapienstwo- ore calcium channel115 91 rotein 2 1750 AAM91059 Homo SapiensHUMA- Human 93 64 immune/haematopoietic antigen SEQ

ID N0:18652.

1751 g110440494Homo SapiensFLJ00092 rotein 252 97 1751 AAM40956 Homo SapiensHYSE- Human polypeptide 80 30 SEQ ID

NO 5887.

1751 gi~10440494~Homo SapiensFLJ00092 protein 252 97 dbj ~BAB

80.1 1752 g115980036Yersinia 2-dehydro-3-deoxyphosphooctonate77 46 pesos aldolase 1752 g111322261Diceros al ha adrenergic rece 74 26 bicornis for 2B

1752 g120516240Thermoanaerobamethylaspartate mutase 73 25 cter ten congensis 1753 g119684014Homo Sapienssimilar to brain-specific1387 99 angiogenesis inhibitor 3 (H. sa iens) 1753 AAB88367 Homo SapiensHELI- Human membrane 1380 99 or secretory protein clone PSECO101.

1753 11469936 Mus musculusFGF-binding protein 158 29 1754 AAB01397 Homo SapiensINCY- Neuron-associated 435 92 rotein.

1754 g121218140Homo Sapiensrab effector MYRIP 435 92 1754 g121320161Mus musculusexophilin 8 378 77 1755 AAM74815 Homo SapiensMOLE- Human bone marrow 253 75 expressed probe encoded protein SEQ

ID NO: 35121.

1755 AAM62013 Homo SapiensMOLE- Human brain expressed253 75 single exon probe encoded protein SEQ ID

NO: 34118.

1755 AAM70390 Homo sapiensMOLE- Human bone marrow 228 62 expressed probe encoded protein SEQ

ID NO: 30696.

1756 g16460201Deinococcusphenylacetic acid degradation85 27 protein radioduransPaaA

1756 g13309543Talcifugu MLL 79 34 rubri es 1756 AAT10059_Homo SapiensUSSH erbB-3 cDNA clone 74 31 E3-16.

aal 1757 118676406Homo sa FLJ00021 protein 70 36 iens 1758 g113423395CaulobacterNADH dehydrogenase I, 78 37 M subunit crescentus Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1758 gi~17506337~CaenorhabditisD1007.15.p 82 24 ref~NP-4913elegans 90.1 ~

1758 gi~16126181~CaulobacterNADH dehydrogenase I, 78 37 M subunit ref~NP_4207crescentus 45.1 1759 gi19881193chimpanzee transcriptional transactivator83 29 cytome alovirus 1759 gi19881161chimpanzee transcriptional transactivator83 29 cytomegalovirus 1759 1556297 Mus musculusal ha-1 type IV collagen81 33 1760 118033185Danio rerioUNC45-related rotein 702 79 1760 AAG77802 Homo SapiensHUMA- Human HOGEN50 603 65 serine/threonine phosphatase protein se uence.

1760 AAM40290 Homo SapiensHYSE- Human polypeptide 603 65 SEQ ID

NO 3435.

1761 g16634123Drosophila SoxNeuro 70 24 melano aster 1762 gi~14245700~Giardia kinesin-like protein 69 26 dbj~BAB561intestinalis 42.1 1762 gi~165011~gbOryctolaguseucaryotic release factor69 24 (eRF) ~AAA31246.cuniculus 1~ , 1762 gi~15559188~Homo SapiensdJ45P21.3 (butyrophilin,69 26 subfamily 3, emb~CAC03 member A1) 424.2 1763 AAM93661 Homo SapiensHELI- Human polypeptide,186 80 SEQ ID

NO: 3536.

1763 AAM64398 Homo SapiensMOLE- Human brain expressed154 76 single exon probe encoded protein SEQ ID

NO: 36503.

1763 gi~20556958~Homo Sapienssimilar to PAM COOH-terminal73 43 ref~XP_0615 interactor protein 1 62.5 1764 AAU17223 Homo SapiensHUMA- Novel signal transduction211 87 pathwa rotein, Se ID
788.

1765 g11334546Podospora Dod COI 113 grp IB protein71 37 anserina 1765 15679307 Mus musculusROR aroma t 70 27 1765 g14186077Mus musculusROR aroma T rotein 70 27 1766 g117864081Mus musculusPPAR aroma coactivator-lbeta74 26 protein 1766 g144795 Methanococcuspolyferredoxin 71 28 voltae 1766 g114279670Lycopersiconverticillium wilt disease71 31 resistance esculentum protein 1768 AAE06588 Homo SapiensSAGA Human protein having165 100 hydrophobic domain, HP
10778.

1768 AAM40979 Homo SapiensHYSE- Human polypeptide 165 100 SEQ ID

NO 5910.

1768 AAB24542 Homo SapiensHUMA- Human secreted 73 30 protein sequence encoded by gene N0:168.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1769 gi6174840Achromobacterlow-specificity D-tlueonine78 33 aldolase xylosoxidans subsp.

xylosoxidans 1769 gi16769806Drosophila SD02660p 75 23 melano aster 1769 gi1098473Rattus insulin-like growth 73 31 factor binding norvegicus rotein 1770 AAP94684 Homo SapiensCHIL Amino acid sequence79 56 encoded by part of human xnamiose binding protein(hMBP) genomic DNA.

1770 gij15790548jHalobacteriumcobyric acid synthase; 69 36 CbiP

ref~NP Sp. NRC-1 72.1 ~

1770 gij11467609jGuillardia Clp protease ATP binding69 27 theta subunit ref~NP_0506 61.1j 1772 gi5532460Shi eila ShiF 66 32 flexneri 1773 gi 11544663Arabidopsis PTPKIS 1 75 42 thaliana 1773 gi11595504Arabidopsis PTPKIS1 protein 75 42 thaliana 1773 gi18389331Mus musculus2',5'-oli oadenylate 73 42 synthetase-like 10 1774 AAM06519 Homo SapiensHYSE- Human foetal protein,414 90 SEQ ID

NO: 250.

1774 gij18552248jHomo Sapienssimilar to latent transforming69 37 growth refjXP_0925 factor beta binding protein 1; latent 10.1 TGF beta binding protein 1775 gi4884924Rangiferine glycoprotein C 67 60 he esvirus 1775 AAB94152 Homo sapiensHELI- Human protein 65 34 sequence SEQ

ID N0:14435.

1775 AAB93253 Homo SapiensHELI- Human protein 65 34 sequence SEQ

ID N0:12271.

1776 gi13424176Caulobacter N-carbamyl-L-amino acid89 24 crescentus amidohydrolase 1776 gi514267 Homo Sapiensproto-oncogene tyrosine-protein86 29 kinase 1776 128237 Homo Sapiens150 protein (AA 1-1130)84 28 1777 g163370 Gallus anus d strophin (AA 1 - 3660)68 31 1777 gij3046783jeScyliorhinusdystrophin 67 29 mb~CAA680canicula 33.1j 1777 gi~2342682jgArabidopsis Contains similarity 67 31 to Rattus AMP-bjAAB7040thaliana activated protein kinase (gbjX95577).

6.1j 1778 AAE16176 Homo SapiensINCY- Human G-protein 1419 100 coupled receptor 7 (GCREC-7) rotein.

1778 AAE18021 Homo SapiensCUBA- Human G-protein 1419 100 coupled receptor-8a (GPCR-8a) rotein.

1778 AAG72411 Homo SapiensVEDA Human OR-like polypeptide1419 100 query se uence, SEQ
ID NO: 2092.

1779 AAM76040 Homo SapiensMOLE- Human bone marrow93 48 expressed probe encoded protein SEQ

117 NO: 36346.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1779 AAM63227 Homo SapiensMOLE- Human brain expressed93 48 single exon probe encoded protein SEQ ID

NO: 35332.

1779 gi12620576BradyrllizobiumID342 87 24 ' a onicum 1780 gi2459833Rattus Maxpl 81 31 norvegicus 1780 AAB65650 Homo SapiensSUGE- Novel protein kinase,- 80 35 SEQ ID

NO: 177.

1780 AAM39805 Homo sapiensHYSE- Human polypeptide 80 36 SEQ ID

NO 2950.

1781 14877963 Mus musculusNF-ka aB inducin kinase 69 39 1781 115077865Mus musculusbullous emphi oid antigen67 35 1-b 1781 g115077863Mus musculusbullous emphi oid anti 67 35 en 1-a 1782 g14138265Nicotiana Avr9 elicitor response 76 27 protein tabacum 1782 g112725153LactococcusSOS ribosomal protein 75 32 lactis subsp.

lactis 1782 AAB21008 Homo SapiensINCY- Human nucleic acid-binding73 32 protein, NuABP-12.

1783 g13947714Streptococcusinitiation factor IF2 86 20 agalactiae 1783 g19558387Streptococcusinitiation factor 2 86 20 a alactiae 1783 g19558369Streptococcusinitiation Factor 2 86 20 a alactiae 1786 g1435855 Mus s . CREB-binding protein; 75 22 CBP

1786 g12911464Leishmania sodium stibogluconate 75 34 resistance tarentolae rotein 1786 g119547887Mus musculusCREB-binding rotein 75 22 1787 13747099 Mus musculusC1 -related factor 616 61 1787 114278927Mus musculusgliacolin ' 615 64 1787 g110566471Mus musculusGliacolin 615 64 1788 gi~21291197~Anopheles agCP7579 71 20 gb~EAA033gambiae str.

42.1 ~ PEST

1788 gi~20803964~MesorhizobiumHYPOTHETICAL PROTEIN 69 43 emb~CAD31loti 541.1 1789 AAM41125 Homo SapiensHYSE- Human polypeptide 320 80 SEQ ID

NO 6056.

1789 AAM39339 Homo SapiensHYSE- Human polypeptide 320 80 SEQ ID

NO 2484.

1789 AAM79857 Homo SapiensHYSE- Human protein SEQ 320 80 ID NO

3503.

1790 g11143585Paracentrotus2 alpha fibrillar collagen69 23 lividus 1791 g19837427Lytechinus embryonic blastocoelar 116 34 extracellular varie atus matrix rotein recursor 1791 g114089698Mycoplasma OLIGOPEPTIDE ABC 71 23 pulinonis TRANSPORTER PERMEASE

PROTEIN

1791 g16572111Bartonella riboflavin synthase alpha69 29 chain Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

uintana 1792 gi~4506023~rHomo Sapiensprotein phosphatase 68 39 2, regulatory ef~NP_0027 subunit B (B56), gamma isoform 10.1 1793 AAM71170 Homo SapiensMOLE- Human bone marrow180 82 expressed probe encoded protein SEQ

ID NO: 31476.

1793 AAM58664 Homo SapiensMOLE- Human brain expressed180 82 single exon probe encoded protein SEQ ID

NO: 30769.

1793 AAM65679 Homo SapiensMOLE- Human brain expressed168 71 single exon probe encoded protein SEQ ID

NO: 37784.

1794 AAG00072 Homo SapiensGEST Human secreted 125 80 protein, SEQ ID

NO: 4153.

1794 AAW34618 Homo SapiensIMUT- Human C3 protein 125 80 mutant DV-7N.

1794 AAW34617 Homo sapiensIMUT- Human C3 protein 125 80 mutant DV-6.

1795 AAY05069 Homo SapiensSMIK Human PIGR-2 protein1055 85 sequence.

1795 gi396170 Homo sa iensCMRF-35 anti en 406 45 1795 gi18490143Homo SapiensCMRF35 leukocyte immunoglobulin-406 45 like receptor 1796 gi~6723273~dBaboon gag-pol precursor polyprotein421 41 bj~BAA8965endogenous 9.1~ virus strain 1796 gi~13940448~Murine leukemiapol precursor protein 421 41 gb~AAK503virus 81.1 ~U43202 1796 gi~331995~gbAKV murine gag-pol polyprotein 421 41 (tag amber codon ~AAB03091.leukemia at 2250-2252 inserts virus Gln in Mo-MuLV) 1797 121411325Homo SapiensSimilar to LOC205103 260 73 1797 gi~4835878~gHomo Sapiensendocytic receptor Endo18077 31 b~AAD3028 O.1~AF1348 1797 gi~16076075~Leishmania trypanothione reductase70 30 emb~CAC94donovani 295.1 donovani 1798 g1927721 SaccharomycesSiplp: SNF1 proteiiikinase72 34 substrate;

cerevisiae YDR422C; CAI: 0.13 1798 g1172604 Saccharomycesprotein kinase 72 34 cerevisiae 1798 gi~6320630~rSaccharomycesSNF1 proteinkinase substrate;72 34 Siplp eflNP_0107cerevisiae 10.1 1799 gi~20839768~Mus musculussimilar to GDP-fucose 71 29 transporter 1 ref~XP_1303 11.1 1801 gi~17461642~Homo Sapienssimilar to Ig kappa 78 23 chain reflXP

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

49.1 ~

1801 gi~6325342~rSaccharomycesProtein required for 76 22 cell viability;

0154 cerevisiae Ypr085cp ef~NP

_ 10.1 1801 gi~9635081~rGallid UL47 74 26 ef~NP_0578herpesvirus 09.1 ~

1802 AAB94148 Homo SapiensHELI- Human protein sequence250 56 SEQ

ID N0:14427.

1802 AAG64564 Homo SapiensSHAN- Human zinc-finger 250 56 protein 60.

1802 AAM79356 Homo SapiensHYSE- Human protein SEQ 250 56 ID NO

3002.

1803 AAW81754 Homo SapiensBOEF Human Fanconi anaemia-631 85 associated ene II protein.

1803 g12407911Homo Sapiensdifferentially expressed555 74 in Fanconi anemia 1803 16013073 Mus musculusHemT-3 protein 89 24 1805 g114189735Homo sapiensATP-binding cassette 1508 90 transporter family A member 12 1805 11943947 Bos taurus ABC transporter 404 31 1805 AAZ94734_Homo SapiensFARB Human ATP binding 395 33 cassette aal ABCAl (ABC1) cDNA.

1806 AAU12234 Homo SapiensGETH Human PR04350 polypeptide859 100 sequence.

1806 AAA96344_Homo SapiensGETH cDNA encoding a 498 48 novel aal of epode designated PR04357.

1806 AAU12445 Homo SapiensGETH Human PRO4357 polypeptide498 48 sequence.

1807 1190396 Homo sa rofilaggrin 76 29 iens 1808 AAB88367 Homo SapiensHELI- Human membrane 74 30 or secretory rotein clone PSECO101.

1808 g119684014Homo Sapienssimilar to brain-specific74 30 angiogenesis inhibitor 3 (H. Sapiens) 1808 gi~18576362~Homo Sapienssimilar to fibroblast 74 30 growth factor re~XP_0844 binding protein 1 81.1 1809 g1530876 Chlamydomonasamino acid feature: Rod 126 35 protein reinhardtiidomain, as 266 .. 468;
amino acid feature: globular protein domain, as 32 .. 265 1809 g16578849Myxococcus FrgA 126 29 xanthus 1809 12429362 Santalum proline rich protein 122 27 album 1810 g117428288Ralstonia PROBABLE CATION- 75 28 solanacearumTRANSPORTING ATPASE

LIPOPROTEIN TRANSMEMBRANE

1810 g121483422Drosophila LD34142p 71 29 melano aster 1810 ABB90042 Homo SapiensHUMA- Human polypeptide 70 32 SEQ ID

NO 2418.

1811 gi~20915248~Mus musculussimilar to Collagen alpha148 74 1(VI) chain ref~XP_1451 precursor 60.1 1812 g12104558Rattus ~ CCA3 ~ 1150 ~ 90 Tahle 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

norvegicus 1812 AAB64963 Homo SapiensROSE/ Human secreted 172 37 protein sequence encoded by gene NO:141.

1812 gi12963869Mus musculusgene trap ankyrin repeat172 37 containing rotein 1813 AAB65201 Homo SapiensGETH Human PR01009 (UNQ493)208 100 rotein se uence SEQ ID
N0:194.

1813 AAY66678 Homo SapiensGETH Membrane-bound protein208 100 PR01009.

1813 AAB24068 Homo SapiensGETH Human PR01009 protein208 100 se uence SEQ ID N0:36.

1815 AAG89314 Homo SapiensGEST Hurnan secreted 191 100 protein, SEQ ID

NO: 434.

1815 gi6460052Deinococcusdipeptidyl peptidase 66 60 IV-related protein radiodurans 1816 gi1052594Drosophila trithorax protein trxI 75 26 melanogaster 1816 gi1052593Drosophila trithorax protein trxII 75 26 melanogaster 1816 gi158818 Drosophila zinc-binding protein 75 26 melanogaster 1817 AAB49765 Homo SapiensHELI- Human proliferation229 94 differentiation factor amino acid se uence.

1817 AAB88393 Homo SapiensHELI- Human membrane 229 94 or secretory rotein clone PSEC0137.

1817 gi18446895Drosophila AT05866p 73 25 melanogaster 1818 gi6573212Giardia variant-specific surface73 32 protein H7-1 intestinalis 1818 gi159143 Giardia variant-specific surface73 32 protein H7 intestinalis 1818 gi15144254Micrurus neurotoxin homologue 72 32 corallinus 1819 gi161857 Tetrahymenasurface antigen 69 35 thermo hila 1821 gi913964 Carcinoscorpiusfactor C 80 26 rotundicauda 1821 gi217397 Tachypleus limulus factor C precursor80 26 tridentatus 1821 gi18542425Tachypleus factor C precursor 80 26 tridentatus 1822 19309473 Mus musculusDNMT1 associated protein-174 37 1822 g11666895Homo sa CHL1 protein 74 23 iens 1822 g116923930Mus musculusMAT1-mediated transcriptional74 37 repressor 1823 g19058659Canis familiarisskeletal muscle chloride73 34 channel C1C-1 1823 g1433182 Drosophila receptor protein tyrosine72 26 phosphatase melanogaster 1823 g120429105Paracoccus decaprenyl diphosphate 72 27 synthase zeaxanthinifacie ns 1824 g113374178Mus musculusTAFII140 rotein 612 88 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1824 gi17861888Drosophila GM10839p 246 49 melano aster 1824 gi6634096Drosophila BIP2 protein 242 48 melano aster 1825 gi16605480Homo sa G6b-C protein 1159 100 iens 1825 116605484Homo sa G6b-E rotein 1009 90 iens 1825 gi5304877Homo sa immuno lobulin rece for 1003 83 iens 1826 AAB94636 Homo SapiensHELI- Human protein sequence105 37 SEQ

ID N0:15515.

1826 AAU15903 Homo SapiensHUMA- Human novel secreted105 37 protein, Se ID 856.

1826 gi21430928Drosophila SD27341p 93 39 melanogaster 1827 AAR33270 Homo SapiensWIST- T cell receptor 329 92 alpha chain clone alphal.3.

1827 gi1806100Homo SapiensT cell rece for alpha 329 92 chain 1827 gi2358032Homo SapiensTCRAV8S3 329 92 1828 gi20513851Hordeum BPM 73 45 vul are 1828 AA001897 Homo SapiensHYSE- Human polypeptide 70 35 SEQ ID

NO 15789.

1828 AAE16477 Homo SapiensOSTE- Human collagen 69 31 alphal (II) rotein.

1829 AAG66837 Homo SapiensSHAN- Human ATP-dependent356 100 serine proteinase 31.

1829 AAG66838 Homo SapiensSHAN- Human ATP-dependent89 100 serine proteinase 31 N-terminal peptide.

1829 gi5881591Gallus gallushomeodomain protein 77 38 1830 AAB94294 Homo SapiensHELI- Human protein sequence951 99 SEQ

ID N0:14745.

1830 gi10504968Drosophila rho guanine nucleotide 180 22 exchange factor melano aster4 1830 gi16197921Drosophila LD03170p 180 22 melano aster 1831 ABB 12353Homo SapiensHYSE- Human bone marrow 199 30 expressed protein SEQ ID NO: 107.

1831 120452161Canis familiarisretinitis i mentosa GTPase143 24 re lator 1831 gi2062609Xenopus middle molecular weight 140 24 laevis neurofilament rotein NF-M(1) 1832 AAB29778 Homo SapiensRHOD- Human MSF-derived 148 18 tribonectin.

1832 gi142161 Anaplasma surface antigen Amf105 141 25 mar finale 1832 gi4808177Drosophila largest subunit of the 141 20 RNA polymerase subobscura II com lex 1833 AAM66321 Homo SapiensMOLE- Human bone marrow 424 51 expressed probe encoded protein SEQ

ID NO: 26627.

1833 AAM53933 Homo SapiensMOLE- Human brain expressed424 51 single exon probe encoded protein SEQ ID

NO: 26038.

1833 gi~6723273~dBaboon gag-pol precursor polyprotein357 47 bj~BAA8965endogenous 9.1 virus strain Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1834 AAM88756 Homo SapiensHUMA- Human 208 100 immune/haematopoietic antigen SEQ

ID N0:16349.

1834 gi20417 Persea americanacellulase 77 34 1834 gi153337 Streptomyceskanamycin-apramycin resistance69 26 tenebrariusmethylase 1837 AAY02893 Homo SapiensIiLTMA- Fragment of human76 41 secreted protein encoded by ene 92.

1837 AAY99429 Homo SapiensGETH Human PR01563 (UNQ769)73 35 amino acid se uence SEQ
ID N0:317.

1837 gi6634084Drosophila malate dehydrogenase 73 39 (NADP-melanogasterdependent oxaloacetate decarboxylating), malic enzyme 1838 gi2865602SaccharopolyspoSapI M2 methyltransferase77 37 ra Sp.

1838 gi3089358Rattus MARRLC2A 75 33 norvegicus 1838 gi~2865602~gSaccharopolyspoSapI M2 methyltransferase77 37 b~AAC9718ra Sp.

2.1~

1839 AAM69149 Homo SapiensMOLE- Human bone marrow 154 96 expressed probe encoded protein SEQ

ID NO: 29455.

1839 AAM56768 Homo SapiensMOLE- Human brain expressed154 96 single exon probe encoded protein SEQ ID

NO: 28873.

1839 AAW96209 Homo SapiensSMIK Amyloid precursor 102 78 protein (APP) C-terminal fragment.

1840 gi9946563Pseudomonasprobable type II secretion81 36 system aeru inosa protein 1840 gi21108565Xanthomonaspseudouridylate synthase75 35 axonopodis pv.

citri str.

1840 ABB04714 Homo sapiensSHAN- Human PP1744 protein74 31 SEQ

ID N0:23.

1841 gi1491949Molluscum MC006L 85 30 contagiosum virus sub a 1 1841 AAM42085 Homo SapiensHYSE- Human polypeptide 81 27 SEQ ID

NO 7016.

1841 AAM40299 Homo SapiensHYSE- Human polypeptide 81 27 SEQ ID

NO 3444.

1842 120381413Homo sapiensSimilar to LOC160680 216 44 1842 g113592175Leishmania ppg3 144 24 maj or 1842 g15420387Leishmania proteophosphoglycan 140 23 ma' or 1843 AAB87181 Homo SapiensMILL- Human secreted 278 42 protein MANGO 349 E41D variant, SEQ ID

N0:231.

1843 AAB87128 Homo sapiensMILL- Human secreted 278 42 protein MANGO 349, SEQ ID N0:130.

1843 AAB87179 Homo SapiensMILL- Human secreted 276 41 protein MANGO 349 I21K variant, SEQ ID

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

N0:227.

1844 AAE14341 Homo sapiensINCY- Human protease 886 93 protein.

1844 gi16768276Drosophila GH27809p 290 41 melano aster 1844 gi2655204Mus musculusubiquitin-specific protease258 35 1846 AAY88300 Homo SapiensMILL- Human TANGO 187-3 1334 90 protein.

1846 gi13097780Homo SapiensSimilar to RIKEN cDNA 1326 90 gene 1846 AAY88296 Homo SapiensMILL- Human TANGO 187-2/31312 87 protein.

1847 AAG74984 Homo SapiensHUMA- Human colon cancer75 32 antigen protein SEQ ID N0:5748.

1847 gi17352449Rattus ErbB3/Her3 precursor 74 38 norve icus 1847 gi~20860870~Mus musculussimilar to H4(D10S170) 75 32 protein re~XP,1256 64.1 ~

1848 gi3123530Fowlpox I3L, ortholo ue of vaccinia75 27 virus I3L

1848 gi5902659Drosophila ring canal protein 70 27 melanogaster 1848 gi~18110218~Drosophila kel-P2 70 27 ref~NP-4765melanogaster 89.2 1849 gi2065210Mus musculusPro-Pol-dUTPase olyprotein614 78 1849 AAM65715 Homo SapiensMOLE- Human bone marrow 548 73 expressed probe encoded protein SEQ

ID NO: 26021.

1849 AAM53338 Homo SapiensMOLE- Human brain expressed548 73 single exon probe encoded protein SEQ ID

NO: 25443.

1850 gi10999071LophognathusNADH dehydrogenase subunit74 23 longirostris 1850 gi18537243Human envelope glycoprotein 74 29 immunodeficienc y virus a 1 1850 gi~1099907,1~LophognathusNADH dehydrogenase subunit74 23 gb~AAG006longirostris 22.2~AF

1851 gi~17448210~Homo Sapienssimilar to 60 kDa heat 72 28 shock protein, ref~XP_0685 mitochondrial precursor (Hsp60) (60 03.1 kDa chaperonin) (CPN60) (Heat shock protein 60) (HSP-60) (Mitochondrial matrix protein Pl) (P60 lymphocyte protein) (HuCHA60) 1852 gi1164937SaccharomycesYOR3160w 74 31 cerevisiae 1852 gi3176662ArabidopsisSimilar to mannosyl-oligosaccharide73 31 thaliana glucosidase gb~X87237 from Homo sa iens.

1852 gi13398928Arabidopsisalpha-glucosidase 1 73 31 thaliana 1853 gi~20889364~Mus musculussimilar to hepatitis ~ 76 ~ 36 A virus cellular Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1384 receptor 1; T cell immunoglobin ref~XP

_ domain and mucin doamin 29.1 ~ rotein 1 1853 gi~21288202~Anopheles agCP9342 71 32 gb~EAA005gambiae str.

23.1 ~ PEST

1854 AAB88481 Homo SapiensHELI- Human membrane 776 99 or secretory rotein clone PSEC0251.

1854 AAE03835 Homo SapiensHLTMA- Human gene 18 776 99 encoded secreted protein HFKHW50, SEQ ID

NO: 81.

1854 AAE03863 Homo SapiensHIJMA- Human gene 18 716 97 encoded secreted protein HFKHW50, SEQ ID

N0:109.

1855 gi1663748Chlamydomonasdynein heavy chain 7 82 29 reinhardtii 1855 gi1663744Chlamydomonasdynein heavy chain 5 80 28 reinhardtii 1855 gi1663738Chlamydomonasdynein heavy chain 2 80 27 reinhardtii 1856 gi18032120Gallus gallusshal-like voltage-gated 75 23 potassium channel 1856 gi1408569Haemophilusadhesion and penetration71 28 protein influenzae 1856 gig 18032120Gallus gallusshal-like voltage-gated 75 23 potassium gb~AAL566 chaimel 33.1 ~AF075 1857 AAM67180 Homo SapiensMOLE- Human bone marrow 129 44 expressed probe encoded protein SEQ

ID NO: 27486.

1857 AAM54795 Homo sapiensMOLE- Human brain expressed129 44 single exon probe encoded protein SEQ ID

NO: 26900.

1857 gi~21040255~Homo Sapienssplicing factor, arginine/serine-rich109 29 re~NP_6319 07.1 ~

1858 gi21392190Drosophila RE74758p 71 39 melanogaster 1858 gi9954108TrypanosomaRNA binding protein RGGm68 40 cruzi 1858 gi20302994Medicago nodule-specific glycine-rich66 32 protein 1C

tnmcatula 1859 gi~20536244~Homo Sapienssimilar to autoantigen 72 30 La ref~XP_0605 05.4 1860 gi~17541362~CaenorhabditisK08E7.S.p 103 29 ref)NP-5024elegans 09.1 1860 gi~17446900~Homo Sapienssimilar to DNA-directed 100 34 RNA

re~XP_0658 polymerase (EC 2.7.7.6) II largest 33.1 ~ chain - Mastigamoeba invertens (fra ment) 1860 gi~9628166~rAfrican CD2 homolog 98 30 swine eflNP fever virus Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

52.1 1861 AAY70691 Homo sa DAND Human membrane attractin-2.162 40 iens 1861 AAY70690 Homo SapiensDAND Human membrane attractin-1.162 40 1861 gi12275390Rattus membrane attractin 162 40 norvegicus 1862 gi10039425Equus caballusALR protein 81 28 1862 gi13529521Mus musculusSimilar to elastin microfibril80 32 interface located protein 1862 AAM40414 Homo SapiensHYSE- Human polypeptide 79 39 SEQ ID

NO 3559.

1863 gi~16588389~Homo SapiensB lymphocyte activation-related247 52 protein gb~AAL267 BC-1514 87.1 ~AF304 1863 gi~20479028~Homo Sapienssimilar to B lymphocyte 117 68 activation-re~XP_1137 related protein BC-1514 29.1 1863 gi~21301715~Anopheles agCP8366 85 41 gb~EAA138gambiae str.

60.1 ~ PEST

1864 AAU15851 Homo SapiensHUMA- Human novel secreted1275 78 protein, Seq ID 804.

1864 AAU16312 Homo sapiensHUMA- Human novel secreted1123 76 protein, Seq ID 1265.

1864 AAG02054 Homo SapiensGEST Human secreted protein,308 91 SEQ ID

NO: 6135.

1865 AAB94953 Homo SapiensHELI- Human protein sequence86 29 SEQ

ID N0:16485.

1865 13746787 Homo SapiensSYT interacting protein 86 29 SIP

1865 g115022507Homo sapienscoactivator activator 86 29 1866 g117133332Nostoc Sp. preprotein translocase 68 43 PCC Sect subunit 1866 gi~13489110~Homo Sapiensgap junction protein, 66 40 alpha 3, 46kD

ref~NP-0687 (connexin 46) 73.1 1867 g1706930 Rattus cyclic GMP stimulated 191 95 norvegicus phosphodiesterase 1867 AAV54762-Homo SapiensUNIW Human cGS-PDE cDNA 137 100 DNA

aal seqeucne.

1867 AAV36157_,Homo SapiensUNIW Human cyclic-GMP-nucleotide137 100 aal phos hodiesterase cDNA.

1868 AAB95695 Homo SapiensHELI- Human protein sequence112 27 SEQ

ID N0:18516.

1868 AAY91447 Homo SapiensHUMA- Human secreted 112 27 protein sequence encoded by gene N0:168.

1868 AAY91393 Homo SapiensHUMA- Human secreted 112 27 protein sequence encoded by gene N0:114.

1870 AAU07886 Homo SapiensWHED Polypeptide sequence1454 94 for human hspGlS.

1870 g113603891Homo sa MOV10-like 1 1454 94 iens 1870 113603857Mus musculusMOV10-like 1 954 77 1871 AAM96652 Homo SapiensHUMA- Human reproductive484 96 system Table 2 SEQ AccessionSpecies Description Score ID No, Identity NO:

related antigen SEQ ID
NO: 5310.

1871 gi18676652Homo sa FLJ00225 rotein 433 95 iens 1871 gi21386760Berneuxia maturase R 70 32 thibeoca 1872 AAQ90304_Homo SapiensNISR Human thryoid peroxidase73 29 gene.

aal 1872 AAW48781 Homo sa RSRR- Thyroid eroxidase.73 29 iens 1872 AAR75689 Homo SapiensNISR Human thryoid eroxidase.73 29 1873 AAG03774 Homo SapiensGEST Human secreted protein,228 90 SEQ ID

NO: 7855.

1873 1338288 Homo Sapienspre rosomatostatin I 228 90 1873 g1342299 Macaca preprosomatostaon 228 90 fascicularis 1875 AAR30418 Homo sa DAND Nearly com lete 76 30 iens p107 rotein.

1875 g1347378 Homo Sapiens107 76 30 1875 g1157871 Drosophila P glycoprotein 76 24 melanogastex 1876 ABB 17955Homo SapiensHUMA- Human nervous system186 40 related poi a tide SEQ ID NO
6612.

1876 AAS 17764_Homo SapiensGENA- Human Genomic DNA 167 39 for aal CRYBB1.

1876 AA002331 Homo SapiensHYSE- Human polypepode 165 42 SEQ ID

NO 16223.

1877 gi~59977~emHuman tripartite fusion transcript224 76 b~CAA7866endogenous 2.1 retrovirus 1878 ABB84943 Homo SapiensGETH Human PR01556 protein1056 93 sequence SEQ ID N0:254.

1878 AAB31670 Homo SapiensPROT- Amino acid sequence1056 93 of a human protein having a hydrophobic domain.

1878 AAB47295 Homo SapiensGETH PR01556 0l epode. 1056 93 1879 ABB15861 Homo SapiensHUMA- Human nervous system73 36 related poi eptide SEQ ID NO
4518.

1880 AAU83117 Homo sapiensZYMO Novel secreted protein66 54 Z799543G2P.

1880 g112723186Lactococcusouter membrane lipoprotein66 26 precursor lactis subsp.

lactis 1881 1609624 Vibrio choleraeE SC 73 29 1882 g112667456Ratios synaptotagnun VIId 86 32 norvegicus 1882 g112667454Rattus synaptotagmin VIII 85 33 norvegicus 1882 g1334072 PseudorabiesORF-3 protein 83 35 virus 1883 g11747 Oryctolagustrichohyalin 119 29 cuniculus 1883 g12072290Xenopuslae XL-INCENP 100 27 vis 1883 g112584554_ polyprotein 96 25 Human coxsackievirus 1884 gi~15601413~Vibrio choleraesucrose-6-phosphate dehydrogenase65 55 ref~NP

Table 2 SEQ AccessionSpecies Description Score 1o ID No. Identity NO:

44.1 ~

1885 gi16878287Homo sa Similar to C-terminal 74 35 iens modulator protein 1885 gi15866714Homo sa C-terminal modulator 74 35 iens rotein 1885 AA006984 Homo SapiensHYSE- Human polypeptide 70 60 SEQ ID

NO 20876.

1887 AAW25939 Homo SapiensCNRS T-cell receptor 601 99 V-beta-5.1 pe tide fra ent.

1887. gi36973 Homo SapiensT-cell receptor beta-chain601 99 1887 gi1552498Homo sa V_se meat translation 600 100 iens product 1888 gi18874468Homo Sapienspartitioning-defective 198 73 3-like protein splice variant c 1888 gi16903870Homo sapienspartitioning-defective 198 73 3-like protein splice variant b 1888 gi16903868Homo Sapienspartitioning-defective 198 73 3-like protein s lice variant a 1889 gi21489377Homo SapiensMAPA rotein 1620 99 1889 gi21489330Bos taurus MAPA protein 833 56 1889 gi21489379Mus musculusMAPA protein 630 48 1890 AAY10874 Homo SapiensHUMA- Amino acid sequence503 100 of a human secreted rotein.

1890 gi17429674Ralstonia PROBABLE LIPOPROTEIN 73 44 solanacearum 1891 gi15723141Homo sa c349E10.1.1 (novel protein,180 46 iens isoform 1) 1891 AAB59006 Homo SapiensHUMA- Breast and ovarian174 47 cancer associated antigen protein sequence SEQ ID 714.

1891 gi19353342Mus musculusRII~EN cDNA 9530058802 162 47 gene 1892 AAM86086 Homo SapiensHUMA- Human 95 53 immiule/haematopoietic antigen SEQ

ID NO:13679.

1892 AA005973 Homo SapiensHYSE- Human polypeptide 94 82 SEQ ID

NO 19865.

1892 AA009418 Homo SapiensHYSE- Human polypeptide 91 70 SEQ ID

NO 23310.

1893 gi8778607ArabidopsisFSM15.23 71 25 thaliana 1894 AAM65951 Homo SapiensMOLE- Human bone marrow 69 38 expressed probe encoded protein SEQ

ID NO: 26257.

1894 AAM53568 Homo sapiensMOLE- Human brain expressed69 38 single exon probe encoded protein SEQ ID

NO: 25673.

1894 gi~20832567~Mus musculussimilar to Heterogeneous163 76 nuclear ref~XP_1335 ribonucleoprotein A3 (hnRNP A3) 24.1 ~ (D 10 S 102) 1895 AAM66299 Homo sapiensMOLE- Human bone marrow 440 83 expressed probe encoded protein SEQ

ID NO: 26605.

1895 AAM53913 Homo SapiensMOLE- Human brain expressed440 83 single exon probe encoded protein SEQ ID

NO: 26018.

1895 gi~6723273~dBaboon gag-pol precursor polyprotein270 45 bj ~BAA8965endogenous 9.1~ virus strain Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1896 gi4883988Bartonella cell division protein 68 28 FtsZ

clarridgeiae 1897 AA013209 Homo sapiensHYSE- Human polypeptide 142 54 SEQ ID

NO 27101.

1897 AAM66708 Homo sapiensMOLE- Human bone marrow 124 46 expressed probe encoded protein SEQ

ID NO: 27014.

1897 AAM54310 Homo SapiensMOLE- Human brain expressed124 46 single exon probe encoded protein SEQ ID

NO: 26415.

1898 gi2565268Drosophila pore-forming protein 75 27 MIP family virilis 1898 gi7453547Homo Sapiensglioma tumor suppressor 75 31 candidate re ion rotein 1 1898 gi3218331Metarhiziumnitrogen response regulator74 26 aniso liae 1899 19656609 Vibrio choleraechemotaxis protein CheA 73 32 1899 gi~20908537~Mus musculusRIVEN cDNA 1700001L19 443 80 re~XP_1274 14.1 1899 gi~15642063~Vibrio choleraechemotaxis protein CheA 73 32 re~NP,2316 95.1 1900 gi~18586105~Homo Sapienssimilar to scal 203 84 reflXP

00.1 ~

1900 gi~20888279~Mus musculussimilar to spinocerebellar199 82 ataxia type 1 refjXP_ 08.1 1901 g1338033 Homo sa serum rotein 90 32 iens 1901 g14808221Homo SapiensdJ1177I5.2 (serum constituent90 32 protein MSE55) 1901 g14098993Mus musculuspolyhomeotic 2 88 30 1902 AAB 19933Homo SapiensINCY- Human oxidoreductase250 100 OXRD-8.

1902 g119713043Fusobacteriumhon/zinc/copper-binding 73 22 protein nucleatum subsp.

nucleatum 1902 gi~20342079~Mus musculusltIKEN cDNA 1700003E16 77 25 ref~XP_1106 14.1 1903 g1342279 Macaca opiomelanocortin 231 49 nemestrina 1903 128342 Homo sa roo iomelanocortin 230 49 iens 1903 g1190183 Homo sapienso iomelanocortin 230 49 1904 gi~11037117~Homo SapiensNAG13 180 53 gb~AAG274 85.1 CAF

537_1 1905 g15360984Homo SapiensdJ228HI3.1 (similar to 152 72 Ribosomal protein L21 e) 1905 AAB44126 Homo SapiensHUMA- Human cancer associated150 83 protein sequence SEQ
ID N0:1571.

Table 2 SEQ AccessionSpecies Description Score No, Identity NO:

1905 gi550015 Homo sa ribosomal protein L21 150 83 iens 1906 gi2654610Pseudomonasarginine/ornitlline succinyltransferase79 25 aeru inosa AIsubunit 1906 gi17226812Botryotiniahistidine kinase 72 33 fuckeliana 1906 gi16904238Botryotiniatwo-component osmosensing72 33 histidine fuckeliana kinase BOS1 1908 gi330359 Human nuclear antigen precursor91 37 herpesvirus 1908 gi1632793Human EBNA3C (EBNA 4B) latent 91 37 protein herpesvirus 1908 11184677 Candida hyphal wall rotein 1 90 38 albicans 1909 g113177635Rattus phospholipase C beta-3 72 26 norve icus 1909 g11150880Mus musculusphos holi ase C beta3 71 26 1909 g117105044Simian 10.1 kDa 71 31 adenovirus 1910 g19857054Leishmania possible CG7055 protein 71 47 maj or 1910 g11617560Leishmania LCFACASS; L5701.2 67 33 ma'or 1910 gi~9857054~eLeishmania possible CG7055 protein 71 47 mb~CAC040major 11.1 1911 AAY87278 Homo SapiensINCY- Human signal peptide501 82 containing protein HSPP-55 SEQ ID

NO:55.

1911 AAB 18912Homo SapiensGETH A novel polypeptide501 82 designated PR01889.

1911 AAU27659 Homo SapiensZYMO Human protein AFP513481.416 77 1912 12065210 Mus musculusPro-Pol-dUTPase olyprotein434 80 1912 gig 18676710Homo SapiensFLJ00254 protein 270 64 dbj~BAB850 07.1 1913 g15713196Caenorhabditisliprin-alpha homolog 479 38 elegans 1913 1930343 Homo SapiensLAR-interacting protein 467 39 1b 1913 g1930341 Homo SapiensLAR-interacting protein 467 39 la 1914 g16651021Mus musculussemaphorin cytoplasmic 274 63 domain-associated rotein 3B

1914 g16651019Mus musculussemaphorin cytoplasmic 274 63 domain-associated protein 3A

1914 AAM25720 Homo SapiensHYSE- Human protein sequence266 61 SEQ

ID N0:1235.

1915 g1902214 Zea mays RNA polymerase beta' 72 24 subuW t-2 1915 g112482 Zea mays RNA polymerase beta-2 72 24 subunit (AA

1-1527) 1915 gig 11467184Zea mays RNA polymerase beta' 72 24 subunit-2 reflNP-0430 17.1 1916 g11655432Mus musculuslexin 2 1135 58 1916 AAM93435 Homo SapiensHELI- Human polypeptide,1132 57 SEQ ID

NO: 3070.

1916 g1961515 Xenopus lexin 1126 54 laevis Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1917 g115559064Mus musculusSNAG1 86 38 1917 gi~20863586~Mus musculussimilar to dJ551D2.5 88 30 (novel protein) ref~XP_1415 81.1 1917 gi~18644890~Mus musculussorting nexin associated86 38 golgi protein 1 re~NP_5706 14.1 1918 g119528383Drosophila RE04404p 67 32 melanogaster 1919 AAM77461Homo SapiensMOLE- Human bone marrow 189 79 expressed probe encoded protein SEQ

ID NO: 37767.

1919 AAM64684Homo sapiensMOLE- Human brain expressed189 79 single exon probe encoded protein SEQ ID

NO: 36789.

1919 gig 17477135Homo Sapienssimilar to embryonal 263 75 stem cell specific ref~XP'0634 gene 1 15.1 1920 g12623757Rarius neurabin 172 97 norvegicus 1920 12827450Gallus anus KS5 rotein 154 88 1920 113991829Xenopus laevisneurabin 145 83 1923 g15532302Heterocapsa PSII CP47 apoprotein 75 29 tri uetra 1923 g11881335Bacillus SIMILAR TO YQFU, YXKD, 68 38 subtilis YITB

OF B. SUBTILIS.

1923 gi~5532302~gHeterocapsa PSII CP47 apoprotein 75 29 b~AAD4470triquetra 1.1~

1924 g16855429Leishmania possible mucin 1 precursor77 33 maj or 1924 g15832816Caenorhabditiscontains similarity to 74 34 Pfam domain:

elegans PF01694 (Rhomboid family), Score=61.7, E-value=5.1e-15, N=1 1924 AAB51976Homo SapiensHUMA- Human secreted 72 38 protein sequence encoded by gene N0:108.

1925 AAB51635Homo SapiensROSE/ Human secreted 205 31 protein sequence encoded by gene N0:75.

1925 AAB47128Homo Sapiens1NCY- CDIFF-6, Incyte 199 34 ID No.

2009435CD 1.

1925 ABB55766Homo SapiensFECH/ Human polypeptide 197 38 SEQ ID

NO 138.

1926 AAG89279Homo SapiensGEST Human secreted protein,330 44 SEQ ID

NO: 399.

1926 AAB70690Homo SapiensSREN- Human hDPP protein319 44 sequence SEQ ID N0:7.

1926 g113182757Homo sa iensHTPAP 319 44 1927 g113177290Ectocarpus EsV-1-8 69 36 siliculosus virus 1928 g118700171Arabidopsis AT5g20480/F7C8 70 86 39 thaliana 1928 g1915207Sus scrofa gastric mucin 83 29 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1928 gi532113Caenorhabditishomeotic region most 79 27 like elegans HMPB_DROME: homeotic probosci edia rotein 1929 ABB 12295Homo SapiensHYSE- Human secreted 135 59 protein homologue, SEQ ID N0:2665.

1929 AAG04080Homo SapiensGEST Human secreted 78 38 protein, SEQ ID

NO: 8161.

1929 gi9279807Drosophila cortactin 77 27 melanogaster 1930 AAV81204_Homo sapiensGEHO Human CD7 cDNA. 872 73 aal 1930 AAB36657Homo SapiensIMMV Human CD7 protein 872 73 sequence SEQ ID N0:2.

1930 AAU02438Homo SapiensGEHO Human lymphocyte 872 73 cell surface anti en CD7 olype tide.

1931 gi2636248Bacillus similar to transaldolase73 29 subtilis (pentose hosphate) 1931 gi~21398633~Bacillus Transaldolase, Transaldolase74 29 [Bacillus reflNP,6546anthracis 18.1 1931 gi~16080764~Bacillus similar to transaldolase73 29 subtilis (pentose ref~NP_3915 phosphate) 92.1 1932 AAB43545Homo SapiensHUMA- Human cancer associated73 46 protein sequence SEQ
ID N0:990.

1932 AAM40234Homo SapiensHYSE- Human polypeptide71 26 SEQ ID

NO 3379.

1934 gi3129962Gallus gallusB locus Lectin like 82 30 Natural Killer cell surface protein 1934 AAB93791Homo SapiensHELI- Human protein 77 38 sequence SEQ

ID N0:13545.

1934 gi2541864Drosophila DAD polypeptide 77 32 melanogaster 1935 gi~4959869~gMurine leukemiapolymerise 335 52 b~AAD3453virus 6.1~

1935 gi~6524624~gPhascolarctospol protein 331 52 b~AAF15098cinereus .l~

1935 gi~9630313~rGibbon ape pol polyprotein 328 52 ef~NP_0567leukemia virus 90.1 1936 gi6562332Arabidopsis diaminopimelate decarboxylase86 30 thaliana 1936 gi7573355Arabidopsis diaminopimelate decarboxylase-like86 30 thaliana rotein 1936 gi15146250Arabidopsis ATSg11880/F14F18 50 86 30 thaliana 1939 AAU07442Homo SapiensGETH Human Wntl Upregulated300 100 protein 2 (WUP2).

1939 AAU07441Homo SapiensGETH Human Wntl Upregulated300 100 protein 1 (WUP1).

1939 AAB56802Homo sapiensROSEI Human prostate 300 100 cancer antigen protein se uence SEQ
ID N0:1380.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1940 15802814 Homo sa Gag-Pro-Pol-Env rotein 587 57 iens 1940 g14185939Human pol protein 586 57 endogenous retrovirus K

1940 15802821 Homo sa Gag-Pro-Pol rotein 586 57 iens 1941 AAU83088 Homo sapiensZYMO Novel secreted protein586 100 Z2812G3P.

1941 AAB20275 Homo sa SCHE Human interleukin 535 76 iens DNAX 80.

1941 AAB20277 Homo SapiensSCHE Human interleukin 529 76 variant.

1942 AAM06866 Homo SapiensHYSE- Human foetal protein,994 100 SEQ ID

NO: 1074.

1942 g117426446Homo sa bA351K23.5 (novel rotein)933 54 iens 1942 115099951Mus musculusdiacylglycerol acyltransferase915 55 1943 AAM06596 Homo sapiensHYSE- Human foetal protein,406 98 SEQ ID

NO: 327.

1943 gi~15640499~Vibrio choleraeS-adenosylmethionine 67 51 synthase ref~NP-2301 26.1 ~

1945 AAG75561 Homo SapiensHUMA- Human colon cancer327 100 antigen protein SEQ ID N0:6325.

1945 g116416764Homo SapiensFI~SG16 327 100 1945 g113905212Mus musculusRIKEN cDNA 1200006F02 261 79 gene 1946 g1288174 Mus musculusOct2b 97 85 1946 g153490 Mus musculusOct2.5 transcription 97 85 factor 1946 g19937478Drosophila thyroid hormone receptor-associated72 39 melanogasterrotein TRAP 170 1947 AAM66980 Homo SapiensMOLE- Human bone marrow 170 69 expressed probe encoded protein SEQ

ID NO: 27286.

1947 AAM54574 Homo SapiensMOLE- Hurnan brain expressed170 69 single exon probe encoded protein SEQ ID

NO: 26679.

1947 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86 expressed probe encoded protein SEQ

ID NO: 35495.

1948 AAY10874 Homo SapiensHUMA- Amino acid sequence100 100 of a human secreted rotein.

1949 AAA27155_Homo SapiensGENE- Human P2 DNA. 100 100 aal 1949 AAY94475 Homo SapiensGENE- Predicted translation100 100 product of human P2 splice isoform, P2-B.

1949 AAY94474 Homo SapiensGENE- Human P2 protein. 100 100 1950 19502082 Homo sapienstubby super-family protein80 40 1950 19502080 Mus musculustubby super-family protein77 41 1950 18118432 Oryza sativabeta-ex ansin 73 35 1951 g14808994walleye envelope polyprotein 69 46 epidermal hyperplasia virus type 1 1951 gig 15642893Thermotoga ribonucleotide reductase,66 46 ref~NP_2279maritime dependent 34.1 1952 AAB80264 Homo SapiensGETH Human PR0332 protein.~ 577 ~ 61 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1952 AAB33425 Homo SapiensGETH Human PR0332 protein577 61 UNQ293 SEQ ID N0:57.

1952 AAY13396 Homo SapiensGETH Amino acid sequence577 61 of protein PR0332.

1953 gi16648392Drosoplula LD39243p 449 61 melanogaster 1953 AAG73684 Homo SapiensHUMA- Human colon cancer371 55 antigen rotein SEQ ID N0:4448.

1953 AAY48312 Homo SapiensMETA- Human prostate 371 55 cancer-associated rotein 9.

1954 AAU84348 Homo SapiensBARK/ Protein MMP2 differentially2068 94 ex ressed in breast cancer tissue.

1954 ABB90738 Homo SapiensUYJO Human Tumour Endothelial2068 94 Marker poi eptide SEQ
ID NO 208.

1954 AAB84607 Homo SapiensPFIZ Amino acid sequence2068 94 of matrix metallo roteinase elatinase A.

1955 gi16769680Drosophila LD46678p 245 35 melano aster 1955 AAM66797 Homo SapiensMOLE- Human bone marrow 148 80 expressed probe encoded protein SEQ

ID NO: 27103.

1955 AAM54396 Homo SapiensMOLE- Human brain expressed148 80 single exon probe encoded protein SEQ ID

NO: 26501.

1957 AAB80242 Homo SapiensGETH Human PR0236 rotein.648 97 _ AAM93378 Homo SapiensHELI- Human polypeptide,648 97 N0: 2955.

1957 AAB 12157Homo sapiensPROT- Hydrophobic domain648 97 protein from clone HP03165 isolated from KB

cells.

1958 AAM41696 Homo SapiensHYSE- Human polypeptide 234 47 SEQ ID

NO 6627.

1958 AAU17119 Homo SapiensHUMA- Novel signal transduction229 46 pathway protein, Seq ID 684.

1958 gi16741621Homo SapiensSimilar to RAB37, member228 47 of RAS

oncogene family 1959 gi18025526cercopithicineLF3 140 30 he esvirus 1959 gi3153821Mus musculusplenty-of prolines-101; 137 25 POP101; SH3-philo-protein 1959 gi39255 Actinomycessialidase 129 28 viscosus 1960 ABB 12366Homo SapiensHYSE- Human bone marrow 400 90 expressed rotein SEQ ID NO: 120.

1960 AA012936 Homo SapiensHYSE- Human polypeptide 115 95 SEQ ID

NO 26828.

1960 AAM84898 Homo SapiensHUMA- Human 113 82 immune/haematopoietic antigen SEQ

ID N0:12491.

1961 gi19110438Homo sa polycystin-1L1 190 94 iens 1961 gi3115393Rana pipiensguanylate cyclase inhibitory80 35 . protein 1961 gi3462887Ratios alpha-fodrin 68 31 norvegicus 1962 AAU83130 Homo Sapiens~ ZYMO Novel secreted ~ 1076~ 100 protein Table 2 SEQ AccessionSpecies Description Score /a ID No. Identity NO:

Z835892G6P.

1962 11890354 Brassica L-ascorbate eroxidase 80 33 na us 1962 g17529611Leishmania hypoothetical protein 79 31 L787.06 ma' or 1963 AAG78679 Homo sa BODE- Human thrombotic 467 86 iens protein 46.

1963 AAY87347 Homo SapiensINCY- Human signal peptide467 86 containing protein HSPP-124 SEQ ID

N0:124.

1963 AAB01431 Homo sa MILL- Human TANGO 224 467 86 iens (form 2).

1964 g13413504Rattus Bassoon 81 26 norvegicus 1964 g1330452 human DNA polymerase 79 28 he esvirus 1964 AAV69717_Homo SapiensLUDW- Tumour rejection 73 33 antigen aal precursor MAGE-C1 cDNA.

1965 gi~2323'287~gmultiple polyprotein 286 64 b~AAB6652sclerosis 8.1~ associated retrovirus 1965 gi~2351212~dFriend marinegag-pol polyprotein (precursor179 47 protein) bj~BAA2206leukemia virus 4.1~

1965 gi~9629516~rRauscher Pol 179 47 marine ef~NP_0447leukemia virus 38.1 1966 gi~2323287~gmultiple polyprotein 476 65 b~AAB6652sclerosis 8.1~ associated retrovirus 1966 gi~2281588~gsynthetic Pol 323 51 b~AAB6416construct 0.1~

1966 gi~9626961~rMarine leukemiaPr180 323 51 ef~NP_0579virus 33.1 1967 12065210 Mus musculusPro-Pol-dUTPase pol rotein518 73 1967 AAM65715 Homo SapiensMOLE- Human bone marrow 464 69 expressed probe encoded protein SEQ

ID NO: 26021.

1967 AAM53338 Homo SapiensMOLE- Human brain expressed464 69 single exon probe encoded protein SEQ ID

NO: 25443.

1968 AAG78149 Homo SapiensBODE- Human polypeptide-388 82 cytochrome b5-13.

1968 g13150438Human pol-env 345 55 endogenous retrovirus K

1968 g11469243Human pol/env 345 55 endogenous retrovirus K

1969 g121113108XanthomonasTong-dependent receptor 78 31 campestris pv.

campestris str.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1969 gi476274 Homo SapiensR kappa B 77 23 1969 gi4206769Acanthamoebamyosin I heavy chain 76 27 kinase castellanii 1970 gi~13310191~multiple recombinant envelope 244 77 protein gb~AAK181sclerosis 89.1~AF331associated 1 retrovirus _ element 1970 gi~8272468~gHomo Sapiensenvelope protein 219 81 b~AAF74215 .1 ~AF15696 1970 gi~21103962~Homo Sapiensenverin-2 219 77 gb~AAM331 41.1 1971 AAU83621 Homo SapiensGETH Human PRO protein, 320 100 Seq ID No 60.

1971 AA005826 Homo SapiensHYSE- Human polypeptide 295 93 SEQ ID

NO 19718.

1971 AAM39560 Homo SapiensHYSE- Human polypeptide 194 56 SEQ ID

NO 2705.

1972 gi6456112Mus musculusF-box protein FBX15 128 44 1972 gi21428946Drosophila GH22104p 74 31 melanogaster 1972 gi~6456112~gMus musculusF-box protein FBX15 128 44 b~AAF09139 .1~

1973 1148270 Escherichialambda-integrase 550 94 coli 1973 g11790244Escherichiasite-specific recombinase,550 94 coli acts on cer I~12 sequence of ColEl, effects chromosome segregation at cell division 1973 g113364217Escherichiasite-specific recombinase544 92 coli XerC

0157:H7 1974 g11805552EscherichiaFORMATE HYDROGENLYASE 887 88 coli TRANSCRIPTIONAL ACTIVATOR.

1974 11616960 EscherichiaHyfR 887 88 coli 1974 g17920396Salmonella formate hydrogenlyase 522 54 activator typhimuriumprotein 1975 1409795 EscherichiaNo definition line found1175 99 coli 1975 g115074592SinorllizobiumHYPOTHETICAL 378 33 meliloti TR.ANSMEMBRANE PROTEIN

1975 g117740718AgrobacteriumNa+/Pi-cotransporter 372 34 tumefaciens str.

C58 (U.

Washington) 1976 AAB82047 Homo SapiensIGAK- Human mast cell 163 23 surface antigen.

1976 g112654783Homo SapiensSimilar to loss of heterozygosity,163 23 11, chromosomal region 2, gene A

1976 AAZ45690-Homo sapiensREGC cDNA sequence encoding108 25 the aal human minor vault protein 193.

1977 ABB56523 Homo SapiensMERI Human NMDA receptor73 28 subunit SEQ ID NO 44.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1977 AAW87504 Homo SapiensSIBI- Human N-methyl-D-aspartate73 28 receptor subunit encoded by clone NMDA24.

1978 AAG00471 Homo SapiensGEST Human secreted protein,285 93 SEQ ID

NO: 4552.

1978 gi298489 Papio hamadryasSP-10 133 34 1978 gi452582 Vulpes vulpesfox sperm acrosomal protein132 34 FSA-Acr. l 1979 AAB87128 Homo SapiensMILL- Human secreted 490 86 protein MANGO 349, SEQ ID N0:130.

1979 AAB87179 Homo SapiensMILL- Human secreted 488 85 protein MANGO 349 I21K variant, SEQ ID

N0:227.

1979 AAB87181 Homo SapiensMILL- Human secreted 487 85 protein MANGO 349 E41D variant, SEQ ID

N0:231.

1982 AAM75035 Homo SapiensMOLE- Human bone marrow 109 67 expressed probe encoded protein SEQ

ID NO: 35341.

1982 AAM62231 Homo SapiensMOLE- Human brain expressed109 67 single exon probe encoded protein SEQ ID

NO: 34336.

1982 gi11967423Mus musculusvomeronasal receptor 105 76 1983 AAG89276 Homo sapiensGEST Human secreted protein,224 46 SEQ ID

NO: 396.

1983 AAB56565 Homo sapiensROSE/ Human prostate 99 40 cancer antigen protein sequence SEQ
ID N0:1143.

1983 AAY44987 Homo sa 1NCY- Human epidermal 78 28 iens protein-4.

1984 AAB95089 Homo SapiensHELI- Human protein sequence498 97 SEQ

ID NO:17025.

1984 AAM06608 Homo SapiensHYSE- Human foetal protein,495 96 SEQ ID

NO: 339.

1984 gi497890 unidentifiedalpha subunit of dinitrogenase73 24 nitrogen-fixingreductase (Fe protein) bacteria 1985 gi~17455728~Homo Sapienssimilar to Zinc-forger 71 37 protein ubi-d4 ref~XP_0635 (Requiem) (Apoptosis response zinc 94.1 ~ finger protein) 1986 gi21428886Drosophila GH12469p 69 34 melano aster 1987 17767529 Bos taurus cyclophilin I 364 75 1987 18699209 Canis familiariscyclo hilin A 361 88 1987 111641132Sus scrofa cyclo hilin 361 88 1988 g115073168SinorhizobiumPROBABLE TRANSLATION 81 37 meliloti INITIATION FACTOR IF-2 PROTEIN

1988 g11181352Paramecium Pro-rich protein; PIPG 78 25 (8X) bursaria Chlorella virus 1 1988 g1493242 Feline Feline herpesvirus type 77 20 1 immediate herpesvirusearly protein 1989 AAM65707 Homo SapiensMOLE- Human bone marrow 134 66 expressed probe encoded protein SEQ

ID NO: 26013.

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

1989 AAM53330 Homo SapiensMOLE- Human brain expressed134 66 single exon probe encoded protein SEQ ID

NO: 25435.

1989 gi~20475216~Homo Sapienssimilar to synapsin 228 59 I

ref~XP-1148 02.1 ~

1990 AAM71181 Homo SapiensMOLE- Human bone marrow110 64 expressed probe encoded protein SEQ

ID NO: 31487.

1990 AAM58674 Homo SapiensMOLE- Human brain expressed110 64 single exon probe encoded protein SEQ ID

NO: 30779.

1990 gi21323636CorynebacteriumSulfate permease and 75 26 related glutamicum transporters (MFS superfamily) 1991 gi1932813Xeno us laevisdsRNA adenosine deaminase96 34 1991 AAE10203 Homo SapiensHYSE- Human bone marrow83 25 derived conti rotein, SEQ ID
NO: 68.

1991 gi3242649Rana catesbeianaalpha 1 type I collagen80 30 1992 gi1181423Paramecium PBCV-1 chitinase 71 41 bursaria Chlorella virus 1 1992 gi~21300897~Anopheles agCP14405 72 37 gb~EAA130gambiae str.

42.1 ~ PEST

1992 gi~9631828~rParamecium PBCV-1 chitinase 71 41 ef~NP_0486bursaria 13.1 Chlorella virus 1 1994 gi8248755Plasmodium protein phosphatase 72 25 falciparum 1994 gi4104348CampylobacterS-layer-RTX protein 70 38 rectus 1994 gi~8248755~ePlasmodium protein phosphatase 72 25 mb~CAB628falciparum 78.2 1995 gi21324402CorynebacteriumUncharacterized ATPase 73 38 related to the glutamicum helicase subunit of the Holliday ATCC 13032 junction resolvase 1995 gi~19552845~CorynebacteriumCOG2256:Uncharacterized73 38 ATPase ref~NP_6008glutamicum related to the helicase subunit of the 47.1 Holliday 'unction resolvase 1995 gi~17533213~CaenorhabditisF14ES.S.p 73 30 reflNP elegans 77.1 ~

1996 11871223 Rickettsia crystalline surface 92 30 hi layer rotein 1996 g16969926Rickettsia OmpB ~ 79 25 aeschlimannii 1996 g114670347Rickettsia OmpB 78 25 felis 1997 gi~20548733~Homo Sapienssimilar to gag protein 256 58 re~XP-0556 41.2 1997 gi~9739120~gBovine leukemiagag 186 34 b~AAF97916virus .l TahlP 7 SEQ AccessionSpecies Description Score ID No. Identity NO:

1997 gi~9626226~rBovine leukemiaPr44 185 34 e~NP_0568virus 97.1 1998 AAM79834 Homo SapiensHYSE- Human protein SEQ 279 71 ID NO

3480.

1998 AAM78850 Homo SapiensHYSE- Human protein SEQ 279 71 ID NO

1512.

1998 AAM79204 Homo SapiensHYSE- Human protein SEQ 272 71 ID NO

1866.

1999 AAM73176 Homo SapiensMOLE- Human bone marrow 168 48 expressed probe encoded protein SEQ

ID NO: 33482.

1999 AAM60521 Homo sapiensMOLE- Human brain expressed168 48 single exon probe encoded protein SEQ ID

NO: 32626.

1999 gi~13929148~Rattus cyclic nucleotide-gated 163 47 channel beta ref~NP_1139norvegicus subunit 1 97.1 ~

2000 gi1869859human very large tegument protein73 30 he esvirus 2000 gi7380253Neisseria 2-keto-4-hydroxyglutarate70 37 aldolase ' meningitidis 2000 gi7226633Neisseria 4-hydroxy-2-oxoglutarate70 37 aldolase/2-meningitidisdeydro-3-deoxyphosphogluconate MC58 aldolase 2001 gi17016969Mus musculusNUANCE 138 36 2001 gi6273778Homo Sapienstrabeculin-alpha 137 33 2001 gi1675222Mus musculusACF7 neural isoform 1 136 42 2002 AAM39256 Homo SapiensHYSE- Human polypeptide 81 29 SEQ ID

NO 2401.

2002 1840789 Homo sa bindin re ulato factor 81 29 iens 2002 g117028337Homo Sapiensregulatory factor X, 81 29 5 (influences HLA

class II expression) 2003 g12252814Mus musculusFOG 172 64 2003 AAR58815 Homo SapiensUSSH Human c-myc far 103 42 upstream element (FUSE) binding protein (FBP)variant from HL60 clone 3-1.

2003 g13598974Rattus protein tyrosine phosphatase103 26 norve icus 2004 g111994696Arabidopsiscontains similarity to 77 28 DNA repair thaliana protein ene id:K7M2.11 2004 17209527 Mus musculustestis-s ecific gene 73 24 2004 gi~17451912~Homo Sapienssimilar to DNA-binding 234 97 protein B

ref~XP_0710 83.1 2005 AAE12023 Homo sapiens1NCY- Human G-protein 173 100 coupled receptor, GCREC-2.

2005 AAG65832 Homo SapiensFARB Human G protein-coupled173 100 receptor (GPCR).

2005 AAG68126 Homo SapiensFARB Human 7TM-GPCR protein105 78 sequence SEQ ID N0:6.

2006 g120068811Homo SapiensRab-couplin protein 130 43 2006 g115822596Homo sapiensnRi 11 104 45 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

2006 gi13377897Homo SapiensRabl l interacting protein83 40 I2i l la 2007 gi~17539708)CaenorhabditisFO8B4.S.p 78 42 ref~NP-5014elegans 89.1 2008 AAE10350 Homo SapiensPFIZ Human ADAMTS-J1.4 504 97 variant protein.

2008 AAE10349 Homo SapiensPFIZ Human ADAMTS-J1.3 504 97 variant rotein.

2008 AAE10347 Homo sapiensPFIZ Human ADAMTS-J1.1 504 97 variant protein.

2009 AAV31720_Homo SapiensMOUN Nucleotide sequence87 29 of the aal PUR-al ha ene.

2009 AAT99264_Homo SapiensMOUN Human PUR-alpha 87 29 gene.

aal 2009 AAQ44800_Homo SapiensMOUN Encodes single-stranded87 29 DNA

aal binding (PUR) protein.

2010 gi170444 Lycopersiconextensin (class II) 123 27 esculentum 2010 gi4662641Arabidopsisexpressed protein 116 30 thaliana 2010 gi188864 Homo sa mucin 115 28 iens 2011 AAY93650 Homo SapiensHUMA- Amino acid sequence1677 100 of a human prostacyclin-stimulating factor-2.

2011 AAS 15723_Homo SapiensCURA- DNA encoding insulin-like1673 99 aal growth factor family related protein, NOV3.

2011 AAE17599 Homo SapiensINCY- Human extracellular1673 99 messenger (XMES)-1 rotein.

2012 gi10440434Homo sa FLJ00052 protein 336 69 iens 2012 gi20502870Mus musculusSDS3 333 68 2012 gi21430678Drosophila RE74901p 170 36 melano aster 2013 AAH77293_Homo SapiensMILL- Human ion channel 214 93 protein aal IC32391 cDNA coding re ion.

2013 AAE13278 Homo Sapiens1NCY- Human transporters214 93 and ion channels (TRICH)-5.

2013 AAG77969 Homo SapiensMILL- Human ion channel 214 93 protein IC32391.

2014 gi4894768Xeno us ephrin-B2 recursor 78 30 laevis 2015 AAU77498 Homo sapiens1NCY- Human lipid metabolism1291 100 enzyme, LMM-6.

2015 ABB08205 Homo SapiensINCY- Human lipid metabolism1122 100 enzyme-5 (LME-5).

2015 ABB07493 Homo SapiensINCY- Human lipid metabolism864 75 molecule (LMM) polypeptide (ID:

2965233 CD 1 ).

2016 gi~14769015~Homo Sapiensfibrillin3 68 36 retlXP_0415 69.1 ~

2017 gi2313786Helicobacterchorismate synthase (aroC)78 33 ylori 26695 2017 gi4155160HelicobacterCHORISMATE SYNTHASE 72 32 pylori J99 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

2017 gi~15645287~Helicobacterchorismate synthase (aroC)78 33 reilNP-2074pylori 26695 57.1 2018 gi15485622Homo sa Q9H4T4 like 1068 100 iens 2018 ABB 14744Homo SapiensHUMA- Human nervous system694 98 related pol epode SEQ ID NO 3401.

2018 AAB95100 Homo SapiensHELI- Human protein sequence101 24 SEQ

ID N0:17064.

2019 18050556 Gorilla carboxyl-ester lipase 223 42 gorilla 2019 AAU09894 Homo SapiensMONS Bile Salt Stimulated217 39 Lipase (BSSL).

2019 ABB04676 Homo SapiensMONS Human milk bile 217 39 salt-stimulated lipase (BSSL) protein SEQ

ID N0:2.

2020 12065210 Mus musculusPro-Pol-dUTPase polyprotein515 74 2020 gi~385615~gbMus Sp. fibulin gene homolog 300 75 ~AAB26708.

1~

2020 gi~13194728~Gallus galluspol-like protein ENS-3 170 33 gb~AAK155 26.1 ~AF329 2021 AAM66980 Homo SapiensMOLE- Human bone marrow 170 75 expressed probe encoded protein SEQ

ID NO: 27286.

2021 AAM54574 Homo sapiensMOLE- Human brain expressed170 75 single exon probe encoded protein SEQ ID

NO: 26679.

2021 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86 expressed probe encoded protein SEQ

ID NO: 35495.

2022 AAD29146_Homo sapiensZYMO Human Zcyto2l consensus649 83 aal cDNA.

2022 AAU83208 Homo SapiensZYMO Novel secreted protein649 83 Z908463G2P.

2022 AAE18311 Homo SapiensZYMO Human Zcyto2l consensus649 83 protein.

2024 g114336750Homo SapiensCe protein similar to 84 34 Dm Cys3His forger rotein 2024 AAB50363 Homo sa UYSL- Human SRCAP. 83 34 iens 2024 AAB95541 Homo SapiensHELI- Human protein sequence83 34 SEQ

ID N0:18149.

2025 g118676682Homo SapiensFLJ00240 protein 470 45 2025 g114701866Dictyosteliumcarmil 221 29 discoideum 2025 g11881738Acanthamoebamyosin-I binding protein219 29 Acan125 castellanii 2026 ABB12490 Homo SapiensHYSE- Human bone marrow 212 78 expressed protein SEQ ID NO: 329.

2027 AAU83147 Homo SapiensZYMO Novel secreted protein1153 100 Z846363G2P.

2027 gi~21287755~Anopheles ebiP4780 205 51 gb~EAA000gambiae str.

76.1 ~ PEST

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

2027 gi~17552028~CaenorhabditisCOSD11.8.p 91 38 ref~NP-4984elegans 07.1 ~

2028 gi1510143Homo Sapienssimilar to C.elegans 323 57 protein encoded in cosmid T20D3 (Z68220).

2028 gi3879942CaenorhabditisT20D3.11 124 27 elegans 2028 gi5869818Globodera NADH-ubiquinone oxidoreductase82 27 allida subunit 6 2029 AAE13288 Homo SapiensINCY- Human transporters75 31 and ion channels (TRICH)-15.

2029 gi3252893Thermotoga ABC transporter 74 37 neapolitana 2029 gi~18403965~Arabidopsisexpressed protein 70 29 re~NP_5658thaliana 26.1 2030 AAB97908 Homo SapiensSHAN- Hurnan GTP-binding79 27 protein 17 SEQ ID N0:2.

2030 AAM42129 Homo SapiensHYSE- Human polypeptide 79 27 SEQ ID

NO 7060.

2030 gi9971156Mus musculusGTP-binding like protein79 27 2031 gi~20864803~Mus musculusRIKEN cDNA 4930503K02 89 25 ref)XP'1308 00.1 ~

2031 gi~21262152~Oryza sativaSMC4 protein 77 28 emb~CAD32 690.1 2031 gi~1507705~gBorrelia outer surface protein 74 33 b~AAB0656burgdorferi 8.1~

2032 AAG65898 Homo SapiensSMIK Amino acid sequence481 100 of GSK

ene Id 18525.

2032 AAU83670 Homo sapiensGETH Human PRO protein, 471 97 Seq ID No 158.

2032 ABB84896 Homo SapiensGETH Human PR01309 protein471 97 se uence SEQ ID N0:160.

2034 gi6723273Baboon gag-pol precursor polyprotein687 43 endogenous virus sham 2034 gi18448744Moloney Pr180 gag-pro-pol polyprotein685 42 marine leukemia virus 2034 gi2801471Moloney Pr180 682 42 m'urine leukemia virus 2035 gi~17554696~CaenorhabditisR148.7.p 68 32 ref~NP elegans 70.1 2035 gi~16127996fEscherichiaaspartokinase I, homoserine68 43 coli re~NP K12 dehydrogenase I

~

43.1 2035 gi~19548975~Escherichiaaspartokinase I-homoserine.68 43 coli gb~AAL908 dehydrogenase I

85.1~AF487 2036 gi13424459Caulobactermethyl-accepting chemotaxis~ 72 ~ 32 protein TahlP 9 SEQ AccessionSpecies Description Score ID No. Identity NO:

crescentus Mc I

2036 gi~16877133~Homo sapienscarboxypeptidase, vitellogenic-like69 30 gb~AAH168 38.1 ~AAH16 2037 AAB67055 Homo SapiensINCY- Human immune response532 75 molecule (IMUN) protein SEQ ID NO:

9.

2037 AA001862 Homo SapiensHYSE- Human polypeptide403 67 SEQ ID

NO 15754.

2037 gi~6753924~rMus musculusFriend virus susceptibility240 39 eflNP

_ 74.1 2039 AAB38447 Homo SapiensHUMA- Fragment of human80 27 secreted protein encoded by gene 20 clone HLTFBY 15.

2039 111527799Mus musculusGTP-bindin rotein like 73 30 2039 g1695237 Equine tegument protein 73 33 a he esvirus 2040 gi~20544038~Homo Sapienssimilar to PER-HEXAMER 68 41 REPEAT

ref~XP PROTEIN 5 12.4 2042 AAM77922 Homo SapiensMOLE- Human bone marrow642 85 expressed probe encoded protein SEQ

ID NO: 38228.

2042 AAM65219 Homo SapiensMOLE- Human brain expressed642 85 single exon probe encoded protein SEQ ID

NO: 37324.

2042 gi~6723273~dBaboon gag-pol precursor polyprotein139 26 bj~BAA8965endogenous 9.1 virus strain 2043 g148507 Wolinella formate dehydrogenase 80 27 succinogenes 2043 112381857Danio rerio c-Maf 78 42 2043 gi~18594822~Homo Sapienszinc finger protein 306 100 21 (KOX 14) reflXP_0929 95.1 2044 13132272 Sus scrofa WT1 homologue 99 47 2044 AAG78446 Homo sapiensMASI Predicted WT1 Wilin's96 45 tumour pol eptide of humans.

2044 AAG62154 Homo SapiensCORI- Human WT1/PSA 96 45 fusion rotein SEQ ID NO: 357.

2046 g121483222Drosophila AT16994p 86 33 melanogaster 2046 g121111736Xanthomonas cell division protein 79 30 campestris pv.

campestris str.

2046 112653493Homo SapiensSimilar to brain acid-soluble79 36 protein 1 2047 ABB 12490Homo SapiensHYSE- Human bone marrow200 83 expressed rotein SEQ ID NO: 329.

2047 gi~20837783~Mus musculussimilar to 40S ribosomal73 35 protein S11 ret~XP_1459 21.1 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

2047 gi~6002932~gStreptomycesglycosyl transferase 71 35 b~AAF00209fradiae .1 CAF ' 2048 AAB59012 Homo SapiensHUMA- Breast and ovarian103 32 cancer associated antigen protein sequence SEQ ID 720.

2048 gi2429362Santalum proline rich rotein 99 31 album 2048 gi17945382Drosophila RE17165p 98 25 melanogaster 2051 gi15625542Hepatitis S antigen 71 31 B virus 2051 gi~4884886~gHepatitis surface antigen 68 30 B virus b~AAD3185 7.1 CAF

2052 AAB28764 Homo SapiensHUMA- Sequence homologous693 78 to protein fragment encoded by gene 21.

2052 gi2065210Mus musculusPro-Pol-dUTPase olyprotein693 78 2052 AAB73606 Homo SapiensSHAN- Human dUTP pyrophosphatase668 77 26.

2053 gi9945983Pseudomonastranscriptional regulator83 34 PcaQ

aeru inosa 2053 gi13874427Homo sa cerebral protein-5 76 35 iens 2053 gi12803205Homo sa CAAX box 1 76 35 iens 2054 gi21307831Aplysia CREB-binding protein 76 26 californica 2054 gi16755887Drosophila guanine nucleotide exchange76 26 factor melano aster 2054 gi~21307831~Aplysia CREB-binding protein 76 26 gb~AAL548californica 59.1) 2055 gi16588389Homo SapiensB lymphocyte activation-related437 71 protein 2055 AAB92981 Homo SapiensHELI- Human protein sequence407 68 SEQ

ID N0:11698.

2055 AAM48325 Homo SapiensSHAN- Human urine receptor398 74 21.23.

2056 gi~2072969~gHomo Sapiensp40 134 47 b~AACS

4.1~

2056 gi~7959889~gHomo SapiensPR02221 123 43 b~AAF71115 .1 CAF

2056 gi~2072974~gHomo Sapiensp40 122 44 b~AACS

7.1 2057 gi19171178Homo Sapiensmetalloprotease disintegrin518 98 16 with thrombospondin type I
motif 2057 gi19171150Homo sa ADAMTS18 rotein 168 35 iens 2057 AAM39212 Homo SapiensHYSE- Human polypeptide 128 76 SEQ ID

NO 2357.

2058 gi~4959869~gMurine leukemiapolymerase 336 50 b~AAD3453virus 6.1 Tahlc:
SEQ AccessionSpecies Description Score ID No. Identity NO:

2058 gi~9630313~rGibbon ape pol polyprotein 331 46 ef~NP_0567leukemia virus 90.1 2058 gi~6723273~dBaboon gag-pol precursor polyprotein329 49 bj~BAA8965endogenous 9.1 ~ virus strain 2059 gi~20546404~Homo Sapienssimilar to nuclear receptor179 91 coactivator ref~XP_1164 4; RET-activating gene 66.1 2060 gi~6731237~gHomo Sapiensmyoferlin 112 79 b~AAF27177 .1 CAF

2060 gi~798799~gbMus musculusimmunoglobulin heavy 72 55 chain ~AAC37713.

1~

2060 gi~20819487~Mus musculussimilar to LYRIC 72 27 ref~XP_1453 57.1 2061 gi415738 Euglena PSII D1- olype tide 75 27 gracilis 2061 gi11491 Euglena 32 kd rotein 75 27 gracilis 2061 gi11488 Euglena 32-Kda thylakoid membrane75 27 acilis protein 2062 gi21360549ArabidopsisAT3g01480/F4P13 3 79 29 thaliana 2062 gi3337366Arabidopsisnodulin-like protein 68 36 thaliana 2063 17959778 Homo sa PR01546 121 42 iens 2063 AAG02639 Homo SapiensGEST Human secreted protein,119 53 SEQ ID

NO: 6720.

2063 AAG02753 Homo SapiensGEST Human secreted protein,110 45 SEQ ID

NO: 6834.

2064 g115077406Antheraea fibroin 109 30 yamamai 2064 AAB82806 Homo SapiensBOST- Human low density 92 24 lipoprotein binding roteiii 2 (LBP-2).

2064 AA001059 Homo SapiensHYSE- Human polypeptide 90 30 SEQ ID

NO 14951.

2065 g1200964 Mus musculusserine 2 ultra hi h sulfur80 30 rotein 2065 1200962 Mus musculusserine 1 ultra high sulfur80 30 protein 2065 AAM99918 Homo SapiensHIJMA- Hurnan polypeptide75 28 SEQ ID

NO 34.

2066 g1544724 Cavia cholecystokinin A receptor;69 29 CCK-A

receptor 2066 g12541920Rattus cholecystokinintype-A 69 29 receptor norvegicus 2066 12114152 Mus musculuscholecystokinin type-A 69 29 receptor 2067 g12828586Pongo pygmaeusBRCA1 73 22 2068 AAM40813 Homo SapiensHYSE- Human polypeptide 75 29 SEQ ID

NO 5744.

2068 AAM39027 Homo SapiensHYSE- Human polypeptide 75 29 SEQ ID

NO 2172.

2068 AAY25768 Homo SapiensHUMA- Human secreted 75 29 protein encoded from gene 58.

2070 11334150 Mus musculusunidentified reading 169 28 frame (first ATG

Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

at os. 210) 2070 gi557822 Saccharomycesmal5, stay len: 1367, 133 20 CAI: 0.3, cerevisiae AMYH_YEAST P08640 GLUCOAMYLASE S1 (EC 3.2.1.3) 2070 gi1304387Saccharomycesglucoamylase 133 20 cerevisiae var.

diastaticus 2071 gi17983056Brucella BETA-HEXOSAMINIDASE A 88 29 melitensis 2071 gi1573917Haemophilus multidrug resistance 81 33 ' protein A (emrA) influenzae Rd 2071 gi17982813Brucella NITROGEN REGULATION 80 26 melitensis PROTEIN NTRB

2073 gi~17532255~Caenorhabditisankyrin and proline rich67 29 domains ref~NP elegans 31.1 2074 gi19919730Homo SapiensBTEBS 704 97 2074 gi13195441Homo sapiensBTE-binding protein 4 478 64 2074 114549656Mus musculusdo amine receptor regulating452 76 factor 2076 AAE17482 Homo SapiensZYMO Human leucine-rich 1326 100 repeat-7 (ZLRR7) rotein.

2076 AAU83190 Homo SapiensZYMO Novel secreted protein1326 100 Z887300G2P.

2076 ABB 11242Homo SapiensHYSE- Human SLIT-2 homologue,568 99 SEQ ID N0:1612.

2077 g118893729Pyrococcus proteaseiv 74 34 furiosus DSM

2077 AAB94745 Homo SapiensHELI- Human protein sequence71 34 SEQ

ID N0:15792.

2077 g116413096Listeria 11n0656 68 35 innocua 2078 g160675 Beet ringspotpolyprotein 75 37 virus 2078 gi~14743288~Homo Sapienssimilar to Alu subfamily92 58 J sequence reflXP contamination warning 0471 entry 91.1 2078 gi~20260801~Beetringspotpolyprotein 75 37 ref~NP_6201virus 13.1 2079 g13834629Mus musculusdiaphanous-related formin;208 67 p134 mDia2 2079 AAG74400 Homo SapiensHUMA- Human colon cancer71 36 antigen rotein SEQ ID N0:5164.

2079 13171906 Homo SapiensDIA-156 roteiii 71 36 2080 g117298315Homo sa ienscandidate tumor suppressor125 100 rotein 2080 g17861733Homo Sapienslow density lipoprotein 125 100 receptor related protein-deleted in tumor 2080 g18926243Mus musculuslow density lipoprotein 90 63 receptor related protein LRP1B/LRP-DIT

2081 g14574224Fundulus multidrug resistance 343 55 transporter heteroclitushomolog 2081 g116304396Pseudopleuronecmultidrug resistance 340 52 transporter-like tes americanusprotein 2081 g13355757Gallus gallus~ ABC transporter protein~ 328 ~ 53 Table 2 SEQ AccessionSpecies Description Score ID No. Identity NO:

2082 gi7532975bacteriophageP10 67 27 phi-8 Table 3 SEQ ID DatabaseDescription *Results NO: entr ID

1059 BL00349CTF/NF-I roteins. BL00349H 15.70 9.710e-09 1061 DM00215PROLINE-RICH PROTEIN DM00215 19.43 6.143e-10 3. 29-61 DM00215 19.43 8.322e-09 1062 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 6.092e-12 ORF2.

1063 PR00944COPPER ION BINDING PROTEINPR00944E 9.18 7.132e-09 SIGNATURE

1076 PD00078REPEAT PROTEIN ANK PD00078B 13.14 9.217e-09 NUCLEAR ANKYR.

1089 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 8.754e-10 SIGNATURE

1089 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.658e-09 SIGNATURE

1089 PR00341PRION PROTEIN SIGNATUREPR00341E 3.32 9.898e-09 1099 PR00886HIGH MOBILITY GROUP PR00886C 11.84 1.141e-12 (HMGl/HMG2) PROTEIN

SIGNATURE

1107 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.077e-09 SIGNATURE

1118 BL00472Small cytokines BL00472A 7.45 5.655e-09 (intercrine/chemokine) C-C

subfamily signatur.

1118 PR00655AUXIN BINDING PROTEIN PR00655E 8.06 9.000e-09 SIGNATURE

1119 BL00970Nuclear transition proteinBL00970C 14.80 8.183e-12 2 proteins. 99-136 1119 BL00826MARCKS family roteins. BL00826B 12.51 4.279e-09 1119 BL00348p53 tumor antigen proteins.BL00348F 23.19 5.881e-10 BL00348F 23.19 6.857e-09 1119 PD01457RIBOSOMAL PROTEIN 40S PD01457A 16.51 8.216e-09 FINGER METAL.

1119 BL00752XPA protein. BL00752B 19.17 7.866e-09 BL00752B 19.17 8.979e-09 1119 DM01269303 kw ACTIVATING RAN DM01269A 23.35 9.446e-09 GTPASE ISOZYME.

1124 DM01813EGG-LAYING HORMONE. DM01813A 15.31 5.215e-09 1127 BL00452Guanylate cyclases proteins.BL00452A 17.52 1.170e-09 1131 BL00113Adenylate kinase roteins.BL00113B 20.49 9.897e-09 1162 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.000e-35 FINGER METAL-BINDING
NU.

1163 BL00407Connexins proteins. BL00407B 14.23 9.775e-30 BL00407C 14.61 2.500e-24 1163 PR00206CONNEXIN SIGNATURE PR00206B 13.75 1.957e-24 PR00206A 11.35 6.559e-23 PR00206C 15.16 7.469e-20 1171 PD01066PROTEIN ZINC FINGER PD01066 19.43 8.500e-28 FINGER METAL-BINDING
NU.

1177 DM018031 HERPESVIRUS DM01803C 7.00 7.240e-09 GLYCOPROTEIN H.

1190 PR00774GUANYLIN PRECURSOR PR00774A 6.49 8.579e-10 SIGNATURE

1195 PD02059CORE POLYPROTEIN PROTEINPD02059C 21.58 8.031 e-09 100-140 GAG CONTAINS: P.

1197 BL00472Small cytokines BL00472A 7.45 8.000e-14 (intercrine/chemokine) C-C

subfamily signatur.

1213 PR00437SMALL CXC CYTOKINE ~ PR00437C 14.85 1.310e-16 Table 3 SEQ DatabaseDescription *Results ID

NO: entr ID

FAMILY SIGNATURE

1213 BL00471Small cytokines BL00471 23.92 7.960e-10 (intercrine/chemokine) C-x-C

subfamily signat.

1216 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 5.208e-09 SIGNATURE

1222 PF00852Fucosyl transferase. PF00852F 15.97 1.409e-15 1224 BL00299Ubi uitin domain roteins.BL00299 28.84 6.301e-11 1230 PR00540MUSCARINIC M3 RECEPTOR PR00540A 10.24 7.174e-09 SIGNATURE

1240 BL00290Immunoglobulins and BL00290A 20.89 7.480e-10 major 160-182 histocompatibility complexBL00290B 13.17 2.875e-09 roteins. 226-243 1258 PR00792PEPSIN (Al) ASPARTIC PR00792A 11.54 5.500e-18 PROTEASE FAMILY SIGNATURE

1258 BL00141Eukaryotic and viral BL00141A 12.10 4.789e-15 aspartyl 87-102 proteases roteins. BL00141B 12.14 2.929e-10 1300 BL00616Histidine acid phosphatasesBL00616A 11.86 1.000e-09 phos hohistidine proteins.

1301 DM014176 kw INDUCING XPMC2 DM01417C 12.93 9.325e-12 MUSHROOM SPAC22G7.04. DM01417D 11.08 9.820e-12 1302 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 6.067e-11 SIGNATURE

1311 BL00926Lysyl oxidase copper-bindingBL00926B 13.84 7.453e-09 region 84-121 roteins.

1320 PR00830ENDOPEPTIDASE LA (LON) PR00830A 8.41 3.712e-09 SER1NE PROTEASE (S16) SIGNATURE

1325 BL00048Protamine P1 proteins. BL00048 6.39 4.671e-10 BL00048 6.39 4.908e-10 BL00048 6.39 2.913e-09 BL00048 6.39 5.950e-09 1345 PF00424REV protein (anti-repressionPF00424A 14.34 2.436e-09 transactivator protein).

1345 BL00048Protamine P1 proteins. BL00048 6.39 4.553e-10 BL00048 6.39 6.513e-09 1353 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 2.857e-15 ORF2.

1363 PF00850Histone deacetylase PF00850B 10.13 5.154e-14 family. 95-109 PF00850C 14.55 9.063e-11 1389 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.423e-09 SIGNATURE

1389 PD00306PROTEIN GLYCOPROTE1N PD00306B 5.57 7.000e-09 PRECURSOR RE.

1396 BL00427Disinte ins roteins. BL00427 13.93 7.698e-17 1396 PR00289DISINTEGR1N SIGNATURE PR00289A 13.62 5.667e-14 1416 BL00419Photosystem I psaA and BL00419B 22.23 9.489e-09 psaB 18-51 roteins.

1434 PF00075RNase H. PF00075I 16.21 7.375e-11 1440 BL00598Chromo domain proteins.BL00598 14.45 1.500e-15 1440 PR00504CHROMODOMA1N SIGNATURE PR00504B 9.12 5.200e-13 PR00504C 11.19 6.510e-09 1450 PF00622Domain in SPla and the PF00622B 21.00 2.227e-09 RYanodine 93-114 Rece tor.

1451 PD02935FATTY ACID PD02935C 16.62 4.375e-16 OXIDOREDUCTASE BIOSYNT.

1467 BL00479Phorbol esters / diacylglycerolBL00479A 19.86 3.000e-11 Table 3 SEQ DatabaseDescription *Results ID

NO: entr ID

binding domain proteins.BL00479B 12.57 3.340e-10 1468 PF00992Tro onin. PF00992A 16.67 5.563e-10 1468 BL00795Involucrin proteins. BL00795C 17.06 3.600e-09 1468 PR00042FOS TRANSFORMING PROTEINPR00042D 8.97 7.554e-09 SIGNATURE

1474 BL00107Protein kinases ATP-bindingBL00107A 18.39 9.308e-12 region 62-92 proteins.

1474 PR00109TYROSINE KINASE CATALYTICPR00109B 12.27 1.563e-09 DOMAIN SIGNATURE

1474 BL00239Receptor tyrosine kinaseBL00239C 18.75 4.205e-09 class II 49-71 proteins.

1475 BL00456Sodiuxnaolute symporterBL00456C 24.55 4.886e-28 family 15-69 proteins.

1480 BL00983L -6 / u-PAR domain BL00983C 12.69 1.346e-09 roteins. 36-51 1482 BL00979G-protein coupled receptorsBL00979A 19.66 9.633e-12 family 3 74-121 roteins.

1502 PD02561DETHIOBIOTIN SYNTHETASEPD02561B 12.71 9.308e-09 SYNTHASE.

1506 BL00297Heat shock hsp70 proteinsBL00297H 15.46 9.625e-23 family 302-355 proteins. BL00297D 11.95 6.063e-21 BL00297E 18.56 6.077e-21 BL00297C 9.51 9.667e-15 1506 PR0030170 KD HEAT SHOCK PROTEINPR00301I 12.76 3.208e-11 SIGNATURE

1513 PR00130DNASE I SIGNATURE PR00130E 14.66 5.046e-09 1515 DM012423 THREONINE--TRNA LIGASE.DM01242A 20.32 5.286e-20 1517 BL00983Ly-6 l u-PAR domain BL00983B 8.19 5.935e-10 roteins. 40-49 1520 BL00415S a sins proteins. BL00415P 2.37 3.914e-10 1520 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 3.746e-09 SIGNATURE PR00049D 0.00 1.000e-08 1530 PF00075RNase H. PF00075F 12.87 5.500e-10 1537 PR00463E-CLASS P450 GROUP I PR00463F 17.63 5.219e-13 SIGNATURE PR00463A 11.40 8.714e-12 PR00463B 17.50 5.041e-10 1537 PR00385P450 SUPERFAMILY PR00385C 16.94 6.318e-09 SIGNATURE

1538 PR00709AVIDIN SIGNATURE PR00709A 4.60 5.585e-09 1553 DM01354kw TRANSCRIPTASE REVERSEDM01354Y 10.69 6.423e-16 ORF2.

1558 PD01066PROTEIN ZINC FINGER PD01066 19.43 6.400e-25 FINGER METAL-BINDING
NU.

1564 PF00589Phage integrase family.PF00589B 16.17 1.621e-11 PF00589C 14.62 9.609e-10 1566 BL00908Mandelate racemase / BL00908B 37.71 6.455e-13 muconate 191-245 lactonizing enzyme family signa.

1567 PR00702ACRIFLAVIN RESISTANCE PR00702A 14.92 2.421e-25 PROTEIN FAMILY SIGNATUREPR00702B 12.77 9.690e-18 1570 BL01047Heavy-metal-associated BL01047A 13.50 5.125e-17 domain 75-97 proteins.

1575 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 9.429e-15 ORF2.

1606 PF00642Zinc finger C-x8-C-x5-C-x3-HPF00642 11.59 2.575e-11 type 197-207 (and similar).

1610 DM01354kw TRANSCRIPTASE REVERSEDM01354I 15.55 7.702e-34 ORF2. DM01354G 11.57, 3.625e-32 DM01354H 18.00 2.528e-23 Table 3 SEQ DatabaseDescription *Results ID

NO: entr ID

DM01354F 14.56 4.088e-11 1616 PD02929 ADHESION GLYCOPROTE1N PD02929A 28.27 2.263e-25 PRECURSORI.

1627 PR00121 SODIiJM/POTASSITJM- PR00121A 6.71 1.000e-08 TRANSPORTING ATPASE

SIGNATURE

1630 PR00824 HEPATIC LIPASE SIGNATUREPR00824A 7.81 7.214e-22 1640 BL00359 Ribosomal protein L11 BL00359C 22.18 1.155e-11 proteins. 93-126 1641 PR00080 ALCOHOL DEHYDROGENASE PR00080A 9.32 8.839e-10 SUPERFAMILY SIGNATURE

1641 PR00081 GLUCOSE/RIBITOL PR00081A 10.53 2.000e-12 DEHYDROGENASE FAMILY PR00081E 17.54 1.783e-10 SIGNATURE PR00081B 10.38 2.227e-09 1641 BL00061 Short-chain BL00061A 9.41 9.053e-10 dehydrogenases/reductasesBL00061B 25.79 6.860e-09 family 197-234 roteins.

1666 BL01257 Ribosomal protein LlOeBL01257D 18.80 2.973e-15 proteins. 59-98 1667 BL01241 Link domain proteins. BL01241 35.81 8.579e-37 BL01241 35.81 7.835e-14 1667 BL00086 Cytochrome P450 cysteineBL00086 20.87 3.377e-09 heme- 283-314 iron 1i and roteins.

1668 PR00671 INHIBIN BETA B CHAIN PR00671A 8.36 8.088e-09 SIGNATURE

1672 BL00674 AAA-protein family BL00674E 15.24 5.680e-15 proteins. 31-50 1682 PF00075 RNase H. PF00075A 14.44 4.400e-13 PF00075C 11.58 8.442e-09 1689 PD01066 PROTEIN ZINC FINGER PD01066 19.43 6.471 e-27 FINGER METAL-BINDING
NU.

1689 PR00788 NITROPHOR1N SIGNATURE PR00788A 9.79 6.108e-09 1692 BL00299 Ubiquitin domain proteins.BL00299 28.84 4.759e-10 1697 PR00423 CELL DIVISION PROTEIN PR00423E 7.36 4.038e-09 SIGNATURE

1706 BL00795 Involucrin proteins. BL00795C 17.06 5.395e-10 1709 BL00514 Fibrinogen beta and BL00514C 17.41 3.618e-25 gamma chains 68-104 C-terminal domain proteins.BL00514H 14.95 6.745e-16 BL00514G 15.98 6.566e-14 BL00514E 14.28 8.286e-14 BL00514D 15.35 2.915e-12 1714 PF00878 Cation-independent PF00878T 17.51 3.818e-09 mannose-6- 41-67 hos hate receptor re eat roteins.

1715 PF01140 Matrix rotein (MA), PF01140D 15.54 4.872e-09 15. 123-157 1715 PF00992 Troponin. PF00992A 16.67 6.451e-10 PF00992A 16.67 3.724e-09 PF00992A 16.67 6.684e-09 1718 PD02474 SYNTHASE SMALL SUBUNITPD02474B 21.08 7.940e-10 ACETOLACT.

1725 BL00412 Neuromodulin (GAP-43) BL00412B 10.60 1.000e-10 proteins. 46-82 1725 PR00215 NEUROMODULIN SIGNATUREPR00215C 13.98 6.116e-10 1725 DM01688 2 POLY-IG RECEPTOR. DM01688G 16.45 3.160e-09 DM01688I 14.97 6.885e-09 1725 PD02870 RECEPTOR INTERLEUKIN-1PD02870B 18.83 8.564e-09 PRECURSOR.

1727 BL00107 Protein kinases ATP-bindingBL00107A 18.39 7.750e-21 region 185-215 proteins.

1727 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 7.176e-12 DOMAIN SIGNATURE

Table 3 SEQ DatabaseDescription *Results ID

NO: entr ID

1727 BL00239 Receptor tyrosine kinaseBL00239B 25.15 4.387e-09 class II 119-166 roteins.

1728 BL00415 Synapsins proteins. BL00415Q 2.23 8.115e-09 1734 PD01270 RECEPTOR FC PD01270B 22.18 5.567e-18 IMMUNOGLOBULIN AFFIN. PD01270C 19.54 1.167e-17 PD01270A 17.22 4.960e-14 PD01270D 24.66 4.284e-09 1736 PD02346 PHOTOSYSTEM II PROTEINPD02346A 9.24 8.851e-09 PRECURSOR PHOTOSYNTHESIS.

1741 BL00415 Syna sins proteins. BL00415Q 2.23 6.777e-09 1744 BL00479 Phorbol esters / diacylglycerolBL00479B 12.57 1.000e-08 binding domain proteins.

1750 PR00763 COAGULIN SIGNATURE PR00763B 8.39 6.457e-09 1754 PR00276 INSULIN A CHAIN SIGNATUREPR00276A 11.84 7.840e-09 1755 PR00042 FOS TRANSFORMING PROTEINPR00042D 8.97 2.565e-09 SIGNATURE

1755 PF00922 Vesiculovirus hospho PF00922A 19.17 5.759e-09 rotein. 99-132 1778 PR00245 OLFACTORY RECEPTOR PR00245A 18.03 9.836e-14 SIGNATURE PR00245C 7.84 1.540e-13 PR00245B 10.38 2.125e-13 1778 BL00237 G-protein coupled receptorsBL00237A 27.68 1.474e-12 proteins. 90-129 1778 PR00534 MELANOCORTIN RECEPTOR PR00534A 11.49 4.729e-09 FAMILY SIGNATURE

1778 PR00237 RHODOPSIN-LIFE GPCR PR00237A 11.48 3.613e-09 SUPERFAMILY SIGNATURE PR00237C 15.69 7.525e-09 1787 PR00007 COMPLEMENT C1Q DOMAIN PR00007B 14.16 5.114e-15 SIGNATURE PR00007A 19.33 7.052e-10 1787 PR00524 CHOLECYSTOKININ TYPE PR00524F 5.36 4.351e-09 RECEPTOR SIGNATURE

1787 DM00250 kw ANNEXIN ANTIGEN DM00250B 13.84 6.595e-09 PROLINE TUMOR.

1787 BL00415 Syna sins roteins. BL00415N 4.29 7.372e-09 1787 BL01113 Clq domain proteins. BL01113B 18.26 3.786e-23 BL01113A 17.99 7.968e-15 BL01113A 17.99 5.091e-14 BL01113A 17.99 5.295e-11 BL01113A 17.99 8.568e-11 BL01113A 17.99 8.977e-11 BL01113A 17.99 4.635e-09 BL01113A 17.99 6.192e-09 BL01113A 17.99 7.750e-09 1787 BL00420 Speract receptor repeatBL00420A 20.42 8.691 proteins e-11 73-101 domain proteins. BL00420A 20.42 9.673e-11 BL00420A 20.42 2.180e-10 BL00420A 20.42 8.062e-09 1789 DM01930 2 kw FINGER SMCX SMCY DM01930E 15.41 2.964e-33 YDR096W.

1795 DM01688 2 POLY-IG RECEPTOR. DM01688I 14.97 7.480e-10 DM01688J 14.69 4.455e-09 1796 PFO0075 RNase H. PF00075J 15.78 4.115e-13 1802 PD00066 PROTEIN ZINC-FINGER PD00066 13.92 4.130e-11 BINDI.

1802 BL00028 Zinc finger, C2H2 type,BL00028 16.07 1.600e-10 domain 110-126 proteins. BL00028 16.07 6.100e-10 1802 PR00048 C2H2-TYPE ZINC FINGER PR00048B 6.02 9.438e-10 SIGNATURE

Table 3 SEQ DatabaseDescription *Results ID

NO: entr ID

1812 PD00078REPEAT PROTEIN ANK PD00078B 13.14 4.130e-09 NUCLEAR ANI~YR.

1824 PF00628PHD-finger. PF00628 15.84 5.500e-13 1833 PF00075RNase H. PF00075B 12.56 4.732e-10 1833 PR00939C2HC-TYPE ZINC-FINGER PR00939A 8.95 3.045e-09 SIGNATURE

1842 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.192e-09 SIGNATURE

1844 BL00972Ubiquitin carboxyl-terminalBL00972D 22.55 3.348e-11 hydrolases family 2 proteins.

1857 PF00424REV protein (anti-repressionPF00424A 14.34 8.085e-09 transactivator rotein).

1860 PR00221CAULIMOVIRUS COAT PROTEINPR00221H 12.82 2.410e-09 SIGNATURE

1864 BL01282BIR re eat proteins. BL01282B 30.49 1.136e-10 1866 BL00155Cutinase, serine proteins.BL00155D 26.87 5.337e-09 1895 PF00075RNase H. PF00075F 12.87 7.353e-10 1911 BL00983Ly-6 J u-PAR domain BL00983C 12.69 6.365e-09 proteins. 101-116 1911 BL00272Snake toxins roteins. BL00272C 8.27 1.000e-08 1925 PR00308TYPE I ANTIFREEZE PROTEINPR00308A 5.90 6.795e-11 SIGNATURE PR00308C 3.83 2.385e-10 1925 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.438e-10 SIGNATURE

1925 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.654e-09 SIGNATURE

1930 DM00179w KINASE ALPHA ADHESIONDM00179 13.97 5.263e-10 CELL.

1935 PF00075RNase H. PF00075J 15.78 2.309e-12 1940 PF00075RNase H. PF00075F 12.87 3.864e-09 1952 PR00019LEUCINE-RICH REPEAT PR00019B 11.36 3.250e-10 SIGNATURE PR00019A 11.19 5.667e-09 1954 BL00546Matrixins cysteine switch.BL00546A 19.62 8.105e-30 _ BL00023Type II fibronectin BL00023 24.31 4.682e-35 1954 collagen-binding 340-376 domain proteins. BL00023 24.31 2.969e-28 BL00023 24.31 9.526e-24 1954 PR00138MATRIXIN SIGNATURE PR00138B 15.82 5.500e-18 PR00138A 15.14 8.773e-16 1954 BL00024Hemopexin domain proteins.BL00024B 21.53 9.591e-33 BL00024A 11.49 2.800e-13 BL00024C 22.98 7.796e-11 1954 PR00013FIBRONECTIN TYPE II PR00013C 12.29 1.000e-20 SIGNATURE PR00013C 12.29 3.571e-15 PR00013C 12.29 7.800e-14 PR00013A 12.26 5.500e-13 PR00013B 14.75 1.237e-11 PR00013B 14.75 4.000e-09 PR00013A 12.26 5.333e-09 PR00013A 12.26 7.833e-09 1957 BL01182Glycosyl hydrolases BL01182A 21.39 3.357e-34 family 35 77-119 proteins.

1957 PR00742GLYCOSYL HYDROLASE PR00742B 15.52 2.653e-14 FAMILY 35 SIGNATURE PR00742A 13.75 6.914e-10 1958 PR00449TRANSFORMING PROTEIN PR00449A 13.20 8.200e-15 RAS SIGNATURE

1964 PR00727BACTERIAL LEADER PR00727A 12.93 7.000e-09 PEPTIDASE 1 (S26) FAMILY

Table 3 SEQ DatabaseDescription *Results ID

NO: entr ID

SIGNATURE

1965 PF00075RNase H. PF00075D 10.71 7.188e-09 1966 PF00075RNase H. PF00075C 11.58 9.786e-11 PF00075B 12.56 1.878e-10 1968 DM008923 RETROVIRAL PROTE1NASE.DM00892C 23.55 4.082e-11 1970 PF00075RNase H. PF00075J 15.78 8.571e-10 1973 PF00589Pha a integrase family.PF00589B 16.17 1.450e-14 1974 BL00675Sigma-54 interaction BL00675B 24.07 1.000e-24 domain 118-172 proteins ATP-binding BL00675C 13.51 6.400e-24 region A 183-210 roteins. BL00675D 12.03 1.750e-09 1987 PR00153CYCLOPHIL1N PEPTIDYL- PR00153B 11.57 1.500e-17 PROLYL CIS-TRANS PR00153A 12.98 4.255e-10 ISOMERASE SIGNATURE

1987 BL00170Cyclophilin-type peptidyl-prolylBL00170B 20.97 6.250e-33 cis- 47-86 trans isomerase signatur.BL00170A 17.08 2.309e-09 1998 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.750e-37 FINGER METAL-BINDING PD01066 19.43 8.863e-11 NU. 68-106 1999 PF00992Tro onin. PF00992A 16.67 3.487e-09 1999 BL00224Clathrin light chain BL00224B 16.94 7.055e-09 proteins. 96-148 1999 BL00422Granins proteins. BL00422C 16.18 8.059e-09 2001 BL00019Actinin-type actin-bindingBL00019B 13.34 7.158e-14 domain 261-283 roteins.

2001 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 3.500e-13 ORF2.

2008 PD01719PRECURSOR GLYCOPROTEIN PD01719A 12.89 3.483e-16 SIGNAL RE.

2011 BL00282Kazal serine protease BL00282 16.88 6.577e-10 inhibitors 127-149 family proteins.

2011 BL00222Insulin-like growth BL00222B 11.09 6.940e-10 factor binding 74-89 proteins.

2011 BL00621Tissue factor proteins.BL00621A 8.69 6.473e-09 2012 PD02563PROTEIN NONSTRUCTURAL PD02563C 13.51 9.634e-10 VP18.

2013 PR00124ATP SYNTHASE C SUBUNIT PR00124A 8.81 5.655e-09 SIGNATURE

2013 PR00783MAJOR INTRINSIC PROTEINPR00783C 13.54 8.981e-09 FAMILY SIGNATURE

2034 PF00075RNase H. PF00075F 12.87 6.523e-09 2037 BL00326Tropom osins proteins. BL00326D 8.76 9.327e-09 2048 PR00671INHIB1N BETA B CHAIN PR00671B 4.29 8.767e-10 SIGNATURE

2052 PD02455ELEMENT TRANSPOSABLE PD02455C 29.23 5.230e-09 225-27_6 INSERTION PROTEIN

TRANSPOSITION DNA.

2058 PF00075RNase H. PF00075J 15.78 9.000e-10 _ PD00066PROTEIN ZINC-FINGER PD00066 13.92 4.000e-13 BINDI.

2074 PR00048C2H2-TYPE ZINC FINGER PR00048B 6.02 4.462e-11 SIGNATURE PR00048B 6.02 1.000e-10 PR00048A 10.52 9.609e-10 2074 BL00028Zinc finger, C2H2 type,BL00028 16.07 9.100e-13 domain 104-120 proteins. BL00028 16.07 1.OOOe-O8 2076 PR00019LEUCINE-RICH REPEAT PR00019A 11.19 1.900e-11 SIGNATURE

Table 3 * Results include in order: Accession No., subtype, e-value, and amino acid position of the signature in the corresponding polypeptide Table 4 SEQ Pfam Model Description E-value Score No: Position of of NO: Pfam the DomainsDomain 1050 FAA_hydrolaseFumarylacetoacetate 0.64 -89.1 1 22-143 (FAA) hydrolase fam 1066 rubredoxin Rubredoxin 7.2 -11.1 1 4-37 1076 ank Ankyrin re eat 0.01 22.5 1 25-57 1076 sodfe_C Iron/manganese superoxide3.9 -67.9 1 38-124 dismutases, C-term 1076 DUF232 Putative transcriptional8.1 -29.1 1 134-254 regulator 1099 box HMG (high mobility grou8 -22.4 1 17-61 HMG ) box 1109 _ u-PAR/Ly-6 domain 0.21 -6.2 1 34-112 1110 ldl_recept Low-density lipoprotein8.8e-07 36.0 1 196-240 a receptor d omain 1110 CUB CUB domain 0.38 -27.8 1 52-161 1118 rvt Reverse transcri tase 0.95 -46.1 1 38-207 1125 adenylatekinaseAdenylate kinase 0.00037 -77.6 1 13-103 1162 KRAB KR AB box 1.1 e-2392.1 1 22-62 1163 connexin Connexin 3.1e-23 90.6 1 1-130 1171 KR.AB KRAB box 6.6e-22 86.2 1 33-73 1193 MHC_I Class I Histocompatibility2e-06 1.1 1 29-205 antigen, domains 1209 DOMON DOMON domain 1.9e-12 54.8 1 102-215 1213 IL8 Small cytokines (intecrine/chemokine),0.59 -7.8 1 18-inter 1218 cys rich_FGFRCysteine rich repeat 4.4 -11.0 1 28-76 1222 Gl co transfGlycosyltransferase 6.6e-06 -54.1 1 1-322 family 10 1240 ig Immunoglobulin domain 1.6e-06 35.1 2 41-124:156-1258 as Eukaryotic aspartyl 8e-06 -110.81 19-241 protease 1280 DOMON DOMON domain 8.9 -16.6 1 35-117 1288 PDZ PDZ domain (Also known 1.1 0.4 1 7-73 as DHR or GLGF) 1301 ExonucleaseExonuclease 3.4e-33 123.7 1 322-479 1311 Gemini_mov Geminivirus putative 5.7 -40.5 1 15-79 movement protein 1341 fn3 Fibronectin type III 6.6e-36 132.7 2 109-domain 200:212-1345 Colla en Colla en tri 1e helix 7.3 -65.8 1 185-243 re eat (20 copies) 1365 Amidase Amidase 0.017 -178.91 68-276 1375 Galactosyl Galactosyltransferase 7.1e-44 159.2 1 113-309 T

1375 Glyco transfGlycosyltransferase 3 -77.1 1 146-293 25 family 25 1381 GRAM GRAM domain 6.6e-14 59.6 1 65-116 1396 Pep M12B-propReprolysin family propeptide1.4e-27 105.1 1 75-191 ep 1396 disintegrinDisinte in 2.6e-10 47.7 1 243-318 1398 SK_channel Calcium-activated SK 1.8e-06 34.9 1 1-57 potassium channel 1413 i Immunoglobulin domain 5.4 9.1 1 29-88 1416 dUTPase dUTPase 0.00044 9.6 1 111-237 1420 Folate rec Folate receptor family 1.7 -111.21 14-175 1434 lectin c Lectin C-type domain 1.5e-05 28.0 1 233-319 1440 chromo 'chromo' (CHRromatin 4.6e-11 50.2 1 92-133 Organization Modifier) 1449 PMSR Peptide methionine sulfoxide0.0089 -65.8 1 4-79 reductase 1450 SPRY SPRY domain ~ 9e-26 ~ 99.0~ 1 ~ 109-240 Table 4 SEQ Pfam Model Description E-value Score No: Position ID of of NO: Pfam the DomainsDomain 1451 MaoC dehydrataMaoC like domain 2.1e-15 64.6 1 31-152 s 1463 NTP transf Nucleotidyltransferase 2.6e-12 54.3 1 121-234 2 domain 1467 DAG_PE-bindPhorbol esters/diacylglycerol8.7e-05 27.4 1 130-180 binding dom 1467 DC1 DC1 domain 0.66 11.2 1 141-172 1470 'rri C jmjC domain 0.46 -18.2 1 166-262 1474 pkinase Protein kinase domain 0.0019 -85.7 1 2-187 1475 SSF Sodiumaolute sym orter 0.13 -177.11 1-311 family 1478 dUTPase dUTPase 7.6 -37.5 1 2-98 1479 fn3 Fibronectin type III 1.1e-19 78.9 1 14-100 domain 1485 rnaseH RNase H 0.36 -28.0 1 59-175 1488 NTR NTR/C345C module 0.044 -6.1 1 293-398 1506 HSP70 Hsp70 rotein 1.6e-13 38.3 1 61-424 1517 UPAR LY6 u-PAR/Ly-6 domain 0.33 -8.2 1 44-106 1530 rnaseH RNase H 0.011 -11.7 1 64-155 1537 p450 Cytochrome P450 2.1 -176.61 31-316 1537 DNA ligase NAD-dependent DNA ligase9.2 -42.9 1 200-256 OB OB-fold d omain 1558 KRAB KRAB box 1.8e-18 74.8 1 68-108 1564 Phage integrasePha a irate rase family1.2e-09 45.5 1 39-204 1566 MR_MLE Mandelate racemase / 0.00079 -24.5 1 153-352 muconate lactonizing en 1570 HMA Heavy-metal-associated 6.6e-13 56.3 1 71-131 domain 1580 i Immunoglobulin domain 0.99 15.2 1 23-131 1601 WD40 ' WD domain, G-beta repeat2e-08 41.5 3 39-75:83-118:126-1606 zf CCCH Zinc finger C-x8-C-x5-C-x3-H0.094 19.3 3 105-type 129:141-173:183-1612 zf CCHC Zinc knuckle 2.1e-05 31.4 2 167-184:202-1618 rnaseH RNase H 6.3e-14 59.7 1 24-144 1618 Zn Irate ase Zinc binding 3.8e-07 37.2 1 146-185 Irate ase domain 1618 _ Domain of unlaiown function9.3 -7.0 1 104-186 DUF224 (DUF224) 1641 adh short short chain dehydrogenase4.6e-32 119.9 1 42-309 1667 Xlink Extracellular link domain2.9e-83 290.0 2 162-267:273-1667 ig Immunoglobulin domain 0.0015 25.2 1 61-145 1682 rvt Reverse transcri tase 3.1e-31 117.2 1 56-238 1683 Ga 30 Gag P30 core shell protein2.9e-33 124.0 1 8-197 1689 KRAB KRAB box 4.9e-22 86.6 1 266-306 1692 ubiquitin Ubiquitin family 0.00061 26.5 1 17-91 1709 fibrinogen_CFibrinogen beta and 7.9e-85 295.2 1 37-255 gamma chains, C-term 1713 HOK GEF Hok/gef family 2.4 -7.8 1 7-54 1716 Ga 30 Gag P30 core shell protein0.0036 -49.7 1 64-229 1721 rnaseH RNase H 0.011 -11.7 1 207-350 1722 dUTPase dUTPase 0.37 -22.9 ~ 1 ~ 93-217 Table 4 SEQ Pfam Model Description E-valueScore No: Position ID of of NO: Pfam the DomainsDomain 1725 ig Irninunoglobulin domain 4.2e-1357.0 2 80-141:259-1725 IQ IQ calmodulin-bindin 4.3e-0530.4 1 49-69 motif 1727 pkinase Protein kinase domain 3e-21 84.0 1 71-267 1728 Fringe Frin e-like 5.9 -112.61 165-370 1734 ig Immuno lobulin domain 0.014 22.0 1 117-170 1737 PP2C Protein phos hatase 2C 0.0067 -50.5 1 37-273 1738 SH3 SH3 domain 1.7e-0531.7 1 102-159 1740 rnaseH RNase H 0.0042 -7.3 1 126-270 1744 DAG_PE-bindPhorbol esters/diacylglycerol2.9 -11.1 1 26-55 binding door 1744 PHD PHD-fin er 3.3 -14.7 1 9-61 1760 GARS_N Phosphoribosylglycinamide8.2 -62.0 1 35-95 synthetase, N

1760 Armadillo Armadillolbeta-catenin-like9.1 8.7 2 44-seg repeat 84:131-1778 7tm 1 7 transmembrane receptor1e-12 55.7 1 41-276 (rhodopsin .
family) 1778 YCF9 YCF9 3.1 -18.5 1 203-258 1787 Clq C1 domain 1e-05 13.2 1 111-230 1787 Collagen Collagen tri 1e helix 0.0043 -3.0 1 50-107 re eat (20 co ies) 1789 jm'C jmjC domain 0.0007812.0 1 52-241 1795 i Immunoglobulin domain 0.0037 23.9 1 64-141 1796 rve Inte ase core domain 2.6e-28107.5 1 20-174 1802 zf C2H2 Zinc finger, C2H2 type 6e-15 63.1 2 68-90:108-1806 Filamin Filamin/ABP280 re eat 0.0005418.6 1 26-131 1812 ank Ankyrin repeat 3.6e-2390.4 3 159-191:205-237:244-1824 PHD PHD-forger 1.1e-1255.6 1 62-110 1826 PAP assoc PAP/25A associated domain1.5e-0635.2 1 101-155 1827 ig Immunoglobulin domain 1.6 13.4 1 29-102 1830 RhoGEF RhoGEF domain 3.3e-0624.0 1 110-280 1830 PH PH domain 2.8 6.7 1 356-451 1833 zf CCHC Zinc knuckle 2.1e-0634.7 1 137-154 1833 rvt Reverse transcriptase 7.7e-0625.9 1 84-277 1844 UCH-2 IJbiquitin carboxyl-terminal0.15 -8.5 1 165-238 hydrolase family 1846 Armadillo Armadillo/beta-catenin-like0.28 17.7 2 50-seg repeat 91:92-1 zf CCHC Zinc knuckle 3.2e-0530.8 1 179-196 _ zf C3HC4 Zinc finger, C3HC4 type 0.0022 23.3 1 218-256 1864 (RING
fin er) 1887 ig Immunoglobulin domain 4e-08 40.4 1 35-112 1889 LRR Leucine Rich Repeat 0.051 20.1 1 62-85 1 rnaseH RNase H 3.4e-0625.8 1 47-177 _ Brevenin Brevenin/esculentin/gaegurin/rugosin7.5 -2.9 1 1-51 1899 family 1911 UPAR LY6 u-PAR/Ly-6 domain ~ 1.3e-06~ 35.4~ 1 ~ 44-117 Table 4 SEQ Pfam Model Description E-valueScore No: Position of of NO: Pfam the DomainsDomain 1911 toxin Snaketoxin 3 -19.5 1 66-117 1911 Activin Activin es I and II receptor9.5 -14.0 1 30-118 rec domain 1912 Retroviral aspa 1 protease7 -26.3 1 42-142 1913 SAM SAM domain (Sterile alpha3.9e-1357.1 2 105-motif) 170:183-1916 Sema Sema domain 1.4e-1454.6 1 51-434 1926 PAP2 PAP2 su erfamily 2.9e-0737.6 1 48-142 1930 i Immunoglobulin domain 2.7e-0737.6 1 41-116 1935 rve Inte rase core domain 2.5e-1357.7 1 1-138 1940 rnaseH RNase H 1.1e-26102.0 1 24-153 1940 Integrase Integrase Zinc binding 4.7e-1253.5 1 155-194 Zn domain 1952 LRRNT Leucine rich repeat N-terminal0.0027 24.4 1 67-95 domain 1953 UQ con Ubiquitin-con'ugatin 2.8e-0840.9 1 78-219 enzyme 1954 Peptidase Matrixin 6.7e-86298.8 1 53-212 1954 fn2 Fibronectin type II domain1e-79 278.2 3 231-272:289-330:347-1958 ras Ras family 1.9 -132.01 215-284 1963 is 1 Thrombos ondin type 1 0.083 8.0 1 20-63 domain 1966 rvt Reverse transcriptase 1.5e-0521.9 1 2-196 1968 G-patch G- atch domain 0.3 6.0 1 307-352 1968 Retroviral aspartyl rotease1.4 -19.9 1 274-385 1970 rve Inte ase core domain 0.78 -16.8 1 265-395 1973 Pha a integrasePha a integrase family 5.7e-0839.9 1 1-153 1974 Si ma54 Sigma-54 interaction 3.1e-37137.2 1 63-253 activat domain 1975 Na Pi cotransNa+/Pi-cotransporter 0.0085 -99.2 1 1-146 _ signal His Kinase A (phosphoacceptor)7 -7.7 1 85-147 1975 domain 1978 UPAR LY6 u-PAR/Ly-6 domain 1.8 -16.0 1 21-96 1978 Zn_clus Fungal Zn(2)-Cys(6) binuclear5.1 -5.7 1 21-60 cluster domain 1987 pro isomeraseCyclophilin type peptidyl-1.2e-1875.4 1 4-171 rolyl cis-tr _ zf CCHC Zinc knuckle 1.9e-0531.5 2 181-1997 198:204-1997 TFIID-31 Transcription initiation7.9 -633 1 75-187 factor I1D, 3lkD su 1997 Ga 12 Gag polyprotein, inner 8.9 -9.5 1 155-229 coat protein 12 1998 KRAB KRAB box 2e-23 91.2 1 27-65 2001 CH Cal onin homology (CH) 0.019 10.8 1 230-330 domain 2001 SAM SAM domain (Sterile al 0.9 6.5 1 248-311 ha motif) 2008 is 1 Thrombospondin a 1 domain0.013 15.1 1 64-98 2011 i Immunoglobulin domain 1.7e-0531.7 1 186-255 2011 kazal Kazal-type serine protease0.0002827.6 1 121-168 inhibitor domain 2011 IGFBP Insulin-like growth factor0.17 2.5 1 53-113 binding protein 2011 zf UBR1 Putative zinc fm er in 8.3 -24.0 1 54-112 N-recognin 2015 PH PH domain 0.0002 28.1 1 174-281 2015 efhand EF hand 0.0003127.5 1 339-367 2018 RPEL RPEL re eat 1.3 11.8 1 25-50 2034 rnaseH RNase H 4e-27 103.6 1 122-267 rr"1~1 o n SEQ Pfam Model Description E-valueScore No: Position of of the Pfam ID
DomainsDomain NO:
2038 anulin Granulin 7.7 -17.8 1 62-91 2052 rve Integrase core domain 2.6e-2494.2 1 160-314 2057 Pep Ml2B~ropReprolysin family propeptide0.44 -29.3 1 179-263 ep 2058 rve Integrase core domain 8.7e-1459.2 1 1-140 2074 zf C2H2 Zinc finger, C2H2 type S.Se-2286.5 3 42-66:72-96:102-2074 zf BED BED zinc finger 0.94 1.8 1 91-129 2074 TP1 Nuclear transition rotein7.5 2.2 1 21-76 2076 LRR 1 3.2e-2080.6 5 57-Leucine Rich Repeat 80:81-104:105-128:129-152:153-2076 LRRNT Leucine rich repeat N-terminal0.0001328.8 1 27-55 2076 LRRCT domain 0.047 18.0 1 186-234 Leucine rich repeat C-terminal domain z ° ~ ° ° o ° °O~r Q, " ~. ~. ~-, o o ~ r0 p N 'zS n n ,n .Q b a ~ a a a a o ~ ~ o ~ o, o.
H
J N N ~t .p N
~1 O Cn O~ N
W ~O \O ~O ~ ~~ W
N CAD f~D C~D CAD N N ~ b i ~ ~ i i n ~
O ,-. ~. ~. ,-. ~ O ~ r..
~1 N V~ W ~. ~. O~
O O O O p O
.p ~ ~. ON O Oo ~
~p c~~i O .O O .O O O v, b n N ~ ~ N ,-'P. O
~O O ~O lp (~D I~
:-' m H
O
r~
d He ~o ~~ ~~~ y~~c~
' yo r"°'oo 0 0~
~°x ~~~ c ~y ~y o ~ ~ H ~ '~ H trJ n t=i n o m No ~ n~ n O
m ~~m ~n ~n a.
yH yH
a ~ m ~ ~ ~ tHI'J ~ tHrJ
ra err c~~~0 ~ ~~ x~~xx~~
ax~~o~ r c~ c~~~ ~~ ~dx~~d~d~CH~
y~~~~~ ~o~o~~'~ ~o od~ood~o H°°o°~ ~~m c~~ °~ ~~~~a~~a .-,~~x~mz v~r~r" t~~ Hy v~~~..~~v~~Hv~ d maOmm~Om ~~N~ro z~~' ~b r~ ~'y O~ O~~ r~Zn ~ ~p ~~ ~ ~ o ~, x ~ ~ '~ ~~ ~~ ~~ o xr r~ m m ~ ~ 0 0 r r 0 0 ~0~~
~d a a ~ ~ O
H
CNn ~ ~ ~ N
.p N ? s0 :p W .p °° coo b~ b N ;D ;P i P~ m 'O p ~ O O ~ ,~~. ""
i 0 o O w N W
~~h O O O O n i~
N J
d ~a ~~ ~~~ ~~ c~~ra r ~~C x~~ ~.~ ~ b xr r.;d ZO~ro x a~Z
x~ ~~ ~ r~ 0 0 ~z ~z x a d o r~
a m ~ ~ ~.
r~~°b~~~°o°z~oo° ~~~~~~~o~~o ~ tzi ~ trl ''d ~ "-~ ~ H a H H k~ ~ ''d ° ~-3 m d r~~~°~~O~~Z~o~'~~~p~~~'~o~H~'bH~' b r° ~zo~~-~~- xr~~ x~o~ ~ Nox~ox d ~~~~xb~~~ ar~c~ x~b ~ z~ z~
°r°~~ro~r~~~d~ ~ara~~a~ o c~~ya~~9~
x °r~r~r~~o~~,x~ ~~~d~~, ~ o ~V N'~~~ ~aN~'~~o ~
Nboo ~ o ~o ~ o W ~ z° z N ~ ~, ~, ~ H x~

° ° o~~' ~o ~o ~ ~ a\ a~ o, -. .-. ~ ~. ~ ~ ro w ~ ,fl ~ ~ ~ C
o ~~ w ~' ~ ~° o ~ td a a a x o~, o~, o~, o~, owo ~ ~ H
H
.a 0 0 0 o w o 0 0 0 0 ~ o ~ b~ ro °m°o °w .o~ 0 0 0 a\ ~ ~ ~ ~ ~ rt O O O O
O
.'~P J N
~M
0 0 0 0 0 ~ ro tn ~-. i-. N
O 0o O v' N c~D
0o IJ
Owo N
O N
r ~
d b r ~~ ~~ ~~ r~ ~
o ro ro ~r o H ~ ~ ~ x n H ~ ~ o Na ~
o ~ ~ n ~~ ~~
x C7 y a a~
~l C7 trJ ~ v~ H C7 ttJ ~] v~ v~ x~ C~ trJ f'l H C~ ''d H ''d a f~ H
HH~y~-Cr O~O7~~7~~W~O~
H ~ ~ ~ f~ H ~ ~ ~ 7~ c7 H m z ~-d z O O 'T' tn t'-1 z O n ~ z O O ~ z ~ ~ O O ~ ~-3 t" H ~ , tH=i 9 H ' ~' ~ ~p z ~ 7~ ~ O z ~ ~ ~ ~ O z ~-p3 tri ~ ''~ ~ '~ H h~ ~ ';' d r3 ~ O H '-'3 O ~1 "'' ~ '-' ~ ~y' ~ a ~ ''~' b~
~ ,"dNzt~~'J~ C=1N~~'~ ~'x~'z''Hbz ~C
~t~-~~ ~~'m''~~ t-r~t-~~0~~ p~pOt-~O ~~~c~ a ~'~o ~~~0 HpH~o ~dm~~~ ~~r'o 0 a '~
o~ z°zo~~ °z~°~~ ~o~ ~~ z°~m z ~ z~ ~z~ r 9~ ~~~r 0 0 ~ ~°o N N
UNR ~O
a a a ~-. oo J v~
ov'o ~ ~ ~ ~ H
H
N N O~ ~ ~ ~rJ

p\ W W W O
,p N N N O
O N N P
p pp ~. w 00 n-r-. i-' p O ~ C
.OP
i r ~. O O m FtJ
O ~ O ~ O
~ H
~ O ~
r d bH ~bH ~~~~x~d da~~~d O~ aH~ ~ ~x~~a~o~~~a o~00 trJ ,~Z, t~r1 H ~ ~ ~~ ~ t~ ~ ~, ~ ~ ~ ~ CJ ro w c~
N ~ '~' ~ H ~ ~ G7 trJ H '~'' lzJ
U',~, ~dp'~.~'' _~d~''~'~CG~~r~O~;'34~~ ~'_'j~~ o O~ O~ ~aHw(~Ox"H~~~O~-''G' ~C~%btrl "o t~J ~-C O ~ o w ,~~° ~o ~~~~o "a x ~t~d~
~~ozm . C
a d aa~~H~aa o ~' r -~3 ~ 9 C7 ~ txrJ H H ~ O
~~1 . ~~ 'z7 n7 nH
~o b bbx~~'~bb H H3 H ~~"' ~ m ~ H
tii t~ ~ H ~ ~ tii trJ
d n ~' n ~-3 d H

z 0 0 0 o O~p N N N N
~b UNG UQ UNQ USG
b~
a a a a z N N
W W O
H
o~', .tea o -.
b~
b ..
~o o .-~ o o ,o C
N ~. oo N n A
,.... W O ~ O ~-6 A
O . ~ ~ ro o o w ~ A rrJ
~' H
~ O ~°
'° r d (~ b H C~ 'b H f~ "b H (7 b H C~
xx~x x~x xx~x x~x x aor~ aor~ om aor~ a ~-3 H ~ H H
,~z, m ,z~, tai tri ~ tai yea y~ a~ ~~ a ~~x ~?~x ~~x !~~x ~' o Or' O~ O~ O~
o.
r ~ r ~ r ,~ v~ ~. v~ ., v~
b~ '"' H H a a ~ ''" "~ ''~ a b~ '"' H ~l a b~ '"' H H td '""' H H
r~or~xzzr°~xz~~°~ zZ~°~xz~m°~x H d tri ~-3 H '~ d tn ~-3 H H d ~ H H ''~ d ~ H H ''~ d o ~ r N N ~ O ~ ~ N N r o ~ N N r O ~ N N O N d xz ~r~r~xz ,~r~~~z ~mr~~z ~r~r~~~
~~bbH ~~bbH ~~broH ~~bbH ~
~o ~~ ~o r~x~ ~o~~~ ~o x~~ ~o ~r~oo ~r~oo ~rr~oo ~r~oo ~r~ o x '-' H ~7 x "'' ~-3 ~ x '-' ~-3 ~ ~ x ~..~ ~ H ~1 ~y x H
H 0 ~ trJ tri ~ ~1 0 ~ trJ hi ~ ~l p ~ lTJ tii b ;~ p ~ tri tii b '~ p ~
~~~Z~ x~Z~ xx~~ZZ xr~~ZZ x~~~ o d ~ d CJ ,~ C7 ,~ d ,.~ H

O O O ~O t0 O O
O O ~ N N CD ~
a a a a w w w N
oho ~ ~O oo N
H
N o O O N oho W .p .p ~. W
coo cNO °co° c°~o can cNO ~ b i i i i i i ~
O O ~ ~' O rt ""
O o0 0o N O ~D
O O O O ~ !-' O ON O O ~ N p M
.O ,O ,O O .O ,O r r-. ~. O
l0 ~O J '-' N ~ fD
~ H
O ~
t~
(~ '"d H C7 'b H n 'b ~ C~ b H
xo~ yon o~ xo~
Z a~~ a~~ yZ~

r~; r~
r r ~~°~~~HHaa~~HHaa~~HHaa~~~~aa ~~~~m°~xzz~°mxzz~°~xzzm°mxzz ~-3 d trJ F-3 H '"3 C7 tai H H ''~ d trJ ~ H '~ d trJ H H
Oc~°~a~~ ~~a~~ ~~aZ~
OOOO~a~
~~x~~r'-~orr~r~r'-~ormr~r'-'ortnmr'-'ormr~ b ooookzNx~~xzNxm~xzNx~mxzNx~~
~~~d~d,-~ ~
ZZZZ ~°~~x~ ~o,~~~ ~oH~x~
~rr~oo ~r~,oo ~rr~oo ~r~,oo ""' H H x ""' ~-3 ~-3 x '_'' H H x "" H H °.
H ~ ~ tii h7 ~ ~] ~ r~ trJ trJ ~ H ~ ~ t=i h7 ~ ~3 ~ ~ trJ t~J
9x~~Z~ ~x~~~Z x~x~~Z~ xx~ZZ
z d d H d H d H d H

O O ~O O ~O O O O
r, ~O l0 ~O ~D .P
b o ~ w w a a a a a a z J ~ .7~ N W
O\ W N O\ ~ O
H
.p W N N W .P
QO W J ~O 01 ~O v-' 00 O 00 O~ -P l0 O O O~ Oo 01 W .J~. W
eD
O O .P mP N co N G
O ~o ~ ~o ~u ~o ~u ~ "~d .? d1 ~ N O O ~ ~ ..

N G O ~ ~ O '"
Ov N O~ .p tm O~ N N "2~
O O p .O O :O o O n W ,Wp '-' ,-. i--' i-. ~ ~-. O
V, r-~ 00 .p lp ~
~ H
~ O
r~
d o~~~~o ~~"'..,''mb n~tr~iv ~tr~i i ~x.ItrrJ n~mp~t~ii a ~ ~ ~ ~ ,..n.3 ~ H ~ ,.,n,3 ~i ~-3 H
d~~r9~ ~~ ~~Z~~~Z~~~~HZ°~r~
~~~do~ ~x ~~'~~r~~r~~~~~xx x o~~~~~ ~ ~~~~~~~~~~~y~~zz ~ b o~ ~~~C~~~~H~~H
r~ WH~Wa'.W~~WH~ r '~W~
x , rbb rr xxx d Hx oo d ~aa o~~

aim x z p O o O o 0 c'_'''" a- a. ~ ~ ~: ~ ~ d ,~N pp ,..
a a a a a z ~' ~ 0 0 0 ~ H
H
W N N .p W N '_' W ~O ~O O W ~ "
W N r-r. :p tp v0 O
.p oo Ov O~ C
N N N ON
O W
O
O i-J. O O O O O
O W O f7 fD
O W t~ .P ~ W
fD cC~h O O O ~ O O
O 0o O
.p .p O
h '° r d a~o ~~ o~~ ~~ ~~ a~~~~~
t7 ~~d "~dld tHnbt7 bd "mbt7 ~OHH
~-3 O
cZn b0 ~~ n'"O~' ~ ~~ u''fO,~rZ O~y~~ o o~ ~~ ~~~ x~ ~~~C
r ~~ ~~ ,~~, a~ ar r ~ ~ °c nH y~ °~ d7~ ~r,.b ~7~ ~'v' O O O O r O r O
m m ~ y w ~~~o~~,~~~~~,~a~~~d~~d~
o~ ~ r~ rn r~ ,goo HH
~rHnb~~r~a~H~~.,~~H~~~H~~~ Ga ~~°~~'~~~d~~~'~~~db°db°~ zox d cnzYbr~rrb~~r~rp~ro,~ro ro ~m ~~b~ ~ c~ n .
arm ~a ~ ~ v ~9 ~ ~~a o r~ ~'~rc~
~~~,~~~c~or~mc~~r~~~r~ o'~ .
r ~r~dg r~H r~H
~~dy~~~m~~~~~~~~o~~o ~xz d ~~da~ ~~. ~~ . ~ . ~ rz~
N r -. ~-. .., ~. "' ,r "'' '-' ~ "' o 0 co 0 0 0 0 ..-. vo ~ ~o ~ ~ ~ ~ b U~G QG U~0 ~ 'C 'C
a~ w a~ a a a a a a 0 0 0 ~ ~ ~ o '-' '-' w o J N ~"
.P ~ ~ .P
~,, ~,, w .-, w P o, ;P
o, N oo :~ oo td ~o rn cn ~u cn ~ b ;, ,~ ,r ~-. ~-~ o w o0 0 ~ o 0 0 0 ~ o C, o .° '~-. N 'v, ~ v, o N w N w J O
~M
~ o i i O ~ o ~ b O O
-w.. \O ~ ~ 0 0 ~ f~D
F
lp t O f rt d Hxc~H~Hx ~~r~~r~ r~~o ~o ~o ~~~~y~~ o ~o ~o~~ ~~ ~~ ~~ c~
r~Hr~HryHrrod rod bd b v~ v~ " rn ~ ,~ ~ n H ~ n H ~ ~-.. H ~ H ~ H ~ H
C~~i-~r~ ~JC~aC~~y ~-3G~~~~HG~~-~~HC~,-,t-'OZ Oz OZ O
~tCrl~ ~x~~ ~-~~~~~~~~G~d~~ ~~ ~~ n .-] "',d H ~--~ H 'T~ N ~ C,) N ~ O N ~ .n, vW' ~" a C a w ~ r., ~ r.., ~ r r, ~-C x '-C ~-C x ~~~r~ '°ax~ax~ax d _ ~ ~ H ~ O H ~ O H
~r~ar~ar~ ~° y° aro y C7 n Cn C~ w ~ ~ w ~ ~ w ~ ~ x 7~ 'sb 7~
H H H ao 0 0 ~~' HH d HH rt~x~~x~~x~
~ d r~
~r r_~~r~~r~
r~~ ~trl'Jc~n~ ,,.aj~~xH~g~~~g~H
~m do~~ ~~~ao~~o~ao b rbb ~ ~ ~ d z~~~ do~~°H~~~c~~~~~
H ~ H ~ H
~x ~mx '~~~ a~ a~, a o ~~~xo~xo~x °~~
o~ Nom ~do~d~Zd~~
r' w z ~ ~ r ~' r' ';' ' ~ hi ~ ~ 0 0 0 o~A
b ., a N
~H
H
v, p.
o, o, 00 0o t~TJ b .° ~°
p OWO N ''t' J
O ~ O
J
~M
o O ~ ~ ~"d 0o ~ ~ O
O
N r' O "~J
'° r d x ~d~~c~~~~x~d~~~~~ax~~~~x~~~
zx x x ~xzxx x ~x x ~x x ~~~ay~~~a~~~~~a~~ ~~a~~~~a~a Z _~ZZ~~~ ~~ Z~Z
.. , ~ ~~~,~a~~~~a m~~~~~' ~ '~~~~~
xxxybxb~~ xxxabx~m~ bxb~~ ~x ~r~ r~ ~~"_'~r~ ~"_'~r ~'w o ~nro~~~ anro....~~. ~.~~. ,-.
dx"adx~~ rxa~x b zaz~ar~~ z~H~a~~ x~'x a .-xo~a~ a '~xdx~ p Z
d x ~; '. H ~ ,. .
~C
~,~~w,~Hl-CH -CH~W~-3~-C~7 ''t~~'~~lH 7~
~W
ny 7~n< ~ ~~7~C~C '~ dOn''~ d0 a~~Zy~a~ a~HZy~a~, z~'a~, z~' ~a~dmmC~ ~~d~yt~C~ ,-~~-m~-~, ~ a r~a~ ~ r~a~ ~ ~ c C
N ~ ~ ~ ~ C~ ~-~3 H '-'3 v~ ''~
O o "~ tii . o O o tr!
W ~ ro r ~, W ~'.''P'' b r N O (?J N
z~
x x z c c ~ ~ ~, b d a a~ ~ a ~ a d z H
w ~ ~ i w .r i.~ o~ ov o0 0~ td o ~ .G .p. i~ .~~n-. b ~O N N N w N
O O O O O r" C
w p i-. W N N n O~ -P -P .P ~-' N ~ ~i O O O O O m b N
-~t ~ N
r~
d C~ ~-3 ~ C~ ~-3 x ~ H x C~ H x C~ ~-3 ~ ~ ~C ~x-' H
C ~ ~C Cc~
d° Nor ~ ~~ z~ ~~~Ndx~~~
o ~x~m ~~ ~ ~ .. ~ how ~n H ~ td ~ ~ td ~ ~ b7 ~ ~ td ~ ~ ~ r o H
w~~ ~a~ ~a~ ~a~ ~a~ _~b a '~ ~ t~rJ m ~ t~l,'J ~ ~-~3 ~J m ~~-l ~t~r,J ~ ~ ta'' ~ t~rJ o ~ °
w ~ ~ c'~r, ''r cHr, '-' c'T'r, ~ vx, ~ .p N y P.
a can ~ .P .p .p .~. ~ o r xbH ~r~~rH~~r r y~~ .~~'~t~~,7~x~m~,~~t~~~~,tn x ~ ~,m~o ~~o r~HO mho b y~~' z~r z~r zxr zxr r o ~ (~ x ~ C~ x ~ n x ~ C~ o~
~~z aim acm a~~
rn m o b ~rn ~ ~ w .-.. Uv ~ td a a a a a H
~z ~1 N N
d r, N P .p VO
O~ .p v0 ~O Oo W 01 O~'00~"~' N ~-~ N v0 tn oo v0 O O O O O O
J tl~ ~~-. ~O J ~ n f~D
J ~p W Ch O ~ .~.
O .O O O O O ,O ~ b n 'W --~ N N N N
~O O O O O ~O
~ H
~ O ~°
'° r ~
a y~
z~ ~H x ~~~ ~ o do a ~ o x a n c .. ~~ a~ ~9 a d ~ r t~ t~
r ym bab ~ ~~~~~~H~~x~~r~~~~~~a~~c~
x~x 00 oor ors o or~oo p d O ~ ~ ~ H ~ H ~ H ~ ~ L~ ~ -~~- ~ ~ Hl ,~,x,, ~ ~ ~-Z3 a rn~~~~~o~~H~~~p~~r~~~~~~~ '~
-!.a t~rJ ~ y-H, ~ m '"~ ~ ~~-7 ~ p ~ ~ ~ ~J a '_~' '"~'' ~ ~ "H_"H-' ~~~o~~~Za~a~~Zx~ '~~~r~~~ o b ~r~o 000 ~'v'~Np~'b~ h~7~~ o n~~~~ z r ~o~~~~o~ d~~
r ° o ~ z~ ~ o off ,..., ~ c~
r ~ ~~nr~ z~ o ~_ ~_ ~ ~_ N N N N N
x' ~ C 'C
a a a o'~', w .tea H
~_ :~ ov td N N N
lh W W O ~O '"
O O O O O ~ C
i--. tl~ N O i-~!
l~h ON1 O c.h W
O O O O O
W--P. O
O
~ O ~°
r d ~~~o~~ ~H~ 9~b ~~b r~
~c ~~~r~~ err z ~z ,..a.3 ~~ X00 H ~a-3 f~
azz b b ~. ~c xy aoo bb a n ° zrz'~or~~ ~z°b°bro° bo°
~x~ ~°~~b~m~~~~~x~x-~~-ro~x'~~
°Htnb~wn v~~ ~ Hv~ btrJ b'J
~z~~
x~~~ ro o o~~'~~ ~ ~~r m O O
~y Crl f-3 H - " tiiw trJ
H N

~_ W N N
N N
r-. ~. W N 'r -H. ~ N ('' ~. ~ tC
n W ~O J W W W
H
N '-' ~ ~' ~ d o ~, o ~, ~-.
o, o i.~ by o~ td ° '~~°, ° N ~ N ~ ro o . . C
0 0 0 0 ~, v, o J o ~-. 'v, ~ c~
~ ° ~r °~
o . 0 0 0 0 ~ ~-d °° ~ ~~" ° v, °
0 0 ~ "~J
o ~
'~ O
'° r d oza ~x ~~~~o°~o~
o ymzWV~ ~~, r~
mm~ ~ m~~~~z b~
v~ n ~ ~ ~ C7 ~~ v~ m m p y x m y v~ ~, N . m ~, m .. a~~ N~bm~ W m p ~ Y ~ ~.:p ~ c b m ~..~, C~~~rJd ~ ~~p°m''H.b~t~rJ ~ R.
x W~bN~ m~ r~
n td~ ~Hd~~o >°c~~'>°c ~xx m~om~'~ pW~p ~xx O ~ ~ H ~'' ~ ~ x d x ~-~d d ~ ~ "b o~;~~~ yd~y xg ~oo~ ~oZ~, ~c x~
m o~~ rim m a~Z d p "' '~ o~ o~ O ~ p w w ua ao 0 o d a a H
r 0 o trJ b °o ~D ~O N O~ .-r O O O
i n N ~' 0o W '-' O
M
fD eC
O O ~
O ~O O
O O W A
v~', o ,p o "~J
O e°
r~
d ~ ~ ~ H
~~~r ~a~r n al ~, a ~~~ ~°~ o ~~a ~~a b r o ~a~ ~a~ ~y~ Hy °~ ~°
H ~ ~ ~ H
d o~~~,~~o~~'"~ zo~~~~~~zo~~~~~~
~m~~rG~~~~~r rH~y~d~~HH~~Yd~~
cz°oxz~z°ox '~~' ~~~~x ~' ~~~~'x "°
z~~~z~N~~~z ~~~~H~~~~~~~~Z~ro ~~~HdHd ~r,r~ r~~~r~r, r~ ~ a~ ~ a~
° ~ ''~ Z ° a '"'~ z ~ ° v~ p ~ ~ ~ ~ ° ~ p n U' m c z~~~Z~~ z~~~Z~~
>C ~ ~C d ~~>C H~ ~~~C "-39 .~

o ~ o w w w "' , ~ w w w w b as ao ~ ~ ~ G
N a. a~ A- bd a~ a~ a a a w t~ ~ ~ ~ ~'' H
H
0 o w oN, v~, N°
d 0 0 ~o ~o w o 0 0 ow, i.~ o b~ ,b 0 o rn co co 0 0 0 ~ ~ a, ,_' O~o N N N .P
, p .O ~ ~ ~ C
.oP J ,-N.
~M
O O '"' ~ n A
Ov ~ C/~
w p ~ N
r N
d x rxbxxb~x~~b~,x~a~ a~~ ~x y ~.,~Hy~-~d~~c~obG~Ccj° C~~ xy t.~~ xt~Od~ O~~ O~ O
N ~ ~ O ~ ~ ~ rØ, ~ ~ ~ r C) r~-, ~ C7 ~' ° o a N~,~CI~~ ~ br.. br.. r9 "O
°a c~~tnp~c~~t~p~ p~ ~ a.
xH
a a a x f~1 ~ ~ N ~ N ~ N r Y ~ ~ td td td ax~obOOxxOxOxxOxOxx~H~c~H~O~
~~~oooo~ro~o~ro~rr~r~ m'~o~"~o ~ H o H t~ ~ ~ ~ ~ p ~ O td ~ O ~ O ~ ~
~p~~~rrp~rnr~~rp~ndr v~r~v~rp ~zoZOZk~°xx°x~xx° ~'~~ ~x~~xz d ~a~a~, ~c bzbz ~<r~~ bc~~ o a a~r~x H~~~ H~~~ ~'~n ~ rtd b ~' ~m_xm° ~x~~ ~x~,~ ~ ° ~ ~~ ~ o x r Z ~ ~~" ,-C''., n "~~' ~-r, ~ n ~ rr-r O ''~b (z ~-°d t~t\'l h%
d tii trJ tn G~ O 47 0 Y ~~~ ~Y~~ ~Yro~ ~' r t~ ~ r~nH ~~~-3 ~ ~ n °

w w w r0 w o ~' ~ a. ~
G

a a a a H

H

N N N ~ ~j J

pp r, 01 i O~ .P

J

O O
O rn O

M

A

.-. ~. O ~n O O b O O

.,o,~ v' H

o, ~ a ,p ~ ~
O
' r ~
d wax ax d~zab~~x ~o~~x ~ ~d ~ trJ CWa-7 9 ~ ~
H ~ ~ "rd H C~ H a ~' y ~

~ , ~~~~a ,, "'b~~go~~
.3 x oo~o ~

zo o ~N~'ra~ ~~~~
d~o ~~~o~~x ~'~~'~ '~ o II zH ~ ~r ~rHa H~oc~ a ~ b ~

~td ~ d~ ~rra ~~N
w ,~N H ~ N O n ~ C~ ~ ~ . n r a.
(~ ~ ~ n ~ ,~Z, nr ~Y 9 H rY d a o ~ Z e~

. .

d ~ ~ r~ t"'' ~ ~o r,d~- o p ~ ~
d o"~o ~ r~ d r '~~~ word ~

~ y ~i ~ O O''~G~~ O
ro ~H a~~ ~td~VC ,~tiO

O n~ ~~ ~~~d n rr~ o oar ~r~

r~N~ N

a m ~ ~ ~

_ _ _ ~O ~O ~O
W W W

A. tn~ ~ ~ y "~
b d a a a a a z H

H

N .P N W

O~

i i i i b O O O

O O

~d O
f~D

~' H

O ~
r ~

d z~c~~ araxax ar x x ~axax a~ax~x ~~
a v ~
x ~

x ~~r~ v~ y~~nHv~ aHv~Hv~ ~n n ~m~ cn w ,~ w ,~
z a ~ ~m H ,..., H ,.~
n H H ,~ H H H
H ~

O ~ ~ ,~ H ~ O H O O rJ
~ ~, ~ C~ ~ O O ~ O
n ~ C~ ~ O ~ O
O O O

b r mom~m~ momnm~ mm~mn mm~mn mr~a z~zoz~ z~zoz~ ~zozo ~zoz~ ~r, b x~ ~c ~ --,~ ~ ~~ ~~ roo '-~ '-' H a '-' ,~ ~ x ~ ~ x O ,.d n a ~ ~ ~
v~ ~ ,.d ~ ~m~aya ~m~y~a ~ N~a a H~
~ ~ ~'~ N~a a ~aH H ~
H ~ ~a~ H

mr.. ~ v~ er o x ~v~ ~
9 ~'' ~ ~
~

C~ ~ ~ ~
~ ~ N ~ ~ ~ ~ ~ ~ ' d ~ 0 W W l ~ '' 3t x 'y H ~c ~c xrHr ''H o~
~c ~c l ~r-~9~c~.c ~a~c~~c (-7 .p ~ .p N .P
~ a N

~

n H

x ro r~vrH

d td~o 0 o y ro p H

r ~ ~ ~ ~ ~z~v~
~ '-' "' , W W O G ~
w w ' _ ~ ~ ~
.° ° °

w N N
a a a a a x w N V1 H
N
00 ~. ~ ~ ~

N ~ c~D ~ b ,..
O
J J
O O m D\
O
~p e~~i ~d 0 o coo "~J
o\ ~ a1 o N °° LsJ
c~'n o 0 O
'° r d -,~~ax~xz~~~
xx . r ~.. N~~r,~~~r a~?~rroac~~rr~a ~td~zo~br~~
C) ~ O ~ ~ n ~ O ~ r' C7 C~ ~ ~ ~ H yn o ~ trJ ~ ~ '~ ~ m ~ ~"' ~ ~ V'' ,b O 'y H ~ ~ ~ x "C
H
x o r~ N ~m m~
d o~~a~~ o~~~~ ~~°~x~ ~~a~d x m~x~~ ~~x~x ~mx~c ~~~m H am ,~ r ~y ~ ~ N ~ ~ ~ ~ N ~ ~ 9 C~ ~ N
~~~bc,~~ ~bc,~c, ~r~~x~xz~~~x"~
o~m~o~o~m~o~o~rm~o~ roo ono "'"~'"d H ~ ~ ~ oo ~ n-7 n ~ ~-~d oo ~ O ~ ~ ~ ~ Q ~ ~tiy ~ ~ O ~ p ~ O
r~ r°~ r r°~ r ~mor~ rrxr~r~
md~~m~mC~~m~m ~x~omoo amxmomx xvo xxx~o xxx abrx~~'~~x~x x~ d z~

o~~b~ o~~b~ ~~r~'~ rr~o ~~o ~w~x~ ~x ~ r~~
~Nmdro ~Nm~b rz~a~ xa a H°~a o o x ~ H x ~ H ~z~N~c a~ ~ ~c n ~~'1 n ~ N r l~'~~'~ "'~C
a~

z W ° ° ~ ,~
d a a a ~ o H
°o a~
a, ov b, b~ ro v, J so mo o ~' ~s b r~

~ H
t~J
A ~
'.''' ~'' o ~ O ~°
r d ~~N~,dozx~z~~~~omx d~~x~x d,~~c~ ~~ ~,~aa~~~~

,.pb ta..,~ ~ ~ ~ O ~ t~ n ~ ~ ~ x c~ p b 'b ~ ~
°a~~drroa~~ob°~~~ro ~°o°~~ o n~9~H
~ ~~~a~~~~ ~~~~~ ~~ ~a x ~da~~ ~~ b~a~H x° c~~ °' ~~c d~c ~xNz~c H N
9 O ~~ 9 ~~ .~~. .
... m ~ .
aO~~'~ ~ mo ~~~~~'~~Ha w H r-, ra.., ~ ~ ~ ~ ~ ~° O H ~-t~Jd ono y G~ O rv'n C/~ O ~ ~-d r C~ '~ H ° trJ C~ ~..~
~ ''d ~ C'7 ~ ~ b r0 ~ ~ ~ trJ ~~ C7 r ~ ~ N
O 04~N ~'~ ~-a3t~~tm''~~Htn-~
r ~ r v, daoo cr-.
"' G7 ~"~ '~ ~ H r '~
ooa ,~~" r~
~zm~

N c~~, v~,c~~,t_~n~~"

--. ~ ~ ~. ~. w ,"d y td t~ ~ 9 ~ Y n H

~-. ~ ~. r-. ~ N
~O J N ~ J ~ ~

G

00 ~D O~ ~ Ov W W
.p .p -P N N

~
J O N ~ W N P ro ~
"' ~1 Oo c~ O v0 O O O O O O rn N i W ~ .? f~
-J. tD

W J pp tn 01 0 0 0 0 0 0 ~
b vp .p. W J W O

Cn ~ H

~ ~_ A

O ~
' r ~
d ox ~~ b~ ro ~ d~c~z ~ro ~~ Ob o"G~dOro a r y n r ~,~. ~~ ~~ ~Z ~ oy N
~ ~ i C ~ b~ td ' ~ i~
~

O d t-' ~C ~C ~

~. C~ by C.'C~'~ ~ ~
' '~' 0 ~' ~''~ trJ
~ ~' ~
~r-~oO ~ ~~ ~~ ~~ ~
,.b~~~ x ~r ~~

~Ot7t7CJ b ~~ ~~ ~~ ~ ~ t7 O ~~
~ t~~%n~ ~
~

ror~~~r rr rr rr r~zr o ~~o~~ b"~r~a mb bb bb bb b Hb e~
9 ~ m N ~ c O O O O O t=i r r n ~ O w ~ b O O O O
7~ ~1 '~ O
~J

trJ CrJ ~-3 H H H Wn H 7~ H H H H trJ
~-~ trJ '~ '~ h7 trJtrJ "
" ~ ti1 trJtrJtrJ tit ~~ ~ ~0~0~ ~~ ~~ ~~ ~ ~~ ~ a ~

no ~N
r ~r o ~ o <
~

V

_ _ _ ~' "' z N N N N_ ~ ~ O ~ A
N N N N N N
ro w p' ~ a.
a a a z H
H
o .-. o 0 o~ ~, ~~ 00 o, :~ N w ~ ~ ro i-. N N ~ ~-. N v1 v, O O -P N O
O O O O O O
w .Np 01 ~ '".
~~h O O ~ O O O vW .b J ~ O W '-~ W "Ot fD
~ H
~ O ~
r d ~~~o~,doo a~d9 ~~ ~dd~
m ~. x r ~~r~~~
wz~~~a ~~ a ~~'x ao ~'~~~~o d ox ° ~ ode' ° ~ c~
~~od~~~~ ~~d ~ ~o~ m ~a"O l7d ~~W"~; d °a ~tnxwt~~Ct~ r~~ c~ r~' ~ c~ a.
c~ ~W n~ "~ n y d ~i ~ x m ~ x ran r a r~ m x Zo ad o °d~oo~orr~odorH aoor ' ~xx~x~x~x~x'~ x°dr xx ~'~~ooZO ~ZO~'~~o~~r~o~
o ~~dE~~or~~~o'~o~dd~ ro ~~x~o~°~'~~~~o~'~~~
y Ha ay~yza°~ y~~N°a~~
~~~m o~ ~ r~~ b ~ z HoH~ o d x yx ~ x a ~ o ~o o , ~o ~ ~ ro tai DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.

NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des brevets JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME

NOTE: For additional volumes, please contact the Canadian Patent Office NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:

Claims (26)

WHAT IS CLAIMED IS:
1. An isolated polynucleotide comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1-1041.
2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization conditions.
3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said polynucleotide has greater than about 99% sequence identity with the polynucleotide of claim 1.
4. The polynucleotide of claim 1 wherein said polynucleotide is DNA.
5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the complementary sequences.
6. A vector comprising the polynucleotide of claim 1.
7. An expression vector comprising the polynucleotide of claim 1.
8. A host cell genetically engineered to comprise the polynucleotide of claim 1.
9. A host cell genetically engineered to comprise the polynucleotide of claim operatively associated with a regulatory sequence that modulates expression of the polynucleotide in the host cell.
10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of:
(a) a polypeptide encoded by any one of the polynucleotides of claim 1;
and (b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions with any one of SEQ ID NO: 1-1041.
11. A composition comprising the polypeptide of claim 10 and a carrier.
12. An antibody directed against the polypeptide of claim 10.
13. A method for detecting the polynucleotide of claim 1 in a sample, comprising:
a) contacting the sample with a compound that binds to and forms a complex with the polynucleotide of claim 1 for a period sufficient to form the complex; and b) detecting the complex, so that if a complex is detected, the polynucleotide of claim 1 is detected.
14. A method for detecting the polynucleotide of claim 1 in a sample, comprising:
a) contacting the sample under stringent hybridization conditions with nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions;
b) amplifying a product comprising at least a portion of the polynucleotide of claim 1; and c) detecting said product and thereby the polynucleotide of claim 1 in the sample.
15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method further comprises reverse transcribing an annealed RNA molecule into a cDNA
polynucleotide.
16. A method for detecting the polypeptide of claim 10 in a sample, comprising:
a) contacting the sample with a compound that binds to and forms a complex with the polypeptide under conditions and for a period sufficient to form the complex; and b) detecting formation of the complex, so that if a complex formation is detected, the polypeptide of claim 10 is detected.
17. A method for identifying a compound that binds to the polypeptide of claim 10, comprising:

a) contacting the compound with the polypeptide of claim 10 under conditions sufficient to form a polypeptide/compound complex; and b) detecting the complex, so that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide of claim 10 is identified.
18. A method for identifying a compound that binds to the polypeptide of claim 10, comprising:

a) contacting the compound with the polypeptide of claim 10, in a cell, under conditions sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a reporter gene sequence in the cell; and b) detecting the complex by detecting reporter gene sequence expression, so that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide of claim 10 is identified.
19. A method of producing the polypeptide of claim 10, comprising, a) culturing a host cell comprising a polynucleotide sequence selected from the group consisting of any of the polynucleotides from SEQ ID NO: 1-1041, under conditions sufficient to express the polypeptide in said cell; and b) isolating the polypeptide from the cell culture or cells of step (a).
20. An isolated polypeptide comprising an amino acid sequence selected from the group consisting of any one of the polypeptides SEQ ID NO: 1042-2082.
21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array.
22. A collection of polynucleotides, wherein the collection comprising of at least one of SEQ ID NO: 1-1041.
23. The collection of claim 22, wherein the collection is provided on a nucleic acid array.
24. The collection of claim 23, wherein the array detects full-matches to any one of the polynucleotides in the collection.
25. The collection of claim 23, wherein the array detects mismatches to any one of the polynucleotides in the collection.
26. The collection of claim 22, wherein the collection is provided in a computer-readable format.
CA002456955A 2001-08-09 2002-08-09 Novel nucleic acids and secreted polypeptides Abandoned CA2456955A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US31126101P 2001-08-09 2001-08-09
US60/311,261 2001-08-09
PCT/US2002/025485 WO2003080795A2 (en) 2001-08-09 2002-08-09 Novel nucleic acids and secreted polypeptides

Publications (1)

Publication Number Publication Date
CA2456955A1 true CA2456955A1 (en) 2003-10-02

Family

ID=28454497

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002456955A Abandoned CA2456955A1 (en) 2001-08-09 2002-08-09 Novel nucleic acids and secreted polypeptides

Country Status (4)

Country Link
EP (1) EP1483386A4 (en)
AU (1) AU2002367815A1 (en)
CA (1) CA2456955A1 (en)
WO (1) WO2003080795A2 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7301016B2 (en) 2000-03-07 2007-11-27 Millennium Pharmaceuticals, Inc. Human transferase family members and uses thereof
US7223580B2 (en) * 2002-03-14 2007-05-29 National Institute Of Advanced Industrial Science And Technology N-acetylglucosamine transferase, nucleic acid encoding the same and use thereof in diagnosing cancer and/or tumor
WO2003091402A2 (en) * 2002-04-23 2003-11-06 University Of Georgia Research Foundation, Inc. N-ACETYLGLUCOSAMINYLTRANSFERASE Vb CODING SEQUENCE, RECOMBINANT CELLS AND METHODS
US20040081653A1 (en) 2002-08-16 2004-04-29 Raitano Arthur B. Nucleic acids and corresponding proteins entitled 251P5G2 useful in treatment and detection of cancer
US9233245B2 (en) 2004-02-20 2016-01-12 Brainsgate Ltd. SPG stimulation
US8055347B2 (en) 2005-08-19 2011-11-08 Brainsgate Ltd. Stimulation for treating brain events and other conditions
US8010189B2 (en) 2004-02-20 2011-08-30 Brainsgate Ltd. SPG stimulation for treating complications of subarachnoid hemorrhage
ATE497389T1 (en) * 2005-08-19 2011-02-15 Univ Duke PARACCRINE FACTOR H12 OBTAINED FROM STEM CELLS FOR USE IN REDUCING CELL DEATH OR FOR REPAIR OF TISSUE
US20100196351A1 (en) * 2007-02-15 2010-08-05 Geneswitch Innovations Llc Secreted pate-like proteins
AU2010286940A1 (en) * 2009-08-26 2012-03-08 Immunotope, Inc. Cytotoxic T-lymphocyte-inducing immunogens for prevention, treatment, and diagnosis of cancer
EP2878335B1 (en) 2013-11-10 2018-01-03 Brainsgate Ltd. Implant and delivery system for neural stimulator
US10271907B2 (en) 2015-05-13 2019-04-30 Brainsgate Ltd. Implant and delivery system for neural stimulator
CN106811542B (en) * 2017-03-28 2020-08-21 大连海洋大学 Gene chip for detecting pathogenic vibrio flora in sea cucumber, shrimp and shellfish culture area and use method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2370195A1 (en) * 1999-04-23 2000-11-02 Incyte Genomics, Inc. Human membrane-associated proteins

Also Published As

Publication number Publication date
EP1483386A4 (en) 2005-08-10
WO2003080795A3 (en) 2004-10-07
AU2002367815A8 (en) 2003-10-08
WO2003080795A8 (en) 2004-04-08
AU2002367815A1 (en) 2003-10-08
WO2003080795A2 (en) 2003-10-02
EP1483386A2 (en) 2004-12-08

Similar Documents

Publication Publication Date Title
US20030224379A1 (en) Novel nucleic acids and polypeptides
EP1381621A2 (en) Novel nucleic acids and polypeptides
WO2002022660A2 (en) Novel nucleic acids and polypeptides
WO2003025148A2 (en) Novel nucleic acids and polypeptides
WO2002031111A2 (en) Novel nucleic acids and polypeptides
WO2004009834A2 (en) Novel nucleic acids and secreted polypeptides
EP1261743A2 (en) Novel nucleic acids and polypeptides
EP1341804A2 (en) Novel nucleic acids and polypeptides
WO2004080148A2 (en) Novel nucleic acids and polypeptides
EP1368475A1 (en) Novel nucleic acids and polypeptides
WO2001064834A2 (en) Novel nucleic acids and polypeptides
WO2001053454A9 (en) Methods and materials relating to g protein-coupled receptor-like polypeptides and polynucleotides
WO2002018424A9 (en) Nucleic acids and polypeptides
CA2456955A1 (en) Novel nucleic acids and secreted polypeptides
WO2002016439A2 (en) Novel nucleic acids and polypeptides
WO2002044340A2 (en) Novel nucleic acids and polypeptides
WO2001053453A2 (en) Novel bone marrow nucleic acids and polypeptides
WO2003025142A2 (en) Novel nucleic acids and secreted polypeptides
WO2002077180A2 (en) Novel nucleic acids and polypeptides
WO2001087917A1 (en) Novel nucleic acids and polypeptides
EP1268762A1 (en) Novel nucleic acids and polypeptides
CA2406121A1 (en) Novel nucleic acids and polypeptides
WO2001057187A2 (en) Novel bone marrow nucleic acids and polypeptides
WO2001064840A2 (en) Novel bone marrow nucleic acids and polypeptides
WO2001088091A2 (en) Novel bone marrow nucleic acids and polypeptides

Legal Events

Date Code Title Description
FZDE Discontinued