CA2348824A1 - 12 human secreted proteins - Google Patents

12 human secreted proteins Download PDF

Info

Publication number
CA2348824A1
CA2348824A1 CA002348824A CA2348824A CA2348824A1 CA 2348824 A1 CA2348824 A1 CA 2348824A1 CA 002348824 A CA002348824 A CA 002348824A CA 2348824 A CA2348824 A CA 2348824A CA 2348824 A1 CA2348824 A1 CA 2348824A1
Authority
CA
Canada
Prior art keywords
seq
polypeptide
regions
amino acid
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002348824A
Other languages
French (fr)
Inventor
Jian Ni
Steven M. Ruben
Henrik S. Olsen
Paul E. Young
Joseph J. Kenny
Paul A. Moore
Ying-Fei Wei
John M. Greene
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Human Genome Sciences Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2348824A1 publication Critical patent/CA2348824A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/64Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
    • C12N9/6421Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
    • C12N9/6424Serine endopeptidases (3.4.21)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P11/00Drugs for disorders of the respiratory system
    • A61P11/06Antiasthmatics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P17/00Drugs for dermatological disorders
    • A61P17/02Drugs for dermatological disorders for treating wounds, ulcers, burns, scars, keloids, or the like
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P17/00Drugs for dermatological disorders
    • A61P17/06Antipsoriatics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P19/00Drugs for skeletal disorders
    • A61P19/02Drugs for skeletal disorders for joint disorders, e.g. arthritis, arthrosis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/14Drugs for disorders of the nervous system for treating abnormal movements, e.g. chorea, dyskinesia
    • A61P25/16Anti-Parkinson drugs
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/18Antipsychotics, i.e. neuroleptics; Drugs for mania or schizophrenia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/22Anxiolytics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/24Antidepressants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/28Drugs for disorders of the nervous system for treating neurodegenerative disorders of the central nervous system, e.g. nootropic agents, cognition enhancers, drugs for treating Alzheimer's disease or other forms of dementia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P27/00Drugs for disorders of the senses
    • A61P27/02Ophthalmic agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P27/00Drugs for disorders of the senses
    • A61P27/02Ophthalmic agents
    • A61P27/06Antiglaucoma agents or miotics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P29/00Non-central analgesic, antipyretic or antiinflammatory agents, e.g. antirheumatic agents; Non-steroidal antiinflammatory drugs [NSAID]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P3/00Drugs for disorders of the metabolism
    • A61P3/08Drugs for disorders of the metabolism for glucose homeostasis
    • A61P3/10Drugs for disorders of the metabolism for glucose homeostasis for hyperglycaemia, e.g. antidiabetics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • A61P31/18Antivirals for RNA viruses for HIV
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P33/00Antiparasitic agents
    • A61P33/02Antiprotozoals, e.g. for leishmaniasis, trichomoniasis, toxoplasmosis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P33/00Antiparasitic agents
    • A61P33/02Antiprotozoals, e.g. for leishmaniasis, trichomoniasis, toxoplasmosis
    • A61P33/04Amoebicides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P33/00Antiparasitic agents
    • A61P33/02Antiprotozoals, e.g. for leishmaniasis, trichomoniasis, toxoplasmosis
    • A61P33/06Antimalarials
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • A61P35/02Antineoplastic agents specific for leukemia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P37/00Drugs for immunological or allergic disorders
    • A61P37/02Immunomodulators
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P37/00Drugs for immunological or allergic disorders
    • A61P37/02Immunomodulators
    • A61P37/04Immunostimulants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P37/00Drugs for immunological or allergic disorders
    • A61P37/02Immunomodulators
    • A61P37/06Immunosuppressants, e.g. drugs for graft rejection
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P43/00Drugs for specific purposes, not provided for in groups A61P1/00-A61P41/00
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P7/00Drugs for disorders of the blood or the extracellular fluid
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P7/00Drugs for disorders of the blood or the extracellular fluid
    • A61P7/06Antianaemics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P9/00Drugs for disorders of the cardiovascular system
    • A61P9/10Drugs for disorders of the cardiovascular system for treating ischaemic or atherosclerotic diseases, e.g. antianginal drugs, coronary vasodilators, drugs for myocardial infarction, retinopathy, cerebrovascula insufficiency, renal arteriosclerosis
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • C07K14/70503Immunoglobulin superfamily
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • C07K14/70546Integrin superfamily
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides

Abstract

The present invention relates to 12 novel human secreted proteins and isolat ed nucleic acids containing the coding regions of the genes encoding such proteins. Also provided are vectors, host cells, antibodies, and recombinant methods for producing human secreted proteins. The invention further relates to diagnostic and therapeutic methods useful for diagnosing and treating disorders related to these novel human secreted proteins.

Description

DEMANDES OU BREVETS VOLUMINEUX
LA PRESENTE PARTIE OE CETTE DEMANDE OU CE BREVET
COMPREND PLUS D'UN TOME.
CECI EST LE TOME / DE
NOTE: ~ Pour tes tomes additionels, veuillez contacter le Bureau canadien des brevets JUMBO APPLICATIONS/PATENTS
THIS SECT10N OF THE APPL1CAT10NlPATENT CONTAINS MORE
'THAN ONE VOLUME
THIS IS VOLUME / OF
NOTE: For additional volumes please contact the Canadian Patent Offica 12 Human Secreted Proteins Field of the Invention This invention relates to newly identified polynucleotides and the polypeptides encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and their production.
Background of the Invention Unlike bacterium, which exist as a single compartment surrounded by a membrane, human cells and other eucaryotes are subdivided by membranes into many functionally distinct compartments. Each membrane-bounded compartment, or 1_5 organelle, contains different proteins essential for the function of the organelle. The cell uses "sorting signals," which are amino acid motifs located within the protein, to target proteins to particular cellular organelles.
One type of sorting signal, called a signal sequence, a signal peptide, or a leader sequence, directs a class of proteins to an organelle called the endoplasmic reticulum (ER). The ER separates the membrane-bounded proteins from all other types of proteins.
Once localized to the ER, both groups of proteins can be further directed to another organelle called the Golgi apparatus. Here, the Golgi distributes the proteins to vesicles, including secretory vesicles, the cell membrane, lysosomes, and the other organelles.
Proteins targeted to the ER by a signal sequence can be released into the extracellular space as a secreted protein. For example, vesicles containing secreted SUBSTITUTE SHEET (RULE Z6) proteins can fuse with the cell membrane and release their contents into the extracellular space - a process called exocytosis. Exocytosis can occur constitutively or after receipt of a triggering signal. In the latter case, the proteins are stored in secretory vesicles (or secretory granules) until exocytosis is triggered. Similarly, proteins residing on the cell membrane can also be secreted into the extracellular space by proteolytic cleavage of a "linker" holding the protein to the membrane.
Despite the great progress made in recent years, only a small number of genes encoding human secreted proteins have been identified. 'these secreted proteins include the commercially valuable human insulin, interferon, Factor VIII, human growth hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of the pervasive role of secreted proteins in human physiology, a need exists for identifying and characterizing novel human secreted proteins and the genes that encode them.
This knowledge will allow one to detect, to treat, and to prevent medical disorders by using secreted proteins or the genes that encode them.
Summary of the Invention The present invention relates to novel polynucleotides and the encoded polypeptides. Moreover, the present invention relates to vectors, host cells, antibodies, and recombinant and synthetic methods for producing the polypeptides and polynucleotides. Also provided are diagnostic methods for detecting disorders and conditions related to the polypeptides and polynucleotides, and therapeutic methods for treating such disorders and conditions. The invention further relates to screening methods for identifying binding partners of the polypeptides.
SUBSTITUTE SHEET (RULE 26) Detailed Description The following definitions are provided to facilitate understanding of certain terms S used throughout this specification.
In the present invention, "isolated" refers to matez-ial removed from its original environment (e.g., the natural environment if it is naturally occurring), and thus is altered "by the hand of man" from its natural state. For example, an isolated polynucleotide could be part of a vector or a composition of matter, or could be contained within a cell, and still be "isolated" because that vector, composition of matter, or particular cell is not the original environment of the polynucleotide. The term "isolated" does not refer to genomic or cDNA libraries, whole cell total or mRNA preparations, genomic DNA
preparations (including those separated by electrophoresis and transferred onto blots), sheared whole cell genomic DNA preparations or other compositions where the art demonstrates no distinguishing features of the polynucleotide/sequences of the present invention.
In the present invention, a "secreted" protein refers to those proteins capable of being directed to the ER, secretory vesicles, or the extracellular space as a result of a signal sequence, as well as those proteins released into the extracellular space without necessarily containing a signal sequence. If the secreted protein is released into the extracellular space, the secreted protein can undergo extracellular processing to produce a "mature" protein. Release into the extracellular space can occur by many mechanisms, including exocytosis and proteolytic cleavage.
In specific embodiments, the polynucleotides of the invention are at least 15, at least 30, at least 50, at least 100, at least 125, at least 500, or at least 1000 continuous SUBSTITUTE SHEET (RULE 26) nucleotides but are less than or equal to 300 kb, 200 kb, 100 kb, 50 kb, 15 kb, 10 kb, 7.5 kb, 5 kb, 2.5 kb, 2.0 kb, or 1 kb, in length. In a further embodiment, polynucleotides of the invention comprise a portion of the coding sequences, as disclosed herein, but do not comprise all or a portion of any intron. In another embodiment, the polynucleotides comprising coding sequences do not contain coding sequences of a genomic flanking gene (i.e., _S' or 3' to the gene of interest in the genome). In other embodiments, the polynucleotides of the invention do not contain the coding sequence of more than 1000, 500, 250, 100, 50, 25, 20, 15, 10, 5, 4, 3, 2, or 1 genomic flanking gene(s).
As used herein, a "polynucleotide" refers to a molecule having a nucleic acid sequence contained in SEQ ID NO:X or the cDNA contained within the clone deposited with the ATCC. For example, the polynucleotide can contain the nucleotide sequence of the full length cDNA sequence, including the _5' and 3' untranslated sequences, the coding region, with or without the signal sequence, the secreted protein coding region, as well as fragments, epitopes, domains, and variants of the nucleic acid sequence.
Moreover, as used herein, a "polypeptide" refers to a molecule having the translated amino acid sequence generated from the poiynucleotide as broadly defined.
In the present invention, the full length sequence identified as SEQ ID NO:X
was often generated by overlapping sequences contained in multiple clones (contig analysis).
A representative clone containing all or most of the sequence for SEQ ID NO:X
was deposited with the American Type Culture Collection ("ATCC"). As shown in Table XIII, each clone is identified by a cDNA Clone ID (Identifier) and the ATCC
Deposit Number. The ATCC is located at 10801 University Boulevard, Manassas, Virginia 20110-2209, USA. The ATCC deposit was made pursuant to the terms of the Budapest Treaty on the international recognition of the deposit of microorganisms for purposes of patent procedure.
A "polynucleotide" of the present invention also includes those polynucleotides capable of hybridizing, under stringent hybridization conditions, to sequences contained SUBSTITUTE SHEET (R.ULE 26) _S
in SEQ ID NO:X, the complement thereof, or the cDNA within the clone deposited with the ATCC. "Stringent hybridization conditions" refers to an overnight incubation at 42 degree C in a solution comprising SO% formamide, Sx SSC (7S0 mM NaCI, 7S mM
trisodium citrate), SO mM sodium phosphate (pH 7.6), Sx Denhardt's solution, 10%
S dextran sulfate, and 20 ,ug/ml denatured, sheared salmon sperm DNA, followed by washing the filters in O.lx SSC at about 6S degree C.
Also contemplated are nucleic acid molecules that hybridize to the polynucleotides of the present invention at lower stringency hybridization conditions.
Changes in the stringency of hybridization and signal detection are primarily accomplished through the manipulation of formamide concentration (lower percentages of foimamide result in lowered stringency); salt conditions, or temperature.
For example, lower stringency conditions include an overnight incubation at 37 degree C in a solution comprising 6X SSPE (20X SSPE = 3M NaCI; 0.2M NaH,POa; 0.02M EDTA, pH 7.4), O.S% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA;
followed 1 S by washes at SO degree C with 1 XSSPE, 0.1 % SDS. In addition, to achieve even lower stringency, washes performed following stringent hybridization can be done at higher salt concentrations (e.g. SX SSC).
Note that variations in the above conditions may be accomplished through the inclusion and/or substitution of alternate blocking reagents used to suppress background in hybridization experiments. Typical blocking reagents include Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and commercially available proprietary formulations. The inclusion of specific blocking reagents may require modification of the hybridization conditions described above, due to problems with compatibility.
Of course, a polynucleotide which hybridizes only to polyA+ sequences (such as any 3' terminal polyA+ tract of° a cDNA shown in the sequence listing), or to a SUBSTITUTE SHEET (RULE 26) complementary stretch of T (or U) residues, would not be included in the definition of "polynucleotide," since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded cDNA clone generated using oligo dT as a primer).
The polynucleotide of the present invention can be composed of any polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA
or modified RNA or DNA. For example, polynucleotides can be composed of single-and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and doubke-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. In addition, the polynucleotide can be composed of triple-stranded regions comprising RNA or DNA or both RNA and DNA. A polynucleotide may also contain one or more modified bases or DNA or RNA backbones modified for stability or for other reasons. "Modified" bases include, for example, tritylated bases and unusual bases such as inosine. A variety of modifications can be made to DNA and RNA; thus, "polynucleotide" embraces chemically, enzymatically, or metabolically modified forms.
The pokypeptide of the present invention can be composed of amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and may contain amino acids other than the 20 gene-encoded amino acids. The polypeptides may be modified by either natural processes, such as posttranslational processing, or by chemical modification techniques which are well known in the art. Such modifications are well described in basic texts and in more detailed monographs, as well as in a voluminous research literature. Modifications can occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the same type of modification may be present in the same or varying degrees at several sites in a given polypeptide. Also, a given SUBSTITUTE SHEET (RULE 26) polypeptide may contain many types of modifications. Polypeptides may be branched , for example, as a result of ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched, and branched cyclic polypeptides may result from posttranslation natural processes or may be made by synthetic methods.
Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation> GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. (See, for instance, PROTEINS - STRUCTURE AND MOLECULAR
PROPERTIES, 2nd Ed., T. E. Creighton, W. H. Freeman and Company, New York (1993); POSTTRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B.
C. Johnson, Ed., Academic Press, New York, pgs. I-12 (1983); Seifter et al., Meth Enzymol 182:626-646 ( 1990); Rattan et al., Ann NY Acad Sci 663:48-62 ( 1992).) "SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO:Y"
refers to a polypeptide sequence, both sequences identified by an integer specified in Table XIII.
"A polypeptide having biological activity" refers to polypeptides exhibiting activity similar, but not necessarily identical to, an activity of a polypeptide of the present invention, including mature forms, as measured in a particular biological assay, with or without dose dependency. In the case where dose dependency does exist, it need not be identical to that of the polypeptide, but rather substantially similar to the dose-dependence in a given activity as compared to the polypeptide of the present invention SUBSTITUTE SHEET (RULE 26) (i.e., the candidate polypeptide will exhibit greater activity or not more than about 25-fold less and, preferably, not more than about tenfold less activity, and most preferably, not more than about three-fold less activity relative to the polypeptide of the present invention.) Polynucleotides and Poly~eatides of the Invention FEATURES OF PROTEIN ENCODED BY GENE NO: 1 The translation product of this gene shares sequence homology with a protein from Xenopus laevis that is described as upregulated in response to thyroid hormone in tadpoles, and is thought to be important in .the tail resorption process during Xenopus laevis metamorphosis (See Proc. Natl. Acad. Sci. USA (1996 Mar. 5):93(5):1924-9, which is herein incorporated by reference). In addition, translation product of this gene shares sequence homology with a recently described group of proteins, called hedgehog interacting proteins (HIPs) (See International Publication No. W098/12326, which is herein incorporated by reference). These proteins bind to hedgehog polypeptides such as Shh and Dhh with high affinity (Kd approx. I nM). HIPs exhibit spatiallyand temporally restricted expression domains indicative of important roles in hedgehog-mediated induction. They regulate differentiation of neuronal cells, regulate survival of differentiated neuronal cells, proliferation of chondrocytes, proliferation of testicular germ line cells and/or expression of patched or hedgehog genes. The biological activity of this polypeptide is assayed by techniques known in the art, otherwise disclosed herein and as described in International Publication No. W098/12326, which is herein incorporated by reference.
Preferred polypeptides of the invention comprise the following amino acid sequence:
SUBSTITUTE SHEET (RULE 26) MLRTSTPNLCGGLHCRAPWLSSGILCLCLIFLLGQVGLLQGHPQCLDYGPPFQPP
LHLEFCSDYESFGCCDQHKDRRIAARYWDIMEYFDLKRHELCGDYIKDILCQEC
SPYAAHLYDAENTQTPLRNLPGLCSDYCSAFHSNCHSAISLLTNDRGLQESHGRD
GTRFCHLLDLPDKDYCFPNVLRNDYLNRHLGMVAQDPQGCLQLCLSEVANGLR
NPVSMVHAGDGTHRFFVAEQVGVVWVYLPDGSRLEQPFLDLKNIVLTTPWIGD
ERGFLGLAFHPKFRHNRKFYIYYSCLDKKKVEKIRISEMKVSRADPNKADLKSER
VILEIEEPASNHNGGQLLFGLDGYMYIFTGDGGQAGDPFGLFGNAQNKSSLLGK
VLRIDVNRAGSHGKRYRVPSDNPFVSEPGAHPAIYAYGIRNMWRCAVDRGDPIT
RQGRGRIFCGDVGQNRFEEVDLILKGGNYGWRAKEGFACYDKKLCHNASLDDV
LPIYAYGHAVGKSVTGGYVYRGCESPNLNGLYIFGDFMSGRLMALQEDRKNKK
WKKQDLCLGSTTSCAFPGLISTHSKFIISFAEDEAGELYFLATSYPSAYAPRGSIYK
FVDPSRRAPPGKCKYKPVPVRTKSKRIPFRPLAKTVLDLLKEQSEKAARKSSSAT
LASGPAQGLSEKGSSKKLASPTSSKNTLRGPGTKKKARVGPHVRQGKRRKSLKS
HSGRMRPSAEQKRAGRSLP (SEQ ID NO: 47). Also preferred are polypeptides comprising the mature polypeptide which is predicted to consist of residues 42-724 of the foregoing sequence, and biologically active fragments of the mature polypeptide.
Figures lA-C show the nucleotide (SEQ ID NO:11) and deduced amino acid sequence (SEQ ID N0:29) of this protein.
Figure 2 shows the regions of similarity between the amino acid sequences of SEQ ID N0:29, the Xenopus laevis tail resoiption protein (gi~1234787) (SEQ ID
N0:48), and the Hedgehog Interacting Protein ("HiP"; gi~AAD31172.1 ) (SEQ ID
N0:49).
Figure 3 shows an analysis of the amino acid sequence of SEQ ID NO: 29.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
Northern analysis indicates that a 2.5-3.0 kb transcript of this gene is expressed primarily in testes tissue and A549 lung carcinoma tissue, but interestingly is absent from SUBSTITUTE SHEET (RULE 26) normal lung tissue. This gene is also expressed in osteoarthritis tissue and human fetal tissues.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in 5 Figures lA-C (SEQ ID N0:29), which was determined by sequencing a cloned cDNA.
The nucleotide sequence shown in Figures lA-C (SEQ ID NO:11) was obtained by sequencing a cloned cDNA, which was deposited on Nov. 17, 1998 at the American Type Culture Collection, and given Accession Number 203484. The deposited gene is inserted in the pSport plasmid (Life Technologies, Rockville, MD) using the SaII/NotI
10 restriction endonuclease cleavage sites.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
NO:11 is intended DNA fragments at least about l5nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ 1D NO:11. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID NO:11. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from SUBSTITUTE SHEET (RULE 26) about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, and from about 501 to about 550, and from about 551 to about 570 of SEQ ID NO:11, or the complementary strand thereto, or the cDNA
contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set forth in Figure 3 and/or Table I, as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table I can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 3, but may, as shown in Table I, be represented or identified by using tabular representations of the data presented in Figure 3. The DNA*STAR computer algorithm used to generate Figure (set on the original default parameters) was used to present the data in Figure 3 in a tabular format (See Table I). The tabular format of the data in Figure 3 is used to easily SUBSTITUTE SHEET (RULE 26) determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 3 and in Table I include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 1 A-C (SEQ ID N0:29). As set out in Figure 3 and in Table I, such preferred regions include Garnier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karpius-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of ane or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figures lA-C, up to the alanine residue at position number 524 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-524 of Figures lA-C, where nl is an integer from 1 to 524 corresponding to the position of the amino acid residue in Figures SUBSTITUTE SHEET (RULE 26) lA-C {which is identical to the sequence shown as SEQ ID NO:29). N-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:29 include polypeptides comprising the amino acid sequence of residues: V-2 to P-529; A-3 to P-529; Q-4 to P-529; D-5 to P-529; P-6 to P-529; Q-7 to P-529; G-8 to P-529; C-9 to P-529; L-10 to P-529; Q-11 to P-529; L-12 to I'-529; C-13 to P-529; L-14 to P-529; S-15 to P-529; E-16 to P-529; V-17 to P-529; A-18 to P-_529; N-19 to P-529; G-20 to P-529; L-21 to P-529; R-22 to P-529; N-23 to P-529; P-24 to P-529; V-25 to P-529; S-26 to P-529; M-27 to P-529; V-28 to P-529; H-29 to P-529; A-30 to P-529; G-31 to P-529; D-32 to P-529; G-33 to P-529; T-34 to P-529; H-35 to P-529: R-36 to P-529; F-37 to P-529; F-38 to P-529; V-39 to P-529; A-40 to P-529; E-41 to P-529; Q-42 to P-529; V-43 to P-529; G-44 to P-529; V-45 to P-529; V-46 to P-529; W-47 to P-529; V-48 to P-529; Y-49 to P-529; L-50 to P-529; P-51 to P-529; D-_52 to P-529; G-53 to P-529; S-54 to P-529; R-55 to P-529; L-56 to P-529; E-57 to P-529; Q-58 to P-529; P-S9 to P-529; F-60 to P-529; L-61 to P-529;
D-62 to P-529; L-63 to P-529; K-64 to P-529; N-65 to P-529; I-66 to P-529; V-67 to P-529; L-68 to P-529; T-69 to P-529; T-70 to P-529; P-71 to P-529; W-72 to P-529; I-73 to P-529; G-74 to P-529; D-75 to P-529; E-76 to P-529; R-77 to P-529; G-78 to P-529; F-79 to P-529; L-80 to P-529; G-81 to P-529; L-82 to P-529; A-83 to P-529; F-84 to P-529;
H-85 to P-529; P-86 to P-529; K-87 to P-529; F-88 to P-529; R-89 to P-529; H-90 to P-529; N-91 to P-529; R-92 to P-529; K-93 to P-529; F-94 to P-529; Y-95 to P-529; I-96 to P-529; Y-97 to P-529; Y-98 to P-529; S-99 to P-529; C-100 to P-529; L-101 to P-529;
D-102 to P-529; K-103 to P-529; K-104 to P-529; K-105 to P-529; V-106 to P-529; E-107 to P-529; K-108 to P-529; I-109 to P-529; R-110 to P-529; I-111 to P-529;
S-112 to P-529; E-113 to P-529; M-114 to P-529; K-1 IS to P-529; V-116 to P-529; S-117 to P-529; R-118 to P-529; A-119 to P-529; D-120 to P-529; P-121 to P-529; N-122 to P-529;
K-123 to P-529; A-124 to P-529; D-125 to P-529; L-126 to P-529; K-127 to P-529; S-128 to P-529; E-129 to P-529; R-130 to P-529; V-131 to P-529; I-132 to P-529;
L-133 to P-529; E-134 to P-529; I-135 to P-529; E-136 to P-529; E-137 to P-529; P-138 to P-529;
SUBSTITUTE SHEET (RULE 26) A-139 to P-529; S-140 to P-529; N-141 to P-529; H-142 to P-529; N-143 to P-529; 6-144 to P-529; G-145 to P-529; Q-146 to P-529; L-147 to P-529; L-148 to P-529;

to P-529; G-150 to P-529; L-151 to P-529; D-152 to P-529; G-153 to P-529; Y-154 to P-529; M-155 to P-529; Y-156 t.o P-529; I-157 to P-529; F-158 to P-529; T-159 to P-529;
_5 G-160 to P-529; D-161 to P-529; G-162 to P-.529; G-163 to P-529; Q-164 to P-529; A-165 to P-529; G-166 to P-529; D-167 to P-529; P-168 to P-529; F-169 to P-529;

to P-529; L-171 to P-529; F-172 to P-529; G-173 to P-529; N-174 to P-529; A-17_5 to P-529; Q-176 to P-529; N-177 to P-529; K-178 to P-529; S-179 to P-529; S-180 to P-529;
L-181 to P-529; L-182 to P-529; G-183 to P-529; K-184 to P-529; V-185 to P-529; L-186 to P-529; R-187 to P-529; I-188 to P-529; D-189 to P-529; V-190 to P-529;
N-191 to P-529; R-192 to P-529; A-193 to P-529; G-194 to P-529; S-195 to P-529; H-19G
to P-529; G-197 to P-529; K-198 to P-529; R-199 to P-529; Y-200 to P-529; R-201 to P-529:
V-202 to P-529; P-203 to P-529; S-204 to P-529; D-205 to P-529; N-206 to P-529; P-207 to P-529; F-208 to P-529; V-209 to P-529; S-210 to P-529; E-211 to P-529; P-212 to P-529; G-213 to P-529; A-214 to P-529; H-2I S to P-529; P-216 to P-529; A-217 to P-529;
I-218 to P-529; Y-219 to P-529; A-220 to P-529; Y-221 to P-529; G-222 to P-529; I-223 to P-529; R-224 to P-529; N-225 to P-529; M-226 to P-529; W-227 to P-529; R-228 to P-529; C-229 to P-529; A-230 to P-529; V-231 to P-529; D-232 to P-529; R-233 to P-529; G-234 to P-529; D-235 to P-529; P-236 to P-529; I-237 to P-529; T-238 to P-529;
R-239 to P-529; Q-240 to P-529; G-241 to P-529; R-242 to P-529; G-243 to P-529; R-244 to P-529; I-245 to P-529; F-246 to P-529; C-247 to P-529; G-248 to P-529;
D-249 to P-529; V-250 to P-529; G-251 to P-529; Q-252 to P-529; N-253 to P-529; R-254 to P-529; F-255 to P-529; E-256 to P-529; E-257 to P-529; V-258 to P-529; D-259 to P-529;
L-260 to P-529; I-261 to P-529; L-262 to P-529; K-263 to P-529; G-264 to P-529; G-265 to P-529; N-266 to P-529; Y-267 to P-529; G-268 to P-529; W-269 to P-529; R-270 to P-529; A-271 to P-529; K-272 to P-529; E-273 to P-529; G-274 to P-529; F-275 to P-529;
A-276 to P-529; C-277 to P-529; Y-278 to P-529; D-279 to P-529; K-280 to P-529; K-281 to P-529; L-282 to P-529; C-283 to P-529; H-284 to P-529; N-285 to P-529;

to P-529; S-287 to P-529; L-288 to P-529; D-289 to P-529; D-290 to P-529; V-291 to P-529; L-292 to P-529; P-293 to P-529; I-294 to P-529; Y-295 to P-529; A-296 to P-529;
Y-297 to P-529; G-298 to P-529; H-299 to P-529; A-300 to P-529; V-301 to P-529; G-5 302 to P-529; K-303 to P-529; S-304 to P-529; V-305 to P-529; T-306 to P-.529; G-307 to P-529; G-308 to P-529; Y-309 to P-529; V-310 to P-529; Y-311 to P-529; R-312 to P-529; G-313 to P-529; C-314 to P-529; E-315 to P-529; S-3i6 to P-529; P-317 to P-529;
N-318 to P-529; L-319 to P-529; N-320 to P-529; G-321 to P-529; L-322 to P-529; Y-323 to P-529; I-324 to P-529; F-325 to P-529; G-326 to P-529; D-327 to P-529;
F-328 to 10 P-529; M-329 to P-529; S-330 to P-529; G-331 to P-529; R-332 to P-529: L-333 to P-529; M-334 to P-529; A-335 to P-529; L-336 to P-529; Q-337 to P-529; E-338 to P-529;
D-339 to P-529; R-340 to P-529; K-341 to P-529; N-342 to P-529; K-343 to P-529; K-344 to P-529; W-345 to P-529; K-346 to P-529; K-347 to P-529; Q-348 to P-529:

to P-529; L-350 to P-529; C-351 to P-529; L-352 to P-529; G-353 to P-529; S-354 to P-15 529; T-355 to P-529; T-356 to P-529; S-357 to P-529; C-358 to P-529; A-359 to P-529;
F-360 to P-529; P-361 to P-529; G-362 to P-529; L-363 to P-529; I-364 to P-_529; S-365 to P-529; T-366 to P-529; H-367 to P-529; S-368 to P-529; K-369 to P-529; F-370 to P-529; I-371 to P-529; I-372 to P-529; S-373 to P-529; F-374 to P-529; A-375 to P-529; E-376 to P-529; D-377 to P-529; E-378 to P-529; A-379 to P-529; G-380 to P-529;

to P-529; L-382 to P-529; Y-383 to P-529; F-384 to P-529; L-385 to P-529; A-386 to P-529; T-387 to P-_529; S-388 to P-529; Y-389 to P-529; P-390 to P-529; S-391 to P-529;
A-392 to P-529; Y-393 to P-529; A-394 to P-529; P-395 to P-529; R-396 to P-529; 6-397 to P-529; S-398 to P-529; I-399 to P-529; Y-400 to P-529; K-401 to P-529;
F-402 to P-529; V-403 to P-529; D-404 to P-529; P-405 to P-529; S-406 to P-529; R-407 to P-529; R-408 to P-529; A-409 to P-529; P-410 to P-529; P-411 to P-529; G-412 to P-529;
K-413 to P-529; C-414 to P-529; K-415 to P-529; Y-416 to P-529; K-417 to P-529; P-418 to P-529; V-419 to P-529; P-420 to P-529; V-421 to P-529; R-422 to P-529;

SUBSTITUTE SHEET (RULE 26) to P-529; K-424 to P-529; S-425 to P-529; K-426 to P-529; R-427 to P-529; I-428 to P-529; P-429 to P-529; F-430 to P-529; R-431 to P-529; P-432 to P-529; L-433 to P-529;
A-434 to P-529; K-435 to P-529; T-436 to P-529; V-437 to P-529; L-438 to P-529; D-439 to P-529; L-440 to P-529; L-441 to P-529; K-442 to P-529; E-443 to P-529;

to P-529; S-445 to P-529; E-446 to P-529; K-447 to P-529; A-448 to P-529; A-449 to P-529; R-450 to P-529; K-451 to P-529; S-452 to P-529; S-453 to P-529; S-454 to P-529;
A-455 to P-529; T-456 to P-529; L-457 to P-529; A-458 to P-529; S-459 to P-529; 6-460 to P-529; P-461 to P-_529; A-462 to P-529; Q-463 to P-529; G-464 to P-529;

to P-529; S-466 to P-529; E-467 to P-529; K-468 to P-529; G-469 to P-529: S-470 to P
529; S-471 to P-529; K-472 to P-529; K-473 to P-529; L-474 to P-529; A-475 to P-529;
S-476 to P-529; P-477 to P-529; T-478 to P-529; S-479 to P-529; S-480 to P-529; K-481 to P-529; N-482 to P-529; T-483 to P-529; L-484 to P-529; R-485 to P-529; G-486 to P-529: P-487 to P-529; G-488 to P-529; T-489 to P-529; K-490 to P-529; K-491 to P-529;
K-492 to P-529; A-493 to P-529; R-494 to P-529; V-495 to P-_529; G-496 to P-529; P-497 to P-529; H-498 to P-529; V-499 to P-529; R-500 to P-529; Q-501 to P-529;

to P-529; K-503 to P-529; R-504 to P-529; R-505 to P-529; K-506 to P-529; S-507 to P-529; L-508 to P-529; K-509 to P-529; S-510 to P-529; H-511 to P-529; S-512 to P-529;
G-513 to P-529; R-514 to P-529; M-515 to P-529; R-516 to P-529; P-517 to P-529; S-518 to P-529; A-519 to P-529; E-520 to P-529; Q-521 to P-529; K-522 to P-529;

to P-529; A-524 to P-529; of SEQ ID N0:29. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities (e.g., ability to illicit mitogenic activity, induce differentiation of normai or malignant cells, bind to EGF receptors, etc.)), may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally SUBSTITUTE SHEET (RULE 26) will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the polypeptide shown in Figures lA-C, up to the glutamine residue at position number 7, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures IA-C, where ml is an integer from 7 to 528 corresponding to the position of the amino acid residue in Figures IA-C. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID
N0:29 include polypeptides comprising the amino acid sequence of residues: M-1 to L-528; M-1 to S-527; M-1 to R-526; M-1 to G-525; M-1 to A-524; M-1 to R-523; M-1 to K-522; M-1 to Q-521; M-1 to E-520; M-1 to A-519; M-1 to S-518; M-1 to P-517; M-1 to R-516; M-1 to M-515; M-1 to R-514; M-1 to G-513; M-I to S-512; M-1 to H-511; M-I
to S-510; M-1 to K-509; M-1 to L-508; M-1 to S-507; M-1 to K-506; M-1 to R-505; M-1 to R-504; M-1 to K-503; M-1 to G-502; M-1 to Q-501; M-1 to R-500; M-1 to V-499; M-1 to H-498; M-1 to P-497; M-1 to G-496; M-I to V-495; M-1 to R-494; M-1 to A-493;
M-1 to K-492; M-1 to K-491; M-1 to K-490; M-1 to T-489; M-1 to G-488; M- I to P-487; M-1 to G-486; M-I to R-485; M-1 to L-484; M-1 to T-483; M-1 to N-482; M-1 to K-481; M-1 to S-480; M-1 to S-479; M-1 to T-478; M-1 to P-477; M-I to S-476; M-I to A-475; M-I to L-474; M-1 to K-473; M-I to K-472; M-1 to S-471; M-1 to S-470; M-1 to SUBSTITUTE SHEET (RULE 26) G-469; M-1 to K-468; M-1 to E-467; M-1 to S-466; M-1 to L-465; M-1 to G-464; M-to Q-463; M-1 to A-462; M-I to P-461; M-I to G-460; M-1 to S-459; M-1 to A-458; M-1 to L-457; M-1 to T-456; M-1 to A-455; M-I to S-454; M-1 to S-453; M-I to S-452; M-1 to K-451; M-1 to R-450; M-1 to A-449; M-1 to A-448; M- I to K-447; M-1 to E-446;
M-1 to S-445; M-I to Q-444; M-I to E-443; M-I to K-442; M-1 to L-441; M-1 to L-440;
M-I to D-439; M-1 to L-438; M-1 to V-437; M-1 to T-436; M-1 to K-435; M-1 to A-434; M-I to L-433; M-I to P-432; M-I to R-431; M-I to F-430; M-1 to P-429; M-1 to I-428; M-1 to R-427; M-1 to K-426; M-1 to S-425; M-1 to K-424; M-I to T-423; M-1 to R-422; M-1 to V-421; M-1 to P-420; M-I to V-419; M-1 to P-418; M-1 to K-417; M-to Y-416; M-1 to K-415; M-1 to C-414; M-1 to K-413; M-1 to G-412; M-I to P-411; M-1 to P-410; M-I to A-409; M-1 to R-408; M-I to R-407; M-1 to S-406; M-1 to P-405;
M-1 to D-404; M-1 to V-403; M-1 to F-402; M-1 to K-401; M-1 to Y-400; M-1 to I-399;
M-1 to S-398; M-1 to G-397; M-1 to R-396; M-1 to P-395; M-1 to A-394; M-1 to Y-393; M-1 to A-392; M-1 to S-391; M-1 to P-390; M-I to Y-389; M-1 to S-388; M-1 to T-387; M-1 to A-386; M-1 to L-385; M-1 to F-384; M-1 to Y-383; M-1 to L-382; M-1 to E-381; M-1 to G-380; M-1 to A-379; M-1 to E-378; M-1 to D-377; M-1 to E-376; M-to A-375; M-1 to F-374; M-1 to S-373; M-1 to I-372; M-1 to I-371; M-1 to F-370; M-1 to K-369; M-1 to S-368; M-1 to H-367; M-1 to T-366; M-1 to S-365; M-1 to I-364; M-I
to L-363; M-1 to G-362; M-1 to P-361; M-1 to F-360; M-1 to A-359; M-1 to C-358; M-1 to S-357; M-I to T-356; M-1 to T-355; M-1 to S-354; M-1 to G-353; M-1 to L-352; M-1 to C-351; M- I to L-350; M-1 to D-349; M-1 to Q-348; M- I to K-347; M- I to K-346; M-1 to W-345; M-1 to K-344; M-1 to K-343; M- I to N-342; M-1 to K-341; M-1 to R-340;
M-1 to D-339; M-1 to E-338; M-1 to Q-337; M-1 to L-336; M-1 to A-335; M-1 to M-334; M-1 to L-333; M-1 to R-332; M-I to G-33I; M-1 to S-330; M-1 to M-329; M-1 to F-328; M-1 to D-327; M-1 to C~-326; M-I to F-325; M-1 to I-324; M-1 to Y-323;
M-1 to L-322; M-1 to G-321; M-1 to N-320; M-1 to L-319; M-1 to N-318; M-1 to P-317; M-to S-316; M-1 to E-315 ; M-1 to C-314; M-1 to G-313 ; M-1 to R-312; M-1 to Y-3 I 1; M-SUBSTITUTE SHEET (RULE 26) 1 to V-310; M-1 to Y-309; M-1 to G-308; M-1 to G-307; M-I to T-306; M-I to V-305;
M-1 to S-304; M-1 to K-303; M-1 to G-302; M-1 to V-301; M-1 to A-300; M-1 to H-299; M-1 to G-298; M-I to Y-297; M-1 to A-296; M-1 to Y-295; M-1 to I-294; M-1 to P-293; M-1 to L-292; M-1 to V-291; M-1 to D-290; M-1 to D-289; M-I to L-288; M-S to S-287; M-1 to A-286; M-1 to N-285; M-1 to H-284; M-1 to C-283; M-I to L-282; M-I to K-281; M-1 to K-280; M-1 to D-279; M-1 to Y-278; M-1 to C-277; M-1 to A-276;
M-1 to F-275; M-1 to G-274; M-1 to E-273; M-I to K-272; M-I to A-271; M-1 to 8-270; M-I to W-269; M-1 to G-268; M-1 to Y-267; M-1 to N-266; M-1 to G-265; M-I
to G-264; M-I to K-263; M-1 to L-262; M-1 to I-261; M-1 to L-260; M-1 to D-259; M-1 to V-258; M-1 to E-257; M-I to E-256; M-I to F-255; M-1 to R-254; M-1 to N-253; M-1 to Q-252; M-1 to G-251; M-1 to V-250; M-1 to D-249; M-1 to G-248; M-1 to C-247; M-to F-246; M-I to I-245; M-1 to R-244; M-1 to G-243; M-1 to R-242; M-I to G-241; M-1 to Q-240; M-1 to R-239; M-1 to T-238; M-1 to I-237; M-1 to P-236; M-I to D-235; M-I
to G-234; M-I to R-233; M-1 to D-232; M-1 to V-231; M-I to A-230; M-1 to C-229; M-1 to R-228; M-I to W-227; M-1 to M-226; M-1 to N-225; M-1 to R-224; M-1 to I-223;
M-1 to G-222; M-1 to Y-221; M-1 to A-220; M-1 to Y-219; M-1 to I-218; M-1 to A-217; M-1 to P-216; M-1 to H-215; M-1 to A-214; M-1 to G-213; M-I to P-212; M-I
to E-211; M-I to S-210; M-1 to V-209; M-1 to F-208; M-1 to P-207; M-1 to N-206; M-1 to D-205; M-1 to S-204; M-1 to P-203; M-1 to V-202; M-1 to R-201; M-1 to Y-200; M-I
to R-199; M-1 to K-198; M-1 to G-197; M-1 to H-196; M-I to S-195; M-1 to G-194; M-1 to A-193; M-I to R-192; M-I to N-191; M-I to V-190; M-1 to D-189; M-1 to I-188;
M-1 to R-187; M-1 to L-186; M-1 to V-185; M-1 to K-184; M-1 to G-183; M-1 to L-182; M-1 to L-181; M-1 to S-180; M-1 to S-179; M-1 to K-178; M-1 to N-177; M-1 to Q-176; M-1 to A-175; M-1 to N-174; M-1 to G-173; M-1 to F-172; M-I to L-171; M-I
to G-170; M-1 to F-169; M-1 to P-168; M-1 to D-167; M-1 to G-166; M-1 to A-165; M-1 to Q-164; M-1 to G-163; M-I to G-162; M-I to D-161; M-1 to G-160; M-1 to T-159;
M-1 to F-158; M-1 to I-157; M-1 to Y-156; M-1 to M-155; M-1 to Y-154; M-1 tv G-SUBSTITUTE SHEET (RULE 26) 153; M-i to D-152; M-1 to L-151; M-1 to G-150; M-1 to F-149; M-1 to L-148; M-1 to L-147; M-I to Q-146; M-1 to G-I45; M-1 to G-144; M-1 to N-143; M-1 to H-i42; M-to N-141; M-1 to S-140; M-1 to A-139; M-1 to P-138; M-1 to E-137; M-1 to E-136; M-1 to I-135; M-1 to E-I34; M-1 to L-133; M-I to I-132; M-1 to V-131; M-1 to R-130; M-I
S to E-129; M-1 to S-128; M-1 to K-127; M-1 to L-126; M-1 to D-125; M-1 to A-124; M-I to K-123; M-1 to N-122; M-I to P-121; M-I to D-120; M-1 to A-119; M-1 to R-118;
M-1 to S-117; M-1 to V-116; M-1 to K-I15; M-1 to M-114; M-I to E-113; M-1 to S-112; M-I to I-I I I; M-1 to R-110; M-1 to I-109; M-1 to K-108; M-1 to E-107; M-1 to V-I06; M-I to K-105; M-1 to K-I04; M-I to K-I03; M-I to D-102; M-1 to L-101; M-1 to 10 C-I00; M-1 to S-99; M-1 to Y-98; M-1 to Y-97; M-1 to I-96; M-1 to Y-95; M-1 to F-94;
M-I to K-93; M-1 to R-92; M-1 to N-91; M-1 to H-90; M-I to R-89; M-I to F-88;

to K-87; M-1 to P-86; M-1 to H-85; M-1 to F-84; M-I to A-83; M-1 to L-82; M-1 to G-81; M-1 to L-80; M-1 to F-79; M-I to G-78; M-1 to R-77; M-1 to E-76; M-I to D-75; M-1 to G-74; M-1 to I-73; M-1 to W-72; M- I to P-71; M-1 to T-70; M-1 to T-69; M-I to L-15 68; M-1 to V-67; M-1 to I-66; M-1 to N-65; M-1 to K-64; M-1 to L-63; M-1 to D-62; M-1 to L-61; M-1 to F-60; M-1 to P-59; M-1 to Q-58; M-1 to E-57; M-1 to L-_56; M-1 to R-55; M-1 to S-54; M-1 to G-53; M-I to D-52; M-1 to P-51; M-I to L-50; M-1 to Y-49; M-1 to V-48; M-1 to W-47; M-I to V-46; M-1 to V-45; M-1 to G-44; M-1 to V-43; M-1 to Q-42; M-1 to E-41; M-1 to A-40; M-1 to V-39; M-1 to F-38; M-1 to F-37; M-I to R-36;
20 M-i to H-35; M-1 to T-34; M-1 to G-33; M-1 to D-32; M-1 to G-31; M-1 to A-30; M-1 to H-29; M-1 to V-28; M-1 to M-27; M-1 to S-26; M-1 to V-25; M-1 to P-24; M-1 to N-23; M-1 to R-22; M-I to L-21; M-1 to G-20; M-1 to N-19; M-1 to A-18; M-1 to V-17;
M-1 to E-16; M-1 to S-15; M-1 to L-14; M-1 to C-13; M-1 to L-12; M-1 to Q-11;
M-1 to L-10; M-1 to C-9; M-1 to G-8; M-1 to Q-7; of SEQ ID N0:29. Polypeptides encoded by 2_5 these polynucleotides are also encompassed by the invention.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a SUBSTITUTE SHEET (RULE 26) biological sample and for diagnosis of diseases and conditions which include, but are not limited to, developmental disorders, and degenerative disorders;
osteoarthritis, and lung cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissue{s) or cell S type(s). For a number of disorders of the above tissues or cells, particularly of developing tissues, cartilage, and bone, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. bone, lung, cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 29 as residues: Asp-52 to Glu-57, Arg-89 to Tyr-95, Asp-102 to Glu-107, Ser-117 to Ser-128, Glu-137 to Gly-145, Arg-192 to Arg-199, Val-231 to Gly-243, Val-250 to Glu-256, Arg-312 to Asn-318, Glu-338 to Asp-349, Pro-405 to Lys-417, Thr-423 to Ile-428, Lys-442 to Ser-453, Glu-467 to Ala-475, Thr-478 to Arg-494, Pro-497 to Arg-526. Polynucleotides encoding said polypeptides are also provided.
Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID NO:11 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 2595 of SEQ ID NO:11, b is an integer of 15 to 2609, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID NO:11, and where b is greater than or equal to a + 14.
SUBSTITUTE SHEET (RULE 26) FEATURES OF PROTEIN ENCODED BY GENE NO: 2 The translation product of this gene, sometimes referred to herein as TIDE
(for S Ten Integrin Domains with EGF homology), shares sequence homology with integrins, which are a superfamily of dimeric ab cell-surface glycoproteins that mediate the adhesive functions of many cell types, enabling cells to interact with one another and with the extracellular mataix (See Genomics 56, 169-178 (1999); all information and references contained within this publication are hereby incorporated herein by reference).
Eight human integrin b subunits have been described to date, and in combination with the 12 known a subunits form a large family of heterodimeric cell surface receptors that mediate cell adhesion to counter-receptors on neighboring cells, and to ECM
proteins (reviewed by Hynes, 1992). Integrin-ligand interactions are crucial for fundamental biological processes such as cell migration and motility, and lymphocyte extravasation.
In another embodiment, polypeptides comprising the amino acid sequence of the open reading frame upstream of the predicted signal peptide are contemplated by the present invention. Specifically, polypeptides of the invention comprise the following amino acid sequence:
TSTPPRAVPLPKSSQAAHQRNCNSGWSPGPASLGVRGSVCPAICWWHLS
LLPPPSVNPTLQKCSSPGAAQELSMRPPGFRNFLLLASSLLFAGLSAVPQSFSPSLR
SWPGAACRLSRAESERRCRAPGQPPGAALCHGRGRCDCGVCICHVTEPGMFFGP
LCECHEW VCETYDGSTCAGHGKCDCGKCKCDQGWYGDACQYPTNCDLTKKK
SNQMCKNSQDIICSNAGTCHCGRCKCDNSDGSGLVYGKFCECDDRECIDDETEEI
CGGHGKCYCGNCYCKAGWHGDKCEFQCDITPWESKRRCTSPDGKICSNRGTCV
CGECTCHDVDPTGDWGDIHGDTCECDERDCRAVYDRYSDDFCSGHGQCNCGR
CDCKAGWYGKKCEHPQSCTLSAEESIRKCQGSSDLPCSGRGKCECGKCTCYPPG
DRRVYGKTCECDDRRCEDLDGVVCGGHGTCSCGRCVCERGWFGKLCQHPRKC
SUBSTITUTE SHEET (RULE 26) NMTEEQSKNLCESADGILCSGKGSCHCGKCICSAEEWYISGEFCDCDDRDCDKH
DGLICTGNGICSCGNCECWDGWNGNACEI WLGSEYP (SEQ ID NO:SO).
Polynucleotides encoding these polypeptides are also provided.
Included in this invention as preferred domains are EGF-like domain signature S and 2 domains, which were identified using the ProSite analysis tool (Swiss Institute of Bioinformatics). A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [ 1 to 6] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception:
prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulfide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet.
1S Subdomains between the conserved cysteines strongly vary in length as shown in the following schematic representation of the EGF-like domain:
+___________________+ +_________________________+
I I I I
x(4)-C-x(0,48)-C-x(3,12)-C-x(1;70)-C-x(1,6)-C-x(2)-a-a-x(0,21)-Q-x(2)-C-x +___________________+
************************************
'C': conserved cysteine involved in a disulfide bond. 'G': often conserved glycine 'a':
2S often conserved aromatic amino acid '*': position of both patterns. 'x':
any residue The region between the Sth and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains. The concensus pattern is as follows: C-x-C-x(S)-G-x(2)-C [The 3 C's are involved in disulfide bonds].
Preferred polypeptides of the invention comprise the following amino acid sequence: GKCDCGKCKCDQGWYGDACQYPTNCDLTK (SEQ ID NO: S1), SUBSTITUTE SHEET (RULE 26) GGHGKCYCGNCYCKAGWHGDKCEFQCDIT (SEQ ID N0:52), HGQCNCGRCDCKAGWYGKKCEHPQSCTLS (SEQ :ID NO: 53), HGTCSCGRCVCERGWFGKLCQHPRKCNMT (SEQ ID NO: 54), GNGICSCGNCECWDGWNGNACEIWLGSEY (SEQ ID NO: 55), and ICGGHGKCYCGNCYCKAGWHGDKCEFQCDITPWESK (SEQ ID NO: 73).
Polynucleotides encoding these polypeptides are also provided.
Further preferred are poiypeptides comprising the EGF-like domain signature 1 and 2 domains of the sequence referenced in Table I for this gene, and at least 5, 10, 15, 20, 25, 30, _50, or 75 additional contiguous amino acid residues of this referenced sequence. The additional contiguous amino acid residues is N-terminal or C-terminal to the EGF-like domain signature I and 2 domains.
Alternatively, the additional contiguous amino acid residues is both N-terminal and C-terminal to the EGF-like domain signature 1 and 2 domains, wherein the total N-and C-terminal contiguous amino acid residues equal the specified number. The above preferred polypeptide domain is characteristic of a signature specific to EGF-like domain 1 and 2 containing proteins. Based on the sequence similarity, the translation product of this gene is expected to share at least some biological activities with EGF-like containing proteins. Such activities are known in the art, some of which are described elsewhere herein.
Included in this invention as preferred domains are integrins beta chain cysteine-rich domains, which were identified using the ProSite analysis tool (Swiss Institute of Bioinformatics). Integrins [7,8] are a large family of cell surface receptors that mediate cell to cell as well as cell to matrix adhesion. Some integrins recognize the R-G-D
sequence in their extracellular matrix protein ligand. Structurally, integrins consist of a dimer of an alpha and a beta chain. Each subunit has a large N-terminal extracellular domain followed by a transmembrane domain and a short C-terminal cytoplasmic region.
Some receptors share a common beta chain while having different alpha chains.
All the SUBSTITUTE SHEET (RULE 26) integrin beta chains contain four repeats of a forty amino acid region in the C-terminal extremity of their extracellular domain. Each of the repeats contains eight cysteines. The concensus pattern is as follows: C-x-[GNQ]-x(1,3)-G-x-C-x-C-x(2)-C-x-C [The five C's are probably involved in disulfide bonds].
5 Preferred polypeptides of the invention comprise the following amino acid sequence: GQPPGAALCHGRGRCDCGVCICHVTEPGMFFGPLC (SEQ ID NO: 74), ETYDGSTCAGHGKCDCGKCKCDQGWYGDACQYP (SEQ ID N0:58), MCKNSQDIICSNAGTCHCGRCKCDNSDGSGLVYG (SEQ ID N0:59), IDDETEEICGGHGKCYCGNCYCKAGWHGDKC (SEQ ID N0:60), 10 KRRCTSPDGKICSNRGTCVCGECTCHDVDPTGDW (SEQ ID N0:61), DRYSDDFCSGHGQCNCGRCDCKAGWYGKKCEHPQ (SEQ ID N0:62), CQGSSDLPCSGRGKCECGKCTCYPPGDRRVYGK (SEQ ID N0:63), CEDLDGVVCGGHGTCSCGRCVCERGWFGKLC (SEQ ID N0:64), SADGILCSGKGSCHCGKCICSAEEWYISGEFC (SEQ ID N0:65), and 15 CDKHDGLICTGNGICSCGNCECWDGWNGNACEI (SEQ ID NO: 66).
Polynucleotides encoding these polypeptides are also provided.
Further preferred are polypeptides comprising the integrins beta chain cysteine-rich domain of the sequence referenced in Table XIII for this gene, and at least 5, 10, 15, 20, 25, 30, 50, or 75 additional contiguous amino acid residues of this referenced 20 sequence. The additional contiguous amino acid residues is N-terminal or C-terminal to the integrins beta chain cysteine-rich domain.
Alternatively, the additional contiguous amino acid residues is both N-terminal and C-terminal to the integrins beta chain cysteine-rich domain, wherein the total N- and C-terminal contiguous amino acid residues equal the specified number. The above 25 preferred polypeptide domain is characteristic of a signature specific to integrin proteins.
Based on the sequence similarity, the translation product of this gene is expected to share at least some biological activities with integrin proteins, and specifically those containing SUBSTITUTE SHEET (RULE 26) an integrins beta chain cysteine-rich domain. Such activities are known in the art, some of which are described elsewhere herein. The following publications were referenced above and are hereby incorporated herein by reference: [ 1) Davis C.G., New Biol.
2:410-419(1990); [ 2) Blomquist M.C., Hunt L.T., Barker W.C., Proc. Natl.
Acad. Sci.
U.S.A. 81:7363-7367(1984); [ 3] Barker W.C., Johnson G.C., Hunt L.T., George D.G., Protein Nucl. Acid Enz. 29:54-68(1986); [ 4) Doolittle R.F., Feng D.F., Johnson M.S., Nature 307:558-_560( 1984); [ 5) Appella E., Weber 1.T., Blasi F., FEBS Lett.
231:1-4(1988); [ 6) Campbell LD., Bork P., Cun-. Opin. Struct. Biol. 3:385-392(1993); [ 7) Hynes R.O., Cell 48:549-554(1987); and [ $] Albelda S.M., Buck C.A., FASEB J.
4:2868-2880( 1990).
The polypeptide of the present invention has been putatively identified as a member of the integrin family and has been termed Ten Integrin Domains with EGF
homology ("TIDE"). This identification has been made as a result of amino acid sequence homology to the human integrin beta-8 subunit (See Genbank Accession No.
gi ~ 184521 ).
Figures 4A-C shows the nucleotide (SEQ ID N0:.12) and deduced amino acid sequence (SEQ ID N0:30) of TIDE. Predicted amino acids from about 1 to about constitute the predicted signal peptide (amino acid residues from about 1 to about 23 in SEQ ID N0:30) and are represented by the underlined amino acid regions; amino acids from about 108 to about 136, from about 195 to about 223, from about 291 to about 319, from about 379 to about 407, and/or from about 465 to about 493 constitute the predicted EGF-like domain signature 1 and 2 domains (amino acids from about 108 to about 136, from about 195 to about 223, from about 291 to about 319, from about 379 to about 407, and/or from about 465 to about 493 in SEQ ID N0:30) and are represented by the double underlined amino acids; and amino acids from about 55 to about 89, from about 97 to about 129, from about 142 to about 175, from about 186 to about 216, from about 228 to about 261, from about 281 to about 314, from about 327 to about 359, from about 368 to SUBSTITUTE SHEET (RULE 26) about 398, from about 417 to about 448, and/or from about 455 to about 487 constitute the predicted integrins beta chain cysteine-rich domains (amino acids from about 55 to about 89, from about 97 to about 129, from about 142 to about 175, from about 186 to about 216, from about 228 to about 261, from about 281 to about 314, from about 327 to about 359, from about 368 to about 398, from about 417 to about 448, and/or from about 455 to about 487 in SEQ ID N0:30) and are represented by the shaded amino acids.
Figure 5 shows the regions of similarity between the amino acid sequences of the Ten Integrin Domains with EGF homology (TIDE) protein {SEQ ID N0:30) and the human integrin beta-8 subunit {SEQ ID NO: 67).
Figure 6 shows an analysis of the Ten Integrin Domains with EGF homology (TIDE) amino acid sequence. Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity; amphipathic regions; flexible regions; antigenic index and surface probability are shown.
A polynucleotide encoding a polypeptide of the present invention is obtained from human osteoblasts, synovial hypoxia tissue, osteoblast and osteoclast, bone marrow stromal cells, umbilical vein, smooth muscle, placenta, and fetal lung. The polynucleotide of this invention was discovered in a human osteoblast II cDNA
library.
Its translation product has homology to the characteristic integrins beta chain cysteine-rich domains of integrin family members. The polynucleotide contains an open reading frame encoding the TIDE polypeptide of 494 amino acids. TIDE exhibits a high degree of homology at the amino acid level to the human integrin beta-8 subunit (as shown in Figure 5).
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the TIDE polypeptide having the amino acid sequence shown in Figures 4A-C (SEQ ID N0:30). The nucleotide sequence shown in Figures 4A-C
(SEQ
ID N0:12) was obtained by sequencing a cloned cDNA (HOHCH55), which was deposited on November 17 at the American Type Culture Collection, and given SUBSTITUTE SHEET (RULE 26) Accession Number 203484. The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA
molecule having the nucleotide sequence of the deposited eDNA or the nucleotide sequence shown in SEQ ID N0:12 is intended DNA fragments at least about l5nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID
N0:12. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:12. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of TIDE
polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, from about 501 to about 550, from about 551 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 751 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 951 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about 1101 to about 1150, from about 1151 to about 1200, from about 1201 to about 1250, from about 1251 to about 1300, from about 1301 to about 1350, from about to about 1400, from about 1401 to about 1450, from about 1451 to about 1500, from about 1501 to about 1550, from about 1551 to about 1600, from about 1601 to about SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'T/US99/25031 1650, from about 1651 to about 1700, from about 1701 to about 1750, from about to about 1800, from about 1801 to about 1850, from about 1851 to about 1900, from about 1901 to about 1950, from about 1951 to about 2000, from about 2001 to about 2050, from about 2051 to about 2100, from about 2101 to about 2150, from about to about 2200, from about 2201 to about 22_50, from about 2251 to about 2300, from about 2301 to about 2350, from about 2351 to about 2400, from about 2401 to about 2450, from about 2451 to about 2499, from about 289 to about 1705, and/or from about 221 to about 1705 of SEQ ID N0:12, or the complementary strand thereto, or the cDNA
contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini.
Preferred nucleic acid fragments of the present invention include nucleic acid molecules encoding a member selected from the group: a polypeptide comprising or alternatively, consisting of, the mature TIDE protein (amino acid residues from about 221 to about 1705 in Figures 4A-C (amino acids from about 221 to about 1705 in SEQ
ID N0:30). Since the location of these domains have been predicted by computer analysis, one of ordinary skill would appreciate that the amino acid residues constituting these domains may vary slightly (e.g., by about 1 to 15 amino acid residues) depending on the criteria used to define each domain. In additional embodiments, the polynucleotides of the invention encode functional attributes of TIDE.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha arnphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions of TIDE. The data representing the structural or functional attributes of TIDE set forth in Figure 6 and/or SUBSTITUTE SHEET (RULE 26) WO 00/29435 PCT/US99l25031 Table II, as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table II can be used to determine regions of TIDE which exhibit a high degree of potential for antigenicity.
Regions of 5 high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 6, but may, as 10 shown in Table II, be represented or identified by using tabular representations of the data presented in Figure 6. The DNA*STAR computer algorithm used to generate Figure 6 (set on the original default parameters) was used to present the data in Figure 6 in a tabular format (See Table II). The tabular format of the data in Figure 6 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred 15 regions set out in Figure 6 and in Table II include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 4A-C. As set out in Figure 6 and in Table II, such preferred regions include Garnier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and I-~opp-20 Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to 25 multimerize, etc.) may still be retained. For example, the ability of shortened TIDE
muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the SUBSTITUTE SHEET (RULE 26) residues of the complete or mature polypeptide are removed from the N-terminus.
Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an TIDE mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six TIDE
amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the TIDE amino acid sequence shown in Figures 4A-C, up to the leucine residue at position number 489 and poiynucleotides encoding such poiypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-494 of Figures 4A-C, where nl is an integer from 2 to 489 corresponding to the position of the amino acid residue in Figures 4A-C (which is identical to the sequence shown as SEQ ID N0:30). In another embodiment, N-terminal deletions of the TIDE polypeptide can be described by the general formula n2-494, where n2 is a number from 2 to 489, corresponding to the position of amino acid identified in Figures 4A-C. N-terminal deletions of the TIDE
polypeptide of the invention shown as SEQ ID N0:30 include polypeptides comprising the amino acid sequence of residues: N-terminal deletions of the TIDE
polypeptide of the invention shown as SEQ ID N0:30 include polypeptides comprising the amino acid sequence of residues: R-2 to P-494; P-3 to P-494; P-4 to P-494; G-5 to P-494;
F-6 to P-494; R-7 to P-494; N-8 to P-494; F-9 toP-494; L-10 to P-494; L-11 to P-494; L-12 to P-494; A-13 to P-494; S-14 to P-494; S-15 to P-494; L-16 to P-494; L-17 to P-494;F-18 to P-494; A-19 to P-494; G-20 to P-494; L-21 to P-494; S-22 to P-494; A-23 to P-494; V-24 to P-494; P-25 to P-494; Q-26to P-494; S-27 to P-4.94; F-28 to P-494; S-29 to P-494;
P-30 to P-494; S-31 to P-494; L-32 to P-494; R-33 to P-494; S-34 toP-494; W-35 to P-494; P-36 to P-494; G-37 to P-494; A-38 to P-494; A-39 to P-494; C-40 to P-494; R-41 SUBSTITUTE SHEET (RULE 26) to P-494; L-42 toP-494; S-43 to P-494; R-44 to P-494; A-45 to P-494; E-46 to P-494; S-47 to P-494; E-48 to P-494; R-49 to P-494; R-50 toP-494; C-51 to P-494; R-52 to P-494;
A-53 to P-494; P-54 to P-494; G-55 to P-494; Q-56 to P-494; P-57 to P-494; P-58 toP-494; G-59 to P-494; A-60 to P-494; A-61 to P-494; L-62 to P-494; C-63 to P-494; H-64 to P-494; G-65 to P-494; R-66 toP-494; G-67 to P-494; R-68 to P-494; C-69 to P-494; D-70 to P-494; C-71 to P-494; C~-72 to P-494; V-73 to P-494; C-74 toP-494; I-75 to P-494;
C-76 to P-494; H-77 to P-494; V-78 to P-494; T-79 to P-494; E-80 to P-494; P-81 to P-494; G-82 to P-494;M-83 to P-494; F-84 to P-494; F-85 to P-494; G-86 to P-494;
P-87 to P-494; L-88 to P-494; C-89 to P-494; E-90 to P-494; C-9lto P-494; H-92 to P-494; E-93 to P-494: W-94 to P-494; V-95 to P-494; C-96 to P-494; E-97 to P-494; T-98 to P-494;
Y-99 toP-494; D-100 to P-494; G-101 to P-494; S-102 to P-494; T-103 to P-494;

to P-494; A-105 to P-494; G-106 to P-494;H-107 to P-494; G-108 to P-494; K-109 to P-494; C- I I 0 to P-494; D-111 to P-494; C-112 to P-494; G-113 to P-494; K- I
14 toP-494;
C-I 15 to P-494; K-116 to P-494; C-117 to P-494; D-118 to P-494; Q-119 to P-494; G-120 to P-494; W-121 to P-494;Y-122 to P-494; G-123 to P-494; D-124 to P-494; A-to P-494; C-126 to P-494; Q-127 to P-494; Y-128 to P-494; P-129 toP-494; T-130 to P-494; N-131 to P-494; C-132 to P-494; D-133 to P-494; L-134 to P-494; T-135 to P-494;
K-136 to P-494;x-137 to P-494; K-138 to P-494; S-139 to P-494; N-140 to P-494;
Q-141 to P-494; M-142 to P-494; C-143 to P-494; K-144 toP-494; N-145 to P-494; S-to P-494; Q-147 to P-494; D-l48 to P-494; I-149 to P-494; I-150 to P-494; C-151 to P-494; S-152to P-494; N-153 to P-494; A-154 to P-494; G~-155 to P-494; T-156 to P-494;
C-157 to P-494; H-158 to P-494; C-159 to P-494;6-160 to P-494; R-161 to P-494;

to P-494; K-163 to P-494; C-164 to P-494; D-165 to P-494; N-166 to P-494; S-167 toP-494; D-168 to P-494; G-169 to P-494; S-170 to P-494; G-171 to P-494; L-172 to P-494;
V-173 to P-494; Y-174 to P-494;6-175 to P-494; K-I76 to P-494; F-177 to P-494;

to P-494; E-179 to P-494; C-180 to P-494; D-181 to P-494; D-182 toP-494; R-183 to P-494; E-184 to P-494; C-185 to P-494; I-186 to P-494; D-187 to P-494; D-188 to P-494;
SUBSTITUTE SHEET (RULE 26) E-189 to P-494; T-190to P-494; E-191 to P-494; E-192 to P-494; I-193 to P-494;

to P-494; G-195 to P-494; G-196 to P-494; H-197 to P-494;6-198 to P-494; K-199 to P-494; C-200 to P-494; Y-201 to P-494; C-202 to P-494; G-203 to P-494; N-204 to P-494;
C-205 toP-494; Y-206 to P-494; C-207 to P-494; K-208 to P-494; A-209 to P-494;
G-210 to P-494; W-211 to P-494; H-212 to P-494;6-213 to P-494; D-214 to P-494; K-to P-494; C-216 to P-494; E-217 to P-494; F-218 to P-494; Q-219 to P-494; C-220 toP-494; D-221 to P-494; I-222 to P-494; T-223 to P-494; P-224 to P-494; W-225 to P-494;
E-226 to P-494; S-227 to P-494;x-228 to P-494; R-229 to P-494; R-230 to P-494;

to P-494; T-232 to P-494; S-233 to P-494; P-234 to P-494; D-235 toP-494; G-236 to P-494; K-237 to P-494; I-238 to P-494; C-239 to P-494; S-240 to P-494; N-241 to P-494;
R-242 to P-494;G-243 to P-494; T-244 to P-494; C-245 to P-494; V-246 to P-494;

to P-494; G-248 to P-494; E-249 to P-494; C-2_50 toP-494; T-251 to P-494; C-252 to P-494; H-2.53 to P-494; D-254 to P-494; V-255 to P-494; D-256 to P-494; P-257 to P-494;T-258 to P-494; G-259 to P-494; D-260 to P-494; W-261 to P-494; G-262 to P-494;
D-263 to P-494; I-264 to P-494; H-265 toP-494; G-266 to P-494; D-267 to P-494;

to P-494; C-269 to P-494; E-270 to P-494; C-271 to P-494; D-272 to P-494;E-273 to P-494; R-274 to P-494; D-275 to P-494; C-276 to P-494; R-277 to P-494; A-278 to P-494;
V-279 to P-494; Y-280 toP-494; D-281 to P-494; R-282 to P-494; Y-283 to P-494;

to P-494; D-285 to P-494; D-286 to P-494; F-287 to P-494;C-288 to P-494; S-289 to P-494; G-290 to P-494; H-291 to P-494; G-292 to P-494; Q-293 to P-494; C-294 to P-494;
N-295 toP-494; C-296 to P-494; G-297 to P-494; R-298 to P-494; C-299 to P-494;
D-300 to P-494; C-301 to P-494; K-302 to P-494;A-303 to P-494; G-304 to P-494; W-to P-494; Y-306 to P-494; G-307 to P-494; K-308 to P-494; K-309 to P-494; C-310 toP-494; E-311 to P-494; H-312 to P-494; P-313 to P-494; Q-314 to P-494; S-315 to P-494;
C-316 to P-494; T-317 to P-494;L-318 to P-494; S-319 to P-494; A-320 to P-494;

to P-494; E-322 to P-494; S-323 to P-494; I-324 to P-494; R-325 toP-494; K-326 to P-494; C-327 to P-494; Q-328 to P-494; G-329 to P-494; S-330 to P-494; S-331 to P-494;
SUBSTITUTE SHEET (RULE 26) D-332 to P-494;L-333 to P-494; P-334 to P-494; C-335 to P-494; S-336 to P-494;

to P-494; R-338 to P-494; G-339 to P-494; K-340 toP-494; C-341 to P-494; E-342 to P-494; C-343 to P-494; G-344 to P-494; K-345 to P-494; C-346 to P-494; T-347 to P-494;C-348 to P-494; Y-349 to P-494; P-350 to P-494; P-351 to P-494; G-352 to P-494;
D-353 to P-494; R-354 to P-494; R-355 toP-494; V-356 to P-494; Y-357 to P-494;

358 to P-494; K-359 to P-494; T-360 to P-494; C-361 to P-494; E-362 to P-494;C-363 to P-494; D-364 to P-494; D-365 to P-494; R-366 to P-494; R-367 to P-494; C-368 to P-494; E-369 to P-494; D-370 toP-494; L-371 to P-494; D-372 to P-494; G-373 to P-494;
V-374 to P-494; V-375 to P-494; C-376 to P-494; G-377 to P-494;6-378 to P-494;
H-379 to P-494; G-380 to P-494; T-381 to P-494; C-382 to P-494; S-383 to P-494;

to P-494; G-385 toP-494; R-386 to P-494; C-387 to P-494; V-388 to P-494; C-389 to P-494; E-390 to P-494; R-391 to P-494; G-392 to P-494;W-393 to P-494; F-394 to P-494;
G-395 to P-494; K-396 to P-494; L-397 to P-494; C-398 to P-494; Q-399 to P-494; H-400 toP-494; P-401 to P-494; R-402 to P-494; K-403 to P-494; C-404 to P-494; N-405 to P-494; M-406 to P-494; T-407 to P-494;E-408 to P-494; E-409 to P-494; Q-410 to P-494; S-411 to P-494; K-412 to P-494; N-4I3 to P-494; L-414 to P-494; C-415 toP-494;
E-416 to P-494; S-417 to P-494; A-418 to P-494; D-419 to P-494; G-420 to P-494; I-421 to P-494; L-422 to P-494; C-423to P-494; S-424 to P-494; G-425 to P-494; K-426 to P-494; G-427 to P-494; S-428 to P-494; C-429 to P-494; H-430 to P-494;C-431 to P-494;
G-432 to P-494; K-433 to P-494; C-434 to P-494; I-435 to P-494; C-436 to P-494; S-437 to P-494; A-438 toP-494; E-439 to P-494; E-440 to P-494; W-441 to P-494; Y-442 to P-494; I-443 to P-494; S-444 to P-494; G-445 to P-494;E-446 to P-494; F-447 to P-494; C-448 to P-494; D-449 to P-494; C-450 to P-494; D-451 to P-494; D-452 to P-494;

toP-494; D-454 to P-494; C-455 to P-494; D-456 to P-494; K-457 to P-494; H-458 to P-494; D-459 to P-494; G-460 to P-494;L-461 to P-494; I-462 to P-494; C-463 to P-494;
T-464 to P-494; G-465 to P-494; N-466 to P-494; G-467 to P-494; I-468 toP-494;

to P-494; S-470 to P-494; C-471 to P-494; G-472 to P-494; N-473 to P-494; C-474 to P-SUBSTITUTE SHEET (RULE 25) 494; E-475 to P-494;C-476 to P-494; W-477 to P-494; D-478 to P-494; G-479 to P-494;
W-480 to P-494; N-481 to P-494; G-482 to P-494; N-483 toP-494; A-484 to P-494;
C-485 to P-494; E-486 to P-494; I-487 to P-494; W-488 to P-494; L-489 to P-494;
of SEQ
ID N0:30. Polypeptides encoded by these polynucleotides are also encompassed by the in vention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities may still be retained.
For example the ability of the shortened TIDE mutein to induce and/or bind to antibodies which recognize 10 the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that 15 an TIDE mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six TIDE amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the TIDE
20 polypeptide shown in Figures 4A-C, up to the phenylalanine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figure 1, where ml is an integer from 6 to 494 corresponding to the position of the amino acid residue in Figures 4A-C. Moreover, the invention provides polynucleotides encoding 25 polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the TIDE polypeptide of the invention shown as SEQ ID
N0:30 include polypeptides comprising the amino acid sequence of residues: M-1 to Y-493; M-SUBSTITUTE SHEET (RULE 26) 1 to E-492; M-1 to S-491; M-1 to G-490; M-1 to L-489; M-1 to W-488; M-1 toI-487; M-1 to E-486; M-1 to C-485; M-1 to A-484; M-1 to N-483; M-1 to G-482; M-1 to N-481;
M-1 to W-480; M-1 to G-479;M-1 to D-478; M-1 to W-477; M-1 to C-476; M-1 to E-475; M-1 to C-474; M-1 to N-473; M-1 to G-472; M-1 to C-471; M-1 toS-470; M-1 to C-469; M-1 to I-468; M-1 to Ci-467; M-1 to N-466; M-1 to G-465; M-1 to T-464;
M-1 to C-463; M-1 to I-462; M-1 to L-461; M-1 to G-460; M-1 to D-459; M-1 to H-458: M-1 to K-457; M-1 to D-456; M-1 to C-455; M-1 to D-454; M-1 to R-453;M-1 to D-452; M-to D-451; M-1 to C-450; M-1 to D-449; M-1 to C-448; M-1 to F-447; M-1 to E-446; M-1 to G-445; M-1 toS-444; M-1 to I-443: M-1 to Y-442; M-1 to W-441; M-1 to E-440; M-1 to E-439; M-1 to A-438; M-1 to S-437; M-1 to C-436; M-Ito I-435; M-1 to C-434; M-1 to K-433; M-1 to G-432; M-1 to C-431; M-1 to H-430; M-1 to C-429; M-1 to S-428;
M-1 to G-427;M-1 to K-426; M-1 to G-425; M-1 to S-424; M-1 to C-423; M-1 to L-422;
M-I to I-421; M-1 to G-420; M-1 to D-419; M-1 toA-418; M-1 to S-417; M-1 to E-416;
M-1 to C-415; M-1 to L-414; M-1 to N-413; M-1 to K-412; M-1 to S-411; M-1 to Q-410;M-1 to E-409; M-1 to E-408; M-1 to T-407; M-1 to M-406; M-1 to N-405; M-1 to C-404; M-1 to K-403; M-1 to R-402; M-1 toP-401; M-1 to H-400; M-1 to Q-399; M-1 to C-398; M-1 to L-397; M-1 to K-396; M-1 to G-395; M-1 to F-394; M-1 to W-393;M-to G-392; M-1 to R-391; M-1 to E-390; M-1 to C-389; M-1 to V-388; M-1 to C-387; M-1 to R-386; M-1 to G-385; M-1 toC-384; M-1 to S-383; M-1 to C-382; M-1 to T-381; M-1 to G-380; M-1 to H-379; M-1 to G-378; M-1 to G-377; M-1 to C-376;M-1 to V-375;
M-1 to V-374; M-1 to G-373; M-1 to D-372; M-1 to L-371; M-1 to D-370; M-1 to E-369; M-I to C-368; M-1 toR-367; M-1 to R-366; M-1 to D-365; M-1 to D-364; M-1 to C-363; M-1 to E-362; M-1 to C-361; M-1 to T-360; M-1 to K-359;M-1 to G-358; M-1 to Y-357; M-1 to V-356; M-1 to R-355; M-1 to R-354; M-1 to D-353; M-1 to G-352; M-to P-351; M-1 toP-350; M-I to Y-349; M-1 to C-348; M-1 to T-347; M-1 to C-346;

to K-345; M-1 to G-344; M-1 to C-343; M-1 to E-342;M-1 to C-341; M-1 to K-340;

to G-339; M-1 to R-338; M-1 to G-337; M-1 to S-336; M-1 to C-335; M-1 to P-334; M-1 SUBSTITUTE SHEET (RULE 26) toL-333; M-1 to D-332; M-1 to S-331; M-1 to S-330; M-1 to G-329; M-1 to Q-328;

to C-327; M-1 to K-326; M-1 to R-325;M-1 to I-324; M-1 to S-323; M-1 to E-322;

to E-321; M-I to A-320; M-1 to S-319; M-1 to L-318; M-1 to T-317; M-1 toC-316;

to S-315; M-1 to Q-314; M-1 to P-313; M-1 to H-312; M-1 to E-311; M-1 to C-310; M-1 to K-309; M-1 to K-308;M-1 to G-307; M-1 to Y-306; M-1 to W-305; M-1 to G-304;
M-1 to A-303; M-1 to K-302; M-1 to C-301; M-1 to D-300; M-1 toC-299; M-1 to R-298;
M-1 to G-297; M-1 to C-296; M-1 to N-295; M-1 to C-294; M-1 to Q-293; M-1 to 6-292; M-1 to H-291;M-1 to G-290; M-1 to S-289; M-1 to C-288; M-1 to F-287; M-1 to D-286; M-1 to D-285; M-1 to S-284; M-1 to Y-283; M-1 toR-282; M-1 to D-281; M-1 to Y-280; M-1 to V-279; M-1 to A-278; M-1 to R-277; M-1 to C-276; M-1 to D-275; M-I
to R-274;M-1 to E-273; M-1 to D-272; M-1 to C-271; M-1 to E-270; M-1 to C-269;

to T-268; M-1 to D-267; M-1 to G-266; M-l toH-265; M-1 to I-264; M-1 to D-263;

to G-262; M-1 to W-261; M-1 to D-260; M-1 to G-259; M-1 to T-258; M-1 to P-257;M-1 to D-256; M-1 to V-255; M-1 to D-254; M-1 to H-253; M-1 to C-252; M-1 to T-251;
M-1 to C-250; M-1 to E-249; M-1 toG-248; M-1 to C-247; M-1 to V-246; M-1 to C-245;
M-1 to T-244; M-1 to G-243; M-1 to R-242; M-1 to N-241; M-1 to S-240;M-1 to C-239;
M-1 to I-238; M-1 to K-237; M-1 to G-236; M-1 to D-235; M-1 to P-234; M-1 to S-233;
M-1 to T-232; M-1 toC-231; M-1 to R-230; M-1 to R-229; M-1 to K-228; M-1 to S-227;
M-1 to E-226; M-1 to W-225; M-1 to P-224; M-1 to T-223;M-1 to I-222; M-1 to D-221;
M-1 to C-220; M-1 to Q-219; M-1 to F-218; M-1 to E-217; M-1 to C-216; M-1 to K-215; M-1 toD-214; M-1 to G-213; M-1 to H-212; M-1 to W-211; M-1 to G-210; M-1 to A-209; M-I to K-208; M-1 to C-207; M-1 to Y-206;M-1 to C-205; M-1 to N-204; M-to G-203; M-1 to C-202; M-1 to Y-201; M-1 to C-200; M-1 to K-199; M-1 to G-198; M-1 toH-197; M-1 to G-196; M-1 to G-195; M-1 to C-I94; M-1 to I-193; M-1 to E-192; M-1 to E-191; M-1 to T-190; M-1 to E-189; M-lto D-188; M-1 to D-187; M-1 to I-186; M-1 to C-185; M-1 to E-184; M-1 to R-183; M-1 to D-182; M-1 to D-I81; M-1 to C-180;M-1 to E-179; M-1 to C-178; M-I to F-177; M-1 to K-176; M-1 to G-175; M-1 to SUBSTITUTE SHEET (RULE 2~

Y-174; M-1 to V-173; M-1 to L-172; M-1 toG-171; M-1 to S-170; M-1 to G-169; M-1 to D-168; M-1 to S-167; M-1 to N-166; M-I to D-165; M-1 to C-164; M-1 to K-163;M-1 to C-162; M-I to R-161; M-1 to G-160; M-1 to C-159; M-1 to H-158; M-1 to C-157; M-I
to T-156; M-1 to G-155; M-1 toA-154; M-1 to N-153; M-1 to S-152; M-1 to C-151;

to I-150; M-1 to I-149; M-1 to D-148; M-1 to Q-147; M-1 to S-146; M-lto N-145;

to K-144; M-1 to C-143; M-1 to M-142; M-1 to Q-141; M-1 to N-140; M-1 to S-139; M-1 to K-138; M-1 to K-137;M-1 to K-136; M-1 to T-135; M-1 to L-134; M-1 to D-133;
M-I to C-132; M-1 to N-131; M-1 to T-130; M-1 to P-129; M-1 toY-128; M-1 to Q-127;
M-1 to C-126; M-1 to A-125; M-1 to D-124; M-1 to G-123; M-1 to Y-122; M-1 to W-121; M-1 to G-120;M-1 to Q-119; M-1 to D-118; M-I to C-117; M-1 to K-116; M-1 to C-I 15; M-1 to K-114; M-1 to G-l I3; M-1 to C-112; M-1 toD-111; M-1 to C-110;
M-1 to K-109; M-1 to G-108; M-1 to H-107; M-1 to G- I 06; M- I to A-105; M-1 to C-104; M-1 to T-103;M-1 to S-102; M-1 to G-101; M-1 to D-100; M-1 to Y-99; M-1 to T-98; M-1 to E-97; M-1 to C-96; M-1 to V-95; M-1 to W-94;M-1 to E-93; M-1 to H-92; M-I to C-91;
M-1 to E-90; M-1 to C-89; M-1 to L-88; M-1 to P-87; M-1 to G-86; M-1 to F-85;

toF-84; M-1 to M-83; M-1 to G-82; M-1 to P-81; M-1 to E-80; M-1 to T-79; M-I
to V-78; M-1 to H-77; M-.1 to C-76; M-1 to I-75;M-1 to C-74; M-1 to V-73; M-1 to G-72; M-1 to C-71; M-I to D-70; M-1 to C-69; M-1 to R-68; M-1 to G-67; M-1 to R-66; M-toG-65; M-1 to H-64; M-1 to C-63; M-1 to L-62; M-1 to A-61; M-1 to A-60; M-1 to G-59; M-I to P-58; M-1 to P-57; M-1 to Q-56;M-1 to G-55; M-1 to P-54; M-1 to A-53; M-1 to R-52; M-1 to C-51; M-1 to R-50; M-1 to R-49; M-1 to E-48; M-1 to S-47; M-1 toE-46; M-1 to A-45; M-1 to R-44; M-1 to S-43; M-1 to L-42; M-1 to R-41; M-1 to C-40; M-1 to A-39; M-I to A-38; M-1 to G-37;M-1 to P-36; M-1 to W-35; M-1 to S-34; M-1 to R-33; M-1 to L-32; M-1 to S-31; M-1 to P-30; M-1 to S-29; M-1 to F-28; M-1 toS-27;
M-I to Q-26; M-1 to P-25; M-1 to V-24; M-1 to A-23; M-1 to S-22; M-I to L-21;
M-1 to G-20; M-1 to A-19; M-1 to F-18;M-1 to L-17; M-1 to L-16; M-1 to S-15; M-1 to S-14;
M-1 to A-13; M-1 to L-12; M-1 to L-11; M-1 to L-10; M-1 to F-9; M-1 toN-8; M-1 to R-SUBSTITUTE SHEET (RULE 26) 7; M-1 to F-6; of SEQ ID N0:30. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:12 which have been determined from the following related cDNA genes: HLHFV34R (SEQ ID N0:68), HSRDA85R
(SEQ ID N0:69), HSRAZ62R (SEQ ID N0:70), HSRDA 17R (SEQ ID N0:71 ), and HSLEC45R (SEQ ID N0:72).
Based on the sequence similarity to the human integrin beta-8 subunit, translation product of this gene is expected to share at least some biological activities with integrin proteins, and specifically the human inteQrin beta-8 subunit. Such activities are known in the art, some of which are described elsewhere herein.
Specifically, polynucleotides and polypeptides of the invention are also useful for modulating the differentiation of normal and malignant cells, modulating the proliferation and/or differentiation of cancer and neoplastic cells, and modulating the immune response. Polynucleotides and polypeptides of the invention may represent a diagnostic marker for hematopoietic and immune diseases and/or disorders. The full length protein should be a secreted protein, based upon homology to the integrin family.
Therefore, it is secreted into serum, urine, or feces and thus the levels is assayable from patient samples. Assuming specific expression levels are reflective of the presence of immune disorders, this protein would provide a convenient diagnostic for early detection.
In addition, expression of this gene product may also be linked to the progression of immune diseases, and therefore may itself actually represent a therapeutic or therapeutic target for the treatment of cancer. Polynucleotides and polypeptides of the invention may play an important role in the pathogenesis of human cancers and cellular transformation, particularly those of the immune and hematopoietic systems. Polynucleotides and polypeptides of the invention may also be involved in the pathogenesis of developmental abnormalities based upon its potential effects on proliferation and differentiation of cells SUBSTITUTE SHEET (RULE 26) and tissue cell types. Due to the potential proliferating and differentiating activity of said polynucleotides and polypeptides, the invention is useful as a therapeutic agent in inducing tissue regeneration, for treating inflammatory conditions (e.g., inflammatory bowel syndrome, diverticulitis, etc.). Moreover, the invention is useful in modulating the 5 immune response to aberrant polypeptides, as may exist in rapidly proliferating cells and tissue cell types, particularly in adenocarcinoma cells, and other cancers.
Alternatively, the expression within cellular sources marked by proliferating cells indicates this protein may play a role in the regulation of cellular division, and may show utility in the diagnosis, treatment, and/or prevention of developmental diseases and 10 disorders, including cancer, and other proliferative conditions.
Representative uses are described in the "Hyperproliferative Disorders" and "Regeneration" sections below and elsewhere herein. Briefly, developmental tissues rely on decisions involving cell differentiation and/or apoptosis in pattern formation.
Dysregulation of apoptosis can result in inappropriate suppression of cell death, 15 as occurs in the development of some cancers, or in failure to control the extent of cell death, as is believed to occur in acquired immunodeficiency and certain neurodegenerative disorders, such as spinal muscular atrophy (SMA).
Alternatively, this gene product is involved in the pattern of cellular proliferation that accompanies early embryogenesis. Thus, aberrant expression of this gene product in 20 tissues - particularly adult tissues - may correlate with patterns of abnormal cellular proliferation, such as found in various cancers. Because of potential roles in proliferation and differentiation, this gene product may have applications in the adult for tissue regeneration and the treatment of cancers. It may also act as a morphogen to control cell and tissue type specification. 'Cherefore, the polynucleotides and polypeptides of the 25 present invention are useful in treating, detecting, and/or preventing said disorders and conditions, in addition to other types of degenerative conditions. Thus this protein may modulate apoptosis or tissue differentiation and is useful in the detection, treatment, SUBSTITUTE SHEET (R.ULE 26) and/or prevention of degenerative or proliferative conditions and diseases.
The protein is useful in modulating the immune response tv aberrant polypeptides, as may exist in proliferating and cancerous cells and tissues. The protein can also be used to gain new insight into the regulation of cellular growth and proliferation. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
The translation product of this gene, sometimes referred to herein as TIDE
(for Ten Integrin Domains with EGF homology), shares sequence homology with integrins, which are a superfamily of dirneric ab cell-surface glycoproteins that mediate the adhesive functions of many cell types, enabling cells to interact with one another and with the extracellular matrix (See Genomics 56, 169-178 ( 1999); all information and references contained within this publication are hereby incorporated herein by reference).
The gene encoding the disclosed cDNA is believed to reside on chromosome 13, at locus 13q33. Accordingly, polynucleotides related to this invention are useful as a marker in linkage analysis for chromosome 13, generally, and particularly at locus 13q33.
This gene is expressed primarily in synovial hypoxia tissue, osteoblast and osteoclast, bone marrow stromal cells, and to a lesser extent in umbilical vein, smooth muscle, placenta, and fetal lung cDNA libraries. Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, disorders of bone and connective tissues, immune and hematopoietic diseases and/or disorders, vascular disorders, and other disorders involving aberrations in cell-surface interactions. Similarly, polypeptides and SUBSTITUTE SHEET (RULE 26) antibodies directed to these polypeptides are useful in providing irnmunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the connective tissue and skeletal system, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. cartilage, bone, vascular, hypoxic tissue, and cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 30 as residues: Met-1 to Phe-6, Arg 44 to Arg-52, His-64 to Cys-69, Tyr-99 to Gln-147, His-158 to Gly-169, Phe-177 to Asp-182, Cys-194 to Cys-202, Gly-213 to Phe-218, Pro-224 to Gly-236, Asp-254 to Trp-261, Asp-263 to Ala-303, Trp-305 to Cys-316, Lys-326 to Asp-332, Pro-334 to Cys-343, Pro-350 to Asp-370, Thr-407 to Asn-413, Gly-425 to Cys-431, Asp-449 to Asp-459, Gly-472 to Asn-483.
Polynucleotides encoding said polypeptides are also provided.
The tissue distribution and homology to the human integrin beta-8 subunit indicates polynucleotides and polypeptides corresponding to this gene are useful for the diagnosis and treatment of a variety of immune system disorders.
Representative uses are described in the "Immune Activity" and "infectious disease" sections below, in Example 11, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein.
Briefly, the expression indicates a role in regulating the proliferation;
survival;
differentiation; and/or activation of hematopoietic cell lineages, including blood stem cells. Involvement in the regulation of cytokine production, antigen presentation, or other processes indicates a usefulness for treatment of cancer (e.g. by boosting immune responses). Expression in cells of lymphoid origin, indicates the natural gene product is involved in immune functions. Therefore it would also be useful as an agent for SUBSTITUTE SHEET (RULE 26) immunological disorders including arthritis, asthma, immunodeficiency diseases such as AIDS, leukemia, rheumatoid arthritis, granulomatous Disease, inflammatory bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and tissues, such as _5 host-versus-graft and graft-versus-host diseases, or autoimmunity disorders, such as autoimmune infertility, Tense tissue injury, demyelination, systemic lupus erythematosis, drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's Disease, and scleroderma. Moreover, the protein may represent a secreted factor that influences the differentiation or behavior of other blood cells, or that recruits hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in the expansion of stem cells and committed progenitors of various blood lineages, and in the differentiation and/or proliferation of various cell types. Based upon the tissue distribution of this protein, antagonists directed against this protein is useful in blocking the activity of this protein. Accordingly, preferred are antibodies which specifically bind a portion of the translation product of this gene.
Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), are more particularly described elsewhere herein.
Furthermore, the protein may also be used to determine biological activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, SUBSTITUTE SHEET (RULE 26) as well as, antibodies directed against the protein may show utility as a tumor marker and/or irnmunotherapy targets for the above listed tissues.
Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID N0:12 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 2485 of SEQ ID N0:12, b is an integer of 15 to 2499, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID N0:12, and where b is greater than or equal to a + 14.
FEATURES OF PROTEIN ENCODED BY GENE NO: 3 The translation product of this gene shares sequence homology with RAMP3 (receptor-activity-modifying proteins), which another group has recently published, which is thought to be important in the transport of the calcitonin-receptor-like receptor (CRLR) to the plasma membrane. RAMPS regulate the transport and ligand specificity of the calcitonin-receptor-like-receptor. There are two other related receptor-activity-modifying proteins, known as RAMP1 and RAMP2 (Nature 1998 May 28;393(6683):333-9). RAMP1 is thought to present the receptor at the cell surface as a mature glycoprotein and a Calcitonin-gene-related peptide (CGRP) receptor.
Alternatively, RAMP2-transported receptors are core-glycosylated and are adrenomedullin receptors. CGRP (a 37-amino-acid neuropeptide) and its receptors are widely distributed in the body, and it is the most potent endogenous vasodilatory peptide SUBSTITUTE SHEET (RULE 26) discovered so far (Crit Rev Neurobiol 1997;11(2-3):167-239). Specific binding sites for adrenomedullin were present in every region of human brain (cerebral cortex, cerebellum, thalamus, hypothalamus, pons and medulla oblongata), suggesting that a novel neurotransmitter/neuromodulator role may exist for adrenomedullin in human brain (Peptides 1997;18(8):1125-9).
Figures 7A-B show the nucleotide (SEQ ID N0:13) and deduced amino acid sequence (SEQ ID N0:31 ) of the Intestine derived extraceilular protein.
Predicted amino acids from about 1 to about 27 constitute the predicted signal peptide (amino acid residues from about 1 to about 27 in SEQ ID N0:31 ) and are represented by the underlined amino acid regions; and amino acids from about 122 to about 138 constitute the predicted transmembrane domain (amino acid residues from about 122 to about 138 in SEQ ID N0:31) and are represented by the double-underlined amino acids.
Figure 8 shows the regions of similarity between the amino acid sequences of the Intestine derived extracellular protein SEQ ID N0:31, and the RAMP3 protein (gi~4587099) (SEQ ID NO: 75).
Figure 9 shows an analysis of the amino acid sequence of SEQ ID NO: 31.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
Northern analysis indicates that a l.4kb transcript of this gene is primarily expressed in small intestine tissue, and to a lesser extent in colon and prostate tissue.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in Figure 1 (SEQ ID N0:3I), which was determined by sequencing a cloned cDNA
(HTLEW81). The nucleotide sequence shown in Figures 7A-B (SEQ ID N0:13) was 2_5 obtained by sequencing a cloned cDNA (HTLEW81), which was deposited on Nov. 17, 1998 at the American Type Culture Collection, and given Accession Number 203484.
The deposited gene is inserted in the pSport plasmid (Life Technologies, Rockville, MD) SUBSTITUTE SHEET (RULE 26) using the SaII/NotI restriction endonuclease cleavage sites. The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID N0:13 is intended DNA
fragments S at least about lSnt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments nt in length are also useful according to the present invention, as are fragments cowesponding to most, if not all. of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:13. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:13. In this context "about" includes the particularly recited size, larger or smaller- by several (S, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of 1 S polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about S1 to about 100, from about 101 to about 150, from about 1S1 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 3S 1 to about 400, from about 401 to about 450, from about 451 to about 500, and from about SO 1 to about _550, and from about SS 1 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 7S 1 to about 800, from about 801 to about 850, from about 8S 1 to about 900, from about 901 to about 950, from about 951 to about i 000, from about 1001 to about 1050, from about l OS
1 to about 1100, from about 11 O 1 to about 11 S0, from about 11 S 1 to about 1200, from about 1201 2S to about 1250, from about 12S 1 to about 1300, from about 1301 to about 1339 of SEQ
ID N0:13, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by SUBSTITUTE SHEET (RULE 26) several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that _5 comprise alpha-helix and alpha-helix forming regions {"alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set forth in Figure 9 and/or Table lIl, as described above, was generated using the various modules and algorithms of the DNA''STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table IIII can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 9, but may, as shown in Table III, be represented or identified by using tabular representations of the data presented in Figure 9. The DNA*STAR computer algorithm used to generate Figure 9 (set on the original default parameters) was used to present the data in Figure 9 in a tabular format (See Table III). The tabular format of the data in Figure 9 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred 2_5 regions set out in Figure 9 and in Table III include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 7A-B. As set out in Figure 9 and in Table III, such preferred regions include Garnier-SUBSTITUTE SHEET (RULE 26) Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figures 7A-B, up to the arginine residue at position number 143 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-148 of Figures 7A-B, where nl is an integer from 2 to 143 corresponding to the position of the amino acid residue in Figures 7A-B (which is identical to the sequence shown as SEQ ID N0:31). N-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:31 include polypeptides comprising the amino acid sequence of residues: E-2 to L-148; T-3 to L-148; G-4 toL-148; A-5 to L-148; L-6 to L-148; R-7 to L-148; R-8 to L-148;P-9 to L-148; Q-10 to L-SUBSTITUTE SHEET (RULE 26) 148; L-11 to L-148; L-12 to L-148; P-l3to L-I48; L-14 to L-148; L-1S to L-148;
L-16 to L-148; L-17 toL-148; L-18 to L-148; C-19 to L-148; G-20 to L-148; G-21 toL-148; C-22 to L-148; P-23 to L-148; R-24 to L-148; A-25 toL-I48; G-26 to L-148; G-27 to L-148;
C-28 to L-148; N-29 toL-148; E-30 to L-148; T-31 to L-148; G-32 to L-148; M-33 toL-148; L-34 to L-148; E-35 to L-148; R-36 to L-148; L-37 toL-148; P-38 to L-148;
L-39 to L-148; C-40 to L-148; G-41 toL-148; K-42 to L-148; A-43 to L-148; F-44 to L-148; A-4_5 toL-148; D-46 to L-148; M-47 to L-148; M-48 to L-148; G-49 toL-148; K-50 to L-148; V-51 to L-148; D-52 to l.-148; V-53 toL-148; W-54 to L-148; K-55 to L-148; W-SG
to L-148; C-57 toL-148; N-58 to L-148; L-59 to L-I48; S-60 to L-I48; E-61 toL-148; F-62 to L-148; I-63 to L-148; V-64 to L-148; Y-65 toL-148; Y-66 to L-148; E-67 to L-148;
S-68 to L-148; F-69 toL-148; 'T-70 to L-148; N-71 to L-148; C-72 to L-148; T-73 toL-148; E-74 to L-148; M-75 to L-I48; E-76 to L-148; A-77 toL-148; N-78 to L-148;

to L-148; V-80 to L-148; G-81 toL-148; C-82 to L-148; Y-83 to L-148; W-84 to L-148;
P-85 toL-148; N-86 to L-148; P-87 to L-148; L-88 to L-148; A-89 toL-148; Q-90 to L-I48; G-91 to L-148; F-92 to L-148; I-93 toL-148; T-94 to L-148; G-95 to L-148;
I-96 to L-148; H-97 toL-148; R-98 to L-148; Q-99 to L-148; F-100 to L-148; F-101 toL-148; S-102 to L-148; N-103 to L-148; C-104 to L-I48; T-10_5 toL-148; V-106 to L-148;

to L-148; R-108 to L-148; V-109 toL-148; H-110 to L-148; L-111 to L-148; E-112 to L-148; D-1 I3 toL-148; P-114 to L-148; P-115 to L-148; D-I 16 to L-148; E-117 toL-148;
V-118 to L-148; L-119 to L-148; I-120 to L-148; P-121 toL-148; L-122 to L-148;

to L-148; V-124 to L-148; I-125 toL-148; P-126 to L-148; V-127 to L-148; V-128 to L-148; L-129 toL-148; T-130 to L-148; V-131 to L-148; A-132 to L-148; M-133 toL-148;
A-134 to L-148; G-135 to L-148; L-136 to L-148; V-137 toL-148; V-138 to L-148;
W-139 to L-148; R-140 to L-148; S-141 toL-148; K-142 to L-148; R-143 to L-148;
of SEQ
ID N0:31. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
SUBSTITUTE SHEET (RULE 26) Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities (e.g., ability to illicit mitogenic activity, induce differentiation of normal or malignant cells, bind to 5' EGF receptors, etc.)), may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature. forms of the polypeptide general ly will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can 10 readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides poiypeptides having one or 15 more residues deleted from the carboxy terminus of the amino acid sequence of the polypeptide shown in Figures 7A-B, up to the arginine residue at position number 7, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures 7A-B, where m1 is an integer from 7 to I47 corresponding to the position of the amino acid 20 residue in Figures 7A-B. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:31 include polypeptides comprising the amino acid sequence of residues: M-1 to L-147; M-1 to T-146;M-I to D-145; M-1 to T-144; M-I to R-143; M-1 to K-142; M-1 toS-141; M-1 to R-25 140; M-1 to W-139; M-1 to V-138; M-1 to V-137;M-1 to L-136; M-I to G-I35; M-1 to A-134; M-1 to M-133; M-1 toA-132; M-1 to V-131; M-1 to T-130; M-1 to L-129; M-I
to V-128;M-1 to V-127; M-1 to P-126; M-1 to I-125; M-1 to V-124; M-1 toI-123;
M-1 to SUBSTITUTE SHEET (RULE 26) WO 00/29435 PGT/US99/x5031 L-122; M-1 to P-121; M-1 to I-120; M-1 to L-119;M-1 to V-118; M-1 to E-117; M-1 to D-116; M-1 to P-115; M-1 toP-114; M-1 to D-113; M-1 to E-112; M-1 to L-111; M-1 to H-110;M-1 to V-109; M-1 to R-108; M-1 to D-107; M-1 to V-106; M-1 toT-105; M-1 to C-104; M-1 to N-103; M-1 to S-102; M-1 to F-lOI;M-1 to F-100; M-1 to Q-99; M-1 to R-98; M-1 to H-97; M-1 toI-96; M-1 to G-95; M-1 to T-94; M-1 to I-93; M-1 to F-92;
M-1 toG-91; M-1 to Q-90; M-1 to A-89; M-1 to L-88; M-1 to P-87; M-lto N-86; M-1 to P-8.5; M-1 to W-84; M-1 to Y-83; M-1 to C-82;M-1 to G-81; M-1 to V-80; M-1 to V-79;
M-1 to N-78; M-1 toA-?7; M-1 to E-76; M-1 to M-75; M-1 to E-74; M-1 to T-73; M-lto C-72; M-1 to N-71; M-1 to T-70; M-1 to F-69; M-1 to S-68;M-1 to E-67; M-1 to Y-66;
M-1 to Y-65; M-1 to V-64; M-1 to I-63;M-1 to F-62; M-1 to E-61; M-1 to S-60: M-1 to L-S9; M-1 to N-58; M-1 to C-S7; M-1 to W-56; M-1 to K-_55; M-1 to W-54; M-1 toV-53; M-1 to D-52; M-1 to V-51; M-1 to K-50; M-1 to G-49; M-lto M-48; M-1 to M-47:
M-I to D-46; M-1 to A-45; M-1 to F-44;M-1 to A-43; M-1 to K-42; M-1 to G-41; M-l to C-40; M-1 toL-39; M-1 to P-38; M-I to L-37; M-1 to R-36; M-1 to E-35; M-1 toL-34;
M-1 to M-33; M-1 to G-32; M-1 to T-31; M-1 to E-30; M-lto N-29; M-1 to C-28; M-to G-27; M-1 to G-26; M-1 to A-25;M-1 to R-24; M-1 to P-23; M-1 to C-22; M-1 to G-21; M-1 toG-20; M-1 to C-19; M-1 to L-18; M-1 to L-17; M-1 to L-16; M-lto L-15; M-1 to L-14; M-1 to P-13; M-1 to L-12; M-1 to L-11; M-lto Q-10; M-1 to P-9; M-1 to R-8;
M-1 to R-7; of SEQ ID N0:31. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:31 which have been determined from the following related cDNA genes: HLHCH17RA (SEQ ID N0:76), HTOATS1R
(SEQ ID N0:77), and/or HBNB041R (SEQ ID N0:78).
The polypeptide of this gene has been determined to have a transmembrane domain at about amino acid position 122 - 138 of the amino acid sequence referenced in Table XIII for this gene. Moreover, a cytoplasmic tail encompassing amino acids 139 to SUBSTTTUTE SHEET (RULE 26) 149 of this protein has also been determined. Based upon these characteristics, it is believed that the protein product of this gene shares structural features to type Ia membrane proteins.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, gastrointestinal and neurodeaenerative diseases and disorders.
Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the central nervous and gastrointestinal systems, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. brain, CNS, gastrointestinal, cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual IS having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 31 as residues: Ala-5 to Gln-10, Pro-23 to Cys-28, Arg-140 to Asp-145. Polynucleotides encoding said polypeptides are also provided.
The tissue distribution and homology to RAMP3 suggest that the translation product of this gene is useful for the detection/treatment of neurodegenerative disease states and behavioural disorders such as Alzheimer's Disease, Parkinson's Disease, Huntington's Disease, Tourette Syndrome, schizophrenia, mania, dementia, paranoia, obsessive compulsive disorder, panic disorder, learning disabilities, ALS, psychoses, autism, and altered behaviors, including disorders in feeding, sleep patterns, balance, and perception. In addition, the gene or gene product may also play a role in the treatment and/or detection of developmental disorders associated with the developing embryo.
SUBSTITUTE SHEET (RULE 26) Alternatively, the tissue distribution in small intestine and colon tissues indicates that polynucieotides and polypeptides corresponding to this gene are useful for the diagnosis and/or treatment of disorders involving the small intestine. This may include diseases associated with digestion and food absorption, as well as hematopoietic disorders involving the Peyer's patches of the small intestine, or other hematopoietic cells and tissues within the body. Similarly, expression of this gene product in colon tissue indicates again involvement in digestion, processing, and elimination of food, as well as a potential role for this gene as a diagnostic marker or causative agent in the development of colon cancer, and cancer in general. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID N0:13 and may have been publicly available prior to conception of the present 1S invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 1325 of SEQ ID NO:13> b is an integer of IS to 1339, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID N0:13, and where b is greater than or equal to a + 14.
FEATURES OF PROTEIN ENCODED BY GENE NO: 4 The translation product of this gene shares sequence homology with a proteoglycan from Gallus gallus, and this proteoglycan is believed to participate in the SUBSTITUTE SHEET (RULE 26) _54 osteogenic processes of cartilage ossification (See Genbank Accession No.
gi~222847).
Based on the sequence similarity. The translation product of this gene is expected to share biological activities with the Gallus gallus proteoglycan polypeptide.
Figures l0A-B shows the nucleotide (SEQ ID N0:14) and deduced amino acid sequence (SEQ ID N0:32) of the retinal specific protein. Predicted amino acids from about I to about 21 constitute the predicted signal peptide (amino acid residues from about 1 to about 21 in SEQ ID N0:32) and are represented by the underlined amino acid regions.
Figure I 1 shows the regions of similarity between the amino acid sequences of the retinal specific protein SEQ ID N0:32, and the Gallus gallus proteoglycan (SEQ ID
N0:79).
Figure 12 shows an analysis of the amino acid sequence of SEQ ID NO: 32.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
Northern analysis indicates that this gene is expressed in adrenal cortex and adrenal medulla tissues. This gene is also expressed in retinal tissue.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in Figures l0A-B (SEQ ID N0:32), which was determined by sequencing a cloned cDNA
(HARA044). The nucleotide sequence shown in Figures IOA-B (SEQ ID N0:14) was obtained by sequencing a cloned cDNA (HARA044), which was deposited on Nov.
17, 1998 at the American Type Culture Collection, and given Accession Number 203484.
The deposited gene is inserted in the pSport plasmid (Life Technologies, Rockville, MD) using the SaII/NotI restriction endonuclease cleavage sites.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited eDNA or the nucleotide sequence shown in SEQ ID
SUBSTITUTE SHEET (RULE 2b) N0:14 is intended DNA fragments at least about l5nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments SO-1500 nt in length are also useful according to the present 5 invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:14. By a fragment at least 20 nt in length, for example, is intended fragments which include. 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0;14. In this context "about" includes the particularly recited size, 10 larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini.
Representative examples of polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, cansist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150> from 1 _5 about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, and from about 501 to about 550, and from about 551 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 7S 1 to about 800, from about 801 to about 850, from about 851 to about 20 900, from about 901 to about 950, from about 9S 1 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about .l 101 to about 1150, from about 11 S 1 to about 1200, from about 1201 to about 1250, from about 1251 to about 1300, from about 1301 to about 1350, from about 1351 to about 1389, and from about 187 to about 1119 of SEQ ID N0:14, or the complementary strand thereto, or the cDNA
25 contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at SUBSTITUTE SHEET (RULE 26) SG
both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and _5 beta-sheet forming regions ("beta-regions"), turn and tum-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set forth in Figure 12 and/or Table IV, as described above, was generated using the various modules and algorithms of the DNA=~STAR set on default parameters. In a prefewed embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table IV can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are deternnined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 12, but may, as shown in Table IV, be represented or identified by using tabular representations of the data presented in Figure 12. The DNA*STAR computer algorithm used to generate Figure 12 (set on the original default parameters) was used to present the data in Figure 12 in a tabular format (See Table IV). The tabular format of the data in Figure 12 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 12 and in Table IV include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures l0A-B. As set out in Figure 12 and in Table IV, such preferred regions include Gamier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, SUBSTITUTE SHEET (RULE 26) Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature poiypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figures l0A-B, up to the proline residue at position number 327 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-332 of Figures l0A-B, where nl is an integer from 2 to 327 corresponding to the position of the amino acid residue in Figures l0A-B (which is identical to the sequence shown as SEQ ID N0:32). N-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:32 include polypeptides comprising the amino acid sequence of residues: R-2 to T-332; L-3 to T-332; L-4 to T-332; A-5 to T-332;F-6 to T-332; L-7 to T-332; S-8 to T-332; L-9 to T-332;
L-10 to T-332; A-11 to T-332; L-12 to T-332; V-13 to T-332;L-14 to T-332; Q-15 to T-SUBSTITUTE SHEET (R.ULE 26) 332; E-16 to T-332; T-17 to T-332; G-18 to T-332; T-19 to T-332; A-20 to T-332; S-21 toT-332; L-22 to T-332; P-23 to T-332; R-24 to T-332; K-25 to T-332; E-26 to T-332; R-27 to T-332; K-28 to T-332;8-29 to T-332; R-30 to T-332; E-31 to T-332; E-32 to T-332; Q-33 to T-332; M-34 to T-332; P-35 to T-332; R-36 toT-332; E-37 to T-332;

to T-332; D-39 to T-332; S-40 to T-332; F-41 to T-332; E-42 to T-332; V-43 to T-332;L-44 to T-332: P-45 to T-332; L-46 to T-332; R-47 to T-332; N-48 to T-332; D-49 to T-332; V-50 to T-332; L-51 toT-332; N-52 to T-332; P-53 to T-332; D-54 to T-332;

to T-332; Y-56 to T-332; G-57 to T-332; E-58 to T-332;V-59 to T-332; I-60 to T-332; D-61 to T-332; L-62 to T-332; S-63 to T-332; N-64 to T-332; Y-65 to T-332; E-66 toT-332; E-67 to T-332; L-68 to T-332; T-69 to T-332; D-70 to T-332; Y-71 to T-332; G-72 to T-332; D-73 to T-332;Q-74 to T-332; L-75 to T-332; P-76 to T-332; E-77 to T-332;
V-78 to T-332; K-79 to T-332; V-80 to T-332; T-81 toT-332: S-82 to T-332; L-83 to T-332; A-84 to T-332; P-85 to T-332; A-86 to T-332; T-87 to T-332; S-88 to T-332;I-89 to T-332; S-90 to T-332; P-91 to T-332; A-92 to T-332; K-93 to T-332; S-94 to T-332; T-95 to T-332; T-96 toT-332; A-97 to T-332; P-98 to T-332; G-99 to T-332; T-100 to T-332; P-101 to T-332; S-102 to T-332; S-103 toT-332; N-104 to T-332; P-105 to T-332;
T-106 to T-332; M-107 to T-332; T-108 to T-332; R-109 to T-332; P-110 toT-332;
T-111 to T-332; T-112 to T-332; A-113 to T-332; G-114 to T-332; L-115 to T-332;

to T-332; L-117 toT-332; S-118 to T-332; S-119 to T-332; Q-120 to T-332; P-121 to T-332; N-122 to T-332; H-123 to T-332; G-124 toT-332; L-125 to T-332; P-126 to T-332;
T-127 to T-332; C-128 to T-332; L-129 to T-332; V-130 to T-332; C-131 toT-332;
V-132 to T-332; C-133 to T-332; L-134 to T-332; G-135 to T-332; S-136 to T-332;

to T-332; V-138 toT-332; Y-139 to T-332; C-140 to T-332; D-141 to T-332; D-I42 to T-332; I-143 to T-332; D-144 to T-332; L-145 toT-332; E-146 to T-332; D-147 to T-332;
I-148 to T-332; P-149 to T-332; P-150 to T-332; L-151 to'T-332; P-152 toT-332;

to T-332; R-154 to T-332; T-155 to T-332; A-156 to T-332; Y-157 to T-332; L-158 to T-332; Y-159 toT-332; A-160 to T-332; R-161 to T-332; F-162 to T-332; N-163 to T-332;
SUBSTITUTE SHEET (RULE 26) R-164 to T-332; I-165 to T-332; S-166 toT-332; R-167 to T-332; I-168 to T-332;

to T-332; A-170 to T-332; E-171 to T-332; D-172 to T-332; F-173 toT-332; K-174 to T-332; G-175 to T-332; L-176 to T-332; T-177 to T-332; K-178 to T-332; L-179 to T-332;
K-180 toT-332; R-181 to T-332; I-182 to T-332; D-183 to T-332; L-184 to T-332;

to T-332; N-186 to T-332; N-187 toT-332; L-188 to T-332; I-189 to T-332; S-190 to T-332; S-191 to T-332; I-192 to T-332; D-193 to T-332; N-.194 toT-332; D-195 to T-332;
A-196 to T-332; F-197 to T-332; R-198 to T-332; L-199 to T-332; L-200 to T-332; H-201 toT-332; A-202 to T-332; L-203 to T-332; Q-204 to T-332; D-205 to T-332; L-to T-332; I-207 to T-332; L-208 toT-332; P-209 to T-332; E-210 to T-332; N-211 to T-332; Q-2I2 to T-332;1_-213 to T-332; E-214 to T-332; A-215 toT-332; L-216 to T-332;
P-217 to T-332; V-218 to T-332; L-219 to T-332; P-220 to T-332; S-221 to T-332: G-222 toT-332; I-223 to T-332; E-224 to T-332; F-225 to T-332; L-226 to T-332; D-227 to T-332; V-228 to T-332; R-229 toT-332; L-230 to T-332; N-231 to T-332; R-232 to T-332; L-233 to T-332; Q-234 to T-332; S-235 to T-332; S-236 toT-332; G-237 to T-332;
1.5 I-238 to T-332; Q-239 to T-33'Z; P-240 to T-332; A-241 to T-332; A-242 to T-332; F-243 toT-332; R-244 to T-332; A-245 to T-332: M-246 to 'T-332; E-247 to T-332;

to T-332; L-249 to T-332; Q-250 toT-332; F-251 to T-332; L-252 to T-332; Y-253 to T-332; L-254 to T-332; S-255 to T-332; D-256 to T-332; N-257 toT-332; L-258 to T-332;
L-259 to T-332; D-260 to T-332; S-261 to T-332; I-262 to T-332; P-263 to T-332; G-264 toT-332; P-265 to T-332; L-266 to T-332; P-267 to T-332; P-268 to T-332; S-269 to T-332; L-270 to T-332; R-271 toT-332; S-272 to T-332; V-273 to T-332; H-274 to T-332;
L-275 to T-332; Q-276 to T-332; N-277 to T-332; N-278 toT-332; L-279 to T-332;

to T-332; E-281 to T-332; T-282 to T-332; M-283 to T-332; Q-284 to T-332; R-285 toT
332; D-286 to T-332; V-287 to T-332; F-288 to T-332; C-289 to T-332; D-290 to T-332;
P-291 to T-332; E-292 toT-332; E-293 to T-332; H-294 to T-332; K-295 to T-332;
H
296 to T-332; T-297 to T-332; R-298 to T-332; R-299 toT-332; Q-300 to T-332; L-to T-332; E-302 to T-332; D-303 to T-332; I-304 to T-332; R-305 to T-332; L-306 toT-SUBSTITUTE SHEET (RULE 26) 332; D-307 to T-332; G-308 to T-332; N-309 to T-332; P-310 to T-332; I-311 to T-332;
N-3I2 to T-332; L-313 toT-332; S-314 to T-332; L-315 to T-332; F-316 to T-332;

to T-332; S-318 to T-332; A-319 to T-332; Y-320 toT-332; F-321 to T-332; C-322 to T-332; L-323 to T-332; P-324 to T-332; R-325 to T-332; L-326 to T-332; P-327 toT-332;
5 of SEQ ID N0:32. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities (e.g., ability 10 to illicit mitogenic activity, induce differentiation of normal or malignant cells, bind to EGF receptors, etc.)), may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking 15 C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
20 Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the polypeptide shown in Figures l0A-B, up to the glutamine residue at position number 7, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures 25 l0A-B, where ml is an integer from 7 to 331 corresponding to the position of the amino acid residue in Figures l0A-B. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid SUBSTITUTE SHEET (RULE 26) sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID
N0:32 include polypeptides comprising the amino acid sequence of residues: M-1 to F-331; M-1 to R-330; M-1 to G-329; M-1 to I-328; M-1 to P-327; M-1 to L-326; M-1 to 8-325; M-1 to P-324; M-1 to L-323; M-1 to C-322; M-1 to F-321; M-lto Y-320; M-1 to A-319; M-I to S-318; M-1 to P-317; M-I to F-316; M-1 to L-315; M-1 to S-314; M-1 to L-313; M-lto N-312; M-1 to I-31 l; M-1 to P-310; M-I to N-309; M-1 to G-308; M-1 to D-307; M-1 to L-306; M-1 to R-305; M-lto I-304; M-1 to D-303; M-1 to E-302; M-1 to L-301; M-1 to Q-300; M-1 to R-299; M-1 to R-298; M-1 to T-297; M-lto H-296; M-1 to K-295; M-1 to H-294; M-1 to E-293; M-1 to E-292; M-1 to P-291; M-1 to D-290; M-to C-289; M-lto F-288; M-1 to V-287; M-1 to D-286; M-1 to R-285; M-1 to Q-284;

to M-283; M-1 to T-282; M-1 to E-281; M-lto I-280; M-1 to L-279; M-1 to N-278;

to N-277; M-1 to Q-276; M-I to L-27.5; M-I to H-274; M-1 to V-273; M-lto S-272; M-1 to R-271; M-1 to L-270; M-1 to S-269; M- I to P-268; M-1 to P-267; M-1 to L-266; M-1 to P-265; M-1 to G-264; M- I to P-263; M-1 to I-262; M-1 to S-261; M-1 to D-260; M-1 to L-259; M-1 to L-258; M-1 to N-257; M-lto D-256; M-1 to S-255; M-1 to L-254;

to Y-253; M-1 to L-252; M-1 to F-251; M-1 to Q-250; M-1 to L-249; M-lto K-248;
M-I
to E-247; M-1 to M-246; M-1 to A-245; M-1 to R-244; M-1 to F-243; M-1 to A-242; M-1 to A-241;M-I to P-240; M-1 to Q-239; M-1 to 1-238; M-I to G-237; M-1 to S-236; M-1 to S-235; M-1 to Q-234; M-1 to L-233;M-1 to R-232; M-1 to N-231; M-1 to L-230; M-1 to R-229; M-1 to V-228; M-1 to D-227; M-1 to L-226; M-1 to F-225;M-1 to E-224; M-1 to I-223; M-1 to G-222; M-1 to S-221; M-1 to P-220; M-1 to L-219; M-1 to V-218; M-1 to P-217;M-1 to L-216; M-1 to A-215; M-1 to E-214; M-1 to L-213; M-1 to Q-212; M-1 to N-211; M-1 to E-210; M-1 to P-209;M-1 to L-208; M-1 to I-207; M-1 to L-206; M-1 to D-205; M-I to Q-204; M-1 to L-203; M-1 to A-202; M-1 to H-201;M-1 to L-200;
M-1 to L-199; M-1 to R-198; M-1 to F-197; M-1 to A-196; M-1 to D-195; M-1 to N-194; M-1 to D-193;M-I to I-192; M-I to S-191; M-1 to S-190; M-1 to I-189; M-I
to L-188; M-I to N-187; M-I to N-186; M-1 to S-185;M-1 to L-184; M-I to D-183; M-1 to I-SUBSTITUTE SHEET (RULE 26) 182; M-1 to R-181; M-1 to K-180; M-1 to L-179; M-1 to K-178; M-1 to T-177;M-1 to L-176; M-1 to G-175; M-1 to K-174; M-1 to F-173; M-.1 to D-I72; M-1 to E-171;

to A-170; M-1 to R-169;M-1 to I-168; M-1 to R-167; M-1 to S-166; M-1 to I-165;

to R-164; M-1 to N-163; M-1 to F-162; M-1 to R-161;M-1 to A-160; M-1 to Y-159;

to L-158; M-1 to Y-157; M-1 to A-156; M-1 to T-155; M-1 to R-154; M-1 to R-153;M-1 to P-152; M-I to L-151; M-1 to P-150; M-1 to P-149; M-1 to I-148; M-1 to D-147; M-1 to E-146; M-1 to L-145;M-1 to D-144; M-1 to I-143; M-1 to D-142; M-1 to D-141:

to C-I40; M-1 to Y-139; M-1 to V-138; M-I to S-I37;M-I to S-136; M-1 to G-135;

to L-134; M-1 to C-133; M-I to V-132; M-1 to C-131; M-1 to V-130; M-1 to L-129;M-I
to C-128; M-I to T-127; M-1 to P-126; M-1 to L-12.5; M-I to G-124; M-1 to H-123: M-1 to N-122; M-1 to P-12I;M-1 to Q-120; M-1 to S-119; M-1 to S-118; M-1 to L-117:

to L-116; M-I to L-115; M-1 to G-114; M-1 to A-1 I3;M-1 to T-112; M-1 to T-111; M-1 to P-110; M-1 to R-109; M-1 to T-108; M-1 to M-107; M-1 to T-106; M-1 to P-105;M-1 to N-104; M-1 to S-I03; M-1 to S-102; M-1 to P-101; M-1 to T-100; M-1 to G-99:

I5 to P-98; M-1 to A-97;M-1 to T-96; M-1 to T-95; M-1 to S-94; M-1 to K-93; M-1 to A-92; M-1 to P-91; M-1 to S-90; M-1 to I-89; M-1 toS-88; M-1 to T-87; M-I to A-86; M-1 to P-85; M-1 to A-84; M-1 to L-83; M-1 to S-82; M-I to T-81; M-1 to V-80; M-lto K
79; M-1 to V-78; M-1 to E-77; M-1 to P-76; M-1 to L-75; M-1 to Q-74; M-1 to D-73;
M-1 to G-72; M-1 to Y-71;M-1 to D-70; M-1 to T-69; M-1 to L-68; M-1 to E-67; M-1 to E-66; M-1 to Y-65; M-1 to N-64; M-1 to S-63; M-1 toL-62; M-1 to D-61; M-1 to I-60;
M-1 to V-59; M-1 to E-58; M-1 to G-57; M-1 to Y-56; M-1 to N-55; M-1 to D-54;M-I
to P-53; M-1 to N-52; M-1 to L-51; M-1 to V-50; M-1 to D-49; M-1 to N-48; M-I
to R-47; M-1 to L-46; M-1 toP-45; M-1 to L-44; M-1 to V-43; M-1 to E-42; M-1 to F-41; M-1 to S-40; M-1 to D-39; M-1 to G-38; M-1 to E-37; M-lto R-36; M-1 to P-35; M-1 to M-34; M-1 to Q-33; M-1 to E-32; M-1 to E-31; M-1 to R-30; M-I to R-29; M-1 to K-28;M-1 to R-27; M-1 to E-26; M-I to K-25; M-1 to R-24; M-1 to P-23; M-I to L-22; M-1 to S-21; M-1 to A-20; M-1 toT-19; M-1 to G-18; M-1 to T-17; M-1 to E-16; M-1 to Q-15; M-SUBSTITUTE SHEET (RULE 26) 1 to L-14; M-1 to V-13; M-1 to L-12; M-1 to A-11;M-1 to L-10; M-1 to L-9; M-1 to S-8;
M-1 to L-7; of SEQ ID N0:32. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide S sequences related to extensive portions of SEQ ID N0:14 which have been determined from the following related cDNA genes: HARAY79R (SEQ ID N0:80), HARA044R
(SEQ ID N0:81), HARAJ74R (SEQ ID N0:82), HARA066R (SEQ ID N0:83), HARAN 19R (SEQ ID N0:84), and HARAT78R (SEQ ID N0:85).
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, retinal disorders. Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the retina, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. retinal, cancerous and wounded tissues) or bodily fluids {e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 32 as residues: Leu-22 to Asp-39, Asn-64 to Pro-76, Pro-98 to Thr-111, Pro-291 to Glu-302. Polynucleotides encoding said poIypeptides are also provided.
The tissue distribution in retinal tissue, and the homology to a Gallus gallus proteoglycan involved in the ossification process indicates that polynucleotides and polypeptides corresponding to this gene are useful for the treatment of disorders of the SUBSTITUTE SHEET (RULE 26) retina which involve the adhesion of tissues, or the binding of certain proteins to the cell surface. The translation products of this gene are useful for the treatment of retinal disorders such as retinal detachment in individuals suffering from myopia, or in the treatment of macular degeneration. Furthermore, this gene may serve as a tumor marker for retinoblastomas, or related tumors. More generally, the tissue distribution in retinal tissue indicates that The translation product of this gene is useful for the diagnosis, detection and/or treatment of eye disorders including blindness, color blindness, impaired vision, short and long sightedness, retinitis pigmentosa, retinitis proliferans, and retinoblastoma, retinochoroiditis, retinopathy and retinoschisis. Based upon the tissue distribution of this protein. antagonists directed against this protein is useful in blocking the activity of this protein. Accordingly, prefen-ed are antibodies which specifically bind a portion of the translation product of this gene.
Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), are more particularly described elsewhere herein. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
SUBSTITUTE SHEET (RULE 26) Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID N0:14 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the 5 scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 1375 of SEQ ID N0:14, b is an integer of 15 to 1389, where both a and b correspond to the positions of nucleotide residues shown in 10 SEQ ID N0:14, and where b is greater than or equal to a + 14.
FEATURES OF PROTEIN ENCODED BY GENE NO: 5 15 The translation product of this gene shares sequence homology with the CD33 protein (See Genbank Accession No. gi~2913995). The expression pattern of CD33 within the hematopoietic system indicates a potential role in the regulation of myeloid cell differentiation. However, this expression is absent from hematopoietic stem cells.
CD33 is expressed in clonogenic leukemia cells in about 90% of patients suffering from 20 acute myeloid leukemia (AML). While about 60-70% of adults suffering from AML
experience complete remission due to chemotherapy application, most of these patients will ultimately die of relapsed leukemia. It is believed that, like CD33, the CD33-like protein of the present invention is also expressed by clonogenic leukemia cells from the vast majority of patients with AML. Thus, there is a clear need to identify and isolate 2S nucleic acid molecules encoding additional polypeptides having CD33-like protein activity. It is believed that cancerous tissue contains significantly greater amounts of CD33-like protein gene copy number and expresses significantly SUBSTITUTE SHEET (RULE 26) enhanced levels of CD33-like protein and mRNA encoding the CD33-like protein when compared to a "standard" mammal, i.e.-a mammal of the same species not having the cancer or inflammatory disease. Thus, enhanced levels of the CD33-like protein will be detected in certain bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal .5 fluid) from mammals when compared to sera from mammals of the same species not having the cancer or inflammatory disease.
Two related cDNA genes, HDPIB36 and HEOMHIO, have been isolated. These cDNA genes appear to encode splice variants of this gene. Preferred polynucleotides comprise the following sequences:
CGACCCACGCGTCCGCCGCCTTCGGCTTCCCCTTCTGCCAA
GAGCCCTGAGCCACTCACAGCACGACCAGAGA (SEQ ID NO: 86), GTATGGAATGGGGTGGGAACCCCTGCCTCTCACACTGGGGAGGGACCCTGGG
GACAGCCTATGGGCTGAGCAGAGAGGGCTCTCAGGGACCCCTGCAGCACAA
GAATCTCCCACCACGGTCTCTGTCCCAGCCCTGACTCAGAAGCCTGATGTCTA
CATCCCCGAGACCCTGGAGCCCGGGCAGCCGGTGACGGTCATCTGTGTGTTT
AACTGGGCCTTTGAGGAATGTCCACCCCCTTCTTTCTCCTGGACGGGGGCTGC
CCTCTCCTCCCAAGGAACCAAACCAACGACCTCCCACTTCTCAG(SEQID NO:
87), ATCCTCCAGAGAACCTGAGAGTGATGGTTTCCCAAGCAAACAGGACAGGTA
GGAAAGGGGACAGAGGAGCCAAGGCCTCTCAGTGCCGAATTGGGGGCCCAG
GAGTCTGGAGGGTCCCCACGCAGGAGGGTCCCTGAGCCCTGAGCTGCTCATC
GATTCTGCCTCTTCCTTCCCT (SEQ ID NO: 88), GTGAGTGGGGGAAAGGGGACACCTGGGTCCCAGGAAGGGGACCCTGCTGAG
TCCTGTCCTCCCTCCCCTCAG (SEQ ID NO: 89), CTGGCCCCCTGGCTCAGAAGCGGAATCAGAAAGCCACACCAAACAGTCCTCG
GACCCCTCTTCCACCAGGTGCTCCCTCCCCAGAATCAAAGAAGAACCAGAAA
AAGCAGTATCAGTTGCCCAGTTTCCCAGAACCCAAATCATCCACTCAAGCCC
SUBSTITUTE SHEET (RULE 26) CAGAATCCCAGGAGAGCCAAGAGGAGCTCCATTATGCCACGCTCAACTTCCC
AGGCGTCAGACCCAGGCCTGAGGCCCGGATGCCCAAGGGCACCCAGGCGGA
TTATGCAGAAGTCAAGTTCCAATGAGGGTCTCT'rAGGCTTTAGGACTGGGAC

GTTTCCTTCTCTCCCTCTCTCTCTCTCTTTCTCTCTCTCTCTCTCTTTCTCTCTCT
TTT (SEQ ID NO: 90), and/or AAAAAAACATCTGGCCAGGGCACAGTGGCTCACGCCTGTAATCCCAGCACTT
TGGGAGGTTGAGGTGGGCAGATCGCCTGAGGTCGGGAGTTCGAGACCAGCC
TGGCCAACTTGGTGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCTGGG
CATGGTGGCAGGCGCCTGTAATCCTACTACTTGGGAAGCTGAGGCAGGAGAA
TCACTTGAACCTGGGAGACGGAGGTTGCAGTGAGCCAAGATCACACCATTGC
ACGCCAGCTTGGGCAACAAAGCGAGACTCCATC'TCAAAAAAAAAATCCTCC
AAATGGGTTGGGTGTCTGTAATCCCAGCACTTTGGGAGGCTAAGGTGGGTGG
ATTGCTTGAGCCCAGGAGTTCGAGACCAGCCTGGGCAACATGGTGAAACCCC
ATCTCTACAAAAAATACAAAACATAGCTGGGCTTGGTGGTGTGTGCCTGTAG
TCCCAGCTGTCAGACA'I'T"TAAACCAGAGCAACTCCCATCTGGAATGGGAGCT
GAATAAAATGAGGCTGAGACCTACTGGGCTGCCATTCTCAGACAGTGGAGGC
CATTCTAAGTCACAGGATGAGACAGGAGGTCCGTACAAGATACAGGTCATA
AAGACTTTGCTGATAAAACAGATTGCAGTAAAGAAGCCAACCAAATCCCACC
AAAACCAAGTTGGCCACGAGAGTGACCTCTGGTCGTCCTCACTGCTACACTC
CTGACAGCACCATGACAGTTTACAAATGCCATGGCAACATCAGGAAGTTACC
CGATATGTCCCAAAAGGGGGAGGAATGAATAATCCACCCCTTGTTTAGCAAA
TAAGCAAGAAATAACCATAAAAGTGGGCAACCAGCAGCTCTAGGCGCTGCT
CTTGTCTATGGAGTAGCCATTCTTTTGTTCCTTTACTTTCTTAATAAACTTGCT
TTCACCTTAAAAAAAAAAAAAAAAAAAAAA (SEQ ID N0:91 ). Also preferred are the polypeptides encoded by these polynucleotides.
SUBSTITUTE SHEET (RULE 26) Figures 13A-C shows the nucleotide (SEQ ID NO:15) and deduced amino acid sequence (SEQ ID N0:33) of the CD33-like protein. Predicted amino acids from about 1 to about 16 constitute the predicted signal peptide (amino acid residues from about 1 to about 16 in SEQ ID N0:33) and are represented by the underlined amino acid regions;
and amino acids from about 496 to about 512 constitute the predicted transmembrane domain (amino acid residues from about 496 to about 512 in SEQ ID N0:33) and are represented by the double-underlined amino acid regions.
Figure 14 shows the regions of similarity between the amino acid sequences of the CD33-like protein SEQ ID N0:33, and the CD33L1 protein (gi~88178) (SEQ ID
NO:
92).
Figure 15 shows an analysis of the amino acid sequence of SEQ ID N0:33.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
Northern analysis indicates that this gene is expressed highest in spleen tissue and peripheral blood leukocytes, and to a lesser extent in ovary and lung tissue.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in Figures 13A-C (SEQ ID N0:33), which was determined by sequencing a cloned cDNA
(HDPCLOS). The nucleotide sequence shown in Figures 13A-C (SEQ ID NO:15) was obtained by sequencing a cloned cDNA (HDPCLOS), which was deposited on Nov.
17, 1998 at the American Type Culture Collection, and given Accession Number 203484.
The deposited gene is inserted in the pSport plasmid (Life Technologies, Rockville, MD) using the SaII/NotI restriction endonuclease cleavage sites.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
NO;15 is intended DNA fragments at least about l5nt, and more preferably at least about SUBSTITUTE SHEET (RULE 26) 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:15. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:15. In this context "about" includes the particularly recited size, larger or smaller by several (S, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of polynucleotide fragments of the invention include.
for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150. from about 1 _51 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from 1 S about 451 to about 500, from about 501 to about _550, from about 5_51 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 751 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 951 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about 1101 to about 11 S0, from about 1151 to about 1200, from about 1201 to about 1250, from about 1251 to about 1300, from about 1301 to about 1350, from about 1351 to about 1400, from about 1401 to about 1450, from about 1451 to about 1500, from about 1501 to about 1550, from about 1551 to about 1600, from about 1601 to about 1650, from about 1651 to about 1700, from about 1701 to about 1750, from about 1751 to about 1800, from about 1801 to about 1850, from about 1851 to about 1900, from about 1901 to about 1950, from about 1951 to about 2000, from about 2001 to about 2050, from about 2051 to about 2100, from about 2101 to about 21 S0, from about 2151 to about 2200, from about 2201 to about 22_50, SUBSTITUTE SHEET (RULE 26) from about 2251 to about 2295, from about 307 to about 1977, and from about 106 to about 1977, of SEQ ID NO:15, or the complementary strand thereto, or the cDNA
contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at 5 both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-10 regions"), coil and coil-foaming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set fouth in Figure 1_5 and/or Table V, as described above, was generated using the various modules and algorithms of the 15 DNA"'STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table V can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to 20 be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 15, but may, as shown in Table V; be represented or identified by using tabular representations of the data presented in Figure 15. The DNA*STAR computer algorithm used to generate 2_5 Figure 15 (set on the original default parameters) was used to present the data in Figure 15 in a tabular format (See Table V). The tabular format of the data in Figure 15 is used to easily determine specific boundaries of a preferred region. The above-mentioned SUBSTITUTE SHEET (RULE 26) preferred regions set out in Figure 15 and in Table V include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 13A-C. As set out in Figure 15 and in Table V, such preferred regions include Gamier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, S Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce andJor bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figures 13A-C, up to the alanine residue at position number 634 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-639 of Figures 13A-C, where nl is an integer from 2 to 634 corresponding to the position of the amino acid residue in Figures 13A-C (which is identical to the sequence shown as SEQ ID N0:33). N-terminal SUBSTITUTE SHEET (RULE 26) deletions of the polypeptide of the invention shown as SEQ ID N0:33 include polypeptides comprising the amino acid sequence of residues: L-2 to Q-639; L-3 to Q-639; P-4 to Q-639;L-5 to Q-639; L-6 to Q-639; L-7 to Q-639; S-8 to Q-639; S-9 toQ-639; L-10 to Q-639; L-11 to Q-639; G-12 to Q-639; G-13 toQ-639; S-14 to Q-639;

to Q-639; A-16 to Q-639; M-17 toQ-639; D-18 to Q-639; G-19 to Q-639; R-20 to Q-639;
F-21 toQ-639; W-22 to Q-639; I-23 to Q-639; R-24 to Q-639; V-2_5 toQ-639; Q-26 to Q-639; E-27 to Q-639; S-28 to Q-639; V-29 toQ-639; M-30 to Q-639; V-31 to Q-G39;

to Q-639; E-33 toQ-639; A-34 to Q-639; C-35 to Q-639; D-3G to Q-639; I-37 toQ-639;
S-38 to Q-639; V-39 to Q-639; P-40 to Q-639; C-41 toQ-639; S-42 to Q-639; F-43 to Q-639; S-44 to Q-G39; Y-45 toQ-G39; P-46 to Q-G39; R-47 to Q-639; Q-48 to Q-639;

toQ-639; W-50 to Q-639; T-51 to Q-639; G-52 to Q-G39; S-53 toQ-639; T-54 to Q-639;
P-5_5 to Q-639; A-56 to Q-639; Y-57 toQ-639; G-58 to Q-639; Y-59 to Q-639; W-60 to Q-639; F-61 toQ-639; K-62 to Q-639; A-63 to Q-639; V-64 to Q-639; T-6_5 toQ-639; E-66 to Q-639; T-67 to Q-639; T-G8 to Q-639; K-69 toQ-639; G-70 to Q-639; A-71 to Q-639; P-72 to Q-639; V-73 toQ-639; A-74 to Q-639; T-75 to Q-639; N-76 to Q-639;

toQ-639; Q-78 to Q-639; S-79 to Q-639; R-80 to Q-639; E-81 toQ-639; V-82 to Q-639;
E-83 to Q-639; M-84 to Q-639; S-85 toQ-639; T-86 to Q-639; R-87 to Q-639; G-88 to Q-639; R-89 toQ-639; F-90 to Q-639; Q-91 to Q-639; L-92 to Q-639; T-93 toQ-639; G-94 to Q-639; D-95 to Q-639; P-96 to Q-639; A-97 toQ-639; K-98 to Q-639; G-99 to Q-639; N-100 to Q-639; C-101 toQ-639; S-102 to Q-639; L-103 to Q-639; V-104 to Q-639;
I-105 toQ-639; R-106 to Q-639; D-107 to Q-639; A-108 to Q-639; Q-109to Q-639;
M-110 to Q-639; Q-111 to Q-639; D-112 to Q-639;E-113 to Q-639; S-114 to Q-639; Q-to Q-639; Y-116 to Q-639;F-117 to Q-639; F-118 to Q-639; R-119 to Q-639; V-120 to Q-639;E-121 to Q-639; R-122 to Q-639; G-123 to Q-639; S-124 toQ-639; Y-125 to Q-639; V-126 to Q-639; R-127 to Q-639; Y-128to Q-639; N-129 to Q-639; F-130 to Q-639; M-131 to Q-639;N-132 to Q-639; D-133 to Q-639; G-134 to Q-639; F-135 toQ-639; F-136 to Q-639; L-137 to Q-639; K-138 to Q-639; V-139to Q-639; T-140 to Q-639;
SUBSTITUTE SHEET (RULE 26) V-141 to Q-639; L-142 to Q-639; S-143to Q-639; F-144 to Q-639; T-145 to Q-639;
P-l46 to Q-639; R-147to Q-639; P-148 to Q-639; Q-149 to Q-639; D-150 to Q-639;H-to Q-639; N-I52 to Q-639; T-153 to Q-639; D-154 toQ-639; L-155 to Q-639; T-156 to Q-639; C-157 to Q-639; H-158to Q-639; V-159 to Q-639; D-160 to Q-639; F-161 to Q-639; S-162to Q-639; R-163 to Q-639; K-164 to Q-639; G-165 to Q-639;V-166 to Q-639;
S-167 to Q-639; A-168 to Q-639; Q-169 toQ-639; R-170 to Q-639; T-171 to Q-639;
V-172 to Q-639; R-173to Q-639: L-174 to Q-639; R-175 to Q-639; V-176 to Q-639;A-to Q-639; Y-178 to Q-639; A-179 to Q-639; P-180 toQ-639; R-181 to Q-639; D-182 to Q-639; L-183 to Q-639; V-184to Q-639; I-185 to Q-639; S-186 to Q-639; I-187 to Q-639; S-188to Q-639; R-189 to Q-639; D-190 to Q-639; N-191 to Q-639;T-192 to Q-639;
P-193 to Q-639; A-194 to Q-639; L-195 to Q-639;E-196 to Q-639; P-197 to Q-639;
Q-198 to Q-639; P-199 to Q-639;Q-200 to Q-639; G-201 to Q-639; N-202 to Q-639; V-toQ-639; P-204 to Q-639; Y-205 to Q-639; L-206 to Q-639; E-207 toQ-639; A-208 to Q-639; Q-209 to Q-639; K-210 to Q-639; G-211 to Q-639; Q-212 to Q-639; F-213 to Q-639; L-214 to Q-639; R-215to Q-639; L-216 to Q-639; L-217 to Q-639; C-218 to Q-639;
A-219to Q-639; A-220 to Q-639; D-221 to Q-639; S-222 to Q-639;Q-223 to Q-639;
P-224 to Q-639; P-225 to Q-639; A-226 to Q-639;T-227 to Q-639; L-228 to Q-639; S-to Q-639; W-230 toQ-639; V-231 to Q-639; L-232 to Q-639; Q-233 to Q-639; N-234to Q-639; R-235 to Q-639; V-236 to Q-639; L-237 to Q-639; S-238to Q-639; S-239 to Q-639; S-240 to Q-639; H-241 to Q-639; P-242to Q-639; W-243 to Q-639; G-244 to Q-639; P-245 to Q-639;8-246 to Q-639; P-247 to Q-639; L-248 to Q-639; G-249 toQ-639;
L-250 to Q-639; E-251 to Q-639; L-252 to Q-639; P-253 toQ-639; G-254 to Q-639;
V-255 to Q-639; K-256 to Q-639; A-257to Q-639; G-258 to Q-639; D-259 to Q-639; S-to Q-639;6-261 to Q-639; R-262 to Q-639; Y-263 to Q-639; T-264 toQ-639; C-265 to Q-639; R-266 to Q-639; A-267 to Q-639; E-268to Q-639; N-269 to Q-639; R-270 to Q-639; L-271 to Q-639;6-272 to Q-639; S-273 to Q-639; Q-274 to Q-639; Q-275 toQ-639;
R-276 to Q-639; A-277 to Q-639; L-278 to Q-639; D-279to Q-639; L-280 to Q-639;
S-SUBSTITUTE SHEET (RULE 26) 281 to Q-639; V-282 to Q-639; Q-283to Q-639; Y-284 to Q-639; P-285 to Q-639; P-to Q-639; E-287to Q-639; N-288 to Q-639; L-289 to Q-639; R-290 to Q-639;V-291 to Q-639; M-292 to Q-639; V-293 to Q-639; S-294 toQ-639; Q-295 to Q-639; A-296 to Q-639; N-297 to Q-639; R-298to Q-639; T-299 to Q-639; V-300 to Q-639; L-301 to Q-639;
S E-302to Q-639; N-303 to Q-639; I~ 304 to Q-639; G-30S to Q-639;N-306 to Q-639; G-307 to Q-639; T-308 to Q-639; S-309 toQ-639; L-310 to Q-639; P-311 to Q-639; V-to Q-639; L-313 toQ-639; E-314 to Q-639; G-315 to Q-639; Q-316 to Q-639; S-317to Q-639; L-318 to Q-639; C-319 to Q-639; L-320 to Q-639; V-32 I to Q-639; C-322 to Q-639;
V-323 to Q-639; T-324 to Q-639;H-325 to Q-639; S-326 to Q-639; S-327 to Q-639;
P-328 to Q-639;P-329 to Q-639; A-330 to Q-639; R-331 to Q-639; L-332 to Q-639;S-to Q-639; W-334 to Q-639; T'-335 to Q-639; Q-336 toQ-639; R-337 to Q-639; G-338 to Q-639; Q-339 to Q-639; V-340to Q-639; L-341 to Q-639; S-342 to Q-639; P-343 to Q-639; S-344to Q-639; Q-345 to Q-639; P-346 to Q-639; S-347 to Q-639; D-348to Q-639;
P-349 to Q-639; G-350 to Q-639; V-351 to Q-639;L-352 to Q-639; E-353 to Q-639;
L-354 to Q-639; P-355 to Q-639;8-356 to Q-639; V-357 to Q-639; Q-358 to Q-639; V-toQ-639; E-360 to Q-639; H-361 to Q-639; E-362 to Q-639; G-363to Q-639; E-364 to Q-639; F-365 to Q-639; T-366 to Q-639; C-367to Q-639; H-368 to Q-639; A-369 to Q-639;
R-370 to Q-639;H-371 to Q-639; P-372 to Q-639; L-373 to Q-639; G-374 toQ-639;
S-375 to Q-639; Q-376 to Q-639; H-377 to Q-639; V-378to Q-639; S-379 to Q-639; L-to Q-639; S-381 to Q-639; L-382to Q-639; S-383 to Q-639; V-384 to Q-639; H-385 to Q-639;Y-386 to Q-639; S-387 to Q-639; P-388 to Q-639; K-389 toQ-639; L-390 to Q-639; L-391 to Q-639; G-392 to Q-639; P-393 toQ-639; S-394 to Q-639; C-395 to Q-639;
S-396 to Q-639; W-397to Q-639; E-398 to Q-639; A-399 to Q-639; E-400 to Q-639;G-401 to Q-639; L-402 to Q-639; H-403 to Q-639; C-404 toQ-639; S-405 to Q-639; C-to Q-639; S-407 to Q-639; S-408 toQ-639; Q-409 to Q-639; A-410 to Q-639; S-411 to Q-639; P-412 toQ-639; A-413 to Q-639; P-414 to Q-639; S-415 to Q-639; L-416 toQ-639; R-417 to Q-639; W-418 to Q-639; W-419 to Q-639; L-420to Q-639; G-421 to Q-SUBSTITUTE SHEET (RULE 26) 639; E-422 to Q-639; E-423 to Q-639; L-424to Q-639; L-4.25 to Q-639; E-426 to Q-639;
G-427 to Q-639;N-428 to Q-639; S-429 to Q-639; S-430 to Q-639; Q-431 toQ-639;
D-432 to Q-639; S-433 to Q-639; F-434 to Q-639; E-435 toQ-639; V-436 to Q-639; T-to Q-639; P-438 to Q-639; S-439 toQ-639; S-440 to Q-639; A-441 to Q-639; G-442 to 5 Q-639; P-443to Q-639; W-444 to Q-639; A-445 to Q-639; N-446 to Q-639;S-447 to Q-639; S-448 to Q-639; L-449 to Q-639; S-450 to Q-639;L-451 to Q-639; H-452 to Q-639;
G-453 to Q-639; G-454 toQ-639; L-45_5 to Q-639; S-456 to Q-639; S-4_57 to Q-639; G-458 toQ-639; L-459 to Q-639; R-460 to Q-639; L-461 to Q-639; R-462 toQ-639; C-to Q-639; E-464 to Q-639; A-465 to Q-639; W-466to Q-639; N-467 to Q-639; V-468 to 10 Q-639; H-469 to Q-639;6-470 to Q-639; A-471 to Q-639; Q-472 to Q-639; S-473 toQ-639; G-474 to Q-639; S-475 to Q-639; I-476 to Q-639; L-477 toQ-639; Q-478 to Q-639;
L-479 to Q-639; P-480 to Q-639; D-481 toQ-639; K-482 to Q-639; K-483 to Q-639;
G-484 to Q-639; L-485to Q-639; I-486 to Q-639; S-487 to Q-639; T-488 to Q-639; A-489to Q-639; F-490 to Q-639; S-491 to Q-639; N-492 to Q-639;6-493 to Q-639; A-494 to Q-15 639; F-495 to Q-639; L-496 toQ-639; G-497 to Q-639; I-498 to Q-639; G-499 to Q-639;
I-500 toQ-639; T-501 to Q-639; A-502 to Q-639; L-503 to Q-639; L-504 toQ-639;

to Q-639; L-506 to Q-639; C-507 to Q-639; L-508 toQ-639; A-509 to Q-639; L-510 to Q-639; I-511 to Q-639; I-512 toQ-639; M-513 to Q-639; K-514 to Q-639; I-515 to Q-639; L-516 toQ-639; P-517 to Q-639; K-518 to Q-639; R-519 to Q-639; R-520to Q-639;
20 T-521 to Q-639; Q-S22 to Q-639; T-523 to Q-639; E-524to Q-639; T-525 to Q-639; P-526 to Q-639; R-527 to Q-639; P-528to Q-639; R-529 to Q-639; F-530 to Q-639; S-to Q-639; R-532to Q-639; H-533 to Q-639; S-534 to Q-639; T-S35 to Q-639; I-536to Q-639; L-537 to Q-639; D-538 to Q-639; Y-539 to Q-639; I-540to Q-639; N-541 to Q-639;
V-542 to Q-639; V-543 to Q-639;P-544 to Q-639; T-545 to Q-639; A-546 to Q-639;
G-2.5 547 toQ-639; P-_548 to Q-639; L-549 to Q-639; A-550 to Q-639; Q-551 toQ-639; K-552 to Q-639; R-553 to Q-639; N-554 to Q-639; Q-555to Q-639; K-556 to Q-639; A-557 to Q-639; T-558 to Q-639;P-559 to Q-639; N-560 to Q-639; S-561 to Q-639; P-562 to Q-SUBSTITUTE SHEET (RULE 26) 639;8-563 to Q-639; T-564 to Q-639; P-565 to Q-639; L-566 to Q-639;P-567 to Q-639;
P-568 to Q-639; G-569 to Q-639; A-570 toQ-639; P-571 to Q-639; S-572 to Q-639;
P-573 to Q-639; E-574 toQ-639; S-575 to Q-639; K-576 to Q-639; K-577 to Q-639; N-578to Q-639; Q-579 to Q-639; K-580 to Q-639; K-581 to Q-639;Q-582 to Q-639; Y-to Q-639; Q-584 to Q-639; L-585 toQ-639; P-586 to Q-639; S-587 to Q-639; F-588 to Q-639; P-589 toQ-639; E-590 to Q-639; P-591 to Q-639; K-592 to Q-639; S-593 toQ-639;
S-594 to Q-639; T-59S to Q-639; Q-596 to Q-639; A-597 toQ-639; P-598 to Q-639;
E-599 to Q-639; S-600 to Q-G39; Q-601 toQ-639; E-602 to Q-639; S-603 to Q-639; Q-to Q-639; E-605 toQ-639; E-606 to Q-639; L-607 to Q-639; H-608 to Q-639; Y-609to Q-639; A-610 to Q-639; T-611 to Q-639; L-612 to Q-639;N-613 to Q-639; F-614 to Q-639;
P-61 S to Q-639; G-616 toQ-639; V-617 to Q-639; R-618 to Q-639; P-619 to Q-639; R-620to Q-639; P-621 to Q-639; E-622 to Q-639; A-623 to Q-639; R-624to Q-639; M-to Q-639; P-626 to Q-639; K-627 to Q-639;6-628 to Q-639; T-629 to Q-639; Q-630 to Q-639; A-631 toQ-639; D-632 to Q-639; Y-633 to Q-639; A-634 to Q-639; of SEQID
N0:33. Polypeptides encoded by these polynucleotides are also encompassed by the mvent~on.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities (e.g., ability to illicit mitogenic activity, induce differentiation of normal or malignant cells, bind to EGF receptors, etc.)), may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid SUBSTITUTE SHEET (RULE 2~

residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the polypeptide shown in Figures 13A-C, up to the leucine residue at position number 7, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures 13A-C, where ml is an integer from 7 to 638 corresponding to the position of the amino acid residue in Figures 13A-C. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID
N0:33 include polypeptides comprising the amino acid sequence of residues: M-I
to F-638; M-1 to K-637;M-I to V-636; M-1 to E-63_5; M-1 to A-634; M-1 to Y-633; M-1 toD-632; M-1 to A-631; M-1 to Q-630; M-1 to T-629; M-I to G-628;M-1 to K-627;

to P-626; M-1 to M-625; M-1 to R-624; M-1 toA-623; M-1 to E-622; M-1 to P-621;

to R-620; M-1 to P-619;M-1 to R-618; M-1 to V-617; M-1 to G-616; M-1 to P-615;

toF-614; M-1 to N-613; M-1 to L-612; M-1 to T-61 l; M-1 to A-610;M-1 to Y-609;

to H-608; M-1 to L-607; M-1 to E-606; M-1 toE-605; M-1 to Q-604; M-1 to S-603;

to E-602; M-1 to Q-601;M-1 to S-600; M-1 to E-599; M-1 to P-598; M-1 to A-597;

toQ-596; M-1 to T-595; M-1 to S-594; M-1 to S-593; M-1 to K-592;M-1 to P-591;

to E-590; M-1 to P-589; M-1 to F-588; M-1 toS-587; M-1 to P-586; M-1 to L-585;

to Q-584; M-1 to Y-583;M-1 to Q-582; M-1 to K-581; M-1 to K-580; M-1 to Q-579;
M-1 toN-578; M-1 to K-577; M-1 to K-576; M-1 to S-575; M-1 to E-574;M-1 to P-573; M-1 to S-572; M-1 to P-571; M-1 to A-570; M-1 toG-569; M-1 to P-568; M-1 to P-567; M-1 to L-566; M-1 to P-565;M-1 to T-564; M-1 to R-563; M-1 to P-562; M-1 to S-561; M-1 toN-560; M-1 to P-559; M-1 to T-SSB; M-1 to A-557; M-1 to K-556;M-1 to Q-555; M-1 to N-554; M-1 to R-553; M-1 to K-552; M-1 toQ-551; M-1 to A-550; M-1 to L-549;
SUBSTITUTE SHEET (RULE 26) WO 00!29435 PCT/US99/25031 M-1 to P-548; M-1 to G-547;M-1 to A-546; M-1 to T-545; M-I to P-544; M-1 to V-543;
M-1 toV-542; M-1 to N-541; M-1 to I-540; M-1 to Y-539; M-1 to D-538;M-1 to L-537;
M-1 to I-536; M-1 to T-535; M-1 to S-534; M-1 toH-533; M-1 to R-532; M-1 to S-531;
M-1 to F-530; M-1 to R-529;M-1 to P-528; M-1 to R-527; M-1 to P-526; M-1 to T-525;
M-1 toE-524; M-1 to T-523; M-1 to Q-522; M-1 to T-521; M-1 to R-520;M-1 to R-519;
M-1 to K-518; M-I to P-.517; M-1 to L-516; M-1 toI-515; M-I to K-514; M-I to M-513;
M-I to I-512; M-1 to I-SI1;M-1 to L-510; M-1 to A-509; M-1 to I_-508; M-1 to C-507;
M-1 toL-506; M-1 to F-505; M-1 to L-504; M-1 to L-503; M-1 to A-502;M-1 to T-501;
M-1 to I-500; M-1 to G-499; M-1 to I-498; M-1 toG-497; M-1 to L-496; M-1 to F-495;
M-1 to A-494; M-I to G-493;M-1 to N-492; M-1 to S-491; M-1 to F-490; M-1 to A-489;
M-I toT-488; M-1 to S-487; M-1 to I-486; M-1 to L-485; M-1 to G-484;M-1 to K-483;
M-1 to K-482; M-1 to D-481; M-I to P-480; M-1 toL-479; M-1 to Q-478; M-1 to L-477;
M-1 to I-476; M-1 to S-475;M-1 to G-474; M-1 to S-473; M-1 to Q-472; M-1 to A-471;
M-I toG-470; M-1 to H-469; M-1 to V-468; M-1 to N-467; M-1 toW-466; M-1 to A-465; M-I to E-464; M-I to C-463; M-1 to R-462;M-I to L-461; M-1 to R-460; M-1 to L-459; M-1 to G-458; M-1 toS-457; M-1 to S-456; M-1 to L-455; M-1 to G-454; M-1 to G-453;M-1 to H-452; M-1 to L-451; M-1 to S-450; M-1 to L-449; M-1 toS-448; M-I
to 5-447; M-1 to N-446; M-1 to A-445; M-1 to W-444;M-1 to P-443; M-1 to G-442; M-1 to A-441; M-1 to S-440; M-1 toS-439; M-1 to P-438; M-1 to T-437; M-1 to V-436; M-1 to E-435;M-1 to F-434; M-1 to S-433; M-I to D-432; M-1 to Q-431; M-1 toS-430; M-1 to S-429; M-1 to N-428; M-1 to G-427; M-1 to E-426;M-1 to L-425; M-1 to L-424; M-1 to E-423; M-1 to E-422; M-1 toG-421; M-1 to L-420; M-1 to W-419; M-1 to W-418; M-toR-417; M-1 to L-416; M-1 to S-415; M-1 to P-414; M-1 to A-413;M-1 to P-412;

to S-411; M-1 to A-410; M-1 to Q-409; M-1 toS-408; M-1 to S-407; M-1 to C-406;

to S-405; M-1 to C-404;M-1 to H-403; M-1 to L-402; M-1 to G-401; M-1 to E-400;

toA-399; M-1 to E-398; M-1 to W-397; M-1 to S-396; M-1 to C-395;M-1 to S-394;

to P-393; M-1 to G-392; M-1 to L-391; M-1 toL-390; M-1 to K-389; M-1 to P-388;

SUBSTITUTE SHEET (RULE 26) to S-387; M-1 to Y-386;M-1 to H-385; M-1 to V-384; M-1 to S-383; M-1 to L-382;

toS-381; M-1 to L-380; M-1 to S-379; M-1 to V-378; M-1 to H-377;M-1 to Q-376;

to S-375; M-1 to G-374; M-1 to L-373; M-1 toP-372; M-1 to H-371; M-1 to R-370;

to A-369; M-1 to H-368;M-1 to C-367; M-1 to T-366; M-1 to F-365; M-1 to E-364;

toG-363; M-1 to E-362; M-1 to H-361; M-1 to E-360; M-1 to V-359;M-1 to Q-358;

to V-357; M-1 to R-356; M-1 to P-355; M-1 toL-354; M-I to E-353; M-1 to L-352;

to V-351; M-1 to G-350;M-1 to P-349; M- I to D-348; M-1 to S-347; M-1 to P-346; M-1 toQ-345; M-1 to S-344; M-1 to P-343; M-1 to S-342; M-1 to L-341;M-1 to V-340:
M-I
to Q-339; M-1 to G-338; M-1 to R-337; M-1 toQ-336; M-1 to T-335; M-1 to W-334;
M-1 to S-333; M-1 to L-332;M-1 to R-331; M-1 to A-330; M-1 to P-329: M-1 to P-328; M-1 toS-327; M-I to S-326; M-1 to H-325; M-1 to T-324; M-1 to V-323;M-1 to C-322; M-I to V-321; M-1 to L-320; M-1 to C-319; M-1 toL-318; M-1 to S-317; M-1 to Q-316; M-1 to G-315; M-1 to E-314;M-1 to L-313; M-1 to V-312; M-1 to P-31 I; M-1 to L-310; M-1 toS-309; M-1 to T-308; M-1 to G-307; M-1 to N-306; M-1 to G-305;M-1 to L-304; M-1 to N-303; M-1 to E-302; M-1 to L-301; M-I toV-300; M-1 to T-299; M-1 to R-298;
M-1 to N-297; M-1 to A-296;M-1 to Q-295; M-1 to S-294; M-1 to V-293; M-1 to M-292; M-1 toV-291; M-1 to R-290; M-1 to L-289; M-1 to N-288; M-1 to E-287;M-1 to P-286; M-1 to P-285; M-1 to Y-284; M-1 to Q-283; M-1 toV-282; M-1 to S-281; M-I
to L-280; M-1 to D-279; M-1 to L-278;M-1 to A-277; M-1 to R-276; M-1 to Q-275; M-1 to Q-274; M-1 toS-273; M-1 to G-272; M-1 to L-271; M-1 to R-270; M-1 to N-269;M-1 to E-268; M-1 to A-267; M-1 to R-266; M-1 to C-265; M-1 toT-264; M-1 to Y-263; M-1 to R-262; M-1 to G-261; M-1 to S-260;M-1 to D-259; M-1 to G-258; M-1 to A-257; M-1 to K-256; M-1 toV-255; M-1 to G-254; M-1 to P-253; M-1 to L-252; M-1 to E-251;M-1 to L-250; M-1 to G-249; M-1 to L-248; M-1 to P-247; M-1 toR-246; M-1 to P-245; M-1 to G-244; M-1 to W-243; M-1 to P-242;M-1 to H-241; M-1 to S-240; M-1 to S-239; M-1 to S-238; M-1 toL-237; M-1 to V-236; M-1 to R-235; M-1 to N-234; M-1 to Q-233;M-1 to L-232; M-1 to V-231; M-1 to W-230; M-1 to S-229; M-1 toL-228; M-1 to T-227; M-1 to SUBSTITUTE SHEET (RULE 26) A-226; M-1 to P-225 ; M-1 to P-224;M-1 to Q-223; M-1 to S-222; M-1 to D-221; M-1 to A-220; M-1 toA-219; M-1 to C-218; M-1 to L-217; M-1 to L-216; M-1 to R-215;M-1 to L-214; M-1 to F-213; M-1 to Q-212; M-1 to G-211; M-1 toK-210; M-1 to Q-209; M-1 to A-208; M-1 to E-207; M-1 to L-206;M-1 to Y-205; M-1 to P-204; M-1 to V-203; M-1 to S N-202; M-1 toG-201; M-1 to Q-200; M-1 to P-199; M-1 to Q-198; M-1 to P-197;M-1 to E-196; M-1 to L-195; M-I to A-194; M-1 to P-193; M-1 toT-192; M-1 to N-191; M-1 to D-190; M-I to R-189; M-1 to S-188;M-1 to I-187; M-1 to S-186; M-I to I-185; M-1 to V-184; M-1 toL-183; M-1 to D-182; M-1 to R-18i; M-1 to P-180; M-1 to A-179:M-1 to Y-178; M-I to A-177; M-1 to V-176; M-1 to R-175; M-.( toL-174; M-1 to R-173; M-1 to 10 V-172; M-1 to T-171; M-1 to R-170;M-1 to Q-169; M-1 to A-168; M-1 to S-167;
M-I to V-166; M-1 toG-165; M-1 to K-164; M-1 to R-163; M-I to S-162; M-1 to F-161;M-1 to D-160; M-I to V-159; M-1 to H-158; M-1 to C-157; M-I toT-156; M-1 to L-155; M-1 to D-154; M-1 to T-153; M-1 to N-152;M-1 to H-151; M-1 to D-150; M-1 to Q-149; M-to P-148; M-1 toR-147; M-1 to P-146; M-1 to T-145; M-1 to F-144; M-1 to S-143;M-1 15 to L-142; M-1 to V-141; M-1 to T-140; M-1 to V-139; M-1 toK-138; M-1 to L-137; M-1 to F-136; M-1 to F-135; M-1 to G-134;M-1 to D-133; M-1 to N-132; M-1 to M-131;

to F-130; M-1 toN-129; M-1 to Y-128; M-1 to R-127; M-1 to V-126; M-1 to Y-125;M-1 to S-124; M-1 to G-123; M-1 to R-122; M-1 to E-121; M-1 toV-120; M-1 to R-119;

to F-118; M-1 to F-117; M-1 to Y-116;M-1 to Q-115; M-1 to S-114; M-1 to E-113;

20 to D-112; M-1 toQ-111; M-1 to M-I 10; M-1 to Q-109; M-1 to A-108; M-1 to D-107;M-1 to R-106; M-1 to I-105; M-1 to V-104; M-1 to L-103; M-1 toS-102; M-1 to C-101; M-1 to N-100; M-1 to G-99; M-1 to K-98;M-1 to A-97; M-1 to P-96; M-1 to D-95; M-1 to G-94; M-1 toT-93; M-1 to L-92; M-1 to Q-91; M-1 to F-90; M-1 to R-89; M-1 toG-88;
M-1 to R-87; M-1 to T-86; M-1 to S-85; M-1 to M-84; M-lto E-83; M-1 to V-82; M-1 to 25 E-81; M-1 to R-80; M-1 to S-79; M-lto Q-78; M-1 to H-77; M-1 to N-76; M-1 to T-75;
M-1 to A-74;M-1 to V-73; M-1 to P-72; M-1 to A-71; M-1 to G-70; M-1 toK-69; M-1 to T-68; M-1 to T-67; M-1 to E-66; M-1 to T-65; M-1 toV-64; M-1 to A-63; M-1 to K-62;
SUBSTITUTE SHEET (RULE 26) M-1 to F-61; M-1 to W-60; M-1 to Y-59; M-1 to G-58; M-1 to Y-57; M-1 to A-56;

to P-SS;M-1 to T-54; M-1 to S-53; M-1 to G-52; M-1 to T-51; M-1 toW-50; M-1 to D-49; M-1 to Q-48; M-1 to R-47; M-1 to P-46; M-lto Y-45; M-1 to S-44; M-1 to F-43; M-1 to S-42; M-1 to C-41; M-lto P-40; M-1 to V-39; M-1 to S-38; M-1 to I-37; M-1 to D-S 36; M-Ito C-35; M-1 to A-34; M-1 to E-33; M-1 to P-32: M-1 to V-31;M-1 to M-30; M-1 to V-29; M-1 to S-28; M-1 to E-27; M-1 toQ-26; M-1 to V-25; M-1 to R-24: M-1 to I-23; M-1 to W-22; M-1 to F-21; M-1 to R-20; M-1 to G-19; M-1 to D-18: M-1 to M-17;M-1 to A-16; M-1 to Q-15; M-1 to S-14; M-1 to G-13; M-1 toG-12; M-1 to L-11; M-1 to L-i0: M-1 to S-9; M-1 to S-8; M-I toL-7; of SEQ ID N0:33. Polypeptides encoded l0 by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID NO:15 which have been determined from the following related cDNA genes: HTOFA26R (SEQ ID N0:93), HWAEM43R
(SEQ ID N0:94), HDPMQ69R (SEQ ID N0:95), HDPGA09RA (SEQ ID N0:96), 15 HEOMH10R (SEQ ID N0:97), and HFKCT73F (SEQ ID N0:98).
The polypeptide of this gene has been determined to have a transmembrane domain at about amino acid position 496 - S 12 of the amino acid sequence referenced in Table XIII for this gene. Moreover, a cytoplasmic tail encompassing amino acids 513 to 639 of this protein has also been determined. Based upon these characteristics, it is 20 believed that the protein product of this gene shares structural features to type Ia membrane proteins.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of the following diseases and conditions which 25 include, but are not limited to, disorders of the immune system, in particular the immunodiagnosis of acute leukemias. Similarly, polypeptides and antibodies directed to these polypeptides are useful to provide immunological probes for differential SUBSTITUTE SHEET (RULE 26) identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the immune system, expression of this gene at significantly higher or lower levels is detected in certain tissues or cell types (e.g., immune, cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue from an individual not having the disorder.
Prefewed epitopes include those comprising a sequence shown in SEQ ID NO: 33 as residues: Pro-46 to Gly-52, Asn-76 to Val-82, Ser-85 to Phe-90, Gly-94 to Asn-100, Gln-111 to Tyr-116, Pra-146 to Leu-155, Ser-188 to Asn-202, Ser-240 to Art-X46, Gly-258 to Tyr-263, Ala-267 to Arg-276, Ser-326 to Arg-331, Ser-333 to Gln-339, Pro-343 to Asp-348, Glu-426 to Asp-432, Pro-S 17 to His-533, Ala-550 to Pro-565, Gly-569 to Gln-582, Pro-589 to Glu-606, Gly-616 to Ala-623, Met-62_5 to Ala-631.
CD33 monoclonal antibodies (MoAB) are important in the immunodiagnosis of AML. CD33 MoABs have been used in preliminary therapeutic trials for purging bone marrow of AML patients, either before transplantation or for diseases resistant to chemotherapy. To prevent AML patients in remission from suffering relapse, or due to the lack of an appropriate allogenic bone marrow donor, a method is necessary for purging leukemia cells from the autografts of patients with advanced AML. By the invention, this method is provided by which bone marrow from an AML patient is obtained by, for example, percutaneous aspirations from the posterior iliac crest, isolating bone marrow mononuclear by Ficoll-hypaque density gradient centrifugation, and incubating with an anti-CD33-like protein MoAB, for example, 3-5 times for min. at 4-6 degrees C, followed by incubation with rabbit complement at about degrees C for 30 minutes. The patient is then subject to myeloablative chemotherapy, followed by reinfusion of the treated autologous bone marrow according to standard techniques. By the invention, clonogenic tumor cells are depleted from the bone marrow SUBSTITUTE SHEET (RULE 2b) WO 00!29435 PCT/US99/2503I

while sparing hematopoietic cells necessary for engraftment. In a further embodiment,the invention provides an in vivo method for selectively killing or inhibiting growth of tumor cells expressing CD33-like protein antigen of the present invention. The method involves administering to the patient an effective amount of an antagonist to inhibit the CD33-like protein receptor signaling pathway. By the invention, administering such antagonist of the CD33-like protein to a patient may also be useful for treating inflammatory diseases including arthritis and colitis.
Antagonists for use in the present invention include polyclonal and monoclonal antibodies raised aginst the CD33-like protein or a fragment thereof, antisense molecules which control gene expression through antisensc DNA or RNA or through triple-helix formation, proteins or other compounds which bind the CD33-like protein domains, or soluble forms of the CD33-like protein, such as protein fragments including the extracellular region from the full length receptor, which antagonize CD33-like protein mediated signaling by competing with the cell surface CD33-like protein for binding to CD33 receptor ligands.

Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID NO:1S and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 2281 of SEQ ID NO:1S, b is an integer of 1S to 2295, where both a and b correspond to the positions of nucleotide residues shown in 2S SEQ ID NO:1 S, and where b is greater than or equal to a + 14.
SUBSTITUTE SHEET (RULE 26) FEATURES OF PROTEIN ENCODED BY GENE NO: 6 This invention relates to newly identified polynucleotides, polypeptides encoded S by such polynucleotides, the use of such polynucleotides and polypeptides, as well as the production of such polynucleotides and polypeptides. The polypeptide of the present invention has been putatively identified as a CD33 homolog derived from a human primary dendritic cells cDNA library. More particularly, the polypeptide of the present invention has been putatively identified as a human siglec homolog, sometimes hereafter referred to as "CD33-like 3" and/or "siglec 7". The invention also relates to inhibiting the action of such polypeptides.
The siglecs (sialic acid binding Ig-like lectins) are type 1 membrane proteins that constitute a distinct subset of the lg superfamily, characterised by their sequence similarities and abilities to bind sialic acids in glycoproteins and glycolipid (Crocker, P.R., et al., Glycobiology:8 (1998)). Members of the Ig Superfamily of proteins are defined as molecules that share domains of sequence similarity with the variable or constant domains of antibodies.
Many Ig superfamily proteins consist of multiple tandem Ig-like domains connected to other domains, such as Fn-III repeat domains (Vaughn, D.E., and P.J.
Bjorkman, Neuron, 16:261-73 (1996)). At the primary structural level, traditional Ig-like domains can be identified by the presence of two cysteine residues separated by approximately 55-75 amino acid residues, and an "invariant" tryptophan residue located 10-15 residues C-terminal to the first of the two conserved cysteine residues.
The two conserved cysteine residues are thought to be involved in disulfide bonding to form the folded Ig structures (Vaughn, D.E., (1996)).
Ig-like domains further share a common folding pattern, that of a sandwich or fold structure of two b-sheets consisting of antiparallel b-strands containing S-10 amino SUBSTITUTE SHEET (RULE 26) SS
acids (Huang, Z., et al., Biopolymers, 43:367-82 (1997)). Ig-like domains are divided, based upon sequence and structural similarities, into four classifications known as C1, C2, I and V-like domains.
The functional determinants of the Ig-tike domains are presented on the faces of S b-sheets or the loop regions of the Ig-fold. Accordingly, protein-protein interactions can occur either between the faces of the b-sheets, or the loop regions of the lg-fold (Huang, Z., ( 1998)). These Ig-like domains are involved in mediating a diversity of biological functions such as intermolecular binding and protein-protein homophilic or heterophilic interactions.
Thus, Ig-like domains play an integral role in facilitating the activities of proteins of the Ig superfamily. In mammals, the group currently comprises sialoadhesin/siglec-1, CD22/siglec-2, CD33/siglec-3, myelin associated glycoprotein (MAG/siglec-4), siglecs-5, -6 and -7 (Crocker, P.R., et al., EMBO J., 13:4490-503 (1994); Sgroi, D., et al., J
Biol. Chem., 268:7011-18 (1993); Freeman, D.S., et al., Blood, 85:2005-12 (1995); Kelm, S., et al., Curr Biol., 4:965-72 (1994); Cornish, A.L., et al., Blood, 92:2123-132 (1998);
Patel, N., et al., J Biol. Chem, 274:22729-738 (1999); Nicol, G., et al., J Biol. Chem.
In Press (1999)). Siglec-7 has also recently been characterised independently as the NK
receptor p75/AIRM1 (Falco, M., et al., J. Exp. Med., 190:793-802 (1999)). In addition, the gene encoding another siglec-like sequence, OBBP-like protein has been reported but there is no information on its binding activity (Yousef, G.M., et al., Biochem.
Biophys. Res.
Commun., In Press (1999)).
Each of these proteins has an extracellular region made up of a membrane distal V-set domain followed by varying numbers of C2 set domains which range from 16 in sialoadhesin to 1 in CD33. In the cases of sialoadhesin, CD22, MAG and CD33, the sialic acid binding site has been mapped to the V-set domain and for sialoadhesin it has been further characterised at the molecular level by X-ray crystallography 11 (Nath, D., SUBSTITUTE SHEET (RULE Z6) et al., J Biol. Chem., 270:26184-91 (1995); van der Merwe, P.A., et al., J.
Biol. Chem., 271:9273-80 (1996); Tang, S., et al., J. Cell Biol., 138:1355-66 (1997);
Taylor, V.C., et al., J. Biol. Chem., 274:11505-12 (1999); May, A.P., et al., Molecular Cell, 1:719-28 ( 1998)).
Apart from MAG and SMP that are found exclusively in the nervous system, all siglecs characterised to date are expressed on discrete subsets of hemopoietic cells and can provide useful lineage-restricted markers. Thus, CD22 is present only on mature B
cells, sialoadhesin is on macrophage subsets, CD33 is a marker of early committed myeloid progenitor cells, siglec-5 is expressed by monocytes and mature neutrophils, siglec-6 is on B cells and siglec-7 is expressed by NK cells and monocytes (Dorken, B., et al., J. Immunology, 136:4470-79 (1986); Crocker, P.R., et al., J. Exp.
Med., 164:1862-75 (1986); Peiper, S.C., et al., In Leukocyte Typing IV. Oxford University Press, Oxford.
814-16 (1989); Cornish, A.L., et al., Blood, 92:2123-132 (1998); Patel, N., et al., J Biol.
Chem, 274:22729-738 (1999); Nicol, G., et al., J Biol. Chem, In Press (1999)).
These expression patterns indicate discrete functions amongst hemopoietic cell subsets, but apart from CD22, a well-characterised negative regulator of B cell activation (reviewed in Oyster, J.G. and C.C. Goodnow, Immunity, 6:509-17 ( 1997)), the biological functions of siglecs expressed in the hemopoietic system are unknown. Proposed functions include cell-cell interactions through recognition of sialylated glycoconjugates on other cells.
However, a number of studies have also shown that cell-cell adhesion mediated by siglecs can be modulated by cis-interactions with sialic acids present in the host plasma membrane. This is particularly striking for CD22, CD33 and siglec-5, whose binding activities can be greatly increased if host cells are pretreated with sialidase to remove the cis-competing sialic acids (Freeman, D.S., et al., Blood, 8_5:2005-12 (1995);
Cornish, A.L., et al., Blood, 92:2123-132 (1998); Sgroi, D., et al., P.N.A.S., 92:4026-30 (1995)).
Besides potential roles in cellular interactions, there is growing evidence that, similar to CD22, the more recently characterised siglecs are involved in signalling SUBSTITUTE SHEET (RULE 26) functions. The cytoplasmic tails of CD33 and siglecs-5, -6 and -7 have two well-conserved tyrosine-based motifs that are similar to well-characterised signaling motifs in other leukocyte receptors (Gergely, J., et al., Immun. Lett., 68:3-15 (1999)).
Where studied, both tyrosine residues can be phosphorylated by src-like kinase(s) and, in the case of the membrane proximal tyrosine, this leads to subsequent recruitment of the tyrosine phosphatases, SHP-1 and SHP-2 (Falco, M., et al., J. Exp. Med., 190:793-802 (1999); Taylor, V.C., et al., J. Biol. Chem., 274:11.505-12 (1999)).
Thus there exists a clear need for identifying and exploiting novel members of the siglec family of immunoglobulin proteins. Although structurally related, such proteins may possess diverse and multifaceted functions in a variety of cell and tissue types. The inventive purified sigiec proteins are research tools useful for the identification, characterization and purification of cell signaling molecules. Furthermore, the identification of new siglecs permits the development of a range of derivatives, agonists and antagonists at the nucleic acid and protein levels which in turn have applications in 1S the treatment and diagnosis of a range of conditions such as cancer, inflammation, neurological disorders and immunological disorders, amongst many other conditions.
The polypeptide of the present invention has been putatively identified as a member of the siglec family and has been termed CD33-like 3. This identification has been made as a result of amino acid sequence homology to the human cd3311 (See Genbank Accession No. gi~2913995).
Figures 16A-B show the nucleotide (SEQ ID N0:16) and deduced amino acid sequence (SEQ ID N0:34) of CD33-like 3. Predicted amino acids from about 1 to about 18 constitute the predicted signal peptide (amino acid residues from about 1 to about 18 in SEQ ID N0:34) and are represented by the underlined amino acid regions; and amino acids from about 360 to about 376 constitute the predicted transmembrane domain (amino acids from about 360 to about 376 in SEQ ID N0:34) and are represented by the double underlined amino acids.
SUBSTITUTE SKEET (RULE Z6) 8$
Figure 17 shows the regions of similarity between the amino acid sequences of the CD33-like 3 protein (SEQ ID N0:34) and the human CD33L1 protein (SEQ ID
N0:99).
Figure 1$ shows an analysis of the CD33-like 3 amino acid sequence. Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity; amphipathic regions;
flexible regions; antigenic index and surface probability are shown.
A polynucleotide encoding a polypeptide of the present invention is obtained from human NK cells, T-cells, primary dendritic cells, placenta, spleen, primary breast cancer, gall bladder, apoptotic t-cells, macrophage, and chronic lymphocytic leukemia spleen. The polynucleotide of this invention was discovered in a human primary dendritic cell cDNA library.
As shown in Figures 16A-B, CD33-like 3 has a transmembrane domain (the transmembrane domains comprise amino acids from about 360 to about 376 of SEQ
ID
N0:34; which correspond to amino acids from about 360 to about 376 of Figures ). The polynucleotide contains an open reading frame encoding the CD33-like 3 polypeptide of 467 amino acids. CD33-like 3 exhibits a high degree of homology at the amino acid level to the human CD33L1 (as shown in Figure 18).
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the CD33-like 3 polypeptide having the amino acid sequence shown in Figures 16A-B (SEQ ID N0:34). The nucleotide sequence shown in Figures 16A-B (SEQ ID N0:16) was obtained by sequencing a cloned cDNA (HDPtJW68), which was deposited on November 17 at the American Type Culture Collection, and given Accession Number 203484.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:16 is intended DNA fragments at least about l5nt, and more preferably at least about SUBSTITUTE SHEET (RULE 26) 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:16. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:16. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of CD33-like 3 polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 to about 2.50, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, from about 501 to about 550, from about 551 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 751 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 951 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about 1101 to about 1150, from about 1151 to about 1200, from about 1201 to about 1250, from about to about 1300, from about 1301 to about 1350, from about 1351 to about 1400, from about 1401 to about 1450, from about 1451 to about 1500, from about 1501 to about 1550, from about 1551 to about 1600, from about 1601 to about 1650, from about to about 1700, from about 1701 to about 1748 of SEQ ID N0:16, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about"
includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini.
SUBSTITUTE SHEET (RULE 26) Preferred nucleic acid fragments of the present invention include nucleic acid molecules encoding a member selected from the group: a polypeptide comprising or alternatively, consisting of, the transmembrane domain (amino acid residues from about 360 to about 376 in Figures 16A-B (amino acids from about 360 to about 376 in SEQ ID
N0:34). Since the location of these domains have been predicted by computer analysis, one of ordinary skill would appreciate that the amino acid residues constituting these domains may vary slightly (e.g., by about l to 15 amino acid residues}
depending on the crite~7a used to define each domain.
In additional embodiments, the polynucleotides of the invention encode functional 10 attributes of CD33-like 3.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, 15 hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions of CD33-like 3. The data representing the structural or functional attributes of CD33-like 3 set forth in Figure 18 and/or Table VI, as described above, was generated using the various modules and algorithms of the DNA~'STAR set on default parameters. In a preferred embodiment, the 20 data presented in columns VIII, IX, XIII, and XIV of Table VI can be used to determine regions of CD33-like 3 which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen 25 recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 18 , but may, as shown in Table VI, be represented or identified by using tabular representations of the SUBSTITUTE SHEET (RULE 26) data presented in Figure 18 . The DNA*STAR computer algorithm used to generate Figure 18 (set on the original default parameters) was used to present the data in Figure 18 in a tabular format (See Table VI). The tabular format of the data in Figure 18 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 18 and in Table VI include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 16A-B. As set out in Figure 18 and in Table VI, such preferred regions include Gamier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-DooIittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, modulate cellular interaction, or signalling pathways, etc.) may still be retained. For example, the ability of shortened CD33-tike 3 muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an CD33-like 3 mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six CD33-like 3 amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the CD33-like 3 amino acid sequence SUBSTITUTE SHEET (RULE 26) shown in Figures 16A-B , up to the glutamic acid residue at position number 462 and poiynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-467 of Figures 16A-B , where n 1 is an integer from 2 to 462 corresponding to the position of the amino acid residue in Figures 16A-B (which is identical to the sequence shown as SEQ
ID
N0:34). In another embodiment, N-terminal deletions c>f the CD33-like 3 poiypeptide can be described by the general formula n2-467, where n2 is a number from 2 to 462.
cowesponding to the position of amino acid identified in Figures 16A-B . N-terminal deletions of the CD33-like 3 polypeptide of the invention shown as SEQ ID
N0:34 include polypeptides comprising the amino acid sequence of residues: N-terminal deletions of the CD33-like 3 polypeptide of the invention shown as SEQ ID
N0:34 include polypeptides comprising the amino acid sequence of residues: L-2 to K-467; L-3 to K-467; L-4 to K-467; L-5 to K-467: L-6 to K-467; L-7 to K-467; P-8 to K-467; L-9 to K-467; L-10 toK-467; W-11 to K-467; G-12 to K-467; R-13 to K-467; E-14 to K-467;
IS R-15 to K-467; V-16 to K-467; E-17 to K-467; G-18 to K-467;Q-19 to K-467; K-20 to K-467; S-21 to K-467; N-22 to K-467; R-23 to K-467; K-24 to K-467; D-25 to K-467;
Y-26 to K-467; S-27 toK-467; L-28 to K-467; T-29 to K-467; M-30 to K-467; Q-31 to K-467; S-32 to K-467; S-33 to K-467; V-34 to K-467; T-35 to K-467;V-36 to K-467; Q-37 to K-467; E-38 to K-467; G-39 to K-467; M-40 to K-467; C-41 to K-467; V-42 to K-467; H-43 to K-467; V-44 toK-467; R-45 to K-467; C-46 to K-467; S-47 to K-467;

to K-467; S-49 to K-467; Y-SO to K-467; P-51 to K-467; V-52 to K-467;D-53 to K-467;
S-54 to K-467; Q-55 to K-467; T-S6 to K-467; D-57 to K-467; S-58 to K-467; D-59 to K-467; P-60 to K-467; V-61 toK-467; H-62 to K-467; G-63 to K-467; Y-64 to K-467;
W-65 to K-46?; F-66 to K-467; R-67 to K-467; A-68 to K-467; G-69 to K-467;N-70 to K-467; D-71 to K-467; I-72 to K-467; S-73 to K-467; W-74 to K-467; K-75 to K-467;
A-76 to K-467; P-77 to K-467; V-7$ toK-467; A-79 to K-467; T-80 to K-467; N-81 to K-467; N-82 to K-467; P-$3 to K-467; A-84 to K-467; W-85 to K-467; A-86 to K-SUBSTITUTE SHEET (RULE 26) 467;V-87 to K-467; Q-$8 to K-467; E-89 to K-467; E-90 to K-467; T-91 to K-467;

to K-467; D-93 to K-467; R-94 to K-467; F-95 toK-467; H-96 to K-467; L-97 to K-467;
L-98 to K-467; G-99 to K-467; D-100 to K-467; P-101 to K-467; Q-102 to K-467;

toK-467; K-104 to K-467; N-105 to K-467; C-106 to K-467; T-107 to K-467; L-108 to K-467; S-109 to K-467; I-110 to K-467; R-111 toK-467; D-112 to K-467; A-113 to K-467; R-114 to K-467; M-115 to K-467; S-116 to K-467; D-117 to K-467; A-118 to K-467; G-119 toK-467; R-120 to K-467; Y-121 to K-467; F-122 to K-467; F-123 to K-467;
R-124 to K-467; M-125 to K-467; E-126 to K-467; K-127 toK-467; G-128 to K-467;
N-129 to K-467; I-130 to K-467; K-131 to K-467; W-132 to K-467; N-133 to K-467;
Y-134 to K-467; K-135to K-467; Y-136 to K-467; D-137 to K-467; Q-138 to K-467; L-to K-467; S-140 to K-467; V-141 to K-467; N-142 to K-467; V-143to K-467; T-144 to K-467; A-145 to K-467; L-146 to K-467; T-147 to K-467; H-148 to K-467; R-149 to K-467; P-150 to K-467; N-151to K-467; 1-152 to K-467; I~ 153 to K-467; I-154 to K-467;
P-155 to K-467; G-156 to K-467; T-157 to K-467; L-158 to K-467; E-159 toK-467:
S-160 to K-467; G-161 to K-467; C-162 to K-467; F-163 to K-467; Q-164 to K-467;
N-165 to K-467; L-166 to K-467; T-167 toK-467; C-168 to K-467; S-169 to K-467; V-to K-467; P-171 to K-467; W-172 to K-467; A-173 to K-467; C-174 to K-467; E-toK-467; Q-176 to K-467; G-177 to K-467; T-178 to K-467; P-179 to K-467; P-180 to K-467; M-181 to K-467; I-182 to K-467; S-183 toK-467; W-184 to K-467; M-185 to K-467; G-186 to K-467; T-187 to K-467; S-188 to K-467; V-189 to K-467; S-190 to K-467; P-191 toK-467; L-192 to K-467; H-193 to K-467; P-194 to K-467; S-195 to K-467;
T-196 to K-467; T-I97 to K-467; R-198 to K-467; S-199 toK-467; S-200 to K-467;
V-201 to K-467; L-202 to K-467; T-203 to K-467; L-204 to K-467; I-205 to K-467;

to K-467; Q-207 toK-467; P-208 to K-467; Q-209 to K-467; H-210 to K-467; H-211 to K-467; G-212 to K-467; T-213 to K-467; S-214 to K-467; L-215 toK-467; T-216 to K-467; C-217 to K-467; Q-218 to K-467; V-219 to K-467; T-220 to K-467; L-221 to K-467; P-222 to K-467; G-223 toK-467; A-224 to K-467; G-225 to K-467; V-226 to K-SUBSTITUTE SHEET (RULE 26) 467; T-227 to K-467; T-228 to K-467; N-229 to K-467; R-230 to K-467; T-231 toK-467;
I-232 to K-467; Q-233 to K-467; L-234 to K-467; N-235 to K-467; V-236 to K-467; 5-237 to K-467; Y-238 to K-467; P-239 toK-467; P-240 to K-467; Q-241 to K-467; N-to K-467; L-243 to K-467; T-244 to K-467; V-245 to K-467; T-246 to K-467; V-toK-4G7; F-248 to K-467; Q-249 to K-4G7; G-250 to K-467; E-251 to K-467; G-2_52 to K-467; T-253 to K-4G7; A-254 to K-467; S-255 toK-467; T-256 to K-467; A-257 to K-467; L-258 to K-467; G-259 to K-467; N-260 to K-467; S-261 to K-467: S-2(2 to K-467; S-263 toK-467; L-264 to K-467; S-265 to K-467; V-266 to K-467; L-267 to K-467;
E-268 to K-467; G-269 to K-467; Q-270 to K-467; S-271 toK-467; L-272 to K-467;
R-273 to K-467; L-274 to K-467; V-275 to K-467; C-276 to K-467; A-277 to K-467;
V-278 to K-4G7; D-279 toK-467; S-280 to K-467; N-281 to K-4G7; P-282 to K-467; P-to K-467; A-284 to K-467; R-285 to K-467; L-286 to K-467; S-287 toK-467; W-288 to K-467; T-289 to K-467; W-290 to K-467; R-291 to K-467; S-292 to K-467; L-293 to K-467; T-294 to K-467; L-295 toK-467; Y-296 to K-467; P-297 to K-467; S-298 to K-467;
Q-299 to K-467; P-300 to K-467; S-301 to K-467; N-302 to K-467; P-303 toK-467;
L-304 to K-467; V-305 to K-467; L-306 to K-467; E-307 to K-467; L-308 to K-467;

to K-467; V-310 to K-467; H-311 toK-467; L-312 to K-467; G-313 to K-467; D-314 to K-467; E-315 to K-467; G-316 to K-467; E-317 to K-467; F-318 to K-467; T-319 toK-467; C-320 to K-467; R-321 to K-467; A-322 to K-467; Q-323 to K-467; N-324 to K-467; S-325 to K-467; L-326 to K-467; G-327 toK-467; S-328 to K-467; Q-329 to K-467;
H-330 to K-467; V-331 to K-467; S-332 to K-467; L-333 to K-467; N-334 to K-467; L-335 toK-467; S-336 to K-467; L-337 to K-467; Q-338 to K-467; Q-339 to K-467; E-to K-467; Y-341 to K-467; T-342 to K-467; G-343 toK-467; K-344 to K-467; M-345 to K-467; R-346 to K-467; P-347 to K-467; V-348 to K-467; S-349 to K-467; G-350 to K-467; V-351 toK-467; L-352 to K-467; L-353 to K-467; Cr-354 to K-467; A-355 to K-467; V-356 to K-467; G-357 to K-467; G-358 to K-467; A-359 toK-467; G-360 to K-467; A-361 to K-467; T-362 to K-467; A-363 to K-467; L-364 to K-467; V-365 to K-SUBSTITUTE SHEET (RULE 26) 467; F-366 to K-467; L-367 toK-467; S-368 to K-467; F-369 to K-467; C-370 to K-467;
V-371 to K-467; I-372 to K-467; F-373 to K-467; I-374 to K-467; V-37S toK-467;
V-376 to K-467; R-377 to K-467; S-378 to K-467; C-379 to K-467; R-380 to K-467;

to K-467; K-382 to K-467; S-383 toK-467; A-384 to K-467; R-38S to K-4.67; P-386 to .5 K-467; A-387 to K-467: A-388 to K-467; D-389 to K-467; V-390 to K-467; G-391 toK-467; D-392 to K-467; 1-393 to K-467; G-394 to K-467; M-39S to K-467; K-396 to K-467; D-397 to K-467; A-398 to K-467; N-399 toK-467; T-400 to K-467; I-401 to K-467;
R-402 to K-467; G-403 to K-467; S-404 to K-467; A-40S to K-467; S-406 to K-467; Q-407 toK-467; G-408 to K-467; N-409 to K-467; L-410 to K-467; T-411 to K-467; E-to K-467; S-413 to K-467; W-414 to K-467; A-41 S toK-467; D-416 to K-467; D-417 to K-467; N-418 to K-467; P-419 to K-467; R-420 to K-467; H-42 i to K-467; H-422 to K-467; G-423to K-467; L-424 to K-467; A-42S to K-467; A-426 to K-467; H-427 to K-467; S-428 to K-467; S-429 to K-467; G-430 to K-467; E-431to K-467; E-432 to K-467;
R-433 to K-467; E-434 to K-467; I-43S to K-467; Q-436 to K-467; Y-437 to K-467; A-1S 438 to K-467; P-439to K-467; L-440 to K-467; S-441 to K-467; F-442 to K-467; H-443 to K-467; K-444 to K-467; G-44S to K-467; E-446 to K-467; P-447to K-467; Q-448 to K-467; D-449 to K-467; L-4S0 to K-467; S-4S 1 to K-467; G-4S2 to K-467; Q-4S3 to K-467; E-4S4 to K-467; A-4SSto K-467; T-4S6 to K-467; N-4S7 to K-467; N-458 to K-467; E-4S9 to K-467; Y-460 to K-467; S-461 to K-467; E-462 to K-467; of SEQ ID
N0:34. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities may still be retained.
For example the ZS ability of the shortened CD33-like 3 mutein to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are SUBSTITUTE SHEET (RULE 26) removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art.
It is not unlikely that a CD33-like 3 mutein with a large number of deleted C-terminal amino acid S residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six CD33-like 3 amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the CD33-like 3 polypeptide shown in Figures 16A-B , up to the leucine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figure 1, where ml is an integer from 6 to 467 corresponding to the position of the amino acid residue in Figures 16A-B .
Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the CD33-like 3 polypeptide of the invention shown as SEQ ID N0:34 include polypeptides comprising the amino acid sequence of residues: M-1 to P-466; M-1 to I-465; M-1 to K-464; M-1 to I-463; M-1 to E-462; M-1 to S-461; M-1 to Y-460; M-1 toE-459; M-1 to N-458; M-1 to N-457; M-1 to T-456; M-1 to A-455; M-1 to E-454; M-1 to Q-453; M-1 to G-452; M-1 to S-451; M-1 toL-450; M-1 to D-449; M-1 to Q-448; M-1 to P-447; M-1 to E-446; M-1 to G-445; M-1 to K-444; M-1 to H-443; M-1 to F-442; M-1 toS-441; M-1 to L-440; M-1 to P-439; M-1 to A-438; M-1 to Y-437; M-1 to Q-436; M-1 to I-435; M-1 to E-434; M-1 to R-433 ; M-1 toE-432; M-1 to E-431; M-1 to G-430; M-1 to S-429; M-1 to S-428; M-1 to H-427; M-1 to A-426; M-1 to A-425; M-1 to L-424; M-1 toG-423; M-1 to H-422; M-1 to H-421; M-1 to R-420; M-1 to P-419; M-1 to N-418; M-1 to D-417; M-to D-416: M-1 to A-415; M-1 toW-414; M-1 to S-413; M-1 to E-412; M-1 to T-41 l; M-1 SUBSTITUTE S)EiEET (RULE Z6) to L-410; M-1 to N-409; M-1 to G-4.08; M-1 to Q-407; M-1 to S-406; M-1 toA-405; M-1 to S-404; M-1 to G-403; M-1 to R-402; M-1 to I-401; M-1 to T-400; M-1 to N-399; M-1 to A-398; M-1 to D-397; M-1 toK-396; M-1 to M-395; M-1 to G-394; M-1 to I-393;

to D-392; M-1 to G-391; M-1 to V-390; M-1 to D-389; M-1 to A-388; M-1 toA-387;
M-1 to P-386; M-I to R-385; M-1 to A-384; M-1 to S-383; M-1 to K-382; M-1 to K-381;
M-1 to R-380; M-I to C-379; M-1 toS-378; M-1 to R-377; M-1 to V-376; M-1 to V-375;
M-1 to I-374; M-1 to F-373; M-l to I-372; M-I to V-371; M-1 to C-370; M-1 toF-369;
M-1 to S-368; M-1 to L-367; M-1 to F-366; M-1 to V-365; M-I to L-364; M-I to A-363;
M-1 to T-362; M-1 to A-361; M-I toG-360; M-1 to A-359; M-1 to G-358; M-1 to G-357; M-1 to V-356; M-I to A-35_5; M-1 to G-354; M-i to L-353; M-1 to L-352; M-toV-35 i; M-1 to G-350; M-i to S-349; M-1 to V-348; M-1 to P-347; M-1 to R-346; M-1 to M-34_5; M-1 to K-344; M-1 to G-343; M-1 toT-342; M-1 to Y-341; M-1 to E-340; M-1 to Q-339; M-I to Q-338; M-1 to L-337; M-1 to S-336; M-1 to L-335; M-1 to N-334;
M-1 toL-333; M-1 to S-332; M-1 to V-331; M-1 to H-330; M-1 to Q-329; M-1 to S-328;
M-1 to G-327; M-1 to L-326; M-1 to S-325; M-I toN-324; M-1 to Q-323; M-1 to A-322;
M-1 to R-321; M-1 to C-320; M-1 to T-319; M-1 to F-3 I 8; M-1 to E-317; M-1 to G-316;
M-1 toE-315; M-1 to D-314; M-I to G-313; M-1 to L-312; M-1 to H-3i 1; M-1 to V-310;
M-1 to Q-309; M-1 to L-308; M-1 to E-307; M-1 toL-306; M-1 to V-305; M-1 to L-304;
M-1 to P-303; M-1 to N-302; M-1 to S-301; M-1 to P-300; M-1 to Q-299; M-1 to S-298;
M-1 toP-297; M-1 to Y-296; M-1 to L-295; M-1 to T-294; M-1 to L-293; M-1 to S-292;
M-1 to R-291; M-1 to W-290; M-1 to T-289; M-1 toW-288; M-1 to S-287; M-1 to L-286; M-1 to R-285; M-1 to A-284; M-1 to P-283; M-1 to P-282; M-1 to N-281; M-1 to S-280; M-1 toD-279; M-1 to V-278; M-1 to A-277; M-1 to C-276; M-1 to V-275; M-1 to L-274; M-1 to R-273; M-1 to L-272; M-1 to S-271; M-1 toQ-270; M-1 to G-269; M-1 to E-268; M-1 to L-267; M-1 to V-266; M-1 to S-265; M-1 to L-264; M-1 to S-263; M-1 to S-262; M-1 toS-261; M-1 to N-260; M-1 to G-259; M-1 to L-258; M-1 to A-257; M-1 to T-256; M-1 to S-255; M-i to A-254; M-1 to T-253; M-1 toG-252; M-1 to E-251; M-1 to SUBSTITUTE SHEET (RULE 26) G-250; M-1 to Q-249; M-1 to F-248; M-1 to V-247; M-1 to T-246; M-1 to V-245; M-to T-244; M-1 toL-243; M-1 to N-242; M-1 to Q-241; M-1 to P-240; M-1 to P-239;

to Y-238; M-1 to S-237; M-1 to V-236; M-1 to N-235; M-1 toL-234; M-1 to Q-233;

to I-232; M-1 to T-231; M-1 to R-230; M-1 to N-229; M-I to T-228; M-1 to T-227; M-1 to V-226; M-I toG-225; M-1 to A-224; M-1 to G-223; M-1 to P-222; M-1 to L-22I;

to T-220; M-1 to V-219; M-1 to Q-218; M-J to C-217; M-I toT-216; M-1 to L-2i5;

to S-214; M-I to T-213; M-I to G-212; M-1 to H-211; M-1 to H-210; M-1 to Q-209; M-1 to P-208; M-1 toQ-207; M-1 to P-206; M-1 to I-205; M-1 to L-204; M-I to T-203; M-1 to L-202; M-1 to V-201; M-1 to S-200; M-1 to S-199; M-1 toR-198; M-1 to T-197;

to T-I96; M-1 to S-195; M-I to P-194; M-1 to H-193; M-1 to L-192; M-I to P-191: M-1 to S-190; M-I toV-189; M-1 to S-188; M-1 to T-187; M-1 to G-186; M-1 to M-18_5; M-I
to W-184; M-1 to S-183; M-1 to I-182; M-1 to M-181; M-1 toP-180; M-1 to P-179;

to T-178; M-1 to G-177; M-1 to Q-176; M-1 to E-I75; M-I to C-174; M-1 to A-173; M-1 to W-172; M-I toP-171; M-1 to V-170; M-1 to S-169; M-1 to C-168; M-1 to T-167;
M-1 to L-166; M-1 to N-165; M-I to Q-164; M-1 to F-163; M-1 toC-162; M-1 to G-161;
M-I to S-160; M-1 to E-159; M-1 to L-158; M-1 to T-157; M-1 to G-156; M-1 to P-155;
M-1 to I-154; M-1 toL-153; M-1 to I-152; M-I to N-151: M-1 to P-150; M-1 to R-149;
M-1 to H-148; M-1 to T-147; M-1 to L-146; M-1 to A-145; M-1 toT-144; M-I to V-143;
M-1 to N-142; M-1 to V-141; M-1 to S-140; M-1 to L-139; M-1 to Q-138; M-1 to D-137; M-1 to Y-136; M-1 toK-135; M-1 to Y-134; M-1 to N-133; M-1 to W-132; M-1 to K-131; M-1 to I-130; M-1 to N-129; M-1 to G-128; M-1 to K-127; M-1 toE-126; M-1 to M-125; M-1 to R-124; M-1 to F-123; M-1 to F-122; M-1 to Y-12I; M-1 to R-120; M-to G-119; M-1 to A-I 18; M-I toD-117; M-1 to S-116; M-1 to M-115; M-1 to R-114; M-1 to A-113; M-1 to D-I12; M-1 to R-111; M-1 to I-110; M-1 to S-109; M-1 toL-108; M-1 to T-107; M-1 to C-106; M-1 to N-105; M-1 to K-104; M-1 to T-103; M-1 to Q-102;
M-1 to P-1 O 1; M-1 to D-100; M-1 toG-99; M-1 to L-98; M-1 to L-97; M-1 to H-96; M-1 to F-95; M-1 to R-94; M- I to D-93; M-1 to R-92; M-1 to T-91; M- I to E-90; M-1 toE-SUBSTITUTE SHEET (RULE 26) 89; M-1 to Q-88; M-1 to V-87; M-1 to A-86; M-1 to W-85; M-1 to A-84; M-1 to P-83;
M-1 to N-82; M-1 to N-81; M-1 to T-80; M-1 toA-79; M-1 to V-78; M-1 to P-77; M-1 to A-76; M-1 to K-75; M-1 to W-74; M-1 to S-73; M-1 to I-72; M-1 to D-71; M-1 to N-70;
M-1 toG-69; M-1 to A-68; M-1 to R-67; M-1 to F-66; M-1 to W-65; M-1 to Y-64; M-to G-63; M-1 to H-62; M-1 to V-61; M-1 to P-60; M-1 toD-S9; M-1 to S-_58; M-1 to D-57; M-1 to T-56; M-I to Q-55; M-1 to S-54; M-1 to D-53; M-1 to V-52; M-1 to P-51; M-1 to Y-50; M-1 toS-49; M-1 to F-48; M-1 to S-47; M-1 to C-46; M-1 to R-45; M-1 to V-44; M-1 to H-43; M-1 to V-42; M-1 to C-41; M-1 to M-40; M-1 toG-39; M-1 to E-38:
M-1 to Q-37; M-1 to V-36; M-1 to T-35; M-1 to V-34; M-1 to S-33; M-1 to S-32;
M-1 to Q-31; M-1 to M-30; M-1 toT'-29; M-1 to L-28; M-1 to S-27; M-1 to Y-26; M-1 to D-2_5;
M-1 to K-24; M-1 to R-23; M-1 to N-22; M-1 to S-21; M- l to K-20; M-1 toQ-19;
M-1 to G-18; M-1 to E-17; M-1 to V-16: M-1 to R-15; M-1 to E-14; M-1 to R-13; M-1 to G-12;
M-1 to W-11; M-1 to L-10: M-1 toL-9; M-1 to P-8; M-1 to L-7; M-1 to L-6; of SEQ ID
N0:34. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:16 which have been determined from the following related cDNA genes: HGBAY02R (SEQ ID NO:100) and HLYBY62R (SEQ ID N0:101).
Based on the sequence similarity to the human CD33L1, translation product of this gene is expected to share at least some biological activities with CD33 proteins, and specifically myeloid modulatory proteins and/or siglec proteins. Such activities are known in the art, some of which are described elsewhere herein.
Specifically, polynucleotides and polypeptides of the invention are also useful for modulating the differentiation of normal and malignant cells, modulating the proliferation and/or differentiation of cancer and neoplastic cells, and modulating the immune response. Polynucleotides and polypeptides of the invention may represent a SUBSTITUTE SHEET (RULE 26) diagnostic marker for hematopoietic and immune diseases and/or disorders. The full-length protein should be a secreted protein, based upon homology to the CD33 family.
Therefore, it is secreted into serum, urine, ar feces and thus the levels is assayable from patient samples. Assuming specific expression levels are reflective of the presence of immune disorders, this protein would provide a convenient diagnostic for early detection.
In addition, expression of this gene product may also be linked to the progression of immune diseases, and therefore may itself actually represent a therapeutic or therapeutic target for the treatment of cancer.
Polynucleotides and polypeptides of the invention may play an important role in the pathogenesis of human cancers and cellular transformation, particularly those of the immune and hematopoietic systems. Polynucleotides and polypeptides of the invention may also be involved in the pathogenesis of developmental abnormalities based upon its potential effects on proliferation and differentiation of cells and tissue cell types. Due to the potential proliferating and differentiating activity of said polynucleotides and polypeptides, the invention is useful as a therapeutic agent in inducing tissue regeneration, for treating inflammatory conditions (e.g., inflammatory bowel syndrome, diverticulitis, etc.). Moreover, the invention is useful in modulating the immune response to aberrant polypeptides, as may exist in rapidly proliferating cells and tissue cell types, particularly in adenocarcinoma cells, and other cancers.
This gene is expressed predominantly on NK cells, and to a lesser extent on T-cells. Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of the following diseases and conditions which include, but are not limited to, immune disorders and cancer, as well as the immunodiagnosis of acute leukemias. Similarly, polypeptides and antibodies directed to these polypeptides are useful to provide immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'C/US99/25031 immune system, and breast tissue, expression of this gene at significantly higher or lower levels is detected in certain tissues or cell types (e.g. immune, cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 34 as residues: Gly-12 to Tyr-26, VaI-52 to Asp-59, Gln-88 to Asp-93, Arg-124 to Asn-129, His-193 to Arg-198, Gln-207 to Thr-213, Gln-338 to Arg-346, Ser-378 to Ala-384, Ser-413 to Arg-420, Ser-428 to Glu-434, His-443 to Ser-451, Glu-454 to Ser-461. Polynucleotides encoding said polypeptides are also provided.
The tissue distribution in NK cells, in combination with the homology to siglec family of proteins indicates the protein product of this gene is useful for the diagnosis and treatment of a variety of immune system disorders. NK cells are bone-marrow derived granular lymphocytes that play an important role in natural immunity to infectious diseases and have the capacity to kill certain virally-infected cells and tumor cells that have down-regulated MHC Class-I antigen expression. The killing and proinflammatory activities of NK cells are regulated through a variety of cell surface receptors that can mediate either activity or inhibitory signals. The best understood receptors are those that recognize MHC Class I molecules at the cell surface and deliver a negative signal, thereby protecting normal host cells from cytotoxicity.
These receptors can belong either to the C-type lectin superfamily or the Ig superfamily, although in humans the majority are members of the Ig superfamily known as killer cell Ig-like receptors (KIRs). Representative uses are described in the "Immune Activity" and "infectious disease" sections below, in Example 11, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the expression indicates a role in regulating the proliferation;
survival; differentiation; and/or activation of hematopoietic cell lineages, including blood SUBSTITUTE SHEET (RULE 26) stem cells. Involvement in the regulation of cytokine production, antigen presentation, or other processes indicates a usefulness for treatment of cancer (e.g. by boosting immune responses). Expression in cells of lymphoid origin, indicates the natural gene product is involved in immune functions. Therefore it would also be useful as an agent for immunological disorders including arthritis, asthma, immunodeficiency diseases such as AIDS, leukemia, rheumatoid arthritis, granulomatous Disease, inflammatory bowel disease, sepsis, acne, neutropenia, neutrophilia, psoriasis, hypersensitivities, such as T-cell mediated cytotoxicity; immune reactions to transplanted organs and tissues, such as host-versus-graft and graft-versus-host diseases, or autoimmunity disorders, such as autoimmune infertility, Tense tissue injury, demyelination, systemic lupus erythematosis> drug induced hemolytic anemia, rheumatoid arthritis, Sjogren's Disease, and scleroderma. Moreover, the protein may represent a secreted factor that influences the differentiation or behavior of other blood cells, or that recruits hematopoietic cells to sites of injury. Thus, this gene product is thought to be useful in the expansion of stem cells and committed progenitors of various blood lineages, and in the differentiation and/or proliferation of various cell types. Based upon the tissue distribution of this protein, antagonists directed against this protein is useful in blocking the activity of this protein. Accordingly, preferred are antibodies which specifically bind a portion of the translation product of this gene.
Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), SUBSTITUTE SHEET (R.ULE 26) are more particularly described elsewhere herein. Furthermore, the protein may also be used to determine biological activity, raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID NO:1G and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 1734 of SEQ ID NO:1G, b is an integer of 15 to 174$, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID N0:16, and where b is greater than or equal to a + 14.
FEATURES OF PROTEIN ENCODED BY GENE NO: 7 This invention relates to newly identified polynucleotides, polypeptides encoded by such polynucleotides, the use of such polynucleotides and polypeptides, as well as the production of such polynucleotides and polypeptides. The polypeptide of the present invention has been putatively identified as a human integrin alpha 11 homolog derived from a human vsteoblast I1 cDNA library. More particularly, the polypeptide of the present invention has been putatively identified as a human integrin alpha 11-subunit homolog, sometimes hereafter referred to as "integrin alpha 11 ", "integrin alpha 11-SUBSTITUTE SHEET (RULE 26) subunit", "al l", "Al l-subunit", and/or "Integrin al l-subunit". The invention also relates to inhibiting the action of such polypeptides.
The integrins are a large family of cell adhesion molecules consisting of noncovalently associated ab heterodimers.
We have cloned and sequenced a novel human integrin a -subunit cDNA, designated al 1. The al l cDNA encodes a protein with a 22 amino acid signal peptide, a large 1120 residue extracellular domain that contains an I-domain of 207 residues and is linked by a transmembrane domain to a short cytoplasmic domain of 24 amino acids. The deduced al l protein shows the typical structural features of integrin a-subunits and is similar to a distinct group of a-subunits from collagen-binding integrins.
However, it differs from most integrin a-chains by an incompleteteiy preserved cytoplasmic GFFKR
motif.
The human ITGA I 1 gene was located to bands q22.3-23 on chromosome 15, and its transcripts were found predominantly in bone, cartilage as well as in cardiac and skeletal muscle. Expression of the 5.5 kilobase al l mRNA was also detectable in ovary and small intestine.
All vertebrate cells express members of the integrin family of cell adhesion molecules, which mediate cellular adhesion to other cells and extracellular subtratum, cell migration and participate in important physiologic processes from signal transduction to cell proliferation and differentiation {Hypes, 92; Springer, 92 }.
Integrins are structurally homologous heterodimeric type-I membrane glycoproteins formed by the noncovalent association of one of eight b -subunits with one of the 17 different a-subunits described to date, resulting in at least 22 different ab complexes. Their binding specificities for cellular and extracellular ligands are determined by both subunits and are dynamically regulated in a cell-type-specific mode by the cellular environment as well as by the developmental and activation state of the cell {Diamond and Springer, 94}. In integrin a -subunits, the aminoterminal region of the SUBSTITUTE SHEET (RULE 26) large extracellular domain consists of a seven-fold repeated structure which is predicted to fold into a b -propeller domain {Corbi et al., 1987; Springer, 1997 }. The three or four C-terminal repeats contain putative divalent cation binding motifs that are thought to be important for ligand binding and subunit association {Diamond and Springer, 94}. The al, a2, a 10, aD, aE, aL, aM and aX-subunits contain an approximately 200 amino acid I-domain inserted between the second and third repeat that is not present in other a-chains { Larson et al., 1989 }. Several isolated I-domains have been shown to independently bind the ligands of the parent integrin heterodimer {Kamata and Takada, 1994; Randi and Hogg, 1994}. The a3, a5-8, aIIb and aV-subunits are proteolytically processed at a conserved site into disulphide-linked heavy and light chains, while the a4-subunit is cleaved at a more aminoterminal site into two fragments that remain noncovalently associated {Hemler et al., 90}. Additional a-subunit variants are generated by alternative splicing of primary transcripts {Ziober et al., 93; Delwel et al., 95; Leung et al., 98 }.
The extracellular domains of a-integrin subunits are connected by a single spanning transmembrane domain to short, diverse cytoplasmic domains whose only conserved feature is a membrane-proximal KXGFF(K/R)R motif { Sastry and Horwitz, 1993 }. The cytoplasmic domains have been implicated in the cell-type-specific modulation of integrin affinity states { Williams et al., 1994 }.
The polypeptide of the present invention has been putatively identified as a member of the integrin family and has been termed integrin alpha 11 subunit ("al l").
This identification has been made as a result of amino acid sequence homology to the human integrin alpha 1 subunit (See Genbank Accession No. gi~346210).
Figures 19A-F show the nucleotide (SEQ ID N0:17) and deduced amino acid sequence (SEQ ID N0:35) of al 1. Predicted amino acids from about 1 to about constitute the predicted signal peptide (amino acid residues from about 1 to about 22 in SEQ ID N0:35) and are represented by the underlined amino acid regions; amino acids from about 666 to about 682, and/or amino acids from about 114_5 to about 1161 SUBSTITUTE SHEET (RULE 26) constitute the predicted transmembrane domains (amino acids from about 666 to about 682, and/or amino acids from about I 145 to about 1161 in SEQ ID N0:35) and are represented by the double underlined amino acids; and amino acids from about 64 to about 96 constitute the predicted immunoglobulin and major histocompatibility complex protein domain (amino acids from about 64 to about 96 in SEQ ID N0:35) and are represented by the bold amino acids.
Figure 20 shows the regions of similarity between the amino acid sequences of the integrin alpha 11 subunit (al l) protein (SEQ ID N0:35) and the human integrin alpha 1 subunit (SEQ ID NO: 103).
Figure 21 shows an analysis of the integrin alpha 11 subunit (al l) amino acid sequence. Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
A polynucleotide encoding a polypeptide of the present invention is obtained from human ovary ,small intestine, fetal heart, fetal brain, large intestine, osteoblasts, human trabelcular bone cells, messangial cells, adipocytes, osteosarcoma, chondrosarcoma, breast cancer cells, and bone marrow tissues and cells. The polynucleotide of this invention was discovered in a human osteoblast II cDNA
library.
Its translation product has homology to the characteristic immunoglobulin and major histocompatibility complex protein domain of integrin family members. As shown in Figures 19A-F, al l has transmembrane domains (the transmembrane domains comprise amino acids 666 - 682 and/or 1145 - 1161 of SEQ ID N0:35; which correspond to amino acids 666 - 682 and/or 1145 - 1161 of Figures 19A-F) with strong conservation between other members of the integrin family. The polynucleotide contains an open reading frame encoding the al 1 polypeptide of I 189 amino acids. The present invention exhibits a high degree of homology at the amino acid level to the human integrin alpha 1 subunit (as shown in Figure 20).
Preferred polypeptides of the invention comprise the following amino acid SUBSTITUTE SHEET (RULE 26) sequence: TNGYQKTGDVYKCPVIHGNCTKLNLGRVTLSNV (SEQ ID NO:102).
Polynucleotides encoding these polypeptides are also provided.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the al I polypeptide having the amino acid sequence shown in Figures 19A-F (SEQ ID N0:35). The nucleotide sequence shown in Figures 19A-F
(SEQ ID N0:35) was obtained by sequencing a cloned cDNA (HOHBY69), which was deposited on November 17 at the American Type Culture Collection, and given Accession Number- 2034$4.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. I3y a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:17 is intended DNA fragments at least about l5nt, and more preferably at least about nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of 15 course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:17. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as 20 shown in SEQ ID N0:17. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of al l polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, from about 501 to about 550, from about 551 to SUBSTITUTE SHEET (RULE 26) about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 751 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 951 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about 1101 to about 1150, from about 1151 to about i 200, from about 1201 to about 1250, from about 1251 to about 1300, from about 1301 to about 1350, from about 1351 to about 1400, from about to about 1450, from about 14_51 to about 1500, from about 1_501 to about 1550, from about 1551 to about 1600, from about 1601 to about 1650, from about 1651 to about 1700, from about 1701 to about 1750, from about 1751 to about 1800, from about to about 1850, from about 1851 to about 1900, from about 1901 to about 1950, from about 1951 to about 2000, from about 2001 to about 2050, from about 2051 to about 2100, from about 2101 to about 2150, from about 2151 to about 2200, from about to about 2250, from about 2251 to about 2300, from about 2301 to about 2350, from about 2351 to about 2400, from about 2401 to about 2450, from about 245 i to about 2500, from about 2501 to about 2550, from about 2551 to about 2600, from about to about 2650, from about 2651 to about 2700, from about 2701 to about 2750, from about 2751 to about 2800, from about 2801 to about 2850, from about 2851 to about 2900, from about 2901 to about 2950, from about 2951 to about 3000, from about to about 3050, from about 3051 to about 3100, from about 3101 to about 3150, from about 3151 to about 3200, from about 3201 to about 3250, from about 3251 to about 3300, from about 3301 to about 3350, from about 3351 to about 3400, from about to about 3450, from about 3451 to about 3500, from about 3501 to about 3550, from about 3551 to about 3600, from about 3601 to about 3650, from about 3651 to about 3700, from about 3701 to about 3750, from about 3751 to about 3800, from about to about 3850, from about 3851 to about 3900, from about 3901 to about 3950, from about 3951 to about 4000, from about 4001 to about 4050, from about 4051 to about 4100, from about 4101 to about 4150, from about 4151 to about 4200, from about SUBSTITUTE SHEET (RULE 26) to about 4250, from about 42S 1 to about 4300, from about 4301 to about 4350, from about 4351 to about 4400, from about 4401 to about 4450, from about 4451 to about 4500, from about 4501 to about 4550, from about 4SS 1 to about 4600, from about 4601 to about 4650, from about 46S 1 to about 4700, from about 4701 to about 4750, from S about 47S 1 to about 4800, from about 4801 to about 4850, from about 48S 1 to about 4900, from about 4901 to about 4950, from about 49S 1 to about 4995, from about, from about 1 to about 236, from about 144 to about 188, from about 231 to about 276 of SEQ
ID N0:17, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini.
Prefen-ed nucleic acid fragments of the present invention include nucleic acid molecules encoding a member selected from the group: a poiypeptide comprising or alternatively, consisting of, any one of the transmembrane domains (amino acid residues from about 666 to about 682 and/or 1145 to about 1161 in Figures 19A-F (amino acids 1 S from about 666 to about 682 and/or 1145 to about 1161 in SEQ ID N0:3S), in addition to the immunoglobulin and major histocompatibility complex protein domain (amino acid residues from about 64 to about 96 in Figures 19A-F (amino acids from about 64 to about 96 in SEQ ID N0:3S). Since the location of these domains have been predicted by computer analysis, one of ordinary skill would appreciate that the amino acid residues constituting these domains may vary slightly (e.g., by about 1 to 1S amino acid residues) depending on the criteria used to define each domain. In additional embodiments, the polynucleotides of the invention encode functional attributes of al 1.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and 2S beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible SUBSTITUTE SHEET (RULE 26) regions, surface-forming regions and high antigenic index regions of the present invention.
The data representing the structural or functional attributes of al l set forth in Figure 21 and/or Table VII, as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table VII
can be used to determine regions of al I which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the .data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 21, but may, as shown in Table VII, be represented or identified by using tabular representations of the IS data presented in Figure 21. The DNA*STAR computer algorithm used to generate Figure 21 (set on the original default parameters) was used to present the data in Figure 21 in a tabular format (See Table VII). The tabular format of the data in Figure 21 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 21 and in Table VII include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 19A-F. As set out in Figure 21 and in Table VII, such preferred regions include Garnier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schu(z flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological SUBSTITUTE SHEET (RULE Z6) functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened al l muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus.
Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an al l mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic acaivities. In tact, peptides composed of as few as six al I
amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted fram the amino terminus of the al 1 amino acid sequence shown in Figures 19A-F, up to the threonine residue at position number 1184 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-1189 of Figures 19A-F, where nl is an integer from 2 to 1184 corresponding to the position of the amino acid residue in Figures 19A-F (which is identical to the sequence shown as SEQ ID N0:35). In another embodiment, N-terminal deletions of the al l polypeptide can be described by the general formula n2-1189, where n2 is a number from 2 to 1184, corresponding to the position of amino acid identified in Figure 19. N-terminal deletions of the al l polypeptide of the invention shown as SEQ ID N0:35 include polypeptides comprising the amino acid sequence of residues: N-terminal deletions of the al l polypeptide of the invention shown as SEQ ID N0:35 include polypeptides comprising the amino acid sequence of residues:
D-2 to E-1189; L-3 to E-1189; P-4 to E-1189; R-5 toE-1189; G-6 to E-1189; L-7 to E-1189; V-8 to E-1189; V-9 to E-1189;A-10 to E-1189; W-11 to E-1189; A-12 to E-I
189;
L-13 to E-1189;S-14 to E-1189: L-15 to E-1189; W-16 to E-1189; P-17 to E-1189;6-18 SUBSTITUTE SHEET (RULE 26) to E-1189; F-19 to E-i 189; T-20 to E-1189; D-21 to E-1189;T-22 to E-1189; F-23 to E-1189; N-24 to E-1189; M-25 to E-1189;D-26 to E-1189; T-27 to E-1189; R-28 to E-1189; K-29 to E-1189;P-30 to E-1189; R-31 to E-1189; V-32 to E-1189; I-33 to E-1189;P-34 to E-1189; G-35 to E-1189; S-36 to E-1189; R-37 to E-1189;T-38 to E-1189;
A-39 to E-1189; F-40 to E-1189; F-41 to E-1189;6-42 to E-1189; Y-43 to E-1189;

to E-1189; V-45 to E-1189;Q-46 to E-1189; Q-47 to E-1189; H-48 to E-1189; D-49 to E-1189;I-50 to E-1189; S-51 to E-1189; G-52 to E-1189; N-53 to E-1189;x-54 to E-1189;
W-55 to E-1189; L-56 to E-1189; V-57 to E-1189;V-58 to E-1189; G-59 to E-1189;
A-60 to E-1189; P-61 to E-1189;L-62 to E-1189; E-63 to E-1189; T-64 to E-1 I89;
N-65 to E-1189:6-66 to E-1189; Y-67 to E-1189; Q-68 to E-1189; K-69 to E-1189;T-70 to E-1189; G-71 to E-1 189; D-72 to E-1189; V-73 to E-1189;Y-74 to E-1189; K-75 to E-1189; C-76 to E-1189; P-77 to E-1189;V-78 to E-1189; I-79 to E-1189; H-80 to E-1189;
G-81 to E-1189;N-82 to E-1189; C-83 to E-1189; T-84 to E-1189; K-8S to E-1189;L-86 to E-1189; N-87 to E-1189; L-88 to E-1189; G-89 to E-1189;8-90 to E-1189; V-91 to E-1189; T-92 to E-1189; L-93 to E-1189;5-94 to E-1189; N-95 to E-1189; V-96 to E-1189;
S-97 to E-1189;E-98 to E-1189; R-99 to E-1189; K-100 to E-1189; D-101 to E-1189;N-102 to E-1189; M-103 to E-I 189; R-104 to E-1189; L-105 toE-1189; G-106 to E-1189;
L-107 to E-1189; S-108 to E-1189; L-109to E-1189; A-110 to E-1189; T-111 to E-1189;
N-112 to E-1189;P-113 to E-1189; K-114 to E-1189; D-115 to E-1189; N-116 toE-1189;
S-117 to E-1189; F-118 to E-1189; L-119 to E-1189; A-120to E-1189; C-121 to E-1189;
S-122 to E-1189; P-123 to E-1189;L-124 to E-1189; W-125 to E-1189; S-126 to E-1189;
H-127 toE-1189; E-128 to E-1189; C-129 to E-1189; G-130 to E-1189; S-131to E-1189;
S-132 to E-1189; Y-133 to E-1189; Y-134 to E-1189;T-135 to E-1189; T-136 to E-1189;
G-137 to E-1189; M-138 toE-1189; C-139 to E-1189; S-140 to E-1189; R-141 to E-1189; V-142to E-1189; N-143 to E-1189; S-144 to E-1189; N-145 to E-1189;F-146 to E-1189; R-147 to E-1189; F-148 to E-1189; S-149 toE-1189; K-150 to E-1189; T-151 to E-1189; V-152 to E-1189; A-153to E-1189; P-154 to E-1189; A-155 to E-1189; L-156 to SUBSTITUTE SHEET (RULE 26) E-1189;Q-157 to E-1189; R-158 to E-1189; C-159 to E-1189; Q-160 toE-1189; T-161 to E-1189; Y-162 to E-1189; M-163 to E-1189; D-164to E-1189; I-165 to E-1189; V-166 to E-1189; I-167 to E-1189; V-168to E-1189; L-169 to E-I 189; D-170 to E-1189; G-171 to E-1189;S-172 to E-1189; N-173 to E-1189; S-174 to E-1189; I-175 toE-1189; Y-176 to E-1189; P-177 to E-1189; W-178 to E-1189; V-179to E-1189; E-180 to E-1189; V-to E-1189; Q-182 to E-I 189;H-183 to E-1189; F-184 to E-1189; L-18_5 to E-1189; I-186 toE-1189; N-187 to E-1189; I-188 to E-1189; L-189 to E-1189; K-190to E-1189; K-to E-1189; F-192 to E-1189; Y-193 to E-1189;1-194 to E-1189; G-195 to E-1189;

to E-1189; G-197 toE-1189; Q-198 to E-1189; I-199 to E-1189; Q-200 to E-1189;
V-201to E-i 189; G-202 to E-1189; V-203 to E-I 189; V-204 to E-1189;Q-205 to E-1189;
Y-206 to E-1189; G-207 to E-1189; E-208 toE-1189; D-209 to E-1189; V-210 to E-1189; V-21 I to E-I 189; H-212to E-1189; E-213 to E-1189; F-214 to E-1189; H-215 to E-1189;L-2i6 to E-1189; N-217 to E-1189; D-218 to E-1189; Y-219 toE-1189; R-220 to E-1189; S-221 to E-1189; V-222 to E-1189; K-223to E-.l 189; D-224 to E-1189; V-22_5 to E-1189; V-226 to E-I 189;E-227 to E-1189; A-228 to E-1189; A-229 to E-1189;

toE-1189; H-231 to E-1189; I-232 to E-1189; E-233 to E-1189; Q-234to E-1189; R-to E-1189; G-236 to E-1189; G-237 to E-1189;T-238 to E-1189; E-239 to E-1189;

to E-1189; R-241 toE-1189; T-242 to E-1189; A-243 to E-1189; F-244 to E-1189;
G-245to E-1189; I-246 to E-1189; E-247 to E-1189; F-248 to E-1189;A-249 to E-1189; R-250 to E-1189; S-251 to E-1189; E-252 toE-1189; A-253 to E-1189; F-254 to E-1189; Q
255 to E-1189; K-256to E-1189; G-257 to E-1189; G-258 to E-1189; R-259 to E
I 189;x-260 to E-1189; G-261 to E-1189; A-262 to E-1189; K-263 toE-1189; K-264 to E-1189; V-265 to E-1189; M-266 to E-1189; I-267to E-1189; V-268 to E-1189; I-269 to E-1189; T-270 to E-1189;D-271 to E-1189; G-272 to E-1189; E-273 to E-1189; S-toE-1189; H-275 to E-1189; D-276 to E-1189; S-277 to E-1189; P-278to E-1189; D-to E-1189; L-280 to E-1189; E-281 to E-1189;x-282 to E-1189; V-283 to E-1189;

to E-1189; Q-285 toE-1189; Q-286 to E-1189; S-287 to E-1189; E-288 to E-1189;
R-SUBSTITUTE SHEET {RULE 26) 289to E-1189; D-290 to E-1189; N-291 to E-1189; V-292 to E-1189;T-293 to E-1189;
R-294 to E-I 189; Y-295 to E-1189; A-296 toE-1189; V-297 to E-1189; A-298 to E-1189; V-299 to E-1189; L-300to E-1189; G-301 to E-1189; Y-302 to E-1189; Y-303 to E-I 189;N-304 to E-1189; R-305 to E-1189; R-306 to E-1189; G-307 toE-1189; I-308 to E-1189; N-309 to E-1189; P-310 to E-1189; E-31 lto E-1189; T-312 to E-I 189; F-313 to E-1189; L-314 to E-1189;N-315 to E-1189; E-316 to E- I 189; I-317 to E- I 189;

toE-1189; Y-319 to E-1189; I-320 to E-1189; A-321 to E-1189; S-322to E-1189; D-to E-1189; P-324 to E-1189; D-325 to E-1189;D-326 to E-1189; K-327 to E-1189;
H-328 to E-1189; F-329 toE-1189; F-330 to E-1189: N-331 to E-1189; V-332 to E-1189; T-333to E-1189; D-334 to E-1189; E-335 to E-1189; A-3:36 to E-1189;A-337 to E-1189; L-338 to E-1189; K-339 to E-1189; D-340 toE-1189; I-341 to E-1189; V-342 to E-I
189;
D-343 to E-I 189; A-344to E-1189; L-345 to E-1189; G-346 to E-1189; D-347 to E-1189;8-348 to E-1189; I-349 to E-1189; F-350 to E-1189; S-351 toE-1189; L-352 to E-1189; E-353 to E-1189; G-354 to E-1189; T-355to E-1189; N-356 to E-1189; K-357 to E-1189; N-358 to E-1189;E-359 to E-1189; T-360 to E-1189; S-361 to E-1189; F-toE-1189; G-363 to E-1189; L-364 to E-1189; E-365 to E-1189; M-366to E-I 189;

to E-1189; Q-368 to E-1189; T-369 to E-1189;6-370 to E-1189; F-371 to E-1189;

to E-1189; S-373 toE-1189; H-374 to E-1189; V-375 to E-1189; V-376 to E-1189;
E-377to E-1189; D-378 to E-1189; G-379 to E-1189; V-380 to E-1189;L-381 to E-1189; L-382 to E-1189; G-383 to E-1189; A-384 toE-1189; V-385 to E-1189; G-386 to E-1189;
A-387 to E-1189; Y-388to E-1189; D-389 to E-1189; W-390 to E-1189; N-391 to E-1189;6-392 to E-1189; A-393 to E-1189; V-394 to E-1189; L-395 toE-1189; K-396 to E-1189; E-397 to E-1189; T-398 to E-1189; S-399to E-1189; A-400 to E-1189; G-401 to E-1189; K-402 to E-1189;V-403 to E-1189; I-404 to E-1189; P-405 to E-1189; L-toE-1189; R-407 to E-1189; E-408 to E-1189; S-409 to E-1189; Y-410to E-1189; L-to E-1189; K-412 to E-1189; E-413 to E-1189;F-414 to E-1189; P-415 to E-I 189;

to E-1189; E-417 toE-1189; L-418 to E-1189; K-419 to E-1189; N-420 to E-1189;
H-SUBSTITUTE SHEET (RULE 26) 421to E-1189; G-422 to E-1189; A-423 to E-1189; Y-424 to E-1189;L-425 to E-1189;
G-426 to E-1189; Y-427 to E-1189; T-428 toE-1189; V-429 to E-1189; T-430 to E-1189;
S-431 to E-1189; V-432to E-1189; V-433 to E-1189; S-434 to E-1189; S-435 to E-1189;8-436 to E-1189; Q-437 to E-1189; G-438 to E-I 189; R-439 toE-1189; V-440 to E-1189; Y-441 to E-1189; V-442 to E-1189; A-443to E-1189; G-444 to E-1189; A-to E-1189; P-446 to E-1189;8-447 to E-1189; F-448 to E-1189; N-449 to E-1189;

toE-1189; T-451 to E-1189; G-4_52 to E-1189; K-453 to E-1189; V-454to E-1189;

to E-1189; L-456 to E-1189; F-457 to E-1189; T-458to E-1189; M-459 to E-1189;

to E-1189; N-461 to E-1189;N-462 to E-1189; R-463 to E-1189; S-464 to E-1189;

toE-1189; T-466 to E-1189; I-467 to E-1189; H-468 to E-1189; Q-469to E-1189; A-to E-1189; M-471 to E-1189; R-472 to E-1189;G-473 to E-1189; Q-474 to E-1189;
Q-475 to E-1189; I-476 toE-1189; G-477 to E-1189; S-478 to E-1189; Y-479 to E-1189; F-480to E-1189; G-481 to E-1189; S-482 to E-1189; E-483 to E-1189;1-484 to E-1189; T-485 to E-1189; S-486 to E-1189; V-487 toE-1189; D-488 to E-1189; I-489 to E-1189; D-490 to E-1189; G-491to E-1189; D-492 to E-1189; G-493 to E-1189; V-494 to E-I 189;T-495 to E-1189; D-496 to E-1189; V-497 to E-1189; L-498 toE-1189; L-499 to E-1189; V-500 to E-1189; G-501 to E-1189; A-502to E-1189; P-503 to E-1189; M-504 to E-1189; Y-505 to E-1189;F-506 to E-I 189; N-507 to E-.1189; E-508 to E-1189; G-_509 toE-1189; R-510 to E-1189; E-511 to E-1189; R-512 to E-1189; G-513to E-1189; K-to E-1189; V-515 to E-1189; Y-516 to E-1189;V-517 to E-1189; Y-518 to E-1189;
E-519 to E-I 189; L-520 toE-1189; R-521 to E-1189; Q-522 to E-1189; N-523 to E-1189;
R-524to E-1189; F-525 to E-1189; V-526 to E-1189; Y-527 to E-1189;N-528 to E-1189;
G-529 to E-1189; T-530 to E-1189; L-531 toE-1189; K-532 to E-1189; D-533 to E-1189;
S-534 to E-1189; H-535to E-1189; S-536 to E-1189; Y-537 to E-1189; Q-538 to E-1189;N-539 to E-1189; A-540 to E-1189; R-541 to E-1189; F-542 toE-1189; G-543 to E-1189; S-544 to E-1189; S-545 to E-1189; I-546to E-1189; A-547 to E-1189; S-548 to E-1189; V-549 to E-1189;8-550 to E-1189; D-551 to E-1189; L-552 to E-1189; N-553 SUBSTITUTE SHEET (RULE 26}

toE-1189; Q-554 to E-1189; D-555 to E-1189; S-556 to E-1189; Y-557to E-1189; N-to E-1189; D-559 to E-1189; V-560 to E-1189;V-561 to E-1 I89; V-562 to E-I
189; G-563 to E-1189; A-564 toE-1189; P-565 to E-1 I89; L-566 to E-1189; E-567 to E-1189;
D-568to E-1189; N-569 to E-1189; H-570 to E-1189; A-571 to E-1 I89;G-572 to E-1189;
A-573 to E-1189; I-574 to E-1189; Y-_57.5 toE-1189; I-576 to E-1189; F-577 to E-1189;
H-578 to E-I 189; G-579to E-1189; F-580 to E-1189; R-581 to E-1189; G-582 to E
1189;5-583 to E-I 189; I-584 to E-1189; L-.585 to E-1189; K-586 toE-1189; T-_587 to E-1189; P-588 to E-1189: K-589 to E-1189; Q-590to E-1189; R-591 to E-1189; I-592 to E-1189; T-593 to E-1189;A-594 to E-1189; S-595 to E-1189; E-596 to E-1189; L-597 toE-1189; A-598 to E-1189; T-599 to E-1189; G-600 to E-1189; L-601to E-1189: Q-602 to E-1189; Y-603 to E-i 189; F-604 to E-1189;6-605 to E-1189: C-606 to E-1189; S-607 to E-1189; I-608 toE-1189; H-609 to E-1189; G-6I0 to E-1189; Q-611 to E-1189; L-612to E-1189; D-613 to E-1189; L-614 to E-1189; N-615 to E-1189;E-616 to E-1189; D-6I7 to E-1189; G-618 to E-1189; L-619 toE-1189; I-620 to E-1189; D-621 to E-1189; L-622 to E-1189; A-623to E-1189; V-624 to E-1189; G-625 to E-1189; A-626 to E-1189;L-627 to E-1189; G-628 to E-1189; N-629 to E-1189; A-630 toE-1189; V-631 to E-1189; I-632 to E-1189; L-633 to E-I 189; W-634to E-1189; S-635 to E-1189; R-636 to E-1189; P-637 to E-1189;V-638 to E-1189; V-639 to E-1189; Q-640 to E-1189; I-641 toE-1189; N-642 to E-1189; A-643 to E-1 I89; S-644 to E-1189; L-645to E-1189; H-646 to E-1189; F-647 to E-1189; E-648 to E-1189;P-649 to E-1189; S-650 to E-1189; K-651 to E-1189; I-toE-1189; N-653 to E-1189;1-654 to E-1189; F-655 to E-1189; H-656to E-1189; R-to E-1189; D-658 to E-1189; C-659 to E-1189;x-660 to E-1189; R-661 to E-1189;

to E-1189; G-663 toE-1189; R-664 to E-1189; D-665 to E-1189; A-666 to E-1189;
T-667to E-1189; C-668 to E-1189; L-669 to E-1189; A-670 to E-1189;A-671 to E-1189; F-672 to E-1189; L-673 to E-1189; C-674 toE-1189; F-675 to E-I 189; T-676 to E-1189; P-677 to E-1189; I-678 toE-1189; F-679 to E-1189; L-680 to E-1189; A-681 to E-1189; P-682to E-1189; H-683 to E-1189; F-684 to E-1189;~Q-685 to E-1189;T-686 to E-1189; T-SUBSTITUTE SHEET (RULE 26) 687 to E-1189; T-688 to E-1189; V-689 toE-1189; G-690 to E-1189; I-691 to E-1189; 8-692 to E-1189; Y-693to E-1189; N-694 to E-1189; A-695 to E-1189; T-696 to E-1189;M-697 to E-1189; D-698 to E-1189; E-699 to E-1 i89; R-700 toE-1189; R-701 to E-1189; Y-702 to E-1189; T-703 to E-1189; P-704to E-1189; R-705 to E-1189; A-706 to E-1189; H-707 to E-1189;L-708 to E-1189; D-709 to E-1189; E-710 to E-1189; G-toE-1189; G-712 to E-1189; D-713 to E-1189; R-714 to E-1189; F-715to E-1189; T-to E-1189; N-717 to E-1189; R-718 to E-1189;A-719 to E-1189; V-720 to E-1189;
L-721 to E- I 189; L-722 toE- I 189; S-723 to E-1 I 89; S-724 to E-1 I 89; G-725 to E-1189; Q-726to E-1189; E-727 to E-1189; L-728 to E-1189; C-729 to E-1189;E-730 to E-1189; 8-731 to E-1189; I-732 to E-i 189; N-733 toE-1189; F-734 to E-I 189; H-735 to E-1189; V-736 to E-1189; L-737to E-1189; D-738 to E-1189; T-739 to E-1189; A-740 to E-1189:D-741 to E-1189; Y-742 to E-1189; V-743 to E-1189; K-744 toE-1189; P-745 to E-1189;
V-746 to E-1189; T-747 to E-1189; F-748to E-1189; S-749 to E-1189; V-750 to E-1189;
E-751 to E-1189;Y-752 to E-1189; S-753 to E-I 189; L-7.54 to E-I 189; E-755 toE-1189;
D-756 to E-1 I89; P-757 to E-1189; D-758 to E-1189; H-759to E-1189; G-760 to E-I 189; P-761 to E-1 I89; M-762 to E-1189;L-763 to E-1189; D-764 to E-1189; D-765 to E-1189; G-766 toE-1189; W-767 to E-1189; P-768 to E-1189; T-769 to E-1189; T-770to E-1189; L-771 to E-1189; R-772 to E-1189; V-773 to E-1189;5-774 to E-1189; V-775 to E-1189; P-776 to E-1189; F-777 toE-1189; W-778 to E-1189; N-779 to E-1189; G-to E-1189; C-781to E-1189; N-782 to E-1189; E-783 to E-1189; D-784 to E-1 i89;E-785 to E-1189; H-786 to E-1189; C-787 to E-1189; V-788 toE-1189; P-789 to E-1189;

to E-1189; L-791 to E-1189; V-792to E-1189; L-793 to E-1189; D-794 to E-1189;

to E-1189;8-796 to E-I 189; S-797 to E-1189; D-798 to E-1189; L-799 toE-1189;

to E-1189; T-801 to E-1189; A-802 to E-1189; M-803to E-1189; E-804 to E-1189;
Y-805 to E-1189; C-806 to E-1189;Q-807 to E-1189; R-808 to E-1189; V-809 to E-1189;
L-810 toE-1189; R-811 to E-1189; K-812 to E-1189; P-813 to E-1189; A-814to E-118.9;
Q-815 to E-1189; D-816 to E-1189; C-817 to E-1189;5-818 to E-1189; A-819 to E-1189;
SUBSTITUTE SHEET (RULE 26) II$
Y-820 to E-1189; T-821 toE-1189; L-822 to E-1189; S-823 to E-1189; F-824 to E-1189;
D-825to E-1189; T-826 to E-1189; T-827 to E-11$9; V-828 to E-1189;F-829 to E-1189;
I-830 to E-I 189; I-831 to E-1189; E-832 to E-1189;S-833 to E-1189; T-834 to E-1189;
R-835 to E-I 189; Q-836 toE-1189; R-837 to E-1189; V-838 to E-1189; A-839 to E-1189; V-840to E-1189; E-841 to E-1189; A-842 to E-I 189; T-843 to E-1189;L-844 to E-1189; E-845 to E-1189; N-846 to E-1189; R-847 toE-I 189; G-848 to E-1189; E-849 to E-1189; N-850 to E-I 189; A-851to E-I I$9; Y-852 to E-1189; S-853 to E-I 189;
T-854 to E-1189;V-855 to E-1189; L-856 to E-1189; N-$57 to E-1189; I-858 toE-I 189; S-859 to E-1189; Q-860 to E-1189; S-861 to E-I 189; A-862to E-I 189; N-863 to E-1189; L-$64 to E-1189; Q-865 to E-1189;F-866 to E-1189; A-867 to E-1189; S-868 to E-1189; L-toE-1189; I-870 to E-1189; Q-871 to E-1189; K-872 to E-1189; E-873to E-1189; D-to E-1189; S-87_5 to E-1189; D-876 to E-1189;6-877 to E-I 189; S-878 to E-1189; I-879 to E-1189; E-880 toE-1189; C-881 to E-1189; V-882 to E-I 189; N-883 to E-1189;
E-884to E-1189; E-885 to E-1189; R-886 to E-11$9: R-887 to E-1189;L-888 to E-1189; Q-889 to E-11$9; K-890 to E-1189; Q-$91 toE-1189; V-892 to E-1189; C-893 to E-1189;
N-894 to E-1189; V-895to E-1189; S-$96 to E-1189; Y-897 to E-1189; P-898 to E-1189;F-899 to E-1189; F-900 to E-1189; R-901 to E-1189; A-902 toE-1189; K-903 to E-1189; A-904 to E-1189; K-905 to E-1189; V-906to E-I 189; A-907 to E-1189; F-908 to E-1189; R-909 to E-1189;L-910 to E-1189; D-911 to E-1189; F-912 to E-1189; E-toE-1189; F-914 to E-1189; S-915 to E-1189; K-916 to E-I 189; S-917to E-1189;
I-91$
to E-1189; F-919 to E-1189; L-920 to E-1189;H-921 to E-1189; H-922 to E-I 189;

to E-1189; E-924 toE-1 I89; I-925 to E-1189; E-926 to E-1189; L-927 to E-1189;
A-928to E-1189; A-929 to E-1189; G-930 to E-1189; S-931 to E-1189;D-932 to E-1189; S-933 to E-1189; N-934 to E-1189; E-935 toE-1189; R-936 to E-1189; D-937 to E-1189;
S-938 to E-I I$9; T-939to E-1189; K-940 to E-1189; E-941 to E-1189; D-942 to E-1189;N-943 to E-1189; V-944 to E-1189; A-945 to E-1189; P-946 toE-1189; L-947 to E-1189; R-948 to E-1189; F-949 to E-1189; H-950to E-1189; L-951 to E-1189; K-952 to SUBSTITUTE SHEET (RULE 26) E-1189; Y-953 to E-1189;E-954 to E-1189; A-955 to E-1189; D-956 to E-1189; V-toE-1189; L-958 to E-1189; F-959 to E-1189; T-960 to E-1189; R-961to E-1189; S-to E-1189; S-963 to E-1189; S-964 to E-1189;L-965 to E-1189; S-966 to E-1189;

to E-1189; Y-968 toE-1189; E-969 to E-1189; V-970 to E-1189; K-971 to E-1189;
L-972to E-1189; N-973 to E-1189; S-974 to E-1189; S-97.5 to E-1189;L-976 to E-1189; E-977 to E-1189; R-978 to E-1189; Y-979 toE-1189; D-980 to E-1189; G-981 to E-1189; 1-982 to E-1189; G-983to E-1189; P-984 to E-1189; P-985 to E-1189; F-986 to E-1189;5-987 to E-1189; C-988 to E-1189; I-989 to E-1189; F-990 toE-I 189; R-991 to E-1189; I-992 to E-1189; Q-993 to E-1189; N-994to E-1189; L-99_5 to E-1189; G-996 to E-1189;
L-997 to E-1189;F-998 to E-1189; P-999 to E-1189; I-1000 to E-1189; H-1001 toE-1189; G-1002 to E-1189; 1-1003 to E-1189; M-1004 to E-1189;M-1005 to E-1189; K-1006 to E-1189; I-1007 to E-1189; T-1008 toE-1189; I-1009 to E-1189; P-1 O 10 to E-1189; I-1011 to E-1189;A-1012 to E-1189; T-1013 to E-1189; R-1014 to E-1189; S-toE-1189; G-1016 to E-1189; N-1017 to E-1189; R-1018 to E-1189;L-1019 to E-1189;
L-1020 to E-1189; K-1021 to E-1189; L-1022 toE-1189; R-1023 to E-1 i89; D-1024 to E-1189; F-1025 to E-1189;L-1026 to E-1189; T-1027 to E-1189; D-1028 to E-1189;
E-l029 toE-i 189; V-1030 to E-1189; A-1031 to E-1189; N-1032 to E-1189;T-1033 to E-1189; S-1034 to E-1189; C-1035 to E-1189; N-1036 toE-1189; I-1037 to E-1189; W
1038 to E-1189; G-1039 to E-1189;N-1040 to E-1189; S-1041 to E-1189; T-1042 to E
1189; E-1043 toE-1189; Y-1044 to E-1189; R-1045 to E-1189; P-1046 to E-1189;T-1047 to E-1189; P-1048 to E-1189; V-1049 to E-1189; E-1050 toE-1189; E-1051 to E-1189; D-1052 to E-1189; L-1053 to E-1189;8-1054 to E-1189; R-1055 to E-1189; A-1056 to E-1189; P-1057 toE-1189; Q-1058 to E-1189; L-1059 to E-1189; N-1060 to E-1189;H-1061 to E-1189; S-1062 to E-1189; N-1063 to E-1189; S-1064 toE-1189; D-1065 to E-1189; V-1066 to E-1189; V-1067 to E-1189;5-1068 to E-1189; I-1069 to E-i 189; N-1070 to E-1189; C-1071 toE-1189; N-1072 to E-1189; I-1073 to E-1189;

1074 to E-1189;L-107_5 to E-1189; V-1076 to E-1189; P-1077 to E-1189; N-1078 toE-SUBSTITUTE SHEET (R.ULE 26) 1189; Q-1079 to E-1189; E-1080 to E-1189; I-1081 to E-I189;N-1082 to E-1189; F-1083 to E-1189; H-1084 to E-1189; L-1085 toE-1 i89; L-1086 to E-1189; G-1087 to E-1189; N-1088 to E-1189;L-1089 to E-1189; W-1090 to E-1189; L-1091 to E-1189; 8-1092 toE-1189; S-1093 to E-1189; L-1094 to E-1189; K-1095 to E-1189;A-1096 to E-S 1189; L-1097 to E-1189; K-1098 to E-1189; Y-1099 toE-1189; K-l 100 to E-1189; 5-1101 to E-1189; M-1102 to E-1189;x-1103 to E-1189; I--1104 to E-1189; M-1 lOS
to E-1189; V-1106 toE-1189; N-1107 to E-1189; A-1108 to E-1189; A-1109 to E-1189;L-1110 to E-1 I 89; Q- I 1 I 1 to E-1189; R-1112 to E-1189; Q-11 I 3 toE-1189; F-1114 to E-1189; H-I I1S to E-1189; S-1116 to E-1189;P-1117 to E-1189; F-1118 to E-1189;

to E-1189; F-1120 toE-1189; K-1121 to E-1189; E-1122 to E-1189; E-1123 to E-1189;D-1124 to E-1189; P-1125 to E-1189; S-1126 to E-1189; R-1127 toE-1189; Q-1128 to E-1189; I-1129 to E-1189; V-1130 to E-1189;F-1131 to E-1189; E-1132 to E-1189; I-to E-1189; S-1134 toE-1189; K-1135 to E-1189; Q-1136 to E-I 189; E-1137 to E-1189;D-1138 to E-1189; W-1139 to E-1189; Q-1140 to E-1189; V-1141 toE-1189; P-1142 to E-1189; I-1143 to E-1189; W-I 144 to E-1189;I-1145 to E-1189; I-1146 to E-1189; V-1147 to E-1189; G-1148 toE-1189; S-1149 to E-1189; T-1150 to E-1189; L-1151 to E-1189;6-1152 to E-1189; G-1153 to E-1189; L-I IS4 to E-1189; L-1155 toE-1189; L-1156 to E-1189; L-1157 to E-1189; A-1158 to E-1189;L-1159 to E-1189; L-1160 to E-1189; V-1161 to E-1189; L-1162 toE-1189; A-i 163 to E-1189; L-1164 to E-1189; W-1165 to E-1189;x-1166 to E-1189; L-1167 to E-1189; G-1168 to E-1189; F-1169 toE-1189; F-1170 to E-1189; R-1171 to E-1189; S-1172 to E-1189;A-1173 to E-1189; R-1174 to E-1189; R-1175 to E-1189; R-1176 toE-1189; R-1177 to E-I 189;
E-1178 to E-1189; P-1179 to E-1189;6-1180 to E-1189; L-1181 to E-1189; D-1182 to E-1189; P-1183 toE-1189; T-1184 to E-1189; of SEQ ID N0:3S. Polypeptides encoded by 2S these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological SUBSTITUTE SHEET (RULE 26)\

functions of the protein, other functional activities (e.g., biological activities (e.g., ability to illicit mitogenic activity, induce differentiation of normal or malignant cells, ability to multimerize, etc.) may still be retained. For example the ability of the shortened al l mutein to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an al l mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunoaenic activities. In fact, peptides composed of as few as six al 1 amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the al 1 polypeptide shown in Figures 19A-F, up to the glycine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures 19A-F, where ml is an integer from 6 to 1189 corresponding to the position of the amino acid residue in Figures 19A-F. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the al 1 polypeptide of the invention shown as SEQ
ID N0:35 include polypeptides comprising the amino acid sequence of residues:
M-1 to L-118$; M-1 to V-1187; M-1 to K-1186;M-1 to P-1185; M-1 to T-1184; M-1 to P-1183;
M-1 to D-1182; M-1 to L-1181; M-1 to G-1 I80; M-1 toP-1179; M-1 to E-1178; M-1 to R-1177; M-1 to R-1176; M-1 to R-1175; M-1 to R-1174; M-1 to A-1173; M-lto S-1172;
M-1 to R-1171; M-1 to F-1170; M-1 to F-1169; M-1 to G-I 168; M-1 to L-1167; M-1 to K-i 166;M-1 to W-1165; M-1 to L-1164; M-1 to A-1163; M-1 to L-1162; M-1 to V-SUBSTITUTE SHEET (RULE 26) 1161; M-1 to L-1160; M-1 toL-1159; M-1 to A-1158; M-1 to L-1157; M-1 to L-1156;
M-1 to L-1155; M-1 to L-1154; M-1 to G-1153; M-lto G-1152; M-1 to L-1 i51; M-1 to T-1150; M-1 to S-1149; M-1 to G-1148; M-1 to V-1147; M-1 to I-1146;M-1 to I-1145;
M-1 to W-1144; M-1 to I-1143; M-1 to P-1142; M-1 to V-1141; M-1 to Q-1140; M-1 toW-1139; M-1 to D-1138; M-1 to E-1137; M-1 to Q-1136; M-1 to K-1135; M-1 to S-1134; M-I to I-1133; M-lto E-1132; M-I to F-1131; M-1 to V-1130; M-1 to I-1129; M-1 to Q-1128; M-1 to R-1127; M-1 to S-1126;M-I to P-I 125; M-1 to D-1124; M-1 to E-1123; M-1 to E-1122; M-1 to R-1121; M-1 to F-1120; M-1 toI-1119; M-1 to F-1118; M-1 to P-1117; M-1 to S-1116; M-I to H-1115; M-1 to F-1114; M-1 to Q-11 i3; M-lto R-1112; M-1 to Q-1111; M-1 to L-11 I0; M-1 to A-1109; M-I to A-1108; M-I to N-1107;
M-1 to V-1106;M-1 to M-1105; M-1 to I-1104; M-1 to K-1103; M-1 to M-1102; M-1 to S-1101; M-1 to K-1100; M-1 toY-1099; M-1 to K-1098; M-1 to L-1097; M-1 to A-1096;
M-1 to K-1095; M-1 to L-1094; M-1 to S-1093; M-lto R-1092; M-1 to L-1091; M-1 to W-1090; M-1 to L-1089; M-1 to N-1088; M-1 to G-1087; M-1 to L-1086;M-1 to L-1085; M-1 to H-1084; M-1 to F-1083; M-1 to N-1082; M-1 to I-1081; M-1 to E-1080;
M-1 toQ-1079; M-1 to N-1078; M-1 to P-1077; M-I to V-1076; M-1 to L-1075; M-1 to R-1074; M-1 to I-1073; M-lto N-1072; M-1 to C-1071; M-1 to N-1070; M-1 to I-1069;
M-1 to S-1068; M-I to V-1067; M-1 to V-1066;M-1 to D-1065; M-1 to S-1064; M-1 to N-1063; M-1 to S-1062; M-I to H-1061; M-1 to N-1060; M-1 toL-1059; M-1 to Q-1058;
M-1 to P-1057; M-1 to A-1056; M-1 to R-1055; M-1 to R-1054; M-1 to L-1053; M-lto D-1052; M-1 to E-1051; M-1 to E-1050; M-1 to V-1049; M-1 to P-1048; M-1 to T-1047;
M-1 to P-1046;M-1 to R-1045; M-1 to Y-1044; M-1 to E-1043; M-1 to T-1042; M-1 to S-1041; M-1 to N-1040; M-1 toG-1039; M-1 to W-1038; M-1 to I-1037; M-1 to N-1036;
M-1 to C-1035; M-1 to S-1034; M-1 to T-1033; M-lto N-1032; M-1 to A-1031; M-1 to V-1030; M-1 to E-1029; M-1 to D-1028; M-1 to T-1027; M-1 to L-1026;M-1 to F-1025;
M-1 to D-1024; M-1 to R-1023; M-1 to L-1022; M-1 to K-1021; M-1 to L-1020; M-I
toL-1019; M-1 to R-1018; M-1 to N-1017; M-1 to G-1016; M-1 to S-1015; M-i to R-SUBSTITUTE SHEET (RULE 26) 1014; M-1 to T-1013; M-Ito A-1012; M-1 to I-1011; M-1 to P-1010; M-1 to I-1009; M-1 to T-1008; M-1 to I-1007; M-I to K-1006; M-lto M-1005; M-1 to M-1004; M-1 to I-1003; M-1 to G-1002; M-1 to H-1001; M-1 to I-1000; M-1 to P-999;M-1 to F-998;

to L-997; M-1 to G-996; M-1 to L-995; M-1 to N-994; M-1 to Q-993; M-1 to I-992; M-1 toR-991; M-1 to F-990; M-1 to I-989; M-1 to C-988; M-1 to S-987; M-1 to F-986;

to P-985; M-1 to P-984;M-1 to G-983; M-1 to I-982; M-1 to G-981; M-1 to D-980;
M-l to Y-979; M-1 to R-978; M-1 to E-977; M-1 toL-976; M-I to S-975; M-1 to S-974;

to N-973; M-1 to L-972; M-1 to K-971; M-1 to V-970; M-1 toE-969; M-1 to Y-968;

to H-967; M-I to S-966; M-1 to L-965; M-1 to S-964; M-I to S-963; M-1 to S-962;M-1 to R-961; M-1 to T-960; M-1 to F-959; M-I to L-958; M-1 to V-957; M-1 to D-956; M-1 to A-955; M-I toE-954; M-1 to Y-953; M-1 to K-952; M-1 to L-951; M-1 to H-950;

to F-949; M-1 to R-948; M-1 toL-947; M-1 to P-946; M-1 to A-945; M-1 to V-944;

to N-943; M-1 to D-942; M-1 to E-941; M-1 toK-940; M-1 to T-939; M-1 to S-938;

to D-937; M-1 to R-936; M-1 to E-935; M-1 to N-934; M-1 toS-933; M-1 to D-932;

to S-931; M-1 to G-930; M-1 to A-929; M-1 to A-928; M-1 to L-927; M-1 toE-926;

to I-925; M-1 to E-924; M-1 to L-923; M-1 to H-922; M-1 to H-921; M-1 to L-920; M-1 to F-919;M-1 to I-918; M-1 to S-917; M-1 to K-916; M-1 to S-915; M-1 to F-914;

to E-9I3; M-1 to F-912; M-1 toD-911; M-1 to L-910; M-1 to R-909; M-1 to F-908;

to A-907; M-1 to V-906; M-1 to K-905; M-1 toA-904; M-I to K-903; M-1 to A-902;
M-1 to R-901; M-1 to F-900; M-1 to F-899; M-1 to P-898; M-1 toY-897; M-1 to S-896; M-1 to V-895; M-1 to N-894; M-1 to C-893; M-1 to V-892; M-1 to Q-891; M-1 toK-890;
M-1 to Q-889; M-1 to L-888; M-1 to R-887; M-1 to R-886; M-1 to E-885; M-1 to E-884;
M-1 toN-883; M-1 to V-882; M-1 to C-881; M-1 to E-880; M-1 to I-879; M-1 to S-878;
M-1 to G-877; M-1 to D-876;M-1 to S-875; M-1 to D-874; M-1 to E-873; M-1 to K-872;
M-1 to Q-871; M-1 to I-870; M-1 to L-869; M-1 toS-868; M-1 to A-867; M-1 to F-866;
M-1 to Q-865; M-1 to L-864; M-1 to N-863; M-I to A-862; M-1 toS-861; M-1 to Q-860;
M-1 to S-859; M-1 to I-858; M-1 to N-857; M-1 to L-856; M-1 to V-855; M-1 to T-SUBSTITUTE SHEET (RULE 26) 854;M-1 to S-853; M-1 to Y-852; M-1 to A-851; M-1 to N-850; M-1 to E-849; M-1 to G-848; M-1 to R-847; M-lto N-846; M-1 to E-845; M-I to L-844; M-1 to T-843; M-1 to A-842; M-1 to E-841; M-1 to V-840; M-1 toA-839; M-1 to V-838; M-1 to R-837; M-1 to Q-836; M-1 to R-835; M-1 to T-834; M-1 to S-833; M-1 toE-832; M-! to I-831; M-1 to I-830; M-1 to F-829; M-1 to V-828; M-1 to T-827; M-1 to T-826; M-1 to D-825;M-1 to F-824; M-1 to S-823; M-1 to L-822; M-1 to T-821; M-1 to Y-820; M-I to A-819; M-1 to S-818; 1VI-1 toC-817; M-l to D-816; M-1 to Q-815; M-1 to A-814; M-1 to P-813;
M-1 to K-812; M-1 to R-811; M-i toL-810; M-1 to V-809; M-1 to R-808; M-I to Q-807; M-I to C-806; M-I to Y-805; M-1 to E-804; M-1 toM-803; M-1 to A-802; M-1 to T-801; M-to P-800; M-1 to L-799; M-1 to D-798; M-1 to S-797; M-1 toR-796; M-1 to A-795;

to D-794; M-1 to L-793; M-1 to V-792; M-1 to L-791; M-1 to D-790; M-1 toP-789;

to V-788; M-1 to C-787; M-1 to H-786; M-i to E-785; M-1 to D-784; M-1 to E-783; M-1 toN-782; M-1 to C-781; M-I to G-780; M-1 to N-779; M-1 to W-778; M-1 to F-777;
M-1 to P-776; M-1 toV-775; M-1 to S-774; M-1 to V-773; M-1 to R-772; M-1 to L-771;
M-1 to T-770; M-1 to T-769; M-1 to P-768;M-1 to W-767; M-1 to G-766; M-1 to D-765;
M-1 to D-764; M-1 to L-763; M-I to M-762; M-1 to P-761; M-Ito G-760; M-1 to H-759; M-1 to D-758; M-1 to P-?57; M-1 to D-756; M-1 to E-755; M-1 to L-754; M-1 toS-753; M-1 to Y-752; M-1 to E-751; M-1 to V-750; M-1 to S-749; M-1 to F-748; M-1 to T-747; M-1 to V-746;M-1 to P-745; M-1 to K-744; M-1 to V-743; M-1 to Y-742; M-1 to D-741; M-1 to A-740; M-1 to T-739; M-lto D-738; M-1 to L-737; M-1 to V-736; M-1 to H-735; M-1 to F-734; M-1 to N-733; M-1 to I-732; M-1 toR-73I; M-1 to E-730; M-1 to C-729; M-1 to L-728; M-1 to E-727; M-1 to Q-726; M-1 to G-725; M-1 toS-724; M-1 to S-723; M-1 to L-722; M-1 to L-721; M-1 to V-720; M-1 to A-719; M-1 to R-718; M-i toN-717; M-1 to T-716; M- i to F-715; M-1 to R-714; M-1 to D-713 ; M-1 to G-712; M-1 to G-711; M-1 toE-710; M-1 to D-709; M-1 to L-708; M-I to H-707; M-1 to A-706;

to R-705; M-1 to P-704; M-1 toT-703; M-1 to Y-702; M-1 to R-701; M-1 to R-700;

to E-699; M-1 to D-698; M-1 to M-697; M-1 toT-696; M-1 to A-695; M-1 to N-694;
M-SUBSTITUTE SHEET (RULE 26) 1 to Y-693; M-1 to R-692; M-1 to I-691; M-1 to G-690; M-1 toV-689; M-1 to T-688; M-1 to T-687; M-1 to T-686; M-1 to Q-685; M-1 to F-684; M-1 to H-683; M-1 to P-682;M-1 to A-681; M-1 to L-680; M-1 to F-679; M-1 to I-678; M-1 to P-677; M-1 to T-676; M-1 to F-675; M-1 toC-674; M-1 to L-673; M-1 to F-672; M-1 to A-671; M-1 to A-670; M-1 to L-669; M-1 to C-668; M-1 to T-667;M-1 to A-666; M-1 to D-665; M-1 to R-664;
M-1 to G-663; M-I to S-662; M-1 to R-661; M-1 to K-660; M-lto C-659; M-1 to D-658;
M-1 to R-6_57; M-1 to H-656; M-1 to F-655; M-1 to I-654; M-1 to N-653; M-1 toI-652;
M-1 to K-651; M-1 to S-650; M-1 to P-649; M-1 to E-648; M-1 to F-647; M-1 to H-646;
M-1 to L-645;M-1 to S-644; M-1 to A-643; M-1 to N-642; M-1 to I-641; M-1 to Q-640;
M-1 to V-639; M-1 to V-638; M-1 toP-637; M-1 to R-636; M-1 to S-635; M-1 to W-634;
M-1 to L-633; M-1 to I-632; M-1 to V-631; M-1 to A-630;M-1 to N-629; M-1 to G-628;
M-1 to L-627; M-1 to A-626; M-1 to G-625; M-1 to V-624; M-1 to A-623; M-lto L-622;
M-1 to D-621; M-1 to I-620; M-1 to L-619; M-1 to G-618; M-1 to D-617; M-I to E-616;
M-1 toN-615; M-1 to L-614; M-1 to D-613; M-1 to L-612; M-1 to Q-611; M-1 to G-610;
M-1 to H-609; M-1 toI-608; M-1 to S-607; M-1 to C-606; M-1 to G-605; M-1 to F-604;
M-1 to Y-603; M-1 to Q-602; M-1 to L-601;M-1 to G-600; M-1 to T-599; M-1 to A-598;
M-1 to L-597; M-1 to E-596; M-1 to S-595; M-1 to A-594; M-1 toT-593; M-1 to I-592;
M-1 to R-591; M-1 to Q-590; M-1 to K-589; M-1 to P-588; M-1 to T-587; M-1 to K-586;M-1 to L-585; M-1 to I-584; M-1 to S-583; M-1 to G-582; M-1 to R-581; M-1 to F-580; M-1 to G-579; M-I toH-578; M-1 to F-577; M-1 to I-576; M-1 to Y-575; M-1 to I-574; M-1 to A-573; M-1 to G-572; M-1 to A-571;M-1 to H-570; M-1 to N-569; M-1 to D-568; M-1 to E-567; M-1 to L-566; M-1 to P-565; M-1 to A-564; M-lto G-563; M-1 to V-562; M-1 to V-561; M-1 to V-560; M-I to D-559; M-1 to N-558; M-1 to Y-557; M-toS-5_56; M-1 to D-555; M-1 to Q-554; M-1 to N-553; M-1 to L-552; M-1 to D-551; M-1 to R-550; M-1 toV-_549; M-I to S-548; M-1 to A-547; M-1 to I-546; M-1 to S-545; M-1 to S-544; M-1 to G-543; M-1 to F-542;M-1 to R-541; M-1 to A-540; M-1 to N-539;

to Q-538; M- 1 to Y-537; M-1 to S-536; M-1 to H-535; M-lto S-534; M-1 to D-533;

SUBSTITUTE SHEET (RULE 26) to K-532; M-1 to L-531; M-1 to T-530; M-1 to G-529; M-1 to N-528; M-1 toY-527;

to V-526; M-1 to F-525; M-1 to R-524; M-1 to N-523; M-1 to Q-522; M-1 to R-521; M-1 toL-520; M-1 to E-519; M-1 to Y-518; M-1 to V-517; M-1 to Y-516; M-1 to V-515;
M-1 to K-514; M-1 toG-513; M-1 to R-512; M-1 to E-511; M-1 to R-510; M-1 to G-509;
M-1 to E-508; M-1 to N-507; M-1 toF-506; M-1 to Y-505; M-1 to M-504; M-1 to P-503;
M-1 to A-502; M-1 to G-_501; M-I to V-500; M-I toL-499; M-1 to L-498; M-1 to V-497;
M-1 to D-496; M-1 to T-495; M-I to V-494; M-1 to G-493; M-1 toD-492; M-1 to 6-491; M-1 to D-490; M-1 to I-489; M- I to D-488; M-1 to V-487; M-1 to S-486; M-1 to T-485;M-1 to I-484; M-1 to E-483; M-I to S-482; M-1 to G-481; M-1 to F-480; M-1 to Y-lU 479; M-1 to S-478; M-1 toG-477; M-1 to I-476; M-1 to Q-475; M-I to Q-474; M-I to 6-473; M-1 to R-472; M-1 to M-471; M-1 toA-470; M- I to Q-469; M-1 to H-468; M-1 to I-467; M-1 to T-466; M-1 to L-465; M-1 to S-464; M-I to R-463;M-1 to N-462; M-1 to N-461; M-1 to H-460; M-1 to M-4_59; M-1 to T-458; M-1 to F-4_57; M-1 to L-456;
M-lto I-455; M-1 to V-454; M-1 to K-453; M-1 to G-452; M-1 to T-451; M-1 to H-450; M-1 to N-449; M-1 toF-448; M-1 to R-447; M-1 to P-446; M-1 to A-445; M-1 to G-444; M-1 to A-443; M-1 to V-442; M-1 toY-441; M-1 to V-440; M-1 to R-439; M-1 to G-438; M-to Q-437; M-1 to R-436; M-1 to S-435; M-1 toS-434; M-I to V-433; M-1 to V-432;

to S-431; M-1 to T-430; M-1 to V-429; M-1 to T-428; M-1 to Y-427;M-1 to G-426;

to L-425; M-1 to Y-424; M-1 to A-423; M-I to G-422; M-1 to H-421; M-1 to N-420; M-lto K-419; M-1 to L-418; M-1 to E-417; M-1 to E-416; M-1 to P-415; M-1 to F-414; M-1 to E-413; M-1 toK-412; M-i to L-411; M-1 to Y-410; M-1 to S-409; M-1 to E-408; M-1 to R-407; M-1 to L-406; M-1 to P-405;M-1 to I-404; M-1 to V-403; M-1 to K-402; M-1 to G-401; M-1 to A-400; M-1 to S-399; M-1 to T-398; M-1 toE-397; M-1 to K-396;
M-1 to L-395; M-1 to V-394; M-i to A-393; M-1 to G-392; M-1 to N-391; M-1 toW-390; M-1 to D-389; M-1 to Y-388; M-1 to A-387; M-1 to G-386; M-1 to V-385; M-1 to A-384; M-1 toG-383; M-1 to L-382; M-1 to L-381; M-1 to V-380; M-1 to G-379; M-1 to D-378; M-1 to E-377; M-1 toV-376; M-1 to V-375; M-1 to H-374; M-I to S-373; M-1 to SUBSTITUTE SHEET (RULE 26) S-372; M-1 to F-371; M-1 to G-370; M-1 toT-369; M-1 to Q-368; M-I to S-367; M-1 to M-366; M-1 to E-365; M-1 to L-364; M-1 to G-363; M-1 toF-362; M-1 to S-361; M-1 to T-360; M-1 to E-359; M-1 to N-358; M-1 to K-357; M-1 to N-356; M-1 toT-355; M-1 to G-354; M-1 to E-353; M-1 to L-352; M-1 to S-351; M-1 to F-350; M-1 to I-349; M-1 to R-348;M-I to D-347; M-1 to G-346; M-1 to L-345; M-1 to A-344; M-1 to D-343; M-1 to V-342; M-1 to I-341; M-1 toD-340; M-I to K-339; M-1 to L-338; M-1 to A-337; M-1 to A-336; M-1 to E-335; M-1 to D-334; M-1 toT-333; M-1 to V-332; M-1 to N-331; M-1 to F-330; M-1 to F-329: M-1 to H-328; M-1 to K-327; M-1 toD-326; M-1 to D-325; M-1 to P-324; M-1 to D-323; M-1 to S-322; M-1 to A-321; M-1 to I-320; M-1 to Y-319;M-1 to K-318; M-1 to I-317; M-1 to E-316; M-1 to N-31 _5 ; M-1 to L-314; M-1 to F-313: M-1 to T-312; M-1 toE-311; M-1 to P-310; M-1 to N-309; M-1 to I-308; M-1 to G-307; M-1 to R-306; M-1 to R-305; M-1 toN-304; M-1 to Y-303; M-1 to Y-302; M-1 to G-301; M-I
to L-300; M-1 to V-299; M-1 to A-298; M-1 toV-297; M-1 to A-296; M-1 to Y-295;
M-1 to R-294; M-1 to T-293; M-I to V-292; M-1 to N-291; M-1 toD-290; M-1 to R-289;
I5 M-1 to E-288; M-1 to S-287; M-1 to Q-286; M-1 to Q-285; M-1 to I-284; M-1 to V-283;M-1 to K-282; M-1 to E-281; M-1 to L-280; M-1 to D-279; M-I to P-278; M-1 to S-277; M-1 to D-276; M-1 toH-275; M-1 to S-274; M-1 to E-273; M-1 to G-272; M-1 to D-271; M-1 to T-270; M-1 to I-269; M-1 to V-268;M-1 to I-267; M-1 to M-266; M-1 to V-265; M-1 to K-264; M-1 to K-263; M-1 to A-262; M-1 to G-261; M-1 to K-260; M-to R-259; M-1 to G-258; M-1 to G-257; M-I to K-256; M-1 to Q-255; M-1 to F-254; M-1 toA-253; M-1 to E-252; M-1 to S-251; M-1 to R-250; M-1 to A-249; M-1 to F-248; M-1 to E-247; M-1 to I-246;M-1 to G-245; M-1 to F-244; M-1 to A-243; M-1 to T-242; M-I to R-241; M-1 to T-240; M-1 to E-239; M-1 toT-238; M-1 to G-237; M-1 to G-236;
M-1 to R-235; M-1 to Q-234; M-1 to E-233; M-1 to I-232; M-1 toH-231; M-1 to S-230;
M-1 to A-229; M-1 to A-228; M-1 to E-227; M-1 to V-226; M-1 to V-225; M-1 toD-224; M-1 to K-223; M-1 to V-222; M-1 to S-221; M-1 to R-220; M-1 to Y-219; M-1 to D-218; M-1 toN-217; M-1 to L-2I6; M-1 to H-215; M-1 to F-214; M-1 to E-213; M-1 to SUBSTITUTE SHEET (RULE 26) H-212; M-1 to V-211; M-1 toV-210; M-1 to D-209; M-1 to E-208; M-1 to G-207; M-to Y-206; M-1 to Q-205; M-1 to V-204; M-1 toV-203; M-1 to G-202; M-1 to V-201;
M-1 to Q-200; M-1 to I-199; M-1 to Q-198; M-1 to G-197; M-1 toP-196; M-1 to G-195; M-1 to I-194; M-1 to Y-193; M-1 to F-192; M-1 to K-191; M-1 to K-190; M-1 to L-189;M-1 to I-188; M-1 to N-187; M-1 to I-186; M-1 to L-185; M-I to F-184; M-1 to H-183; M-I to Q-182; M-1 toV-181; M-1 to E-180; M-1 to V-179; M-1 to W-178; M-1 to P-177;
M-1 to Y-176; M-1 to I-175; M-1 to S-174;M-1 to N-173; M-I to S-I72; M-1 to G-171;
M-1 to D-170; M-1 to L-169; M-I to V-168; M-1 to I-167; M-1 toV-166; M-1 to I-165;
M-1 to D-164; M-1 to M-163; M-1 to Y-162; M-1 to T-161; M-1 to Q-160; M-1 toC-159; M-1 to R-158; M-I to Q--157; M-1 to L-156; M-1 to A-155; M-1 to P-154; M-1 to A-153; M-I toV-152; M-1 to T-151; M-1 to K-150; M-1 to S-149; M-1 to F-148; M-1 to R-147; M-1 to F-146; M-1 toN-145; M-1 to S-144; M-1 to N-143; M-1 to V-142; M-I to R-141; M-1 to S-140; M-1 to C-139; M-1 toM-138; M-1 to G-137; M-1 to T-136; M-1 to T-135; M-1 to Y-134; M-1 to Y-133; M-1 to S-132; M-1 toS-131; M-1 to G-130; M-1 to C-129; M-1 to E-128; M-1 to H-127; M-1 to S-126; M-1 to W-125; M-1 toL-124; M-1 to P-123; M-1 to S-122; M-1 to C-121; M-1 to A-120; M-1 to L-119; M-I to F-118; M-1 to S-117;M-I to N-116; M-1 to D-115; M-1 to K-114; M-1 to P-113; M-1 to N-112; M-1 to T-111; M-1 to A-110; M-lto L-109; M-1 to S-108; M-1 to L-107; M-1 to G-106; M-1 to L-I05; M-1 to R-104; M-1 to M-103; M-1 toN-102; M-1 to D-101; M-1 to K-100: M-to R-99; M-1 to E-98; M-1 to S-97; M-1 to V-96; M-1 to N-95; M-lto S-94; M-1 to L-93; M-I to T-92; M-1 to V-91; M-1 to R-90; M-1 to G-89; M-1 to L-88; M-1 to N-87;
M-lto L-86; M-1 to K-85; M-1 to T-84; M-1 to C-83; M-1 to N-82; M-1 to G-81; M-1 to H-80; M-1 to I-79; M-lto V-78; M-1 to P-77; M-1 to C-76; M-1 to K-75; M-1 to Y-74;
M-1 to V-73; M-1 to D-72; M-1 to G-71; M-lto T-70; M-1 to K-69; M-1 to Q-68; M-to Y-67; M-1 to G-66; M-1 to N-65; M-1 to T-64; M-1 to E-63; M-lto L-62; M-1 to P-61; M-1 to A-60; M-1 to G-59; M-1 to V-58; M-1 to V-57; M-1 to L-56; M-1 to W-55;
M-lto K-54; M-1 to N-53; M-1 to G-52; M-1 to S-51; M-1 to I-50; M-1 to D-49; M-1 to SUBSTITUTE SHEET (RULE 26) H-48; M-1 to Q-47; M-1 to Q-46; M-1 to V-45 ; M-1 to T-44; M-1 to Y-43; M-1 to G-42;
M-1 to F-41; M-1 to F-40; M-I to A-39; M-lto T-38; M-1 to R-37; M-1 to S-36; M-1 to G-35; M-1 to P-34; M-1 to I-33; M-1 to V-32; M-1 to R-31; M-1 toP-30; M-1 to K-29;
M-1 to R-28; M-1 to T-27; M-1 to D-26; M-1 to M-25; M-1 to N-24; M-1 to F-23;

toT-22; M-1 to D-21; M-1 to T-20; M-1 to F-19; M-1 to G-18; M-1 to P- I 7; M-1 to W-16; M-1 to L-15; M-1 toS-14; M-1 to L-13; M-1 to A-12; M-1 to W-11; M-1 to A-10; M-1 to V-9; M-1 to V-8; M-1 to L-7; M-1 to G-6; of SEQ ID N0:35. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:17 which have been determined from the following related cDNA genes: HEEAB54R (SEQ ID N0:104), HRDAF83R
(SEQ ID NO:105), HOUBC62R (SEQ ID N0:106), HCDBI19R (SEQ ID N0:107), HOHCU94R (SEQ ID N0:108), HOACC13R (SEQ ID N0:109), HCDAP21R (SEQ ID
NO:l 10), HNHHA34R (SEQ ID NO:111), HOHEA75R (SEQ ID N0:112) and HNGEL59R (SEQ ID N0:113).
Based on the sequence similarity to the human integrin alpha 1 subunit, translation product of this gene is expected to share at least some biological activities with integrin proteins, and specifically the integrin alpha 1 protein. Such activities are known in the art, some of which are described elsewhere herein.
Specifically, polynucleotides and polypeptides of the invention are also useful for modulating the differentiation of normal and malignant cells, modulating the proliferation and/or differentiation of cancer and neoplastic cells, and modulating the immune response. Polynucleotides and polypeptides of the invention may represent a diagnostic marker for hematopoietic and immune diseases and/or disorders. The full-length protein should be a secreted protein, based upon homology to the integrin family.
Therefore, it is secreted into serum, urine, or feces and thus the levels is assayable from patient samples. Assuming specific expression levels are reflective of the presence of SUBSTITUTE SHEET (RULE 26) immune disorders, this protein would provide a convenient diagnostic for early detection.
In addition, expression of this gene product may also be linked to the progression of immune diseases, and therefore may itself actually represent a therapeutic or therapeutic target for the treatment of cancer.
Polynucleotides and polypeptides of the invention may play an important role in the pathogenesis of human cancers and cellular transformation, particularly those of the immune and hematopoietic systems. Polynucleotides and polypeptides of the invention may also be involved in the pathogenesis of developmental abnormalities based upon its potential effects on proliferation and differentiation of cells and tissue cell types. Due to the potential proliferating and differentiating activity of said polynucleotides and polypeptides, the invention is useful as a therapeutic agent in inducing tissue regeneration, for treating inflammatory conditions (e.g., inflammatory bowel syndrome, diverticulitis, etc.). Moreover, the invention is useful in modulating the immune response to aberrant polypeptides, as may exist in rapidly proliferating cells and tissue cell types, particularly in adenocarcinoma cells, and other cancers.
Alternatively, the expression within cellular sources marked by proliferating cells indicates this protein may play a role in the regulation of cellular division, and may show utility in the diagnosis, treatment, andlor prevention of developmental diseases and disorders, including cancer, and other proliferative conditions.
Representative uses are described in the "Hyperproliferative Disorders" and "Regeneration" sections below and elsewhere herein. Briefly, developmental tissues rely on decisions involving cell differentiation and/or apoptosis in pattern formation.
Dysregulation of apoptosis can result in inappropriate suppression of cell death, as occurs in the development of some cancers, or in failure to control the extent of cell death, as is believed to occur in acquired immunodeficiency and certain neurodegenerative disorders, such as spinal muscular atrophy (SMA).
SUBSTITUTE SHEET (RULE 26) Alternatively, this gene product is involved in the pattern of cellular proliferation that accompanies early embryogenesis. Thus, aberrant expression of this gene product in tissues - particularly adult tissues - may correlate with patterns of abnormal cellular proliferation, such as found in various cancers. Because of potential roles in proliferation and differentiation, this gene product may have applications in the adult for tissue regeneration and the treatment of cancers. It may also act as a morphogen to control cell and tissue type specification. Therefore, the polynucleotides and polypeptides of the present invention are useful in treating, detecting, and/or preventing said disorders and conditions, in addition to other types of degenerative conditions. Thus this protein may modulate apoptosis or tissue differentiation and is useful in the detection, treatment, and/or prevention of degenerative or proliferative conditions and diseases. The protein is useful in modulating the immune response to aberrant polypeptides, as may exist in proliferating and cancerous cells and tissues.
The protein can also be used to gain new insight into the regulation of cellular growth and proliferation. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement.
Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
This gene is expressed almost exclusively in osteoblasts, human trabelcular bone cells, messangial cells, adipocytes, and to a lesser extent in osteosarcoma, chondrosarcoma, breast cancer cells, and bone marrow.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of the following diseases and conditions which include, but are not limited to, disorders of the skeletal system, connective tissues, and immune and hematpoietic diseases and/or disorders. Similarly, polypeptides and SUBSTITUTE SHEET (RULE 26) antibodies directed to these polypeptides are useful to provide immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the connective tissue and skeletal system, expression of this gene at significantly higher or lower levels is detected in certain S tissues or cell types (e.g. immune, hematopoietic, skeletal, bone, cartilage, deveIpomental, reproductive, secretory, and cancerous and wounded tissues) or bodily fluids or cell types (e.g., lymph, serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 35 as residues: Phe-23 to Arg-31, Leu-62 to Asp-72, Val-96 to Asp-101, Thr-111 to Asn-116, Glu-128 to Thr-135, Val-142 to Ser-149, Asn-217 to Val-222, Glu-233 to Arg-241, Gly-272 to Leu-280, Gln-286 to Thr-293, Tyr-303 to Ile-308, Gly-354 to Thr-360, Gtu-408 to Lys-419, Glu-508 to Lys-514, Arg-521 to Val-526, Gly-529 to Phe-542, Asp-551 to Tyr-557, Thr-587 to Thr-593, His-656 to Asp-665, Met-697 to Arg-705, Asp-709 to Thr-716, Glu-755 to Gly-760, Asn-779 to His-786, Leu-810 to Asp-816, Leu-844 to Ala-851, Gln-871 to Gly-877, Glu-884 to Gln-889, Ser-931 to Asn-943, Ser-974 to Ile-982, Gly-1039 to Gln-1058, Arg-1121 to Arg-1127, Ser-1134 to Trp-1139, Ser-1172 to Pro-1183. Polynucleotides encoding said polypeptides are also provided.
The tissue distribution in osteoblasts and homology to integrin alpha subunit indicates that the protein products of this gene are useful for the treatment of disorders and conditions affecting the skeletal system, in particular osteoporosis as well as disorders afflicting connective tissues (e.g. arthritis, trauma, tendonitis, chrondomalacia and inflammation), such as in the diagnosis and treatment of various autoimmune disorders such as rheumatoid arthritis, lupus, scleroderma, and dermatomyositis~as well SUBSTITUTE SHEET (RULE 2~

as dwarfism, spinal deformation, and specific joint abnormalities as well as chondrodysplasias (ie. spondyioepiphyseal dysplasia congenita, familial osteoarthritis, Atelosteogenesis type II, metaphyseal chondrodysplasia type Schmid).
polynucleotides and polypeptides corresponding to this gene are useful for the treatment and diagnosis of _5 hematopoietic related disorders such as anemia, pancytopenia, leukopenia, thrombocytopenia or leukemia since stromal cells are important in the production of cells of hematopoietic lineages. Such a use is consistent with the observed homology to integrin family members, in conjunction with The tissue distribution in bone marrow cells. Integrins play pivotal roles in cell migration, inflammation, proliferation, and cellular infiltration. Thus, the present invention is expected to share at least some of these activities. Representative uses are described in the "Immune Activity"
and "infectious disease" sections below, in Example 11, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein. Briefly, the uses include bone marrow cell ex-vivo culture, bone marrow transplantation, bone marrow reconstitution, radiotherapy or chemotherapy of neoplasia. The gene product may also be involved in lymphopoiesis, therefore, it can be used in immune disorders such as infection, inflammation, allergy, immunodeficiency etc. In addition, this gene product may have commercial utility in the expansion of stem cells and committed progenitors of various blood lineages, and in the differentiation and/or proliferation of various cell types. Based upon the tissue distribution of this protein, antagonists directed against this protein is useful in blocking the activity of this protein. Accordingly, preferred are antibodies which specifically bind a portion of the translation product of this gene.
Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably SUBSTITUTE SHEET (RULE 26) serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), are more particularly described elsewhere herein. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues. Many polynucleotide sequences, such as EST
sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ ID N0:17 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome. Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 4981 of SEQ ID N0:17, b is an integer of 15 to 4995, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID N0:17, and where b is greater than or equal to a +
14.
FEATURES OF PROTEIN ENCODED BY GENE NO: 8 The present invention relates to three novel peptidoglycan recognition binding proteins expressed by keratinocytes, wound-healing tissues and chondrosarcoma tissue.
More specifically, isolated nucleic acid molecules are provided encoding a human peptidoglycan recognition protein-related protein, sometimes referred to herein as "human tag7" or "tag7" or "htag7". Further provided are vectors, host cells and SUBSTITUTE SHEET (RULE 26) recombinant methods for producing the same. The invention also relates to both the inhibition and enhancement of activities of the tag? protein, polypeptides and diagnostic methods for detecting tag? gene expression.
Peptidoglycan, as well as Lipopolysaccharide (LPS), is a surface component of many bacteria which illicit a wide range of physiological and immune responses in humans. Specifically, peptidoglycan has been shown to manifest itself clinically by reproducing most of the symptoms of bacterial infection, including fever, acute-phase response, inflammation, septic shock, leukocytosis, sleepiness, malaise, abcess formation, and arthritis (see Dziarski et al., JBC, 273 (15): 8680 (1998)).
Furthermore, the type of peptidoglycan (i.e.- the specific stereoisomers or analogs of muramyl dipeptide, N-acetylglucosaminyl-beta(1-4)-N-acteylmuramyl tetrapeptides, etc.), were shown to elicit a broad range of activities, including exhibiting greater pyrogenicity, inducing acute joint inflammation, stimulating macrophages, and causing hemorrhagic necrosis at a primed site (See Kotani et al., Fed Proc, 45(11): 2534 (1986)).
It has been demonstrated in humans that a lipopolysaccharide binding protein exists that was discovered as a trace plasma protein (See Schumann et al., Science, 249(4975):1429 (1990)). It is thought that one of the modes of action by which this lipopolysaccharide binding protein functions is by forming high-affinity complexes with lipopolysaccharide, that then bind to macrophages and monocytes, inducing the secretion of tumor necrosis factor. Dziarski and Gupta (See Dziarski et al., JBC, 269(3): 2100 (1994)) demonstrated that a 70kDa receptor protein present on the surface of mouse lymphocytes served to bind heparin, heparinoids, bacterial lipoteichoic acids, peptidoglycan, and lipopolysaccharides. Recently, Dziarski et al. demonstrated that the CD14, a glycosylphosphatidylinositol-linked protein present on the surface of macrophage and polymorphonuclear leukocytes, bound peptidoglycan and lipopolysaccharide.
Furthermore, the binding affinity of CD14 for lipopolysaccharide was significantly increased in the presence of a LPS-binding protein present in plasma. It is SUBSTITUTE SHEET (RULE 26) thought that the LPS-binding protein functions as a transfer molecule, whereby it binds LPS and presents it to the CD14 receptor (See Dziarski et al., JBC, 273(15):

(1998)). Yoshida et al. isolated a peptidoglycan binding protein from the hemolymph of the Silkworm, Bombyx mori, using column chromatography. This protein was found to have a very specific affinity for peptidogiycan (See Yoshida et al., JBC, 271(23): 13854 (1996)).
Additionally, Kang et al. recently cloned a peptidoglycan binding protein from the moth Trichoplusia ni. The peptidoglycan binding protein was shown to bind strongly to insoluble peptidogiycan (See Kang etal., PNAS, 95( 17 ): 10078 ( 1998)). In this study the peptidoglycan binding protein was upregulated by a bacterial infection in T. ni. The insect immune system is regarded as a model for innate immunity. Thus, Kang et al were able to gene both mouse and human homologs of the T. ni peptidoglycan binding protein.
All of these peptidoglycan binding proteins shared regions of homology, as well as four conserved cysteine residues which may function in the tertiary structure of the protein, possibly in helping to form binding domains. Given that peptidoglycan is an integral component of bacterial cell walls, and that it induces many physiological responses from cytokine secretion to inflammation and macrophage activation, it appears as if this family of proteins is a ubiquitous group involved in the binding and recognition of peptidoglycan, the presentation of antigens (e.g., cell wall components, etc.), and the activation of the immune system, such as the secretion of cytokines, such as TNF. TNF is noted for its pro-inflammatory actions which result in tissue injury, such as induction of procoagulant activity on vascular endothelial cells (Pober, J.S. et al., J.
Immunol.
136:1680 (1986)), increased adherence of neutrophils and lymphocytes (Pober, J.S. et al., J. Immunol. 138:3319 (1987)), and stimulation of the release of platelet activating factor from macrophages, neutrophils and vascular endothelial cells (Camussi, G. et al., J. Exp.
Med. 166:1390 (1987)).
SUBSTITUTE SHEET (RULE 26) Recent evidence implicates TNF in the pathogenesis of many infections (Cerami, A. et al., Immunol. Today 9:28 (1988)), immune disorders, neoplastic pathology, e.g., in cachexia accompanying some malignancies (Oliff, A. et al., Cell 50:555 (1987)), and in autoimmune pathologies and graft-versus host pathology (Piguet, P.-F. et al., J. Exp.
Med. 166:1280 ( 1987)). The association of TNF with cancer and infectious pathologies is often related to the host's catabolic state. A major problem in cancer patients is weight loss, usually associated with anorexia. The extensive wasting which results is known as "cachexia" (Kern, K. A. et al. J. Parent. Enter. Nutr. 12:286-298 (1988)).
Cachexia includes progressive weight loss, anorexia, and persistent erosion of body mass in response to a malignant growth. The cachectic state is thus associated with significant morbidity and is responsible for the majority of cancer mortality.
A number of studies have suggested that TNF is an important mediator of the cachexia in cancer, infectious pathology, and in other catabolic states. TNF
is thought to play a central role in the pathophysiological consequences of Gram-negative sepsis and endotoxic shock (Michie, H.R. et al., Br. J. Surg. 76:670-671 (1989); Debets, J. M. H. et al., Second Vienna Shock Forum, p.463-466 (1989); Simpson, S. Q. et al., Crit.
Care Clin. 5:27-47 (1989)), including fever, malaise, anorexia, and cachexia.
Endotoxin is a potent monocyte/macrophage activator which stimulates production and secretion of TNF (Kombluth, S.K. et al., J. Immunol. 137:2585-2591 ( 1986)) and other cytokines.
Because TNF could mimic many biological effects of endotoxin, it was concluded to be a central mediator responsible for the clinical manifestations of endotoxin-related illness.
TNF and other monocyte-derived cytokines mediate the metabolic and neurohormonal responses to endotoxin (Michie, H.R. et al., N. Eng. J. Med. 318:1481-1486 (1988)).
Endotoxin administration to human volunteers produces acute illness with flu-like symptoms including fever, tachycardia, increased metabolic rate and stress hormone release (Revhaug, A. et al., Arch. Surg. 123:162-170 (1988)). Elevated levels of circulating TNF have also been found in patients suffering from Gram-negative sepsis SUBSTITUTE SHEET (RULE 26) (Waage, A. et al., Lancet 1:355-357 (1987); Hammerle, A.F. et al., Second Vienna Shock Forum p. 715-718 ( 1989); Debets, J. M. H. et al., Crit. Care Med. 17:489-497 ( 1989);
Calandra, T. et al., J. Infec. Dis. 161:982-987 (1990)). Passive immunotherapy directed at neutralizing TNF may have a beneficial effect in Gram-negative sepsis and endotoxemia, based on the increased TNF production and elevated TNF levels in these pathology states, as discussed above.
Antibodies to a "modulator" material which was characterized as cachectin (later found to be identical to TNF) were disclosed by Cerami et al. (EPO Patent Publication 0,212,489, March 4, 1987). Such antibodies were said to be useful in diagnostic immunoassays and in therapy of shock in bacterial infections. Rubin et al.
(EPO Patent Publication 0,218,868, April 22, 1987) disclosed monoclonal antibodies to human TNF, the hybridomas secreting such antibodies, methods of producing such antibodies, and the use of such antibodies in immunoassay of TNF. Yone et al. (EPO Patent Publication 0,288,088, October 26, 1988) disclosed anti-TNF antibodies, including mAbs, and their utility in immunoassay diagnosis of pathologies, in particular Kawasaki's pathology and bacterial infection. The body fluids of patients with Kawasaki's pathology (infantile acute febrile mucocutaneous lymph node syndrome; Kawasaki, T., Allergy 16:178 (1967);
Kawasaki, T., Shonica (Pediatrics) 26:935 (1985)) were said to contain elevated TNF
levels which were related to progress of the pathology (Yone et al., supra).
Accordingly, there is a need to provide molecules that are involved in pathological conditions. Such novel proteins could be useful in augmenting the immune system in such areas as immune recognition, antigen presentation, and immune system activation. Antibodies or antagonists directed against these proteins is useful in reducing or eliminating disorders associated with TNF and TNF-like cytokines, such as endotoxic shock and auto-immune disorders, for example.
The polypeptide of the present invention has been putatively identified as a member of the novel peptidoglycan recognition binding protein family and has been SUBSTITUTE SHEET (R.ULE 26) termed human tag7. This identification has been made as a result of amino acid sequence homology to the mouse tag? (See Genbank Accession No. emb~CAA60133).
Figure 34 shows the nucleotide (SEQ ID N0:18) and deduced amino acid sequence (SEQ ID N0:36) of htag7. Predicted amino acids from about 1 to about constitute the predicted signal peptide (amino acid residues from about 1 to about 21 in SEQ ID N0:36) and are represented by the underlined amino acid regions; and amino acids from about 34 to about 117 constitute the predicted PGRP-like domain (amino acids from about 34 to about 117 in SEQ ID N0:36) and are represented by the double underlined amino acids.
Figure 35 shows the regions of similarity between the amino acid sequences of the htag7 protein (SEQ ID N0:36) and the mouse taa7 protein (SEQ ID N0:114).
Figure 36 shows an analysis of the htag7 amino acid sequence. Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity; amphipathic regions;
flexible regions; antigenic index and surface probability are shown.
A polynucleotide encoding a polypeptide of the present invention is obtained from human chondrosarcoma cells, bone marrow, and neutrophils. The polynucleotide of this invention was discovered in a human chondrosarcoma cDNA library.
As shown in Figure 34, htag7 has a PGRP domain (the PGRP domain comprise amino acids from about 34 to about 117 of SEQ ID N0:36; which correspond to amino acids from about 34 to about 117 of Figure 34). The polynucleotide contains an open reading frame encoding the htag7 polypeptide of 198 amino acids. htag7 exhibits a high degree of homology at the amino acid level to the mouse tag? (as shown in Figure 35).
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the htag7 polypeptide having the amino acid sequence shown in Figure 34 (SEQ ID N0:36). The nucleotide sequence shown in Figure 34 (SEQ ID
N0:18) was obtained by sequencing a cloned cDNA (HCDDP40), which was deposited SUBSTITUTE SHEET (RULE 26) on November 17 at the American Type Culture Collection, and given Accession Number 203484.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:18 is intended DNA fragments at least about l5nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention. as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:18. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:18. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Representative examples of htag7 polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 1.50, from about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 3.51 to about 400, from about 401 to about 450, from about 451 to about 500, from about 501 to about 550, from about 551 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 726, and from about 130 to about 379 of SEQ ID N0:18, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about"
includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini.
SUBSTITUTE SHEET (RULE 26) Preferred nucleic acid fragments of the present invention include nucleic acid molecules encoding a member selected from the group: a polypeptide comprising or alternatively, consisting of, the PGRP-like domain (amino acid residues from about 34 to about 117 in Figure 34 (amino acids from about 34 to about 117 in SEQ ID
N0:36).
_5 Since the location of these domains have been predicted by computer analysis, one of ordinary skill would appreciate that the amino acid residues constituting these domains may vary slightly (e.g., by about 1 to 15 amino acid residues) depending on the criteria used to define each domain. As indicated, nucleic acid molecules of the present invention which encode a htag7 polypeptide may include, but are not limited to those encoding the amino acid sequence of the PGRP-like domain of the polypeptide, by itself; and the coding sequence for the PGRP-like domain of the polypeptide and additional sequences, such as a pre-, or pro or prepro- protein sequence. In additional embodiments, the poiynucleotides of the invention encode functional attributes of htag7.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions of htag7.
The data representing the structural or functional attributes of htag7 set forth in Figure 36 and/or Table XIl, as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table XII can be used to determine regions of htag7 which exhibit a high degree of potential for antigenicity.
Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'T/US99/25031 likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 36, but may, as shown in Table XII, be represented or identified by using tabular representations of the data presented in Figure 36. The DNA*STAR computer algorithm used to generate Figure 36 (set on the original default parameters) was used to present the data in Figure 36 in a tabular format (See Table XII). The tabular format of the data in Figure 36 is used to easily determine specific boundaries of a prefewed region. The above-mentioned preferred regions set out in Figure 36 and in Table XII include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figure 34. As set out in Figure 36 and in Table XIL, such preferred regions include Gamier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and I-Iopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize; modulate cellular interaction, or signalling pathways, etc.) may still be retained. For example, the ability of shortened htag7 muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an htag7 mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, SUBSTITUTE SHEET (RULE 26) peptides composed of as few as six htag7 amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the htag7 amino acid sequence shown in Figure 34, up to the proline residue at position number 191 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-196 of Figure 34 , where nl is an integer from 2 to 191 con-esponding to the position of the amino acid residue in Figure 34 (which is identical to the sequence shown as SEQ ID N0:36). In another embodiment, N-terminal deletions of the htag7 polypeptide can be described by the general formula n2-I96, where n2 is a number from 2 to 191, corresponding to the position of amino acid identified in Figure 34. N-terminal deletions of the htag7 polypeptide of the invention shown as SEQ ID N0:36 include polypeptides comprising the amino acid sequence of residues: N-terminal deletions of the htag7 polypeptide of the invention shown as SEQ
ID N0:36 include polypeptides comprising the amino acid sequence of residues:
S-2 to P-196; R-3 to P-196; R-4 to P-196; S-5 to P-196; M-6 to P-196; L-7 to P-196; L-8 to P-196; A-9 to P-I96; W-10 to P-196; A-11 to P-196; L-12 to P-196; P-13 to P-196;
S-14 to P-196; L-15 to P-196; L-16 to P-196; R-17 to P-196; L-18 to P-196; G-19 to P-196; A-20 to P-196; A-21 to P-196; Q-22 to P-196; E-23 to P-196; T-24 to P-196; E-25 to P-196;
D-26 to P-196; P-27 to P-196; A-28 to P-196; C-29 to P-196; C-30 to P-196; S-31 to P-196; P-32 to P-196; I-33 to P-196; V-34 to P-196; P-3S to P-196; R-36 to P-196; N-37 to P-196; E-38 to P-I96; W-39 to P-196; K-40 to P-196; A-41 to P-196; L-42 to P-I96; A-43 to P-196; S-44 to P-196; E-45 to P-196; C-46 to P-196; A-47 to P-196; Q-48 to P-I9b; H-49 to P-196; L-50 to P-I96; S-51 to P-I96; L-52 to P-196; P-53 to P-196; L-54 to P-196; R-55 to P-196; Y-56 to P-196; V-57 to P-196; V-58 to P-196; V-59 to P-196; S-60 to P-I96; H-61 to P-196; T-62 to P-196; A-63 to P-196; G-64 to P-196; S-65 to P-I96; S-66 to P-196; C-67 to P-196; N-68 to P-196; T-69 to P-196; P-70 to P-196; A-71 SUBSTITUTE SHEET (RULE 26) to P-196; S-72 to P-196; C-73 to P-196; Q-74 to P-196; Q-75 to P-196; Q-76 to P-196;
A-77 to P-196; R-78 to P-196; N-79 to P-196; V-80 to P-196; Q-81 to P-196; H-82 to P-196; Y-83 to P-196; H-84 to P-196; M-85 to P-196; K-86 to P-196; T-87 to P-196; L-88 to P-196; G-89 to P-196; W-90 to P-196; C-91 to P-196; D-92 to P-196; V-93 to P-196;
G-94 to P-196; Y-95 to P-196; N-96 to P-196; F-97 to P-196; L-98 to P-196; I-99 to P-196; G-100 to P-196; E-101 to P-196; D-102 to P-196; G- I 03 to P-196; L-104 to P- I 96;
V-105 to P-196; Y-106 to P-196; E-107 to P-196; G-108 to P-196; R-109 to P-196; 6-110 to P-196; W-111 to P-196; N-112 to P-196; F-113 to P-196; T-I 14 to P-196;

to P-196; A-116 to P-196; H-117 to P- I 96; S- I L 8 to P-196 ; G-119 to P- I
96; H-120 to P-196; L-121 to P-196; W-122 to P-196; N-123 to P-196; P-124 to P-196; M-125 to P-196;
S-126 to P-196; I-127 to P-196; G-128 to P-196; I-129 to P-196: S-130 to P-196; F-131 to P-196; M-132 to P-196; G-133 to P-196; N-134 to P-196; Y-135 to P-196; M-136 to P-196; D-137 to P-196; R-138 to P-196; V-139 to P-196; P-140 to P-196; T-141 to P-196; P-142 to P-196; Q-143 to P-196; A-144 to P-196; I-145 to P-196; R-146 to P-196;
A-147 to P-196; A-148 to P-196; Q-149 to P-196; G-150 to P-196; L-151 to P-196; L-152 to P-196; A-153 to P-196: C-154 to P-196; G-155 to P-196; V-156 to P-196;

to P-196; Q-158 to P-196; G-159 to P-196; A-160 to P-196; L-161 to P-196; R-162 to P-196; S-163 to P-196; N-164 to P-196; Y-165 to P-196; V-166 to P-196; I~ 167 to P-196;
K-I68 to P-196; G-169 to P-196; H-170 to P-196; R-171 to P-196; D-172 to P-196; V-173 to P-196; Q-174 to P-196; R-175 to P-196; T-176 to P-196; L-177 to P-196;
S-178 to P-196; P-179 to P-196; G-180 to P-196; N-181 to P-196; Q-182 to P-196; L-183 to P-196; Y-184 to P-196; H-185 to P-196; L-186 to P-196; I-187 to P-196; Q-188 to P-196;
N-189 to P-196; W-190 to P-196; P-191 to P-196; of SEQ ID N0:36. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities ) may still SUBSTITUTE SHEET (RULE 26) be retained. For example the ability of the shortened htag7 mutein to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a htag? mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six htag7 amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the htag7 polypeptide shown in Figure 34 , up to the methionine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figure 1, where ml is an integer from 6 to 196 corresponding to the position of the amino acid residue in Figure 34 . Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the htag7 polypeptide of the invention shown as SEQ ID N0:36 include polypeptides comprising the amino acid sequence of residues: M-1 to S-195; M-1 to R-194; M-1 to Y-193; M-1 to H-192; M-I to P-191; M-1 to W-190; M-1 to N-189; M-1 to Q-188; M-I to I-187; M-1 to L-186; M-1 to H-185; M-I to Y-184; M-1 to L-183; M-1 to Q-182; M-1 to N-181; M-1 to G-180; M-1 to P-179; M-1 to S-178; M-I to L-177; M-1 to T-176; M-1 to R-175; M-1 to Q-174; M-1 to V-173; M-1 to D-172; M-1 to R-171; M-to H-170; M-1 to G-169; M-I to K-168; M-1 to L-167; M-I to V-166; M-1 to Y-165; M-1 to N-164; M-1 to S-163; M-1 to R-162; M-1 to L-161; M-1 to A-160; M-1 to G-1_59;
M-I to Q-158; M-1 to A-157; M-1 to V-156; M-I to G-155; M-1 to C-154; M-1 to A-SUBSTITUTE SHEET (RULE 26) 153; M-1 to L-152; M-1 to L-151; M-1 to G-150; M-1 to Q-149; M-1 to A-148; M-1 to A-147; M-1 to R-146; M-1 to I-145; M-1 to A-144; M-1 to Q-143; M-1 to P-142; M-1 to T-141; M-1 to P-140; M-1 to V-139; M-1 to R-138; M-1 to D-137; M-1 to M-136; M-to Y-135; M-1 to N-134; M-1 to G-133; M-1 to M-132; M-1 to F-131; M-1 to S-130; M-.5 1 to I-129; M-1 to G-128; M-1 to I-127; M-1 to S-126; M-1 to M-125; M-1 to P-124; M-1 to N-123; M-1 to W-122; M-1 to L-121; M-1 to H-120; M-1 to G-119; M-1 to S-118;
M-1 to H-117; M-1 to A-116; M-1 to G-1 I5; M-1 to T-114; M-1 to F-113; M-1 to N-112; M-1 to W-111; M-1 to G-110; M-1 to R-109; M-I to G-108; M-1 to E-107; M-1 to Y-106; M-1 to V-i05; M-1 to L-104; M-1 to G-103; M-1 to D-102; M-1 to E-101; M-I
to G-100; M-1 to I-99; M-1 to L-98; M-I to F-97; M-1 to N-96; M-I to Y-95; M-1 to G-94; M-1 to V-93; M-1 to D-92; M-1 to C-91; M-1 to W-90; M-1 to G-89; M-1 to L-88;
M-1 to T-87; M-1 to K-86; M-1 to M-8S; M-1 to H-84; M-1 to Y-83; M-1 to H-82;
M-l to Q-81; M-1 to V-80; M-1 to N-79; M-1 to R-78; M- I to A-77; M-1 to Q-76; M-1 to Q-75; M-1 to Q-74; M-1 to C-73; M-1 to S-72; M-1 to A-71; M-1 to P-70; M-I to T-69; M-1 to N-68; M-1 to C-67; M-1 to S-66; M-1 to S-65; M-1 to G-64; M-1 to A-63; M-1 to T-62; M-1 to H-61; M-1 to S-60; M-1 to V-59; M-1 to V-58; M-1 to V-57; M-1 to Y-56;
M-1 to R-55; M-1 to L-54; M-1 to P-53; M-1 to L-52; M-1 to S-51; M-1 to L-50;
M-1 to H-49; M-1 to Q-48; M-1 to A-47; M-1 to C-46; M-1 to E-45; M-1 to S-44; M-1 to A-43;
M-1 to L-42; M-1 to A-41; M-1 to K-40; M-1 to W-39; M-1 to E-38; M-1 to N-37;

to R-36; M-1 to P-35; M-1 to V-34; M-1 to I-33; M-1 to P-32; M-1 to S-31; M-1 to C-30; M-1 to C-29; M-1 to A-28; M-1 to P-27; M-1 to D-26; M-1 to E-25; M-1 to T-24; M-1 to E-23; M-1 to Q-22; M-1 to A-21; M-1 to A-20; M-1 to G-19; M-1 to L-18; M-1 to R-17; M-1 to L-16; M-1 to L-15; M-1 to S-14; M-1 to P-13; M-1 to L-12; M-1 to A-11;
M-1 to W-10; M-1 to A-9; M-1 to L-8; M-1 to L-7; M-1 to M-6; of SEQ ID N0:36.
Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:36 which have been determined SUBSTITUTE SHEET (RULE 26) from the following related cDNA genes: HBMTB79R (SEQ ID NO:115) and HCDDP40R (SEQ ID N0:116).
Based on the sequence similarity to the mouse tag? and the PGRP-like domain, translation product of this gene is expected to share at least some biological activities S with tag? proteins, and specifically cytokine modulatory proteins. Such activities are known in the art, some of which are described elsewhere herein. Specifically, polynucleotides and polypeptides of the invention are also useful for modulating the differentiation of normal and malignant cells, modulating the proliferation and/or differentiation of cancer and neoplastic cells, and modulating the immune response.
Polynucleotides and polypeptides of the invention may represent a diagnostic marker for hematopoietic and immune diseases and/or disorders. The full-length protein should be a secreted protein, based upon homology to the tag? protein. Therefore, it is secreted into serum, urine, or feces and thus the levels is assayable from patient samples.
Assuming specific expression levels are reflective of the presence of immune disorders, this protein would provide a convenient diagnostic for early detection. In addition, expression of this gene product may also be linked to the progression of immune diseases, and therefore may itself actually represent a therapeutic or therapeutic target for the treatment of cancer.
Polynucleotides and polypeptides of the invention may play an important role in the pathogenesis of human cancers and cellular transformation, particularly those of the immune and hematopoietic systems. Polynucleotides and polypeptides of the invention may also be involved in the pathogenesis of developmental abnormalities based upon its potential effects on proliferation and differentiation of cells and tissue cell types. Due to the potential proliferating and differentiating activity of said polynucleotides and polypeptides, the invention is useful as a therapeutic agent in inducing tissue regeneration, for treating inflammatory conditions (e.g., inflammatory bowel syndrome, diverticulitis, etc.).
SUBSTITUTE SHEET (RULE 26) Moreover, the invention is useful in modulating the immune response to aberrant polypeptides, as may exist in rapidly proliferating cells and tissue cell types, particularly in adenocarcinoma cells, and other cancers. The translation product of this gene shares sequence homology with Tag7, which is a mouse cytokine that, in soluble form, triggers apoptosis in mouse L929 cells in vitro.
The translation product of this gene also shares sequence homology with anti microbial BGP-A, a bovine antimicrobial peptide from bovine neutrophils.
Preferred polypeptides of this invention comprise residues 184 to 196 shown in SEQ ID
NO: 36.
This polypeptide is believed to be the active mature form of the translation product of this gene.
This gene is expressed primarily in bone marrow and to a lesser extent in human chondrosarcoma and neutrophils.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, infections, cancer, and disorders of the immune system. Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of infected tissues and the immune system, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. immune, hematopoietic, and cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 36 as residues: Ala-63 to Asn-68, Ala-71 to Gln-81, Tyr-135 to SUBSTITUTE SHEET (RULE 26) Thr-141, Leu-167 to Gln-174, Pro-191 to Pro-196. Polynucleotides encoding said polypeptides are also provided.
FEATURES OF PROTEIN ENCODED BY GENE NO: 9 This invention relates to newly identified polynucleotides, polypeptides encoded by such polynucleotides, the use of such polynucleotides and poiypeptides, as well as the production of such polynucleotides and polypeptides. The polypeptide of the present invention has been putatively identified as a human butyrophilin homolog derived from a human testes tumor cDNA library. The polypeptide of the present invention is sometimes hereafter referred to as "Butyrophlin and B7-like 1gG superfamily receptor ", and/or "BBIR II". The invention also relates to inhibiting the action of such polypeptides.
Butyrophilin is a glycoprotein of the immunoglobulin superfamily that is secreted in association with the milk-fat-globule membrane from mammary epithelial cells. The butyrophilin gene appears to have evolved from a subset of genes in the immunoglobulin superfamily and genes encoding the B30.2 domain, which is conserved in a family of zinc-finger proteins. Furthermore, expression analysis of butyrophilin genes has shown that butyrophilin expression increases during lactation in conjunction with an increase in milk fat content. These results suggest that the stage-specific expression of milk fat globule membrane glycoproteins in mammary epithelial cells is regulated in a similar but not necessarily identical mechanism to that of a major milk protein, beta-casein.
The polypeptide of the present invention has been putatively identified as a member of the milk fat globule membrane glycoprotein family, and more particularly the butyrophilin family, and has been termed Butyrophlin and B7-like IgG
superfamily receptor ("BBIR II"). This identification has been made as a result of amino acid SUBSTITUTE SHEET (RULE 26) sequence homology to the bovine butyrophilin precursor (See Genbank Accession No.
gi ~ 162773 ).
Preferred polypeptides of the invention comprise the following nucleic acid sequence:
ACATCCATGGCTCTAATGCTCAGTTTGGTTCTGAGTCTCCTCAAGCTGGGATC
AGGGCAGTGGCAGGTGTTTGGGCCAGACAAGCCTGTCCAGGCCTTGGTGGGG
GAGGACGCAGCATTCTCCTGTTTCCTGTCTCCTAAGACCAATGCAGAGGCCA
TGGAAGTGCGGTTCTTCAGGGGCCAGTTCTCTAGCGTGGTCCACCTCTACAG
GGACGGGAAGGACCAGCCATTTATGCAGATGCCACAGTATCAAGGCAGGAC
AAAACTGGTGAAGGATTC.'TATTGCGGAGGGGCGCATCTCTCTGAGGCTGGAA
AACATTACTGTGTTGGATGCTGGCCTCTATGGGTGCAGGATTAGTTCCCAGTC
TTACTACCAGAAGGCCATCTGGGAGCTACAGGTGTCAGCACTGGGCTCAGTT
CCTCTCATTTCCATCACGGGATATGTTGATAGAGACATCCAGCTACTCTGTCA
GTCCTCGGGCTGGTTCCCCCGGCCCACAGCGAAGTGGAAAGGTCCACAAGGA

ATGTGGAGATCTCTCTGACCGTCCAAGAGAACGCCGGGAGCATATCCTGTTC
CATGCGGCATGCTCATCTGAGCCGAGAGGTGGAATCCAGGGTACAGATAGG
AGATACCTTTTTCGAGCCTATATCGTGGCACCTGGCTACCAAAGTACTGGGA
ATACTCTGCTGTGGCCTA'I'T'TT'TTGGCATTGTTGG.ACTGAAGATTTTCTTCTCC
AAATTCCAGTGGAAAATCCAGGCGGAACTGGACTGGAGAAGAAAGCACGGA
CAGGCAGAATTGAGAGACGCCCGGAAACACGCAGTGGAGGTGACTCTGGAT
CCAGAGACGGCTCACCCGAAGCTCTGCGTTTCTGATCTGAAAACTGTAACCC
ATAGAAAAGCTCCCCAGGAGGTGCCTCACTCTGAGAAGAGATTTACAAGGA
AGAGTGTGGTGGCTTCTCAGAGTTTCCAAGCAGGGAAACATTACTGGGAGGT
GGACGGAGGACACAATAAAAGGTGGCGCGTGGGAGTGTGCCGGGATGATGT
GGACAGGAGGAAGGAGTACGTGACTTTGTCTCCCGATCATGGGTACTGGGTC
CTCAGACTGAATGGAGAACATTTGTATTTCACATTAAATCCCCGTTTTATCAG
SUBSTITUTE SHEET (RULE 26) CGTCTTCCCCAGGACCCCACCTACAAAAATAGGGGTCTTCCTGGACTATGAG
TGTGGGACCATCTCCTTCTTCAACATAAATGACCAGTCCCTTATTTATACCCT
GACATGTCGGTTTGAAGGCTTATTGAGGCCCTACATTGAGTATCCGTCCTATA
ATGAGCAAAATGGAACTCCCAGAGACAAGCAACAGTGAGTCCTCCTCACAG
S GCAACCACGCCCTTCCTCCCCAGGGGTGAAATGTAGGATGAATCACATCCCA
CATTCTTCTTTAGGGATATTAAGGTCTCTCTCCCAGATCCAAAGTCCCGCAGC
AGCCGGCCAAGGTGGCTTCCAGATGAAGGGGGACTGGCCTGTCCACATGGG
AGTCAGGTGTCATGGCTGCCCTGAGCTGGGAGGC~AAGAAGGCTGACATTAC
ATTTAGTTTGCTCTCACTCCATCTGGCTAAGTGATCTTGAAATACCACCTCTC
AGGTGAAGAACCGTCAGGAATTCCCATCTCACAGGCTGTGGTGTAGATTAAG
TAGACAAGGAATGTGAATAATGC'TTAGATCTTATTGATGACAGAGTGTATCC
TAATGGTTTGTTCATTATATTACACTTTCAGTAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAA (SEQ ID N0:117), and/or ATGGCTCTAATGCTCAGTTTGGTTCTGAGTCTCCTCAAGCTGGGATCAGGGCA
GTGGCAGGTGTTTGGGCCAGACAAGCCTGTCCAGGCCTTGGTGGGGGAGGAC
GCAGCATTCTCCTGTTTCCTGTCTCCTAAGACCAATGCAGAGGCCATGGAAG
TGCGGTTCTTCAGGGGCCAGTTCTCTAGCGTGGTCCACCTCTACAGGGACGG
GAAGGACCAGCCATTTATGCAGATGCCACAGTATCAAGGCAGGACAAAACT
GGTGAAGGATTCTATTGCCiGAGGGGCGCATCTCTCTGAGGCTGGAAAACATT
ACTGTGTTGGATGCTGGCCTCTATGGGTGCAGGATTAGTTCCCAGTCTTACTA
CCAGAAGGCCATCTGGGAGCTACAGGTGTCAGCACTGGGCTCAGTTCCTCTC
ATTTCCATCACGGGATATGTTGATAGAGACATCCAGCTACTCTGTCAGTCCTC
GGGCTGGTTCCCCCGGCCCACAGCGAAGTGGAAAGGTCCACAAGGACAGGA
T"I'TGTCCACAGACTCCAGGACAAACAGAGACATGCATGGCCTGT'ITGATGTG
GAGATCTCTCTGACCGTCCAAGAGAACGCCGGGAGCATATCCTGTTCCATGC
GGCATGCTCATCTGAGCCGAGAGGTGGAATCCAGGGTACAGATAGGAGATA
CCTTTIfCGAGCCTATATCGTGGCACCTGGCTACCAAAGTACTGGGAATACTC
SUBSTITUTE SHEET (RULE 26) WO 00/29x35 PCT/US99/25031 TGCTGTGGCCTATTTTTTGGCATTGTTGGACTGAAGATTTTCTTCTCCAAATTC
CAGTGGAAAATCCAGGCGGAACTGGACTGGAGAAGAAAGCACGGACAGGCA
GAATTGAGAGACGCCCGGAAACACGCAGTGGAGGTGACTCTGGATCCAGAG
ACGGCTCACCCGAAGCTCTGCGTTTCTGATCTGAAAACTGTAACCCATAGAA
AAGCTCCCCAGGAGGTGCCTCACTCTGAGAAGAGATTTACAAGGAAGAGTGT
GGTGGCTTCTCAGAGTTTCCAAGCAGGGAAACATTACTGGGAGGTGGACGGA
GGACACAATAAAAGGTGGCGCGTGGGAGTGTGCCGGGATGATGTGGACAGG
AGGAAGGAGTACGTGACTTTGTCTCCCGATCATGGGTACTGGGTCCTCAGAC
TGAATGGAGAACATTTGTATTTCACATTAAATCCCCGTTTTATCAGCGTCTTC
CCCAGGACCCCACCTACAAAAATAGGGGTCTTCCTGGACTATGAGTGTGGGA
CCATCTCCTTCTTCAACATAAATGACCAGTCCCT'TATTTATACCCTGACATGT
CGGTTTGAAGGCTTATTGAGGCCCTACATTGAGTATCCGTCCTATAATGAGC
AAAATGGAACTCCCAGAGACAAGCAACAGTGA (SEQ ID NO:118). Polypeptide encoded by these polynucleotides are a3so provided.
Preferred polypeptides of the invention comprise the following amino acid sequence:
MALMLSLVLSLLKLGSGQWQVFGPDKPVQALVGEDAAFSCFLSPKTNAEAMEV
RFFRGQFSS V VHLYRDGKDQPFMQMPQYQGRTKLV KDSIAEGRISLRLENITVL
DAGLYGCRISSQSYYQKAI WELQVSALGS VPLISITCiY VDRDIQLLCQSSGWFPRP
TAKWKGPQGQDLSTDSRTNRDMHGLFDVEISLTVQENAGSISCSMRHAHLSREV
ESR VQIGDTFFEPIS WHLATK VLGILCCGLFFGI V GLKIFFSKFQWKIQAELDWRR
KHGQAELRDARKHAVEVTLDPETAHPKLCVSDLKTVTHRKAPQEVPHSEKRFT
RKSVVASQSFQAGKHYWEVDGGHNKRWRVGVCRDDVDRRKEYVTLSPDHGY
W VLRLNGEHLYFTLNPRFIS VFPRTPPTKIG VFLDYECGTISFFNINDQSLI YTLTC
RFEGLLRPYIEYPSYNEQNGTPRDKQQ (SEQ ID NO::I 19). Polynucleotides encoding these polypeptides are also provided.
SUBSTITUTE SHEET (RULE 26) WO 00!29435 PCT/US99/25031 A preferred polynucleotide splice variant of the invention comprises the following nucleic acid sequence:
ACCTTTTTCGAGCCTATATCGTGGCACCTGGCTACCAAAGTACTGGGAATACT
CTGCTGTGGCCTATTTTTTGGCATTGTTGGACTGAAGATTTTCTTCTCCAAATT
CCAGTGGAAAATCCAGGCGGAACTGGACTGGAGAAGAAAGCACGGACAGGC
AGAATTGAGAGACGCCCGGAAACACGCAGTGGAGGTGACTCTGGATCCAGA
GACGGCTCACCCGAAGCTCTGCGTTTCTGATCTGAAAACTGTAACCCATAGA
AAAGCTCCCCAGGAGGTGCCTCACTCTGAGAAGAGATTTACAAGGAAGAGT
GTGGTGGCTTCTCAGAGTTTCCAAGCAGGGAAAC'.ATTACTGGGAGGTGGACG
GAGGACACAATAAAAGGTGGCGCGTGGGAGTGTGCCGGGATGATGTGGACA
GGAGGAAGGAGTACGTGACTTTGTCTCCCGATCATGGGTACTGGGTCCTCAG
ACTGAATGGAGAACATTTGTATTTCACATTAAATCCCCGTTTTATCAGCGTCT
TCCCCAGGACCCCACCTACAAAAATAGGGGTCTTCCTGGACTATGAGTGTGG
GACCATCTCCTTCTTCAAC'.ATAAA'I'GACCAGTCCCTTATTTATACCCTGACAT
GTCGGTTTGAAGGCTTATTGAGGCCCTACATTGAGTATCCGTCCTATAATGAG
CAAAATGGAACTCCCAGAGACAAGCAACAGTGAGTCCTCCTCACAGGCAAC
CACGCCCTTCCTCCCCAGGGGTGAAATGTAGGATGAATCACATCCCACATTC
TTCTTTAGGGATATTAAGGTCTCTCTCCCAGATCCAAAGTCCCGCAGCAGCCG
GCCAAGGTGGCTTCCAGATGAAGGGGGACTGGCCTGTCCACATGGGAGTCA
GGTGTCATGGCTGCCCTGAGCTGGGAGGGAAGAAGGCTGACATTACATTTAG
TTTGCTCTCACTCCATCTGGCTAAGTGATCTTGAAATACCACCTCTCAGGTGA
AGAACCGTCAGGAATTCCCATCTCACAGGCTGTGGTGTAGATTAAGTAGACA
AGGAATGTGAATAATGCTTAGATCTTATTGATGACAGAGTGTATCCTAATGG
TTTGTTCATTATATTACACTTTCAGTAAAAAAAAAAAAAAAAAAAAAAAAAA
2.5 AAAAAA (SEQ ID N0:120). Polypeptides encoded by these polynucleotides are also provided.
SUBSTITUTE SHEET (RULE 26) Figures 22A-D show the nucleotide (SEQ ID N0:19) and deduced amino acid sequence (SEQ ID N0:37) of BBIR II. Predicted amino acids from about 1 to about 17 constitute the predicted signal peptide (amino acid residues from about 1 to about 17 in SEQ ID N0:37) and are represented by the underlined amino acid regions.
Figure 23 shows the regions of similarity between the amino acid sequences of the Butyrophlin and B7-like IgG superfamily receptor (BBIR II) protein (SEQ ID
N0:37) and the bovine butyrophilin precursor (SEQ 1D N0:121 ) Figure 24 shows an analysis of the integrin alpha l 1 subunit (BBIR II) amino acid sequence.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
A polynucleotide encoding a polypeptide of the present invention is obtained from human small intestine, colon tumor, and human testes tumor cells and tissues. The polynucleotide of this invention was discovered in a human testes tumor cDNA
library.
Its translation product has homology to the B30.2-like domain which is characteristic of proteins containing zinc-binding B-box motifs, and particularly for butyrophilin family members. The polynucleotide contains an open reading frame encoding the BBIR II polypeptide of 318 amino acids. BBIR II exhibits a high degree of homology at the amino acid level to the bovine butyrophilin precursor (as shown in Figure 23). The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the BBIR II polypeptide having the amino acid sequence shown in Figures 22A-D (SEQ ID N0:37). The nucleotide sequence shown in Figures 22A-D
(SEQ ID N0:19) was obtained by sequencing a cloned cDNA (HTTDB46), which was deposited on November 17 at the American Type Culture Collection, and given Accession Number 203484.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the SUBSTITUTE SHEET (RULE 26) nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:19 is intended DNA fragments at least about l5nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:19. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:19. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. Representative examples of BBIR II polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 t.o about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, from about 501 to about 550, from about 551 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 751 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 951 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about 1101 to about 1150, from about 1151 to about 1200, from about 1201 to about 1250, from about 1251 to about 1300, from about 1301 to about 1350, from about 1351 to about 1400, from about to about 1450, from about 1451 to about 1500, from about 1501 to about 1550, From about 1551 to about 1600, from about 1601 to about 1650, from about 1651 to about 1700, from about 1701 to about 1750, from about 1751 to about 1800, from about to about 18_50, from about 1851 to about 1900, from about 1901 to about 1950, from SUBSTITUTE SHEET (RULE 26) about 1951 to about 2000, from about 2001 to about 2050, from about 2051 to about 2100, from about 2101 to about 2150, from about 2151 to about 2200, from about to about 2250, from about 2251 to about 2300, from about 2301 to about 2350, from about 2351 to about 2400, from about 2401 to about 2450, from about 2451 to about 2500, from about 2501 to about 2550, from about 2551 to about 2600, from about to about 2650, from about 2651 to about 2700, from about 2701 to about 2750, from about 27S 1 to about 2800, from about 2801 to about 2850, from about 2851 to about 2900, from about 2901 to about 2950, from about 2951 to about 3000, from about to about 3050, from about 3051 to about 3059 of SEQ ID N0:19, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about"
includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini.
Preferred nucleic acid fragments of the present invention include nucleic acid molecules encoding a member selected from the group: a polypeptide comprising or alternatively, consisting of, the mature BBIR II protein (amino acid residues from about I8 to about 318 in Figures 22A-D (amino acids from about 18 to about 318 in SEQ ID
N0:37). Since the location of this form of the protein has been predicted by computer analysis, one of ordinary skill would appreciate that the amino acid residues constituting these domains may vary slightly (e.g., by about 1 to 15 amino acid residues) depending on the criteria used to define this location. In additional embodiments, the polynucleotides of the invention encode functional attributes of BBIR II.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions of BBIR II.
The data SUBSTITUTE SKEET (RULE 26) representing the structural or functional attributes of BBIR II set forth in Figure 24 and/or Table VIII, as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table VIII can be used to determine regions of BBIR II which exhibit a high degree of potential for antigenicity.
Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 24, but may, as shown in Table VIII, be represented or identified by using tabular representations of the data presented in Figure 24. The DNA*STAR computer algorithm used to generate Figure 24 (set on the original default parameters) was used to present the data in Figure 24 in a tabular format (See Table VIII). The tabular format of the data in Figure 24 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 24 and in Table VIII include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 22A-D. As set out in Figure 24 and in Table VIII, such preferred regions include Gamier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened BBIR II
muteins to induce and/or bind to.antibodies which recognize the complete or mature SUBSTITUTE SHEET (RULE 26) forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus.
Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an BBIR II mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six BBIR II
amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the BBIR II amino acid sequence shown in Figures 22A-D, up to the cystein residue at position number 313 and polynucieotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues n 1-318 of Figures 22A-D, where n 1 is an integer from 2 to 313 corresponding to the position of the amino acid residue in Figures 22A-D (which is identical to the sequence shown as SEQ ID
N0:37).
In another embodiment, N-terminal deletions of the BBIR II polypeptide can be described by the general formula n2-318, where n2 is a number from 2 to 313, corresponding to the position of amino acid identified in Figures 22A-D. N-terminal deletions of the BBIR II polypeptide of the invention shown as SEQ ID N0:37 include polypeptides comprising the amino acid sequence of residues: N-terminal deletions of the BBIR II polypeptide of the invention shown as SEQ ID N0:37 include polypeptides comprising the amino acid sequence of residues: A-2 to T-318; L-3 to T-318; M-4 to T-318; L-5 to T-318; S-6to T-318; L-7 to T-318; V-8 to T-318; L-9 to T-318; S-10 to T-318; L-11 to T-318; L-12 to T-318; K-13 toT-318; L-14 to T-318; G-15 to T-318;

to T-318; G-17 to T-318; Q-18 to T-318; W-19 to T-318; Q-20 toT-318; V-21 to T-318;
F-22 to T-318; G-23 to T-318; P-24 to T-318; D-25 to T-318; K-26 to T-3I8; P-27 toT-318; V-28 to T-318; Q-29 to T-318; A-30 to T-318; L-31 to T-318; V-32 to T-318; G-33 SUBSTITUTE SHEET (RULE 26) to T-318; E-34 toT-318; D-35 to T-318; A-36 to T-318; A-37 to T-318; F-38 to T-318;
S-39 to T-318; C-40 to T-318; F-41 toT-318; L-42 to T-318; S-43 to T-318; P-44 to T-318; K-45 to T-318; T-46 to T-318; N-47 to T-318; A-48 toT-318; E-49 to T-318;

to T-318; M-51 to T-318; E-52 to T-318; V-53 to T-318; R-54 to T-318; F-55 toT-318;
F-56 to T-318; R-57 to T-318; G-58 to T-318; Q-59 to T-318; F-60 to T-318; S-61 to T-318; S-62 toT-318; V-63 to T-318; V-64 to T-318; H-65 to T-318; L-66 to T-318;

to T-318; R-68 to T-318; D-69 toT-318; G-70 to T-318; K-71 to T-318; D-72 to T-318;
Q-73 to T-318; P-74 to T-318; F-75 to T-318; M-76 toT-318; Q-77 to T-318; M-78 to T-318; P-79 to T-318; Q-80 to T-318; Y-81 to T-318; Q-82 to T-318; G-83 toT-318;

to T-318; T-85 to T-318; K-86 to T-318; L-87 to T-318; V-88 to T-318; K-89 to T-318;
D-90 toT-318; S-91 to T-318; I-92 to T-318; A-93 to T-318; E-94 to T-318: G-95 to T-318; R-96 to T-318; I-97 toT-318; S-98 to T-318; L-99 to T-318; R-100 to T-318; L-101 to T-318; E-102 to T-318; N-103 to T-318;I-104 to T-318; T-10_5 to T-318; V-106 to T-318; L-107 to T-318; D-108 to T-318; A-109 to T-318; G-110 toT-318; L-111 to T-318;
Y-112 to T-318; G-113 to T-318; C-114 to T-318; R-115 to T-318; I-116 to T-318;S-117 to T-318; S-118 to T-318; Q-119 to T-318; S-120 to T-318; Y-121 to T-318; Y-122 to T-318; Q-123 toT-318; K-124 to T-318; A-125 to T-318; I-126 to T-318; W-127 to T-318;
E-128 to T-318; L-129 to T-318;Q-130 to T-318; V-131 to T-318; S-132 to T-318;
A-133 to T-318; L-134 to T-318; G-135 to T-318; S-136 toT-318; V-137 to T-318; P-to T-318; L-139 to T-318; I-140 to T-318; S-141 to T-318; I-142 to T-318;A-143 to T-318; G-144 to T-318; Y-145 to T-318; V-146 to T-318; D-147 to T-318; R-148 to T-318;
D-149to T-318; I-150 to T-318; Q-151 to T-318; L-152 to T-318; L-153 to T-318;

to T-318; Q-155 to T-318;5-156 to T-318; S-157 to T-318; G-158 to T-318; W-159 to T-318; F-160 to T-318; P-161 to T-318; R-162to T-318; P-163 to T-318; T-164 to T-318;
A-165 to T-318; K-166 to T-318; W-167 to T-318; K-168 toT-318; G-169 to T-318;
P-170 to T-318; Q-171 to T-318; G-172 to T-318; Q-173 to T-318; D-174 to T-318;L-to T-318; S-176 to T-318; T-177 to T-318; D-178 to T-318; S-179 to T-318; R-180 to T-SUBSTITUTE SHEET (RULE 26) 318; T-181 toT-318; N-182 to T-318; R-183 to T-318; D-184 to T-3I8; M-185 to T-318;
H-186 to T-318; G-187 to T-318;L-188 to T-318; F-189 to T-318; D-190 to T-318;
V-191 to T-318; E-192 to T-318; I-193 to T-318; S-194 toT-318; L-195 to T-318; T-196 to T-318; V-197 to T-318; Q-198 to T-318; E-199 to T-318; N-200 to T-318;A-20l to T-318; G-202 to T-318; S-203 to T-318; I-204 to T-318; S-205 to T-318; C-206 to T-318;
S-207 toT-318; M-208 to T-318; R-209 to T-318; H-210 to T-318; A-211 to T-318;
H-212 to T-318; L-213 to T-318;5-214 to T-3 I 8; R-215 to 'r-318; E-216 to T-318: V-217 to T-318; E-218 to T-318; S-219 to T-318; R-220 toT-318; V-221 to T-318; Q-222 to T-318; I-223 to T-318; G-224 to T-318; D-225 to T-318; W-226 to T-318;8-227 to T-318;
R-228 to T-318; K-229 to T-318; H-230 to T-318; G-231 to T-318; Q-232 to T-318; A-233to T-318; G-234 to T-318; K-235 to T-318; R-236 to T-318; K-237 to T-318; Y-to T-318; S-239 toT-318; S-240 to T-318; S-241 to T-318; H-242 to T-318; I-243 to T-318; Y-244 to T-318; D-245 to T-318;S-24G to T-318; F-247 to T-318; P-248 to T-318;
S-249 to T-318; L-250 to T-318; S-251 to T-318; F-252 toT-318; M-253 to T-318;
D-254 to T-318; F-255 to T-318; Y-256 to T-318; I-257 to T-318; L-258 to T-318;R-259 to T-318; P-260 to T-318; V-261 to T-318; G-262 to T-318; P-263 to T-318; C-264 to T-318; R-265to T-318; A-266 to T-318; K-267 to T-318; L-268 to T-318; V-269 to T-318;
M-270 to T-318; G-271 toT-318; T-272 to T-318; L-273 to T-318; K-274 to T-318;
L-275 to T-318; Q-276 to T-318; I-277 to T-318;L-278 to T-318; G-279 to T-318; E-280 to T-318; V-281 to T-318; H-282 to T-318; F-283 to T-318; V-284to T-318; E-285 to T-318; K-286 to T-318; P-287 to T-318; H-288 to T-318; S-289 to T-318; L-290 toT-318;
L-291 to T-318; Q-292 to T-318; I-293 to T-318; S-294 to T-318; G-295 to T-318: 6-296 to T-318;S-297 to T-318; T-298 to T-318; T-299 to T-318; L-300 to T-318; K-301 to T-318; K-302 to T-318; G-303to T-318; P-304 to T-318; N-305 to T-318; P-306 to T-318; W-307 to T-318; S-308 to T-318; F-309 toT-318; P-310 to T-318; S-311 to T-318;
P-312 to T-318; C-313 to T-318; of SEQ ID N0:37. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
SUBSTITUTE SHEET (RULE 26) Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities (e.g., ability to illicit mitogenic activity, induce differentiation of normal or malignant cells, ability to multimerize, etc.) may still be retained. For example the ability of the shortened BBIR II
mutein to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that an BBIR II mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six BBIR II
amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the BBIR II polypeptide shown in Figures 22A-D, up to the serine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figure 1, where ml is an integer from 6 to 318 corresponding to the position of the amino acid residue in Figures 22A-D. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the BBIR II polypeptide of the invention shown as SEQ ID N0:37 include polypeptides comprising the amino acid sequence of residues M-1 to P-317; M-1 to F-316; M-1 to L-315;M-1 to A-314; M-1 to C-313; M-1 to P-312; M-1 to S-311; M-1 to P-310; M-1 to F-309; M-1 to S-308; M-lto W-307; M-1 to P-306; M-1 to N-305; M-I to P-304; M-1 to G-303; M-1 to K-302; M-1 to K-301; M-1 toL-300;
SUBSTITUTE SHEET (RULE 26) M-1 to T-299; M-1 to T-298; M-1 to S-297; M-1 to G-296; M-1 to G-295; M-1 to S-294;
M-1 toI-293; M-1 to Q-292; M-1 to L-291; M-1 to L-290; M-1 to S-289; M-1 to H-288;
M-1 to P-287; M-1 toK-286; M-1 to E-285; M-1 to V-284; M-1 to F-283; M-1 to H-282;
M-1 to V-281; M-1 to E-280; M-1 toG-279; M-1 to L-27$; M-1 to I-277; M-1 to Q-276;
M-1 to L-275; M-1 to K-274; M-1 to L-273; M-1 toT-272; M-1 to G-271; M-1 to M-270;
M-1 to V-269; M-1 to L-268; M-1 to K-267; M-1 to A-266; M-1 toR-265; M-1 to C-264;
M-1 to P-263; M-1 to G-262; M-1 to V-261; M-1 to P-260; M-1 to R-259; M-1 toL-258;
M-1 to I-257; M-1 to Y-256; M-1 to F-255; M-1 to D-254; M-i to M-253; M-1 to F-252;
M-1 toS-251; M-1 to L-250; M-1 to S-249; M-1 to P-248; M-1 to F-247; M-1 to S-246;
M-1 to D-24_5; M-1 toY-244; M-1 to I-243; M-1 to H-242; M-1 to S-241; M-1 to S-240;
M-1 to S-239; M-1 to Y-238; M-1 toK-237; M-1 to R-236; M-1 to K-235; M-1 to G-234;
M-1 to A-233; M-i to Q-232; M-1 to G-231; M-1 toH-230; M-1 to K-229; M-1 to R-228; M-1 to R-227; M-1 to W-226; M-1 to D-225; M-1 to G-224; M-1 toI-223; M-1 to Q-222; M-1 to V-221; M-1 to R-220; M-1 to S-219; M-1 to E-218; M-1 to V-217; M-toE-216; M-1 to R-215; M-i to S-214; M-1 to L-213; M-1 to H-212; M-1 to A-211;

to H-210; M-1 toR-209; M-1 to M-208; M-1 to S-207; M-1 to C-206; M-1 to S-205;

to I-204; M-1 to S-203 ; M-1 toG-202; M-1 to A-201; M-1 to N-200; M-1 to E-199; M-1 to Q-198; M-1 to V-197; M-1 to T-196; M-1 toL-19_5; M-1 to S-194; M-1 to I-193; M-1 to E-192; M-1 to V-191; M-1 to D-190; M-1 to F-189; M-1 toL-188; M-1 to G-187;

to H-186; M-1 to M-185; M-1 to D-184; M-1 to R-183; M-1 to N-182; M-1 toT-181;
M-1 to R-180; M-1 to S-179; M-1 to D-178; M-1 to T-177; M-1 to S-176; M-1 to L-175;
M-1 toD-174; M-1 to Q-173; M-1 to G-172; M-1 to Q-171; M-1 to P-170; M-1 to G-169;
M-1 to K-168; M-1 toW-167; M-1 to K-166; M-1 to A-165; M-1 to T-164; M-1 to P-163; M-1 to R-162; M-1 to P-161; M-1 toF-160; M-1 to W-159; M-1 to G-158; M-1 to S-157; M-1 to S-156; M-1 to Q-155; M-1 to C-154; M-1 toL-153; M-1 to L-152; M-1 to Q-151; M-1 to I-150; M-1 to D-149; M-1 to R-148; M-1 to D-147; M-1 toV-146; M-1 to Y-145; M-1 to G-i44; M-1 to A-143; M-1 to I-142; M-1 to S-141; M-I to I-140; M-SUBSTITUTE SHEET (RULE 26) toL-139; M-1 to P-138; M-I to V-137; M-1 to S-136; M-1 to G-135; M-1 to L-134;

to A-133; M-1 toS-132; M-1 to V-13I; M-1 to Q-130; M-1 to L-129; M-1 to E-128;

to W-127; M-1 to I-126; M-1 toA-125; M-1 to K-124; M-1 to Q-123; M-1 to Y-122;

to Y-121; M-1 to S-120; M-1 to Q-119; M-1 toS-118; M-1 to S-117; M-1 to I-116;

to R-115; M-1 to C-114; M-1 to G-113; M-1 to Y-112; M-1 toL-I 11; M-1 to G-110; M-1 to A-109; M-1 to D-108; M-1 to L-107; M-1 to V-106; M-1 to T-105; M-1 toI-104:

to N-103: M-I to E-102; M-1 to L-101; M-1 to R-100; M-1 to L-99; M-1 to S-98;
M-I to I-97;M-1 to R-96; M-l to G-95; M-1 to E-94; M-1 to A-93; M-I to I-92; M-1 to S-91;
M-1 to D-90; M-1 to K-89;M-I to V-88; M-1 to L-87; M-1 to K-86; M-1 to T-85; M-1 to R-84; M-1 to G-83; M-1 to Q--82; M-1 to Y-81;M-1 to Q-80; M-1 to P-79; M-1 to M-78;
M-1 to Q-77; M-I to M-76; M-1 to F-75; M-I to P-74; M-1 to Q-73;M-1 to D-72; M-1 to K-71; M-1 to G-70; M-1 to D-69; M-1 to R-68; M-1 to Y-67; M- I to L-66; M-1 to H-65;M-1 to V-64; M-1 to V-63; M-1 to S-62; M-1 to S-61; M-I to F-60; M-I to Q-59; M-1 to G-58; M-I to R-57;M-1 to F-56; M-I to F-55; M-1 to R-54; M-1 to V-53; M-1 to E-52; M-I to M-51; M-1 to A-50; M-1 to E-49;M-1 to A-48; M-1 to N-47; M-1 to T-46;
M-1 to K-45; M-1 to P-44; M-1 to S-43; M-1 to L-42; M-1 to F-41;M-1 to C-40; M-1 to S-39; M-1 to F-38; M-1 to A-37; M-1 to A-36; M-1 to D-35; M-1 to E-34; M-1 to G-33;M-I to V-32; M-1 to L-31; M-1 to A-30; M-1 to Q-29; M-1 to V-28; M-1 to P-27; M-1 to K-26; M-1 to D-25;M-1 to P-24; M-1 to G-23; M-1 to F-22; M-1 to V-21; M-1 to Q-20; M-1 to W-19; M-1 to Q-18; M-1 to G-17;M-1 to S-I6; M-1 to G-15; M-1 to L-14;
M-I to K-13; M-1 to L-12; M-1 to L-11; M-1 to S-10; M-1 to L-9;M-1 to V-8; M-1 to L-7; M-I to S-6; of SEQ ID N0:37. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:19 which have been determined from the following related cDNA genes: HTTDB46R (SEQ ID N0:122), and HSIEA44R.
(SEQ ID N0:123).
SUBSTITUTE SHEET (RULE 26}

Based on the sequence similarity to the bovin butyrophilin precursor, translation product of this gene is expected to share at least some biological activities with B30.2-like domain containing proteins, and specifically butyrophilin proteins. Such activities are known in the art, some of which are described elsewhere herein.
Specifically, polynucleotides and polypeptides of the invention are also useful for modulating the differentiation of normal and malignant cells, modulating the proliferation and/or differentiation of cancer and neoplastic cells, and regulation of cell growth and differentiation. Polynucleotides and polypeptides of the invention may represent a diagnostic marker for breast diseases and/or disorders, in addition to disorders of secretory organs and tissues (which include, testicular and gastrointestinal disorders, particularly those cells which serve secretory functions for seminal fluid or gastrointestinal hormones, and disorders of the mucosal membranes of such cells and tissues, etc.).
The full-length protein should be a secreted protein, based upon homology to the butyrophilin family of proteins. Therefore, it is secreted into milk, serum, urine, seminal fluid, or feces and thus the levels is assayable from patient samples.
Assuming specific expression levels are reflective of the presence of breast disorders (i.e., breast cancer, breast dysfunction, etc.) this protein would provide a convenient diagnostic for early detection of such disorders In addition, expression of this gene product may also be linked to the progression of breast diseases, and therefore may itself actually represent a therapeutic or therapeutic target for the treatment of breast cancer. Polynucleotides and polypeptides of the invention may play an important role in the pathogenesis of human cancers and cellular transformation, particularly those of secretory cells and tissues.
Polynucleotides and polypeptides of the invention may also be involved in the pathogenesis of developmental abnormalities based upon its potential effects on proliferation and differentiation of cells and tissue cell types.
SUBSTITUTE SHEET (RULE 26) Due to the potential proliferating and differentiating activity of said polynucleotides and polypeptides, the invention is useful as a therapeutic agent in inducing tissue regeneration, for treating inflammatory conditions. Moreover, the invention is useful in modulating the immune response to aberrant polypeptides, as may exist in rapidly proliferating cells and tissue cell types, particularly in cancers. The invention, including agonists and/or antagonists thereof, is useful in modulating the nutritional value of milk, its caloric content, its fat content, and may conceivably be useful in mediating the adaption of breast secretory function as a delivery vehicle for therapeutics (i.e., transgenic breast secretory tissue for transferring therapeutically active proteins to infants).
Alternatively, the expression within cellular sources marked by proliferating cells indicates this protein may play a role in the regulation of cellular division, and may show utility in the diagnosis, treatment, and/or prevention of developmental diseases and disorders, including cancer, and other proliferative conditions.
Representative uses are described in the "Hyperproliferative Disorders" and "Regeneration" sections below and elsewhere herein. Briefly, developmental tissues rely on decisions involving cell differentiation and/or apoptosis in pattern formation.
Dysregulation of apoptosis can result in inappropriate suppression of cell death, as occurs in the development of some cancers, or in failure to control the extent of cell death, as is believed to occur in acquired immunodeficiency and certain neurodegenerative disorders, such as spinal muscular atrophy (SMA).
Alternatively, this gene product is involved in the pattern of cellular proliferation that accompanies early embryogenesis. Thus, aberrant expression of this gene product in tissues - particularly adult tissues - may correlate with patterns of abnormal cellular proliferation, such as found in various cancers. Because of potential roles in proliferation and differentiation, this gene product may have applications in the adult for tissue regeneration and the treatment of cancers. It may also act as a morphogen to control cell SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'T/US99/25031 and tissue type specification. Therefore, the polynucleotides and polypeptides of the present invention are useful in treating, detecting, andlor preventing said disorders and conditions, in addition to other types of degenerative conditions. Thus this protein may modulate apoptosis or tissue differentiation and is useful in the detection, treatment, and/or prevention of degenerative or proliferative conditions and diseases.
The protein is useful in modulating the immune response to aberrant polypeptides, as may exist in proliferating and cancerous cells and tissues. The protein can also be used to gain new insight into the regulation of cellular growth and proliferation. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
This gene is expressed primarily in small intestine, colon tumor, and to a lesser extent in human testes tumor cells.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, gastrointestinal diseases and/or disorders, in addition to lactation disorders, and tumors of the testes. Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the immune and reproductive systems, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. immune, testicular, gastrointestinal, and cancerous and wounded tissues) or bodily fluids (e.g. lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard SUBSTITUTE SHEET (RULE 26) gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 37 as residues: Tyr-67 to Pro-74, Ser-117 to Gln-123, Pro-161 to Met-185, Gly-224 to His-242, Thr-299 to Trp-307. Polynucleotides encoding said polypeptides are also provided.
FEATURES OF PROTEIN ENCODED BY GENE NO: 10 The translation product of this gene contains a serine protease motif and accordingly is believed to possess serine protease activity. Assays for determining such activity are well known in the art. Preferred polypeptides of this invention possess such activity.
Included in this invention as preferred domains are serine protease histidine active site domains, which were identified using the ProSite analysis tool (Swiss Institute of Bioinformatics). The catalytic activity of the serine proteases from the trypsin family is provided by a charge relay system involving an aspartic acid residue hydrogen- bonded to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the vicinity of the active site serine and histidine residues are welt conserved in this family of proteases [1]. Consensus pattern: [LIVM)-[ST)-A-[STAG)-H-C , H is the active site residue.
Preferred polypeptides of the invention comprise the following amino acid sequence: GTLVAEKHVLTAAHCIHDGKTYVKGTQ (SEQ ID NO: 124).
Polynucleotides encoding these polypeptides are also provided.
Further preferred are polypeptides comprising the serine protease histidine active site domain of the sequence referenced in Table XIII for this gene, and at least 5, 10, 1S, SUBSTITUTE SHEET (RULE Z6) 20, 25, 30, 50, or 75 additional contiguous amino acid residues of this referenced sequence. The additional contiguous amino acid residues is N-terminal or C-terminal to the serine protease histidine active site domain.
Alternatively, the additional contiguous amino acid residues is both N-terminal and C-terminal to the serine protease histidine active site domain, wherein the total N-and C-terminal contiguous amino acid residues equal the specified number. The above preferred polypeptide domain is characteristic of a signature specific to serine protease proteins. Based on the sequence similarity, the translation product of this gene is expected to share at least some biological activities with serine proteases.
Such activities are known in the art, some of which are described elsewhere herein.
In another embodiment, polypeptides comprising the amino acid sequence of the open reading frame upstream of the predicted signal peptide are contemplated by the present invention. Specifically, polypeptides of the invention comprise the following amino acid sequence:
GTRGQAWEPRALSRRPHLSERRSEPRPGRAARRGT'VLGMAGIPGLLFLLFF
LLCAVGQVSPYSAPWKPTWPAYRLPV VLPQSTLNLAKPDFGAEAKLEVSSSCGP
QCHKGTPLPTYEEAKQYLS YETLYANGSRTETQ VGIYILSSSGDGAQHRDSGSSG
KSRRKRQIYGYDSRFSIFGKDFLLNYPFSTS VKLSTGCTGTLVAEKHVLTAAHCI
HDGKTYVKGTQKLRVGFLKPKFKDGGRGANDSTSAMPEQMKFQWIRVKRTHV
PKGWIKGNANDIGMDYDYALLELKKPHKRKFMKIGVSPPAKQLPGGRIHFSGYD
NDRPGNLVYRFCDVKDETYDLLYQQCDSQPGASGSGVYVRMWKRQHQKWER
KIIGMISGHQWVDMDGSPQEFTRGCSEITPLQYIPDISIGV (SEQ ID NO: 125).
Polynucleotides encoding these polypeptides are also provided.
A preferred polypeptide variant of the invention comprises the following amino acid sequence:
MAGIPGLLFLLFFLLCAVGQVSPYSAPWKPTWPAYRLPV VLPQSTLNLAKPD
SUBSTITUTE SHEET (RULE 26) FGAEAKLEVSSSCGPQCHKGTPLPTYEEAKQYLSYETLYANGSRTETQVGIYILS
SSGDGAQHRDSGSSGKSRRKRQIYGYDSRFSIFGKDFLLNYPFSTS VKLSTGCTG
TLVAEKHVLT
AAHCIHDGKTYVKGTQKLRVGFLKPKFKDGGRGANDSTSAMPEQMKFQWIRV
KRTHVPKGWIKGNANDIGMDYDYALLELKKPHKRKFMKIGVSPPAKQLPGGRI
HFSGYDNDRPGNLVYRFCDVKDETYDLLYQQCDAQPGASGSGVYVRMWKRQ
QQKWERKIIGIFSGHQW VDMNGSPQDFNVAVRITPLKYAQICYWIKGNYLDCRE
G (SEQ ID NO: 126). Polynucleotides encoding these polypeptides are also provided.
Figures 25 A-B show the nucleotide (SEQ ID N0:20) and deduced amino acid sequence (SEQ ID N0:38) of the present invention. Predicted amino acids from about 1 to about 19 constitute the predicted signal peptide (amino acid residues from about 1 to about 19 in SEQ ID N0:38) and are represented by the underiined amino acid regions;
amino acids from about 162 to about 188 constitutes the predicted serine protease histidine active site domain (amino acids residues from about 162 to about 188 in SEQ
ID N0:38) and are represented by the double underlined amino acid regions; and amino acid residue 175 (amino acid residue 175 in SEQ ID N0:38) constitutes the predicted histidine active site residue and is represented by the bold amino acid.
Figure 26 shows the regions of similarity between the amino acid sequences of the present invention SEQ ID N0:38, and the Human Pancreatic Elastase 2 protein (gi~219620)(SEQ ID NO: 127).
Figure 27 shows an analysis of the amino acid sequence of SEQ ID N0:38.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
Northern analysis indicates that this gene is expressed highest in HUVEC, HUVEC +LPS, smooth muscle, fibroblasts, present in heart, brain, placenta, lung, liver, muscle, kidney, pancreas, spleen, thymus, prostate, testes, ovary, small intestine, colon and weakly in PBLs.
SUBSTITUTE SHEET (RULE 26) The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in Figures 25 A-B (SEQ ID N0:38), which was determined by sequencing a cloned cDNA.
The nucleotide sequence shown in Figures 25 A-B (SEQ ID N0:20) was obtained by sequencing a cloned cDNA (HUSAQOS), which was deposited on Nov. 17, 1998 at the American Type Culture Collection, and given Accession Number 203484. The deposited gene is inserted in the pSport plasmid (Life Technologies, Rockville, MD) using the SaII/NotI restriction endonuclease cleavage sites.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:20 is intended DNA fragments at least about ISnt, and more preferably at least about nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of 15 course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:20. By a fragment at least 20 nt in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as 20 shown in SEQ ID N0:20. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. Representative examples of polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, and from about 501 to about 550, and from about 551 to about SUBSTITUTE SHEET (RULE 26) 600, from about 601 to about 650, from about 651 to about 700, from about ?O1 to about 750, from about 751 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 951 to about 1000, from about 1001 to about 1050, from about 1051 to about 1100, from about 1101 to about 1150, from about 11 _51 to about 1200, from about 1201 to about 1250, from about 1251 to about 1300, from about 1301 to about 1350, from about 13S 1 to about 1400, from about 1401 to about 1450, from about 1451 to about 1500, from about 1501 to about 1550, from about I S_51 to about 1600, from about 1601 to abaut 1650, from about 1651 to about 1699 of SEQ ID N0:20, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set forth in Figure 27 and/or Table IX, as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table IX can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to SUBSTITUTE SHEET (RULE 26) be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 27, but may, as shown in Table IX, be represented or identified by using tabular representations of the data presented in Figure 27. The DNA*STAR computer algorithm used to generate Figure 27 (set on the original default parameters) was used to present the data in Figure 27 in a tabular format (See Table IX). The tabular format of the data in Figure 27 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 27 and in Table IX include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figure 1. As set out in Figure 27 and in Table IX, such preferred regions include Garnier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic SUBSTITUTE SHEET (RULE 26) activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figures 25A-B, up to the aspartic acid residue at position number 370 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues n I-375 of Figures 25A-B, where n 1 is an integer from 2 to 370 corresponding to the position of the amino acid residue in Figures 25A-B (which is identical to the sequence shown as SEQ ID
N0:38).
N-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:38 include polypeptides comprising the amino acid sequence of residues: A-2 to V-375; G-3 to V-375; I-4 to V-375; P-5 to V-375; G-6 to V-375; L-7 to V-375; L-8 to V-375; F-9 to V-375; L-10 to V-375; L-11 to V-375; F-12 to V-375; F-13 to V-375; L-14 to V-375; L-to V-375; C-16 to V-375; A-17 to V-375; V-18 to V-375; G-19 to V-375; Q-20 to V-15 375; V-21 to V-375; S-22 to V-375; P-23 to V-375; Y-24 to V-375; S-25 to V-375; A-26 to V-375; P-27 to V-375; W-28 to V-375; K-29 to V-375; P-30 to V-375; T-31 to V-375;
W-32 to V-375; P-33 to V-375; A-34 to V-375; Y-35 to V-375; R-36 to V-375; L-37 to V-375; P-38 to V-375; V-39 to V-375; V-40 to V-375; L-41 to V-375; P-42 to V-375; Q-43 to V-375; S-44 to V-375; T-45 to V-375; L-46 to V-3?5; N-47 to V-375; L-48 to V-375; A-49 to V-375; K-50 to V-375; P-51 to V-375; D-52 to V-375; F-53 to V-375; G-54 to V-375; A-55 to V-375; E-56 to V-375; A-57 to V-375; K-58 to V-375; L-59 to V-375;
E-60 to V-375; V-61 to V-375; S-62 to V-375; S-63 to V-375; S-64 to V-375; C-65 to V-375; G-66 to V-375; P-67 to V-375; Q-68 to V-375; C-69 to V-375; H-70 to V-375; K-71 to V-375; G-72 to V-375; T-73 to V-375; P-74 to V-375; L-75 to V-375; P-76 to V-375; T-77 to V-375; Y-78 to V-375; E-79 to V-375; E-80 to V-375; A-81 to V-375; K-82 to V-375; Q-83 to V-375; Y-84 to V-375; L-85 to V-37.5; S-86 to V-375; Y-87 to V-375;
E-88 to V-375; T-89 to V-375; L-90 to V-375; Y-91 to V-375; A-92 to V-375; N-93 to SUBSTITUTE SHEET (RULE 26) V-375; G-94 to V-375; S-95 to V-375; R-96 to V-375; 'C-97 to V-375; E-98 to V-375; T-99 to V-375; Q-100 to V-375; V-101 to V-375; G-102 to V-375; I-103 to V-375; Y-to V-375; I-105 to V-375; L-106 to V-375; S-107 to V-375; S-108 to V-375; S-109 to V-375; G-110 to V-375; D-111 to V-375; G-112 to V-375; A-113 to V-375; Q-114 to V-375; H-115 to V-375; R-I 16 to V-375; D-117 to V-375; S-118 to V-375; G-119 to V-375; S-120 to V-375; S-121 to V-375; G-122 to V-375; K-123 to V-375; S-124 to V-375; R-125 to V-375; R-126 to V-375; K-127 to V-375; R-128 to V-375; Q-129 to V-375; I-130 to V-375; Y-131 to V-375; G-132 to V-375; Y-133 to V-375; D-134 to V-375; S-135 to V-375; R-136 to V-375; F-137 to V-375; S-138 to V-375; I-139 to V-375;
F-140 to V-375; G-141 to V-37_5; K-142 to V-375; D-143 to V-37_5; F-144 to V-375; L-145 to V-375; L-146 to V-375; N-147 to V-375; Y-148 to V-375; P-149 to V-375;

to V-375; S-151 to V-37_5; T-152 to V-375; S-153 to V-375; V-154 to V-375; K-15_5 to V-375; L-156 to V-375; S-157 to V-375; T-158 to V-37.5; G-1S9 to V-375; C-160 to V-375; T-161 to V-375; G-162 to V-375; T-163 to V-375; L-164 to V-375; V-165 to V-375; A-166 to V-375; E-167 to V-375; K-168 to V-375; H-169 to V-375; V-170 to V-375; L-171 to V-375; T-172 to V-375; A-173 to V-375; A-174 to V-375; H-175 to V-375; C-176 to V-375; I-177 to V-375; H-178 to V-375; D-179 to V-375; G-180 to V-375; K-181 to V-375; T-182 to V-375; Y-183 to V-375; V-184 to V-375; K-185 to V-375; G-186 to V-375; T-187 to V-375; Q-188 to V-375; K-189 to V-375; L-190 to V-375; R-191 to V-375; V-192 to V-375; G-193 to V-375; F-194 to V-375; L-19S to V-375; K-196 to V-375; P-197 to V-375; K-198 to V-375; F-199 to V-375; K-200 to V-375; D-201 to V-375; G-202 to V-375; G-203 to V-375; R-204 to V-375; G-205 to V-375; A-206 to V-375; N-207 to V-375; D-208 to V-375; S-209 to V-375; T-210 to V-375; S-211 to V-375; A-212 to V-375; M-213 to V-375; P-214 to V-375; E-215 to V-375; Q-216 to V-375; M-217 to V-375; K-218 to V-375; F-219 to V-375; Q-220 to V-375; W-221 to V-375; I-222 to V-375; R-223 to V-375; V-224 to V-375; K-225 to V-375; R-226 to V-375; T-227 to V-375; H-228 to V-375; V-229 to V-375; P-230 to V-SUBSTITUTE SHEET (RULE 26) 375; K-231 to V-375; G-232 to V-375; W-233 to V-375; I-234 to V-375; K-235 to V-375; G-236 to V-375; N-237 to V-375; A-238 to V-375; N-239 to V-375; D-240 to V-375; I-241 to V-375; G-242 to V-375; M-243 to V-375; D-244 to V-375; Y-245 to V-375; D-246 to V-375; Y-247 to V-375; A-248 to V-375; L-249 to V-375; L-250 to V-375; E-251 to V-375; L-252 to V-375; K-253 to V-375; K-254 to V-37_5; P-255 to V-375; H-2S6 to V-375; K-257 to V-375; R-258 to V-375; K-259 to V-375; F-260 to V-375; M-261 to V-375; K-262 to V-375; I-263 to V-375; G-264 to V-375; V-265 to V-375; S-266 to V-375; P-267 to V-375; P-268 to V-375: A-269 to V-375; K-270 to V-375; Q-271 to V-375; L-272 to V-375; P-273 to V-375; G-274 to V-3?5; G-275 to V-375; R-276 to V-375; I-277 to V-375; H-278 to V-375; F-279 to V-375; S-280 to V-375;
G-281 to V-375; Y-282 to V-375; D-283 to V-375; N-284 to V-375; D-285 to V-375; 8-286 to V-375; P-287 to V-37_5; G-288 to V-375; N-289 to V-375; L-290 to V-37_5; V-291 to V-375; Y-292 to V-375; R-293 to V-375; F-294 to V-37.5; C-295 to V-375; D-296 to V-375; V-297 to V-37_5; K-298 to V-375; D-299 to V-375; E-300 to V-375; T-301 to V-375; Y-302 to V-375; D-303 to V-375; L-304 to V-375; I; 305 to V-375; Y-306 to V-375; Q-307 to V-375; Q-308 to V-375; C-309 to V-375; D-310 to V-375; S-311 to V-375; Q-312 to V-375; P-313 to V-3?5; G-314 to V-375; A-315 to V-375; S-316 to V-375; G-317 to V-375; S-318 to V-375; G-319 to V-375; V-320 to V-375; Y-321 to V-375; V-322 to V-375; R-323 to V-375; M-324 to V-375; W-325 to V-375; K-326 to V-375; R-327 to V-375; Q-328 to V-375; H-329 to V-375; Q-330 to V-375; K-331 to V-375; W-332 to V-375; E-333 to V-375; R-334 to V-375; K-335 to V-375; I-336 to V-375; I-337 to V-375; G-338 to V-375; M-339 to V-375; I-340 to V-375; S-341 to V-375;
G-342 to V-375; H-343 to V-375; Q-344 to V-375; W-345 to V-375; V-346 to V-375;
D-347 to V-375; M-348 to V-375; D-349 to V-375; G-350 to V-375; S-351 to V-375; P-352 to V-375; Q-353 to V-375; E-354 to V-375; F-355 to V-375; T-356 to V-375;

to V-375; G-358 to V-375; C-359 to V-375; S-360 to V-375; E-361 to V-375; I-362 to V-375; T-363 to V-375; P-364 to V-375; L-365 to V-375; Q-366 to V-375; Y-367 to V-SUBSTITUTE SHEET (R.ULE 26) WO 00/29435 PCTlUS99/25031 375; I-368 to V-375; P-369 to V-375; D-370 to V-375; of SEQ ID N0:38.
Polypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities such as ability to modulate the extracellular matrix, etc.) may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the polypeptide shown in Figures 25A-B, up to the glycine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures 25A-B, where ml is an integer from 6 to 375 corresponding to the position of the amino acid residue in Figures 25A-B. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:38 include polypeptides comprising the amino acid sequence of residues: M-1 to G-374; M-1 to I-373; M-1 to S-372; M-1 to I-371; M-1 to D-370; M-1 to P-369; M-1 to I-368; M-1 to Y-367; M-1 to Q-366; M-1 to L-365; M-1 to P-364; M-1 to T-363; M-1 to I-362; M-1 to E-SUBSTITUTE SHEET (RULE 26) 361; M-1 to S-360; M-1 to C-359; M-1 to G-358; M-1 to R-357; M-1 to T-356; M-I
to F-355; M-1 to E-354; M-1 to Q-353; M-1 to P-352; M-1 to S-351; M-1 to G-350; M-1 to D-349; M-1 to M-348; M-1 to D-347; M-1 to V-346; M-1 to W-345; M-I to Q-344; M-to H-343; M-1 to G-342; M-1 to S-341; M-1 to I-340; M-1 to M-339; M-I to G-338; M-1 to I-337; M-1 to I-336; M-I to K-335; M-1 to R-334; M-1 to E-333; M-1 to W-332; M-1 to K-331; M-I to Q-330; M-1 to H-329; M-1 to Q-328; M-1 to R-327; M-I to K-326;
M-1 to W-325; M-1 to M-324; M-1 to R-323; M-1 to V-322; M-1 to Y-321; M-1 to V-320; M-1 to G-319; M-1 to S-318; M-1 to G-317; M-1 to S-316; M-1 to A-315; M-1 to G-314; M-1 to P-313; M-1 to Q-312; M-1 to S-311; M-1 to D-310; M-1 to C-309; M-to Q-308; M-1 to Q-307; M-1 to Y-306; M-1 to L-305; M-1 to L-304; M-1 to D-303; M-1 to Y-302; M-1 to T-301; M-1 to E-300; M-1 to D-299; M-1 to K-298; M-1 to V-297;
M-1 to D-296; M-1 to C-295; M-1 to F-294; M-1 to R-293; M-1 to Y-292; M-1 to V-291; M-1 to L-290; M-1 to N-289; M-1 to G-288; M-1 to P-287; M-I to R-286: M-I
to D-285; M-1 to N-284; M-1 to D-283; M-1 to Y-282; M-1 to G-281; M-1 to S-280; M-to F-279; M-1 to H-278; M-1 to I-277; M-1 to R-276; M-1 to G-275; M-1 to G-274; M-1 to P-273; M-1 to L-272; M-1 to Q-271; M-1 to K-270; M-1 to A-269; M-I to P-268; M-1 to P-267; M-1 to S-266; M-1 to V-265; M-1 to G-264; M-1 to I-263; M-1 to K-262; M-1 to M-261; M-1 to F-260; M- I to K-259; M- I to R-258; M-1 to K-257; M- I to H-256; M-1 to P-255; M-1 to K-254; M-1 to K-253; M-1 to L-252; M-1 to E-251; M-I to L-250;
M-1 to L-249; M-1 to A-248; M-1 to Y-247; M-1 to D-246; M-1 to Y-245; M-1 to D-244; M-1 to M-243; M-1 to G-242; M-1 to I-241; M-1 to D-240; M-1 to N-239; M-1 to A-238; M-1 to N-237; M-1 to G-236; M-1 to K-235; M-1 to I-234; M-1 to W-233; M-to G-232; M-1 to K-231; M-1 to P-230; M-1 to V-229; M-1 to H-228; M-1 to T-227; M-1 to R-226; M-1 to K-225; M-1 to V-224; M-I to R-223; M-1 to I-222; M-1 to W-221;
M-1 to Q-220; M-1 to F-219; M-1 to K-218; M-1 to M-217; M-1 to Q-216; M-1 to E-215; M-1 to P-214; M-I to M-213; M-1 to A-212; M-1 to S-211; M-I to T-210; M-1 to S-209; M-1 to D-208; M-I to N-207; M-1 to A-206; M-1 to G-205; M-1 to R-204; M-SUBSTITUTE SHEET (RULE 26) to G-203; M-1 to G-202; M-1 to D-201; M-1 to K-200; M-1 to F-199; M-1 to K-198; M-1 to P-197; M-1 to K-196; M-1 to L-195; M-1 to F-194; M-1 to G-193; M-1 to V-192;
M-1 to R-191; M-1 to L-190; M-1 to K-189; M-1 to Q-188; M-1 to T-187; M-1 to 6-186; M-1 to K-185; M-1 to V-184; M-1 to Y-183; M-1 to T-182; M-1 to K-181; M-1 to G-180; M-1 to D-179; M-1 to H-178; M-1 to I-177; M-1 to C-176; M-1 to H-175; M-1 to A-174; M-1 to A-173; M-1 to T-172; M-1 to L-171; M-1 to V-170; M-1 to H-169; M-to K-168; M-1 to E-167; M-I to A-166; M-1 to V-165; M-1 to L-164; M-1 to T-163; M-I to G-162; M-1 to T-161; M-I to C-160; M-1 to G-159; M-1 to T-158; M-1 to S-157;
M-I to L-156; M-1 to K-155; M-1 to V-154; M-1 to S-153; M-1 to T-152; M-1 to S-151;
M-1 to F-150; M-I to P-149; M-1 to Y-148; M-1 to N-147; M-1 to L-146; M-1 to L-145;
M-I to F-144; M-1 to D-143; M-1 to K-142; M-1 to G-141; M-1 to F-140; M-1 to I-139;
M-1 to S-138; M-1 to F-137; M-1 to R-136; M-1 to S-135; M-1 to D-134; M-1 to Y-133;
M-1 to G-132; M-1 to Y-131; M-1 to I-130; M-1 to Q-129; M-1 to R-128; M-I to K-127;
M-1 to R-126; M-I to R-125; M-1 to S-124; M-1 to K-123; M-1 to G-122; M-1 to S-121;
M-1 to S-120; M-1 to G-119; M-1 to S-118; M-1 to D-117; M-1 to R-116; M-1 to H-115; M-1 to Q-114; M-1 to A-113; M-I to G-112; M-1 to D-11 l; M-1 to G-110; M-1 to S-109; M-1 to S-108; M-1 to S-107; M-1 to L-106; M-1 to I-105; M-1 to Y-104; M-1 to I-103; M-1 to G-102; M-1 to V-101; M-1 to Q-100; M-1 to T-99; M-1 to E-98; M-1 to T-97; M-I to R-96; M-1 to S-95; M-1 to G-94; M-1 to N-93; M-1 to A-92; M-1 to Y-91;
M-1 to L-90; M-1 to T-89; M-1 to E-88; M-1 to Y-87; M-1 to S-86; M-1 to L-85;
M-1 to Y-84; M-1 to Q-83; M-1 to K-$2; M-1 to A-81; M-1 to E-80; M-1 to E-79; M-1 to Y-78;
M-1 to T-77; M-1 to P-76; M-1 to L-75; M-1 to P-74; M-1 to T-73; M-1 to G-72;
M-1 to K-71; M-1 to H-70; M-1 to C-69; M-1 to Q-68; M-1 to P-67; M-1 to G-66; M-1 to C-65;
M-1 to S-64; M-1 to S-63; M-1 to S-62; M-1 to V-61; M-1 to E-60; M-1 to L-59;
M-1 to K-58; M-1 to A-57; M-1 to E-56; M-1 to A-55; M-1 to G-54; M-1 to F-53; M-1 to D-52;
M-1 to P-51; M-1 to K-50; M-1 to A-49; M-1 to L-48; M-1 to N-47; M-1 to L-46;

to T-45; M-I to S-44; M-1 to Q-43; M-1 to P-42; M-1 to L-41; M-1 to V-40; M-1 to V-SUBSTITUTE SHEET (RULE 26) 39; M-1 to P-38; M-1 to L-37; M-1 to R-36; M-1 to Y-35; M-1 to A-34; M-1 to P-33; M-1 to W-32; M-1 to T-31; M-1 to P-30; M-1 to K-29; M-1 to W-28; M-1 to P-27; M-1 to A-26; M-1 to S-25; M-1 to Y-24; M-1 to P-23; M-1 to S-22; M-1 to V-21; M-1 to Q-20;
M-1 to G-19; M-1 to V-18; M-1 to A-17; M-1 to C-16; M-1 to L-15; M-1 to L-14;

to F-13; M-1 to F-12; M-1 to L-11; M-1 to L-10; M-1 to F-9; M-1 to L-8; M-1 to L-7;
M-1 to G-6; of SEQ ID N0:38. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:20 which have been determined from the following related cDNA genes: HFKCF40F (SEQ ID N0:128), HSRDF26R
(SEQ ID N0:129), HTEBE07R (SEQ ID N0:130)> HFTBP82R (SEQ ID N0:131), HAQBJ 11 R (SEQ ID N0:132), HAFBB 1 I R (SEQ ID N0:133), HOEF08.SR (SEQ ID
N0:134), and HUVGY95R (SEQ ID N0:135).
The gene encoding the disclosed cDNA is believed to reside on chromosome 12.
Accordingly, polynucleotides related to this invention are useful as a marker in linkage analysis for chromosome 12.
This gene is expressed primarily in endothelial cells, fibroblasts, smooth muscle, and osteoblasts, and to a lesser extent in brain, heart, placental tissues, lung, and many other tissues. Moreover, the transcript is present in HUVEC, HUVEC +LPS, smooth muscle, fibroblasts; present in heart, brain, placenta, lung, liver, muscle, kidney, pancreas, spleen, thymus, prostate, testes, ovary, small intestine, colon and weakly in PBLs.
Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, disorders of vascularized tissues. Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for SUBSTITUTE SHEET (RULE 26) WO 00/29435 PCT/US99l25031 differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the vascular tissues, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types {e.g. vascular, skeletal, developmental, neural, cardiovascular, pulmonary, renal, immune, hematopoietic, reproductive, gastrointestinal, and cancerous and wounded tissues) or bodily fluids (e.g., lymph, seminal, fluid, amniotic fluid, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 38 as residues: Pro-67 to Thr-73, Pro-76 to Gln-83, Asn-93 to Thr-99, His-115 to Arg-128, His-178 to Lys-189, Pro-197 to Ala-212, Val-224 to Trp-233, Lys-253 to Lys-259, Ser-280 to Asn-289, Asp-296 to Tyr-302, Gln-308 to Ala-315, Arg-327 to Lys-335, Asp-349 to Gly-358. Polynucleotides encoding said polypeptides are also provided.
The tissue distribution in the vascularized endothelial cells indicates that polynucleotides and polypeptides corresponding to this gene are useful for the diagnosis and treatment of diseases of vascularized tissues, such as atherosclerosis, ataxia malabsortion, and hyperlipidemia. These and other factors often result in other cardiovascular disease. Furthermore, translation product of this gene is useful for the treatment of wounds, and may facilitate the wound healing process. Moreover, the protein is useful in the detection, treatment, and/or prevention of a variety of vascular disorders and conditions, which include, but are not limited to miscrovascular disease, vascular leak syndrome, aneurysm, stroke, embolism, thrombosis, coronary artery disease, arteriosclerosis, and/or atherosclerosis. Based upon the tissue distribution of this protein, antagonists directed against this protein is useful in blocking the activity of this SUBSTITUTE SHEET (RULE 26) protein. Accordingly, preferred are antibodies which specifically bind a portion of the translation product of this gene.
Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), are more particularly described elsewhere herein. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID N0:20 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 1685 of SEQ ID N0:20, b is an integer of 15 to 1699, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID N0:20, and where b is greater than or equal to a + 14.
SUBSTITUTE SHEET (RULE 26) FEATURES OF PROTEIN ENCODED BY GENE NO: 11 The translation product of this gene shares sequence homology with Cytotoxic-Regulatory T-Cell Associated Molecule (CRTAM) protein, which is thought to be important in the regulation of celluar physiology, development, differentiation or function of various cell types, including haematopoietic cells and various T-cell progenitors. See for example, PCT publication WO 96/34102 incorporated herein by reference in its entirety. Moreover, the protein product of this gene also shares homology with the thymocyte activation and developmental protein and the class-I MHC-restricted T cell associated molecule (See Genbank Accession Nos. gi~2665790, gb~AAB88491.1, gb~AAC80267.1, and gi~3930163; all information and references contained within these accessions are hereby incorporated herein by reference). Based on the sequence similarity, the translation product of this gene is expected to share at least some biological activities with T-cell modulatory proteins. Such activities are known in the an, some of which are described elsewhere herein.
Preferred polypeptides of the invention comprise the following amino acid sequence:
MASV VLPSGSQCAAAAAAAAPPGLRLRLLLLLFSAAALIPTGDGQNLFTKDVTVI
EGEVATISCQVNKSDDSVIQLLNPNRQTIYFRDFRPLKDSRFQLLNFSSSELKVSL
TNVSISDEGRYFCQLYTDPPQESYTTITVLVPPRNLMIDIQKDTAVEGEEIEVNCT
AMASKPATTIRWFKGNTELKGKSEVEEWSDMYTVTSQLMLKVHKEDDGVPVIC
QVEHPAVTGNLQTQRYLEVQYKPQVHIQMTYPLQGLTREGDALELTCEAIGKPQ
PVM VTW VR VDDEMPQHA VLSGPNLFINNLNKTDNGTYRCEASNIVGKAHSDY
MLYVYDPPTTIPPPTT'I"I"TT'TTTTTT'I'ILTIITDSRAGEEGSIRAVDHAVIGGVVAV
V VFAMLCLLIILGRYFARHKGTYFTHEAKGADDAADADTAIINAEGGQNNSEEK
SUBSTITUTE SHEET (RULE 26) KEYFI (SEQ ID NO: 136). Polynucleotides encoding these polypeptides are also provided.
The polypeptide of this latter embodiment has been determined to have a transmembrane domain at about amino acid position 379 - 395 of the amino acid _5 sequence referenced in Table XIII for this gene. Moreover, a cytoplasmic tail encompassing amino acids 396 to 442 of this protein has also been determined.
Based upon these characteristics, it is believed that the protein product of this gene shares structural features to type Ia membrane proteins.
Preferred polynucleotides comprise the following sequence:
ATGGCGAGTGTAGTGC
TGCCGAGCGGATCCCAGTGTGCGGCGGCAGCGGCGGCGGCGGCGCCTCCCG
GGCTCCGGCTCCGGCTTCTGCTGTTGCTCTTCTCCGCCGCGGCACTGATCCCC
ACAGGTGATGGGCAGAATCTGTTTACGAAAGACGTGACAGTGATCGAGGGA
GAGGTTGCGACCATCAGTTGCCAAGTCAATAAGAGTGACGACTCTGTGATTC
AGCTACTGAATCCCAACAGGCAGACCATTTATT'rCAGGGACTTCAGGCCTTT
GAAGGACAGCAGGTTTCAGTTGCTGAATTTTTCTAGCAGTGAACTCAAAGTA
TCATTGACAAACGTCTCAATTTCTGATGAAGGAAGATACTTTTGCCAGCTCTA
TACCGATCCCCCACAGGAAAGTTACACCACCATCACAGTCCTGGTCCCACCA
CGTAATCTGATGATCGATATCCAGAAAGACACTGCGGTGGAAGGTGAGGAG
ATTGAAGTCAACTGCACTGCTATGGCCAGCAAGCCAGCCACGACTATCAGGT
GGTTCAAAGGGAACACAGAGCTAAAAGGCAAATCGGAGGTGGAAGAGTGGT
CAGACATGTACACTGTGACCAGTCAGCTGATGCTGAAGGTGCACAAGGAGG
ACGATGGGGTCCCAGTGATCTGCCAGGTGGAGCACCCTGCGGTCACTGGAAA
CCTGCAGACCCAGCGGTATCTAGAAGTACAGTATAAGCCTCAAGTGCACATT
CAGATGACTTATCCTCTACAAGGCTTAACCCGGGAAGGGGACGCGCTTGAGT
TAACATGTGAAGCCATCGGGAAGCCCCAGCCTGTGATGGTAACTTGGGTGAG
AGTCGATGATGAAATGCCTCAACACGCCGTACTGTCTGGGCCCAACCTGTTC
SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'T/US99/25031 ATCAATAACCTAAACAAAACAGATAATGGTACATACCGCTGTGAAGCTTCAA
ACATAGTGGGGAAAGCTCACTCGGATTATATGCTGTATGTATACGATCCCCC
CACAACTATCCCTCCTCCCACAACAACCACCACCACCACCACCACCACCACC
ACCACCATCCTTACCATCATCACAGATTCCCGAGCAGGTGAAGAAGGCTCGA
TCAGGGCAGTGGATCATGCCGTGATCGGTGGCGTCGTGGCGGTGGTGGTGTT
CGCCATGCTGTGCTTGCTCATCATTCTGGGGCG('TATTTTGCCAGACATAAAG
GTACATACTTCACTCATGAAGCCAAAGGAGCCGATGACGCAGCAGACGCAG
ACACAGCTATAATCAATGCAGAAGGAGGACAGAACAACTCCGAAGAAAAGA
AAGAGTACTTCA7'CTAG (SEQ ID N0:137). Also preferred are the polypeptides encoded by these polynucleotides.
Figures 28A-B shows the nucleotide (SEQ ID N0:21) and deduced amino acid sequence (SEQ ID N0:39) of the present invention. Predicted amino acids from about l to about 44 constitute the predicted signal peptide (amino acid residues from about 1 to about 44 in SEQ ID N0:39) and are represented by the underlined amino acid regions.
Figure 29 shows the regions of similarity between the amino acid sequences of the present invention SEQ ID N0:39, the human poliovirus receptor protein (gig 1524088) (SEQ ID NO: 138), the human class-I MI-iC-restricted T cell associated molecule (W09634102) (SEQ ID N0:144), and the Gallus gallus thymocyte activation and developmental protein (gb~AAB88491.1) (SEQ ID N0:145).
Figure 30 shows an analysis of the amino acid sequence of SEQ ID N0:39.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in Figures 28A-B (SEQ ID N0:39), which was determined by sequencing a cloned cDNA.
The nucleotide sequence shown in Figures 28A-B (SEQ ID N0:21) was obtained by sequencing a cloned cDNA (HOUDJ81 ), which was deposited on Nov. 17, 1998 at the SUBSTITUTE SHEET (RULE 26) American Type Culiurc ColIcction, and given Accession Number 203484. The deposited gene is inserted in the pSport plasrztid (Life Technologies, lZockville, IV~D) using the SaII/Notz restriction endonuclease cleavage sites.
The pzescnt invention is fu~tlter dizeeted to fragments of the isolated nucleic acid molecules described herein. gy a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:2I is intended DNA, fragments at Ieast about l5nt, and more preferably at least about 20 nt, still mom preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which arc useful as diagnostic probes and primters as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA oz' as shown in SEQ iD N0:21. By a fragrxaent at least 20 nt in length, for example, is intended fragnents which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as 1S shown in SEQ ID Np.2X_ In this context "about" includes the particulat'ly reciicd size, larger or smaller by several ($, 4, 3, 2, or 1) nucleotides, at either ternc~inus or at both termini. Representative examples of polynucleotide fragmerzis of the invention include, for example, fragments that coztzprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, front about 51 to about 100, from about 101 to about 150, from about 151 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, firom about 401 to about 450, frozrt about 451 to about 500, and from about 50I to about SSO, and ;from about 551 to about 600, from about 601 to about 650, from; about 651 to about 700, from about 701 to about 750, from about 7S 1 to about 800, from about 801 to about 850, from about 851 to about 900, from about 901 to about 950, from about 95i to about x000, from about 1001 to about 1050, from about lOSI to about I 100, fmm about II01 to about 1150, from about 1 I51 to about 1200, from about 1201 to about 1250, from about 1251 to about 1300, SUBSTITUTE SHEET (RULE 26) from about 1301 to about 1350, from about 1351 to about 1400, from about 1401 to about 1450, from about 14S 1 to about 1500, from about 1 SO 1 to about 1 S20 of SEQ ID
N0:21, or the complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by S several {S, 4, 3, 2, or I) nucleotides, at either terminus or at both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set forth in Figure 30 and/or Table X, 1S as described above, was generated using the various modules and algorithms of the DNA*STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table X can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 30, but may, as shown in Table X, be represented or identified by using tabular representations of the 2S data presented in Figure 30. The DNA*STAR computer algorithm used to generate Figure 30 (set on the original default parameters) was used to present the data in Figure in a tabular format {See Table X). The tabular format of the data in Figure 30 is used SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'f/US99/25031 to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 30 and in Table X include, but are not limited to, regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figures 28A-B. As set out in Figure 30 and in Table X, such preferred regions include Garnier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittle hydrophilic regions and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jarneson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figures 28A-B, up to the threonine residue at position number 359 and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-364 of Figures 28A-B, where nl is an integer from 2 to 359 corresponding to the position of the amino acid residue in Figures SUBSTITUTE SHEET (RULE 26) 28A-B (which is identical to the sequence shown as SEQ ID N0:39). N-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:39 include polypeptides comprising the amino acid sequence of residues: A-2 to R-364; S-3 to 8-364; V-4 to R-364; V-5 to R-364; L-6 to R-364; P-7 to R-364; S-8 to R-364; G-9 to R-364; S-10 to R-364; Q-11 to R-364; C-12 to R-364; A-13 to R-364; A-14 to R-364; A-15 to R-364; A-16 to R-364; A-17 to R-364; A-18 to R-364; A-19 to R-364; A-20 to R-364;
P-2l to R-364; P-22 to R-364; G-23 to R-364; L-24 to R-364; R-25 to R-364; L-26 to 8-364; R-27 to R-364; L-28 to R-364; L-29 to R-364; L-30 to R-364; L-31 to R-364; L-32 to R-364; F-33 to R-364; S-34 to R-364; A-3_5 to R-364; A-36 to R-364; A-37 to R-364;
l0 L-38 to R-364; I-39 to R-364; P-40 to R-364; T-41 to R-364; G-42 to R-364;
D-43 to R-364; G-44 to R-364; Q-45 to R-364; N-46 to R-364; L-47 to R-364; F-48 to R-364; T-49 to R-364; K-.50 to R-364; D-51 to R-364; V-52 to R-364; T-53 to R-364; V-54 to R-364;
I-55 to R-364; E-56 to R-364; G-_57 to R-364; E-58 to R-364; V-59 to R-364; A-60 to 8-364; T-61 to R-364; I-62 to R-364; S-63 to R-364; C-64 to R-364; Q-65 to R-364; V-66 to R-364; N-67 to R-364; K-68 to R-364; S-69 to R-364; D-70 to R-364; D-71 to R-364;
S-72 to R-364; V-73 to R-364; I-74 to R-364; Q-75 to R-364; L-76 to R-364; L-77 to 8-364; N-78 to R-364; P-79 to R-364; N-80 to R-364; R-81 to R-364; Q-82 to R-364; T-83 to R-364; I-84 to R-364; Y-85 to R-364; F-86 to R-364; R-87 to R-364; D-88 to R-364;
F-89 to R-364; R-90 to R-364; P-91 to R-364; L-92 to R-364; K-93 to R-364; D-94 to R-364; S-95 to R-364; R-96 to R-364; F-97 to R-364; Q-98 to R-364; L-99 to R-364; L-100 to R-364; N-101 to R-364; F-102 to R-364; S-103 to R-364; S-104 to R-364; S-105 to 8-364; E-106 to R-364; L-107 to R-364; K-108 to R-364; V-109 to R-364; S-110 to R-364;
L-111 to R-364; T-112 to R-364; N-113 to R-364; V-114 to R-364; S-115 to R-364; I-116 to R-364; S-117 to R-364; D-118 to R-364; E-119 to R-364; G-120 to R-364;

to R-364; Y-122 to R-364; F-123 to R-364; C-124 to R-364; Q-125 to R-364; L-126 to R-364; Y-127 to R-364; T-128 to R-364; D-129 to R-364; P-130 to R-364; P-131 to 8-364; Q-132 to R-364; E-133 to R-364; S-134 to R-364; Y-135 to R-364; T-136 to R-364;
SUBSTITUTE SHEET (RULE 26) T-137 to R-364; I-138 to R-364; T-139 to R-364; V-140 to R-364; L-141 to R-364; V-142 to R-364; P-143 to R-364; P-144 to R-364; R-145 to R-364; N-146 to R-364;

to R-364; M-148 to R-364; I-149 to R-364; D-150 to R-364; I-151 to R-364; Q-152 to 8-364; K-153 to R-364; D-154 to R-364; T-155 to R-364; A-156 to R-364; V-157 to R-364; E-158 to R-364; G-159 to R-364; E-160 to R-364; E-161 to R-364; I-162 to R-364;
E-163 to R-364; V-164 to R-364; N-16S to R-364; C-166 to R-364; T-167 to R-364; A-168 to R-364; M-169 to R-364; A-170 to R-364; S-171 to R-364; K-172 to R-364;

to R-364; A-174 to R-364; T-175 to R-364; T-176 to R-364; I-177 to R-364; R-178 to 8-364; W-179 to R-364; F-180 to R-364; K-181 to R-364; G-182 to R-364; N-183 to R-364; T-184 to R-364; E-185 to R-364; L-186 to R-364; K-187 to R-364; G-188 to R-364;
K-189 to R-364; S-190 to R-364; E-191 to R-364; V-192 to R-364; E-193 to R-364; E-194 to R-364; W-19S to R-364; S-196 to R-364; D-197 to R-364; M-198 to R-364;
Y-199 to R-364; T-200 to R-364; V-201 to R-364; T-202 to R-364; S-203 to R-364;

to R-364; L-205 to R-364; M-206 to R-364; L-207 to R-364; K-208 to R-364; V-209 to 1 _5 R-364; H-210 to R-364; K-211 to R-364; E-212 to R-364; D-213 to R-364; D-214 to R-364; G-215 to R-364; V-216 to R-364; P-217 to R-364; V-218 to R-364; I-219 to R-364;
C-220 to R-364; Q-221 to R-364; V-222 to R-364; E-223 to R-364; H-224 to R-364; P-225 to R-364; A-226 to R-364; V-227 to R-364; T-228 to R-364; G-229 to R-364;

to R-364; L-231 to R-364; Q-232 to R-364; T-233 to R-364; Q-234 to R-364; R-235 to R-364; Y-236 to R-364; L-237 to R-364; E-238 to R-364; V-239 to R-364; Q-240 to R-364; Y-241 to R-364; K-242 to R-364; P-243 to R-364; Q-244 to R-364; V-245 to 364; H-246 to R-364; I-247 to R-364; Q-248 to R-364; M-249 to R-364; T-250 to R-364;
Y-251 to R-364; P-252 to R-364; L-253 to R-364; Q-254 to R-364; G-255 to R-364; L-256 to R-364; T-257 to R-364; R-258 to R-364; E-259 to R-364; G-260 to R-364;

2_5 to R-364; A-262 to R-364; L-263 to R-364; E-264 to R-364; L-265 to R-364;
T-266 to R-364; C-267 to R-364; E-268 to R-364; A-269 to R-364; I-270 to R-364; G-271 to 8-364; K-272 to R-364; P-273 to R-364; Q-274 to R-364; P-275 to R-364; V-276 to R-364;
SUBSTITUTE SHEET (RULE 26) M-277 to R-364; V-278 to R-364; T-279 to R-364; W-280 to R-364; V-281 to R-364; 8-282 to R-364; V-283 to R-364; D-284 to R-364; D-285 to R-364; E-286 to R-364;
M-287 to R-364; P-288 to R-364; Q-289 to R-364; H-290 to R-364; A-291 to R-364;

to R-364; L-293 to R-364; S-294 to R-364; G-295 to R-364; P-296 to R-364; N-297 to R-364; L-298 to R-364; F-299 to R-364; I-300 to R-364; N-301 to R-364; N-302 to R-364; L-303 to R-364; N-304 to R-364; K-305 to R-364; T-306 to R-364; D-307 to 364; N-308 to R-364; G-309 to R-364; T-310 to R-364; Y-311 to R-364; R-312 to 364; C-313 to R-364; E-314 to R-364; A-315 to R-364; S-316 to R-364; N-317 to R-364;
I-318 to R-364; V-319 to R-364; G-320 to R-364; K-321 to R-364; A-322 to R-364; H-323 to R-364; S-324 to R-364; D-325 to R-364; Y-326 to R-364; M-327 to R-364;

to R-364; Y-329 to R-364; V-330 to R-364; Y-331 to R-364; D-332 to R-364; P-333 to R-364; P-334 to R-364; T-33_5 to R-364; T-336 to R-364; I-337 to R-364; P-338 to 8-364; P-339 to R-364; P-340 to R-364; T-341 to R-364; T-342 to R-364; T-343 to R-364;
T-344 to R-364; T-345 to R-364; T-346 to R-364; T-347 to R-364; T-348 to R-364; T-349 to R-364; T-350 to R-364; T-351 to R-364; T-352 to R-364; T-353 to R-364;

to R-364; L-355 to R-364; T-356 to R-364; I-357 to R-364; I-358 to R-364; T-359 to 8-364; of SEQ ID N0:39. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities such as ability to modulate the extracellular matrix, etc.) may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein SUBSTITUTE SHEET (RULE 26) and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the polypeptide shown in Figures 28A-B, up to the leucine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figures 28A-B, where ml is an integer from 6 to 364 cowesponding to the position of the amino acid residue in Figures 28A-B. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:39 include polypeptides comprising the amino acid sequence of residues: M-1 to A-363; M-1 to R-362; M-1 to S-361; M-1 to D-360; M-1 to T-359; M-1 to I-358; M-1 to I-357; M-1 to T-356; M-1 to L-355; M-1 to I-354; M-1 to T-353; M-1 to T-352; M-1 to T-351; M-1 to T-350; M-1 to T-349; M-1 to T-348; M-1 to T-347; M-1 to T-346; M-1 to T-345; M-1 to T-344; M-1 to T-343; M-1 to T-342; M-1 to T-341; M-1 to P-340; M-1 to P-339; M-1 to P-338; M-1 to I-337; M-1 to T-336; M-1 to T-335; M-1 to P-334; M-1 to P-333; M-1 to D-332; M-1 to Y-331; M-1 to V-330; M-1 to Y-329; M-1 to L-328; M-1 to M-327; M-1 to Y-326; M-1 to D-325; M-1 to S-324; M-1 to H-323; M-1 to A-322; M-1 to K-321; M-to G-320; M-1 to V-319; M-1 to I-318; M-1 to N-317; M.-1 to S-316; M-1 to A-315; M-1 to E-314; M-1 to C-313; M-1 to R-312; M-1 to Y-311; M-1 to T-310; M-1 to G-309; M-1 to N-308; M-1 to D-307; M-1 to T-306; M-1 to K-305; M-1 to N-304; M-1 to L-303;
M-1 to N-302; M-1 to N-301; M-1 to I-300; M-1 to F-299; M-1 to L-298; M-1 to N-297;
M-1 to P-296; M-1 to G-295; M-1 to S-294; M-1 to L-293; M-1 to V-292; M-1 to A-291;
M-1 to H-290; M-1 to Q-289; M-1 to P-288; M-1 to M-287; M-1 to E-286; M-1 to D-SUBSTITUTE SHEET (RULE 26) 285; M-I to D-284; M-1 to V-283; M-1 to R-282; M-1 to V-281; M-I to W-280; M-1 to T-279; M-1 to V-278; M-1 to M-277; M-1 to V-276; M-1 to P-275; M-I to Q-274; M-to P-273; M-1 to K-272; M-1 to G-271; M-1 to I-270; M-1 to A-269; M-1 to E-268; M-1 to C-267; M-1 to T-266; M-1 to L-265; M-1 to E-264; M-1 to L-263; M-1 to A-262; M-1 to D-261; M-1 to G-260; M-1 to E-259; M-I to R-258; M-1 to T-257; M-1 to L-256; M-1 to G-255; M-I to Q-254; M-1 to L-253; M-1 to P-252; M-1 to Y-251; M-I to T-250;
M-1 to M-249; M-1 to Q-248; M-1 to I-247; M-1 to H-246; M-I to V-245; M-1 to Q-244: M-1 to P-243; M-1 to K-242; M-1 to Y-241; M-1 to Q-240; M-1 to V-239; M-1 to E-238; M-1 to L-237; M-1 to Y-236; M-I to R-235; M-1 to Q-234; M-1 to T-233; M-to Q-232; M-1 to L-231: M-I to N-230; M-1 to G-229; M-1 to T-228; M-1 to V-227; M-1 to A-226; M-1 to P-225; M-1 to H-224; M-1 to E-223; M-I to V-222; M-1 to Q-221;
M-1 to C-220; M-1 to I-219; M-1 to V-218; M-I to P-217; M-1 to V-216; M-1 to G-215;
M-1 to D-214; M-1 to D-213 ; M-1 to E-212 ; M-1 to K-211; M-1 to H-210; M-1 to V-209; M-1 to K-208; M-1 to L-207; M-1 to M-206; M-1 to L-205; M-1 to Q-204; M-I
to S-203; M-1 to T-202; M-1 to V-201; M-1 to T-200; M-1 to Y-199; M-1 to M-198; M-to D-197; M-1 to S-196; M-1 to W-195; M-1 to E-194; M-1 to E-193; M-1 to V-192; M-1 to E-191; M-1 to S-190; M-1 to K-189; M-1 to G-188; M-1 to K-187; M-1 to L-186;
M-1 to E-185; M-1 to T-184; M-1 to N-183; M-1 to G-182; M-1 to K-181; M-1 to F-180; M-1 to W-179; M-1 to R-178; M-1 to I-177; M-1 to T-176; M-1 to T-175; M-1 to A-174; M-1 to P-173; M-1 to K-172; M-1 to S-171; M-1 to A-170; M-1 to M-169; M-to A-168; M-1 to T-167; M-1 to C-166; M-1 to N-165; M-1 to V-164; M-1 to E-163; M-1 to I-162; M-1 to E-161; M-1 to E-160; M-1 to G-159; M-1 to E-158; M-1 to V-157; M-1 to A-156; M-1 to T-I55; M-1 to D-154; M-1 to K-153; M-I to Q-152; M-1 to I-151;
M-1 to D-150; M-1 to I-149; M-1 to M-148; M-1 to L-147; M-1 to N-146; M-1 to R-145; M-1 to P-144; M-1 to P-143; M-1 to V-142; M-1 to L-141; M-1 to V-140; M-1 to T-139; M-I to I-138; M-1 to T-137; M-1 to T-136; M-1 to Y-135; M-1 to S-134; M-I to E-133; M-1 to Q-132; M-1 to P-131; M-1 to P-130; M-1 to D-129; M-1 to T-128; M-I to SUBSTITUTE SHEET (RULE 16) Y-127; M-1 to L-126; M-1 to Q-125; M-1 to C-124; M-1 to F-123; M-1 to Y-122; M-to R-121; M-1 to G-120; M-1 to E-119; M-1 to D-118; M-1 to S-117; M-1 to I-116; M-1 to S-115; M-1 to V-114; M-1 to N-113; M-1 to T-112; M-1 to L-111; M-1 to S-110; M-1 to V-109; M-1 to K-108; M-1 to L-107; M-1 to E-106; M-1 to S-105; M-1 to S-104; M-1 to S-103; M-1 to F-102; M-1 to N-101; M-1 to L-100; M-1 to L-99; M-1 to Q-98;
M-1 to F-97; M-1 to R-96; M-1 to S-95; M-1 to D-94; M-1 to K-93; M-1 to L-92; M-1 to P-91;
M-1 to R-90; M-1 to F-89; M-1 to D-88; M-1 to R-87; M-1 to F-86; M-1 to Y-85;
M-1 to I-84; M-1 to T-83; M-I to Q-82; M-1 to R-81; M-1 to N-80; M-1 to P-79; M-1 to N-78;
M-1 to L-77; M-1 to L-76; M-1 to Q-75; M-1 to I-74; M-1 to V-73; M-1 to S-72;
M-1 to D-71; M-1 to D-70; M-1 to S-69; M-1 to K-68; M-1 to N-67; M-1 to V-66; M-1 to Q-65;
M-1 to C-64; M-1 to S-63; M-1 to I-62; M-1 to T-61; M-1 to A-60; M-1 to V-59;
M-1 to E-58; M-1 to G-57; M-1 to E-_56; M-1 to I-55; M-1 to V-54; M-1 to T-53; M-1 to V-52;
M-1 to D-51; M-1 to K-50; M-1 to T-49; M-1 to F-48; M-1 to L-47; M-1 to N-46;

to Q-45; M-1 to G-44; M-1 to D-43; M-1 to G-42; M-1 to T-41; M-1 to P-40; M-1 to I-39; M-1 to L-38; M-1 to A-37; M-1 to A-36; M-1 to A-:35; M-I to S-34; M-1 to F-33; M-1 to L-32; M-1 to L-31; M-1 to L-30; M-1 to L-29; M-1 to L-28; M-1 to R-27; M-1 to L-26; M-1 to R-25; M-1 to L-24; M- I to G-23; M-1 to P-22 ; M-1 to P-21; M-1 to A-20; M-1 to A-19; M-1 to A-18; M-1 to A-17; M-1 to A-16; M-1 to A-15; M-1 to A-14; M-1 to A-13; M-1 to C-12; M-1 to Q-11; M-1 to S-10; M-1 to G-9; M-1 to S-8; M-1 to P-7; M-I
to L-6; of SEQ ID N0:39. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:21 which have been determined from the following related cDNA genes: HSQFJ92R (SEQ ID N0:139), HFLAB 18F
(SEQ ID N0:140), HAQBH82R (SEQ ID N0:141 ), HLHTM lOR (SEQ ID N0:142), and HLHAL65R (SEQ ID N0:143).
SUBSTITUTE SHEET (RULE 26) This gene is expressed primarily in immune system related tissues such as ulcerative colitis, rejected kidney tissues, and to a lesser extent in thymus and bone marrow. Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, immune and hematopoietic diseases and/or disorders, particularly ulcerative colitis and rejected organs. Similarly, polypeptides and antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissue{s) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the immune system, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g.
transplanted kidney, immune, hematopoeitic, renal, and cancerous and wounded tissues) or bodily fluids (e.g., lymph, serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an individual not having the disorder.
Preferred polypeptides of the present invention comprise immunogenic epitopes shown in SEQ ID NO: 39 as residues: Gly-42 to Phe-48, Val-66 to Asp-71, Asn-78 to Thr-83, Asp-88 to Arg-96, Tyr-127 to Tyr-135, Lys-181 to Trp-195, His-210 to Gly-215, Leu-303 to Thr-310, Thr-341 to Thr-350. Polynucleotides encoding said polypeptides are also provided.
The tissue distribution primarily in immune cells and tissues, combined with the homology to the CRTAM, thymocyte activation and developmental protein, the class-I
MHC-restricted T cell associated molecule protein, and the polivirus receptor, indicates that the protein products of this gene are useful for the regulation of celluar physiology, development, differentiation or function of various cell types, including haematopoietic cells and particularly T-cell progenitors. Representative uses are described in the SUBSTITUTE SHEET (RULE 26) "Immune Activity" and "infectious disease" sections below, in Example 11, 13, 14, 16, 18, 19, 20, and 27, and elsewhere herein. The proteins can be used to develop products for the diagnosis and treatment of conditions associated with abnormal physiology or development, including abnormal proliferation, e.g. cancers, or degenerative conditions.
S The physiology or development of a cell can be modulated by contacting the cell with an agonist or antagonist (i.e. an anti- CRTAM-like peptide antibody). Further the CRTAM-like polypeptides of the present invention include treatment of ulcerative colitis, organ rejection and other immune system related disorders. Agonists or antagonists may treat or prevent such disorders as ulcerative colitis and rejected organs, such as kidney. Based upon the tissue distribution of this protein, antagonists directed against this protein is useful in blocking the activity of this protein. Accordingly, preferred are antibodies which specifically bind a portion of the translation product of this gene.
Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), are more particularly described elsewhere herein. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
SUBSTITUTE SHEET (RULE 26) Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
ID N0:21 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 1506 of SEQ ID N0:21, b is an integer of 15 to 1520, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID N0:21, and where b is greater than or equal to a + 14.
FEATURES OF PROTE1N ENCODED BY GENE NO: 12 Figure 31 shows the nucleotide (SEQ ID N0:22) and deduced amino acid sequence (SEQ ID N0:40) of the present invention. Predicted amino acids from about 1 to about 23 constitute the predicted signal peptide (amino acid residues from about 1 to about 23 in SEQ ID N0:40) and are represented by the underlined amino acid regions.
Figure 32 shows the regions of similarity between the amino acid sequences of the present invention SEQ ID N0:40 and the human FAP protein (gi~1890647) (SEQ
ID
N0:146).
Figure 33 shows an analysis of the amino acid sequence of SEQ ID N0:40.
Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity;
amphipathic regions; flexible regions; antigenic index and surface probability are shown.
The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding the polypeptide having the amino acid sequence shown in Figure 31 (SEQ ID N0:40), which was determined by sequencing a cloned cDNA.
The SUBSTITUTE SHEET (RULE 26) nucleotide sequence shown in Figure 31 (SEQ ID N0:22) was obtained by sequencing a cloned cDNA (HPWCM76), which was deposited on Nov. 17, 1998 at the American Type Culture Collection, and given Accession Number 203484. The deposited gene is inserted in the pSport plasmid (Life Technologies, Rockville, MD) using the SaII/NotI
restriction endonuclease cleavage sites.
The present invention is further directed to fragments of the isolated nucleic acid molecules described herein. By a fragment of an isolated DNA molecule having the nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in SEQ ID
N0:22 is intended DNA fragments at least about ISnt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length which are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments 50-1500 nt in length are also useful according to the present invention, as are fragments corresponding to most, if not all, of the nucleotide sequence of the deposited cDNA or as shown in SEQ ID N0:22. By a fragment at least 20 ni in length, for example, is intended fragments which include 20 or more contiguous bases from the nucleotide sequence of the deposited cDNA or the nucleotide sequence as shown in SEQ ID N0:22. In this context "about" includes the particularly recited size, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. Representative examples of polynucleotide fragments of the invention include, for example, fragments that comprise, or alternatively, consist of, a sequence from about nucleotide 1 to about 50, from about 51 to about 100, from about 101 to about 1 S0, from about 1 S 1 to about 200, from about 201 to about 250, from about 251 to about 300, from about 301 to about 350, from about 351 to about 400, from about 401 to about 450, from about 451 to about 500, and from about 501 to about 550, and from about 551 to about 600, from about 601 to about 650, from about 651 to about 700, from about 701 to about 750, from about 751 to about 800, from about 801 to about 807 of SEQ ID N0:22, or the SUBSTITUTE SHEET (RULE 26) 19$
complementary strand thereto, or the cDNA contained in the deposited gene. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. In additional embodiments, the polynucleotides of the invention encode functional attributes of the corresponding protein.
Preferred embodiments of the invention in this regard include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet foaming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions and high antigenic index regions. The data representing the structural or functional attributes of the protein set forth in Figure 33 and/or Table XI, as described above, was generated using the various modules and algorithms of the DNA'~STAR set on default parameters. In a preferred embodiment, the data presented in columns VIII, IX, XIII, and XIV of Table XI can be used to determine regions of the protein which exhibit a high degree of potential for antigenicity. Regions of high antigenicity are determined from the data presented in columns VIII, IX, XIII, and/or XIV by choosing values which represent regions of the polypeptide which are likely to be exposed on the surface of the polypeptide in an environment in which antigen recognition may occur in the process of initiation of an immune response.
Certain preferred regions in these regards are set out in Figure 33, but may, as shown in Table XI, be represented or identified by using tabular representations of the data presented in Figure 33. The DNA*STAR computer algorithm used to generate Figure 33 (set on the original default parameters) was used to present the data in Figure 33 in a tabular format (See Table XI). The tabular format of the data in Figure 33 is used to easily determine specific boundaries of a preferred region. The above-mentioned preferred regions set out in Figure 33 and in Table XI include, but are not limited to, SUBSTITUTE SHEET (RULE 26) regions of the aforementioned types identified by analysis of the amino acid sequence set out in Figure 31. As set out in Figure 33 and in Table XI, such preferred regions include Gamier-Robson alpha-regions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turn-regions, Kyte-Doolittie hydrophilic regions S and Hopp-Woods hydrophobic regions, Eisenberg alpha- and beta-amphipathic regions, Karplus-Schulz flexible regions, Jameson-Wolf regions of high antigenic index and Emini surface-forming regions. Even if deletion of one or more amino acids from the N-terminus of a protein results in modification of loss of one or more biological functions of the protein, other functional activities (e.g., biological activities, ability to multimerize, etc.) may still be retained. For example, the ability of shortened muteins to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptides generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the N-terminus. Whether a particular polypeptide lacking N-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted N-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the amino terminus of the amino acid sequence shown in Figure 31, up to the arginine residue at position number 61 and polynucieotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues nl-66 of Figure 31, where nl is an integer from 2 to 61 corresponding to the position of the amino acid residue in Figure 31 (which is identical to the sequence shown as SEQ ID N0:40). N-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:40 include polypeptides comprising SUBSTITUTE SHEET (RULE 26) the amino acid sequence of residues: S-2 to N-66; S-3 to N-66; S-4 to N-66; S-5 to N-66;
L-6 to N-66; K-7 to N-66; H-8 to N-66; L-9 to N-66; L-10 to N-66; C-11 to N-66; M-12 to N-66; A-13 to N-66; L-14 to N-66; S-15 to N-66; W-16 to N-66; F-17 to N-66;

to N-66; S-19 to N-66; F-20 to N-66; I-21 to N-66; S-22 to N-66; G-23 to N-66;
E-24 to N-66; T-25 to N-66; S-26 to N-66; F-27 to N-66; S-28 to N-66; L-29 to N-66; L-30 to N-66; N-31 to N-66; S-32 to N-66; F-33 to N-66; F-34 to N-66; L-35 to N-66; P-36 to N-66; Y-37 to N-66; P-38 to N-66; S-39 to N-66; S-40 to N-66; R-41 to N-66; C-42 to N-66; C-43 to N-66; C-44 to N-66; F-45 to N-66; S-46 to N-66; V-47 to N-66; Q-48 to N-66; C-49 to N-66; S-50 to N-66; I-51 to N-66; L-S2 to N-66; D-53 to N-66; P-.54 to N-66; F-SS to N-66; S-56 to N-66; C-57 to N-66; N-58 to N-66; S-59 to N-66; M-60 to N-66; R-61 to N-66; of SEQ ID N0:40. Poiypeptides encoded by these polynucleotides are also encompassed by the invention.
Also as mentioned above, even if deletion of one or more amino acids from the C-terminus of a protein results in modification or loss of one or more biological functions of the protein, other functional activities (e.g., biological activities such as ability to modulate the extracellular matrix, etc.) may still be retained. For example the ability to induce and/or bind to antibodies which recognize the complete or mature forms of the polypeptide generally will be retained when less than the majority of the residues of the complete or mature polypeptide are removed from the C-terminus. Whether a particular polypeptide lacking C-terminal residues of a complete polypeptide retains such immunologic activities can readily be determined by routine methods described herein and otherwise known in the art. It is not unlikely that a mutein with a large number of deleted C-terminal amino acid residues may retain some biological or immunogenic activities. In fact, peptides composed of as few as six amino acid residues may often evoke an immune response.
Accordingly, the present invention further provides polypeptides having one or more residues deleted from the carboxy terminus of the amino acid sequence of the SUBSTITUTE SHEET (RULE 26) polypeptide shown in Figure 31, up to the leucine residue at position number 6, and polynucleotides encoding such polypeptides. In particular, the present invention provides polypeptides comprising the amino acid sequence of residues 1-ml of Figure 31, where ml is an integer from 6 to 66 corresponding to the position of the amino acid residue in Figure 31. Moreover, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of C-terminal deletions of the polypeptide of the invention shown as SEQ ID N0:40 include polypeptides comprising the amino acid sequence of residues: M-1 to E-65; M-1 to W-64; M-1 to P-63; M-1 to F-62; M-1 to R-61; M-1 to M-60; M-1 to S-59; M-1 to N-58;
M-1 to C-57; M-1 to S-56; M-1 to F-55; M-1 to P-54; M-1 to D-53; M-1 to L-52;
M-1 to I-51; M-1 to S-50; M-1 to C-49; M-1 to Q-48; M-1 to V-47; M-1 to S-46; M-1 to F-45;
M-1 to C-44; M-1 to C-43; M-1 to C-42; M-1 to R-41; M-1 to S-40; M-1 to S-39;
M-1 to P-38; M-1 to Y-37; M-1 to P-36; M-1 to L-35; M-1 to F-34; M-1 to F-33; M-1 to S-32;
M-1 to N-31; M-1 to L-30; M-1 to L-29; M-1 to S-28; M-1 to F-27; M-1 to S-26;
M-1 to T-25; M-1 to E-24; M-1 to G-23; M-1 to S-22; M-1 to I-21; M-1 to F-20; M-1 to S-19;
M-1 to S-18; M-1 to F-17; M-1 to W-16; M-I to S-15; M-1 to L-14; M-1 to A-13;

to M-12; M-1 to C-11; M-1 to L-10; M-1 to L-9; M-1 to H-8; M-1 to K-7; M-1 to L-6; of SEQ ID N0:40. Polypeptides encoded by these polynucleotides are also encompassed by the invention.
In addition, the invention provides nucleic acid molecules having nucleotide sequences related to extensive portions of SEQ ID N0:22 which have been determined from the following related cDNA genes: HPWCM76R (SEQ ID N0:147).
This gene is expressed primarily in prostate BPH (benign prostatic hyperplasia) tissue. Therefore, polynucleotides and polypeptides of the invention are useful as reagents for differential identification of the tissues) or cell types) present in a biological sample and for diagnosis of diseases and conditions which include, but are not limited to, inflammation of the prostate, or related tissues. Similarly, polypeptides and SUBSTITUTE SHEET (RULE 26) antibodies directed to these polypeptides are useful in providing immunological probes for differential identification of the tissues) or cell type(s). For a number of disorders of the above tissues or cells, particularly of the prostate, expression of this gene at significantly higher or lower levels is routinely detected in certain tissues or cell types (e.g. prostate, cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid and spinal fluid) or another tissue or- cell sample taken from an individual having such a disorder, relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid ti~om an individual not having the disorder.
The tissue distribution in prostate BPH tissue indicates that polynucleotides and polypeptides corresponding to this gene are useful for the treatment of inflammatory conditions which result in an enlargement of the prostate, or related tissues.
Polynucleotides and polypeptides corresponding to this gene are useful for the treatment and diagnosis of conditions concerning proper testicular function (e.g.
endocrine function, sperm maturation), as well as cancer. Therefore, this gene product is useful in the treatment of male infertility and/or impotence. This gene product is also useful in assays designed to identify binding agents, as such agents (antagonists) are useful as male contraceptive agents. Similarly, the protein is believed to be useful in the treatment and/or diagnosis of testicular cancer. The testes are also a site of active gene expression of transcripts that is expressed, particularly at low levels, in other tissues of the body.
Therefore, this gene product is expressed in other specific tissues or organs where it may play related functional roles in other processes, such as hematopoiesis, inflammation, bone formation, and kidney function, to name a few possible target indications. Based upon the tissue distribution of this protein, antagonists directed against this protein is useful in blocking the activity of this protein. Accordingly, preferred are antibodies which specifically bind a portion of the translation product of this gene.
SUBSTITUTE SHEET (RULE 26) Also provided is a kit for detecting tumors in which expression of this protein occurs. Such a kit comprises in one embodiment an antibody specific for the translation product of this gene bound to a solid support. Also provided is a method of detecting these tumors in an individual which comprises a step of contacting an antibody specific for the translation product of this gene to a bodily fluid from the individual, preferably serum, and ascertaining whether antibody binds to an antigen found in the bodily fluid.
Preferably the antibody is bound to a solid support and the bodily fluid is serum. The above embodiments, as well as other treatments and diagnostic tests (kits and methods), are more particularly described elsewhere herein. The homology to the FAP
protein indicates that the protein product of this gene is useful in treating, detecting, and/or preventing iron metabolism disorders, particularly those resulting in high oxidative states, tissue damage, athersclerosis, free radical damage, vascular disorders, iron binding protein dysfunction, nitric oxide synthase dysfunction or aberration, vasodilation disorders, and tissue edema. Based on the sequence similarity, the translation product of this gene is expected to share at least some biological activities with iron metabolism modulatory proteins. Such activities are known in the art, some of which are described elsewhere herein. Furthermore, the protein may also be used to determine biological activity, to raise antibodies, as tissue markers, to isolate cognate ligands or receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional supplement. Protein, as well as, antibodies directed against the protein may show utility as a tumor marker and/or immunotherapy targets for the above listed tissues.
Many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ
TD N0:22 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence is cumbersome.
Accordingly, preferably excluded from the present invention are one or more SUBSTITUTE SHEET (RULE 26) polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 793 of SEQ ID N0:22, b is an integer of 15 to 807, where both a and b correspond to the positions of nucleotide residues shown in SEQ
ID N0:22, and where b is greater than or equal to a + 14.
SUBSTITUTE SHEET (RULE 26) Table I
Res I II III IV V VI VIIVIIIIX X XIXII XIII XIV
Position Met 1 . . B . . . . -0.100.44 . . . -0.40 0.48 Val 2 . . B . . . . 0.080.01 . . . 0.15 0.63 Ala 3 . . $ . . . . 0.470.01 . . . 0.40 0.76 .

Gln 4 . . B . . . . 0.51-0.01. . . 1.401.33 Asp 5 . . . . . T C 0.23-0.20. . F 2.201.77 Pro 6 . . : . T T . 0.02-0.27* * F 2.50 0.94 Gln 7 . . . . T T . 0.88-0.09* * F 2.25 0.45 1~ Gly 8 . . . . T T . 0.66-0.09. * F 2.00 0.4?

Cys 9 . A B . . . . -0.010.60 . * . -0.10 0.25 Leu 10 . A B . . . . -0.820.74 . * . -0.35 0.08 Gln 11 . A B . . . . -0.911.03 . * . -0.60 0.06 Leu 12 . A B . . . . -0.910.99 . * . -0.60 0.16 15 Cys 13 . A B . . . . -1.420.41 . * . -0.60 0.34 Leu 14 . A B . . . . -1.340.37 * . . -0.30 0.14 Ser 15 . . A B . . . . -0.530.47 * * . -0.60 0.18 Glu 16 . A B . . . . -0.880.19 * . . -0.30 0.53 Val 17 . . B . . T . -0.880.04 * * . 0.10 0.63 Ala 18 A . . . . T . -0.100.04 * . . 0.10 0:39 Asn 19 . . . . T T . 0.71-0.34* . . 1.310.44 Gly 20 . . . . T T . 0.800.06 * * F 1.07 0.96 Leu 21 . . . . . . C -0.06-0.16* * F 1.631.47 Arg 22 . . . . . . C 0.50-0.01* * F 1.69 0.68 25 Asn 23 . . . . . T C 0.49-0.03* . F 2.10 0.92 Pro 24 . . 8 . . T . -0.370.16 * . F 1.241.10 Val 25 . . B . . T . -0.060.11.* * . 0.73 0.42 Ser 2b . . B . . T . 0.170.61 * * . 0.22 0.35 ' Met 27 . . B B . . . -0.290.71 * . . -0.39 0.23 V 28 . . B B . . . -0.290.71 . . . -0.60 0.31 al His 29 . . B B . . . -0.420.07 . . . 0.00 0.38 Ala 30 A . B . . . . 0.120.11 . . . 0.50 0.38 , Gly 31 . . . . T T . 0.39-0.01* . F 2.15 0.74 Asp 32 . . . . T T . 1.10-0.16* * F 2.45 0.74 35 .Gly 33 . . . . . T C 1.26-0.66 * F 3.001.44 Thr 34 . . . . . T C 0.59-0.37* * F 2.401.26 His 35 . . B B . . . 0.32-0.01* * . I ~0 0.65 Arg 36 . . B B . . . 0.080.63 . * . 0.00 0.49 Phe 37 . . B B . . . 0.080.70 . * . -0.30 0.34 Phe 38 . . B B . . . 0.420.21 . * . -0.30 0.:14 SUBSTITUTE SSEET (RULE 26) Val 39 A . . B . . . -0.12 0.11. * . -0.30 0.39 Ala 40 A . . B . . . -0.43 0.76 -0.60 0.33 Glu 41 A . . B . . . -1.40 0.40. * . -0.60 0.38 Gln 42 A . . B . . . -1.56 0.26. . . -0.30 0.38 $ Val 43 A . . B . . . -1.14 0.26. . . -0.30 0.28 Gly 44 . . B B . . . -1.14 0.67. . . -0.60 0.17 Vai 45 . . B B . . . -0.80 1.31. . . -0.60 0.07 Val 46 . . B B . . . -1.61 1.67. . . -0.60 0.15 Trp 47 . . B B . . . -1,82 1.71. . . -0.60 0.13 1~ Val 48 . . B B . . . -0.97 1.71. . . -0.60 0.27 Tyr 49 . . B . . . . -0.97 1.07. . . -0.16 0.60 Leu 50 . . B . . T . -0.41 0.86* . . 0.28 0.56 Pro 51 . . . . T T . 0.56 0.33* * F 1.521.01 Asp 52 . . . . T T . 0.03 -0.31. * F 2.361.27 1$ Gly 53 . . . . . T C 0.89 -0.39* * F 2.40 1.27 Ser 54 . A . . . . C 1.13 -1.0'7* . F 2.061.42 Arg 55 . A B . . . . 1.73 -1.10* * F 1.621.47 Leu 56 . A B . . . . 1.24 -0.67* . F 1.38 2.30 Glu 57 . A B . . . . 0.43 -0.31* . F 0.841.49 Gln 58 . A B . . . . 0.78 -0.01* * F 0.45 0.63 Pro 59 A A . . . . . 0.27 -0.Oi* * F 0.60 1 ?7 Phe 60 A A . . . . . 0.20 -0.O1* * . 0.30 O.bO

Leu 61 A A . . . . . I .Ol -0.01* . . 0.30 0.70 Asp 62 A A . . . . . 0.12 -0.01* . . 0.30 0.73 2$ Leu 63 A . . B . . . -0.73 0.24* . . -0.30 0.59 Lys 64 A . . B . . . -1.33 0.10. . . -0.30 0.53 Asn 65 . . B B . . . -0.94 0.10. . . -0.30 0.26 Ile 66 . . B B . . . -0.44 0.59. * . -0.60 0.4b Val 67 . . B B . . . -0.66 0.39. . . -0.30 _ 0.33 30 Leu 68 . . B B . . . -0.13 0.81* . . -0.60 0.32 Thr 69 . . B B . . . -1.07 1.33. . F -0.45 0.48 Thr 70 . . B B . . . -1.41 1.33. . F -0.45 0.45 Pro 71 . . B B . . . -0.52 1.11. . F -0.45 0.54 Trp 72 . . . B T . . 0.33 0.43. . . -0.20 . 0.62 3$ Ile 73 . . B B . . . 1.26 -0.06. * . 0.610.75 Gly 74 . . B B . . . 1.22 -0.54. . F 1.37 0.95 Asp 75 . . . . . T C 0.83 -0.54. . F 2.28 0.89 Glu 76 . . B . . T . 0.23 -0.67. . F 2.541.10 Arg 77 . . . . T T . 0.18 -0.67. * F 3.10 0.92 Gly 78 . . . . T T . 0.26 -0.67. . F 2.79 0.54 Phe 79 A . . . . . . 0.01 0.01. * . 0.83 0.36 SUBSTITUTE SHEET (RULE 26) Leu 80 A . . . . . . -0.690.51. * . 0.22 0.13 Gly 81 A . . . . . . -0.721.30. * . -0.09 . 0.12 Leu 82 A . . . . . -1.041.37* * . -0.40 0.18 Ala 83 A . . . . . . -0.661.01. * . -0.40 0.34 Phe 84 A . . . . . . -0.660.33. * . -0.10 0.70 His 85 A . . . . T . 0.27 0.69* * . -0.20 0.73 Pro 86 A . . . . T . 0.58 0.00. * . 0.251.42 Lys 87 A . . . . T . 1.39 0.00. * . 0.56 2.23 Phe 88 A . . . T . 2.09 -039* * . 1.47 2.63 1~ Arg 89 A . . . . . 2.83 -0.89* * . 1.88 3.33 His 90 A . . . T T . 2.17 -1.31* * . 2.79 3.33 Asn 91 . . . , T T . 2.13 -0.53. * . 3.10 3.33 Arg 92 . . . . T T . 1.20 -0.56. * . 2.79 2:67 Lys 93 . . . . T T . 1.66 0.13* * . 1.581.37 IS Phe 94 . . . B T . . 1..300.39* * . 0.871.34 _ Tyr 95 . . B B . . . 1.03 0.74. . . -0.14 1.07 Ile .96 . . B B . . . 0.37 1.13* . . -0.60 0.72 Tyr 97 . . $ . . T . -0.561.70* . . -0.20 0.44 Tyr 98 . . B . . T . -0.601.60. * . -0.20 0.23 Ser 99 . . B . . T . 0.14 0.84. . . -0.20 0.56 Cys 100 A . . . . T . 0.43 0.16. . . 0.10 0.71 Leu 101 A A . . . . . 1.37 -0.60. . . 0.60 0.9I

Asp 102 A A . . . . . 0.76 -1.36. . F 0.901.35 Lys 103 A A . . . . . 1.00 -1.10. . F 0.901.87 ZS Lys 104 A A . . . . . 1.34 -1.67. . F 0.90 3.94 Lys 105 A A . . . . . 1.12 -2.36* * F 0.90 4.71 Val 106 A A . . . . . 2.04 -1.67. * F 0.901.65 Glu 107 A A . . . . . 1.16 -1.67. * F 0.901.62 Lys 108 A A . . . . . 0.81 -0.99. * F 0.75 . 0.57 Ile 109 A A . . . . . 0.77 -0.60. * F 0.901.02 Arg 110 A A . . . . . O.12 -1.24. . . 0.751.02 Ile 1 A A . . . . . 1.02 -0.63. * . 0.60 I 0.51 1 .

Ser 112 A A . . . . . 0. -0.63. * F 0.90 x 7 1.45 Glu t A A . . . . . -0.18-0.67 * F 0.75 13 0.55 35 Met 114 A A . . . . . 0.82 -0.29* * F 0.601.05 Lys 115 A A . . . . . 0.12 -0.97* . F 0.901.53 V I . A B . . . . 1.01 -0.86. . F 0.75 al 16 0.89 Ser l17 A A . . . . . 1.10 -0.86. * F 0.901.51 Arg Il8 A A . . . . . 1.10 -1.04. . F 0.901.17 40 Ala 119 A A . . . . 1.?4 -0.64* . F 0.90 . 2.52 Asp !?0 A . . . . T 1.1 -1.29* . F 1.30 . I 3.77 SUBSTITUTE SHEET (RULE 26) Pro 121 A . . . . T . 1.97 -1.17* . F 1.301.94 Asn 122 A . . . . T . x.46 -1.17. * F 1.30 ~ 3.21 Lys 123 A . . . . T . 1.39 -0.99. * F 1.30 1.59 Ala 124 A A . . . . . 1.68 -0.99. * F 0.90 2.05 Asp 125 A A . . . . . 1.68 -1.03* . F 0.901.71 Leu 126 A A . . . . . 2.00 -1.43* * F 0.901.48 Lys 127 A A . . . . . 1.14 -1.43* * F 0.90 2.87 Ser 128 A A . . . . . 0.21 -1.29* * F 0.901.28 Glu 129 A A . . . . . -0.01 -0.60* * F 0.901.08 1~ Arg 130 A A . . . . . -0.01 -0.60* * F 0.75 0.45 Val 131 A A . . . . . -0.09 -0.60* * . 0.60 0.58 Ile 132 A A . . . . . -0.13 -0.30* * . 0.30 0.23 Leu 133 A A . . . . . 0.17 -0.30* * . 0.30 0.21 Glu 134 A A . . . . . -0.04 -0.30* * . 0.30 0.48 15 Ile 135 A A . . . . . -0.74 -0.51* * . 0.751.06 Glu 136 A A . . . . . -0.19 -0.70* * F 0.901.30 Glu 137 A A . . . . . 0.70 -1.00* . F 0.901.01 Pro 138 A . . . . T . 1.48 -0.60. * F 1.30 2.31 Ala 139 A . . . . T . 1.48 -0.79. . F 1.581.82 Ser 140 A . . . . T . 2.02 -0.39. . F 1.561.69 Asn 141 . . . . . T C 1.68 0.04. . F 1.441.08 His 142 . . , . . T C 1.68 0.04. . F 1.721.06 Asn 143 . . . . T T . 1.08 -0.06. . F 2.801.37 Gly 144 . . . . T T . 0.86 0.24. . F 1.77 0.?0 2$ Gly 145 . . . . T T . 0.46 0.53. . F 1.19 0.43 Gln 146 . A B B . . . 0.11 0.81. . F 0.110.23 Leu 147 . A B B . . . -0.67 0.84. * . -0.32 0.23 Leu 148 . A B B . . . -0.67 1.10. * . -0.60 0.19 Phe 149 . A B B . . . -0.67 0.67. * . -0.60 , 0.18 3~ Gly 150 . . B B . . . -0.57 0.70. * . -0.60 0.22 Leu 151 . . B . . T . -1.1? 0.77. * . -0.20 0.42 Asp 152 . . . . T T . -0.60 0.70. * . 0.20 . 0.48 Gly 153 . . . . T T . -0.68 0.67. . . 0.20 0.76 Tyr 154 . . B . . T . -0.68 0.93. * . -0.20 0.65 35 Met 155 . . B B . . . -0.64 1.03. . . -0.60 0.33 Tyr 156 . . B B . . . -0.18 1.51. * . -0.60 0.49 Ile 157 . . B B . . . -0.18 1.51. . . -0.60 0.31 Phe 158 . . B B . . . -0.18 0.76. . . -0.60 0.52 Thr 159 . . B B . . . -0.28 0.57. . F -0.45 0.33 Gly 160 . . . . T T . 0.32 0.24. . F 0.88 0.46 Asp 161 . . . . T T . -0.02 -0.04. . F 1.710.93 SUBSTITUTE SHEET (RULE 26) GIy 162 . . . . . T C 0.52-0.33. . F 1.74 0.65 Gly I63 . . . . . T C 1.22-0.39. . F 1.97 . 0.65 Gin 164 . . . . . . C 1.32-0.81* . F 2.30 0.65 Ala 165 . . . . . . C 0.97-0.39* . F 1.921.01 $ Gly 166 . . B . . . . 0.62-0.03. . F 1.34 0.89 Asp 167 . . B . . T . 0.16-0.03. . F 1.310.51 Pro 168 . . B . . T . -0.200.26 . . F 0.48 0.41 Phe 169 . . B . . T . -0.540.54 * . . -0.20 0.3b Gly 170 . . B . . T . 0.040.54 * . . -0.20 0.21 1 Leu 171 . . B . . . . -0.200.94 . . . -0.40 ~ 0.22 Phe 172 . . B . . . . -0.20LOl . . . -0.40 0.26 Gly 173 . . B . . . . 0.010.63 . * . -0.10 0.46 Asn 174 . . . . . . C 0.760.60 . * F 0.55 0.89 Ala 175 . . . . . . C 0.80-0.09. . F 1.90 2.05 15 Gln 176 . . . . . . C 1.31-0.49. * F 2.20 2.78 Asn 177 . . . . . T C 1.20-0.53. . F 3.00 2.32 Lys 178 . . . . T T . 0.73-0.24. * F 2.601.89 Ser 179 . . B . . T . 0.39-0.06. * F 1.75 0.90 Ser 180 . . B . . T . 1.02-0.03* * F 1.45 0.55 Zfl Leu 181 . . B . . . . 0.17-0.43* * F 0.95 0.55 Leu 182 . . B B . . . -0.640.21 * . F -0.15 0.31 Gly 183 . . B B . . , -0.580.51 * * F -0.45 0.19 Lys 184 . . B B , . . -1.170.13 * * . -0.30 0.45 Val 185 . . B B . . . -0.870.13 * * . -0.30 0.38 25 Leu 186 . . B B . . . -0.91-0.56* * . 0.60 0.64 Arg I 87 . B B . . . -0.10-0.34* * . 0.30 . 0.24 Ile 188 . . B B . . . 0.360.06 * . . -0.30 . 0.52 Asp 189 . . B . . T . -0.28-0.59* * . 1.151.23 Val 190 . . B . . T . 0.23-0.77* * . 1.00 ~ 0.63 Asn 191 . . B . . T . 0.74-0.34* * F 0.85 0.89 Arg 192 . . B , . T . 0.60-0.64. * F 1.49 0.72 Ala 193 . . . . T . . 1.14-0.14* * F 1.881.32 Gly 194 . . . . T T . 1. -0.36* . F 2.27 l9 _ 0.81 Ser 195 . . . . . T C 2.16-0.76* . F 2.710.83 3$ His 196 . . . . T T . 1.91-0.76* * F 3.401.60 Gly 197 . . . . T T . 1.91-0.50* * F 3.06 2.54 Lys 198 . . B . . . . 1,64-0.93. * F 2.12 3.71 Arg 199 . . B B . . . 1,78-0.67. * F 1.58 2.02 Tyr 200 . . B B . . . 1,78-0.74. * . 1.433.16 40 Arg 201 . . 8 B . . . 1.81-0.79. * . 1.43 2.12 Val ?02 . . B B . . . 2.16-0.79. * F 1.931.81 SUBSTITUTE SHEET (RULE 26) Pro 203 . . B . . T . 1.90 -0.39. * F 2.361.85 Ser 204 . . . . T T . 1.09 -0.71. * F 3.401.46 Asp 205 . . . . . T C 0.48 0.07. * F 1.961.71 Asn 206 . . . . . T C 0.07 0.07. * F 1.47 0.82 $ Pro 207 . . . . . . C 0.92 0.03. . F 0.93 0.82 Phe 208 . . B . . . . 0.92 -0.36. . F 0.99 0.85 Val 209 . . B . . . . 0.88 0.07. . F 0.33 0.82 Ser 210 . . B . . . . 0.29 0.10. . F 0.610.52 Glu 2l 1 . . B . . T . 0.26 0.17. . F 1.09 ' 0.61 1 Pro 212 . . . . T T . 0.26 -0.11. . F 2.52 ~ 1.12 Gly 213 . . . . T T . 0.37 -0.33. . F 2.801.29 Ala 214 . . . . . T C 0.33 -0.21. . . 2.02 0.75 His 215 . . . . . . C 0.39 0.47. . . 0.64 0.34 Pro 216 . . B B . . . -0.200.80. . . -0.04 0.54 1$ Ala 2i7 . . B B . . . -0.230.87. . . -0.32 0.54 Ile 218 . . B B . . . -0.231.13. . . -0.60 0.62 Tyr 219 . . B . . T . -0.531.06. * . -0.20 0.40 A!a 220 . . B . . T . -0.391.31. . . -0.20 0.28 Tyr 221 . . B . . T . -0.180.81* . . -0.20 0.77 G1y 222 . . B . . T . -0.190.53* . . -0.20 0.79 Ile 223 . . B B . . . 0.41 0.39* * . -0.30 0.78 Arg 224 . . B B . . . 0.77 0.80* * . -0.60 0.52 Asn 225 . . . B T . . 0.69 0.04* * . 0.251.03 Met 226 . . . B T . . 0.34 0.19* * . 0.10 0.79 2$ Trp 227 . . B B . . . -0.170.00* * . 0.30 0.41 Arg 228 . . B . . . . 0.72 0.64* * . -0.40 0.19 Cys 229 . . B . . . . 0.72 0.24* * . 0.24 0.32 Ala 230 . . B . . . . 038 -0.37* . . 1.18 0.59 Val 231 . . B . . . . 0.98 -0.86* . . 1.82 . 0.30 Asp 232 . . . . T T . 1.06 -0.86* . F 2.910.93 Arg 233 . . . . T T . 0.06 -1.00* . F 3.401.42 Gly 234 . , . . T T . 0.41 -0.81* * F 3.061.34 Asp 235 . . . . . T C 1.11 -0.97* . F 2.521.16 Pro 236 . . B . . . . 1.97 -0.97* . F .1.781.16 3$ Ile 237 . . B . . . . 1.62 -0.57* * F 1.78 2.03 Thr 238 . . B . . . . 1.62 -0.57* * F 1.781.20 Arg 239 . . B . . . . 1.62 -0.57* * F 2.121.52 Gln 240 . . B . . . . 1.73 -0.57* * F 2.46 2.15 Gly 241 . . . . T T . 1.06 -1.26. * F 3.40 2.92 Arg 242 . . , . T T . i:L4 -1.06. * F 3.061.05 Gly 243 . . . . T T . 0.89 -0.27. * F 3:17 0.52 SUBSTITUTE SHEET (RULE 26) Arg 244 . . B . . T . 0.43-0.10. * F 1.53 0.28 Ile 245 . . B . . . . 0.43-0.10 *
. . . 0.84 0.14 Phe 246 . . B . . . . -0.08-0.10. * . 0.50 0.24 Cys 247 . . B . . T . -0.530.11 . * . 0.40 0.09 $ Gly 248 . . B . . T . -0.190.54 . . . 0.40 0.13 Asp 249 . . . . T T . -0.300.26 . . F 1.55 0.26 Val 250 . . . . . T C 0.70-0.13. . F 2.25 0.77 Gly 251 . . . . . T C 0.70-0.70. * F 3.001.53 Gin 252 . . . . . T C 1.37-0.34* . F 2.25 0.80 1~ Asn 253 . . . . . T C 1.71-0.34* . F 2.101.86 Arg 254 . . B . . T . 0.86-0.99* . F 1.90 3.25 Phe 255 A A . . . . . 1.71-0.77. . F 1.201.39 Glu 256 A A . . . . . 1.24-1.17. * F 0.901.45 Glu 257 A A . . . . . 0.36-0.89. * . 0.60 0.61 1$ Val 258 A A . B . . . -0.46-0.20* * . 0.300.49 Asp 259 A A . B . . . -0.52-0.30. * . 0.30 0.23 Leu 260 A A . B . . . -0.17-0.30* * . 0.30 0.27 Ile 261 A A . B . . . -0.510.13 * * . -0.30 0.36 Leu 262 A . . . . T . -0.51-0.09* * . 0.70 0.21 Lys 263 . . . . T T . 0.100.31 * * F 0.65 . 0.42 Gly 264 . . . . T T . -0.240.39 * . F 0.65 0.93 Gly 265 . . . . T T . 0.280.13 * * F 0.801.12 Asn 266 . . . . . T C 1.280.36 * ' F 0.45 0.59 Tyr 267 . . . . . T C 1.500.36 * * . 0.451.17 2$ Gly 268 . . . . . T C i.500.43 * * . 0.151.19 Trp 269 . . B . . T . 1.840.00 * * . 0.851.48 Arg 270 . A B . . . . 1.84-0.40* * . 0.451.63 Ala 271 . A B . . . . 1.14-0.73* * F 0.901.63 Lys 272 A A . . . . . . 0.80-0.37* * F 0.601.35 Glu 273 A A . . . . . 0.48-0.79* * F 0.75 0.69 Gly 274 A A . . . . . 0.52-0.21. * F 0.45 0.3?

Phe 275 A A . . . . . 0.410.04 * * . -0.30 0.29 Ala 276 A A . . . . . 1.040.04 * . . -0.30 0.28 Cys 277 A ' . . . T . 1.040.04 * . . 0.10 . 0.56 3$ Tyr 278 A . . . . T . 0.23-0.39* . . 0.851.30 Asp 279 A . . . . T . -0.09-0.49* . F 1.001.06 Lys 280 A . . . . T . 0.58-0.41* . F 1.001.06 Lys 281 A A . . . . . 1.17-0.49. . F 0.45 0.92 Leu 282 A A . . . . . i:24-0.84. . . 0.60 0.89 4flCys 283 A A . . . . . 1.19-0.34. * . 0.30 0.45 His 284 . A B . . . . 0.380.04 . * . -0.30 0.30 SUBSTITUTE SHEET (RULE 26) ziz Asn 285 . . . . T . 0.33 0.73 . . -0.20 B * 0,30 Ala 286 A . . . . T . 0.29 0.04 . . 0.10 . * 0.94 Ser 287 A . . . . T . 0.24 -O.S3. . 1.151.15 .

Leu 288 A . . . . T . 0.10 -0.39. F 0.85 . 0.53 Asp 289 . . B B . . . -0.08-0.10. F 0.45 . 0.43 Asp 290 . . B B . . . -0.97-0.17. F 0.45 * 0.50 Val 291 . . B B . . . -0.62O.I3 . . -0.30 . 0.42 Leu 292 . . B B . . . -0.910.20 . . -0.30 . 0.40 Pro 293 . . B B . . . -0.340.70 * . -0.60 . 0.24 1~ 11e 294 . . B , . . . -0.691.46 . . -0.60 B ~ . 0.51 Tyr 295 . . B . . T . -0.721.24 . . 0.20 . 0.61 Ala 296 . . B . . T . -0.461.06 . . -0.20 . 0.54 Tyr 29? . . B . . T . -0.501.13 . . -0.20 * 0.77 Gly 298 . . B . . T . -0.631.09 * . -0.20 . 0.37 15 His 299 . . B B . . . 0.30 0.76 * . -0.60 . 0.36 Ala 300 . . B B . . . 0.24 0.26 * . -0.30 . 0.46 Val 301 . . B B . . . -0.02-0.11* . 0.30 . O.b2 Gly 302 . . B . . T . -0.090.10 * F 0.25 . 0.34 Lys 303 . . B . . T . -0.090.09 * F 0.25 . 0.48 Scr 304 . . B . . T . -0.400.01 * F 0.25 . 0.64 Val 305 . . B . . T . -0.06-0.20. F 0.85 . 0.64 Thr 306 . . B . . T . -0.060.13 . F 0.25 . 0.50 Gly 307 . . B . . T . 0.04 0.77 . F -0.05 . 0.28 Gly 308 . . B . . T . 0.11 1.14 . F -0.05 . 0.59 25 Tyr 309 . . B . . T . 0.07 0.50 . . -0.20 . 0.80 Val 310 . . B . . . . 0.26 0.44 * . -0.40 . 0.80 Tyr 31l . . B . . T . 0.57 0.59 * . -0.20 . 0.43 Arg 312 . . B . . T . 0.61 0.16 * . 0.10 . 0.48 Gly 313 . . B . . T . 0.74 -0.21* F 1.13 . . 0.87 Cys 314 . . B . . T . 0.99 -0.43* F 1.410.85 *

Glu 315 . . B ~ . . . 1.03 -0.79* F 1.79 . * 0.70 Ser 316 . . . . . T C 1.28 -0.(0. F 2.170.58 *

Pro 317 . . . . T T . 0.82 -0.13. F 2.801.75 *

Asn 318 . . . . T T . 0.36 -0.27. F 2.521.00 .

35 Leu 319 . . . . T T . 0.78 0.41 . F 1.19 . 0.62 Asn 320 . . . B T . . -0.110.79 . . 0.36 . 0.63 Gly 321 . . B B . . . -0.511.04 . . -0.32 . 0.27 Leu 322 . . B B
-0.64 1.43 . . -0.60 . 0.29 Tyr 323 . . B B . . . -0.641.17 . . -0.60 * 0.18 4~ Ile 3?4 . . B B . . . -0.530.77 * . -0.60 . 0.30 Phe 3?S . . B B . . . -L 1.13 * . -0.60 l3 . 0.31 SUBSTITUTE SHEET (RULE 26) Gly 326 . . B B . . . -1.09. * . -0.60 1.06 . 0:?0 Asp 327 . . B . . . . -0.620.69 . . -0.40 - * 0.38 Phe 328 . ~ B
- . . . -0.270.43 . . -0.40 * 0.43 Met 329 A . . . . T . -0.19-0.36. . 0.70 * 0.85 Ser 330 . . . . . T C -0.09-0.10. F 1.05 * 0.42 Gly 331 A . . . . T . -0.330.51 . F -0.05 . 0.48 Arg 332 A . . . . T . -1.140.23 . . 0.10 . 0.49 Leu 333 A A . . . . . -0.440.30 . . -0.30 . 0.30 Met 334 A A ~. . . . . 0.16 0.31 . . -0.30 * 0.53 Ala 335 A A . . . . . 0.46 -0.11. . 0.30 * 0.47 Leu 336 A A . . . . . 0.91 -0.1 . . 0.30 I * 0.95 Gln 337 A A . . . . . 0.84 -0.80. . 0.751.87 .

Glu 338 A A . . . . . 1.66 -1.41. F 0.90 . 3.71 Asp 339 A A . . . . . 2.30 -1.51* F 0.90 . 7.23 1$ Arg 340 A A . . . . . 2.93 -2.20~ F 0.90 * 8.35 .

Lys 341 A A . . . . . 3.46 -2.60. F 0.90 * 9.64 Asn 342 A . . . . T . 3.50 -1.69* F 1.30 . 6.07 Lys 343 A . . . . T . 3.54 -1.69* F 1.30 . 6.20 Lys 344 A . . . . T . 3.54 -1.69* F 1.30 . 6.20 Trp 345 A . . . . T . 3.43 -1.29* F 1.30 . 6.68 Lys 346 A A . . . . . 2.58 -1.69. F 0.90 ' 5.58 .

Lys 347 A A . . . . . 1.91 -1.00. F 0.90 . 2.30 Gln 348 . A B . . . . 1.06 -0.43* F 0.601.17 .

Asp 349 . A B . . . . 0.67 -0.66. F 0.75 . 0.48 z$ Leu 350 . A B . . . . 0.66 -0.23. F 0.45 . 0.24 Cys 351 . A B . . . . 0.30 0.16 . F -0.15 . 0.19 Leu 352 . A 8 . . . . -0.060.24 . F -0.15 . 0.16 Gly 353 . A . . T . . -0.360.73 . F -0.05 . 0?8 Ser 354 . . . . T T . -1.020.43 . F 0.35 . . 0.70 Thr 355 . . . . T T . -0.800.43 . F 0.35 . 0.45 Thr 356 . . B . . T . -0.830.24 . F 0.25 . 0.46 Ser 357 . . B . . T . -0.230.60 . F -0.05 . 0.30 Cys 358 . . B . . , . -0.230.64 . . -0.40 . 0.32 Ala 359 . . B . . . . -0.740.59 . . -0.40 . 0.22 Phe 360 ~. . B . . T . -1.320.79 . . -0.20 . O.t4 Pro 361 . . B . . T . -1.31t.09 . . -0.20 . 0.18 Gly 362 . . . . T T . -132 0.90 . . 0.20 . 0.24 Leu 363 . . B . . T . -0.690.89 . . -0.20 . 0.39 Ile 364 . . B . . . . -0.400.60 * . -0.40 . 0.35 4~ Ser 365 A . . . . T . 0.34 0.56 * . -0.20 * 0.47 Thr 366 A . . . . T . -0.140.13 * F 0.401.13 .

SUBSTITUTE SHEET (RULE 26) His 36? A . . . . T . -0.690.23 * * F 0.401.40 Ser 368 . . B . . T . -0.770.23 * * F 0.25 . 0.73 Lys 369 . . B B . . . -0.180.53 * * . -0.60 0.36 Phe 370 . . B B . . . -0.580.43 * * . -0.60 0.35 $ Ile 371 . . B B . . . -0.860.71 * * . -0.60 0.23 Ile 372 . . B B . . . -0.820.83 * * , -0.60 0.11 Ser 373 . A B . . . . -0.520.83 * . . -0.60 0.23 Phe 374 A A . . . . . -0.570.04 * * . -0.30 0.54 Ala 375 A A . . . . . -0.46-0.64. . . 0.751.35 1~ Glu 376 A A . . . . . 0.09 -0.83* . . 0.751.01 Asp 377 A A ; . . . . 0.98 -0.79* . F 0.901.16 Glu 378 A A . . . . . 0.47 -1.57. . F 0.901.99 Ala 379 A A . . . . . 0.92 -1.39. . F 0.75 0.95 Gly 380 A A . . . . . 0.81 -0.63. . F 0.75 0.89 Glu 381 A A . B . . . 0.00 0.16 . . . -0.30 0.44 Leu 382 A A . B . . . -O.S90.84 . . . -0.60 0.36 Tyr 383 A A . B . . . -0.900.84 . . . -0.60 0.37 Phe 384 A A . B . . . -0.610.90 . . . -0.60 0.31 Leu 385 . A B B . . . -0.511.29 . . . -0.60 0.50 Ala 386 . A B B . . . -0.721.36 . . . -0.60 0.50 Thr 387 . A B B . . . -0.211.03 . . . -0.60 0.89 Ser 388 . A . . . . C -0.560.63 . . F -0.10 1.45 Tyr 389 . . . . . T C -0.100.44 . . F 0.301.45 Pro 390 . . . . T T . 0.12 0.70 . . F 0.501.58 25 Ser 391 . . . . T T . 0.50 0.71 . . . 0.351.19 Ala 392 . . B . . T . 0.92 0.76 . . . 0.081.17 Tyr 393 . . B . . . . 0.88 0.00 . . . 0.911.49 AIa 394 . . B . . T . 0.82 0.00 . * . 1.241.10 Pro 395 . . B . . T . 0.14 0.00 . * F 1.52 - 1.46 Arg 396 . . . . T T . 0.20 0.19 . . F I .30 0.65 Gly 397 . . 8 . . T . 0.83 0.19 . . F 0.921.01 Ser 398 . . B B . . . 0.38 -0.31. . F 0.991.31 Ile 399 . . B B . . . 0.11 0.04 * . . -0.04 O.S8 Tyr 400 . . B B . . . 0.32 0.69 * * . -0.47 0.43 35 Lys 401 . . B B . . . 0.00 0.26 * * . -0.30 0.54 Phe 402 . . B B . . . 0.04 0.30 * . . 0.191.19 Val 403 . . B B . . . 0.46 0.00 * . F 1.281.02 Asp 404 . . B . . T . 1.46 -0.76* . F 2.171.00 Pro 405 . . B . . T . 1.11 -0.76* . F 2.66 2.26 4flSer 406 . . . . T T . 0.86 -1.04* . F 3.40 3.07 Arg 407 . . . . T T . 1.34 -1.26* . F 3.06 2.85 SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC'f/US99/25031 Arg 408 . . . . T . . 1.86 -0.83. . F 2.86 2.85 Ala 409 . . . . . . C 1.90 -0.83. . F 2.66 . 2.10 Pro 410 . . . . . T C 1:44 -1.21* . F 2.86 2.15 Pro 411 . . . . T T . 1.79 -0.64* * F 2.910.59 Gly 412 . . . . T T . 1.43 -0.64. * F 3.401.16 Lys 413 . . . . T T . 1.37 -0.39. * F 2.761.18 Cys 414 . . . . T T . 1.74 -0.81. * F 2.721.52 Lys 415 . . B . . T . 1.10 -0.81. * F 1.98 2.38 Tyr 416 . . B . . T . 1.10 -0.60. * F 1.49 0.88 1~ Lys 417 . . B . . T . O.S9 -0.17. * F L00 2.55 Pro 418 . . B B . . . 0.66 -0.10. * F 0.45 0.95 Val 419 . . B B . . . 1.01 -0.10. * F 0.601.18 Pro 420 . . B B . . . 1.01 -0.37. * F 0.79 0.85 Val 421 . . B B . . . 0.96 -0.37. * F 1.281.10 i5 Arg 422 . . B B . . . 0.96 -0.41. * F 1.621.99 Thr 423 . . B . . T . I -1.06* * F 2.66 .28 2.58 Lys 424 . . . . T T . 1?4 -1.49* * F 3.40 6.80 Ser 425 . . . . T T . 1.24 -1.44* * F 3.06 2.43 Lys 426 . . . . T T . 1.40 -1.01* * F 2.72 2.61 Arg 427 . . B . . . . 1.40 -0.71. * F 1.781.13 Ile 428 . . B . . . . 1.50 -0.71* * F 1.441.65 Pro 429 . . B . . . . 0.64 -0.67* * . 0.951.28 Phe 430 . . B . . . . 0.36 0.01 * * . -0.10 0.54 Arg 431 . A B . . . . 0.36 0.51 * * . -0.60Ø?7 25 Pro 432 . A B . . . . -0.07-0.17* * F 0.601.00 Leu 433 A A . . . . . -0.03-0.11* * F 0.60 L.67 Ala 434 A A . . . . . -0.63-0.26* * F 0.45 0.63 Lys 435 A A . . . . . 0.07 0.43 * * F -0.45 0.34 Thr 436 A A . . . . . -0.860.00 * * . 0.30 . 0.68 Val 437 A A . . . . . -1.460.00 * . . 0.30 0.56 Leu 438 A A . . . . . -0.600.19 * . . -0.30 0.23 Asp 439 A A . . . . . -0.01O.19 * . . -0.30 0.32 Leu 440 A A . . . . . -0.06-0.30* . . 0.30 0.74 Leu 441 A A . . . . . -0.04-0.54* . F 0.901.56 35 Lys 442 A A . . . . . 0.81 -0.84* . F 0.901.25 Glu 443 A A . . . . . 1.67 -0.84* . F 0.90 2.63 Gln 444 A A . . . . . 1.08 -1.53* . F . 0.90 6.38 Ser 445 A A . . . . . 1.30 -I * . F 0.90 .71 3?2 Glu 446 A A . . . . . 2.22 -1:11* . F 0.901.88 Lys 447 A A . . . . . ?.22 -I?I * , F 0.902.13 Ala 448 A A . . . . . 1.92 -1.61* . F 0.90 3.17 SUBSTITUTE SHEET (RULE 26) Ala 449 A A . . . . . 1.62-1.61* , 0.90 F 2.4b Arg 450 A A . . . . . 1.62-1.23* . 0.90 . F 1.65 Lys 451 A A . . . . . 1.03-0.84* . 0.90 F 2.18 Ser 452 A . . . . T . 0.68-0.84* . 1.30 F 2.18 Ser 453 A . . . T . 0.46'-0.86. . 1.301.61 F

Ser 454 . . B . . T . 0.46-0.17. . 0.85 F 0.66 Ala 455 . . B . . T . 0.040.33 . . 0.25 F 0.50 Thr 456 . . B . . . . -0.340.33 . . -0.10 . 0.50 Leu 457 . . B . . '. . -0.260.37 . . -0.10 . 0.37 1~ Ala 458 . . B . . T . -0.540.41 . . -0.05 F 0.57 Ser 459 . . B . . T . -0.240.41 . . -0.05 F 0.40 Gly 460 . . . . . T C 0.000.33 . . 0.45 . F 0.83 Pro 461 . . . . . T C -0.500.07 * . 0.45 F 0.81 Ala~ 462 . . . , ' C 0.010.26 * . 0.25 . 'F 0.50 15 Gln 463 A . . . . . . 0.600.26 . . 0.05 F 0.68 Gly 464 . . B . . . . 0.94-0.17. . 0.65 F 0.76 Leu 465 . . B . . . . 0.94-0.60. . 1.101.50 F

Ser 466 A . . . . . . 0.86-0.67* . 0.95 F 0.86 Glu 467 A . . . . . . 1.14-0.69. * 1.101.16 F

Lys 468 A . . . . . . . -0.73. . 1.10 l t F 1.89 Gly 469 A . . . T T . 1.58-1.41. . 1.70 F 2.82 Ser 470 A . . . . T . 1.58-1.80. . 1.30 F 3.26 Ser 471 A . . . . T . 1.29-1.11. . 1.301.34 F

Lys 472 A . . . . T . 0.99-0.61. . 1.301.3?
F

2$ Lys 473 . . B . . . . 0.73-0.66. . 1.101.37 F

Leu 474 . . B . . . . 0.7?-0.61. . 1.101.58 F

Ala 475 . . B . . . . 0.77-0.51* . 1.401.14 F

Ser 47b . . B . . T . 0.77-0.13. . 1.45 F 0.77 Pro 477 . . B . . T . 0.770.26 . . 1.30 ' F 1.24 Thr 478 . . . . T T . 0.72-0.43. . 2.60 F 2.46 Ser 479 . . . . . T C 1.22-0.53. . 3.00 F 2.95 Ser 480 . . . . T T . 1.00-0.43* * 2.60 F 2.76 Lys 481 . . B . . T . 1.41-0.17* * 1.901.58 F

Asn 482 . . B . . T . 1.28-0.66* * 1.90 F 2.30 35 Thr 483 . . B . . T . 1.38-0.61* * 1.941.70 F

Leu 484 . . B . . . . 1.33-0.57* * 1.781.32 F

Arg 4$5 . . B . . . . 1.32-0.14* * 1.67 F 0.81 Gly 486 . . B . . T . 1.32-0.06. * 2.210.81 F

Pro 487 . . . . T T . 1.37-0.54. * 3.401.96 F

Gly 488 . . . . T T . 1.72-1.23. * 3.06 F 2.00 Thr 489 . . , . . T C I -1:?3. * 2.52 .94 F 4.05 SUBSTITUTE SHEET (RULE 26) Lys 490 . A B . . . 1.94 -1.16. * F 1.58 . 2.65 Lys 491 . A B . . . 1.43 -1.59* * F 1.24 _ . 5.24 Lys 492 . A B . . . x.30 -1.37* * F 0.90 . 2.69 Ala 493 . A B . . . 1.43.-1.43* * F 0.901.33 .

Arg 494 . A B . . . 1.71 -1.00* * F 0.901.03 .

Val- 495 . A B . . . 0.81 -0.50* * F 0.75 . 0.70 Gly 496 . . B . T . 0.88 0.14* * F 0.25 ~ . O.S2 Pro 497 . . B . T , 0.83 -0.36* * F 0.85 . O.S2 His 498 . . B . T . 1.08 0.04* * F 0.741.20 .

1 V 499 . . B . , T . 1.01 -0.17* * F 1.68 ~ al 1.20 Arg S00 . . B . . T . 1.98 -0.60* * F 2.321.SS

Gln SO1 . . B . . T . 2.43 -1.03* . F 2.66 2.24 Gly S02 . . . . T T . 2.69 -1.53* . F 3.40 5.90 Lys S03 A . . . . T . 2.42 -2.I7. . F 2.66 . . 6.03 IS Arg S04 A . . . . . . 2.47 -1.79. * F 2.124.66 Arg SOS . . B . . . . 2.40 -1.50. . F 1.78 3.89 Lys 506 . B . . . . 2.10 -1.93. * F 1.44 3.89 Ser S07 . . B . . . . 2.41 -1.54. . F 1.44 2.66 Leu S08 . . B . . . . 2.07 -1.04. . F 1.781.85 20 Lys S09 , . B , . . . 1.61 -0.66. . F 2.121.24 Ser Si0 . . . . . . C 1.61 -0.23* * F 2.210.91 His S11 . . . . T T . 0.97 -0.61* * F 3.40 2.17 Ser 512 . . . . . T C 1.38 -0.69. * F 2.861.07 Giy S13 . . . . T T . 1..98-0.69. * F 3.021.57 25 Arg S14 . . . . T T . 1.63 -0.64. * F 2.981.78 Met SIS . . . . . . C 1.34 -0.76. * F 2.541.78 Arg S 16 . . . . . T C 1.38 -0.64. * F 2.70 1.82 Pro S 17 . . . : . T C 1.68 -1.07. * F 3.00 1.61 Ser 518 A . , . . T . 2.07 -0.67. * F 2.50 ~ 2.82.

Ala S19 A . . . . T . 2.07 -1.29. * F 2.20 2.88 Glu S20 A A . . . . . 2.08 -1.29* * F 1.50 3.65 Gln 521 A A . . . . . 1.62 --1.21* * F 1.512.75 Lys S22 A A . . . . . 1.94 -1.17* . F 1.52 2.69 , Arg 523 A A . . . . . 1.94 -1.67* . F 1.83 3.04 35 ~Ala S24 . A . . T . . 1.72 -1.29* . F 2.54 2.36 Gly S2S . . . . T T . 1.51 -1.00* . F 3.10 0.97 Arg S26 . . . . T T . 1.12 -O.S7* . F 2.79 0.77 Ser S27 . . . . . T C 0.69 -0.14* . F 1.98 0.97 Leu S28 . . . . . T C 0.19 -0.21* . . 1.671.25 Pro S29 . . B . . . . 0.39 -0.21* . . 0.810.8?

Ter 530 . . . - T . . 0.34 0.21* . . 0.30 . 0.78 SUBSTITUTE SHEET (RULE 26) PAGE INTENTIONALLY LEFT BLANK
SUBSTITUTE SHEET (RULE 26) Table II
Res I II III IV V VI VIIVIIIIX X XIXIIXIII
Position XIV

Met I . . . . . . C 0.69-0.24. * . 1.191.74 . .

Arg 2 . . . . . . C 0.38-0.24. * . 1.531.34 3 ~ ~ . . . T C 0.880.11. . . 1.32 0.91 Pro 4 . . . . T T . 1.27-0.31. . . 2.611.80 Gly 5 . . : . T T . 0.96-0.53. . F 3.401.48 Phe 6 . . . . T T . 0.740.26. . F 2.010.83 1 Arg 7 . A B . . . . -0.1$0.51. . . 0.42 ~ 0.44 Asn 8 . A B . . . . -0.780.77* . . 0.08 0.37 Phe 9 . A B .. . . . -1.161.03* . . -0.26 0.35 Leu 10 . A B . . . . -1.110.74* . . -0.60 0.18 Leu 11 . A . . . . C -0.711.13* . . -0.40 0.15 1 Leu 12 . A . . . . C -1.631.11* . . -0.40 'Jr 0.23 Ala 13 . A . . . . C -2.441.01. . . -0.40 0.23 Ser ~ . A . . . . C -2.441.01. . . -0.40 14 0.23 Ser 15 . A . . . . C -2.221.11. . . -0.40 0.24 Leu 16 . A B . . . . -1.760.93. . . -0.60 0.24 Leu 17 . A B . . . . -1.760.86. . . -0.60 0.18 Phe 18 . A B . . . . -1.471.16. . . -0.60 0.11 Ala 19 . A . . . . C -1.761.16. . . -0.40 0.18 Gly 20 . A . . . . C -2.310.97. . . -0.40 0.22 Leu 21 . A . . . . C -1.710.93* . . -0.40 0.19 2$ Ser 22 . A . . . . C -0.900.57* . . -0.40 0.29 Ala 23 . A . . . . C -0.500.47* . . -0.40 0.51 Val 24 . A . . . C -0.610.43* . . -0.40 0.83 Pro 25 . . . . . T C -0.570.53* . F 0.15 0.53 Gln 26 . . . . T T . 0.030.53* . F 0.35 0.71 Ser 27 . . . . T T . 0.030.46* . F 0.501.4$

Phe 28 . . . . . T C -0.190.20* * F 0.60 i.28 Ser 29 . . . . . T C 0.780.46* * F 0.15 0.61 Pro 30 . . . . . T C 0.690.06* * F 0.45 0.89 Ser 31 . . . . T T . 0.400.06* * F 0.801.38 3$ Leu 32 . . . . T T . 0.490.19* * F 0.801.08 Arg 33 . . . . T . . 0.840.23* * F 0.601.08 Ser 34 . . . . T . . 0.560.23* * F 0.45 0.80 Trp 35 . . . . . T C 0.180.34* * F 0.45 0.98 Pro 36 . . . . . T C -0.190.16* * F 0.45 0.50 44 Gly 37 . . . . T T . 0.730.73* * . 0.20 0.34 SUBSTITUTE SHEET (RULE 26) Ala 38 . . . . T T . -0.190.34 * * . 0.50 0.38 Ala 39 . . , . . . G -0.190.1 * . . 0.10 . I 0.20 Cys 40 . B . T . . 0.210.07 * . . 0.30 . 0.27 Arg 41 . A B . . . . -0.17-0.36'* . . 0.30 0.53 Leu 42 . A . . . C 0.18-0.36* . . 0.50 0.53 Ser 43 . A . . . . C 0.47-0.86* * . 0.951.70 Arg 44 . A . . . . C 1.06-1.04* . F 1.10 l.lb Ala 45 . A . . . . C 1.83-1.04. * F 1.10 2.44 Glu 46 . A . . T . . 1.83-1.73. * F 1.30 3.57 1~ Ser 47 . A . . T . . 1.98-Z.I . * F 1.30 l 3.57 Glu 48 . A . . T . . 2.39-1.54. * F 1.301.89 Arg 49 . A . . T . . 1.69-2.04. * F 1.30 2.14 Arg 50 . A . . T . . 2.0?-1.54* * F 1.581.62 Cys 51 . A . . T . . 1.72-1.50* * . 1.711.44 1 Arg 52 . A . . T . . 2.02-1.07* * F 1.99 'Jr 0.73 Ala 53 . . . . . T G 1.81-0.67* * F 2.47 0.64 Pro 54 . . . . T T . 1.49-0.24* * F 2.801.86 Gly 55 . . . . T T . 1.03-0.39. * F 2.521.47 Gln 56 . . . . . T C 1.110.04 * * F 1.441.44 Pro 57 . . . . . T C 0.410.04 * * F 1.010.94 Pro 58 . . . . T T . 0.190.11 . . F 0.93 0.96 Gly 59 . . . . T T . -0.270.37 . . F 0.65 0.46 Ala 60 . . B . . T . 0.040.54 . . . -0.20 0.16 Ala 61 . . B . . . . -0.300.61 . * . -0.40 0.14 25 L.eu 62 . . B . . . . 0.020.61 . * . -0.40 0.14 Cys 63 . . . . T . . -O.I10.19 . * . 0.610.27 His 64 . . . . T T . 0:340.11 . * . 1.12 0.26 Gly 65 . . . . T T . 0.2?-0.39. * F 2.18 0.63 Arg 66 . . . . T T . 0.86-0.50. * F 2.49 ~ 0.63 3flGly 67 . . . . T T . 1.00-1.07. * F 3.10 0.77 Arg 68 ~ . . . T . . 1.32-1.00. * F 2.59 . 0.42 Cys 69 . . . . T T . 0.50-1.00. * . 2.33 0.21 Asp 70 . . . . T T . 0.18-0.36. * . 1.72 ~ 0.16 Cys 71 . . . . T T . -0.82-0.21. * . 1.410.04 35 Gly 72 . . . . T T . -1.140.47 . * . 0.20 0.06 Val 73 . . . B T . . -1.290.47 . * . -0.20 ' 0.02 Cys 74 . . B B . . . -1.480.97 . . . -0.60 0.05 Ile 75 . . B B . . . -1.791.04 * . . -0.60 0.03 Cys 76 . . B B . . . -1.121.10 . . . -0.60 0.07 4~ His 77 . . B B . . . -0.990.46 . . . -0.60 0.22 Val 78 . . . B T . . -0.48. . . . 0.10 0.31 0.48 SUBSTITUTE SHEET (RULE 26) Thr 79 . . B . . C -0.410.06. . 0.05 . F 0.89 Glu 80 . . . . T C -0.220.10* . 0.45 . F 0.64 Pro 81 . . . T T . -0.260.39. . 0.65 . F 0.75 Gly 82 . . . T T . -0.570.53. . 0.20 . 0.45 Met 83 . . . T T . 0.080.47* . 0.20 . . 0.2b Phe 84 . . . T . . -0.420.90. . 0.00 . . 0.26 Phe 85 . . . T . . -1.091.16. . 0.00 . . 0.21 Gly 86 . . . . T C -0.881.30. . 0.00 . . 0.12 Pro 87 . . . . . T C -1.200.69. . 0.00 . 0.23 1 Leu 88 . . . . T T . -0.630.47* , 0.20 ~ . 0.14 Cys 89 . . . . T T . 0.070.19* . 0.50 . 0.20 Glu 90 . A . . T . . 0.48-0.24* . 0.70 . 0.22 Cys 91 . A . . T . . -0.03. . . 0.10 0.24 . 0.28 His 92 . A . . T . . -0.490.20* . 0.10 . 0.39 15 Glu 93 . A . . T . . 0.320.20* . 0.10 . 0.12 Trp 94 . A . . T . . 0.680.20* . 0.10 . 0.39 Va1 95 . A . . T . . 0.430.11* . 0.10 . 0.42 Cys 96 . A . . T . . 1.100.37* . 0.38 . 0.38 Glu 97 . A . . T . . 0.790.37* . 0.66 . 0.60 Thr 98 . A . . T . . 0.49-0.11* . 1.69,0.80 F

Tyr 99 . . . . T T . 0.47-0.37. . 2.521.99 . F

Asp 100 . . . T T . 0.66-0,46. . 2.801.66 . F

Gly 101 . . . T T . 0.730.11. . 1.77 . F 0.62 Ser 102 . . . T T . 0.390.13. . 1.49 . F 0.40 25 Thr 103 . . . T . . 0.67-0.20. . 1.610.24 . F

Cys 104 . . . T ' . 0.570.30. * 0.78 . T . 0.32 Ala 105 . . . T T . 0.610.30. * 0.50 . . 0.24 Gly 106 . . . T T . 0.29-0.09. * 1.10 . . 0.33 His 107 . . . T. T . 0.590.00. * 0.50 . . 0.33 Gly 108 . . . T . . 0.23-0.57. * 1.35 . F 0.55 Lys 109 . . . T . . 0.56-0.50. * 1.36 . F 0.30 Cys 110 . :. . T ~ T . 1.19-0.50. . 1.72 . . 0.22 Asp 11 I . . . T T . 0.87-1.00. * 2.33 . . 0.44 Cys 112 . . . T T . 0.94-0.86. * 2.64 . . 0.12 3$ Gly I 13 . . . T T . 0.62-0.86. * 3.10 . F 0.44 Lys 114 . . . T . . 0.58-0.86. * 2.59 . F 0.14 Cys 115 . . . T . . 1.24-0.86. * 2.28 . F 0.44 Lys 116 . . . T . . 0.90-1.03. * 1.97 . F 0.76 Cys 1 t7 . . , T . . 1.28-1.03. * 1.94 . F 0.38 40 Asp 118 . . . T T . 1.38-0.11. * 1.810.74 . F

Gln 119 . . . T T . 0.990.07. * 1.49 . F 0.58 SUBSTITUTE SHEET (RULE 26) Gly 120 . . . . T T . 1.66 0.50 . * F 1.621.07 Trp 121 . . . . T T . 1.02 -0.07. * F 2.801.07 Tyr 122 . . . . T . . 1.02 0.43 . . . 1.12 0.62 Gly i23 . . . . T . . 1.02 0.60 . . . 0.84 0.34 Asp 124 . . . . T . . 0.78 0.57 . . . 0.56 0.56 Ala 125 . . . . T . . 0:91 0.41 . . . 0.28 0.56 Cys 126 . . . . T . . 0.89 0.09 * . . 0.30 0.87 Gln 127 . . . . T . . 1.13 0.14 * . . 0.30 0.75 Tyr 128 . . . . . . C 0.81 0.54 . . . -0.05 1.20 1 Pro 129 . . . . T T . 0.81 0.61 . * F 0.50 ~ 1.20 Thr 130 . . . . T T . 0..590.04 . * F 0.801.15 Asn 13l . . . . T T . 0.94 0.33 . * F 0.65 0.61 Cys 132 . . . . T T . 0.99 0.06 . * . O.SO
0.57 Asp 133 . . . . T . . 1.28 -0.37. . . 0.90 0.79 IS Leu 134 . . . . T ~ . 1.53 -0.86. . F i.69 . 0.98 Thr 135 . . . . T . . 1.54 -1.26. . F 2.18 3.64 Lys 136 ~ . . . T . . 1.54 -1.44. . F 2.52 . 2.92 Lys 137 . . . . T . . 2.21 -1.04* . F 2.86 5.70 Lys 138 . . . . T T . 1.61 -1.33* . F 3.40 6.84 Ser 139 . . . . T T . 1.76 -1.20* . F 3.06 3.39 Asn 140 . . . . T T . 2. -0.63* . F 2.57 I 1 0.91 Gln 141 . . . . T T . 2.07 -0.63* . F 2.57 0.91 Met 142 . . . . T . . 1.72 -0.23* . . 2.071.09 Cys 143 . . . . T "' . 1.68 -0.23* . . 2.12 T 0.91 25 Lys 144 . . . . T T . 1.98 -0.23* . F 2.610.91 Asn 145 . . . . T T . 1.09 -0.63* . F 3.401.53 Ser 146 . . . . T T . 0.20 -0.56* . F 3.06 2.00 Gln 147 . . . B T . . 0.13 -0.44* . F 1.87 0.70 Asp 148 . . . B T~ . . 0.50 0.13 * . F 0.93 0.23 Ile 149 . . B B . . . 0.46 0.11 * . . 0.04 0.23 Ite 150 . . B B . . . -0.130.13 . . . -0.30 0.22 Cys ISI . . B . . T . -0.180.23 . . . 0.10 0.13 Ser 152 . . . . T T . -0.490.66 . . . 0.20 0.19 Asn 153 . . . . T T . -1.160.46 . . F 0.35 0.38 35 AIa 154 . . . . T T . -0.300.34 . . F 0.65 0.38 Gly ISS . . , . T . . -0.080.27 . . F 0.45 0.39 Thr 156 . . . . T . . 0.24 0.46 . . . 0.00 0.13 Cys 157 . . . . T T . 0.6b 0.49 . . . 0.20 0.13 His 1 S8 . . . . T T . -0.01-0.01. * . 1.10 0.25 Cys 159 . . . . T T . 0.62 0.13 . * . 0.50 0.09 Gly 160 . . . . T T . 0.30 -0.36. * . I .
t0 0.35 SUBSTITUTE SHEET (RULE 26) Arg 161 . . . . T . . 0.61 -0.36. * . I .24 0.14 Cys 162 . . . . T T . 1.28 -0.86. * F 2.23 0.43 Lys 163 . ~ . . T T . 1.01 -1.03, * F 2.57 . 0.69 Cys 164 . . . . T T . 1.68 -1.07. * F 2.910.47 S Asp 165 . . . . T T . 1.68 -1.07. * F 3.401.48 Asn 166 . . . . T T . 1.27 -1.21. * F 2.910.73 Ser 167 . . . . T T . 1.59 -0.83. . F 2.721.83 Asp 168 . . . . T T . 0.73 -0.97. . F 2.381.08 Gly 169 . . : . T T . 0.54 -0.29. . F 1.59 0.56 1~ Ser 170 . . . B T . . 0.30 -0.04. . F 0.85 0.31 Gly 171 . . . B T . . -0.040.33 . * F 0.25 0.29 Leu 172 . . B B . . . 0.30 0.76 . * F -0.45 0.29 Val 173 . . B B . . . -0.400.33 . * . -0.30 0.43 Tyt 174 ~ . . B T . . -0.720.73 . * . -0.20 . 0.38 1$ Gly 175 . . . . T T . -0.420.87 . . . 0.20 0.24 Lys 176 . . . . T T . -0.740.19 . . . 0.50 0.57 Phe 177 . . . . T T . 0.07 0.11 . . . 0.84 0.20 Cys 178 . . . . T T . 0.92 -0.64* * . 2.08 0.33 Glu 179 . . . . T . . 1.28 -1.07* . . 2.22 0.28 20 Cys 180 . . . . T T . 1.62 -1.07* . . 2.76 0.62 Asp 181 . . . . T T . 0.91 -1.86' * F 3.40 2.01 Asp 182 . . . . T T . 0.72 -1.86* * F 2.910.62 Arg 183 . . . . T T . 1.39 -1.17. * F 2.88 0.81 Glu 184 . . . . T . . 1.39 -1.74. * . 2.50 0.81 2$ Cys 185 . . . . T . . 2.06 -1.74. * . 2.47 0.81 Ile 186 . . . . T . . 1.74 -1.74* * . 2.44 0.72 Asp 187 . . . . T T . 1.74 -1.26. * F 3.10 0.60 Asp 188 . . . . . T C 1.63 -1.26* * F 2.741.94 G1u 189 A . . . . T . 0.74 -1.83* . F 2.23 , 4.79 30 Thr 190 A . . . . T . 0.74 -1.83* . F 1.92 2.01 GIu 191 A . . . . . . 1.29 -1.26* . F 1.26 0.65 Glu 192 A . . . . . . 0.94 -0.83. . F 0.95 0.37 11e 193 . . . . T . . 0.91 -0.40. . . 1.15 0.25 Cys 194 . . . . T T . 0.57 -0.39. . . 1.60 0.20 3$ Giy 195 . . . . T T . 0.92 0.04 . . . 1.25 0.11 Gly 196 . . . . T T . 0.26 0.04 * . F 1.65 0.32 His 197 . . . . T T . 0:01 -0.07. . F 2.50 0.32 Gly 198 . . . . T T . 0.23 0.11 . . F 1.65 0.51 Lys 199 . . . . T T . 0.56 0.26 . . . 1.25 0.28 40 Cys 200 . . . . T T . 0.90 0.26 . . . 1.00 0.20 Tyr 201 . . . . T T . 0.58 0.16 . . . 0.75 0.33 SUBSTITUTE SHEET (RULE 26) Cys 202 . . . T T . 0.370.30. . . 0.50 . 0.09 Gly 203 . . . T T . 0.041.06. * . O:LO
. . 0.26 Asn 204 . . . T T . 0.041.06. * . 0.20 . 0.09 Cys 205 . . . T T . 0.120.30* * . 0.50 . 0.33 Tyr 206 . . . T . . 0.020.23. * . 0.30 . 0.34 Cys 207 . . . T T . 0.400.23* * . 0.50 . 0.21 Lys 208 . . . T T . 0.710.74* * . 0.20 . 0.40 Ala 209 . . . T T . 0.370.67* * . 0.20 . 0.35 Gly 210 . . . T T . 1.030.34* * . 0.810.65 .

1 Trp 211 . . . T . . 1.32-0.23* * . 1.52 ~ . 0.54 His 212 . . . . T C 1.32-0.23* * . 1.981.07 .

Gly 213 . . . T ~ T . 1.28-0.16. * F 2.49 . 0.58 Asp 214 . . . T T . 1.17-0.59. * F 3.10 . 0.96 Lys 215 . . . T T . .51 -0.71. * F 2.79 . 1 0.61 Cys 216 A . . T . . 1.13-0.81. * . 2.081.07 . ~

Glu 217 A . ~ . T . . 1.17-0.67. * . 1.62 . 0.34 Phe 218 A . . T . . 0.62-0.67. * . 1.310.29 .

Gln 219 A . . T . . 0.310.01. * . 0.10 . . 0.37 Cys 220 A . . T . . 0.06-0.07. * . 0.70 . 0.31 2U Asp 221 A . . T . . 0.430.36. * . 0.10 . 0.56 lle 222 A . . . . C 0.430.49. * . -0.06 . 0.34 Thr 223 . . . . T C 0.830.09. * F 1.281.09 .

Pro 224 , . . . T T . 0.88-0.10. . F 2.27 . . 0.88 Trp 225 . . . T T . 1.66-0.10. * F 2.76 . 2.50 25 Glu 226 . . . T T . 1.77-0.79. * F 3.40 . 3.39 Ser 227 . . . T T . 1.99-1.27. . F 3.06 . 4.29 Lys 228 . . . T T . 1.99-1.13* * F 2.72 . 2.19 Arg 229 . . . T T . 1.90-1.56* * F 2.381.82 .

Arg 230 . . . T ' T . 1.98-1.17. * F 2.38 . 1.$2 30 Cys 231 . . . T . . 1.98-1.13. . F 2.181.41 .

Thr 232 . . . T . . 1.93-1.13. * F 2.521.20 .

Ser 233 . . . . T C 1.93-0.70* * F 2.710.61 .

Pro 234 , . . T T . 0.93-0.70* * F 3.40 . 2.27 Asp 235 . . . T T . 0.16-0.59* * F 3.061.10 .

35 Gly 236 . . . T T . 0.52-0.50. * F 2.58 . 0.44 Lys 237 . . . T . . 0.83-0.50. * F 2.35 . 0.38 Ile 238 . , . T . . 1 -0.53. * . 2.47 . ~4 0.37 Cys 239 . . . T T . 1.11-0.53. * . 2.64 . ~ 0.73 Ser 240 . . . T T . 0.80-0.53. * F 3.10 . 0.36 Asn 241 . . . T T . 0.48-0.04. . F 2.49 . 0.74 Arg 242 . . . T T . -0.42-O.I6. . F 2.18 . 0.74 SUBSTITUTE SHEET (RULE 2~

Gly 243 . . . B T . . -0.20 -0.09. . F 1.47 0.41 Thr 244 . . . B T . . 0.12 0.10* . F 0.56 . 0.14 Cys 245 . . . B T . . 0.42 0.13* . . 0.10 0.07 Val 246 . . . B T . . -0.24 0.13* . . 0.10 0.12 Cys 247 . . . . T T . -0.67 0.27* * . 0.50 0.04 Gly 248 . . . . T T . -0.99 0.27. . . 0.50 0.12 Glu 249 . . . . T T . -0.71 0.27. . . 0.50 0.09 Cys 250 . . . . T T . -0.04 0.13. . . 0.50 0.22 Thr 251 . . . . T . . -0.04 -0.44. * . 0.90 ' 0.37 1 Cys 252 . . . . T . . 0.62 -0.23. . . 0.90 ~ 0.16 His 253 . . . . T . 0.76 -0.23. . . 1.24 0.50 Asp 254 . . . . T . . 0.44 -0.37. * . 1.58 0.54 Val 255 . . . . T. . . 0,77 -0.37. * . 2.071.44 Asp 256 . . . . . T C 1,08 -0.51. * F 2.861.05 15 Pro 257 . . . . T T . 1.46 -1.01* * F 3.401.05 Thr 258 . . . . T T . 1.14 -0.10* * F 2.76148 Gly 259 . . . . T T . 1.14 -0.31. * F 2.27 0.88 Asp 260 . . . . T . . 1. -0.31. * F 1.73 I 1 0.95 Trp 261 . . . . T . . 1.08 -0.06. * F 1.39 0.46 2~ Gly 262 . . . . . . C 0.94 -0.04. * F 0.85 0.63 Asp 263 . . . . T . . 1.26 -0.04. * F 1.05 0.38 Ile 264 . . . . T . . 1.29 -0.04. * . 0.90 0.60 His 265 . . . . T T . O.b2 -0.47* . . L 10 0.87 Gly 266 . . . . T T . 0.91 -0.33* * F 1.25 0.28 25 Asp 267 . . . . T T . 0.59 -0.33. * F 1.56 0.69 Thr 268 . . . . T T . 0.59 -0.44. * F 1.87 0.27 Cys 269 . . . . T T . 1.48 -0.94* * . 2.33 0.46 Glu 270 . . . . T T . 1.62 -1.37. . . 2.64 0.48 Cys 271 . . . . T T . 1.97 -1.37* . F 3.10 ~ 0.65 Asp 272 . . . . T T . 1.30 -1.8b* . F 2.94 2.01 Glu 273 . . . . T . . 1.72 -1.86* * F 2.28 0.62 Arg 274 . . . . T T . 1.80 -1.86* . F 2.32 2:?8 Asp 275 . . . . T T . 0.94 -1.93* . F 2.011.38 Cys 276 . . . . T T . 1.37 -1.29* . , 1.40 0.59 35 Arg 277 . . . . T T . 1.37 -0.53* . . I.40 0.47 Ala 278 . , B B . . . 1.48 -0.53* . . 0.60 0.47 Val 279 . . B B . . . 1.12 -0.53* . . 1.091.73 ~

Tyr 280 . . . B T . . 0.82 -0.34* * . 1.531.38 Asp 281 . . . . T T . 1.49 0.04* . . 1.671.83 40 Arg 282 . . . . T T . 1.38 -0.46* * F 2.76 4.12 Tyr 283 . . . . T T . 1:27 -1.i0* * F 3.404.39 SUBSTITUTE SHEET (RULE 26) Ser 284 . . . T T . 1.46-1.07* * F 3.06 . 2.28 Asp 285 . . . T . . 1.40-0.50* . F 2.07 ~ . 0.62 Asp 286 . . . T . . 1.06-0.11* . F 1.73 . 0.53 Phe 287 . . . T . . 0.91-0.44* * . 1.24 . 0.39 $ Cys 288 . . . T T . 0.81-0.33. . . 1.10 . 0.32 Ser 289 . . . T T . 1.110.10 . . . 0.50 . 0.19 Gly 290 . . . T T . 0.440.50 . * F 0.35 . 0.38 His 291 . . . T T . 0.440.29 . * F 0.87 . 0.38 Gly 292 . . . T . . 0.480.11 . * F 0.89 . 0.46 Gln 293 . . . T . . 0.800.30 * * . 0.96 . 0.25 Cys 294 . . . T T . 1.210.30 * * . 1.38 . 0.18 Asn 295 . . . T T . 0.89-0.20. * . 2.20 . . 0.36 Cys 296 . . . T T . 0,92-0.06. * . 1.98 . 0.11 Gly 297 . . . T T . 0.60-0.46* * _ 2.04 . 0.34 1$ Arg 298 . . . T . . 0.64-0.46* * . 1.90 . 0.11 Cys 299 . . . . T T . 0,72-0.86* * . 2.46 0.43 Asp 300 . . . . T T . 0.38-0.93* * . 2.52 0.44 Cys 301 . . . . T T . 0.76-0.93* * . 2.80 0.22 Lys 302 . . . . T T . 0.86-0.01* * . 2.22 0.43 Ala 303 . . . . T . . 0.400.17 * * . 1.14 0.40 Gly 304 . . . . T . . I. 0.60 . * . 0.56 t 0.75 Trp 305 . . . . T . . 1.160.03 . * . 0.92 0.75 Tyr 306 . . . . T . . 1.160.03 . * ..1.131.48 Gly 307 . . . . T T . 1.110.10 * . F 1.67 0.80 2$ Lys 308 . . . . T T . 1.67-0.33* . F 2.761.32 Lys 309 . . . . T T . 1.80-0.74. . F 3.401.15 Cys 310 . . . . T T . 2.09-1.07. . F 3.061.79 Glu 31 t . . . : T . . 2.03-1.10. . F 2.52 1.55 His 312 . . . . . T C 1.71-0.71. . F 2.181.04 -Pro 313 . . . , T T . 1.36-0.14. . F 1.741.04 .

Gln 314 . . _ . T 1' . 0.50-0.23. . F 1.25 0.87 Ser 315 . . . . T T . 0.870.46 . . F 0.35 0.52 Cys 316 . . . B T . . 0.280.34 . . F 0.25 0.45 Thr 317 . . . B . . C 0.310.41 . . . -0.40 0.27 3$ Leu 318 . . . B . . C 0.520.01 . . . -0.10 0.34 Ser 319 . . . B . . C 0.22-0.37. * . 0.651.11 Ala 320 A A . . . . . =0.37-0.56* * F 0.901.03 Glu 321 A A . . . . . 0.41-0.36* * F 0.45 0.87 Glu 322 A A
0.77-1.04* * F 0.90 I ~8 Ser 323 . A . . T . . 0.91-1.43* * F 1.64 2.53 lle 324 . A . . T . . 1.21-1.36* * F 1.83 0.78 SUBSTITUTE SHEET (RULE 26) Arg 325 A . . T . . 1.46 -0.96* . F 2.17 . 0.78 Lys 326 A . . T . . 1.16 -O.S3* . F 2.S10.S8 .

Cys 327 . . . T T . 0.86 -O.S3* . F 3.401.11 .

Gln 328 . . . T T . 1.16 -0.83* * F 2.910.76 .

$ Gly 329 . . . T T . 1.23 -0.83* * F 2.57 . 0.63 Ser 330 . . . T T . 0.91 -0.14* * F 1.93 . 0.97 Ser 33I . . . T . . 0.20 -0.29. * F 1.39 . 0.87 Asp 332 . , . T . . O.S7 -0.1 * * F LOS
. I 0.47 Leu 333 . . . . . C 0.22 -0.16* * F 1.16 . 0.47 Pro 334 . . . T . . 0.68 -0.11* * F 1.67 . 0.35 Cys 33S . . . T T . 0.63 -O.SO. * F 2.18 . 0.41 Ser 336 . . . T T . 0.98 -0.07. * F 2.49 . 0.49 Gly 337 . . . T T . 0.31 -0.76. * F 3.10 . 0.63 Arg 338 . . . T T . 1.12 -0.61. * F 2.79 . 0.63 Gly 339 . . . T . . 0.67 -1.19. * F 2.56 . 0.82 Lys 340 . . . T . . 0.99 -L00 * * F 2.53 . 0.44 Cys 341 . . , T T . 1.33 -L00 * * F 2.70 . 0.22 Glu 342 . . . T T . 1.01 -1.00* * . 2.52 . 0.45 Cys 343 . . . T T . 0.59 -0.86* * . 2.80 . 0.12 Gly 344 . . . T T . 0.27 -0.37. . . 2.22 . 0.33 Lys 34S . . . T . . -0.02-0.37. . . 1.74 . 0.10 Cys 346 . . . T . . 0.43 0.39 . . . 0.86 . 0.29 Thr 347 . . . T . . 0.22 0.24 . . . O.S8 . 0.46 Cys 348 . . . T . . 0.54 0.24 . * . 0.64 . 0.36 Tyr 349 . B . . . . 0.89 0.67 . * . 0.28 . 0.66 Pro 3S0 . . . . T C 0.96 0.10 . . F 1.47 . 0.76 Pro 351 . . . T T . 1.73 -0.39* . F 2.76 . 2.78 Gly 3S2 . , . T T . 1.19 -0.96* . F 3.40 . 3.47 Asp 3S3 . . . , T~ T . 1.61 -1.07* . F 3.06 ~ . 1.67 3~ Arg 3S4 . . . T . . 1.51 -0.74* * F 2.521.69 .

Arg 3S5 . . . T . . 1.77 -0.74* * F 2.181.69 .

Val 356 . . . T . . 1.67 -1.17* * . 1.69 . 2.02 Tyr 357 . , . T . . 1.34 -0.69* * . 1.351.49 .

Gly 358 , . . T T . 1.34 -0.11* * F 1.25 . 0.41 Lys 3S9 . . . T T . O.S7 -0.11* . F 1.25 . 0.95 Thr 360 . . . T T . 0.46 -0.19* * F 1.59 . 0.33 Cys 361 . , . T T . 1.31 -0.94* * F 2.23 . O.SS

Glu 362 . . . T . . 1.67 -1.37* . . 2.22 . 0.46 Cys 363 . . . T T . 2.13 -1.37* . . 2.76 . 0.62 Asp 364 . . . T T . 1.41 -1.86* . F 3.40 . 2.28 Asp 36S . . . T T . 1.72 -1.86* . F 2.910.70 .

SUBSTITUTE SHEET (RULE 26) Arg 366 . . . . T T . 2.39 -1.86* F 3.03 . 2.28 Arg 367 . _ . . . T . . 1..58-2.43* F 2.80 . 2 ~8 Cys 368 . . . . T . . 2.24 -1.74. F 2.771.12 .

Glu 369 . . . . T . . 1.90 -1.74. F 2.59 . 0.96 Asp 370 . . . . T T . 1.04 -1.31* F 3.10 . 0.48 Leu 37i . . . . T T . 0.08 -0.67* F 2.79 . 0.67 Asp 372 . . . . T T . -0.70-0.60. F 2.48 . 0.29 Gly 373 . . . . T T . -0.38-0.03* . 1.72 . 0.09 Val 374 . . B ' , . . . -0.720.40 * . 0.210.11 .

1~ Val 375 . . B . . . . -0.76O.I4 . . -0.10 * 0.07 Cys 376 . . . . T T . -0.290.64 . . 0.20 ~ . 0.09 Gly 377 . . . . T T . -0.600.64 . . 0.20 . 0.12 Gly 378 . . . . T T . -0.920.49 . F 0.35 . 0.23 His 379 . . . T T . -0.370.41 . F 0.35 . . 0.23 1$ Gly 380 . . . T . . -0.180.23 . F 0.45 . . 0.32 Thr 381 . . . T . . 0.14 0.37 * F 0.45 . . 0.17 Cys 382 . . . T T . 0.60 0.37 * . 0.50 . . 0.12 Ser 383 . . . T T . 0.28 -0.13* . 1.28 . . 0.25 Cys 384 . . . T T . -0.540.01 . . 0.86 . 0.09 20 Gly 385 . . . T T . -0.870.17 * . i .04 . . 0.13 Arg 386 . . . T . . -O.S60.17 * . 1.02 . . 0.05 Cys 387 . B . T . . 0.22 -0.21* . 1.80 . . 0.16 Val 388 . B . . . . 0.18 -0.79* . 1.52 . . 0.32 Cys 389 . B . . . . 0.56 -0.79* . 1.34 . * 0.16 25 Giu 390 . . . T T . 0.20 0.13 * . 0.86 . . 0.32 Arg 391 . . . T T . -0.260.34 . . 0.68 . . 0.38 Gly 392 . . . T T . 0.46 0.13 . . 0.50 . . 0.69 Trp 393 . . . T T . 0.50 -0.44. . 1. l0 . . 0.80 Phe 394 . . , T - . . 0.50 0.24 * . 0.30 . . 0.34 Gly 395 . . . T . . O.SO 0.81 * . 0.00 . ' . 0.18 Lys 396 . . . T . . 0.36 0.79 * . 0.00 . . 0.30 Leu 397 . . . T . . 0.49 0.37 * . 0.64 . * 0.47 Cys 398 . . . T . . 0.89 0.01 * . 0.98 . * 0.74 .

Gln 399 . . . T . . 1.63 -0.41* . 1.92 . . 0.72 3$ His 400 . . . . T C 1.31 -0.41* . 2.411.75 . *

Pro 401 . . . T T . 1.27 -0.53* F 3.401.75 . *

Arg 402 . . . T T . 1.48 -0.70* F 3.061.63 . .

Lys 403 . . . T T . I -0.49. F 2.42 . .83 . 1.
i 8 Cys 404 . . . T . . 1.83 -0.50* . 2.031.11 . .

Asn 405 A . . . . C I -0.93. . I .14 . .87 . 0.98 Met 406 A . , , . C 2.08 -0.93* F 0.95 . . 0.85 SUBSTITUTE SHEET (RULE 26) Thr 407 . A . . . . C 1.67 -0.53* . F 1.44 2.73 Glu 408 A A . . . . . 1.67 -0.71. * F 1.58 2.28 Glu 409 A A . . . . . 2.33 -i.l* . F 1.92 l 4.61 Gln 410 . A . . T . . 1.52 -1.33* . F 2.66 5.13 Ser 411 . . . . T T . 1.46 -1.13* . F 3.40 2.44 Lys 412 . . . T T . 1.77 -0.56* . F 2.910.76 Asn 413 . . . . T C 1.47 -0.56* . F 2.62 0.76 Leu 414 . . . . . T C 0.88 -0.57* . . 2.38 0.76 Cys 415 . . . . T . . 0.88 -0.46* . . 1.99 0.38 1~ Glu 416 . . . . T . . 0.83 -0.46* . F 2.05 0.40 Ser 417 . . . . T T . -0.10-0.43* . F 2.50 0.48 Aia 418 . . . . T T . -0.91-0.43* . F 2.25 0.62 Asp 419 . . . . T T . -0.77-0.31* . F 2.00 0.30 Gly 420 . . . . T T . -0.400.26* . . i.00 0.12 Ile 421 . . B . . . . -0.740.26* . . 0.15 0.16 Leu 422 . . B . . . . -0.400.19* * . 0.
I
S
0.09 Cys 423 . . . . T T . -U.I 0.19* * . 1.00 6 0.19 Ser 424 . . . . T T . -0.460.19* * F 1.40 0.27 Gly 425 . . . . T T . -0.78-0.11. * F 2.25 0.43 2~ Lys 426 . . ~ . . T T . 0.08 -0.23. * F 2.50 0.43 Gly 427 . . . . T . . 0.22 -0.30. * F 2.05 0.44 Ser 428 . . . . T . . 0.54 -0.11* * F 1.80 0.24 Cys 429 . . . . T . . 0.89 -0.11' * . I .40 0.12 His 430 . . . . T T . 0.57 -0.11* . . 1.35 0.24 25 Cys 431 . . . . T T . -0.370.03* . . 0.50 0.10 Gly 432 . . . . T T . -0.690.33* . . 0.50 0.12 Lys 433 . . . . T T . -0.690.33* . . 0.50 0.05 Cys 434 . . . . T . . -0.610.21* . . 0.30 0.12 Ilc 435 . A . . T . . -0.580. * . . 0.10 . t4 0.13 3~ Cys 436 . A B . . . . 0.09 -0.29* . . 0.30 0.11 Ser 437 . A . . . . C 0.14 -0.29* . . 0.50 0.35 Ala 438 . A . . . . C -0.140.06* . . -0.10 0.52 Glu 439 . A . . T . . -0.370.13. . . 0.251.53.

Glu 440 . A . B T . . 0.22 0.24. . . 0.10 0.80 35 Trp 441 . A . B T . . 0.54 0.24. * . 0.251.06 Tyr 442 . A . B T . . 0.84 0.17. * . 0.10 0.61 Ile 443 . . . B T . . 0.73 0.17. * . 0.10 0.61 Ser 444 . . . B T . . 0.07 0.96. * . -0.20 0.50 Gly 445 . . . . T . . 0.07 0.61. * F 0.15 0.17 Glu 446 . . . . T . . -0.31-0.14. * . 0.90 0.41 Phe 447 . . . . T . . -0.07-0.26* . . 1 ?4 0.16 SUBSTITUTE SHEET (RULE 26) Cys 448 . . . T T . 0.82 -0.64* . 2.08 . * 0.28 Asp 449 _ . .~ . T T . 1.23 -1.07* . 2.42 . . 0.27 Cys 450 . . . T T . 1.58 -1.07* . 2.76 . . 0.60 Asp 451 . . . T T . 0.91 -1.86* F 3.401.87 . .

Asp 452 . . . T T . 1.61 -1.86* F 2.910.60 . .

Arg 453 . . . T T . 2.32 -1.86* F 2.721.87 . *

Asp 454 . . . T T . 2.29 -2.43* F 2.72 . * 2.24 Cys 455 . . . T T . 2.96 -1.93. F 2.721.83 . .

Asp 456 . . . T . . 2.61 -1.93. F 2.521.56 . .

1~ Lys 457 . . . T . . 1.80 -1.50* F 2.710.92 . .

His 458 . . . T T . 0.80 -0.81* F 3.401.42 . .

Asp 459 . . T T . 0.13 -0.70. F 2.910.60 . .

Gly 460 . . . T T . 0.49 -0.13. . 2.12 . . 0.16 Leu 461 . B . . T . 0.14 0.36. . 0.78 . . 0.17 lle 462 . B . . . . 0.10 0.29. . 0.24 . . 0.10 Cys 463 . . . T T . -0.210.69* . 0.20 . . 0.16 Thr 4b4 . . . T T . -1.100.69* F 0.35 . . 0.20 Gly 465 . . . T T . -1.420.69* F 0.35 . . 0.20 Asn 466 . . . T T . -0.910.57* F 0.35 . . 0.20 Gly 467 . . . T . . -0.690.39. F 0.45 . . 0:18 Ile 468 . . . T . . -0.370.47* . 0.00 . . 0.10 Cys 469 . . . T T . -0.060.47* . 0.42 . . 0.06 Ser 470 . . . T T . -0.380.47* . 0.64 . . 0.10 Cys 471 . . . T T . -0.380.61. . 0.86 . . 0.08 2$ Gly 472 . . . T T . -0.70-0.07. . 1.98 . . 0.24 Asn 473 . . . T T . -O.IO-0.07. . 2.20 . . 0.10 Cys 474 . . . T T . 0.57 0.46. . 1.08 . . 0.19 Glu 475 . . . T T . 0.52 -0.11. . 1.76 . . 0.32 Cys 476 . . . T T . 0.90 -0.11. . 1.54 . . 0.20 Trp 477 . . . T T . 1.24 0.40. . 0.42 . . 0.39 Asp 478 . . . T T . 0.90 0.23. . 0.50 . . . 0.36 Gly 479 . . . T T . 1.57 0.66. F 0.35 . . 0.67 .

Trp 480 . . . T T . 0.98 0.49. F 0.501.02 . .

Asn 481 . . . . T C 0.98 0.07. F 0.45 . . 0.62 35 Gly 482 . . . . T C 1.27 0.64* F O.15 . . 0.33 Asn 483 . . . . T C 0.38 0.21* . 0.30 . . 0.55 .

Ala 484 . . . . T C 0.43 -0.01. . 0.90 . . 0.24 Cys 485 A . . T . . -0.090.50. . -0.20 . . 0.25 Glu 486 A B . . . . -0.430.76. . -0.60 . . 0.13 11e 487 A . . T . . -0.390.79. . -0.20 . . 0.13 Trp 488 A . . T . . -0.390.67. . -0?0 . . 0.32 SUBSTITUTE SHEET (RULE 26) Leu 489 . A . . . . C -0.040.10 . . . -0.10 0.32 Gly 490 . . . . T T . 0.41 0.86 . * F 0.56 0.72 Ser 491 . . . . . T C 0.02 0.60 . . F 0.721.05 Glu 492 . . . . . T C 052 0.11 . . F 1.231.63 Tyr 493 . . . . . T C 0.42 -0.14. . . 1.89 2. I I

Pro 494 . . . T . . 0.84 -0.14. . . 2.10 2.01 Ter 495 . . . . T . . 0.80 -0.10. . . 1.891.4$

SUBSTITUTE SHEET (RULE 26) Table III
Res i II III IV V VI VIi IX X XI XiIXIII
Position VIII XIV

Met 1 A A . . . . . 0.10 -0.19. . . 0.30 0.92 Glu 2 A A . . . . . -0.32-0.1 * * . 0.30 ~ I 0.72 Thr 3 A A . . . . . 0.18 0.14 * . . -0.30 0.47 Gly 4 A A . . . . . 0.68 -0.29* . . 0.30 0.93 Ala 5 A A . . . . . 0.86 -0.90* . F 0.901.05 1~ Leu 6 A A . . . . . 1.46 -0.47. . F 0.601.12 Arg 7 . A B . . . . 0.64 -0.56. . F 0.901.96 Arg 8 . . B . . . . 0.14 -0.30* . F 0.801.60 Pro 9 . A B . . . . 0.28 -0.11. . F 0.601.60 Gln 10 . A B . . . . 0.06 -0.37. . F 0.601.26 1$ Leu 11 . A B . . . . 0.06 0.31 * . . -0.30 0.53 Leu 12 . A B . . . . -0.871.00 . . . -0.60 0.28 Pro 13 . A B . . . . -1.791.26 . * . -0.60 0.14 Leu 14 . A B . . . . -2.391.54 . . . -0.60 0.14 Leu 15 . A B . . . . -3.061.54 . . . -0.60 0.14 Leu 16 , A B . . . . -2.591.43 . . . -0.60 0.05 Leu 17 . A B . . . . -2.121.43 . . . -0.60 0.06 Leu 18 . A B . . . . -2.581.17 . . . -0.60 0.07 Cys 19 . . B . . T . -1.981.06 * * . -0.20 0.04 Gly 20 . . . . T T . -1.060.80 * * . 0.20 0.08 25 Gly 21 . . . . T T . -0.830.11 . * F 0.65 0.20 Cys 22 . . B . . ~ . -0.37-0.07. * F 0.85 T 0.37 Pro 23 . . B . . . . 0.10 -0.21* * F 0.96 0.37 Arg 24 . . . . T T . 0.10 -0.21. * F 1.87 . 0.37 Ala 25 . . . . T T . 0.44 -0.07. * F 2.18 0.37 Gly 26 . . . . T T . 0.79 -0.24. * F 2.49 0.38 Gly 27 . . , . T T . 1.14 -0.67. * F 3.10 0.34 .

Cys 28 . . . . T . . 1.01 -0.19. * F 2.29 0.48 Asn 29 . . . . . T C 0.30 -0.26. * F 1.98 0.48 Glu 30 . . B . . T . 0.08 -0.07. . F 1.47 0.48 35 Thr 31 . . B . . T . 0.42 0.19 * . F 0.56 0.74 Gly 32 . . B . . T . 0.88 -0.39* . F 0.85 0.80 Met 33 A A . . . . . 0.73 -0.79* . . 0.60 0.91 Leu 34 A A . . . . . 0.52 -0.10* * . 0.30 0.52 Glu 35 A A . . . . . -0.29-0.16. * . 0.30 0.81 4~ Arg 36 A A . . . . . -0.640.10 * * . -0.30 0.67 SUBSTITUTE SHEET (RULE 26) Leu 37 A A . . . . . -0.64 0.06 * * . -0.30 , 0.44 Pro 38 A A . . . . . 0.00 -0.20* * . 0.30 ~ 0.25 Leu 39 A A . . . . . 0.22 -0.20* * . 0.30 0.26 Cys 40 A A . . . . . -0.48 0.30 * * . -0.30 0.31 Gly 41 A A . . . . . -1.18 0.40 * * . -0.30 0.18 Lys 42 A A . . . . . -0.37 0.47 -0.60 0.21 Ala 43 A A . . . . . -0.76 -0.21* * . 0.30 0.67 Phe 44 A A . . . . . -0.54 -0.17* * . 0.30 0.67 Ala 45 A A . . . . . -0.22 0.01 * * . -0.30 0.33 1 Asp 46 A A . . 0 44 * *
~ 17 0 . . . . . -0.60 . 0.32 Met 47 A A . . . . . -0.73 -0.06* * . 0.30 0.75 Met 48 A A . . . . . -0.14 -0.20* * . 0.30 0.55 Gly 49 A A . . . . . -0.30 -0.70* * . 0.60 0.55 Lys 50 A . . B . . . 0.00 -0.06. * . 0.30 0.41 IS Val 51 A . . B 0 24 *

. . . . . . -0.30 . 0.44 Asp 52 A . . B . . . 0.36 -0.37. * . 0.30 0.89 Val 53 A . . B . . . 0.29 O.II . * . -0.30 0.47 Trp 54 A . . B . . . 0.63 0.69 *
-0.60 0.34 Lys 55 A . , g _ , -022 0.44 . * . -0.60 0.32 Trp 56 A . . B . . . 0.33 1.13 . . . -0.60 0.36 Cys 57 A . , B , 0.33 ~ 0.87 . . . -0.60 0.46 Asn 58 . . . B . . C 0.49 -0.04* . . 0.50 0.40 Leu 59 . . . B . . C -0.11 0.74 * . . -0.40 0.33 Ser 60 . . . B . . C -1.01 0.51 * . . -0.40 0.43 25 Glu 61 . . B B . . . -0.97 0.59 * . . -0.60 0.20 Phe 62 . . B B . . . -0.54 0.94 . . . -0.60 0.38 Ile 63 . . B B . . . -0.54 1.01 * .~. -0.60 0.44 64 . . B B . . . -0.03 0.63 * . . -0.60 0.44 Tyr 65 . . $ B . , . -0.43 1.01 . . . -0.60 - 0.68 Tyr 66 . . B B . . . -0.74 1.01 * . . -0.60 0.84 Glu b7 . . . B T . . -0.04 0.8I * . . -0.05 1.63 Ser 68 . . . B T . . 0.18 0.57 . . . -0.05 1.68 Phe 69 . . . . T T . 0.72 0.39 * . . 0.50 0.57 Thr 70 , . . . T T . 0.97 0.11 * . F 0.65 0.48 35 Asn 71 . , . . , T C 0.61 0.11 . . F 0.45 0.62 Cys 72 A . . . . T . 0.61 0.34 . * F 0.25 0.71 Thr 73 A A . . . . . 0.32 -0.44. * F 0.45 0.85 Giu 74 A A . . . . . 1.02 -0.43* * . 0.30 0.53 Met 75 A A . . . . . 0.48 -0.43. * . 0.451.60 Giu 76 A A . . . . . -0.38 -0.36. * . 0.30 0.82 Ala 77 A A . . . . . -0.06 -0.20. * . 0.30 0.35 SUBSTITUTE SHEET (RULE 26) Asn 78 A A . . . . . -0.410.23. * . -0.30 0.35 Val 79 . A B . . . . -0.660.19. * . -0.30 . 0.11 Val 80 . B . . . . -0.340.94. * . -0.60 A 0.17 Gly 81 . A B . . . . -0.561.36. * . -0.60 0.11 Cys 82 . . . . T . . 0.031.39* . . 0.00 0.23 Tyr 83 . . . . T . . -0.181.14* . . 0.00 0.50 Trp 84 . . B . . T . -0.130.93. . . -0.20 0.78 Pro 85 . . . . . T C 0.131.19* . F 0.301.20 Asn 86 . . . . . T C 0.481.11* . F 0.15 0.77 1 Pro 87 . . . . . T C 0.800.76* . F 0.30 ~ 1.27 I-eu 88 . . . . . . C 0.340.27* . F 0.25 0.81 Ala 89 . . . . . . C -0.260.63* . F -0.05 0.44 Gln 90 . . B B . . . -0.360.91* . . -0.60 0.20 Gly 91 . . B B . . . -0.700.97* . . -0.60 0.35 15 Phe 92 . . B B . . . -1.380.71* . . -0.60 . 0.34 Ile 93 . . B B . . . -0.600.90* . . Ø60 O.14 Thr 94 . . B B . . . 0.101.00* . . -0.60 0.19 Gly 95 . . B B . . . 0.100.57* . . -0.60 ~ 0.43 Ile 96 . . B B . . . -0.260.19* . . -0.15 1.06 His 97 . . B B . . . -0.260.29* . . -0.30 0.64 Arg 98 . . . B T . . 0.330.59* . . -0.20 0.56 Gln 99 . . . B T . . 0.640.54* . . -0.05 1.06 Phe 100 . . . B T . . 0.320.26* . . 0.251.26 Phe 101 . . . . T T . 0.900,33* . . 0.50 0.34 25 Ser 102 . . . . T T . 0.080.81. * . 0.20 0.29 Asn 103 . . . . T T . -0.031.06. . . 0.20 0.25 Cys 104 . . . . T T . 0.080.27. * . 0.50 0.47 Thr 105 . . . . T . . -0.08-0.51. . . 1.20 0.69 Val 106 . A . . T . . 0.59-0.26. * . 0.70 0.32 Asp 107 . A B . . . . 0.08-0.16. * . 0.30 0.81 Arg 108 . A B . . . . 0.08-0.04. * . 0.30 0.46 V 109 . A B . . . . 0.74-0.53. * . 0.75 al 1.08 His 110 . A B . . . . 0.84-1.17. * . 0.751.08 L,eu 111 . A . . . . C 1.49-0.74. * . 0.80 0.85 3$ Glu 112 . A . . . . C 1.49-0.31. * F 0.801.78 Asp I . A . . . . C 1.38-0.96* * F 1.10 13 2.18 Pro 114 . , . . . T C 1.38-1.46* * F 1.50 4.58 Pro 115 . . . . T T . 0.60-1.50* . F 1.701.96 Asp 116 A . . . . T . 0.52-0.81* . F I.15 0.97 Glu 117 A . . . . T . 0.31-0.13* * . 0.70 0.44 Val 118 A . . B . . . -0.50-0.13* . . 0.30 0.44 SUBSTITUTE SHEET (R.ULE 26) 23$
Leu 1 . . B B . . . -1.180.13. . -0.30 i9 . 0.22 Ile 120 . . B 8 . . . -1.820.81 . . . -0.60 0.09 Pro 121 . . B B . . . -2.711.46. . -0.60 . 0.09 Leu 122 . . B B . . . -2.921.50. . -0.60 * 0.07 $ Ile 123 . . B B . . . -2.921.24. . -0.60 . , . 0.16 Val 124 . . B B . . . -2.971.20. . -0.60 . 0.08 Ile 125 . . B B . . . -2.891.41. . -0.60 . 0.07 Pro 126 . . B B . . . -2.991.41. . -0.60 * 0.08 Val i27 . . 8 B . . . -3.031.21. . -0.60 . 0.16 1~ Val 128 . B B . . . -2.731.21. . -0.60 * 0.17 Leu 129 . . B B . . . -2.481.03. . -0.60 . 0.11 Thr 130 . . B B . . . -2.181.21 -0.60 0.15 Val 131 . . B B . . . -2.311.07. . -0.60 . . 0.20 Ala 132 A . . B . . . -2.270.86. . -0.60 . 0.25 1$ Met 133 A . . B . . . -2.270.86. . -0.60 . 0.14 Ala 134 A . . B . . . _2.311.01. . -0.60 . 0.14 GIy 135 A . . B . . . -2.291.01* . -0.60 * 0.10 Leu 136 A . . B . . . -1.321.43* . -0.60 * 0.11 Val 137 A . . B . . . -1.030.81. . -0.60 * 0.21 Val 138 A . . B . . . -0.390.70* . -0.26 . 0.29 Trp 139 . . B B . . . 0.31 0.27* . 0.38 . 0.70 Arg 140 . . B B . . . 0.34 -0.41. F 1.621.84 .

Ser I41 . . B , . T . L -0.57. F 2.66 16 . 3.58 Lys 142 . . . . T T . 1.70 -1.21. F 3.40 * 5.68 2$ Arg 143 . . . . T T . 1.74 -1.64* F 3.06 . 4.I9 T~ 1 - - . . T T . I -0.96. F 2.72 ~ .22 * 2.58 Asp 145 . A . . T . . 0.72 -0.66. F 1.981.06 .

Thr 146 . A B : . . . 0.63 -0.23* F 0.79 . 0.69 Leu I47 . A B . . . . 0.20 0.20* . -0.30 - . 0.61 Leu 148 . A B . . . . -0.300.14. . -0.30 * 0.47 Ter 149 . A B . . . . -0.380.57. . -0.60 . 0.42 SUBSTITUTE SHEET (RULE 26) wo oon9a3s PCT/US99/2s031 Table IV
Res I II III IV V VI VII IX X XI XII
Position VIII X

III XIV
Met I A A . . . . . -0.760.27. * . -0.30 0.49 Arg 2 A A . . . . . -x.070.34* * . -0.30 0.39 Leu 3 A A . . . . . -1.490.70* * . -0.60 0.26 Leu 4 A A . . . . . -1.400.96. * . -0.60 0.22 Ala 5 A A , , . . . -1.820.73. * -0.60 0.15 1 Phe 6 A A
~

. . . . . -2.03l * * . -0.60 .41 0.15 Leu 7 A A . . . . . -2.731.41. * . -0.60 0.15 Ser 8 A A . . . . . -2.731.23. . . -0.60 0.15 Leu 9 A A . . . . . -2.781.41. . . -0.60 0.14 Leu 10 A A . . . . . . -3.001.27* . . -0.60 0.13 15 Ala 11 A A

. . . . . -2.301.27* . . -0.60 0.08 Leu 12 A A . . . . . -1.49i.29. . . -0.60 0.17 Val 13 A A . . . . . -1.500.60. . . -0.60 0.35 Leu 14 A A . . . . . -1.030.40. . . -0.02 0.50 Cln 15 A A B . . . . -0.530.33. . F 0.410.60 Glu 16 A . . . . T . -0.530.13. . F 1.241.I7 Thr 17 A . . . . T . -0.02-0.01. * F 2.121.43 Gly 18 . . . . T T . 0.02 -0.31. . F 2.801.11 Thr 19 A . . . . T . 0.62 -0.03* * F 1.97 0.53 Ala 20 A . . . . . . 0.73 0.40. . F 0.89 0.57 25 Ser 21 . . . . . . C 0.78 -0.09. . F 1.561.12 Leu 22 . A . . . . C 1.09 -0.51* . F 1.381.55 Pro 23 A A . . . . . I -1.00* . F 0.90 2.66 .54 Arg 24 A A . . . ' . . 1.90 -I.SO* * F 0.90 3.88 Lys 25 A A . . . . . 2.60 -1.89. . F 0.90 9.42 Glu 26 A A . . . . . 3.01 -2.57. . F 0.9011.93 Arg 27 A A . . . . . 3.82 -3.00. . F 0.9011.93.

Lys 28 A A . . . . . 4.03 -3.00. . F 0.9010.33 Arg 29 A A . . . . . 3.92 -3.00. . F 0.9010.33 Arg 30 A A . . . . . 3.28 -2.60. * F 0.90 9.13 35 Glu 31 A A . , . . . 3.07 -1.99* . F 0.90 4.52 Glu 32 A A . . . . . 3.07 -1.56. . F 1.24 3.57 Gln 33 A A . . . . . 3.02 -1.56* . F 1.58 3.57 Met 34 . A , . . . C 2.57 -1.56* . F 2.12 3.57 Pro 35 A . . . . T . 2.46 -1.13* . F 2.66 2.04 Arg 36 . . . . T T . 2.16 -1.13" . F 3.401.97 SUBSTITUTE SHEET (RULE 26) GIu 37 A . . . . T . 1.46 -1.14* F 2.66 . 2.66 Gly 38 . _ . . . T T . 1.46 -0.97* F 2.721.49 *

Asp 39 A . . . . . . 1.20 -1.40* F 1.781.32 .

Ser 40 A . . . . . . 0.60 -0.76* F 1.29 0.57 $ Phe 41 . B

. , . . . 0.28 -0.07. . 0.50 . 0.47 G1u 42 . . B . . . . -0.53-0.07. . 0.50 . 0.44 Val 43 . . B . . . . -0.080.61. . -0.40 . 0.27 Leu 44 . . B , . . . -0.080.23. . -0.10 * 0.61 Pro 45 A . . . . . . 0.22 -0.16* 0.50 1 Leu 46 A . T .
~

. . . . 0.07 -0.16. . 0.85 * 1.27 Arg 47 A . , . . T . -0.74-0.16. F 1.001.14 *

Asn 48 . . B . T T . 0.11 -0.16. F 1.25 * 0.61 Asp 49 . . B . . T . 0.71 -0.19. F 1.00 * L 19 Val 50 . . 8 . . . . 0.92 -0.44* F 0.65 1$ Leu 51 . B .

. . . . . 1.73 -0.44. F 0.95 * 0.97 Asn 52 . . B . . T . 1.38 -0.44. F 1.45 . 0.94 Pro 53 . . . . . T C 1.03 0.3.1. F 1.501.98 .

Asp 54 . . . . T T . 1.03 0.10. F 2.00 * 2.37 Asn 55 . . . . . T C 1.03 -0.59. F 3.00 . 2.56 Tyr 56 . B B

. . . . 0.96 -0.34. F 1.801.23 *

Gly 57 . . B B . . . 0..96-0.09. . 1.20 . 0.52 Glu 58 . . B B . . . 0.36 -0.09* . 0.90 . 0.53 Val 59 . . B B . . . 0.06 0.20 0.00 0.28 Ile 60 . . B B . . . 0.06 -O.I7. 0.30 0.38 2$ Asp 61 . B B

. . . . 0.06 -0.20. . 0.30 . 0.35 Leu 62 . . B . . T . 0.40 0.56. . -0.20 . 0.75 Ser 63 . . . . . T C 0.40 -0.09. . 1.051.85 .

Asn 64 . . , . . T C 0.44 -0.77. F 1.501.9I
.

Tyr 65 A . . . . T . 1.02 -0.09. F 1.341.91 ~ *

Glu 66 A

. . . . . . 1.02 -0.29. F 1.48 . 2.06 Giu 67 A . . . . . . 1.59 -0.67* F 2.12 . 2.14 Leu 68 . . B . . . . 1.54 -0.31* F 2.16 . 2.14 .

Thr 69 . . . . T T . 1.54 -0.64* F 3.401.22 .

Asp 70 . . . , T T . 1.79 -0.64* F 3.061.18 .

3$ Tyr 71 . . T T

. . . 0.98 -0.24* F 2.42 . 2.48 Gly 72 . . . . T T . 0.77 -0.24* F 2.081.42 .

Asp 73 A . . . . . , 1.58 -0.30* F 1.231.31 .

Gln 74 A . . . . . . 1.03 -0.30* F 0.981.45 *

Leu 75 . . B . . . . 1.08 -0.41. F 1.071.09 *

Pro 76 . B

. . . . . 0.47 -0.84. F 1.461.30 *

Giu 77 . . B B . . . 0.50 -0.20. F 0.90 * 0.56 SUBSTITUTE SHEET (RULE 26) Val 78 . . B B . . . 0.20 -O.ll. F 0.810.98 *

Lys 79 . - B B . , . -0.61-0.41. F 0.72 . * 0.85 80 ' ' B B ~ . . -0.39-0.16. F 0.63 * 0.40 Thr 81 ' ' B B . . . -0.390.34. F -0.06 * O
SS

$ Ser $2 . B .

. . . . . -0.980.13. F 0.05 * 0.42 Leu 83 , . B

.
-0.430.63. . -0.40 * 0.58 Ala 84 . , B -- . . -0.7$0.47. . -0.40 * 0.58 Pro 8S A

. . . . . . -0.810.37. . -0.10 . 0.58 Ala 86 . . B B . . . -0.800 F

. . -0.45 1 Th . 0.49 ~

r 87 . . B B . . . -0.710.37. F -0.1 . S 0.65 Ser ' . . B B . . . -0.490 F

. . 0.13 Ile $9 . . B B . . . 0.14 0.37. F 0.65 . 0.410.65 .

Ser 90 . . B . . T . 0.06 -0.13. F 1.59 . 0.90 Pro 91 . . . . . T C 0.33 -0.23. F 2 . 17 0 Ala 92 . . . T T .
.

. . 0.33 -0.13. F 2.801.86 .

Lys 93 . . B . . T . 0.04 -0.33. F 2.12 . 2.00 Ser 94 . . B , . . . 0.72 -0.21. F 1.641.31 .

Thr 95 ., B . . . . 0.68 -0.21. F 1.36 . . 2.00 Thr 96 . . $ . . . . O.SB -0.29. F 0.93 . 0 Ala 97 . B .

. . . . . 0.96 0.20. F 0.201.07 .

Pro 98 ' . B . . . . 0.61 0.24. F 0.481.14 .

Gly 99 . . . . T . . 0.61 0.14. F 1.161.06 .

Thr 100 . . . . . T C 0.92 0.04. F 1.441.41 .

Pro 101 . . . . . T C 1.02 -0.06. F 2.321 . 46 25 Ser 102 . . T T .

. . . 1.30 -0.06. F 2.80 . 2.28 Ser 103 . . . . . T C 0.91 0.00. F 2.32 . 2.28 Asn 104 . . . . . T C 0.94 0.13. F 1.441.46 .

Pro 105 . . . . . T C 1.37 0.19. F 1.361.57 .

Thr 106 . . . . T T . 1.37 -0.20. F 2.08 . 2 Met 107 . B .

. . . T . 1.36 -0.16. F 1.60 . 2.21 Thr 108 . . B . . . . 1.34 -0.07. F 1.60 . 2.07 Arg 109 . . B . . T . 0.76 -0.01. F 2.00 . 2.07 Pro 110 . . B . . T . 0.62 0.00* F 1.80 . 2.11 Thr 111 . . B . . T . 0.12 -0.19. F 1 . 601 35 Thr 112 . B .
.

. . . T . -0.090.01. F 0.65 . 0.61 Ala 113 . A B . . . . -0.590.?0. F -0.25 . 0.32 Gly 114 . A B . . . . -1.000.96. . -0.60 . 0.19 Leu 115 . A B . . . . -1.490.86. . -0.60 . 0.17 Leu 116 . A B . . . . -0.780.76. -0.60 Leu 117 . A B .

. . . . -0.680.66. F -0.45 * 0.40 Ser 118 . A B . . . . -0.090.66. F -0.29 . 0.75 SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC1'/US99/25031 Ser 119 . B . . . . 0.22 0.37. F O.S2 . . L46 Gln 120 . B . . T . 0.69 0.19. F 0.$8 _ . 2.41 . .

Pro 121 . . . T T . 0.69 -0.07. F 2.041.78 . .

Asn 122 . . . T T . 1.29 0.23. F 1.601.10 . *

His 123 . . . T T . 1.28 0.27. F 1.29 . . 0.98 Gly 124 . . . T . . 0.91 0.36. . 0.78 . . 0.91 Leu 125 . . . . T C 0.10 0.50. . 0.32 . . 0.30 Pro 126 . . . T T . -0.54 0.79. . 0.36 . . 0.18 Thr 127 . . . T T . -1.21 0.93. . 0.20 . . 0.14 1 Cys 128 . B . . T -2 1 ~ . 03 07 . . . . -0.20 . . 0.09 Leu 129 . B B . . . -2.36 1.03. . -0.60' . . 0.04 Val 130 . B B . . . -2.3b 1.17. . -0.60 . . 0.02 Cys 131 . B B . . . -2.49 1.37. . -0.60 . . 0.02 Val 132 . B B . . . -2.48 1.23. . -0.60 . . 0.03 15 Cys 133 . B B 2 0 . 11 93 . . . - . . . -0.60 . . 0.05 Leu l34 . B B . . . -2.16 0.67. . -0.60 . . 0.13 Gly 13S . . . T T . -1.54 0.74. F 0.35 . * 0.13 Ser 136 . . . T T . -1.54 0.86. F 0.35 . * 0.39 Ser 137 . B . . T . -0.69 0.86. F -0.05 . . 0.25 Val 138 . B . . T . -0.02 0.17. . 0.10 , . 0.43 Tyr 139 . B . . . . -0. -0.26. . 0.50 . i 0 . 0.53 Cys 140 . B . . T . 0.24 0.04. . 0.10 . * 0.28 Asp 141 . B . . T . -0.27 -0.34. . 0.70 . * 0.63 Asp 142 . B . . T . 0.03 -0.30. F 0.85 . - * 0.33 25 Ile 143 . B . . T . 0.89 -1.06. F 1.301.07 . .

Asp 144 A B . . . . 0.24 -1.63. F 0.901.07 . .

Leu 14S A B . . . . 0.70 -0.94. F 0.75 . . 0.45 Glu 146 A B : . . . 0.49 -0.51. F 0.75 . * 0.99 Asp 147 A B . . . . -0.32 -0.77. F 0.99 . ~ * 0.92 Ile 148 A B . . . . 0.36 -0.09* F 0.93 . * 0.92 Pro 149 . . . . . C 0.47 -0.34* F 1.57 . . 0.82 Pro 150 . . . . . C 1.39 -0.34* F 1.810.96 . .

Leu 151 . . . . T C 1.08 -0.34* F 2.40 . . 2.68 Pro 152 . B . . T . 0.49 -0.54* F 2.26 . . 2.50 35 Arg 153 . . . T T . 1.13 -0.47* F 2.121.63 . .

Arg 154 . B . . T . 0.53 -0.14* F 1.48 .~ . 3.11 Thr 1 SS . B B . . . 0.50 -0.14* . 0.69 . . 1.66 Ala 156 . B B . . . 0.72 0.19* . -0.15 . 1.33 Tyr 157 . B B . . . 1.04 0.69* . -0.60 . * 0.68 Leu 158 . B B . . . 0.23 0.69* . -0.60 . * 0.93 Tyr 159 . B B . . . 0.12 0.99* . -0.60 . ~' 0.80 SUBSTITUTE SHEET (RULE 26) WO 00/29435 PC1'/US99125031 Ala 160 . . B B . . . 0.54 0.89 * * . -0.60 0.82 Arg 161 . _ B B . . . 0.24 0.13 * * . -0.15 . 1.94 Phe 162 . . B B . . . 0.19 0.13 * * . Ø30 0.87 Asn 163 . . B B . . . I.II -0.24* * . 0.451.15 Arg 164 . . B B . . . 0.47 -0.74. * F 1.081.15 Ile 165 . . B B . . . 1.17 -0.06. * F 0.810.93 Ser 166 . . . B . . C 0.47 -0.84* * F 1.641.13 Arg 167 . . B B . . . 1.17 -0.74* * F 1.47 0.59 Ile 168 . . B B . . 1.17 -0.74* * F 1.801.45 .

1~ Arg 169 . . B B . . 0.36 -1.43* . . 1.471.80 .

Ala 170 A A . . . . 1.29 -1.03* * F 1.29 . 0.80 Glu 171 A A . . . . 1.24 -1.03. * F 1.26 . 2.27 Asp 172 A A . . . . 0.32 -1.29. * F 1.08 . 1, i S

Phe 173 A A . . . . 0.90 -0.60. * F 0.75 . 0.94 15 Lys 174 A A . . . . 0.83 -0.61. * F 0.75 . 0.78 Gly 175 A A . . . . 0.61 -0.61* * F 0.75 . 0.94 Leu 176 A A . . . . 0.66 0.07 * * F -0.15 . 0.89 Thr 177 A A . . . . 0..77-0.71* * F 0.75 . 0.89 Lys 178 A A . . . . 0.58 -0.71* * F 0.901.76 .

2~ Leu 179 A A . . . . 0.53 -0.46* * F 0.601.50 .

Lys 180 . A B . . . 0.07 -1.14* * F 0.901.73 .

Arg 181 : A B . . . 0.58 -0.94* * F 0.75 . 0.71 Ile 182 . A B . . . 0.89 -0.56. * F 0.901.16 .

Asp 183 . A B . . . 0.84 -0.84. * . 0.60 . 0.93 25 Leu 184 . . B . T . 0.84 -0.44* * F 0.85 . 0.77 Ser 185 . . B , T . -0.090.24 * * F 0.25 . 0.90 Asn 186 . . . . T C -0.500.24 . * F 0.45 . 0.38 Asn 187 . . . : T C 0.09 0.63 * * F 0.15 . 0.62 Leu 188 . . B . , . . -0.800.33 * * . -0.10 . 0.62 Ile 189 . . B . . . 0.01 0.63 * . . -0.40 . 0.27 Ser 190 . . B . . . 0.31 0.23 * . F 0.05 . 0.28 Ser 191 . . B . . . 0.31 0.23 * * F 0.05 . 0.54 Ile 192 . . B . . . -0.28-0.46* . F 0.80 . 1.29 , Asp 193 . . B . T . -0.17-0.64* . F 1.15 . 0.98 35 Asn 194 A . . . T . 0.83 -0.24* * F 0.85 . 0.63 Asp 195 A . . . T . 0.32 -0.63* . F 1.301.76 . ~

Ala 196 A . . . T . -0.19-0.63* . . 1.00 . 0.87 Phe 197 A A . . . . 0.67 0.06 * . . -0.30 . 0.45 Arg 198 A A . . . . O.U8 0.16 * . . -0.30 . 0.36 40 Leu 199 A A . . . . -0.730.66 * * . -0.60 . 0.36 Leu 200 A A . . . . -0.730.84 * . . -0.60 . 0.35 SUBSTITUTE SHEET (RULE 26) His 201 A A . . . . -0.140.46* . -0.60 . * 0.31 Ala 202 A A , . . . -0.260.46* . -0.60 . . * 0.62 Leu 203 A A . . . . -x.260.46* . -0.60 . * 0.62' Gln 204 A A . . . . -1.260.46* -0.60 . . 0.32 $ Asp 205 . A B . -0 0 . 66 64 . . . . . -0.60 . . 0.26 Leu 206 . A B . . . -0.620.57. . -0.60 . . 0.49 Ile 207 . A B . . . -0.03-0.1. . 0.30 . I . 0.49 Leu 208 . . B . T . 0.78 -0.11* F 0.85 . . 0.47 Pro 209 A . . . T . -0.030.29. F 0.25 . . 0.99 Glu 210 A . . . T . -0.030.29. F 0.401.16 . *

Asn 211 A . . . T . 0.19 -0.40. F L00 2.44 . .

Gln 212 A A . . . . 0.27 -0.59. F 0.901.60 . *

Leu 213 A A . . . . 0.87 -0.33. . 0.30 . . 0.76 Glu 214 A A . . . 0 0 . 22 10 . . . -0.30 1$ . . 0.73 Ala 215 . A B . . . -0.590.34. . -0.30 . . 0.31 Leu 216 . A B . . . -0.800.63. . -0.60 . . 0.31 Pro 217 . . B . . . -1.100.37. . -0.10 . . 0.28 Val 218 . . B . . . -0.630.76* . -0.40 . . 0.37 Leu 219 . . B . T . -1.520.69* F -0.05 . . 0,44 Pro 220 . . . . T C -0.930.69* F 0.15 . 0.20 Ser 221 . . . . T C -0.820.26* F 0.45 . . 0.47 Gly 222 . . B . T . -1.420.40* F 0.25 . . 0.49 Ile 223 . A 8 . . . -0.570.40* F -0.15 . . 0.26 Glu 224 . A B . . . -0.61-0.03. . 0.30 . * 0.33 Phe 225 . A B 0 0 *
. 29 23 . . . - . . . -0.30 . 0.25 Leu 226 . A B . . . -0.80-0.20. . 0.30 . * 0.69 Asp 227 A A . . . . -0.46-0.20* . 0.30 . * 0.33 ~8 A !'~ ~ ~ . . 0.54 0.20* . -0.30 ~ . 0.61 Arg 229 A A . . . . -0.27-0.59. . 0.75 . * 1.45 ~

Leu 230 A A . . . . 0.43 -0.59. . 0.88 . * 0.71 Asn 23I A A . . . . 0.94 -0.19* . 1.011.67 . *

Arg 232 . A . T . . 0.64 -0.44* F 1.84 . * L 14 Leu 233 . A . T . . 1.16 -0.06* F 2.121.85 . .

Gln 234 . . .. T T . 0.16 -0.31* F 2.801.14 . .

3$ Ser 235 . . . T T . 0.97 -0.03* F 2.37 . . 0.41 Ser 236 . . . . T C 0.76 0.37* F 1 ~9 . * 0.86 Gly 237 . . . T T . 0.06 0.11* F 1.210.76 . *

Ile 238 . A B . . . 0.28 0.21. F 0.13 . . 0.58 Gln 239 . A B . . . Ø420.33. F -0.15 . . 0.43 4~ Pro 240 . A B . . . -0.010.73* F -0.45 . * 0.38 Ala 241 A A . . . . Ø300.30* . -0.15 . * 1.06 SUBSTITUTE SKEET (RULE 26) Ala 242 A A . . . . . -0.560.11* . . -0.30 0.62 Phe 243 A A . . . . . 0.33 0.33* * . -0.30 0.40 Arg 244 A A . . . . . 0.38 -0.10* * . 0.30 ~ 0.68 Ala 245 A A . . . . . -0.22-0.60* * . 0.751.35 Met 246 A A . . . . . 0.37 -0.41* * . 0.451.28 Glu 247 A A . . . . 0.26 -0.80* * . 0.751.13 .

Lys 248 A A . . . . 0.14 -0.01* * . 0.30 . 0.97 Leu 249 A A . . . . -0.210.17* * . -0.30 . 0.81 Gln 250 A A . . . . -0.430.31. . . -0.30 . 0.73 1 Phe 251 A A . . . . -0.131.00. . . -0.60 ~ . 0.30 Leu 252 . A B . . . -0.131.39. * . -0.60 . 0.49 Tyr 253 . . B . . . -0.180.70. * . -0.40 . 0.47 Leu 254. . . B . . . -0.180.70* . . -0.23 . 0.88 Ser 255 . . B . T . -0.990.60* . . 0.14 . 0.88 I Asp 256 . . . T T . -0.290.60* . . 0.71 S . . 0.46 Asn 257 . . . T T . 0.22 -0.16* . F 1.93 . 0.94 Leu 258 . . B . T . -0.42-0.46* . F 1.70 . 0.94 Leu 259 . . B . . . 0.18 -0.16* . F 1.33 . 0.39 Asp 260 . . B . . . 0.13 0.27* . F 0.56 . 0.38 Ser 261 . . B . . . -0.080.30* . F 0.39 . 0.45 Ile 262 . . B .. T . -0.890.04* . F 0.42 . 0.85 Pro 263 . . B . T . -0.290.04* . F 0.25 . 0.42 Gly 264 . . . . T C 0.31 0.47* . F 4.15 . 0.48 Pro 265 . . . . T C 0.01 0.51. * F 0.301.07 .

25 Leu 266 . . . . . C -0.500.21* * F 0.42 . 0.93 Pro 267 . . . . T C 0.50 0.47* * F 0.49 . 0.77 Pro 268 . . . T T . 0.41 0.04* * F I .16 . 0.98 Ser 269 . . B . T . -0.100.00* * F 1.681.59 .

Leu 270 . . B . , T . 0.08 -0.04* * F 1.70 . 0.76 Arg 271 . . B . . . 0.08 0.03* * F 0.73 . 0.67 Ser 272 . . B . . . 0.29 0.29. . . 0.410.41 .

Val 273 . . B . . . 0.50 0.30. * . 0.24 . 0.87 His 274 . . B . . . 0.80 0.01. * . 0.07 . 0.71 Leu 275 . . B . T . 0.80 0.41* . . -0.20 . 0.86 35 Gln 276 . . . . T C -0.200.71* . F 0.15 . 0.95 Asn 277 . . . . T C O.IO 0.76. * F 0.15 . 0.49 Asn 278 . . . . T C 0.64 0.26* * F 0.601.03 .

Leu 279 A A . . . . 0.08 0.06* . . -0.30 . 0.86 Ile 280 A A . . . . 0.89 0.27* * . -0.30 . 0.53 4~ Glu 281 . A B , . . 1.00 0.27* . . -0.30 . 0.57 Thr 282 A A . . . . 1.00 -O.I3* . . O.d51.35 .

SUBSTITUTE SHEET (RULE 26) Met 283 . A B . . . . 0.14 -0.81* . F 0.90 3.21 Gln 284 . A B . . . . 0.26 -0.86* . F 0.901.38 _ Arg 285 . A B . . . . 0.48 -0.07* . F 0.45 0.$3 Asp 286 . A . . T . . 0.48 0.01 * . . 0.10 0.45 $ Vat 287 . A B . . . . 0.58 -0.60* . . 0.60 0.43 Phe 288 A A . . . . . 1.18 -0.57* . . 0.60 0.34 Cys 289 A A . . . . . 1.18 -0.57* . . 0.60 0.35 Asp 290 A A . . . . . 1.03 -0.57* . F 0.75 0.82 Pro 291 A A . . . . . 1.08 -0.71. . F 0.901.30 Glu 292 A A . . . . . 1.90 -1.50. . F 0.90 - 4.83 Glu 293 A A . . . . . 2.29 -1.57. . F 0.90 3.94 His 294 A A . . . . . 3.07 -1.09* . F 0.90 3.67 Lys 295 A A . . . . . 3.18 -1.51. . F 0.90 4.16 His 296 A A . . . . . 339 -I * . F 0.90 .51 4.70 1$ Thr 297 A A . . . . . 2.58 - * . F 0.90 L 5.98 Arg 298 A A . . . . . 2.58 -0.93* . F 0.90 2.47 Arg 299 A A . . . . . 2.61 -0.93* . F 0.90 3.14 Gin 300 A A . . . . . 1.68 -1.43* * F 0.90 3.63 Leu 301 A A . . . . . 1.82 -1.23* * F 0.901.30 Glu 302 . A B . . . . 1.32 -1.23* * F 0.901.30 Asp 303 . A B . . . . 1.21 -0.54* * F 0.75 0.62 Ile 304 . A B . . . . 0.76 -0.94* * F 1.111.25 Arg 305 . A B . . . . 0.76 -1.20. * F 1.17 0.72 Leu 306 . A B . . . . 1.36 -0.80* * F 1.38 0.69 2$ Asp 307 . . . . T T . 0.47 -0.37* * F 2.241.52 Gly 308 . . . . . T C 0.47 -0.37. * F 2.10 0.54 Asn 309 . . . . . T C 0.54 0.03 * * F 1.441.06 Pro 310 . . . . . T C 0.13 0.03 * * F 1.08 0.52 Ile 311 . . B . . . . 0.13 0.41 . . F 0.17 0.71 3~ Asn 3I2 . . B . . . . -0.570.67 . . . -0.19 0.36 Leu 313 . . B . . , . -0.431.06 . * . -0.40 0.20 Ser 314 . . B . . . . -0.73L06 . * . -0.40 0.45 Leu 315 . . B . . . . -1.110.76 . * . -0.40 0.37 Phe 316 . . B . . T . -0.4?0.86 . * . -0.20 0.46 3$ Pro 317 . . . . T T . -1.170.93 . * . 0.20 0.54 Ser 318 . . . . T T . -1.021.33 . * . 0.20 0.56 Ala 319 . . B . . T . -1.531.21 . . . -0.20 0.35 Tyr 320 . . B . . . . -0.931.11 * . . -0.40 0.19 Phe 321 . . B . . . . -0.121.11 * . . -0.40 0.21 40 Cys 322 B

. . . . . . -0.720.73 * . . -0.40 0.42 Leu 323 . . B . . . . -0.630.91 . * . -0.40 0?2 SUBSTITUTE SHEET (RULE 26) Pro 324 . . $ . . . . -0.93 0.59 . . -0.40 * 0.39 Arg 325 . . B B . . . -1.03 0.49 . . -0.60 . 0.51 Leu 326 . . B B . . . -0.22 0.34 . . -0.30 . 0.61 Pro 327 . . . B T . . -0.26 -0.34 . 0.70 . * 0.78 S Iie 328 . . . B T . . 0.24 0.01 . . 0.10 . 0.34 Gly 329 . . B B . . . 0.07 0.50 . F -0.60 * 0.60 Arg 330 . . B B . . . -0.43 0.24 . F -0.15 * 0.50 Phe 331 . . B B . . . -0.01 0.24 * . -0.30 . 0.91 Thr 332 . . B B . . . -0. -0.01 . 0.45 i 9 * * 1.17 1~ Ter 333 . . B B . . . 0.31 -0.01 . 0.30 ' * * 0.76 SUBSTITUTE SHEET (RULE 26) Table V
Res I II III IV V VI VII VIIIIX X XII
Position . A B . . . . -1.30 0.70XI Xlll Met . XIV
I . .
-0.60 0.39 Leu 2 . A B . . . . -1.72 0.96. . . -0.60 0.25 Leu 3 . A B . . . . -2.14 1.21. . . -0.60 0.16 Pro 4 . A B . . . . -2.06 1.47. . . -0.60 0.14 Leu 5 . A B . . . . -1.97 1.24* -0.60 Leu 6 . A B . . . . -2.18 0.94. . . 0.22 -0.60 0.36 Leu 7 . A B . . . . -2.18 0.94. . . -0.60 0.19 Ser 8 . A B . . . . -L7I 1.20* . . -0.60 0.19 Ser 9 . . B B . . . -1.84 0.94. . F -0.45 0.23 Leu 10 . . B B , . . -1.33 0.69 F -0.45 1 Leu 11 . . . B . . C -0.52 0.39* . F 0.27 S 0.05 0.27 Gly 12 . . . . . T C -0.30 0.40. . F 0.45 0.35 Gly 13 . . . . . T C -0.60 0.51. . F 0.15 0.43 .

Ser 14 . . B . . T . -0.30 0.44. . F 0.12 0.52 Gln 15 . . B . . T . 0.17 -0.24. * F 1.19 0.88 20 Ala 16 . . B . . . . I .09 -0.24. * F I .16 0.88 Met 17 . . B . . T . 0.73 -0.67. * F 1.981.28 Asp 18 . . B . . T . 0.79 -0.27. * F 1.70 0.64 Gly 19 . . . . T T . 0.20 0.24* * F 1.33 0.67 Arg 20 . . . . T T . 0.31 0.43* * 0.710.47 2$ Phe 21 . . B B . . . 0.04 -0.19* * . 0.64 0.56 Trp 22 . . B B . . . 0.64 0.46* * . -0.43 0.42 Ile 23 . . B B . . . 0.64 0.43* * . -0.60 0.37 Arg 24 . . B B . , . 0.69 0.43* * . -0.60 ~ 0.74 Va1 25 . . B B . . . -0.28 0.03* *
F -0.30 Gln 26 . . . $ . . C -0.18 -0.24* * 0.94 0.65 0.99 Glu 27 . . . B . . C -0.74 -0.31* * F 0.65 0.50 Ser 28 . . . B T . . -0.07 0.33 0.10 0.50 Val 29 . . B B . . . -0.18 0.11 -0.30 0.45 Met 30 . . B B . . . 0.09 -0.29 0.30 0.45 35 Val 31 . . B B . . . -0.58 0.21. . . -0.30 0.34 Pro 32 . . B B . . . -0.58 0.40. . . -0.30 0.24 Glu 33 . . B . . . . -1.17 -0.24. . . 0.50 0.41 Ala 34 . . . . T . . -0.61 -0.17. . . 0.90 0.39 Cys 35 . . B . . . . -0.87 -0.43* . . 0.50 0.34 Asp 36 . . . B T
-0.22 -0.21* . 0.70 0.14 SUBSTITUTE SHEET (RULE 26) Ile 37 . . B B . . . -0.680.21. . -0.30 . 0.22 Ser 38 . . B B
-0.980.29. . -0.30 * 0.22 Val 39 . . B . . T . -1.090.10. . 0.10 * 0.18 Pro 40 . . B . . T . -0.720.89. -0.20 * 0.22 S Cys 41 . . . . T T . -0.970.59. . 0.20 * 0.22 Ser 42 . . . . T T . -0.290.96. . 0.20 * 0.46 Phe 43 . . . . T . . 0.12 0.74* . 0.28 . 0.46 Ser 44 . . B . . . . , 0.31. . 0.61 0.98 . 1.69 Tyr 45 . . B . . T . 1.19 0.14. . 1.09 . 2.19 Pro 46 . . . . T T . 1.57 -0.24. F 2.52 . 4.22 Arg 47 . . . . T T . 1.56 -0.11. F 2.80 . 3.31 Gln 48 . . . . T T . 1.91 -0.01. F 2.52 . 3.05 Asp 49 . . . . T . . 1.91 -0.34. F 2.041.95 *

Trp 50 . . . . T T . 1.84 -0.39. F 1.961.33 *

IS Thr 51 . . . . . T C 1.84 0.10. F 0.$81.11 *

Gly 52 . . . . T T . 1.14 0.13. F 0.801.03 *

Ser 53 . . . . . T C 0.90 0.63. F O.15 . 0.99 Thr 54 . . . . . . C 0.56 0.47. F 0.101.07 .

Pro 55 . . . . . T C 0.60 0.41. F 0.301.07 .

Ala 56 . . . T T . 0.62 0.74. . 0.351.26 .

Tyr 57 . . . . T T . 0.27 1.27. . 0.20 . 0.91 Gly 58 . . . . T T . 0.61 1.57. : 0.20 . 0.51 Tyr 59 . . B B . . . 0.33 1.14* . Ø45 . 1.01 Trp 60 . . B B . . . -0.311.i4* . -0.60 . 0.65 ZJ' Phe 61 . B B

. . . . -0.031.03* . -0.60 . 0.49 Lys 62 . . B B . . . 0.21 1.09* . -0.60 . 0.45 Ala 63 . . B B . . . 0.24 0.33 -0.30 0.74 . . B B . . . 0.18 -0.10* . 0.791.24 .

Thr 65 . . B B . . . 0.51 -0.40* F 1.13 ~ . 0.89 Glu 66 . . B B . . . 0.87 -0.40* F 1.621.77 .

Thr 67 ' . . B T . . 0.23 -0.47* F 2.36 . 2.36 Thr 68 . . . T T . O.b -0.61* F 3.40 l . I .65 Lys 69 . . . . T T . 0.61 -0.67* F 3.061.48 .

Gly 70 . . . . . T C 0.33 -0.03. F 2.07 . 0.76 35 Ala 71 . . . . . T C 0.02 -0.Ol* F 1.73 . 0.53 Pro 72 . . B . . . . 0.33 -0.01* . 0.84 . 0.38 Val 73 - . B . . . . 0.61 0.39. . -0.10 . 0.62 Ala 74 . . B . . . . 0.57 0.46. . -0.10 . 0.84 Thr 75 . . B . . . . 0.61 0.36* . 0.50 * 0.94 Asn 76 . . . . . . C 1.31 0.31. F 1.301.70 .

His 77 . . . . . T C 1.52 -0.33* F 2.40 . 3.29 SUBSTITUTE SHEET (RULE 26) Gln 78 . . . . T C 1.52 -0.83* . F 3.00 . 3.95 Ser 79 . . . . T C 2.11 -0.67* . F 2.701.82 . .

Arg 80 . $ . . T . 1.82 -1.07* . F 2.20 . 2.32 Glu $1 A B . . . . 1.52 -0.96* . F 1.501.33 .

$ Val 82 A B . . . . 1.24 -0.97* * F 1.541.33 .

Glu 83 A B . . . . 1.36 -0.87* * . 1.28 . 0.98 Met 84 A B . . . . 1.31 -0.87* * . 1.771.10 .

Ser 85 . . . . T C 1.31 -0.44* * F 2.561.47 .

Thr 86 . , . T T . 0,61 -1.09. * F 3.401.67 .

1 Arg 87 . . . T T . 1.47 -0.30. * F 2.76 ~ . 1.46 Gly 88 . . . T T . 0,66 -0.51. * F 2.721.88 .

Arg 89 . B B . . . 0.94 -0.21. * F 1.281.08 .

Phe 90 . B B . . . 0.90 -0.21. * . 0.64 . 0.79 Gln 91 . B B . . . 1.21 0.21. * . -0.30 . ~ 0.79 Leu 92 . B B . . . 0.89 -0.21. * . 0.30 . 0.68 Thr 93 . B $ . . . 0.64 0.21. * F 0.341.21 .

Gly 94 . . B . . C 0.58 -0.07. * F 1.33 . 0.70 Asp 95 . . . . T C 0.93 -0.47* * F 2.221.71 .

Pro 96 . . . . T C 0.93 -0.73. * F 2.861.17 .

Ala 97 . . . T T . 1.08 -0.81. . F 3.401.90 .

Lys 98 . . . T T . 1.09 -0.67. * F 2.910.61 .

Gly 99 . . . T T . 0.62 -0.29* . F 2.27 . 0.53 Asn 100 . B . T T . -0.23-0.03. * F 1.93 . 0.43 Cys 101 . B . . T . -0.910.11* * . 0.44 . 0.16 25 Ser 102 . B . . T . -0.210.80* * . -0.20 . 0.11 Leu 103 A B B . . . -0.260.37* * . -0.30 . 0.14 Val 104 A B B . . . -0.50-0.03* * . 0.30 . 0.43 Ile 105 A B B . . . -0.50-0.10* . . 0.30 . 0.33 Arg 106 A B B . . . -0.43-0.09* . . 0.30 . 0.68 3~ Asp 107 A $ B . . . -0.13-0.16* . . 0.30 . 0.91 Ala 108 A $ . . . . 0.68 -0.40* . . 0.45 . 2.25 Gln 109 A B . . . . 1.53 -1.09* . . 1.031.92 .

Met 110 A . . . . C 2.12 -1.09. . . 1.511.99 .

Gln 11 A B . . . .' 2.01 -0.70. . F 1.74 I 2.64 .
.

Asp 112 . . . T T . 1.77 -0.80. * F 2.82 . 2.64 Glu 113 . . . T T . 1.66 -0.44* . F 2.80 . 4.18 Ser 114 . . . T T . 0.96 -0.27* * F 2.52 . 2.09 Gln 115 . . . T T . 1.67 0.11* * F 1.641.08 .

Tyr 116 . B B . . . 0.81 0.11* * . 0.41 . 1.22 40 Phe 117 . B B . . . 0.81 0.76* . . -0.32 . 0.68 Phe 1 I . B B . . . 0.92 0.37* * . 0.04 8 0.68 .

SUBSTITUTE SHEET (RULE 26) Arg I 19 . . B B . . . 0.88 -0.03* . . 0.98 0.85 Val 120 . . B B . . . O.S8 -0.36* . . 1.32 _ 0.97 Glu 121 . . . . T T . O.S8 -0.76* . F 3.061.50 Arg 122 . . . . T T . 0.42 -0.79* . F 3.401.20 Gly 123 . . . . T T . 1.23 -0.14* * F 2.76 1.20 Ser 124 . . . . T T . 0.88 -0.79* * F 2.721.36 Tyr 12S . . B . . . . 1.73 -O.b3. * . 1.331.09 Val 126 . . B . . . . 1.03 0.37 . * . 0.391.76 Arg 127 . . B . . . . 0.32 0.73 . . . -0.25 1.14 1~ Tyr 128 . . B . . . . 0.67 0.96 . * . -0.40 0.72 Asn 129 . . B . . . 0.97 0.60 . * . -0.25 Phe 130 . . B . . . . 0.87 -0.04 * 0.651.33 Met 131 . . B . . . . 1.02 0.39 . * . -0.10 0.84 Asn 132 . . . . T T . 0.21 0.41 . * . 0.20 0.45 1$ Asp 133 . . . . T T . -0.360.80 . . . 0.20 0.45 Gly i34 . . . . T T . -0.310.70 . * . 0.20 0.38 Phe 13S . . B . . T . -0.470.09 . . . 0.10 0.47 Phe 136 . . B B . . . -0. 0.33 . * . -0.30 i 8 0.21 Leu 137 . . B B . . . -1.030.81 . . . -0.60 0.30 20 Lys 138 . . B B . . . -1.841.03 . . . -0.60 0.26 Val 139 . . B B . . . -1.800.93 . . . -0.60 0.25 Thr 140 . . B B . . . -1.80O.S3 . * . -0.60 0.40 Val 141 . . B B . . . -1.410.63 . . . -0.60 0.17 Leu. i42 . . B B . . . -0.811.11 . * . -0.60 0.34 25 Ser 143 . . B B . , . -0.740.90 . * . -0.60 0.36 Phe 144 , , B B . . . -0.100.41 * * . -0.26 0.96 Thr 14S . . . . . T C 0.21 0.20 * * F 1.281.80 Pro 146 . . . . . T C 1.07 -0.09* * F 2.22 2.32 Arg 147 . . . . . T C 1.84 -0.47. * F 2.56 ~ 4.48 Pro 148 . . . . T T . 2.14 -0.76. * F 3.40 4.22 Gin 149 . . . . T . . 2.53 -0.84* * F 2.86 4.39 Asp ISO . . . . T . . 2.84 -0.79* . F 2.52 3.24 His 1 S 1 . . . T . .. 2.24-0.79* . F 2.18 . 3.50 Asn 1S2 . . . . T T . 1.82 -O.S3* * F 2.04 1.67 3 Thr 1 S3 . . . . T T . 1.37 -0.44. . F 1.40 J' 1.44 Asp 154 . . , . T T . 1.33 0.13 . * F 0.65 O.S7 Leu 1 SS . . B , . T . 0.48 0.13 * * . 0.10 0.48 Thr 1S6 . . B B . . . O.S1 0.37 . * . -0.30 0.25 Cys 157 . . B B . . . -0.19-0.11. * . 0.49 0.25 40 His 1S8 . . B B , . . -0.180.67 * * . -0.22 0.26 Val 1S9 . . B 8 . . . -0.070.37 * * . 0.27 0.24 SUBSTITUTE SHEET (RULE 26) DEMANDES OU BREVETS VOLUMINEUX

COMPREND PLUS D'UN TOME.
CECI EST LE TOME / DE
NOTE: ~ Pour (es tames additionels, veuillez contacter le Bureau canadien des brevets THIS SECTION OF THE APPL1CATIONIPATE~NT CONTAINS MORE
THAN ONE VOLUME
THiS lS VOLUME ~ OF
NOTE: For additional volumes please contact the Canadian Patent Office

Claims (23)

What Is Claimed Is:
1. An isolated nucleic acid molecule comprising a polynucleotide having a nucleotide sequence at least 95% identical to a sequence selected from the group consisting of:
(a) a polynucleotide fragment of SEQ ID NO:X or a polynucleotide fragment of the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ
ID
NO:X;
(b) a polynucleotide encoding a polypeptide fragment of SEQ ID NO:Y or a polypeptide fragment encoded by the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X;
(c) a polynucleotide encoding a polypeptide domain of SEQ ID NO:Y or a polypeptide domain encoded by the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X;
(d) a polynucleotide encoding a polypeptide epitope of SEQ ID NO:Y or a polypeptide epitope encoded by the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X;
(e) a polynucleotide encoding a polypeptide of SEQ ID NO:Y or the cDNA
sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X, having biological activity;
(f) a polynucleotide which is a variant of SEQ ID NO:X;
(g) a polynucleotide which is an allelic variant of SEQ ID NO:X;
(h) a polynucleotide which encodes a species homologue of the SEQ ID NO:Y;
(i) a polynucleotide capable of hybridizing under stringent conditions to any one of the polynucleotides specified in (a)-(h), wherein said polynucleotide does not hybridize under stringent conditions to a nucleic acid molecule having a nucleotide sequence of only A residues or of only T residues.
2. The isolated nucleic acid molecule of claim 1, wherein the polynucleotide fragment comprises a nucleotide sequence encoding a secreted protein.
3. The isolated nucleic acid molecule of claim 1, wherein the polynucleotide fragment comprises a nucleotide sequence encoding the sequence identified as SEQ ID

NO:Y or the polypeptide encoded by the cDNA sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X.
4. The isolated nucleic acid molecule of claim 1, wherein the polynucleotide fragment comprises the entire nucleotide sequence of SEQ ID NO:X or the cDNA
sequence included in ATCC Deposit No:Z, which is hybridizable to SEQ ID NO:X.
5. The isolated nucleic acid molecule of claim 2, wherein the nucleotide sequence comprises sequential nucleotide deletions from either the C-terminus or the N-terminus.
6. The isolated nucleic acid molecule of claim 3, wherein the nucleotide sequence comprises sequential nucleotide deletions from either the C-terminus or the N-terminus.
7. A recombinant vector comprising the isolated nucleic acid molecule of claim 1.
8. A method of making a recombinant host cell comprising the isolated nucleic acid molecule of claim 1.
9. A recombinant host cell produced by the method of claim 8.
10. The recombinant host cell of claim 9 comprising vector sequences.
11. An isolated polypeptide comprising an amino acid sequence at least 95%
identical to a sequence selected from the group consisting of:
(a) a polypeptide fragment of SEQ ID NO:Y or the encoded sequence included in ATCC Deposit No:Z;
(b) a polypeptide fragment of SEQ ID NO:Y or the encoded sequence included in ATCC Deposit No:Z, having biological activity;
(c) a polypeptide domain of SEQ ID NO:Y or the encoded sequence included in ATCC Deposit No:Z;

(d) a polypeptide epitope of SEQ ID NO:Y or the encoded sequence included in ATCC Deposit No:Z;
(e) a secreted form of SEQ ID NO:Y or the encoded sequence included in ATCC
Deposit No:Z;
(f) a full length protein of SEQ ID NO:Y or the encoded sequence included in ATCC Deposit No:Z;
(g) a variant of SEQ ID NO:Y;
(h) an allelic variant of SEQ ID NO:Y; or (i) a species homologue of the SEQ ID NO:Y.
12. The isolated polypeptide of claim 11, wherein the secreted form or the full length protein comprises sequential amino acid deletions from either the C-terminus or the N-terminus.
13. An isolated antibody that binds specifically to the isolated polypeptide of claim 11.
14. A recombinant host cell that expresses the isolated polypeptide of claim 11.
15. A method of making an isolated polypeptide comprising:
(a) culturing the recombinant host cell of claim 14 under conditions such that said polypeptide is expressed; and (b) recovering said polypeptide.
16. The polypeptide produced by claim 15.
17. A method for preventing, treating, or ameliorating a medical condition, comprising administering to a mammalian subject a therapeutically effective amount of the polypeptide of claim 11 or the polynucleotide of claim 1.
18. A method of diagnosing a pathological condition or a susceptibility to a pathological condition in a subject comprising:
(a) determining the presence or absence of a mutation in the polynucleotide of claim 1; and (b) diagnosing a pathological condition or a susceptibility to a pathological condition based on the presence or absence of said mutation.
19. A method of diagnosing a pathological condition or a susceptibility to a pathological condition in a subject comprising:
(a) determining the presence or amount of expression of the polypeptide of claim 11 in a biological sample; and (b) diagnosing a pathological condition or a susceptibility to a pathological condition based on the presence or amount of expression of the polypeptide.
20. A method for identifying a binding partner to the polypeptide of claim 1 1 comprising:
(a) contacting the polypeptide of claim 11 with a binding partner; and (b) determining whether the binding partner effects an activity of the polypeptide.
21. The gene corresponding to the cDNA sequence of SEQ ID NO:Y.
22. A method of identifying an activity in a biological assay, wherein the method comprises:
(a) expressing SEQ ID NO:X in a cell;
(b) isolating the supernatant;
(c) detecting an activity in a biological assay; and (d) identifying the protein in the supernatant having the activity.
23. The product produced by the method of claim 20.
CA002348824A 1998-10-28 1999-10-27 12 human secreted proteins Abandoned CA2348824A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US10597198P 1998-10-28 1998-10-28
US60/105,971 1998-10-28
PCT/US1999/025031 WO2000029435A1 (en) 1998-10-28 1999-10-27 12 human secreted proteins

Publications (1)

Publication Number Publication Date
CA2348824A1 true CA2348824A1 (en) 2000-05-25

Family

ID=22308773

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002348824A Abandoned CA2348824A1 (en) 1998-10-28 1999-10-27 12 human secreted proteins

Country Status (5)

Country Link
EP (1) EP1124850A4 (en)
JP (1) JP2002530062A (en)
AU (1) AU1231400A (en)
CA (1) CA2348824A1 (en)
WO (1) WO2000029435A1 (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020137890A1 (en) * 1997-03-31 2002-09-26 Genentech, Inc. Secreted and transmembrane polypeptides and nucleic acids encoding the same
ES2353637T3 (en) 1998-08-07 2011-03-03 Immunex Corporation LDCAM MOLECULES.
US20030055231A1 (en) 1998-10-28 2003-03-20 Jian Ni 12 human secreted proteins
EP1141376A4 (en) * 1998-12-23 2002-03-13 Human Genome Sciences Peptidoglycan recognition proteins
SE9902056D0 (en) * 1999-06-03 1999-06-03 Active Biotech Ab An integrin heterodimer and an alpha subunit thereof
EP1514933A1 (en) * 1999-07-08 2005-03-16 Research Association for Biotechnology Secretory protein or membrane protein
US7129338B1 (en) 1999-07-08 2006-10-31 Research Association For Biotechnology Secretory protein or membrane protein
AU5917801A (en) * 2000-04-27 2001-11-07 Millennium Pharm Inc Novel integrin alpha subunit and uses thereof
US6596493B1 (en) * 2000-08-15 2003-07-22 The Johns Hopkins University School Of Medicine Diagnosis and treatment of tumor-suppressor associated disorders
IL149851A0 (en) * 2002-05-26 2002-11-10 Yeda Res & Dev Resistin binding proteins, their preparation and use
AU2015242979B2 (en) * 2002-11-22 2016-12-01 Ganymed Pharmaceuticals Ag Genetic products differentially expressed in tumors and the use thereof
DE10254601A1 (en) * 2002-11-22 2004-06-03 Ganymed Pharmaceuticals Ag Gene products differentially expressed in tumors and their use
ATE554108T1 (en) 2003-07-25 2012-05-15 Amgen Inc PROCEDURES REGARDING LDCAM AND CRTAM
ES2625259T3 (en) 2006-08-29 2017-07-19 Oxford Biotherapeutics Ltd Identification of protein associated with hepatocellular carcinoma, glioblastoma and lung cancer
PT2403878T (en) 2009-03-05 2017-09-01 Squibb & Sons Llc Fully human antibodies specific to cadm1
CN113481185B (en) * 2021-08-05 2022-12-02 云南师范大学 Salt-tolerant beta-galactosidase GalNC2-13 and preparation method and application thereof
CN116082456B (en) * 2022-11-07 2023-08-01 优睿赛思(武汉)生物科技有限公司 Mutant signal peptide, recombinant vector containing mutant signal peptide and application of mutant signal peptide

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0869178A1 (en) * 1997-04-02 1998-10-07 Smithkline Beecham Corporation SAF-3, Sialoadhesin family member-3

Also Published As

Publication number Publication date
WO2000029435A1 (en) 2000-05-25
JP2002530062A (en) 2002-09-17
AU1231400A (en) 2000-06-05
EP1124850A1 (en) 2001-08-22
EP1124850A4 (en) 2005-10-19

Similar Documents

Publication Publication Date Title
CA2348824A1 (en) 12 human secreted proteins
US8372948B2 (en) 12 human secreted proteins
Adams et al. Biphasic modulation of cell growth by recombinant human galectin-1
Girard et al. Cyclin A is required for the onset of DNA replication in mammalian fibroblasts
Roitelman et al. Immunological evidence for eight spans in the membrane domain of 3-hydroxy-3-methylglutaryl coenzyme A reductase: implications for enzyme degradation in the endoplasmic reticulum
Biggs et al. A Drosophila gene that is homologous to a mammalian gene associated with tumor metastasis codes for a nucleoside diphosphate kinase
Hunter et al. Active site of trypanothione reductase: a target for rational drug design
Zolkiewska et al. Integrin alpha 7 as substrate for a glycosylphosphatidylinositol-anchored ADP-ribosyltransferase on the surface of skeletal muscle cells.
Volk et al. A 135‐kd membrane protein of intercellular adherens junctions.
Tate et al. Human kidney gamma-glutamyl transpeptidase. Catalytic properties, subunit structure, and localization of the gamma-glutamyl binding site on the light subunit.
Plevani et al. Polypeptide structure of DNA primase from a yeast DNA polymerase-primase complex.
Xu et al. Molecular basis of the redox regulation of SUMO proteases: a protective mechanism of intermolecular disulfide linkage against irreversible sulfhydryl oxidation
Kivirikko et al. [6] Recent developments in posttranslational modification: Intracellular processing
Hartshorne et al. Regulation of smooth muscle actomyosin
Pacifici et al. Remodeling of the rough endoplasmic reticulum during stimulation of procollagen secretion by ascorbic acid in cultured chondrocytes. A biochemical and morphological study.
Wilk et al. Glutamyl aminopeptidase (aminopeptidase A), the BP-1/6C3 antigen
Volk et al. Cleavage of A-CAM by endogenous proteinases in cultured lens cells and in developing chick embryos
JPH0880195A (en) Gene coding adseverin
Blostein et al. Functional properties of an H, K-ATPase/Na, K-ATPase chimera
JP2892726B2 (en) Platelet blocking peptide
Ankenbauer et al. Proteins regulating actin assembly in oogenesis and early embryogenesis of Xenopus laevis: gelsolin is the major cytoplasmic actin-binding protein.
Pidard et al. Neutrophil proteinase cathepsin G is proteolytically active on the human platelet glycoprotein Ib-IX receptor: characterization of the cleavage sites within the glycoprotein Ib α subunit
Carlsson et al. Purification and properties of cyclostome carbonic anhydrase from erythrocytes of hagfish
SK287034B6 (en) Peptides capable of binding to the gap protein SH3 domain, nucleotide sequences coding therefor, and preparation and use thereof
Thielens et al. Structure and functions of the interaction domains of C1r and C1s: keystones of the architecture of the C1 complex

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead