EP1608390A2

EP1608390A2 - Correlation of gene expression with chromosome abnormalities in ataxia telangiectasia tumorigenesis

Info

Publication number: EP1608390A2
Application number: EP02786436A
Authority: EP
Inventors: Carrolee Barlow; Christopher J. Winrow; Marie Lei A. Callahan; Daniel G. Pankratz; Cecile Rose T. Vibat; Amy J. Warren
Original assignee: NEUROME Inc
Current assignee: NEUROME Inc
Priority date: 2001-10-17
Filing date: 2002-10-17
Publication date: 2005-12-28
Also published as: WO2003033668A3; WO2003033668A2; AU2002349956A1

Abstract

Polynucleotides, polypeptides, kits and methods are provided related to regulated genes characteristic of ataxia telangiectasia tumorigenesis.

Description

CORRELATION OF GENE EXPRESSION WITH CHROMOSOME ABNORMALITIES IN ATAXIA TELANGIECTASIA TUMORIGENESIS

This application claims priority of U.S. Provisional Application No. 60/330,206, filed October 17, 2001, which application is hereby incorporated by reference.

BACKGROUND OF THE INVENTION

Introduction to Cancer

Cancer is a complex, multistep process involving the evolution of a normal cell into a neoplasm that can grow and divide out of control. Usually, the timing of cell division is under strict constraint, involving a network of signals that work together to say when a cell can divide, how often it should happen and how errors can be fixed. Mutations or other genomic aberrations that effect specific genes in one or more pathways in this complex network can trigger cancer. Such genomic changes can be initiated by exogenous exposures (e.g., chemicals, radiation or viruses), endogenous exposures (e.g., oxy-radicals) or enzymatic errors (e.g., polymerase or recombinase infidelity). Genetic predisposition, or a combination of genetic predisposition (i.e., genetic susceptibility) and environmental exposures also play a large role in the cancer process. Specific families of genes tend to be affected in the cancer process that allow cancer cells to grow and replicate out of control, resist normal cell signals to die, and allow cancer cells to invade other tissues. Some genetic disorders are the direct result of a mutation in one gene. However, most tumors are the result of several genomic changes.

The main mechanisms for cancer formation involve genes that result in 1) the impairment of a DNA repair pathway, 2) the transformation of a normal gene into an oncogene and/or 3) the malfunction of a tumor suppressor gene. Cancer genes can be classified as either caretaker or gatekeeper genes (Kinzler et al. 1997), although there is clearly overlap for some genes. This concept of caretaker and gatekeeper genes acknowledges their respective roles in maintenance of genomic integrity (e.g., DNA repair) and cellular proliferation, respectively. Some examples of caretaker genes are those involved in DNA repair, carcinogen activation or carcinogen detoxification, while examples of gatekeeper genes are those involved in cell cycle control and DNA replication. Dysfunctional caretaker genes increase the probability of mutations in gatekeeper genes, which are necessary to initiate the molecular pathogenesis of cancer. The transformation of a normal cell to a tumor cell appears to depend in part on mutations in genes that normally control the cell cycle. Cell cycle checkpoints are believed to play a major role in maintaining the integrity of the genome. Defects in these control points may contribute to increased incidence of genomic alterations, such as deletions, translocations, and amplifications that are common during the evolution of a normal cell to a cancer cell.

The macromolecular target in cancer initiation is most commonly DNA, where carcinogens directly bind to DNA (i.e., forming adducts) or trigger other types of DNA damage (e.g., through the formation of reactive oxygen species). If left unrepaired, or repaired incorrectly, DNA damage can result in errors in reading the genetic code during DNA replication and the establishment of a mutation, strand break, translocation, or other gross chromosomal change. Genetic predisposition to cancer can also lead to genomic instability, resulting in similar genomic aberrations. In addition, during the cancer process, as cancer cells search for genetic variations that may give them a growth advantage over normal cells or allow them to invade other tissues, new mutations and chromosome aberrations occur. These molecular changes that occur throughout the disease process can be measured and are called biological markers or "biomarkers." For example, a number of specific cytogenetic abnormalities have been recognized that are very closely, and sometimes uniquely, associated with morphologically and clinically distinct subsets of leukemia or lymphoma (Le Beau

MM, Larson RA. Cytogenetics and neoplasia. In: Hoffman R, Benz EJ, Shattil SJ, Furie B, Cohen

HJ, Silverstein L., eds. Hematology: basic principles and practice. New York: Churchill

Livingstone, 1994: 878. Heim S, Mitelman F. Cancer cytogenetics, ed Z. New York: Wiley-Liss,

1995. Heim S, Mitelman F. Cytogenetics of solid tumors. Adv Histopathol 1995;15:37. Sandberg

AA. The chromosomes in human cancer and leukemia, ed 2. New York: Elsevier, 1990. Mitelman

F, Kaneko Y, Berger R. Report of the committee on chromosome changes in neoplasia. In:

Cuticchia AJ, Pearson PL, eds. Human gene mapping. Baltimore: Johns Hopkins University Press,

1993:773. Mitelman F. Catalog of chromosome aberrations in cancer, ed 5. New York: Wiley-Liss,

1994.), thus can serve as diagnostic and/or prognostic biomarkers. The detection of one of these recurring abnormalities can be helpful in the correct diagnosis and can add information of prognostic importance. Specific examples of genomic aberrations and links with specific tumor types is discussed in more detail below. Thus a goal for the use of biomarkers is to describe the amount and type of these pro-mutagenic or mutagenic lesions for use as diagnostic or prognostic purposes, and also to enhance exposure assessments in epidemiological studies, focus chemoprevention strategies, and elucidate carcinogenic mechanisms. Cancer and Chromosome Abnormalities

Virtually all malignant cells from patients with leukemia, lymphoma, or solid tumors have chromosomal abnormalities. A number of specific cytogenetic abnormalities have been associated with morphologically and clinically distinct subsets of leukemia or lymphoma (Le Beau MM, Larson RA. Cytogenetics and neoplasia. In: Hoffman R, Benz EJ, Shattil SJ, Furie B, Cohen HJ, Silverstein L., eds. Hematology: basic principles and practice. New York: Churchill Livingstone, 1994: 878. Heim S, Mitelman F. Cancer cytogenetics, ed Z. New York: Wiley-Liss, 1995. Heim S, Mitelman F. Cytogenetics of solid tumors. Adv Histopathol 1995;15:37. Sandberg AA. The chromosomes in human cancer and leukemia, ed 2. New York: Elsevier, 1990. Mitelman F, Kaneko Y, Berger R. Report of the committee on chromosome changes in neoplasia. In: Cuticchia AJ, Pearson PL, eds. Human gene mapping. Baltimore: Johns Hopkins University Press, 1993:773. Mitelman F. Catalog of chromosome aberrations in cancer, ed 5. New York: Wiley-Liss, 1994.)

The detection of one of these recurring abnormalities can be helpful in the correct diagnosis and can add information of prognostic importance. In addition, the appearance of new abnormalities in the karyotype of a patient usually indicates a change to a more aggressive disorder.

Interestingly, it is not known how consistent structural rearrangements occur. They could occur randomly, where selection occurs of only the aberrations that provide a growth advantage, or the frequency of the various abnormalities seen in cancer may reflect an inherent genetic instability. It has been proposed that chromosomal fragile sites may mediate rearrangements (Bishop J., Annu Rev Biochem 1983;52:301.). It is likely in the case of AT (ataxia telangiesctasia, described below), where normal DNA double-strand break repair is compromised, that novel mechanisms of recombination occur at key chromosomal hotspots. Non-random chromosomal rearrangements have been invaluable for identifying genes involved in oncogenesis and for providing diagnostic clues for certain tumors (Rowley J., Cancer Res 1990;50:3816.). A correlation exists between the type of chromosomal abnormality and the histopatho logic type of tumor in which it is found, suggesting that certain lineages are susceptible to the transforming effects caused by deregulation of the particular gene at that translocation (Rabbitts TH., Nature 1994;372:143). For example, breakpoints affecting the T-cell receptor (TCR) locus is common in T-cell malignancies and suggest the translocations arose by recombination error. Translocations are a type of chromosome defect where a chromosome is broken, which allows it to associate with parts of other chromosomes. This can cause genes to become transcriptionally deregulated, structurally altered, or both. For example, a gene can be placed in a new environment, such as next to a region responsible for the regulation of a different gene (i.e., the promoter), resulting in the inappropriate expression level of that gene. Such changes can result in oncogenic effects, where the gene no longer responds to the normal cell signals. Alternatively, a gene could be placed in an inactive region of a chromosome, resulting in the inappropriate suppression of its expression, or a gene may simply be inserted out of frame such that it no longer codes for an active protein. In this case, such a change would be characteristic of a tumor suppressor gene, where it can no longer actively keep cell growth in check. A common feature in several of the genes known to be deregulated by translocation in hematopoietic malignancies is that they encode proteins with significant homology to known transcriptional regulatory proteins (Rabbitts TH., Nature 1994;372:143. Cleary ML., Cell 1991.66:619.).

In addition to translocations, another example of chromosomal abnormalities common to cancer is gene amplification. Several cellular oncogenes are amplified in human tumors. For example, c-MYC is amplified in pro-myelocytic leukemia. Other oncogenes, such as cERBB (EGFR), ERBB-2 (HER-2/NEU), and c-MYC family members, are amplified in specific tumor types, and multiple copies of these genes have been associated with poor prognosis. Amplification of N-MYC in human neuroblastoma has been correlated with advanced stages of that disease (Schwab M, et al, Nature 1984;308:288.). Similarly, amplification of ERBB-2 is a prognostic factor in mammary and ovarian cancers (King C, et al, Science 1985;252:554. Slamon D, et al, Science 1989;244:707. Hynes NE, and Stern, DF., Biochem Biphvs Acta Rev Cancer 1994;1198:165.) and the c-ERBB gene is amplified in glioblastomas (Libermann T, et al, Nature 1985;313:144.) and squamous carcinomas (Merlino C, et al, Science 1984;224:417. Ullrich A, et al. Nature 1984:309:418). Specific Examples of Cancers Involving Specific Genomic Defects

Ras, a model oncogene

One specific example of a situation in which genomic alterations are reflected in both gene expression and tumor promotion is the gene Ras, which is found on chromosome 11. In normal cells, Ras acts as a molecular switch that sends a signal that tells the cell to grow. When mutated

(as is the case in about 30% of all human cancers), Ras frequently is constitutively active, serving as an oncogene whose protein product continually stimulates cell growth in cancer cells. Rb, a model tumor suppressor

An example of another gene in which genomic defects influence tumorigenesis is retinoblastoma. The disease retinoblastoma occurs in early childhood and affects about 1 child in 20,000. There are both hereditary and non-hereditary forms of the disease. In the hereditary form, multiple tumors are found in both eyes, while in the non-hereditary form only one eye is affected and by only one tumor. In the hereditary form, a gene called Rb is lost from chromosome 13. Since the absence of Rb seemed to be linked to retinoblastoma, it has been suggested that the role of Rb in normal cells is to suppress tumor formation. Rb is found in all cells of the body, where under normal conditions it acts as a brake on the cell division cycle by preventing certain regulatory proteins from triggering DNA replication. If Rb is missing, a cell can replicate in an uncontrolled manner, resulting in tumor formation. Untreated, retinoblastoma is almost uniformly fatal, but with early diagnosis and modern methods of treatment the survival rate is over 90%. Since the Rb gene is found in all cell types, it would be useful to study the molecular mechanism of tumor suppression by Rb to gain insight into the progression of many types of cancer, not just retinoblastoma. p53 — a model tumor suppressor gene

Similar to RB, the p53 gene has been referred to as a tumor suppressor gene, i.e., its activity stops the formation of tumors. Individuals who inherit only one functional copy of the p53 gene from their parents, are predisposed to cancer and usually develop several independent tumors in a variety of tissues in early adulthood. This condition is rare, and is known as Li-Fraumeni syndrome. However, it has been found that almost all human tumors contain mutations in p53, and therefore appear play a role in network of molecular changes in tumorigenesis. In the cell, the p53 serves as the "guardian of the genome" (Kinzler, K. W. and Vogelstein, B. (1997) Nature 386:761- 763) where it binds DNA, interacts with a large number of proteins, and causes cell cycle "checkpoints" to control cell cycle, particularly in response to DNA damaging agents. Most mutant forms of p53 can no longer effectively bind DNA, and as a consequence cannot stimulate the production of the genes (and protein products) to cause a cell cycle arrest or halt cell division. In the absence of these normal checkpoints, cancer cells can divide uncontrollably, and form tumors.

Burkitt lymphoma and translocation of Myc

In addition to being duplicated, the Myc gene can undergo translocations, which have been associated with the disease Burkitt lymphoma. Burkitt lymphoma is a rare form of cancer predominantly affecting young children in central Africa, but the disease has also been reported in other areas. Burkitt lymphoma results from chromosome translocations that involve the Myc gene. The classic chromosome translocation in Burkitt lymophoma involves chromosome 8, the site of the Myc gene. This changes the pattern of Myc's expression, thereby disrupting its usual function in controlling cell growth and proliferation. It is unknown what causes this chromosome translocation or how this process contributes to Burkitt lymphoma and other cancers such as leukemia.

Chronic myeloid leukemia and the Philadelphia chromosome (fusion protein)

One additional and well characterized translocation has been associated with chronic myeloid leukemia (CML). CML is a cancer of blood cells, characterized by replacement of the bone marrow with malignant, leukemic cells. Many of these leukemic cells can be found circulating in the blood and can cause enlargement of the spleen, liver and other organs. CML is usually diagnosed by finding a specific chromosomal abnormality called the Philadelphia (Ph) chromosome, named after the city where it was first recorded. The Ph chromosome is the result of a translocation or exchange of genetic material between the long arms of chromosomes 9 and 22. This exchange brings together two genes: the BCR (breakpoint cluster region) gene on chromosome 22 and the proto-oncogene ABL (Ableson leukemia virus) on chromosome 9. The resulting hybrid gene BCR-ABL codes for a fusion protein with tyrosine kinase activity, which activates signal transduction pathways, leading to uncontrolled cell growth. A mouse model has been created which develops a CML-like disease when given bone marrow cells infected with a virus containing the BCR-ABL gene. In other animal models, the fusion proteins have been shown to transform normal blood precursor cells to malignant cells. To research the human disease, antisense oligomers (short DNA segments) which block BCR-ABL have been developed which specifically suppress the formation of leukemic cells, while not affecting the normal bone marrow cell development. Thus, gene- fusion products present unique opportunities for detection, prognosis, as well as unique therapeutic targets that may lead to more specific treatments of tumor over normal cells.

Pancreatic cancer — loss of DPC4 on chromosome 18

Another example of a chromosomal deletion that is associated with a specific type of cancer involves the tumor suppressor gene DPC4. The pancreas is responsible for producing the hormone insulin, along with other substances. It also plays a key role in the digestion of protein. About

90%) of human pancreatic carcinomas show a loss of part of chromosome 18. The tumor suppressor gene, DPC4 (Smad4), which is normally found on the part of chromosome 18 that is lost in pancreatic cancer, may play a role in pancreatic cancer development. There is a whole family of Smad proteins in vertebrates, all involved in signal transduction of transforming growth factor-beta (TGF-beta) related pathways. Ataxia Telangiectasia

Ataxia telangiectasia (AT) is a human autosomal recessive multisystem disorder comprising progressive cerebellar ataxia with onset in infancy, progressive oculocutaneous telangiectasia, and unusual susceptibility to progressive bronchopulmonary disease and to lymphoreticular neoplasia (reviewed in Boder, E., (1987) Ataxia-Telangiectasia, pp. 95-117 in Gomez, M.R. and Adams, R.D., Eds., Neurocutaneous Disease. A Practical Approach. Butterworths, Boston). AT patients appear to have two separate clinical patterns of malignancy. One pattern involves solid tumors and includes malignancies in the oral cavity, breast, stomach, pancreas, ovary, and bladder. The second pattern of neoplasia in AT consists of lymphocytic leukemia and non-Hodgkin's lymphoma (Hecht and Hecht (1990) Cancer Genet. Cytogenet. 46: 9-19). In addition, AT generally includes other progressive neurologic degenerations, such as choreoathetosis and oculomotor dysfunction, recurrent sinopulmonary infections secondary to immunodeficiency, lymphoreticular malignancies, growth retardation, incomplete sexual maturation, endocrine abnormalities, and premature aging of the skin and hair (reviewed in Barlow, C, et al., (1996) Cell. 86:159-171). The disease is progressive, and death generally occurs by the second or third decade of life due to neurologic deterioration or lymphoreticular malignancies (Taylor, A.M., et al. (1996) Blood. 87:423-438). AT homozygotes have an approximately 250-700 fold increased risk or developing leukemia and lymphoblastic lymphomas. No effective treatments have been found to alter the course of the disease. The frequency of AT in the United States and Britain has been estimated to be between 1:40,000 and 1:100,000, resulting in a carrier frequency of 0.5% - 1.0% (Barlow et al, 1996). Heterozygous carriers may also have a predisposition to cancer, particularly breast cancer.

AT has been mapped to human chromosome position 1 lq22-23, and the responsible gene (designated ATM, "AT mutated") has been defined by positional cloning. ATM is a key protein for managing cell cycle perturbations in response to DNA damage and plays a role in genetic stability and cancer susceptibility. A defective copy of both ATM genes, as found in AT patients, results in spontaneous chromosomal loss and genetic instability, which could result in inappropriate replication of damaged DNA, contribute to the malignant transformation of cells. Available evidence indicates that ^J may be a critical regulator of many important cellular processes and thus has potential implications for cancer in the general population. Notably, many studies have indicated that loss of ATM function is related to cancer predisposition in AT heterozygotes. In particular, it has been found that women heterozygous for one mutated allele of y47 have a three- to five-fold increase in breast cancer risk. Importantly, it has also been recently established that patients with loss of heterozygosity at chromosome position 1 lq22 and specifically have loss of ATM function, develop particularly aggressive forms of both B and T cell lymphomas. The specific increase in lymphoid tumors in AT patients suggests that the abnormal production of DNA strand breaks and chromosomal rearrangement in conjunction with lack of repair, may be a rate- limiting step in lymphoid tumorigenesis. These studies suggest that ATM is a tumor suppressor gene whose inactivation is a key event in the development of many forms of cancer, particularly in T-cell prolymphocytic leukemias and aggressive forms of B cell chronic lymphocytic leukemia in patients who do not have AT.

The mouse homolog, Atm, has also been identified. Atm shows an 84% amino acid identity and a 91% similarity with ATM. Sequence comparisons revealed that both ATM and Atm are members of a family of genes involved at several stages in DNA double-strand break repair and cell cycle control. tm-deficient mice (Atm^'1') have proven an excellent model for studying the cancer phenotype as they invariably succumb to an aggressive T-cell lymphoblastic lymphoma between two and six months of age (Taylor, A.M., et al, (1996) Blood. 87:423-438). These tumors harbor gene rearrangements of lg and TCR gene families, which suggest that loss oϊAtm results in the inability of T-cells to manage DNA double-strand breaks, leading to chromosome aberrations and tumors. The Atm ^' mice are an ideal model for characterizing the genes involved in cancer formation. Several T-cell lymphomas from multiple different Atm deficient animals have been successfully established and characterized extensively at the cellular and cytogenetic level (Taylor, A.M., et al, 1996). Translocations at the TCRα locus was found in all tumor cell lines established from these mice. Therefore, it is likely that the tumors arise due to aberrant V(D)J rearrangements of the TCRα locus as seen in human cancers. In addition, specific regions of chromosome 12 were affected, which are syntenic to regions on human chromosomes known to be frequently mutated in human hematopoietic cancer (Narducci, et al, 1997; Rabbitts, 1998). Thus, the characterization of the Atm deficient tumor cell lines will offer an opportunity to precisely characterize the genes involved in the multistep process of cancer formation not only in mice, but also in humans. What is needed therefore, is the identification of genes that are dysregulated during AT tumorigenesis as a direct result of specific chromosome aberrations, as well as identification of genes associated with AT, AT tumors, or other cancers. Such molecules are important biomarkers for tumor formation (diagnosis) and progression, that may be useful for characterizing the genes involved in the multi-step process of cancer formation, as well as potential markers for monitoring treatment and prevention of cancer.

SUMMARY OF THE INVENTION

Despite advances in recent years, the precise etiology and pathogenesis of AT, AT tumors, or other cancers remains undefined. In order to try to better understand the mechanistic basis of AT, AT tumors, or other cancers, much effort has been directed towards the discovery and development of various animal models of AT, AT tumors, or other cancers. In particular, the identification of a new gene-fusion transcript and its association with a biological phenotype, such as invasiveness or drug resistance could be a particularly important biomarker. Such dysregulated genes may be suitable biomarkers not only for AT specific tumors, but also for other human cancers, such as other hematopoietic cancers that are frequently known to be mutated in the same chromosome regions.

The TOtal Gene expression Analysis (TOGA^®) method, described in Sutcliffe et al., Proc. Natl Acad. Sci. USA 97(5): 1976-81 (2000), WO 00/26406, U.S. application serial no. 09/775,217, PCT/US02/02666, U.S. Patent No. 5,459,037, U.S. Patent No. 5,807,680, U.S. Patent No. 6,030,784, U.S. Patent No. 6,096,503, U.S. Patent No. 6,110,680, and U.S. Patent No. 6,309,834, all of which are incorporated herein by reference, is a tool used to identify and analyze polynucleotide expression associated with AT, AT tumors, or other cancers. The TOGA^® method is an improved method for the simultaneous sequence-specific identification of mRNAs in an mRNA population which allows the visualization of nearly every mRNA expressed by a tissue as a distinct band on a gel whose intensity corresponds roughly to the concentration of the mRNA. The method can identify changes in expression of mRNA associated with the administration of drugs or with physiological or pathological conditions such as AT, AT tumors, or other cancers.

TOGA was used to identify genes that were aberrantly expressed during AT tumorigenesis in a set of four tumor cell lines derived from Atm deficient mice, and these genes were correlated with specific chromosome aberrations. These cell lines were chosen to study because of the demonstrated correlation between chromosomal aberrations in the mouse model and those of the human disease. Three of the tumor cell lines had similar cytogenetic profiles (AT-4, AT-7, and AT- 13) and one tumor line had several unique chromosomal aberrations (AT- 12). There appears to be a good correlation between chromosome aberrations and number of dysregulated genes, as not only does AT- 12 have the most chromosome aberrations, but also demonstrated the largest number of genes differentially expressed by 2-fold or greater compared to each of the other tumor cells lines. Tumor cell lines from these mice are an ideal model for determining how abnormal regulation of genes give rise to human lymphoma and leukemia. Genes that are abnormally regulated among AT cell lines may be uniquely associated with AT tumorigenesis (as compared to other mechanisms of tumorigenesis) due to genetic abnormalities caused by AT deficiency. These may include phenomena such as translocations of lymphoid specific promoters or other regulatory sequences in or near genes promoting cell growth, or inactivation or deletion of genes that may regulate or suppress lymphoid tumor growth. Thus, the examples shown here were also chosen in part on the basis of their potential location in or near chromosomal aberrations among the AT cell lines studied. The genes identified in these examples may relate to specific biological aspects found both in the mouse model and in human AT tumorigenesis (e.g., the development of particularly aggressive forms of lymphomas).

Such molecules are useful in the therapy of cancer, including treatment, prevention, and amelioration of cancer, and in the diagnosis of cancer. In addition, such dysregulated genes may be suitable biomarkers not only for AT specific tumors, but also for the study of other human cancers, such as other hematopoietic cancers that are frequently known to be mutated in the same chromosome regions. Further, this approach of correlating changes in gene expression to genomic changes (i.e., chromosome aberrations) could be applied to a number of different diseases.

In particular, it has been found that because of the close correspondence between human

ATM and mouse Atm genes, the tumors that develop in ltw-deficient mice provide the ability to specifically characterize the genes involved in cancer formation as a model of particular forms of human hematopoietic cancer. Cancer cell lines established from several T-cell lymphomas from multiple different Atm deficient animals are useful for the study of human cancers, as well as for the development of suitable diagnostic and therapeutic compositions and methods. An extensive cellular analysis of these tumors has been performed and show that they invariably arise at a stage of development when the TCR undergoes rearrangement. It has been specifically found that the translocations occur at the TCRα locus in ten often tumor cell lines established from these mice.

Therefore, it is likely that the tumors arise due to aberrant V(D)J rearrangements of the TCRα locus as seen in human cancers. In addition, it has been specifically found that regions of chromosome 12 are also involved. These regions are syntenic to the regions on human chromosomes known to be frequently mutated in human hematopoietic cancer. The Atm deficient tumor cell lines disclosed herein are useful for characterizing the genes involved in the multi-step process of cancer formation.

The present invention associates previously known and novel polynucleotides, their corresponding genes and regions thereof and their encoded polypeptides to AT, AT tumors, or other cancers such that the polynucleotides, polypeptides, genes and regions thereof can be useful for diagnosis and treatment of AT, AT tumors, or other cancers. Some embodiments of the invention provide methods for preventing, treating, modulating, or ameliorating a medical condition, such as AT, AT tumors, or other cancers comprising administering to a mammalian subject a therapeutically effective amount of at least one polypeptide of the invention, at least one polynucleotide of the invention, at least one gene of the invention, or a region thereof. A preferred embodiment of the invention provides a method for preventing, treating, modulating, or ameliorating a medical condition, such as AT, AT tumors, or other cancers, comprising administering to a mammalian subject a therapeutically effective amount of an antibody that binds specifically to a polypeptide of the invention.

Additional embodiments of the invention provide a method for using a polynucleotide of the invention, a polypeptide of the invention, an antibody of the invention, or a gene of the invention or a region thereof for the manufacture of a medicament useful in the treatment of AT, AT tumors, or other cancers. An additional embodiment of the invention provides a method of diagnosing a pathological condition or a susceptibility to a pathological condition in a subject. The method comprises determining the presence or absence of a mutation in a polynucleotide or gene of the invention or a region thereof. A pathological condition or a susceptibility to a pathological condition, such as AT, AT tumors, or other cancers is diagnosed based on the presence or absence of the mutation.

Even other embodiments of the invention provide methods of diagnosing a pathological condition or a susceptibility to a pathological condition, such as AT, AT tumors, or other cancers in a subject. The methods comprise detecting an alteration in expression of a polynucleotide, gene or region thereof, or a polypeptide encoded by the polynucleotide or gene of the invention, wherein the presence of an alteration in expression of the polypeptide is indicative of the pathological condition or susceptibility to the pathological condition. The alteration in expression can be an increase in the amount of expression or a decrease in the amount of expression. In a preferred embodiment, a first biological sample is obtained from a patient suspected of having AT, AT tumors, or other cancers and a second sample from a suitable comparable control source is obtained. The amount of at least one polypeptide, polynucleotide or gene of the invention or a region thereof is determined in the first and second sample. A patient is diagnosed as having AT, AT tumors, or other cancers if the amount of the polypeptide, polynucleotide or gene or region thereof in the first sample is greater than or less than the amount of the polypeptide, polynucleotide or gene or region thereof in the second sample.

Where a polynucleotide or gene of the invention is down-regulated and is associated with a pathological condition, such as AT, AT tumors, or other cancers, the expression of the polynucleotide or gene can be increased or the level of the intact polypeptide product can be increased in order to treat, prevent, ameliorate, or modulate the pathological condition. This can be accomplished by, for example, administering a polynucleotide, gene, or polypeptide of the invention to the mammalian subject. For example, FX-induced thymoma transcript (SEQ ID NO:78), was downregulated in the AT-4 tumor cell line. This transcript has been previously observed to show differential mRNA expression in other thymomas compared with normal thymus tissue (Pampeno CL, Meruelo D., (1996) Cell Growth Differ 8:1113-23) and has been postulated to play a role in the processes of T-cell differentiation and regeneration in addition to tumorigenesis. Therefore, it is possible that up-regulation of this transcript in tumor cell lines could inhibit tumor cell proliferation and promote differentiation, thus halting the tumorigenesis process.

Where a polynucleotide or gene of the invention is up-regulated and is associated with a pathological condition in a mammalian subject, such as AT, AT tumors, or other cancers, the expression of the polynucleotide or gene can be blocked or reduced or the level of the intact polypeptide product can be reduced in order to treat, prevent, ameliorate, or modulate the pathological condition. For example, the DST ACGG 458 (SEQ ID NO: 9), which was found to be a granzyme C-granzyme B gene fusion product (SEQ ID NOs: 70, 71-74), was highly expressed in the tumor cell line AT-7 and also up-regulated in tumor cell line AT-13. Further analysis by

Northern found it to be also expressed in AT- 10. The presence of this gene fusion product in multiple thymomas is suggestive of a common mechanism, such as its importance in tumor formation. As the granzyme family of proteins are involved in the degradation of the extracellular matrix, if dysregulated, could play an important role in increased invasiveness (i.e., increased metastatic potential) of tumor cells. Therefore, inhibition of the expression of this aberrant transcript could decrease the ability of tumor cells to metastasize, and therefore prevent spreading of the cancer. Inhibition of metastasis would not only protect normal tissue, but also extend the lifetime and prognosis of the patient. This alteration in expression can be accomplished, for example, by administering an inhibitor of the polynucleotide, gene, or polypeptide of the invention to a mammalian subject such as through the use of antisense oligonucleotides, triple helix base pairing methodology or ribozymes. Alternatively, drugs or antibodies that bind to and inactivate the polypeptide product or otherwise interfere with the activity of the polypeptide in the disease state can be used.

Yet other embodiments of the invention involve assessing the stage of AT, AT tumors, or other cancers by testing for regulation of at least one polynucleotide, polypeptide, antibody or gene of the invention or a region thereof. Further embodiments of the invention involve assessing the efficacy or toxicity of a therapeutic treatment for AT, AT tumors, or other cancers by testing for regulation of at least one polynucleotide, polypeptide, antibody or gene of the invention or a region thereof.

Another embodiment of the present invention provides a method of using a polynucleotide, polypeptide, antibody or gene of the invention or a region thereof for delivering to a patient in need thereof, genes, DNA vaccines, diagnostic reagents, peptides, proteins or macromolecules. Another embodiment of the invention provides a method of using a polypeptide or antibody of the invention to identify a binding partner to a polypeptide of the invention. In a preferred embodiment, a polypeptide of the invention is contacted with a binding partner and it is determined whether the binding partner affects an activity of the polypeptide.

Still another embodiment of the invention provides a substantially pure isolated DNA molecule suitable for use as a probe for genes regulated in AT, AT tumors, or other cancers, chosen from the group consisting of the DNA molecules shown in SEQ ID NO: 1-74, or their corresponding genes or regions thereof, or DNA molecules at least 95% similar to one of the foregoing molecules.

Even another embodiment of the invention provides a kit for detecting the presence of a polypeptide of the invention in a mammalian tissue sample. In one embodiment, the kit comprises a first antibody that immunoreacts with a mammalian protein encoded by a gene corresponding to the polynucleotide of the invention or with a polypeptide encoded by the polynucleotide in an amount sufficient for at least one assay and suitable packaging material. The kit can further comprise a second antibody that binds to the first antibody. The second antibody can be labeled with enzymes, radioisotopes, fluorescent compounds, colloidal metals, chemiluminescent compounds, phosphorescent compounds, bioluminescent compounds, or with an organic moiety, such as biotin.

Another embodiment of the invention provides a kit for detecting the presence of genes encoding a protein comprising a polynucleotide of the invention, or fragment' thereof having at least 10 contiguous bases, in an amount sufficient for at least one assay, and suitable packaging material.

An additional embodiment of the invention involves a method for identifying biomolecules associated with AT, AT tumors, or other cancers comprising the steps of: developing a cellular experiment specific for AT, AT tumors, or other cancers, harvesting the RNA from the cells used in the experiment, obtaining a gene expression profile, and using the gene expression profile for identifying biomolecules whose expression was altered during the experiment. The biomolecules identified may be polynucleotides, polypeptides or genes.

Yet another embodiment of the invention provides a method for detecting the presence of a nucleic acid encoding a protein in a mammalian tissue sample. A polynucleotide or gene of the invention or fragment thereof having at least 10 contiguous bases is hybridized with the nucleic acid of the sample. The presence of the hybridization product is detected.

Still another embodiment of the present invention provides a method for predicting whether a subject afflicted with a pathological condition is likely to respond favorably to a treatment prior to administration of the treatment to the subject. The method comprises the steps of: (a) obtaining a sample from the subject; (b) determining a level of expression of at least one of (i) a first polynucleotide selected from the group consisting of SEQ ID NOs: 1-74; (ii) a second polynucleotide at least 95% identical to the first polynucleotide; (iii) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95%> identical to the gene, or a region of any of the foregoing genes; (iv) a first polypeptide encoded by a polynucleotide selected from the group consisting of SEQ ID NOs: 1-74; (v) a second polypeptide encoded by a gene corresponding to any of the foregoing polynucleotides or another gene at least 95% identical to said gene; (vi) a third polypeptide at least 90% identical to one of the foregoing polypeptides; or (vii) a fragment of one of the foregoing polypeptides; and (c) comparing the level of expression to a database comprising expression patterns from patients previously given the treatment, wherein a similar level of expression from the subject as compared to the level of expression from the database of patients that responded favorably to the treatment predicts that the subject will respond favorably to the treatment and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

Still another embodiment of the present invention provides a method for predicting a metastatic potential, remission or regression of a tumor in a subject. The method comprises the steps of: (a) obtaining a sample from the subject; (b) determining a level of expression of at least one of: (i) a first polynucleotide selected from the group consisting of SEQ ID NOs: 1-74; (ii) a second polynucleotide at least 95% identical to the first polynucleotide; (iii) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to the gene, or a region of any of the foregoing genes; (iv) a first polypeptide encoded by a polynucleotide selected from the group consisting of SEQ ED NOs: 1-74; (v) a second polypeptide encoded by a gene corresponding to any of the foregoing polynucleotides or another gene at least 95% identical to said gene; (vi) a third polypeptide at least 90% identical to one of the foregoing polypeptides; or (vii) a fragment of one of the foregoing polypeptides; and (c) comparing the level of expression to a database comprising levels of expression patterns correlated with metastatic potential, remission or regression from patients having tumors, wherein a similar level of expression from the subject as compared to the level of expression from the database predicts the metastatic potential, remission or regression of the tumor in the subject.

Yet another embodiment of the present invention provides a method for determining a stage of tumor progression in a subject, comprising the steps of: (a) obtaining a sample from the subject; (b) determining a level of expression of at least one of: (i) a first polynucleotide selected from the group consisting of SEQ ID NOs: 1-74; (ii) a second polynucleotide at least 95% identical to the first polynucleotide; (iii) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to the gene, or a region of any of the foregoing genes; (iv) a first polypeptide encoded by a polynucleotide selected from the group consisting of SEQ ID NOs: 1-74; (v) a second polypeptide encoded by a gene corresponding to any of the foregoing polynucleotides or another gene at least 95% identical to said gene; (vi) a third polypeptide at least 90% identical to one of the foregoing polypeptides; or (vii) a fragment of one of the foregoing polypeptides; and (c) comparing the level of expression to a database comprising expression patterns correlated with the stage of tumor progression from patients previously diagnosed with tumors, wherein a similar level of expression from the subject as compared to the level of expression from the database of patients previously diagnosed with tumors determines the stage of tumor progression in the subject. An additional embodiment of the present invention provides a method for identifying genes associated with a chromosomal aberration in AT and AT tumors, the method comprising the steps of: (a) obtaining a sample from a subject; (b) isolating RNA from the sample; (c) performing TOGA on the isolated RNA; (d) identifying genes that are differentially regulated based on TOGA; (e) determining whether the differentially regulated genes comprises a genetic marker at a site of the chromosomal aberration, wherein the chromosomal aberration is selected from the group consisting of chromosome breaks, amplifications, deletions and translocations.

Additionally, the present invention provides novel polynucleotides, genes and their encoded polypeptides. One embodiment of the invention provides an isolated nucleic acid molecule comprising a fusion polynucleotide of SEQ ID NOs:70-74. Also provided is an isolated nucleic acid molecule comprising a polynucleotide at least 95%> identical to any one of the isolated nucleic acid molecules of the invention, an isolated nucleic acid molecule at least ten bases in length that is hybridizable to any one of the isolated nucleic acid molecules of the invention under stringent conditions, and an isolated nucleic acid molecule that is a homolog, ortholog, or paralog of any one of the isolated nucleic acid molecules of the invention. Any one of the isolated nucleic acid molecules of the invention can comprise sequential nucleotide deletions from either the 5 '-terminus or the 3 '-terminus. Also provided is the gene corresponding to the cDNA sequence of any one of the isolated nucleic acids of the invention, an isolated nucleic acid molecule hybridizable to such gene under stringent conditions, and an isolated nucleic acid molecule or gene that is a homolog, paralog or ortholog of such gene.

Another embodiment of the invention provides an isolated or purified fusion polypeptide encoded by a fusion polynucleotide of SEQ ID NOs: 70-74, a polynucleotide at least 95% identical to said polynucleotide or a gene corresponding to one of the foregoing polynucleotides and the complements and degenerate variants thereof. Also provided is an isolated or purified polypeptide 90%) identical to one of the foregoing polypeptides, a fragment of one the foregoing polypeptides, and the homologs, paralogs, and orthologs of the foregoing polypeptides. Also provided is an isolated nucleic acid molecule or gene encoding any of the polypeptides or polypeptide fragments of the invention. Optionally, any one of the isolated polypeptides of the invention comprises sequential amino acid deletions from either the C-terminus or the N-terminus.

Yet another embodiment of the invention comprises an isolated antibody that binds specifically to an isolated fusion polypeptide encoded by a fusion polynucleotide of SEQ ID NOs:70-74, a polynucleotide at least 95%) identical to said polynucleotide or a gene corresponding to one of the foregoing polynucleotides and the complements and degenerate variants thereof.

The isolated antibody can be a monoclonal antibody or a polyclonal antibody. The foregoing merely summarizes certain aspects of the invention and is not intended, nor should it be construed as limiting the invention in any way.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features, aspects, and advantages of the present invention will become better understood with reference to the following description, appended claims, and accompanying drawings where:

Figure 1 is a graphical representation of the results of TOGA^® runs using a 5' PCR primer with parsing bases CCGT (SEQ ID NO:86) and the universal 3' PCR primer (SEQ ID NO:78) showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 1 A), AT- 7 (Fig. IB), AT- 12 (Fig. 1C), and AT-13 (Fig. ID), where the vertical index line indicates a PCR product of about 151 b.p. that is expressed to a greater level in the tumor AT-13 cell line than in the other samples. The horizontal axis represents the number of base pairs of the molecules in these samples and the vertical axis represents the fluorescence measurement in the TOGA^® analysis (which corresponds to the relative expression of the molecule of that address). The results of the TOGA^® runs have been normalized using the methods described in pending U.S. Patent Application Serial No. 09/318,699/U.S., and PCT Application Serial No. PCT/US00/14159, both entitled Methods and System for Amplitude Normalization and Selection of Data Peaks (Dennis Grace, Jayson Durham); and U.S. Patent 6,334,099, PCT Application Serial No. PCT/US00/14123 and pending U.S. Patent Application Serial Nos. 09/940,987/U.S., 09/940,581/U.S., 09/940,746/U.S., all entitled Methods for Normalization of Experimental Data (Dennis Grace, Jayson Durham) all of which are incorporated herein by reference. The vertical line drawn through the four panels represents the DST molecule identified as BAR1_11 (SEQ ID NO:3).

Figure 2 presents a graphical example of the results obtained when a DST is verified by the

Extended TOGA^® method using a primer generated from a cloned product (as described below).

The PCR product corresponding to SEQ ID NO:3 (BAR1_11) was cloned and a 5' PCR primer was built from the cloned DST (SEQ ID NO: 87). The product obtained from PCR with this primer

(SEQ ID NO: 87) and the universal 3' PCR primer (SEQ ID NO:78) (as shown in Panel A) was compared to the length of the original PCR product that was produced in the TOGA^® reaction with mRNA extracted from the AT-13 cell line using a 5' PCR primer with parsing bases CCGT (SEQ

ID NO:86) and the universal 3' PCR primer (SEQ ID NO:78) (as shown Panel B). Again, for all panels, the number of base pairs is shown on the horizontal axis, and fluorescence intensity (which corresponds to relative expression) is found on the vertical axis. In Panel C, the traces from Panel

A and Panel B are overlaid, demonstrating that the peak found using an extended primer from the cloned DST is the same number of base pairs as the original PCR product obtained through TOGA^® as BAR1_11 (SEQ ID NO:3). Panel C thus illustrates that B AR1_11 (SEQ ID NO:3) was the DST amplified in Extended TOGA^®.

Figure 3 shows a Northern blot using a radiolabeled probe derived from a candidate gene soluble lectin (Mac-2; GenBank Accession Number L08649) where the human homolog has been mapped to 14q21-q22, which is syntenic to mouse chromosomes 12 and 14, and corresponds to DST BAR1_11 (SEQ ID NO: 3). The image in the Panel 3A, shows hybridization of the probe with a band of mRNA that is expressed at various levels in cell lines AT-4, AT-7, AT-13, and poly A⁺ (pA) thymus mRNA thymus, but not detectable in cell lines AT-10, AT-11, AT-12; APT-3, P53-1, 101-7, and 292-3. The image in Panel 3B shows the methylene blue stained gel for quantification of mRNA loading. As predicted by TOGA , the most intense band is present in the AT-13 tumor cell line.

Figure 4 is a graphical representation of the results of TOGA^® analysis, similar to Figure 1, using a 5' PCR primer with parsing bases GCTG (SEQ ID NO: 88) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 4A), AT-7 (Fig. 4B), AT-12 (Fig. 4C), and AT-13 (Fig. 4D), where the vertical index line indicates a PCR product of about 289 b.p. that is upregulated in the AT-4 tumor cell line. The vertical line drawn through the four panels represents the DST molecule identified as BAR1B_131 (SEQ ID NO:69).

Figure 5 shows a Northern blot using a radiolabeled probe derived from the candidate gene RhoC (GenBank Accession Number X80638) that corresponds to DST BAR1B 31 (SEQ ID NO: 69). The image in Panel 5 A, shows hybridization of the probe with a band of mRNA that is expressed at various levels in cell lines AT-4, AT-7, AT-13, and poly A⁺ (pA) thymus mRNA thymus, but not detectable in cell lines AT-10, AT-11, AT-12; APT-3, P53-1, 101-7, and 292-3. The image in Panel 5B shows the methylene blue stained gel for quantification of mRNA loading: As predicted by TOGA^®, the most intense band is present in the AT-4 tumor cell line.

Figure 6 is a graphical representation of the results of TOGA^® analysis, similar to Figure 1, using a 5' PCR primer with parsing bases GCTG (SEQ ID NO: 89) and the universal 3' primer

(SEQ ID NO: 78), showing PCR products produced from mRNA extracted from tumor cell lines

AT-4 (Fig. 6A), AT-7 (Fig. 6B), AT-12 (Fig. 6C), and AT-13 (Fig. 6D), where the vertical index line indicates a PCR product of about 345 b.p. that is upregulated in the AT-12 cell line. The vertical line drawn through the four panels represents the DST molecule identified as BAR1B_79

(SEQ ID NO:32). Figure 7 shows a Northern blot using a radiolabeled probe derived from the candidate gene ALK-1 (GenBank Accession Number Z31664) that corresponds to DST BAR1B_79 (SEQ ID NO: 32). The image in the upper panel, shows hybridization of the probe with a two bands of mRNA at

4.0 kb and 3.7 kb that are expressed at various levels in cell lines AT-4, AT-11, AT-12, AT-13, 292-3, and poly A⁺ (pA) thymus mRNA, but not detectable in cell lines AT-7, AT-10, APT-3, P53- 1, and 101-7.

Figure 8 is a graphical representation of the results of TOGA^® analysis, similar to Figure 1, using a 5' PCR primer with parsing bases AC AT (SEQ ID NO: 90) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 8 A), AT-7 (Fig. 8B), AT-12 (Fig. 8C), and AT-13 (Fig. 8D), where the vertical index line indicates a PCR product of about 368 b.p. that is down-regulated in the AT-4 tumor cell line. The vertical line drawn through the four panels represents the DST molecule identified as BAR1B 125 (SEQ ID NO:63).

Figure 9 shows a Northern blot using a radiolabeled probe derived from the candidate gene FX-induced thymoma transcript (GenBank Accession Number U38252) that corresponds to DST BAR1B_125 (SEQ ID NO: 63). The image Panel A, shows hybridization with a band of mRNA at

4.1 kb that is expressed at various levels in cell lines AT-7, AT-10, AT-12, AT-13 and APT-3, but not detectable in cell lines AT-4, AT-11, P53-1, 101-7, 292-3 and polyA (pA) thymus mRNA. The image in Panel B shows the methylene blue stained gel for quantification of mRNA loading. Northern analysis confirmed the TOGA^® result indicating lack of expression of this mRNA in AT- 4.

Figure 10 is a graphical representation of the results of TOGA^® analysis, similar to Figure 1, using a 5' PCR primer with parsing bases GAGC (SEQ ID NO: 91) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from cell lines AT-4 (Fig. 10A), AT-7 (Fig. 10B), AT-12 (Fig. 10C), and AT-13 (Fig. 10D), where the vertical index line indicates a PCR product of about 197 b.p. that is down-regulated in the AT-12 tumor cell line tumor cell line. The vertical line drawn through the four panels represents the DST molecule identified as BAR1B 27 (SEQ ID NO:65).

Figure 11 shows a Northern blot using a radiolabeled probe derived from the candidate gene IAP-1 (GenBank Accession Number U88908) that corresponds to DST BAR1B 127 (SEQ ID

NO: 65). The image in Panel A, shows hybridization with a band of mRNA at 4.2 kb that is expressed at various levels in cell lines AT-4, AT-7, AT-10, AT-11, AT-13, APT-3, P53-1, 101-7, 292-3 and normal thymus, but not detectable in cell line AT-12; the image in Panel B shows the methylene blue stained gel for quantification of mRNA loading. The results of the TOGA^® analysis was confirmed by this Northern.

Figure 12 shows analysis by FISH of the candidate gene IAP-1. The results found with FISH confirmed in normal cells and in tumor cells AT-4, AT-7, AT-10, and AT-13 that the IAP-1 locus is localized to chromosome 9. Shown in the schematic of chromosome 9 (12 A), IAP-1 is located toward the end or centromere of the chromosome. This location was positively confirmed by FISH (12B). FISH analysis of the AT-12 tumor line confirmed deletion of chromosome 9 and thus expression of IAP-1 could not be detected (data not shown).

Figure 13 is a graphical representation of the results of TOGA^® analysis, similar to Figure 1, using a 5' PCR primer with parsing bases ACGG (SEQ ID NO: 92) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 13A), AT-7 (Fig. 13B), AT-12 (Fig. 13C), and AT-13 (Fig. 13D), where the vertical index line indicates a PCR product of about 458 b.p. that is up-regulated in the AT-7 and AT-13 tumor cell lines. The vertical line drawn through the four panels represents the DST molecule identified as BAR1_27 (SEQ ID NO:9).

Figure 14 shows a Northern blot using a radiolabeled probe derived from the candidate gene granzyme C (GrzmC) (GenBank Accession Number Ml 8459) that corresponds to DST BAR1 27 (SEQ ID NO: 9). The image in the upper panel shows hybridization of two bands of mRNA at 1.7 kb and 1.1 kb. The 1.1 kb band, labeled as "normal" as it is the predicted size of granzyme C, was expressed at low levels in tumor cell lines AT-4, AT-7, AT-10, and polyA (pA) thymus. The upper 1.7 kb band, labeled as "alternate" was present only in tumor cell lines AT-7 and AT-10, and to a lesser extent AT-13 and APT-3. Neither transcript was detectable in cell lines AT-11, AT-12, P53-1, 101-7, and 292-3. The image in the lower panel shows the methylene blue stained gel for quantification of mRNA loading.

Figure 15 demonstrates a schematic of chromosome 14, demonstrating the close proximity of the granzyme family members (Grzb, Grzc, Grzd, Grze, Grzf, and Grzg) to the TCRalpha locus.

Figure 16 shows RT-PCR results confirming expression of the granzyme B-granzyme C gene fusion product in AT-7 and AT-13 tumor cell lines. Figure 16A shows RT-PCR products amplified with gene-specific 5' Granzyme B and 3' Granzyme C primers. cDNA template was synthesized from DNase-treated RNA prepared from murine C57 Black thymus and the following

ATM tumor cell lines: AT-4, AT-7, AT-10, AT-12, AT-13, and APT-3. The left side of panel A shows PCR products amplified with the 5' primer Gzmb-II and 3' primer Gzmc III. Asterisks in AT-7 and AT-13 indicate bands with the expected size of 869 bp for gene fusion product I. The right side of panel A displays PCR products amplified with 5' primer Gzmb-I and 3' primer Gzmc III. Asterisks in AT-7 and AT-13 indicate bands with the expected size of 896 bp for gene fusion I. Figure 16B shows a map of gene fusion I is shown with the relative locations of 5' Granzyme B primers Gzmb-I and Gzmb-II with 3' Granzyme C primer Gzmb-III. Primers I, II, and III (SEQ ID NOs: 98, 99, and 96) were used for PCR amplification experiments in panel 16A.

Figure 17 shows a map detailing two gene fusion products (gene fusion product 1; SEQ ID NO: 70 & gene fusion product 2; SEQ ID NOs: 71-74). The nucleotide positions of Granzyme B/Granzyme C fusions I and II are shown relative to wild-type Granzyme B (lower left comer) and Granzyme C (upper right comer) transcript sequences. Solid white boxes represent Granzyme C sequences and solid black boxes show Granzyme B sequences. Gray regions indicate overlapping cDNA sequences common to both Granzyme B and Granzyme C. Hatched regions represent 5' Granzyme B sequence that are likely to occur in the full-length fusion transcripts. The hatched regions are 5' to the Granzyme B primers used for RT-PCR, thus these upstream sequences were not observed in the amplification products analyzed.

Figure 18 represents a Northern blot using the gene fusion product 1 (SEQ ID NO: 70) as the radiolabeled probe. The image shows hybridization of one band of mRNA at the appropriate size in only tumor cell lines AT-7, AT-10, and APT-3. Consistent with previous results, no gene fusion product was detectable in cell lines AT-4, AT-12, AT-13, P53-1, 101-7, 292-3, and thymus.

Figure 19 shows that by FISH analysis, normal cells as well as in tumor cells AT-4, AT-7, AT-10, AT-12 and AT-13 that the granzyme C locus is adjacent to the TCRalpha delta locus (Figure 19A). In this study, the granzyme C probe was labeled with green fluorescence and the TCRalpha probe was labeled with red fluorescence, such that regions that co-localize are represented as yellow fluorescence. In the greyscale version of the image shown in Figure 19 A, arrows are drawn to show the pair of duplicate spots appearing on a metaphase chromosome spread. FISH analysis of AT-7 cells with a granzyme C specific probe showed duplication events at the TCRalpha and granzyme C loci, and in some cases a translocation of granzyme C to chromosome 12, where it co-localized with Tcl-1 (Figure 19B). In in other cases, translocations of 12; 14 were not observed, but rather abnormalities on chromosome 14 were observed, suggesting that in these cases at this locus an aberrant fusion event occured (Figure 19 A). Figure 20 shows the Western analysis of gene fusion product 1. Gene fusion protein 1 was inserted into an arabinose-inducible expression vector containing a histidine tag and was expressed by treating cells with increasing concentrations of arabinose (ara). Histidine tagged gene fusion protein (Grzb/c-his) was detected using an anti-histine antibody and was observed at the expected molecular weight of 25.8 kD. Histidine-tagged luciferase (luc-his) was also expressed as a positive control. As a negative control to demonstrate specificity, in the absence of ara (0%>), neither the fusion protein or the luc-his protein are expressed, and thus no band was observed. This study demonstrated that an intact, in-frame, fusion protein could be expressed, which will be further characterized for invasiveness in various cell culture models.

Figure 21 is a graphical representation of the results of TOGA analysis, similar to Figure 1, using a 5' PCR primer with parsing bases CTCG (SEQ ID NO: 93) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 21A), AT-7 (Fig. 21B), AT-12 (Fig. 21C), and AT-13 (Fig. 21D), where the vertical index line indicates a PCR product of about 310 b.p. that is up-regulated in the AT-7 and AT-13 tumor cell lines. The vertical line drawn through the four panels represents the DST molecule identified as BAR1_28 (SEQ ID NO: 10).

Figure 22 shows that the benzodiazepine receptor locus is approximately 10 cM form the c- myc locus (Figure 22 A). Figure 22B is an example of Northern blot analysis using a radiolabeled probe derived from c-Myc, GAPDH (as a loading control), or the candidate gene benzodiazepine receptor (GenBank Accession Number D21207). Each panel represents hybridization of the respective probe with a band of mRNA that is expressed at various levels in cell lines AT-4, AT-7, AT-10, AT-11, AT-12, AT-13, APT-3, P53-1, 101-7, and 292-3.

Figure 23 summarizes the results from cDNA microarrays that were used to compare the expression profile of normal and Atm^{' '} thymus at various time points prior to tumor development.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention and the methods of obtaining and using the present invention will be described in detail after setting forth some preliminary definitions, which are provided to facilitate understanding of certain terms used in the present invention. Many of the techniques described herein are described in Dracopoli et al., Current Protocols in Human Genetics, John Wiley and Sons, New York (1999), and Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, New York (2000), both of which are incorporated herein by reference. Definitions

"Isolated" refers to material removed from its original environment (e.g., the natural environment if it is naturally occurring), and thus is altered "by the hand of man" from its natural state.

"Stringent hybridization conditions" refers to an overnight incubation at 42°C in a solution comprising 50% formamide, 5X SSC (5X SSC = 750 mM NaCl, 75 mM sodium citrate, 50 mM sodium phosphate pH 7.6), 5X Denhardt's solution, 10%> dextran sulfate, and 20 μg/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1 X SSC at about 65°C. Changes in the stringency of hybridization and signal detection are primarily accomplished through the manipulation of formamide concentration (lower percentages of formamide result in lowered stringency); salt conditions, or temperature. For example, lower stringency conditions include an overnight incubation at 37°C in a solution comprising 6X SSPE (20X SSPE = 3M NaCl; 0.2M NaH₂PO₄; 0.02M EDTA, pH 7.4), 0.5% SDS, 30%> formamide, 100 μg/ml salmon sperm blocking DNA; followed by washes at 50°C with IX SSPE, 0.1% SDS. In addition, to achieve even lower stringency, washes performed following stringent hybridization can be done at higher salt concentrations (e.g., 5X SSC). Note that variations in the above conditions may be accomplished through the inclusion and/or substitution of alternate blocking reagents used to suppress background in hybridization experiments. Typical blocking reagents include Denhardt's reagent, BLOTTO (5% w/v non-fat dried milk in phosphate buffered saline ("PBS")), heparin, denatured salmon sperm DNA, and other commercially available proprietary formulations. The inclusion of specific blocking reagents may require modification of the hybridization conditions described above, due to problems with compatibility. Of course, a polynucleotide which hybridizes only to polyA+ sequences (such as any 3 ' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a complementary stretch of T (or U) residues, would not be included in the definition of "polynucleotide," since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double- stranded cDNA clone).

"Conservative amino acid substitution" refers to a substitution between similar amino acids that preserves an essential chemical characteristic of the original polypeptide.

"Identity" per se has an art-recognized meaning and can be calculated using published techniques. (See, e.g., Lesk, Ed., Computational Molecular Biology, Oxford University Press, New York, (1988); Smith, Ed., Biocomputing: Informatics And Genome Projects, Academic Press, New York, (1993); Griffin and Griffin, Eds., Computer Analysis Of Sequence Data, Part I, Humana Press, New Jersey, (1994); von Heinje, Sequence Analysis In Molecular Biology, Academic Press, (1987); and Gribskov and Devereux, Eds., Sequence Analysis Primer, M Stockton Press, New York, (1991)). While there exists a number of methods to measure identity between two polynucleotide or polypeptide sequences, the term "identity" is well known to skilled artisans (Carillo et al, SIAMJ Applied Math., 48:1073 (1988)). Methods commonly employed to determine identity or similarity between two sequences include, but are not limited to, those disclosed in "Guide to Huge Computers," Martin J. Bishop, Ed., Academic Press, San Diego, (1994) and Carillo et al., (1988), Supra.

"EST" refers to an Expressed Sequence Tag, i.e., a short sequence of a gene made from cDNA, typically in the range of 200 to 500 base pairs. Since an EST corresponds to a specific region of a gene, it can be used as a tool to help identify unknown genes and map their position in the genome.

"DST" refers to a Digital Sequence Tag, i.e., a polynucleotide that is an expressed sequence tag of the 3' end of an mRNA.

Other terms used in the fields of biotechnology and molecular and cell biology as used herein will be as generally understood by one of ordinary skill in the applicable arts.

Ataxia Telangiectasia mutated (ATM) plays a key role in monitoring and responding to DNA damage, and is essential for the appropriate regulation of cell cycle checkpoint control for the maintenance of genome integrity. In the absence of Atm, defects in these control points may contribute to the increased incidence of genomic alterations, including deletions, amplifications and translocations and thereby alter genetic stability and cancer susceptibility. p53 is the most commonly mutated gene in human cancer. ATM appears to act upstream of p53 (Morgan and Kastan, (1997) Cancer Res. 57:3386-9), and normal ATM function is required for optimal signaling to p53 after exposure to ionizing irradiation (Barlow et al., (1997) Nat. Genet. 17:453-456; Barlow et al., (1997) Nat. Genet. 17:462-466. Both Atm deficient mice and p53 deficient mice develop lethal lymphomas, although p53 deficient mice develop these lymphomas at a reduced rate compared to the Atm deficient mice. It is possible that in the absence of ATM, defective signaling to p53 occurs during the process of TCRalpha rearrangement resulting in cell cycle defects and subsequent tumor formation. However, p53 mediated defects may only be partially responsible for transformation as the Atm deficient mice develop cancer much more quickly than the p53 deficient mice (Liao MJ et al. (1999) Genes and Dev. 13:1246-50; Westphal CS et al. (1997) Nat. Genetics 16(4): 397-401; Nacht M (1998) Cell Growth Differ. 9:131-8; Liao MJ (1998) Mol Cell Biol. 1998 Jun; 18:3495-501).

In order to specifically define the role of loss of ATM in particular forms of lymphomas as well as to the more common forms of p53 dependent loss in tumor formation, mice deficient in Atm and one or both alleles of p53 were generated. These mice are referred to as Atm^''^'p53^+/' and Atm^{' '} p53^''^' for loss of both alleles. These mice are very ill and die young of multiple types of tumors including lymphomas. Several tumor cell lines were also generated from these mice, and it was found that Atm deficient tumor cell lines also have cellular characteristics that are unique from p53 tumor cell lines, and may express cellular markers suggestive of a more generalized defect resulting from gross genetic alterations.

The current invention describes spectral karyotype analysis (SKY) and fluorescent in situ analysis (FISH) to characterize the cytogenetics of tumors from Atm ^' as well as Atm^''^'p53^{+ '}, Atm^{' '} p53^''^', and p53^~'^~, and the use of the PCR-based TOGA^® platform to examine the RNA expression profiles of a subset of these Atm^{' '} T-cell lymphomas. Specifically, three Atm^{' '} tumor cell lines, which had similar cytogenetic profiles (AT-4, AT-7 and AT-13), and one Atm^{' '} tumor line which had several unique chromosomal aberrations (AT-12), were chosen for analysis.

TOGA^® enabled the classification of cytogenetically similar tumors from a dissimilar tumor line based on the resulting expression profiles. In addition, this approach allowed the identification of chromosomal disruptions including specific fusion events that may account for the unusually aggressive and invasive nature of ,4tm-defrcient tumors. These candidates have been validated by independent methods and several have been shown to be specifically expressed in itm-deficient lymphomas and not in tumors lacking p53 and p21. By using this combined approach of powerful imaging techniques coupled with sensitive RNA profiling technologies, it has been possible to identify specific loci which appear to be involved in directing the initiation and progression of lymphoreticular malignancies which arise as a result of tm-deficiencies in both mice and humans. Treatment of AT, AT tumors, or other cancers

Where a polynucleotide, polypeptide or gene of the invention or region thereof is down- regulated and is associated with a pathological condition, such as AT, AT tumors, or other cancers, the expression of the polynucleotide or gene or region thereof can be increased or the level of the intact polypeptide product can be increased in order to treat, prevent, ameliorate, or modulate the pathological condition. This can be accomplished, for example, by administering a polynucleotide, polypeptide or gene of the invention or region thereof (or a set of polynucleotides, polypeptides, genes or regions thereof, including those of the invention) to the mammalian subject. For example, FX-induced thymoma transcript (SEQ ID NO: 78), was downregulated in the AT-4 tumor cell line. This transcript has been previously observed to show differential mRNA expression in other thymomas compared with normal thymus tissue (Pampeno CL, Meruelo D., (1996) Cell Growth Differ 8:1113-23) and has been postulated to play a role in the processes of T-cell differentiation and regeneration in addition to tumorigenesis. Therefore, it is possible that up-regulation of this transcript in tumor cell lines could inhibit tumor cell proliferation and promote differentiation, thus halting the tumorigenesis process.

A polynucleotide or gene of the invention or region thereof can be administered to a mammalian subject alone or with other polynucleotides or genes by a recombinant expression vector comprising the polynucleotide or gene or region thereof. As used herein, a mammalian subject can be a human, baboon, chimpanzee, macaque, cow, horse, sheep, pig, dog, cat, rabbit, guinea pig, rat or mouse. Preferably, the recombinant vector comprises a polynucleotide shown in SEQ ID NOs: 1-74 inclusive or a polynucleotide which is at least 98% identical to a nucleic acid sequence shown in SEQ ID NOs: 1-74 inclusive or a gene corresponding to one of the foregoing polynucleotides or a region thereof. Also, preferably, the recombinant vector comprises a variant polynucleotide that is at least 80%), 90%, or 95%> identical to a polynucleotide comprising at least one of SEQ ID NOs: 1-74 inclusive, a polynucleotide at least ten bases in length hybridizable to polynucleotide comprising at least one of SEQ ID NOs: 1-74 inclusive, a polynucleotide comprising at least one SEQ ID NOs: 1-74 inclusive with sequential nucleotide deletions from either the 5' terminus or the 3' terminus, or a species homolog of a polynucleotide comprising at least one of SEQ ID NOs: 1-74 inclusive or gene corresponding to any one of the foregoing polynucleotides of a region thereof.

The administration of a polynucleotide or gene of the invention, or region thereof or recombinant expression vector containing such polynucleotide, gene or region thereof to a mammalian subject can be used to express a polynucleotide in said subject for the treatment of, for example, ATM. Expression of a polynucleotide or gene in target cells, including but not limited to AT tumor cells, or other cancer cells, would effect greater production of the encoded polypeptide. In some cases, where the encoded polypeptide is a nuclear protein, the regulation of other genes may be secondarily up- or down-regulated. For example, as described above, the FX-induced thymoma transcript (SEQ ID NO:78), was downregulated in the AT-4 tumor cell line. This transcript has been previously observed to show differential mRNA expression in other thymomas compared with normal thymus tissue (Pampeno CL, Meruelo D., (1996) Cell Growth Differ 8:1113-23) and has been postulated to play a role in the processes of T-cell differentiation and regeneration in addition to tumorigenesis. Therefore, it is possible that up-regulation of this transcript in tumor cell lines could inhibit tumor cell proliferation and promote differentiation, thus halting the tumorigenesis process.

There are available to one skilled in the art multiple viral and non-viral methods suitable for introduction of a nucleic acid molecule into a target cell, as described above. In addition, a naked polynucleotide, gene or region thereof can be administered to target cells. Polynucleotides and genes of the invention or regions thereof and recombinant expression vectors of the invention can be administered as a pharmaceutical composition (including, without limitation, genes delivered by vectors such as adeno-associated virus, liposomes, PLGA, canarypox virus, adenovirus, retroviruses including IL-1 and GM-CSF antagonists). Such a composition comprises an effective amount of a polynucleotide, gene or region thereof or recombinant expression vector, and a pharmaceutically acceptable formulation agent selected for suitability with the mode of administration. Suitable formulation materials preferably are non-toxic to recipients at the concentrations employed and can modify, maintain, or preserve, for example, the pH, osmolarity, viscosity, clarity, color, isotonicity, odor, sterility, stability, rate of dissolution or release, adsorption, or penetration of the composition. See Remington 's Pharmaceutical Sciences (18th Ed., A.R. Gennaro, ed., Mack Publishing Company 1990).

The pharmaceutically active compounds (i.e., a polynucleotide, gene or region thereof or a vector) can be processed in accordance with conventional methods of pharmacy to produce medicinal agents for administration to patients, including humans and other mammals. Thus, the pharmaceutical composition comprising a polynucleotide, gene or region thereof or a recombinant expression vector may be made up in a solid form (including granules, powders or suppositories) or in a liquid form (e.g., solutions, suspensions, or emulsions). The dosage regimen for treating a disease with a composition comprising a polynucleotide, gene or region thereof or expression vector is based on a variety of factors, including the type or severity of ATM, the age, weight, sex, medical condition of the patient, the route of administration, and the particular compound employed. Thus, the dosage regimen may vary widely, but can be determined routinely using standard methods. A typical dosage may range from about 0.1 mg/kg to about 100 mg/kg or more, depending on the factors mentioned above.

The frequency of dosing will depend upon the pharmacokinetic parameters of the polynucleotide, gene or region thereof or vector in the formulation being used. Typically, a clinician will administer the composition until a dosage is reached that achieves the desired effect. The composition may therefore be administered as a single dose, as two or more doses (which may or may not contain the same amount of the desired molecule) over time, or as a continuous infusion via implantation device or catheter. Further refinement of the appropriate dosage is routinely made by those of ordinary skill in the art and is within the ambit of tasks routinely performed by them. Appropriate dosages may be ascertained through use of appropriate dose-response data.

The cells of a mammalian subject may be transfected in vivo, ex vivo, or in vitro. Administration of a polynucleotide, gene or region thereof or a recombinant vector containing a polynucleotide, gene or region thereof to a target cell in vivo may be accomplished using any of a variety of techniques well known to those skilled in the art. For example, U.S. Patent No. 5,672,344 describes an in vivo viral-mediated gene transfer system involving a recombinant neuro trophic HSV-1 vector. The above-described compositions of polynucleotides, genes and regions thereof and recombinant vectors can be transfected in vivo by oral, buccal, parenteral, rectal, or topical administration as well as by inhalation spray. The term "parenteral" as used herein includes, subcutaneous, intravenous, intramuscular, intrastemal, infusion techniques or intraperitoneally.

While the nucleic acids and/or vectors of the invention can be administered as the sole active pharmaceutical agent, they can also be used in combination with one or more vectors of the invention or other agents. When administered as a combination, the therapeutic agents can be formulated as separate compositions that are given at the same time or different times, or the therapeutic agents can be given as a single composition.

Another delivery system for polynucleotides or genes of the invention and regions thereof is a "non-viral" delivery system. Techniques that have been used or proposed for gene therapy include DNA-ligand complexes, adenovirus-ligand-DNA complexes, direct injection of DNA, CaPO₄ precipitation, gene gun techniques, electroporation, lipofection, and colloidal dispersion (Mulligan, R., (1993) Science, 260 (5110):926-32). Any of these methods are widely available to one skilled in the art and would be suitable for use in the present invention. Other suitable methods are available to one skilled in the art, and it is to be understood that the present invention may be accomplished using any of the available methods of transfection. Several such methodologies have been utilized by those skilled in the art with varying success. Id.

Where a polynucleotide or gene of the invention is up-regulated and exacerbates a pathological condition in a mammalian subject, such as AT, AT tumors, or other cancers, the expression of the polynucleotide or gene can be blocked or reduced or the level of the intact polypeptide product can be reduced in order to treat, prevent, ameliorate, or modulate the pathological condition. This can be accomplished by, for example, the use of antisense oligonucleotides, triple helix base pairing methodology or ribozymes. Alternatively, drugs or antibodies that bind to and inactivate the polypeptide product can be used. For example, the DST ACGG 458 (SEQ ID NO: 9), which was found to be granzyme C-granzyme B gene fusion products (SEQ ID NO: 70, 71-74), were highly expressed in the tumor cell line AT-7 and also up-regulated in tumor cell line AT-13. Further analysis by Northern found it to be also expressed in AT-10. The presence of this gene fusion product in multiple thymomas is suggestive of a common mechanism, such as its importance in tumor formation. As the granzyme family of proteins are involved in the degradation of the extracellular matrix, if dysregulated, could play an important role in increased invasiveness (i.e., increased metastatic potential) of tumor cells. Therefore, inhibition of the expression of this aberrant transcript could decrease the ability of tumor cells to metastasize, and therefore prevent spreading of the cancer. Inhibition of metastasis would not only protect normal tissue, but also extend the lifetime and prognosis of the patient. In addition, the measurement of the presence of this gene fusion transcript could be used to diagnose a tumor and could be used to determine the stage of progression of a tumor, for predicting the aggressiveness or invasive potential (i.e., metastatic potential) of a tumor, for helping to predict the course of treatment, and for predicting remission or regression of the rumor, associated with AT or other cancers for prognostic purposes.

Antisense oligonucleotides are nucleotide sequences that are complementary to a specific

DNA or RNA sequence. Once introduced into a cell, the complementary nucleotides combine with natural sequences produced by the cell to form complexes and block either transcription or translation. Preferably, an antisense oligonucleotide is at least 11 nucleotides in length, but can be at least 12, 15, 20, 25, 30, 35, 40, 45, or 50 or more nucleotides long. Longer sequences also can be used. Antisense oligonucleotide molecules can be provided in a DNA construct and introduced into a cell as described above to decrease the level of gene products of the invention in the cell.

Antisense oligonucleotides can be deoxyribonucleotides, ribonucleotides, or a combination of both. Oligonucleotides can be synthesized manually or by an automated synthesizer, by covalently linking the 5' end of one nucleotide with the 3' end of another nucleotide with non- phosphodiester internucleotide linkages such alkylphosphonates, phosphorothioates, phosphorodithioates, alkylphosphonothioates, alkylphosphonates, phosphoramidates, phosphate esters, carbamates, acetamidate, carboxymethyl esters, carbonates, and phosphate triesters. See Brown, (1994) Meth. Mol. Biol, 20:1-8; Sonveaux, (1994) Meth. Mol. Biol, 26:1-72; Uhlmann et al., (1990) Chem. Rev., 90:543-583.

Modifications of gene expression can be obtained by designing antisense oligonucleotides that will form duplexes to the control, 5', or regulatory regions of a gene of the invention. Oligonucleotides derived from the transcription initiation site, e.g., between positions -10 and +10 from the start site, are preferred.

Similarly, inhibition can be achieved using "triple helix" base-pairing methodology. Triple helix pairing causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or chaperons. Therapeutic advances using triplex DNA have been described in the literature (e.g., Gee et al., in Huber & Carr, Molecular and Immulogic Approaches, Futura Publishing Co., Mt. Kisco, N.Y., 1994). An antisense oligonucleotide also can be designed to block translation of mRNA by preventing the transcript from binding to ribosomes.

Precise complementarity is not required for successful complex formation between an antisense oligonucleotide and the complementary sequence of a polynucleotide. Antisense oligonucleotides that comprise, for example, 2, 3, 4, or 5 or more stretches of contiguous nucleotides that are precisely complementary to a polynucleotide, each separated by a stretch of contiguous nucleotides which are not complementary to adjacent nucleotides, can provide sufficient targeting specificity for mRNA. Preferably, each stretch of complementary contiguous nucleotides is at least 4, 5, 6, 7, or 8 or more nucleotides in length. Non-complementary intervening sequences are preferably 1, 2, 3, or 4 nucleotides in length. One skilled in the art can easily use the calculated melting point of an antisense-sense pair to determine the degree of mismatching which will be tolerated between a particular antisense oligonucleotide and a particular polynucleotide sequence. Antisense oligonucleotides can be modified without affecting their ability to hybridize to a polynucleotide or gene of the invention or regions thereof. These modifications can be internal or at one or both ends of the antisense molecule. For example, internucleoside phosphate linkages can be modified by adding cholesteryl or diamine moieties with varying numbers of carbon residues between the amino groups and terminal ribose. Modified bases and/or sugars, such as arabinose instead of ribose, or a 3', 5 '-substituted oligonucleotide in which the 3' hydroxyl group or the 5 ' phosphate group are substituted, also can be employed in a modified antisense oligonucleotide. These modified oligonucleotides can be prepared by methods well known in the art. See, e.g., Agrawal et al., (1992) Trends Biotechnol, 10:152-158; Uhlmann et al., (1990) Chem. Rev., 90:543-584; Uhlmann et al., (1987) Tetrahedron. Lett, 215:3539-3542.

Ribozymes are RNA molecules with catalytic activity. See, e.g., Cech, (1987) Science, 236:1532-1539; Cech, (1990) Ann. Rev. Biochem., 59:543-568; Cech, (1992) Curr. Opin. Struct. Biol, 2:605-609; Couture & Stinchcomb, (1996) Trends Genet., 12:510-515. Ribozymes can be used to inhibit gene function by cleaving an RNA sequence, as is known in the art (e.g., Haseloff et al., U.S. Patent No. 5,641,673). The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. Examples include engineered hammerhead motif ribozyme molecules that can specifically and efficiently catalyze endonucleolytic cleavage of specific nucleotide sequences.

The coding sequence of a polynucleotide or gene of the invention or a region thereof can be used to generate ribozymes that will specifically bind to mRNA transcribed from the polynucleotide. Methods of designing and constructing ribozymes which can cleave RNA molecules in trans in a highly sequence specific manner have been developed and described in the art (see Haseloff et al. (1988) Nature, 334:585-591). For example, the cleavage activity of ribozymes can be targeted to specific RNAs by engineering a discrete "hybridization" region into the ribozyme. The hybridization region contains a sequence complementary to the target RNA and thus specifically hybridizes with the target (see, e.g., Gerlach et al., EP 321,201).

Specific ribozyme cleavage sites within a RNA target can be identified by scanning the target molecule for ribozyme cleavage sites which include the following sequences: GUA, GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides corresponding to the region of the target RNA containing the cleavage site can be evaluated for secondary structural features which may render the target inoperable. Suitability of candidate RNA targets also can be evaluated by testing accessibility to hybridization with complementary oligonucleotides using ribonuclease protection assays. The nucleotide sequences shown in SEQ ID NOs: 1-74 inclusive, their complements and their corresponding genes and regions thereof provide sources of suitable hybridization region sequences. Longer complementary sequences can be used to increase the affinity of the hybridization sequence for the target. The hybridizing and cleavage regions of the ribozyme can be integrally related such that upon hybridizing to the target RNA through the complementary regions, the catalytic region of the ribozyme can cleave the target.

Ribozymes can be introduced into cells as part of a DNA construct. Mechanical methods, such as microinjection, liposome-mediated transfection, electroporation, or calcium phosphate precipitation, can be used to introduce a ribozyme-containing DNA construct into cells in which it is desired to decrease polynucleotide expression. Alternatively, if it is desired that the cells stably retain the DNA construct, the construct can be supplied on a plasmid and maintained as a separate element or integrated into the genome of the cells, as is known in the art. A ribozyme-encoding DNA construct can include transcriptional regulatory elements, such as a promoter element, an enhancer or UAS element, and a transcriptional terminator signal, for controlling transcription of ribozymes in the cells.

As taught in Haseloff et al., U.S. Patent No. 5,641,673, ribozymes can be engineered so that ribozyme expression will occur in response to factors that induce expression of a target gene. Ribozymes also can be engineered to provide an additional level of regulation, so that destruction of mRNA occurs only when both a ribozyme and a target gene are induced in the cells.

Polypeptides or antibodies to the polypeptides of the invention can also be used directly as therapeutics to prevent, treat, modulate, or ameliorate disease. The mammalian subject (preferably a human) can be given a recombinant or synthetic form of the polypeptide in one of many possible different formulations, including, but not limited to, subcutaneous, intravenous, intramuscular, intraperitoneal, or intracranial injections of a solution of the polypeptide or antibody, or a suspension of a crystallized form of the polypeptide or antibody; topical creams or slow release cutaneous patch containing the polypeptide; encapsulated forms for oral or other gastrointestinal delivery of the polypeptide or antibody. In some cases, delivery of the polypeptide or antibody may be in the form of injection or transplantation of cells or tissues containing an expression vector such that a recombinant form of the polypeptide will be secreted by the cells or tissues, as described above for transfected cells. The frequency and dosage of the administration of the polypeptides or antibodies will be determined by factors such as the biological activity of the pharmacological preparation, the persistence and clearance of the active protein, and the goals of treatment. In the case of antibody therapies, the frequency and dosage will also depend on the ability of the antibody to bind and neutralize the target molecules in the target tissues. Diagnostic Tests

Pathological conditions or susceptibility to pathological conditions, such as AT, AT tumors, or other cancers, can be diagnosed using methods of the invention. Testing for expression of a polynucleotide or gene of the invention or regions thereof or for the presence of the polynucleotide or gene product can correlate with the severity of the condition and can also indicate appropriate treatment. Furthermore, testing for regulation of a polynucleotide or gene of the invention or regions thereof or a panel of polynucleotides or genes of the invention or regions thereof can be used in drug development studies to assess the efficacy or toxicity of any experimental therapeutic.

For example, the presence or absence of a mutation in a polynucleotide or gene of the invention or regions thereof can be determined through sequencing techniques known to those skilled in the art and a pathological condition or a susceptibility to a pathological condition can be diagnosed based on the presence or absence of the mutation. Further, an alteration in expression of a polypeptide encoded by a polynucleotide or gene of the invention can be detected, where the presence of an alteration in expression of the polypeptide is indicative of AT, AT tumors, or other cancers, or susceptibility to AT, AT tumors, or other cancers. The alteration in expression can be an increase in the amount of expression or a decrease in the amount of expression, i.e. a modulation in expression. Examples have been described in the current invention of cases where both increases and decreases have been observed, that may be associated with increased proliferation, and invasive potential. For example, the decreased expression of the FX-induced thymoma in the AT-4 tumor cell line may play a role in suppressing differentiation, and enhanced proliferation (Pampeno

CL, Meruelo D., (1996) Cell Growth Differ. 8:1113-23). An example of increased expression is the DST ACGG 458 (SEQ ID NO: 9), found to contain granzyme C-granzyme B gene fusion products (SEQ ID NO: 70, 71-74) that were highly expressed in the tumor cell line AT-7 and also up-regulated in tumor cell line AT-13. The presence of this gene fusion product in multiple thymomas is suggestive of a common mechanism, such as its importance in tumor formation. The granzyme family of proteins are involved in the degradation of the extracellular matrix. If dysregulated, these proteins could play an important role in increased invasiveness (i.e., increased metastatic potential) of tumor cells. Therefore, inhibition of the expression of this aberrant transcript could decrease the ability of tumor cells to metastasize, and therefore prevent spreading of the cancer. Inhibition of metastasis would not only protect normal tissue, but also extend the lifetime and prognosis of the patient. In addition, the measurement of the presence of this gene fusion transcript could be used to diagnose a tumor and could be used to determine the stage of progression of a tumor, for predicting the aggressiveness or invasive potential (i.e., metastatic potential) of a tumor, for helping to predict the course of treatement, and for predicting remission or regression of the tumor, associated with AT or other cancers for prognostic purposes.

The use of diagnostic tests is not limited to determining the presence of, or susceptibility to disease. In many cases, the diagnostic test can be used to assess disease stage, especially in situations where such an objective lab test has no alternative reliable subjective test available. These tests can be used to follow the course of disease, help predict the future course of disease, or determine the possible reversal of the disease condition. For example, the level of expression of polynucleotides, genes, polypeptides of the invention or regions thereof may be indicative of disease stage or progression.

Some embodiments of the present invention provide methods for predicting whether a subject afflicted with a pathological condition, such as AT, AT tumors and other cancers, is likely to respond favorably to a particular treatment regimen. The method involves obtaining a sample from the subject and determining a level of expression of at least one of any molecule of the present invention. A comparison is then made between the level of expression from the subject to a database comprising expression levels and expression patterns from previous patients who have undergone the treatment. The database may contain and will correlate expression levels of the molecules of the invention with previous patients' treatments. In this way, the molecules of the invention, can be used to predict whether an individual subject afflicted with, for example, AT, will respond favorably to a particular treatment regimen based on expression patterns from patients previously afflicted with AT who underwent a treatment regimen.

Other embodiments of the present invention provide methods for predicting the metastatic potential, regression or remission of a tumor in a subject. The method involves obtaining a sample from the subject and determining a level of expression of at least one of any molecule of the present invention. A comparison is then made between the level of expression from the subject to a database comprising expression levels and expression patterns correlated with metastatic potential, remission and regression from previous patients afflicted with the same type of tumor, such as, for example, an AT tumor.

Still other embodiments of the present invention provide methods for determining a stage of tumor progression in a subject. The method involves obtaining a sample from the subject and determining a level of expression of at least one of any molecule of the present invention. A comparison is then made between the level of expression from the subject to a database comprising expression levels and expression patterns correlated with stages of tumor progression from previous patients afflicted with the same type of tumor, such as, for example, an AT tumor.

An additional embodiment of the present invention provides a method for identifying genes associated with chromosomal aberrations in AT and AT tumors. The method involves obtaining a sample from a subject and isolating the RNA therefrom. The TOGA^® process is then performed on the isolated RNA to identify differentially regulated genes. After the differentially regulated genes are identified, a genetic marker at the site of the chromosomal aberration is identified. Chromosomal aberrations include, but are not limited to chromosome breaks, amplifications, deletions, and translocations, which may be identified according to techniques well known to those of skill in the art.

In drug development studies, these tests can be useful as efficacy markers, so that the ability of any new therapeutics to treat disease can be evaluated on the basis of these objective assays. The utility of these diagnostic tests will first be determined by developing statistical information correlating the specific lab test values with several clinical parameters so that the lab test values can be known to reliably predict certain clinical conditions.

In many cases, the diagnostic lab tests based on the polynucleotides, genes, antibodies or polypeptides of the invention, i.e., gene expression profiles of polynucleotides or polypeptides encoded by the polynucleotides identified in SEQ ID NOs: 1-74, will be important markers of drug or disease toxicity. The markers of toxicity versus drug efficacy will be determined by studies correlating the effects of known toxins or pathological conditions with specific alterations in gene regulation. Toxicity markers generated in this fashion will be useful to distinguish the various therapeutic versus deleterious effects on cells or tissues in the patient.

As an additional method of diagnosis, a first biological sample from a patient suspected of having a pathological condition, such as AT, AT tumors, or other cancers, is obtained along with a second sample from a suitable comparable control source. A biological sample can comprise saliva, blood, cerebrospinal fluid, amniotic fluid, urine, feces, tissue, or the like. Ideally, such a study should compare normal tissue to that of the tumor tissue, as well as blood and urine, in order to identify tumor specific expression as well as indicators of metastasis (i.e., circulating in the blood). A suitable control source can be obtained from one or more mammalian subjects that do not have the pathological condition. For example, the average concentration and distribution of a polynucleotide, gene, or polypeptide of the invention or a region thereof can be determined from biological samples taken from a representative population of mammalian subjects, wherein the mammalian subjects are the same species as the subject from which the test sample was obtained. The amount of at least one polypeptide, gene, polynucleotide of the invention or region thereof is determined in the first and second sample. The amounts of the polypeptide in the first and second samples are compared. A patient is diagnosed as having the pathological condition, i.e. AT, AT tumors, or other cancers, if the amount of the polypeptide, gene, polynucleotide of the invention or a region thereof in the first sample is greater than or less than the amount of the polypeptide, gene, polynucleotide of the invention or a region thereof in the second sample. Preferably, the amount of polypeptide, gene, polynucleotide of the invention or a region thereof in the first sample falls in the range of samples taken from a representative group of patients with the pathological condition.

The method for diagnosing a pathological condition can comprise a step of detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at least two nucleotide sequences, wherein at least one sequence in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from said group.

The present invention also includes a diagnostic system, preferably in kit form, for assaying for the presence of the polypeptide of the present invention in a body sample, including, but not limited to brain tissue, cell suspensions or tissue sections, or a body fluid sample, such as CSF, blood, plasma or serum, where it is desirable to detect the presence, and preferably the amount, of the polypeptide of this invention in the sample according to the diagnostic methods described herein.

In a related embodiment, the discovery of differential expression patterns for the molecules of the invention allows for screening of test compounds with an eye to modulating a particular expression pattern; for example, screening can be done for compounds that will convert an expression profile for a poor prognosis to a better prognosis. These methods can also be done on the protein basis; that is, protein expression levels of the molecules of the invention, such as, for example, polypeptides encoded by the polynucleotides identified in SEQ ID NOs: 1-74, can be evaluated for diagnostic and prognostic purposes or to screen test compounds. In addition, the invention provides methods of conducting high-throughput screening for test compounds capable of inhibiting activity of proteins encoded by the polynucleotides of the invention, i.e., SEQ ID NOs: 1-74. The method of high-throughput screening involves combining test compounds and the polypeptide and measuring an effect of the test compound on the encoded polypeptide. Functional assays such as cytosensor microphysiometer, calcium flux assays such as FLIPR (Molecular Devices Corp, Sunnyvale, CA), or the TUNEL assay may be employed to measure cellular activity.

The invention also provides a method of screening test compounds for inhibitors of AT, AT tumors, or other cancers and the pharmaceutical compositions comprising the test compounds. The method for screening comprises obtaining samples from subjects afflicted with AT, AT tumors, or other cancers, maintaining separate aliquots of the samples with a plurality of test compounds, and comparing expression of molecules of the invention, i.e., SEQ ID NOs: 1-74, in each of the aliquots to determine whether any of the test compounds provides a substantially modulated level of expression relative to samples with other test compounds or to an untreated sample. In addition, methods of screening may be devised by combining a test compound with a protein and thereby determining the effect of the test compound on the polypeptide.

In a related embodiment, a nucleic acid molecule can be used as a probe (i.e., an oligonucleotide) to detect the presence of a polynucleotide of the present invention, a gene corresponding to a polynucleotide of the present invention or a region thereof, or a mRNA in a cell that is diagnostic for the presence or expression of a polypeptide of the present invention in the cell. The nucleic acid molecule probes can be of a variety of lengths from at least about 10 to about 5000 nucleotides long, although they will typically be about 20 to 500 nucleotides in length. The probe can be used to detect the polynucleotide, gene, gene region or mRNA through hybridization methods that are well known in the art.

In a related embodiment, detection of genes corresponding to the polynucleotides of the present invention can be conducted by primer extension reactions such as the polymerase chain reaction (PCR). To that end, PCR primers are utilized in pairs, as is well known, based on the nucleotide sequence of the gene to be detected. Preferably, the nucleotide sequence is a portion of the nucleotide sequence of a polynucleotide of the present invention. Particularly preferred PCR primers can be derived from any portion of a DNA sequence encoding a polypeptide of the present invention, but are preferentially from regions that are not conserved in other cellular proteins. Preferred PCR primer pairs useful for detecting the genes corresponding to the polynucleotides of the present invention and expression of these genes are described below. Nucleotide primers from the corresponding region of the polypeptides of the present invention described herein are readily prepared and used as PCR primers for detection of the presence or expression of the corresponding gene in any of a variety of tissues.

In another embodiment, a diagnostic system, preferably in kit form, is contemplated for assaying for the presence of the polypeptide of the present invention or an antibody immunoreactive with the polypeptide of the present invention in a body fluid sample. Such diagnostic kit would be useful for monitoring the fate of a therapeutically administered polypeptide of the present invention or an antibody immunoreactive with the polypeptide of the present invention. The system includes, in an amount sufficient for at least one assay, a polypeptide of the present invention and/or a subject antibody as a separately packaged immunochemical reagent. Instructions for use of the packaged reagent(s) are also typically included.

A diagnostic system of the present invention preferably also includes a label or indicating means capable of signaling the formation of an immunocomplex containing a polypeptide or antibody molecule of the present invention.

Any label or indicating means can be linked to or incorporated in an expressed protein, polypeptide, or antibody molecule that is part of an antibody or monoclonal antibody composition of the present invention or used separately, and those atoms or molecules can be used alone or in conjunction with additional reagents. Such labels are themselves well-known in clinical diagnostic chemistry and constitute a part of this invention only insofar as they are utilized with otherwise novel proteins methods and/or systems.

The labeling means can be a fluorescent labeling agent that chemically binds to antibodies or antigens without denaturing them to form a fluorochrome (dye) that is a useful immunofluorescent tracer. Suitable fluorescent labeling agents are fluorochromes such as fluorescein isocyanate (FIC), fluorescein isothiocyanate (FITC), 5-dimethylamine-l- naphthalenesulfonyl chloride (DANSC), tetramethylrhodamine isothiocyanate (TRITC), lissamine, rhodamine 8200 sulphonyl chloride (RB 200 SC) and the like. A description of immunofluorescence analysis techniques is found in DeLuca, "Immunofluorescence Analysis", in

Antibody As a Tool, Marchalonis et al., Eds., John Wiley & Sons, Ltd., pp. 189-231 (1982), which is incorporated herein by reference. Other suitable labeling agents are known to those skilled in the art. In preferred embodiments, the indicating group is an enzyme, such as horseradish peroxidase (HRP), glucose oxidase, or the like. In such cases where the principal indicating group is an enzyme, such as HRP or glucose oxidase or a vitamin, such as biotin, additional reagents are required to visualize the formation of the receptor-ligand complex. Such additional reagents for HRP include hydrogen peroxide and an oxidation dye precursor, such as diaminobenzidine. Such additional reagents for biotin include streptavidin. An additional reagent useful with glucose oxidase is 2,2'-amino-di-(3-ethyl-benzthiazoline-G-sulfonic acid) (ABTS).

Radioactive elements are also useful labeling agents and are used illustratively herein. An exemplary radiolabeling agent is a radioactive element that produces gamma ray emissions. Elements which themselves emit gamma rays, such as ¹²⁴1, ¹²⁵1, ¹²⁸1, ¹³²I and ⁵¹Cr represent one class of gamma ray emission-producing radioactive element indicating groups. Particularly preferred is ¹²⁵I. Another group of useful labeling means are those elements such as ⁿC, ¹⁸F, ¹⁵O and ¹³N which themselves emit positrons. The positrons so emitted produce gamma rays upon encounters with electrons present in the animal's body. Also useful is a beta emitter, such ¹ ' 'indium or ³H.

The linking of labels or labeling of polypeptides and proteins is well known in the art. For instance, antibody molecules produced by a hybridoma can be labeled by metabolic incorporation of radioisotope-containing amino acids provided as a component in the culture medium (see, e.g., Galfre et al., Meth. Enzymol, 73:3-46 (1981)). The techniques of protein conjugation or coupling through activated functional groups are particularly applicable (see, e.g., Aurameas, et al., Scand. J. Immunol, Vol. 8 Suppl. 7:7-23 (1978); Rodwell et al., Biotech., 3:889-894 (1984); and U.S. Patent No. 4,493,795).

The diagnostic systems can also include, preferably as a separate package, a specific binding agent. Exemplary specific binding agents are second antibody molecules, complement proteins or fragments thereof, such as, S. aureus protein A, and the like. Preferably the specific binding agent binds the reagent species when that species is present as part of a complex.

In preferred embodiments, the specific binding agent is labeled. However, when the diagnostic system includes a specific binding agent that is not labeled, the agent is typically used as an amplifying means or reagent. In these embodiments, the labeled specific binding agent is capable of specifically binding the amplifying means when the amplifying means is bound to a reagent species-containing complex. The diagnostic kits of the present invention can be used in an "ELISA" format to detect the quantity of the polypeptide of the present invention in a sample. A description of the ELISA technique is found in Sites et al., Basic and Clinical Immunology, 4th Ed., Chap. 22, Lange Medical Publications, Los Altos, CA (1982) and in U.S. Patent No. 3,654,090; U.S. Patent No. 3,850,752; and U.S. Patent No. 4,016,043, which are all incorporated herein by reference.

Thus, in some embodiments, a polypeptide of the present invention, an antibody or a monoclonal antibody of the present invention can be affixed to a solid matrix to form a solid support that comprises a package in the subject diagnostic systems.

A reagent is typically affixed to a solid matrix by adsorption from an aqueous medium, although other modes of affixation applicable to proteins and polypeptides can be used that are well known to those skilled in the art. Exemplary adsorption methods are described herein.

Useful solid matrices are also well known in the art. Such materials are water insoluble and include the cross-linked dextran available under the trademark SEPHADEX from Pharmacia Fine Chemicals (Piscataway, NJ), agarose, polystyrene beads of about 1 micron (μm) to about 5 millimeters (mm) in diameter available from several suppliers (e.g., Abbott Laboratories, Chicago, IL), polyvinyl chloride, polystyrene, cross-linked polyacrylamide, nitrocellulose- or nylon-based webs (sheets, strips or paddles) or tubes, plates or the wells of a microtiter plate, such as those made from polystyrene or polyvinylchloride.

The reagent species, labeled specific binding agent, or amplifying reagent of any diagnostic system described herein can be provided in solution, as a liquid dispersion or as a substantially dry power, e.g., in lyophilized form. Where the indicating means is an enzyme, the enzyme's substrate can also be provided in a separate package of a system. A solid support such as the before- described microtiter plate and one or more buffers can also be included as separately packaged elements in this diagnostic assay system.

The packaging materials discussed herein in relation to diagnostic systems are those customarily utilized in diagnostic systems. Genes

The present invention also relates to the genes corresponding to SEQ ID NOs: 1-74, and the polypeptides encoded by the polynucleotides or genes or regions thereof of SEQ ID NOs:l-74. The corresponding gene can be isolated in accordance with known methods using the sequence information disclosed herein. Such methods include preparing probes or primers from the disclosed sequence and identifying or amplifying the corresponding gene from appropriate sources of genomic material.

Homologs, Paralogs and Orthologs

Also provided in the present invention are homologs of the polynucleotides, polypeptides and genes of the invention and regions thereof, including paralogous genes and orthologous genes. Nucleic acid homologs may be isolated and identified using suitable probes or primers from the sequences provided herein and screening a suitable nucleic acid source for the desired homolog. Studies of gene and protein evolution often involve the comparison of homologs, which are sequences that have common origins but may or may not have common activity. Sequences that share an arbitrary level of similarity determined by alignment of matching bases are called homologous.

There are many cases in which genes have duplicated, assumed somewhat different functions and been moved to other regions of the genome (e.g. alpha and beta globin). Such related genes in the same species are referred to as paralogs (e.g., Lundin, 1993, who refers to Fitch, 1976 for this distinction). They must be distinguished from orthologs (homologous genes in different species, such as beta globin in human and mouse) if any sensible comparisons are to be made. These terms as relate to genes are formally defined as follows:

As used herein, "paralogous genes" are genes within the same species produced by gene duplication in the course of evolution. They may be arranged in clusters or distributed on different chromosomes, an arrangement which is usually conserved in a wide range of vertebrates.

As used herein, "orthologous genes" describes homologous genes in different species that are descended from the same gene in the nearest common ancestor. Orthologs tend to have similar function.

In reports of previous Human Gene Mapping Workshops, the Comparative Gene Mapping Committee recommended explicit criteria for establishing homology between genes mapped in different species, as well as urging inclusion of specific criteria in comparative gene mapping publications (O'Brien and Graves, 1991). The evidence for gene homology might also be recorded in The Comparative Animal Genome database (TCAGdb). Revised criteria for determining homology can include any of the following (the most stringent are asterixed): Gene or other nucleotide sequence: • similar nucleotide sequence* • cross-hybridization to the same molecular probe*

• conserved map position* Protein or polypeptide:

• similar amino acid sequence*

• similar subunit structure and formation of functional heteropolymer

• immunological cross-reaction

• similar expression profile

• similar subcellular location

• similar substrate specificity

• similar response to specific inhibitors Phenotype:

• similar mutant phenotype

• complementation of function*

Two new criteria have recently been added. Because of the accumulation of overwhelming evidence for linkage conservation among mammal and vertebrate species, conserved map position may now itself constitute an important criterion of homology, and is particularly valuable in distinguishing between members of a gene family. Complementation of function has also been added, because it is now possible to establish complementation of function by transfection across even the widest species barriers.

More recent studies have also demonstrated that some of these criteria are much more stringent than others. A strong basis for homology would be a demonstration of high DNA or amino acid sequence similarity, in addition to conservation of map position between flanking homologous markers. Less robust immunological and biochemical criteria for gene homology would need to be confirmed at least by gene position. The assumption of gene homology must be considered a working hypothesis, and may later be further confirmed when further scientific criteria are applied.

Preferred embodiments of the present invention include homologs, paralogs and orthologs of the polynucleotides, polypeptides and genes of the invention and regions thereof. Polypeptides

The polypeptides of the invention can be prepared in any suitable manner. Such polypeptides include isolated naturally occurring polypeptides, recombinantly produced polypeptides, synthetically produced polypeptides, or polypeptides produced by a combination of these methods. Means for preparing such polypeptides are well understood in the art. See, e.g., Curr. Prot. Mol. Bio., Chapter 16.

The polypeptides may be in the form of the secreted protein, including the mature form, or may be a part of a larger protein, such as a fusion protein (see below). It is often advantageous to include an additional amino acid sequence that contains secretory or leader sequences, pro- sequences, sequences which aid in purification (such as multiple histidine residues), or an additional sequence for stability during recombinant production.

The polypeptides of the present invention are preferably provided in an isolated form, and preferably are substantially purified. A recombinantly produced version of a polypeptide, including the secreted polypeptide, can be substantially purified by the one-step method described in Smith & Johnson (Gene, 67:31-40, 1988). Polypeptides of the invention also can be purified from natural or recombinant sources using antibodies of the invention raised against the secreted protein according to methods that are well known in the art. Signal Sequences

Methods for predicting whether a protein has a signal sequence, as well as the cleavage point for that sequence, are available. For instance, the method of McGeoch uses the information from a short N-terminal charged region and a subsequent uncharged region of the complete (uncleaved) protein (Virus Res., 3:271-286 (1985)). The method of von Heinje uses the information from the residues surrounding the cleavage site, typically residues -13 to +2, where +1 indicates the amino terminus of the secreted protein (Nucleic Acids Res., 14:4683-4690 (1986)). Therefore, from a deduced amino acid sequence, a signal sequence and mature sequence can be identified.

The deduced amino acid sequence of a secreted polypeptide can be analyzed by a computer program called Signal P (Nielsen et al., Protein Engineering, 10:1-6 (1997), which predicts the cellular location of a protein based on the amino acid sequence. As part of this computational prediction of localization, the methods of McGeoch and von Heinje are incorporated.

As one of ordinary skill in the art will appreciate, however, cleavage sites sometimes vary from organism to organism and cannot be predicted with absolute certainty. Accordingly, the present invention provides secreted polypeptides having a sequence corresponding to the translations of SEQ ID NOs: 1-74 and their corresponding genes which have an N-terminus beginning within 5 residues (i.e., + or - 5 residues) of the predicted cleavage point. Similarly, it is also recognized that in some cases, cleavage of the signal sequence from a secreted protein is not entirely uniform, resulting in more than one secreted species. These polypeptides, and the polynucleotides and genes encoding such polypeptides, are contemplated by the present invention.

Moreover, the signal sequence identified by the above analysis may not necessarily predict the naturally occurring signal sequence. For example, the naturally occurring signal sequence may be further upstream from the predicted signal sequence. However, it is likely that the predicted signal sequence will be capable of directing the secreted protein to the ER. These polypeptides, and the polynucleotides and genes encoding such polypeptides, are contemplated by the present invention. Polynucleotide, Polypeptide and Gene Variants

Polynucleotide, polypeptide and gene variants differ from the polynucleotides, polypeptides and genes of the present invention, but retain essential properties thereof. In general, variants have close similarity overall and are identical in many regions to the polynucleotide or polypeptide of the present invention.

Further embodiments of the present invention include polynucleotides having at least 80% identity, more preferably at least 90%> identity, and most preferably at least 95%, 96%, 97%, 98%> or 99% identity to a sequence contained in SEQ ID NOs: 1-74. Of course, due to the degeneracy of the genetic code, one of ordinary skill in the art will immediately recognize that a large number of the polynucleotides having at least 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity, polynucleotides at least ten bases in length hybridizable to polynucleotide comprising at least one of SEQ ID NOs: 1-74 inclusive, polynucleotides comprising at least one SEQ ID NOs: 1-74 inclusive with sequential nucleotide deletions from either the 5' terminus or the 3' terminus, or a species homolog of polynucleotides comprising at least one of SEQ ID NOs: 1-74 inclusive will encode a polypeptide identical to an amino acid sequence contained in the translations of SEQ ID NOs: 1-74.

Further embodiments of the present invention include genes and regions thereof having at least 80% identity, more preferably at 90% identity, and most preferably at least 95%, 96%>, 97%,

98%) or 99% identity to genes corresponding to a sequence contained in SEQ ID NOs: 1-74 and regions thereof. Of course, due to the degeneracy of the genetic code, one of ordinary skill in the art will immediately recognize that a large number of the genes having at least 85%, 90%, 95%,

96%), 91%, or 99%) identity respectively to genes of the invention, genes hybridizable to genes of the invention, genes of the invention with sequential nucleotide deletions from either the 5' terminus or the 3' terminus, or a species homolog of genes of the invention will encode a polypeptide identical to an amino acid sequence contained in the translations of genes of the invention.

Further embodiments of the present invention also include polypeptides having at least 80% identity, more preferably at least 85% identity, more preferably at least 90% identity, and most preferably at least 95%, 96%, 97%, 98% or 99% identity to an amino acid sequence contained in translations of SEQ ID NOs: 1-74 and their corresponding genes. Preferably, the above polypeptides should exhibit at least one biological activity of the protein. In a preferred embodiment, polypeptides of the present invention include polypeptides having at least 90% similarity, more preferably at least 95 %> similarity, and still more preferably at least 96%, 97%>, 98%), or 99%o similarity to an amino acid sequence contained in translations of SEQ ID NOs: 1-74 and their coπesponding genes. Methods for aligning polynucleotides, polypeptides, genes or regions thereof are codified in computer programs, including the GCG program package (Devereux et al., Nuc. Acids Res. 12:387 (1984)), BLASTP, BLASTN, FASTA (Atschul et al., J. Molec. Biol. 215:403 (1990)), and Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, 575 Science Drive, Madison, WI 53711) which uses the local homology algorithm of Smith and Waterman (Adv. in App. Math., 2:482-489 (1981)).

When using any of the sequence alignment programs to determine whether a particular sequence is, for instance, 95%> identical to a reference sequence, the parameters are set such that the percentage of identity is calculated over the full length of the reference polynucleotide or gene that gaps in identity of up to 5% of the total number of nucleotides in the reference polynucleotide or gene are allowed.

A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of

Brutiag et al. (Comp. App. Biosci., 6:237-245 (1990)). The term "sequence" includes nucleotide and amino acid sequences. In a sequence alignment the query and subject sequences are either both nucleotide sequences or both amino acid sequences. The result of said global sequence alignment is presented in terms of percent identity. Preferred parameters used in a FASTDB search of a DNA sequence to calculate percent identity are: Matrix=Unitary, k-tuple=4, Mismatch

Penalty=l, Joining Penalty=30, Randomization Group Length=0, and Cutoff Score=l, Gap Penalty=5, Gap Size Penalty 0.05, and Window Size=500 or query sequence length in nucleotide bases, whichever is shorter. Preferred parameters employed to calculate percent identity and similarity of an amino acid alignment are: Matrix=PAM 150, k-tuple=2, Mismatch Penalty= 1, Joining Penalty=20, Randomization Group Length=0, Cutoff Score=l, Gap Penalty=5, Gap Size Penalty=0.05, and Window Size=500 or query sequence length in amino acid residues, whichever is shorter.

For example, a polynucleotide having a nucleotide sequence of at least 95% "identity" to a sequence contained in SEQ ID NOs: 1-74 means that the polynucleotide is identical to a sequence contained in SEQ ID NOs: 1-74 or the cDNA except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the total length (not just within a given 100 nucleotide stretch). In other words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to SEQ ID NOs: 1-74, up to 5% of the nucleotides in the sequence contained in SEQ ID NOs: 1-74 or the cDNA can be deleted, inserted, or substituted with other nucleotides. These changes may occur anywhere throughout the polynucleotide.

Similarly, a polypeptide having an amino acid sequence having at least, for example, 95% "identity" to a reference polypeptide, means that the amino acid sequence of the polypeptide is identical to the reference polypeptide except that the polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the total length of the reference polypeptide. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a reference amino acid sequence, up to 5% of the amino acid residues in the reference sequence may be deleted or substituted with another amino acid, or a number of amino acids up to 5%> of the total amino acid residues in the reference sequence may be inserted into the reference sequence. These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence.

The variants may contain alterations in the coding regions, non-coding regions, or both.

Especially preferred are polynucleotide variants containing alterations that produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded polypeptide. Nucleotide variants produced by silent substitutions due to the degeneracy of the genetic code are preferred. Moreover, variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any combination are also preferred. Polynucleotide variants can be produced for a variety of reasons. For instance, a polynucleotide variant may be produced to optimize codon expression for a particular host (i.e., codons in the human mRNA may be changed to those preferred by a bacterial host, such as E. coli). Variants may also arise by the process of ribosomal frameshifting, by translational read-through at naturally occurring stop codons, and by decoding of in-frame translational stop codons UGA through insertion of selanocysteine (See The RNA World, 2^nd edition, ed: Gesteland, R.F., Cech, T.R., & Atkins, J.F.; Cold Spring Harbor Laboratory Press, 1999).

The variants may be allelic variants. Naturally occurring variants are called "allelic variants," and refer to one of several alternate forms of a gene occupying a given locus on a chromosome of an organism (Lewin, Ed., Genes II, John Wiley & Sons, New York (1985)). These allelic variants can vary at either the polynucleotide and/or polypeptide level. Alternatively, non- naturally occurring variants may be produced by mutagenesis techniques or by direct synthesis. See, e.g., Curr. Prot. Mol. Bio., Chapter 8.

Using known methods of protein engineering and recombinant DNA technology, variants may be generated to improve or alter the characteristics of the polypeptides of the present invention. For example, polypeptide variants containing amino acid substitutions of charged amino acids with other charged or neutral amino acids may produce proteins with improved characteristics, such as decreased aggregation. As known, aggregation of pharmaceutical formulations both reduces activity and increases clearance due to the aggregate's immunogenic activity (see, e.g., Pinckard et al., Clin. Exp. Immunol. 2:331-340 (1967); Robbins et al., Diabetes, 36: 838-845 (1987); Cleland et al., Crit. Rev. Therap. Drug Carrier Sys., 10:307-377 (1993)). Similarly, interferon gamma exhibited up to ten times higher activity after deleting 8-10 amino acid residues from the carboxy terminus of this protein (Dobeli et al., J. Biotechnology, 7:199-216 (1988)).

Moreover, ample evidence demonstrates that variants often retain a biological activity similar to that of the naturally occurring protein. For example, Gayle et al. conducted extensive mutational analysis of human cytokine IL-la (J Biol. Chem., 268:22105-22111 (1993)). These investigators used random mutagenesis to generate over 3,500 individual IL- 1 a mutants that averaged 2.5 amino acid changes per variant over the entire length of the molecule. Multiple mutations were examined at every possible amino acid position. The investigators concluded that

"most of the molecule could be altered with little effect on either binding or biological activity." In fact, only 23 unique amino acid sequences, out of more than 3,500 amino acid sequences examined, produced a protein that differed significantly in activity from the wild-type sequence. Another experiment demonstrated that one or more amino acids can be deleted from the N- terminus or C-terminus of the secreted protein without substantial loss of biological function. Ron et al. reported variant KGF proteins having heparin binding activity even after deleting 3, 8, or 27 amino-terminal amino acid residues (J. Biol. Chem. 268: 2984-2988 (1993)).

Furthermore, even if deleting one or more amino acids from the N-terminus or C-terminus of a polypeptide results in modification or loss of one or more biological functions, other biological activities may still be retained. For example, the ability of a deletion variant to induce and/or to bind antibodies which recognize the secreted form will likely be retained when less than the majority of the residues of the secreted form are removed from the N-terminus or C-terminus. Whether a particular polypeptide lacking N- or C-terminal residues of a protein retains such immunogenic activities can readily be determined by routine methods described herein and otherwise known in the art.

Thus, the invention further includes polypeptide variants that show substantial biological activity. Such variants include deletions, insertions, inversions, repeats, frameshifting, read- through translational variants, and substitutions selected according to general rules known in the art so as have little effect on activity. For example, guidance concerning how to make phenotypically silent amino acid substitutions is provided in Bowie et al., Science, 247:1306-1310 (1990), wherein the authors indicate that there are two main strategies for studying the tolerance of an amino acid sequence to change.

The first strategy exploits the tolerance of amino acid substitutions by natural selection during the process of evolution. By comparing amino acid sequences in different species, the amino acid positions that have been conserved between species can be identified. These conserved amino acids are likely important for protein function. In contrast, the amino acid positions in which substitutions have been tolerated by natural selection indicate positions which are not critical for protein function. Thus, positions tolerating amino acid substitution may be modified while still maintaining biological activity of the protein.

The second strategy uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene to identify regions critical for protein function. For example, site- directed mutagenesis or alanine-scanning mutagenesis (the introduction of single alanine mutations at every residue in the molecule) can be used (Cunningham et al., Science, 244:1081-1085 (1989)).

The resulting mutant molecules can then be tested for biological activity. According to Bowie et al., these two strategies have revealed that proteins are surprisingly tolerant of amino acid substitutions. The authors further indicate which amino acid changes are likely to be permissive at certain amino acid positions in the protein. For example, the most buried or interior (within the tertiary structure of the protein) amino acid residues require nonpolar side chains, whereas few features of surface or exterior side chains are generally conserved. Moreover, tolerated conservative amino acid substitutions involve replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and Ile; replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and Glu; replacement of the amide residues Asn and Gin; replacement of the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp; and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly.

Besides conservative amino acid substitution, variants of the present invention include: (i) substitutions with one or more of the non-conserved amino acid residues, where the substituted amino acid residues may or may not be one encoded by the genetic code; (ii) substitution with one or more of amino acid residues having a substituent group; (iii) fusion of the mature polypeptide with another compound, such as a compound to increase the stability and or solubility of the polypeptide (e.g., polyethylene glycol); (iv) fusion of the polypeptide with additional amino acids, such as an IgG Fc fusion region peptide, a leader or secretory sequence, or a sequence facilitating purification. Such variant polypeptides are deemed to be within the scope of those skilled in the art from the teachings herein. Polynucleotide and Polypeptide Fragments

In the present invention, a "polynucleotide fragment" and "region of a gene" refers to a short polynucleotide having a nucleic acid sequence contained in SEQ ID NOs: 1-74. The short nucleotide fragments are preferably at least about 15 nucleotides (nt), and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length. A fragment "at least 20 nt in length," for example, is intended to include 20 or more contiguous bases from the cDNA sequence contained in that shown in SEQ ID NOs: 1-74. These nucleotide fragments are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments (e.g., 50, 150, and greater than 150 nucleotides) are preferred.

Moreover, representative examples of polynucleotide fragments of the invention, include, for example, fragments having a sequence from about nucleotide number 1-50, 51-100, 101-150,

151-200, 201-250, 251-300, 301-350, 351-400, 401-450, to the end of SEQ ID NOs:l-74. In this context "about" includes the particularly recited ranges, larger or smaller by several nucleotides (i.e., 5, 4, 3, 2, or 1 nt) at either terminus or at both termini. Preferably, these fragments encode a polypeptide that has biological activity.

In the present invention, a "polypeptide fragment" refers to a short amino acid sequence contained in the translations of SEQ ID NOs: 1-74. Protein fragments may be "free-standing," or comprised within a larger polypeptide of which the fragment forms a part or region, most preferably as a single continuous region. Representative examples of polypeptide fragments of the invention include, for example, fragments from about amino acid number 1-20, 21-40, 41-60, or 61 to the end of the coding region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, or 60 amino acids in length. In this context "about" includes the particularly recited ranges, larger or smaller by several amino acids (5, 4, 3, 2, or 1) at either extreme or at both extremes.

In situations where a DST of the present invention is not a translatable polypeptide, i.e., where the DST is in whole or in part of the 3' untranslated region of its corresponding gene, the translation product or region of the translation product of the gene corresponding to the DST is intended to be encompassed by the terms "polypeptide" or "polypeptide fragment" as used herein.

Preferred polypeptide fragments include the secreted protein as well as the mature form. Further prefeπed polypeptide fragments include the secreted protein or the mature form having a continuous series of deleted residues from the amino or the carboxy terminus, or both. For example, any number of amino acids ranging from 1-60, can be deleted from the amino terminus of either the secreted polypeptide or the mature form. Similarly, any number of amino acids ranging from 1-30, can be deleted from the carboxy terminus of the secreted protein or mature form. Furthermore, any combination of the above amino and carboxy terminus deletions are preferred. Similarly, polynucleotide fragments encoding these polypeptide fragments are also prefeπed.

Also preferred are polypeptide and polynucleotide fragments characterized by structural or functional domains, such as fragments that comprise alpha-helix and alpha-helix-forming regions, beta-sheet and beta-sheet-forming regions, turn and turn-forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions, substrate binding region, and high antigenic index regions. Polypeptide fragments of the translations of SEQ ID NOs: 1-74 and their coπesponding genes falling within conserved domains are specifically contemplated by the present invention. Moreover, polynucleotide fragments encoding these domains are also contemplated.

Other prefeπed fragments are biologically active fragments or the polynucleotide or gene encoding biologically active fragments. Biologically active fragments are those exhibiting activity similar, but not necessarily identical, to an activity of the polypeptide of the present invention. The biological activity of the fragments may include an improved desired activity, or a decreased undesirable activity. Epitopes And Antibodies Or Binding Partners To Them

Fragments which function as epitopes may be produced by any conventional means. (See, e.g., Houghten, R. A., Proc. Natl. Acad. Sci. USA, 82:5131-5135 (1985), further described in U.S. Patent No. 4,631,211).

In the present invention, immunogenic epitopes preferably contain a sequence of at least seven, more preferably at least nine, and most preferably between about 15 to about 30 amino acids. Antigenic epitopes are useful to raise antibodies, including monoclonal antibodies, which specifically bind the epitope. (See, e.g., Wilson et al., Cell, 31:161-118 (1984); Sutcliffe et al., Science, 219:660-666 (1983)).

Similarly, immunogenic epitopes can be used to induce antibodies or to select binding partners according to methods well known in the art. (See, e.g., Sutcliffe et al., (1983) supra; Wilson et al., (1984) supra; Chow et al., Proc. Natl. Acad. Sci., USA, 82:910-914; and Bittle et al., J. Gen. Virol., 66:2347-2354 (1985)). A prefeπed immunogenic epitope includes the secreted protein. The immunogenic epitope may be presented together with a carrier protein, such as an albumin, to an animal system (such as rabbit or mouse). Alternatively, the immunogenic epitope may be prescribed without a caπier, if the sequence is of sufficient length (at least about 25 amino acids). However, immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be sufficient to raise antibodies capable of binding to, at the very least, linear epitopes in a denatured polypeptide (e.g., in Western blotting.)

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (mAb) is meant to include intact molecules as well as antibody fragments (such as, for example, Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab and F(ab')2 fragments lack the Fc fragment of intact antibody, clear more rapidly from the circulation, and may have less nonspecific tissue binding than an intact antibody (Wahl et al., J. Nucl. Med., 24:316-325, 1983). Thus, these fragments are prefeπed, as well as the products of a Fab or other immunoglobulin expression library. Moreover, antibodies of the present invention include chimeric, single chain, and human and humanized antibodies.

The antibodies may be chimeric antibodies, e.g., humanized versions of murine monoclonal antibodies. Such humanized antibodies may be prepared by known techniques, and offer the advantage of reduced immunogenicity when the antibodies are administered to humans. See, e.g., Co et al., Nature, 351:501-2 (1991). In one embodiment, a humanized monoclonal antibody comprises the variable region of a murine antibody (or just the antigen binding site thereof) and a constant region derived from a human antibody. Alternatively, a humanized antibody fragment may comprise the antigen binding site of a murine monoclonal antibody and a variable region fragment (lacking the antigen-binding site) derived from a human antibody. Procedures for the production of chimeric and further engineered monoclonal antibodies include those described in Riechmann et al., Nature, 332:323, 1988, Liu et al., PNAS, 84:3439, 1987, Larrick et al., Bio/Technology, 7:934, 1989, and Winter and Harris, TIPS, 14:139, May, 1993, Zou et al., Science 262:1271-4, 1993, Zou et al., Cwrr. Biol, 4:1099-103, 1994, and Walls et al, Nucleic Acids Res., 21 :2921-9, 1993.

One method for producing a human antibody comprises immunizing a non-human animal, such as a transgenic mouse, with a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes, whereby antibodies directed against the polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their corresponding genes are generated in said animal. Procedures have been developed for generating human antibodies in non-human animals. The antibodies may be partially human, or preferably completely human. For example, mice have been prepared in which one or more endogenous immunoglobulin genes are inactivated by various means and human immunoglobulin genes are introduced into the mice to replace the inactivated mouse genes. Such transgenic mice may be genetically altered in a variety of ways. The genetic manipulation may result in human immunoglobulin polypeptide chains replacing endogenous immunoglobulin chains in at least some (preferably virtually all) antibodies produced by the animal upon immunization. Examples of techniques for production and use of such transgenic animals are described in U.S. Patent Nos. 5,814,318, 5,569,825, and 5,545,806, which are incorporated by reference herein. Antibodies produced by immunizing transgenic animals with a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes and methods of using such antibodies are provided herein.

Monoclonal antibodies may be produced by conventional procedures, e.g., by immortalizing spleen cells harvested from the transgenic animal after completion of the immunization schedule. The spleen cells may be fused with myeloma cells to produce hybridomas by conventional procedures. Examples of such techniques are described in U.S. Patent No. 4,196,265, which is incorporated by reference herein.

A method for producing a hybridoma cell line comprises immunizing such a transgenic animal with an immunogen comprising at least seven contiguous amino acid residues of a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes; harvesting spleen cells from the immunized animal; fusing the harvested spleen cells to a myeloma cell line, thereby generating hybridoma cells; and identifying a hybridoma cell line that produces a monoclonal antibody that binds a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes. Such hybridoma cell lines, and monoclonal antibodies produced therefrom, are encompassed by the present invention. Monoclonal antibodies secreted by the hybridoma cell line are purified by conventional techniques. Examples of such techniques are described in U.S. Patent No. 4,469,630 and U.S. Patent No. 4,361,549.

Antibodies are only one example of binding partners to epitopes or receptor molecules. Other examples include, but are not limited to, synthetic peptides, which can be selected as a binding partner to an epitope or receptor molecule. The peptide may be selected from a peptide library as described by Appel et al., Biotechniques, 13, 901-905; and Dooley et al., J. Biol. Chem. 273, 18848-18856, 1998.

Binding assays can select for those binding partners (antibody, synthetic peptide, or other molecule) with highest affinity for the epitope or receptor molecule, using methods known in the art. Such assays may be done by immobilizing the epitope or receptor on a solid support, allowing binding of the library of antibodies or other molecules, and washing away those molecules with little or no affinity. Those binding partners or antibodies with highest affinity for the epitope or receptor will remain bound to the solid support. Alternatively, aπays of candidate binding partners may be immobilized, and a labeled soluble receptor molecule is allowed to interact with the array, followed by washing unbound receptors. High affinity binding is detectable by the presence of bound label.

Antibodies or other binding partners may be employed in an in vitro procedure, or administered in vivo to inhibit biological activity induced by a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes. Disorders caused or exacerbated (directly or indirectly) by the interaction of such polypeptides of the present invention with cell surface receptors thus may be treated. A therapeutic method involves in vivo administration of a blocking antibody to a mammal in an amount effective for reducing a biological activity induced by a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes. An example of a case where antibody administration in vivo could be used to inhibit a biological activity induced by a polypeptide translated from a polynucleotide sequence is as follows. In cancer, often genes called oncogenes or protoncogenes are overexpressed, providing cancer cells with a growth advantage over normal cells. For example, an antibody that could bind to a dysregulated, constitutively active cell surface growth receptor and inhibit further signaling by this receptor would be useful in reestablishing normal cellular growth.

Generally, antibodies or binding partners to receptors or cell surface polypeptides also can be linked to moieties, such as, for example, drug-loaded particles, antigens, DNA vaccines, immune modulators, other peptides, proteins for specific binding, and the like to the cells for targeting and enhanced delivery of the drug-loaded particles, antigens, DNA vaccines, immune modulators, other peptides, proteins for specific binding, and the like. Exemplary vaccines that can be specifically targeted to particular cells include, but are not limited to, rotavirus, influenza, diptheria, tetanus, pertussis, Hepatitis A, B and C, as well as conjugate vaccines, including S. pneumonia. Similarly, exemplary drugs that may be specifically targeted to particular cells include, but are not limited to, insulin, LHRH, buserlein, vasopressin and recombinant interleukins, such as IL-2 and IL-12. Additionally, exemplary vectors, such as, for example, adeno-associated virus, canarypox virus, adenovirus, retrovirus, and other delivery vehicles, such as, for example, liposomes and PLGA may be used to specifically target therapeutic moieties, such as, for example, IL-1 antagonist, GM-CSF antagonists, and the like, to particular cells. As is apparent to one skilled in the art, numerous other vaccines, drugs, and vectors may be useful in targeting and delivering therapeutic agents to particular cells.

Also provided herein are conjugates comprising a detectable (e.g., diagnostic) or therapeutic agent, attached to an antibody directed against a polypeptide translated from a nucleotide sequence chosen from SEQ ID NOs: 1-74 or their coπesponding genes. Examples of such agents are well known, and include, but are not limited to diagnostic radionuclides, therapeutic radionuclides, and cytotoxic drugs. See, e.g., Thrush et al., Annu.Rev. Immunol, 14:49- 71, 1996. The conjugates may be useful in in vitro or in vivo procedures. Fusion Proteins

Any polypeptide of the present invention can be used to generate fusion proteins. For example, the polypeptides of the present invention, when fused to a second protein, can be used as an antigenic tag. Antibodies raised against the polypeptides of the present invention can be used to indirectly detect the second protein by binding to the polypeptide. Moreover, because secreted proteins target cellular locations based on trafficking signals, the polypeptides of the present invention can be used as targeting molecules once fused to other proteins.

Examples of domains that can be fused to polypeptides of the present invention include not only heterologous signal sequences, but also other heterologous functional regions. The fusion does not necessarily need to be direct, but may occur through linker sequences.

Moreover, fusion proteins may also be engineered to improve characteristics of the polypeptide of the present invention. For instance, a region of additional amino acids, particularly charged amino acids, may be added to the N-terminus of the polypeptide to improve stability and persistence during purification from the host cell or subsequent handling and storage. Also, peptide moieties may be added to the polypeptide to facilitate purification. Such regions may be removed prior to final preparation of the polypeptide. The addition of peptide moieties to facilitate handling of polypeptides is a familiar and routine technique in the art.

In addition, polypeptides of the present invention, including fragments and, specifically, epitopes, can be combined with parts of the constant domain of immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins facilitate purification and show an increased half- life in vivo. One reported example describes chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of mammalian immunoglobulins (EP A 394,827; Traunecker et al., Nature, 331:84-86, 1988). Fusion proteins having disulfide-linked dimeric structures (due to the IgG) can also be more efficient in binding and neutralizing other molecules, than the monomeric secreted protein or protein fragment alone (Fountoulakis et al., J. Biochem., 270:3958-3964 (1995)).

Similarly, EP A 0 464 533 (Canadian counterpart 2045869) discloses fusion proteins comprising various portions of constant region of immunoglobulin molecules together with another human protein or part thereof. In many cases, the Fc part in a fusion protein is beneficial in therapy and diagnosis, and thus can result in, for example, improved pharmacokinetic properties (see, e.g., EP A 0 232 262). Alternatively, deleting the Fc part after the fusion protein has been expressed, detected, and purified, would be desired. For example, the Fc portion may hinder therapy and diagnosis if the fusion protein is used as an antigen for immunizations. In drug discovery, for example, human proteins, such as hIL-5, have been fused with Fc portions for the purpose of high-throughput screening assays to identify antagonists of hIL-5 (See, Bennett et al., J. Mol. Recognition 8:52-58 (1995); Johanson et al., J. Biol. Chem., 270:9459-9471,1995).

Moreover, the polypeptides of the present invention can be fused to marker sequences, such as a peptide that facilitates purification of the fused polypeptide. In prefeπed embodiments, the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (QIAGEN, Inc., Chatsworth, CA), among others, many of which are commercially available. As described in Gentz et al., for instance, hexa-histidine provides for convenient purification of the fusion protein (Proc. Natl. Acad. Sci. USA 86:821-824 (1989)). Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al., Cell, 37:767 (1984)). Other fusion proteins may use the ability of the polypeptides of the present invention to target the delivery of a biologically active peptide. This might include focused delivery of a toxin to tumor cells, or a growth factor to stem cells.

Thus, any of these above fusions can be engineered using the polynucleotides or the polypeptides of the present invention. See, e.g., Curr. Prot. Mol. Bio., Chapter 9.6. Vectors, Host Cells, and Protein Production

The present invention also relates to vectors containing the polynucleotide or gene of the present invention or regions thereof, host cells, and the production of polypeptides by recombinant techniques. The vector may be, for example, a phage, plasmid, viral, or retroviral vector. Retroviral vectors may be replication competent or replication defective. In the latter case, viral propagation generally will occur only in complementing host cells.

The polynucleotides, genes or regions thereof may be joined to a vector containing a selectable marker for propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is a virus, it may be packaged in vitro using an appropriate packaging cell line and then transduced into host cells. See, e.g., Curr. Prot. Mol. Bio., Chapters 9.9, 16.15.

The polynucleotide or gene or gene region insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few. Other suitable promoters will be known to the skilled artisan. The expression constructs will further contain sites for transcription initiation, termination, and, in the transcribed region, a ribosome binding site for translation. The coding portion of the transcripts expressed by the constructs will preferably include a translation initiating codon at the beginning and a termination codon (UAA, UGA or UAG) appropriately positioned at the end of the polypeptide to be translated.

As indicated, the expression vectors will preferably include at least one selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance genes for culturing in E. coli and other bacteria. Representative examples of appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, 293, and Bowes melanoma cells, and plant cells. Appropriate culture mediums and conditions for the above-described host cells are known in the art.

Vectors prefeπed for use in bacteria include pQE70, pQE60 and ρQE-9, available from QIAGEN, Inc.; pBluescript vectors, Phagescript vectors, pNH8A, PNH16A, PNH18A, pNH46A, available from Stratagene Cloning Systems, Inc.; and ptrc99a, pKK223-3, ρKK233-3, pDR540, pRIT5 available from Pharmacia Biotech, Inc. Among prefeπed eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXTl and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available from Pharmacia. Other suitable vectors will be readily apparent to the skilled artisan.

Introduction of the construct into the host cell can be effected by calcium phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated transfection, electroporation, transduction, infection, or other methods. Such methods are described in many standard laboratory manuals, such as Davis et al., Basic Methods In Molecular Biology (1986). It is specifically contemplated that the polypeptides of the present invention may, in fact, be expressed by a host cell lacking a recombinant vector.

A polypeptide of this invention can be recovered and purified from recombinant cell cultures by well-known methods, including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography, and lectin chromatography. Most preferably, high performance liquid chromatography ("HPLC") is employed for purification.

Depending upon the host employed in a recombinant production procedure, the polypeptides of the present invention may be glycosylated or may be non-glycosylated. In addition, polypeptides of the invention may also include an initial modified methionine residue, in some cases as a result of host-mediated processes. Thus, it is well known in the art that the N- terminal methionine encoded by the translation initiation codon generally is removed with high efficiency from any protein after translation in all eukaryotic cells. While the N-terminal methionine on most proteins also is efficiently removed in most prokaryotes, for some proteins, this prokaryotic removal process is inefficient, depending on the nature of the amino acid to which the N-terminal methionine is covalently linked.

Polypeptides of the present invention, and preferably the secreted form, can also be recovered from products purified from natural sources, including bodily fluids, tissues and cells, whether directly isolated or cultured; products of chemical synthetic procedures; and products produced by recombinant techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and mammalian cells. Other Uses of the Polynucleotides of the Invention

Each of the polynucleotides and genes of the present invention and regions thereof identified herein can be used in numerous ways as reagents. The following description should be considered exemplary and utilizes known techniques.

The polynucleotides and genes of the present invention and regions thereof are useful for chromosome identification. There exists an ongoing need to identify new chromosome markers, since few chromosome marking reagents based on actual sequence data (repeat polymorphisms) are presently available. Each polynucleotide of the present invention can be used as a chromosome marker. In terms of cancer in general, no diagnostic markers exist that can be used to prevent or delay the carcinogenic process. Several markers, such as specific chromosome abeπations (e.g., fusion products), do exist as general indicators or hallmarks of either the tissue origin of the cancer or markers of the aggressive nature of the disease (i.e., metastatic potential). The polynucleotides of the present invention could be used as chromosome markers for diagnosis of cancer or as a gauge of the stage of the disease.

Briefly, sequences can be mapped to chromosomes by preparing PCR primers (preferably

15-25 bp) from the sequences shown in SEQ ID NOs: 1-74 or their corresponding genes or regions thereof. Primers can be selected using computer analysis so that primers do not span more than one predicted exon in the genomic DNA. These primers may then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene coπesponding to the SEQ ID NOs: 1-74 or their coπesponding genes or regions thereof will yield an amplified fragment. Similarly, somatic hybrids provide a rapid method of PCR mapping the polynucleotides to particular chromosomes. Moreover, sublocalization of the polynucleotides can be achieved with panels of specific chromosome fragments. Other gene-mapping strategies that can be used include in situ hybridization, prescreening with labeled flow-sorted chromosomes, and preselection by hybridization to construct chromosome specific-cDNA libraries.

Precise chromosomal location of the polynucleotides, genes of the invention or regions thereof can also be achieved using fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. This technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides of 2,000-4,000 bp are prefeπed. For a review of this technique, see Verma et al., Human Chromosomes: a Manual of Basic Techniques, Pergamon Press, New York (1988).

For chromosome mapping, the polynucleotides, genes of the invention or regions thereof can be used individually (to mark a single chromosome or a single site on that chromosome) or in panels (for marking multiple sites and/or multiple chromosomes). Prefeπed polynucleotides correspond to the noncoding regions of the cDNAs because the coding sequences are more likely conserved within gene families, thus increasing the chance of cross-hybridization during chromosomal mapping.

Once a polynucleotide, gene of the invention or region thereof has been mapped to a precise chromosomal location, the physical position of the polynucleotide, gene or region thereof can be used in linkage analysis. Linkage analysis establishes coinheritance between a chromosomal location and presentation of a particular disease. Disease mapping data are found, for example, in V. McKusick, Mendelian Inheritance in Man (available on line through Johns Hopkins University Welch Medical Library), Kruglyak et al. (Am. J. Hum. Genet., 56:1212-23, 1995); Curr. Prot. Hum. Genet. Assuming one megabase mapping resolution and one gene per 20 kb, a cDNA precisely localized to a chromosomal region associated with the disease could be one of 50-500 potential causative genes. Thus, once coinheritance is established, differences in the polynucleotide and the coπesponding gene or region thereof between affected and unaffected individuals can be examined.

The polynucleotides of SEQ ID NOs: 1-74 and their coπesponding genes or regions thereof can be used for this analysis of individuals. In terms of cancer in general, genetics clearly play a role in the cancer process, either directly or indirectly through gene-environment interactions. For example, in the case of familial cancers, individuals that inherit specific genetic variations may be predisposed to specific cancers. Further, in the case of cancer in the general population, there is evidence that genetic variations (i.e., polymorphisms) may cause certain individuals to be more sensitive or susceptible to environmental, occupational, or other exposures (e.g., cigarette smoking), and thus this genetic variation may be a risk factor in the cancer process. Identification of the presence of specific polynucleotides might serve to predict individuals that may be at risk of specific cancers, and also to predict individuals who might be sensitive to environmental or occupational exposures. Thus, identification of such polynucleotides could serve as biomarkers to enhance risk assessments, focus cancer prevention strategies, as well as elucidate carcinogenic mechanisms.

First, visible structural alterations in the chromosomes, such as deletions or translocations, are examined in chromosome spreads or by PCR. If no structural alterations exist, the presence of point mutations is ascertained. Mutations observed in some or all affected individuals, but not in normal individuals, indicates that the mutation may cause the disease. However, complete sequencing of the polypeptide and the coπesponding gene from several normal individuals is required to distinguish the mutation from a polymorphism. If a new polymorphism is identified, this polymorphic polypeptide can be used for further linkage analysis.

Furthermore, increased or decreased expression of the gene in affected individuals as compared to unaffected individuals can be assessed using polynucleotides or genes of the present invention or regions thereof. Any of these alterations (altered expression, chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic marker.

In addition to the foregoing, a polynucleotide or gene of the invention or regions thereof can be used to control gene expression through triple helix formation or antisense DNA or RNA.

Both methods rely on binding of the polynucleotide or gene or gene region to DNA or RNA. For these techniques, prefeπed polynucleotides are usually 20 to 40 bases in length and complementary to either the region of the gene involved in transcription (see, Lee et al., Nuc. Acids Res., 6:3073

(1979); Cooney et al., Science, 241:456 (1988); and Dervan et al., Science, 251:1360 (1991) for discussion of triple helix formation) or to the mRNA itself (see, Okano, J. Neurochem, 56:560

(1991); and Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca

Raton, FL (1988) for a discussion of antisense technique). Triple helix formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques are effective in model systems, and the information disclosed herein can be used to design antisense or triple helix polynucleotides in an effort to treat disease. For example, in the cases where expression of the gene leads to increased proliferation, which could give tumor cells a growth advantage over normal tumor, or cases where increased expression leads to increased invasiveness, such as the granzyme B-granzyme C gene fusion products (SEQ ID NOs: 70-74), a method to control such gene expression would inhibit the growth and metastasis of tumor cells. Other Uses of the Polypeptides and Antibodies of the Invention

Each of the polypeptides identified herein can be used in numerous ways. The following description should be considered exemplary and utilizes known techniques.

A polypeptide of the present invention can be used to assay protein levels-in a biological sample using antibody-based techniques. For example, protein expression in tissues can be studied with classical immunohistological methods (Jalkanen, et al., J. Cell. Biol, 101 :976-985, 1985;

Jalkanen et al., J. Cell. Biol, 105:3087-3096, 1987). Other antibody-based methods useful for detecting protein gene expression include immuni -ssfta?ys ^, such as the enzyme linked

, ^^' immunosorbent assay (ELISA) and the radioimmunoassay (RIA). See, e.g., Curr. Prot. Mol. Bio.,

Chapter 11. Suitable antibody assay labels are known in the art and include enzyme labels, such as glucose oxidase; and radioisotopes, such as iodine (¹²⁵1, ¹²¹I), carbon (¹⁴C), sulfur (³⁵S), tritium

(³H), indium (¹¹²In), and technetium (^99mTc); fluorescent labels, such as fluorescein and rhodamine; and organic moieties, such as biotin.

In addition to assaying secreted protein levels in a biological sample, proteins can also be detected in vivo by imaging. Antibody labels or markers for in vivo imaging of protein include those detectable by X-radiography, nuclear magnetic resonance (NMR), or electron spin resonance (ESR). For X-radiography, suitable labels include radioisotopes, such as barium or cesium, which emit detectable radiation but are not overtly harmful to the subject. Suitable markers for NMR and ESR include those with a detectable characteristic spin, such as deuterium, which may be incorporated into the antibody by labeling of nutrients for the relevant hybridoma.

A protein-specific antibody or antibody fragment that has been labeled with an appropriate detectable imaging moiety, such as a radioisotope (e.g., ¹³¹1, ¹¹²In, ^99mTc), a radio-opaque substance, or a material detectable by NMR, is introduced (e.g., parenterally, subcutaneously, or intraperitoneally) into the mammal. It will be understood in the art that the size of the subject and the imaging system used will determine the quantity of imaging moiety needed to produce diagnostic images. In the case of a radioisotope moiety, the quantity of radioactivity necessary for a human subject will normally range from about 5 to 20 millicuries of ^99mTc. The labeled antibody or antibody fragment will then preferentially accumulate at the location of cells which contain the specific protein. In vivo tumor imaging is described in Burchiel et al., "Immunopharmacokinetics of Radiolabeled Antibodies and Their Fragments" (Chapter 13 in Tumor Imaging: The Radiochemical Detection of Cancer, Burchiel and Rhodes, Eds., Masson Publishing Inc. (1982)).

Thus, the invention provides a method of diagnosing a pathological condition, which involves (a) assaying the expression of a polypeptide of the present invention in cells or body fluid of an individual; and (b) comparing the level of gene expression with a standard gene expression level, whereby an increase or decrease in the assayed polypeptide gene expression level compared to the standard expression level is indicative of a pathological disorder, such as AT, AT tumors or other cancers. Cancer is a process characterized by the mutation or dysregulation of genes critical in maintaining normal cellular growth and other normal cellular behavior. Often genes called oncogenes or protoncogenes are overexpressed, providing cancer cells with a growth advantage over normal cells. Alternatively, " tumor suppressor" genes that normally maintain normal cellular growth often are turned off during the cancer process. The measurement of the up- or down regulation of specific polynucleotides and polypeptides can be diagnosed or monitored by assaying changes in polypeptide levels in tissues or blood. For example, lack of expression of the estrogen receptor gene or protein in breast cancer cells is useful information that influences the course of treatment of this disease. As a second example, expression the prostate specific antigen (PSA) is a diagnostic indicator for prostate cancer, as well as other inflammatory events in the prostate.

Moreover, polypeptides of the present invention can be used to treat disease. For example, patients can be administered a polypeptide of the present invention in an effort to replace absent or decreased levels of the polypeptide (e.g., insulin); to supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S for hemoglobin B); to inhibit the activity of a polypeptide

(e.g., an oncogene); to activate the activity of a polypeptide (e.g., by binding to a receptor); to reduce the activity of a membrane bound receptor by competing with it for free ligand (e.g., soluble tumor necrosis factor (TNF) receptors used in reducing inflammation); or to bring about a desired response (e.g., blood vessel growth). For example, the FX-induced thymoma transcript (SEQ ID

NO:78), was downregulated in the AT-4 tumor cell line. This transcript has been previously observed to show differential mRNA expression in other thymomas compared with normal thymus tissue and has been postulated to play a role in the processes of T-cell differentiation and regeneration in addition to tumorigenesis. Therefore, it is possible that up-regulation of this transcript and resulting polypeptide in tumor cell lines could inhibit tumor cell proliferation and promote differentiation, thus halting the tumorigenesis process. Similarly, antibodies directed to a polypeptide of the present invention can also be used to treat disease. For example, administration of an antibody directed to a polypeptide of the present invention can bind and reduce overproduction of the polypeptide. Similarly, administration of an antibody can activate the polypeptide, such as by binding to a polypeptide bound to a membrane (receptor). Polypeptides can be used as antigens to trigger immune responses. Two examples of cases where an antibody directed to a polypeptide can be used to treat cancer are as follows: In cancer, oncogenes may be overexpressed, providing cancer cells with a growth advantage over normal cells. An antibody that could block the expression of such gene products would be useful in maintaining normal cellular growth. Similarly, tumor suppressor genes that normally maintain normal cellular growth often are frequently turned off during the cancer process. If the inhibition of expression of such genes were due to a repressor protein or transcription factor, a blocking antibody specific for this repressor protein would serve to restore normal cellular regulation.

A mammalian subject (preferably a human) can be given a recombinant or synthetic form of a polypeptide or antibody in one of many possible different formulations, preferably encapsulated and other forms for oral or other gastrointestinal delivery of the polypeptide or antibody. In some cases, delivery of the polypeptide or antibody may be in the form of injection or transplantation of cells or tissues containing an expression vector such that a recombinant form of the polypeptide will be secreted by the cells or tissues, as described above for transfected cells.

The frequency and dosage of the administration of the polypeptides or antibodies will be determined by factors such as the biological activity of the pharmacological preparation and the goals in treatment of AT, AT tumors, or other cancers. In the case of antibody deliveries, the frequency of dosage will also depend on the ability of the antibody to bind and neutralize the target molecules in the target tissues.

Polypeptides can also be used to raise antibodies, which in turn are used to measure protein expression from a recombinant cell, as a way of assessing transformation of the host cell. See, e.g., Curr. Prot. Mol. Bio., Chapter 11.15. Moreover, the polypeptides of the present invention can be used to test the following biological activities. Biological Activities

The polynucleotides, polypeptides and genes of the present invention and regions thereof can be used in assays to test for one or more biological activities. If these polynucleotides, polypeptides and genes or gene regions exhibit activity in a particular assay, it is likely that these molecules may be involved in the diseases associated with the biological activity. Thus, the polynucleotides, polypeptides, genes and gene regions can be used to prevent or treat the associated disease or pathological condition. Examples of the disease or pathological conditions that may be prevented or treated according to the methods described herein include, but are not limited to, AT, AT tumors, or other cancers. Immune Activity

A polypeptide, polynucleotide or gene of the present invention or region thereof may be useful in treating deficiencies or disorders of the immune system, by activating or inhibiting the proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune cells develop through a process called hematopoiesis, producing myeloid (platelets, red blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells from pluripotent stem cells. The etiology of these immune deficiencies or disorders may be genetic, somatic (such as cancer or some autoimmune disorders) acquired (e.g., by chemotherapy or toxins), or infectious. Moreover, a polynucleotide, polypeptide or gene of the present invention or region thereof can be used as a marker or detector of a particular immune system disease or disorder.

A polynucleotide, polypeptide or gene of the present invention or region thereof may be useful in treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide, polynucleotide or gene of the present invention or region thereof could be used to increase differentiation and proliferation of hematopoietic cells, including the pluripotent stem cells, in an effort to treat those disorders associated with a decrease in certain (or many) types hematopoietic cells. Examples of immunologic deficiency syndromes include, but are not limited to blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia telangiectasia, common variable immunodeficiency, Di George's Syndrome, HIV infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency (SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria.

Moreover, a polypeptide, polynucleotide or gene of the present invention or region thereof could also be used to modulate hemostatic (bleeding cessation) or thrombolytic activity (clot formation). For example, by increasing hemostatic or thrombolytic activity, a polynucleotide, polypeptide or gene of the present invention or region thereof could be used to treat blood coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgery, or other causes. Alternatively, a polynucleotide, polypeptide or gene of the present invention or region thereof that can decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve clotting. These molecules could be important in the treatment of heart attacks (infarction), strokes, or scarring.

A polynucleotide, polypeptide or gene of the present invention or region thereof may also be useful in the treatment or detection of autoimmune disorders. Many autoimmune disorders result from inappropriate recognition of self as foreign material by immune cells. This inappropriate recognition results in an immune response leading to the destruction of the host tissue. Therefore, the administration of a polypeptide, polynucleotide or gene of the present invention or region thereof that inhibits an immune response, particularly the proliferation, differentiation, or chemotaxis of T-cells, or in some way results in the induction of tolerance, may be an effective therapy in preventing autoimmune disorders.

Examples of autoimmune disorders that can be treated or detected by the present invention include, but are not limited to: Addison's Disease, hemolytic anemia, antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, Multiple Sclerosis, Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, Pemphigus, Polyendocrinopathies, Purpura, Reiter's Disease, Stiff-Man Syndrome, Autoimmune Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation, Guillain-Baπe Syndrome, insulin dependent diabetes mellitis, and autoimmune inflammatory eye disease.

Similarly, allergic reactions and conditions, such as asthma (particularly allergic asthma) or other respiratory problems, may also be treated by a polypeptide, polynucleotide or gene of the present invention or a region thereof. Moreover, these molecules can be used to treat anaphylaxis, hypersensitivity to an antigenic molecule, or blood group incompatibility.

A polynucleotide, polypeptide or gene of the present invention or a region thereof may also be used to treat and/or prevent organ rejection or graft-versus-host disease (GVHD). Organ rejection occurs by host immune cell destruction of the transplanted tissue through an immune response. Similarly, an immune response is also involved in GVHD, but, in this case, the foreign transplanted immune cells destroy the host tissues. The administration of a polypeptide, polynucleotide or gene of the present invention or a region thereof that inhibits an immune response, particularly the proliferation, differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing organ rejection or GVHD.

Similarly, a polypeptide, polynucleotide or gene of the present invention or a region thereof may also be used to modulate inflammation. For example, the polypeptide, polynucleotide gene or a region thereof may inhibit the proliferation and differentiation of cells involved in an inflammatory response. These molecules can be used to treat inflammatory conditions, both chronic and acute conditions, including inflammation associated with infection (e.g., septic shock, sepsis, or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel disease, Crohn's disease, or resulting from over production of cytokines (e.g., TNF or IL-1.) Hyperproliferative Disorders

A polypeptide, polynucleotide or gene of the invention or a region thereof can be used to treat or detect hyperproliferative disorders, including neoplasms. A polypeptide, polynucleotide or gene of the present invention or a region thereof may inhibit the proliferation of the disorder through direct or indirect interactions. Alternatively, a polypeptide, polynucleotide or gene of the present invention or region thereof may proliferate other cells that can inhibit the hyperproliferative disorder.

For example, by increasing an immune response, particularly increasing antigenic qualities of the hyperproliferative disorder or by inducing the proliferation, differentiation, or mobilization of T-cells, hyperproliferative disorders can be treated. This immune response may be increased by either enhancing an existing immune response, or by initiating a new immune response. Alternatively, decreasing an immune response may also be a method of treating hyperproliferative disorders, such as by administering the polypeptide, polynucleotide, gene or region thereof, as a chemotherapeutic agent.

Examples of hyperproliferative disorders that can be treated or detected by a polynucleotide, polypeptide or gene of the present invention or a region thereof include, but are not limited to neoplasms located in the abdomen, bone, breast, digestive system, liver, pancreas, peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, thyroid), eye, head and neck, nervous system (central and peripheral), lymphatic system, pelvic region, skin, soft tissue, spleen, thoracic region, and urogenital system.

Similarly, other hyperproliferative disorders can also be treated or detected by a polynucleotide or polypeptide of the present invention. Examples of such hyperproliferative disorders include, but are not limited to hypergammaglobulinemia, lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis, Sezary Syndrome, Waldenstron's Macroglobulinemia, Gaucher's Disease, histiocytosis, and any other hyperproliferative disease, besides neoplasia, located in an organ system listed above. Binding Activity

A polypeptide of the present invention may be used to screen for molecules that bind to the polypeptide or for molecules to which the polypeptide binds. The binding of the polypeptide and the molecule may activate (i.e., an agonist), increase, inhibit (i.e., an antagonist), or decrease activity of the polypeptide or the molecule bound. Examples of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or small molecules.

Preferably, the molecule is closely related to the natural ligand of the polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand, a structural or functional mimetic (see, e.g., Coligan et al., Current Protocols in Immunology 1(2), Chapter 5 (1991)). Similarly, the molecule can be closely related to the natural receptor to which the polypeptide binds or, at least, related to a fragment of the receptor capable of being bound by the polypeptide (e.g., an active site). In either case, the molecule can be rationally designed using known techniques.

Preferably, the screening for these molecules involves producing appropriate cells which express the polypeptide, either as a secreted protein or on the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells expressing the polypeptide (or cell membrane containing the expressed polypeptide) are then preferably contacted with a test compound potentially containing the molecule to observe binding, stimulation, or inhibition of activity of either the polypeptide or the molecule.

The assay may simply test binding of a test compound to the polypeptide, wherein binding is detected by a label, or in an assay involving competition with a labeled competitor. Further, the assay may test whether the test compound results in a signal generated by binding to the polypeptide.

Alternatively, the assay can be carried out using cell-free preparations, polypeptide/molecule affixed to a solid support, chemical libraries, or natural product mixtures. The assay may also simply comprise the steps of mixing a test compound with a solution containing a polypeptide, measuring polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity or binding to a standard.

Preferably, an ELISA assay can measure polypeptide level or activity in a sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The antibody can measure polypeptide level or activity by either binding, directly or indirectly, to the polypeptide or by competing with the polypeptide for a substrate.

All of these above assays can be used as diagnostic or prognostic markers. The molecules discovered using these assays can be used to treat disease or to bring about a particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule. Moreover, the assays can discover agents that may inhibit or enhance the production of the polypeptide from suitably manipulated cells or tissues. At present, early diagnosis of cancers in most tissues depend upon a number of relatively invasive and expensive clinical tests. Assays for the presence of markers, in easily obtained specimens (blood, urine or stool) may provide an important diagnostic tool.

Therefore, the invention includes a method of identifying compounds which bind to a polypeptide of the invention comprising the steps of: (a) incubating a candidate binding compound with a polypeptide of the invention; and (b) determining if binding has occurred. Moreover, the invention includes a method of identifying agonists/antagonists comprising the steps of: (a) incubating a candidate compound with a polypeptide of the invention, (b) assaying a biological activity, and (c) determining if a biological activity of the polypeptide has been altered.

The following examples illustrate the approach of combining cytogenetic and genomic approaches to identify conserved disruptions in aggressive T-cell lymphomas. These experiments are intended to illustrate the invention, and are not to be construed as limiting the scope of the invention.

EXAMPLES

EXAMPLE 1

Identification and Characterization of Regulated Polynucleotides

Spectral Karyotyping (SKY) and Fluorescent In Situ Hybridization (FISH) of Atm deficient

T-cell lymphoblastic lymphomas

The tumors from Atm^{' '} as well as Atm^''^'p53^+/', Atm^'l'p53^'t' , _mάp53^~'^' were characterized by SKY and FISH in order to determine the specific role of Atm in promoting lymphomogenesis. Characterization of these tumors demonstrated that translocations are a distinctive feature of A-T dependent lymphomagenesis and this is unique from that of p53 dependent lymphomagenesis. Further, it was found that genomic instability is additive.

As shown in Table 1, loss of ATM leads to specific recurring chromosomal abeπations.

The genetic alterations in these cell lines were characterized using spectral karyotyping (SKY) and fluorescent in situ hybridization (FISH). Consistent abnormalities of chromosome 14 and chromosome 12, similar to those found in human leukemias and lymphomas, were observed in all tumors studied. The observed abnormalities included insertions (ins6:14) as well as translocations (tl2;10, tl4;15 and tl4;3). The tumors were also found to be CD4 and CD8 double positive, demonstrating they occur at an immature stage of development. Abnormalities at the TCR locus, as found in human hematopoietic cancer, were observed. Genomic clones in bacterial artificial chromosomes (BACs) containing the candidate gene of interest were selected. These BACs were fluorescently labeled and used as probes on metaphase preparations using FISH. The results of these experiments demonstrated that both alleles of the TCRalpha locus are abnormally reaπanged in all tumor cell lines studied. These results further demonstrated that the TCR locus is reaπanged in both human and mouse lymphomas. In addition, two often tumors analyzed showed chromosomal abnormalities of genes known to be involved in human tumors. These studies demonstrate that ATM function is critical for maintaining appropriate recombination pathways and in the absence of ATM, both humans and mice develop aggressive tumors. A similar pathway of tumorigenesis is also likely to occur in the absence of ATM in both human and mouse.

Similarly isolated and grown tumors from/753^7" mice revealed that loss of p53 results in aneuploidy without translocations (for an example see Table 1) and the majority of tumors occur at the CD3+ stage similar to previous reports (Ward, J.M. (1999) Lab Invest. 1999;79:3-14; Mombaerts, P. (1995) Proc Natl Acad Sci U S A.;92:7420-4). Interestingly, in the setting of Atm^''^' and p53^{+ '}, the tumors appear to still harbor the Atm translocations, but aneuploidy is now seen more frequently. Finally, loss of both alleles of p53 and Atm result in a markedly additive effect with mice succumbing to lymphoma generally within the first several weeks of life. In these tumors, there is marked aneuploidy as well as translocations, but in many cases, the translocations are no longer limited to chromosomes 12 and 14. This suggests that p53 deficient tumor cell lines do not exhibit consistent abnormalities of chromosomes 14 or 12. In addition, p53 deficient tumors show more genomic instability with aneuploidy and gene amplification. This suggests that p53 deficiency alone does not result in typical chromosome 14 or 12 alterations as seen in the Atm deficient tumor cell lines or in human hematopoietic cancer. The Atmlp53 deficient cell lines do show chromosome 14 and 12 abeπations in addition to the aneuploidy seen in the p53 deficient tumor cell lines. Further, preliminary data suggests that the Atm deficient tumor cell lines may more precisely mimic the human disease. Therefore, the identification of genes involved in the translocations would allow a more detailed comparison to p53 models and would allow for further elucidation of the roles of p53 and Atm in tumorigenesis. RNA Isolation

Cytoplasmic RNA was prepared from cells from four tumor cell lines: AT-4, AT-7, AT-12 and AT-13 that had the chromosomal abnormalities summarized in Table 1. The isolated RNA was enriched to form a starting polyA-containing mRNA population by methods known in the art. The TOGA^® Process

Isolated RNA was analyzed using TOGA . Preferably, prior to the application of the TOGA^® technique, the isolated RNA was enriched to form a starting polyA-containing mRNA population by methods known in the art. In a prefeπed embodiment, the TOGA^® method further comprised an additional PCR step performed using four 5' PCR primers in four separate reactions and cDNA templates prepared from a population of antisense cRNAs. A final PCR step that used 256 5' PCR primers in separate reactions produced PCR products that were cDNA fragments that coπesponded to the 3 '-region of the starting mRNA population. The produced PCR products were then identified by: a) the initial 5' sequence comprising the sequence remainder of the recognition site of the restriction endonuclease used to cut and isolate the 3' region plus the sequence of the preferably four parsing bases immediately 3' to the remainder of the recognition site, preferably the sequence of the entire fragment, and b) the length of the fragment. These two parameters, sequence and fragment length, were used to compare the obtained PCR products to a database of known polynucleotide sequences. Since the length of the obtained PCR products includes known vector sequences at the 5' and 3' ends of the insert, the sequence of the insert provided in the sequence listing is shorter than the fragment length that forms part of the digital address.

The method yields Digital Sequence Tags (DSTs), that is, polynucleotides that are expressed sequence tags (ESTs) of the 3' end of mRNAs. DSTs that showed changes in relative levels when comparing 1 of 4 samples, or showed paired differences in 2 of 4 samples or were highly expressed and located on mouse chromosomes involved in the translocations were selected for further study. The intensities of the laser-induced fluorescence of the labeled PCR products were compared across samples isolated from the four tumor cell lines AT-4, AT-7, AT-12 and AT- 13. The results are presented in Table 2.

In general, double-stranded cDNA is generated from poly(A)-enriched cytoplasmic RNA extracted from the tissue samples of interest using an equimolar mixture or set of all 48 5'- biotinylated anchor primers to initiate reverse transcription. One such suitable set is G-A-A-T-T- C-A-A-C-T-G-G-A-A-G-C-G-G-C-C-G-C-A-G-G-A-A-T-T-T-T-T-T-T-T-T-T-T-T-T-T-T-T-T-T- V-N-N (SEQ ID NO: 75), where V is A, C or G and N is A, C, G or T. One member of this mixture of 48 anchor primers initiates synthesis at a fixed position at the 3' end of all copies of each mRNA species in the sample, thereby defining a 3' endpoint for each species, resulting in biotinylated double-stranded cDNA.

Each biotinylated double-stranded cDNA sample was cleaved with the restriction endonuclease Mspl, which recognizes the sequence CCGG. The resulting fragments of cDNA corresponding to the 3' region of the starting mRNA were then isolfted by capture of the biotinylated cDNA fragments on a streptavidin-coated substrate. Suitable streptavidin-coated substrates include microtitre plates, PCR tubes, polystyrene beads, paramagnetic polymer beads, and paramagnetic porous glass particles. A prefeπed streptavidin-coated substrate is a suspension of paramagnetic polymer beads (Dynal, Inc., Great Neck, NY).

After washing the streptavidin-coated substrate and captured biotinylated cDNA fragments, the cDNA fragment product was released by digestion with Notl, which cleaves at an 8-nucleotide sequence within the anchor primers but rarely within the mRNA-derived portion of the cDNAs. The 3' Mspl-Notl fragments, which are of uniform length for each mRNA species, were directionally ligated into Clal- Notl-cleaved plasmid pBC SK+ (Stratagene, La Jolla, CA) in an antisense orientation with respect to the vector's T3 promoter, and the product used to transform Escherichia coli SURE cells (Stratagene). The ligation regenerates the Notl site, but not the Mspl site, leaving CGG as the first 3 bases of the 5' end of all PCR products obtained. Each library contained in excess of 5 x 10⁵ recombinants to ensure a high likelihood that the 3' ends of all mRNAs with concentrations of 0.001% or greater were multiply represented. Plasmid preps (Qiagen) were made from the cDNA library of each sample under study.

An aliquot of each library was digested with Mspl, which effects linearization by cleavage at several sites within the parent vector while leaving the 3' cDNA inserts and their flanking sequences, including the T3 promoter, intact. The product was incubated with T3 RNA polymerase (MEGAscript kit, Ambion) to generate antisense cRNA transcripts of the cloned inserts containing known vector sequences abutting the Mspl and Notl sites from the original cDNAs.

At this stage, each of the cRNA preparations was processed in a three-step fashion. In step one, 250 ng of cRNA was converted to first-strand cDNA using the 5' RT primer (A-G-G-T-C-G-

A-C-G-G-T-A-T-C-G-G, (SEQ ID NO: 76). In step two, 400 pg of cDNA product was used as PCR template in four separate reactions with each of the four 5' PCR primers of the form G-G-T- C-G-A-C-G-G-T-A-T-C-G-G-N (SEQ ID NO: 77), each paired with a "universal" 3' PCR primer G-A-G-C-T-C-C-A-C-C-G-C-G-G-T (SEQ ID NO: 78) to yield four sets of PCR reaction products ("Nl reaction products").

In step three, the product of each subpool was further divided into 64 subsubpools (2ng in 20μl) for the second PCR reaction. This PCR reaction comprised adding 100 ng of the fluoresceinated "universal" 3' PCR primer (SEQ ID NO: 78) conjugated to 6-FAM and 100 ng of the appropriate 5' PCR primer of the form C-G-A-C-G-G-T-A-T-C-G-G-N-N-N-N (SEQ ID NO: 79), and using a program that included an annealing step at a temperature X slightly above the Tm of each 5' PCR primer to minimize artifactual mispriming and promote high fidelity copying. Each polymerase chain reaction step was performed in the presence of TaqStart antibody (Clonetech).

The products ("N4 reaction products") from the final polymerase chain reaction step for each of the tissue samples were resolved on a series of denaturing DNA sequencing gels using the automated ABI Prizm 377 sequencer. Data were collected using the GeneScan software package (ABI) and normalized for amplitude and migration. Complete execution of this series of reactions generated 64 product subpools for each of the four pools established by the 5' PCR primers of the first PCR reaction, for a total of 256 product subpools for the entire 5' PCR primer set of the second PCR reaction.

The mRNA samples from each of the tumor cell lines as described above were analyzed. Table 2 is a summary of the expression levels of 839 mRNAs determined from cDNA. These cDNA molecules are identified by their digital address, that is, a partial 5' terminus nucleotide sequence coupled with the length of the molecule, as well as the relative amount of the molecule produced at different time intervals after treatment. The 5' terminus partial nucleotide sequence is determined by the recognition site for Mspl and the nucleotide sequence of the parsing bases of the 5' PCR primer used in the final PCR step. The digital length of the fragment was determined by interpolation on a standard curve, and as such, may vary plus or minus 1 or 2 base pairs from the actual length as determined by sequencing.

For example, the entry in Table 2 that describes a DNA molecule identified by the digital address Mspl CCGT, is further characterized as having a 5' terminus partial nucleotide sequence of

CGGCCGT and a digital address length of 151 b.p. The DNA molecule identified as Mspl CCGT

151 was further characterized as being uniquely upregulated in the AT-13 tumor cell line. Additionally, the DNA molecule identified as Mspl CCGT 151 is described by its nucleotide sequence, which coπesponds with SEQ ID NO: 3.

Similarly, the other DNA molecules identified in Table 2 by their Mspl digital addresses are further characterized by: 1) the level of gene expression in the AT-4 tumor cells; 2) the level of gene expression in the AT-7 tumor cells; 3) the level of gene expression in AT-12 tumor cells; and 4) the level of gene expression in AT-13 tumor cells. Further examples of TOGA^® analysis are shown in Figures 4, 6, 8, 10, 13, and 21.

TOGA^® analysis Figures 6, 8, and 10 are based on data from a single library. These experiments were replicated with a second library from the same source. Any differences between the length of the DST in the digital addresses in the single library versus the duplicate libraries are no more than two base pairs. Table 2 shows results that were obtained from two libraries (e.g. AT- 4.1 and AT-4.2).

The data shown in Figure 1 and Table 2 were generated with a 5'-PCR primer (C-G-A-C- G-G-T-A-T-C-G-G-C-C-G-T, SEQ ID NO:86) paired with the "universal" 3' primer (SEQ ID NO: 78) labeled with 6-carboxyfluorescein (6FAM, ABI) at the 5' terminus. PCR reaction products were resolved by gel electrophoresis on 4.5% acrylamide gels and fluorescence data acquired on ABI377 automated sequencers. Data were analyzed using GeneScan software (Perkin-Elmer).

The results of TOGA^® analysis using a 5' PCR primer with parsing bases CCGT (SEQ ID NO: 86) and the universal 3' primer (SEQ ID NO: 78) are shown in Figure 1, which shows the PCR products produced from mRNA extracted from cell lines AT-4 (Fig. 1 A), AT-7 (Fig. IB), AT-12 (Fig. 1C), and AT-13 (Fig. ID). The vertical index line indicates a PCR product of about 151 base pairs that is expressed to a greater level in the AT-13 tumor cell line. The horizontal axis represents the number of base pairs of the molecules in these samples and the vertical axis represents the fluorescence measurement in the TOGA^® analysis (which coπesponds to the relative expression of the molecule of that address). The results of the TOGA runs were normalized using the methods of Grace & Durham previously described. The vertical line drawn through the four panels represents the DST molecule identified as BAR1_11 (SEQ ID NO:3).

Some products, which were differentially represented, appeared to migrate in positions that suggest that the products were novel based on comparison to data extracted from GenBank. The sequences of such products were determined by one of two methods: cloning or direct sequencing of the PCR products (see description below). DST Prioritization Process

TOGA^® analysis was performed on the 4 individual tumor cell lines and the levels of gene expression were compared between each of the cell lines. Briefly, from more then 16,000 distinct peaks, we identified 64 candidates showing expression differences of 2-fold or greater for further characterization. These candidates were separated into three categories: increased expression in a single cell line, decreased expression in a single cell line, and increased or decreasesd expression in two lines. The candidate DSTs were isolated and sequenced, and the identity determined as described below, and the complete list of genes that met the criteria are shown in Tables 6A, B, and C. This information was then used in combination with direct and syntenic mapping data from the Locuslink, Mouse Genome Database and Homologene websites to assess chromosomal locations. By combining the TOGA^® expression data with chromosomal mapping data, the power of the TOGA^® technology was expanded to pinpoint loci that might have been affected by the specific chromosomal disruptions. Several candidate genes were chosen for evaluation based on their known localization to chromosomes, such as chromosomes 12 and 14 (Table 1), that were found to be abnormal in the specific Atm and Atm/p53 deficient tumors. Candidate EST sequences were determined. EST sequences were used in BLAST searches to identify matching sequences in the GenBank database. These GenBank sequences were then used to generate probes for Northern analysis. Northern blots were used to validate the gene expression patterns in the four lines subjected to TOGA^®, as well as to compare the specificity of expression patterns in other tumor cell lines (for example, in Atm/p53 deficient tumors). As the TOGA^® technology is a valuable tool for discovering new genes and identifying genes that may not yet have been mapped to specific chromosomes, such uncharacterized or novel regulated sequences may be useful tools in assisting mapping efforts as they are most likely localized to chromosomes containing abeπations in these Atm tumor cell lines. Such sequences will be pursued in future studies. We will pursue genes that are abnormally regulated due to loss of AT, as this is the clinical phenotype we are most interested in and for which it appears the mouse model is an ideal for determining how abnormal regulation of genes give rise to human lymphoma and leukemia. Cloning of TOGA^® Generated PCR Products

In suitable cases, the PCR product was isolated, cloned into a TOPO vector (Invitrogen) and sequenced on both strands. The database matches for each cloned DST sequence are listed in Table 2. BAR1 11 (SEQ ID NO:3), the DNA molecule identifed by Mspl CCGT 151, was one such cloned product. In order to verify that the cloned product coπesponds to the TOGA^® peak of interest, the extended TOGA^® assay was performed for each DST (see below). Direct Sequencing of TOGA® Generated PCR Products

In other cases, the TOGA^® PCR product was sequenced using a modification of a direct sequencing methodology (Innis et al., Proc. Nat'l Acad. Sci., 85: 9436-9440 (1988)).

PCR products coπesponding to DSTs were gel purified and PCR amplified again to incorporate sequencing primers at 5' and 3' ends. The sequence addition was accomplished through 5' and 3' ds-primers containing Ml 3 sequencing primer sequences (Ml 3 forward and Ml 3 reverse respectively) at their 5' ends, followed by a linker sequence and a sequence complementary to the DST ends. Using the Clontech Taq Start antibody system, a master mix containing all components except the gel purified PCR product template was prepared, which contained sterile H₂O, 10X PCR II buffer, lOmM dNTP, 25 mM MgCl₂, AmpliTaq/Antibody mix (1.1 μg/μl Taq antibody, 5 U/μl AmpliTaq), 100 ng/μl of 5' ds-primer (5' TCC CAG TCA CGA CGT TGT AAA ACG ACG GCT CAT ATG AAT TAG GTG ACC GAC GGT ATC GG 3', SEQ ID NO: 81), and 100 ng/μl of 3' ds-primer (5' CAG CGG ATA ACA ATT TCA CAC AGG GAG CTC CAC CGC GGT GGC GGC C 3', SEQ ID NO: 82). After addition of the PCR template, PCR was performed using the following program: 94°C, 4 minutes and 25 cycles of 94°C, 20 seconds; 65°C, 20 seconds; 72°C, 20 seconds; and 72°C 4 minutes. The resulting amplified adapted PCR product was gel purified.

The purified PCR product was sequenced using a standard protocol for ABI 3700 sequencing. Briefly, triplicate reactions in forward and reverse orientation (6 total reactions) were prepared, each reaction containing 5 μl of gel purified PCR product as template. In addition, the sequencing reactions contained 2 μl 2.5X sequencing buffer, 2 μl Big Dye Terminator mix, 1 μl of either the 5' sequencing primer (5' CCC AGT CAC GAC GTT GTA AAA CG 3', SEQ ID NO: 83), or the 3' sequencing primer (5' TTT TTT TTT TTT TTT TTT V 3', where V=A, C, or G, SEQ ID NO: 84) in a total volume of 10 μl.

In an alternate embodiment, the 3' sequencing primer was the sequence 5' GGT GGC GGC CGC AGG AAT TTT TTT TTT TTT TTT TT 3', (SEQ ID NO: 85). PCR was performed using the following thermal cycling program: 96°C, 2 minutes and 29 cycles of 96°C, 15 seconds; 50°C, 15 seconds; 60°C, 4 minutes. Table 2 contains the database matches for the sequences determined by this method. In order to verify that the product determined by direct sequencing corresponds to the TOGA^® peak of interest, the extended TOGA^® assay was performed for each DST (see below). Verification Using the Extended TOGA^® Method

In order to verify that the TOGA^® peak of interest coπesponds to the identified DST, an extended TOGA^® assay was performed for each DST as described below. PCR primers ("Extended TOGA^® primers") were designed from sequence determined using one of three methods: (1) in suitable cases, the PCR product was isolated, cloned into a TOPO vector (Invitrogen) and sequenced on both strands; (2) in other cases, the TOGA^® PCR product was sequenced using a modification of a direct sequencing methodology (Innis et al., Proc. Nat'l. Acad. Sci., 85: 9436-9440 (1988)) or (3) in many cases, the sequences listed for the TOGA^® PCR products were derived from candidate matches to sequences present in available GenBank, EST, or proprietary databases.

PCR was performed using the Extended TOGA^® primers and the Nl PCR reaction products as a substrate. Oligonucleotides were synthesized with the sequence G-A-T-C-G-A-A-T-C extended at the 3' end with a partial Mspl site (C-G-G), and an additional 18 adjacent nucleotides from the determined sequence of the DST. For example, for the PCR product with the TOGA^® address CCGT 151 (BAR1_11; SEQ ID NO:3), the 5' PCR primer was G-A-T-C-G- A- A-T-C-C- G-G-C-C-G-T-G-T-G-T-G-C-C-T-T-A-G-G-A-G (SEQ ID NO:87). This 5' PCR primer was paired with the fluorescence labeled universal 3' PCR primer (SEQ ID NO:78) in a PCR reaction using the PCR Nl reaction product as substrate.

The length of the PCR product generated with the Extended TOGA^® primer was compared to the length of the original PCR product that was produced in the TOGA^® reaction. The results for SEQ ID NO:3, for example, are shown in Figure 2. The PCR product coπesponding to SEQ ID

NO:3 (BAR1 11) was cloned and a 5' PCR primer was built from the cloned DST (SEQ ID

NO:87). The product obtained from PCR with this primer (SEQ ID NO:87) and the universal 3'

PCR primer (SEQ ID NO:78) (as shown in the top panel, A) was compared to the length of the original PCR product that was produced in the TOGA^® reaction with mRNA extracted from AT-13 tumor cells using a 5' PCR primer with parsing bases CCGT (SEQ ID NO: 86) and the universal 3'

PCR primer (SEQ ID NO:78) (as shown in the middle panel, B). Again, for all panels, the number of base pairs is shown on the horizontal axis, and fluorescence intensity (which coπesponds to relative expression) is found on the vertical axis. In the bottom panel (C), the traces from the top and middle panels are overlaid, demonstrating that the peak found using an extended primer from the cloned DST is the same number of base pairs as the original PCR product obtained through TOGA^® as BAR1_11 (SEQ ID NO:3). The bottom panel thus illustrates that BAR1_11 (SEQ ID NO: 3) was the DST amplified in Extended TOGA^®. The same method was used to verify that the sequences determined by direct sequencing derive from the PCR product of interest.

(K)

In other cases, the sequences listed for the TOGA PCR products were derived from candidate matches to sequences present in available Genbank, EST, or proprietary databases. Table 4 lists the candidate matches for each by accession number of the Genbank entry or by the accession numbers of a set of computer-assembled ESTs used to create a consensus sequence. Extended TOGA^® primers were designed based on these sequences (as mentioned previously), and Extended TOGA^® was run to determine if the database sequences were the DSTs amplified in TOGA^®. Assignment of Identities to DSTs

Digital Sequence Tags (DSTs) can be easily associated with the gene encoding the full- length mRNA transcript including both 5' and 3' untranslated regions by methods known to those skilled in the art. For example, searches of the public databases of expressed sequences (e.g., GenBank) can identify cDNA sequences that overlap with the DST. Statistically significant sequence matches with greater than 95%> nucleotide sequence matches across the overlap region can be used to generate a contiguous sequence ("contig") and serial searches with the accumulated contig sequence can be used to assemble extended sequence associated with the DST. In cases where the assembled contig includes an open reading frame (a nucleotide sequence encoding a continuous sequence of amino acids), the polypeptide encoded by the expressed mRNA can be predicted.

In other cases, extended sequence can also be generated by making a probe containing the DST sequence. The probe would then be used to select cDNA clones by hybridization methods known in the art. These cDNA clones may be selected from libraries of cDNA clones developed from the original RNA sample, from other RNA samples, from fractionated mRNA samples, or from other widely available cDNA libraries, including those available from commercial sources. Sequences from the selected cDNA clones can be assembled into contigs in the same manner described for database sequences. The cDNA molecules can also be isolated directly from the mRNA by the rapid analysis of cDNA ends (RACE) and long range PCR. This method can be used to isolate the entire full-length cDNA or the intact 5' and 3' ends of the cDNA. Methods for alignment of biological sequences for pairwise comparison are well known in the art. Local alignments between a query sequence and a subject sequence can be derived by using the algorithm of Smith (JMol Biol, 1981), by the homology alignment algorithm of Needleman (JMol Biol, 1970), or by the similarity search algorithm of Pearson (Proc Natl Acad Sci, 1988). A prefeπed method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a sequence database, can be determined using the BLAST computer program based on the algorithm of Altschul and colleagues (Altschul, JMol Biol; 1990; Altschul, Nucleic Acids Res, 1997). The term "sequence" includes nucleotide and amino acid sequences. In a sequence alignment, the query sequence can be either protein or nucleic acid or any combination thereof. BLAST is a statistically driven search method that finds regions of similarity between a query and database sequences. These are called segment pairs, and consist of gapless alignments of any part of two sequences. Within these aligned regions, the sum of the scoring matrix values of their constituent symbol pairs is higher than a level expected to occur by chance alone. The scores obtained in a BLAST search can be interpreted by the experienced investigator to determine real relationships versus random similarities. The BLAST program supports four different search mechanisms:

• Nucleotide Query Searching a Nucleotide Database- Each database sequence is compared to the query in a separate nucleotide-nucleotide pairwise comparison.

• Protein Query Searching a Protein Database- Each database sequence is compared to the query in a separate protein-protein pairwise comparison.

• Nucleotide Query Searching a Protein Database- The query is translated, and each of the six products is compared to each database sequence in a separate protein-protein pairwise comparison.

• Protein Query Searching a Nucleotide Database- Each nucleotide database sequence is translated, and each of the six products is compared to the query in a separate protein- protein pairwise comparison.

By using the BLAST program to search for matches between a sequence of the present invention and sequences in GenBank and EST databases, identities were assigned whenever possible. A portion of these results are listed in Table 2. DST Validation Using Northern Blot Analysis

Northern blots were used to verify the gene expression patterns in the four lines subjected to TOGA^® (AT-4, AT-7, AT-12, and AT-13). Several additional cell lines were selected for Northern analysis that include tumor cell lines derived from mice with mutations in the tumor suppressor gene p53 as well as combinations of mutations of AT and p53. This was done to assist in determining which genes are abnormally regulated due to loss of AT (cell lines labeled AT-#), p53 (p53-l) or combinations of both genes (APT-3 is AT^_ p53^+/" andT01-7, 292-3 are double mutants).

Poly A enriched mRNA was extracted from four tumor cell lines, AT-4, AT-7, AT-12, and AT-13, as described above. The mRNA samples were electrophoresed through an agarose gel, blotted, and probed using well-known methods. Briefly, 20 μg of total RNA or 2 μg of poly A+ mRNA was electrophoresed through a 1.2% agarose gel containing formamide along with the appropriate molecular weight standards. The gel was blotted overnight using nylon membrane to transfer the RNA. The membrane was prehybridized for one hour at 42°C in hybridization buffer (5X SSPE, 5X Denhardt's solution, 50% formamide, 0.2% SDS, 100 μg/ml salmon sperm DNA, and water). The probe DNA (50ng) was labeled with ³²[P]-dCTP and ³²[P]-dATP using asymmetric PCR labeling. The membrane was probed with radiolabeled DNA (2-5 x 10 cpm/ ml) overnight at 42°C in hybridization buffer. In addition, the Northern blots were probed with radiolabeled cyclophylin DNA to normalize the amount of mRNA in each sample. Band intensities of the probed mRNA samples were quantitated using a Phosphoimager SI and normalized to the hybridization signal of cyclophilin.

Figure 3 is an example of Northern blot analysis using a radiolabeled probe derived from a candidate gene soluble lectin (Mac-2; GenBank Accession Number L08649), where the human homolog has been mapped to 14q21-q22, which is syntenic to mouse chromosomes 12 and 14, and coπesponds to CCGT 151 (DST BAR1_11; SEQ ID NO: 3). The image in Panel A, shows hybridization of the probe with a band of mRNA that is expressed at various levels in cell lines AT-4, AT-7, AT-13, and poly A⁺ (pA) thymus mRNA thymus, but not detectable in cell lines AT- 10, AT-11, AT-12; APT-3, P53-1, 101-7, and 292-3. The image in Panel B shows the methylene blue stained gel for quantification of mRNA loading. As predicted by TOGA^®, the most intense band is present in the AT-13 tumor cell line. Due to the complete absence of this mRNA in the p53 cell lines, this DST is an ideal biomarker for AT-specific lymphomas. EXAMPLE 2 Characterization of Polynucleotides Identified in a Single Atm^'Λ Tumor Cell Line

The TOGA^® technology was used to to examine the RNA expression profiles of a subset of the Atm^{' '} T-cell lymphomas shown in Table 1. Specifically, three Atm^''^' tumor cell lines, which had similar cytogenetic profiles (AT-4, AT-7 and AT-13), and one Atm^''^' tumor line which had several unique chromosomal abeπations (AT-12), were chosen for analysis. As shown in Table 5, the most unique tumor at the level of genomic translocations was AT-12, and AT-12 showed the largest numbers of genes uniquely expressed in a comparison of all tumor cell lines.

As described above, the TOGA^® technology was utilized to identify genes exhibiting unique or conserved patterns of expression between the tumor cell lines. Briefly, from more than 16,000 distinct peaks, 64 candidates showing expression differences of 2-fold or greater were selected for further characterization. These candidates were separated into three categories: increased expression in a single cell line, decreased expression in a single cell line, or increased expression in two lines/decreased expression in two lines. The candidate DSTs were isolated and sequenced, and the complete list of genes that met the criteria are shown in Tables 6A, B, and C. This information was then used in combination with direct and syntenic mapping data from the Locuslink, Mouse Genome Database and Homologene websites to assess chromosomal locations. By combining the TOGA^® expression data with chromosomal mapping data, loci which might have been affected by the specific chromosomal disruptions could be pinpointed. Soluble Lectin (Mac-2)

An example of a molecule uniquely upregulated in the AT-13 tumor cell line is shown in Figure 1. As previously described in Example 1, Figure 1 is a graphical representation of the results of TOGA^® analysis using a 5' PCR primer with parsing bases CCGT (SEQ ID NO: 86) and the universal 3' primer (SEQ I D NO: 78) showing PCR products produced from mRNA extracted from rumor cell lines AT-4 (Fig. 1A), AT-7 (Fig. IB), AT-12 (Fig. 1C), and AT-13 (Fig. ID); the vertical index indicating PCR products of about 151 base pairs in length. The PCR product was cloned and sequenced (DST BAR1_11 ; SEQ ID NO: 3). The DNA molecule identified as Mspl CCGT 151 was further identified as soluble lectin (Mac-2) (GenBank Accession Number L08649), where the human homolog has been mapped to 14q21-q22, which is syntenic to mouse chromosomes 12 and 14 (Table 3A & 6A). As shown in Figure 1, the soluble lectin transcript was uniquely upregulated in the AT-13 tumor cell line. This upregulation was confirmed by Northern blot analysis (Figure 3), as described in Example 1. In addition, a less intense band was observed in the Atm^'1' tumor cell lines AT-4 and AT-7, and thymus, but was not detectable in cell lines AT-10, AT-11, AT-12; APT-3, P53-1, 101- 7, and 292-3 (Figure 3 A). Due to the complete absence of this mRNA in the p53 cell lines, this DST is an ideal biomarker for AT-specific lymphomas. In addition, as it is not present in all Atm-/- tumor cell lines, it may be associated with other phenotypes, such as stage of tumorigenesis or aggressiveness. RhoC

An example of a molecule uniquely upregulated in the AT-4 tumor cell line is shown in Figure 4. Figure 4 is a graphical representation of the results of TOGA^® analysis using a 5' PCR primer with parsing bases GCTG (SEQ ID NO: 88) and the universal 3' primer (SEQ ID NO: 78) showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 4A), AT- 7 (Fig. 4B), AT-12 (Fig. 4C), and AT-13 (Fig. 4D); the vertical index identifying the PCR products of about 289 base pairs in length. The sequence of the PCR product was determined (DST BAR1B_131; SEQ ID NO: 69). The DNA molecule identified as Mspl GCTC289 was further identified as RhoC (GenBank Accession Number X80638), where the human homolog has been mapped to Ip21-pl3, which is syntenic to mouse chromosome 3 (Tables 4 and 6A).

This upregulation was confirmed by Northern blot analysis (Figure 5). Figure 5 is an example of Northern blot analysis using a radiolabeled probe derived from the candidate gene RhoC (GenBank Accession Number X80638). The image in Panel 5A, shows hybridization of the probe with a band of mRNA that is expressed at various levels in cell lines AT-4, AT-7, AT-13, and poly A⁺ (pA) thymus mRNA thymus, but not detectable in cell lines AT-10, AT-11, AT-12; APT-3, P53-1, 101-7, and 292-3. The image in Panel B shows the methylene blue stained gel for quantification of mRNA loading. As predicted by TOGA , the most intense band is present in the AT-4 tumor cell line. Due to the complete absence of this mRNA in the p53 cell lines, this DST is an ideal biomarker for AT-specific lymphomas. In addition, as it is not present in all Atm^'1' tumor cell lines, it may be associated with other phenotypes, such as stage of tumorigenesis or aggressiveness.

ALK-1

The most unique tumor cell line, based on genomic translocations was AT-12, and this cell line also demonstrated the largest numbers of genes uniquely expressed in comparison to the other tur or cell lines (Table 5). An example of a molecule uniquely upregulated in the AT-12 tumor cell line is shown in Figure 6. Figure 6 is a graphical representation of the results of TOGA^® analysis using a 5' PCR primer with parsing bases GCTG (SEQ ID NO: 89) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from (top to bottom panels) tumor cell lines AT-4 (Fig. 6 A), AT-7 (Fig. 6B), AT-12 (Fig. 6C), and AT-13 (Fig. 6D); the vertical index indicating PCR products of about 345 base pairs in length. The sequence of the PCR product was determined (DST BAR1B_79; SEQ ID NO: 32). The DNA molecule identified as Mspl GCTG345 was further identified as mouse activin-like receptor (ALK-1) (GenBank Accession Number Z31664), where the human homolog has been mapped to 12ql l-ql4, which is syntenic to mouse chromosome 10 and 15 (Tables 4 and 6A).

This upregulation was confirmed by Northern blot analysis (Figure 7). Figure 7 is an example of Northern blot analysis using a radiolabeled probe derived from the candidate gene ALK-1 (GenBank Accession Number Z31664). The image in the upper panel, shows hybridization of the probe with a two bands of mRNA at 4.0 kb and 3.7 kb that are expressed at various levels in cell lines AT-4, AT-11, AT-12, AT-13, 292-3, and poly A⁺ (pA) thymus mRNA, but not detectable in cell lines AT-7, AT-10, APT-3, P53-1, and 101-7. The image in the lower panel shows the methylene blue stained gel for quantification of mRNA loading. It is interesting that the Atm^'/'/p53^' ^A cell line (292-3) demonstrated the presence of this mRNA, but the Atm^+/+/p53^'/' (p53-l) cell line and the Atm^'A/p53^+/' (APT-3), cell line did not express this transcript, suggesting an association of this mRNA with a distinct pathway of tumorigenesis. Due to the selectivity of this mRNA in only one of the p53 cell lines, this DST is an ideal biomarker for AT-specific lymphomas and may be associated with specific but not all p53-ιymphomas. In addition, as it is not present in all Atm^{' '} tumor cell lines, it may be associated with other phenotypes, such as stage of tumorigenesis or aggressiveness. FX-induced thymoma transcipt

An example of a molecule uniquely down-regulated in the AT-4 tumor cell line is shown in

Figure 8. Figure 8 is a graphical representation of the results of TOGA^® analysis using a 5' PCR primer with parsing bases ACAT (SEQ ID NO: 90) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from (top to bottom panels) tumor cell lines AT-4 (Fig. 8 A), AT-7 (Fig. 8B), AT-12 (Fig. 8C), and AT-13 (Fig. 8D); the vertical index indicating PCR products of about 368 base pairs in length. The sequence of the PCR product was determined (DST BAR1B_125; SEQ ID NO: 63). The DNA molecule identified as Mspl ACAT367 was further identified as FX-induced thymoma transcript (GenBank Accession Number U38252), where the human homolog has been mapped to chromosome 12 and mouse chromosome 5 (Tables 4 and 6B).

This down-regulation was confirmed by Northern blot analysis (Figure 9). Figure 9 is an example of Northern blot analysis using a radiolabeled probe derived from the candidate gene FX- induced thymoma transcript (GenBank Accession Number U38252). The image in the upper panel, shows hybridization with a band of mRNA at 4.1 kb that is expressed at various levels in cell lines AT-7, AT-10, AT-12, AT-13 and APT-3, but not detectable in cell lines AT-4, AT-11, P53-1, 101-7, 292-3 and polyA (pA) thymus mRNA. The image in the lower panel shows the methylene blue stained gel for quantification of mRNA loading. Northern analysis thus confirmed the TOGA^® result indicating lack of expression of this mRNA in AT-4. Interestingly, this pattern (loss of expression) was also observed in an additional Atm '^' cell line, AT-11, that was not studied by TOGA^®, and thus may represent a marker unique to a specific phenotype of some AT-specific lymphomas. It is interesting that the Atm^+/+/p53^'/' ( p53-l) cell line and Atm^'1' /p53^''^' cell line (292- 3) also showed a lack of presence of this transcript, but the Atm^'/'/p53 ^'/', APT-3, demonstrated the presence of this mRNA, also suggesting an association of this mRNA with a distinct pathway of tumorigenesis. Due to the selectivity of this mRNA in being lost in specific p53 tumor anάAtm tumor cell lines, it is possible that loss of expression of this transcript related to a specific phenotype of the cells. Thus this DST is an ideal biomarker for certain AT-specific and p53- specific lymphomas that may relate to a specific stage of tumorigenesis or aggressiveness.

In addition, this transcript has been previously observed to show differential mRNA expression in other thymomas compared with normal thymus tissue (Pampeno CL, Meruelo D., (1996) Cell Growth Differ 8:1113-23) and has been postulated to play a role in the processes of T- cell differentiation and regeneration in addition to tumorigenesis. Therefore, it is possible that upregulation of this transcript in tumor cell lines could inhibit tumor cell proliferation and promote differentiation, thus halting the tumorigenesis process.

EXAMPLE 3 Identification of a Polynucleotide Where Loss of Expression Correlates with Genomic Loss

Inhibitor of Apoptosis Protein I

The most unique tumor cell line, based on genomic translocations was AT-12, and this cell line also demonstrated the largest numbers of genes uniquely expressed in comparison to the other tumor cell lines (Table 5). One of the molecules uniquely down-regulated in the AT-12 tumor cell line was identified as the inhibitor of apoptosis protein-1 (IAP-1) (Table 4). This gene resides on chromosome 9, and AT-12 was the only line with observed disruptions of chromosome 9 (see Table 1 and 6B).

Figure 10 is a graphical representation of the results of TOGA^® analysis using a 5' PCR primer with parsing bases GAGC (SEQ ID NO: 91) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from (top to bottom panels) tumor cell lines AT-4 (Fig. 10A), AT-7 (Fig. 10B), AT-12 (Fig. IOC), and AT-13 (Fig. 10D); the vertical index indicating PCR products of about 197 base pairs inlength. The sequence of the PCR product was determined (DST BAR1B_127; SEQ ID NO: 65). The DNA molecule identified as Mspl GAGC 197 was further identified as inhibitor of apoptosis protein-1 (IAP-1) (GenBank Accession Number U88908), where the human homolog has been mapped to chromosome 1 lq22 and the mouse gene to chromosome 9 (Tables 4 and 6B).

In agreement with the TOGA^® analysis, lack of expression of this transcript in only AT-12 tumor cells was confirmed by Northern blot analysis (Figure 11). Figure 11 is an example of Northern blot analysis using a radiolabeled probe derived from the candidate gene IAP-1 (GenBank Accession Number U88908). The image in Panel A, shows hybridization with a band of mRNA at 4.2 kb that is expressed at various levels in cell lines AT-4, AT-7, AT-10, AT-11, AT- 13, APT-3, P53-1, 101-7, 292-3 and normal thymus, but not detectable in cell line AT-12; the image in the Panel B shows the methylene blue stained gel for quantification of mRNA loading.

Interestingly, this pattern (loss of expression) was only observed in the AT-12 tumor cell line, the only cell line demonstrating a loss of chromosome 9. Thus, the loss of expression of this transcript is directly associated with a genomic loss. Therefore, the measurement of the loss of this gene in various lymphomas could be used as an indicator of chromosome 9 deletion, and may be associated with a phenotype unique to this type of tumor, suggesting an association of this mRNA with a distinct pathway of tumorigenesis.

Analysis by FISH confirmed in normal cells (Figure 12) and in tumor cells AT-4, AT-7, AT-10, and AT-13 that the IAP-1 locus is localized to chromosome 9. FISH analysis of the AT-12 tumor line confirmed deletion of chromosome 9 and thus expression of IAP-1 could not be detected (data not shown).

EXAMPLE 4 Identification of Polynucleotides Associated with Genomic Translocations: Identification of a

Gene Fusion Product An example of a molecule up-regulated in two tumor cell lines, AT-7 and AT-13, is shown in Figure 13. Figure 13 shows the results of another example of TOGA^® analysis, in this case using a 5' PCR primer with parsing bases ACGG (SEQ ID NO: 92) and the universal 3' primer (SEQ ID NO: 78), showing PCR products produced from mRNA extracted from (top to bottom panels) tumor cell lines AT-4 (Fig. 13A), AT-7 (Fig. 13B), AT-12 (Fig. 13C), and AT-13 (Fig. 13D); the vertical index indicating PCR products of about 458 base pairs in length. The PCR product was cloned (DST BAR1_27; SEQ ID NO: 9). The DNA molecule identified as Mspl ACGG458 was further identified as granzyme C (GenBank Accession Number Ml 8459), where the human homolog has been mapped to chromosome 14ql 1.2 and the mouse chromosome 14, 20 cM (Tables 3A & 6C).

This up-regulation was confirmed by Northern blot analysis (Figure 14) using a radiolabeled probe derived from the candidate gene granzyme C (GrzmC) (GenBank Accession number Ml 8459). The image in the upper panel shows hybridization of two bands of mRNA at 1.7 kb and 1.1 kb. The 1.1 kb band, labeled as "normal" as it is the predicted size of granzyme C, was expressed at low levels in tumor cell lines AT-4, AT-7, AT-10, and polyA (pA) thymus. Due to the immature stage of development of the AT tumor cells, i.e., at the CD4+/CD8+ stage, little if any expression was expected in the tumor cell lines while a faint expression was expected in normal thymus mRNA. Granzyme C belongs to a family of closely related granzymes that are involved in the degradation of the extracellular matrix, thus if dysregulated, could play an important role in increased invasiveness (increased metastatic potential) of tumor cells. Thus, the finding that granzyme C was expressed in some of the tumors suggests its expression may play a role in the phenotype of those tumor cell lines. The upper 1.7 kb band, labeled as "alternate" (Figure 14) was present only in tumor cell lines AT-7 and AT-10, and to a lesser extent AT-13 and APT-3. Neither transcript was detectable in cell lines AT-11, AT-12, P53-1, 101-7, and 292-3. The image in the lower panel shows the methylene blue stained gel for quantification of mRNA loading. The pattern of expression of the higher molecule weight mRNA, i.e., highest in AT-7 and lesser in AT-13, and absence in AT-4, matches the expression pattern predicted by TOGA^®, and thus it was of interest to further characterize this abeπant transcript. In summary, the TOGA^® analysis led to not only the discovery that normal granzyme C transcription was misregulated, but also led to the identification of a second abnormally large band, suggestive of an abeπant form of the gene was expressed in some tumors. As shown in Table 6C, granzyme C, as well as most of its family members, belong to a family of closely related granzymes clustered on mouse chromosome 14, which was found to be the site of multiple translocations in both human and AT cell lines (Table 1). Specifically, as previously described, all tumor cell lines established from Atm^{' '} mice demonstrated translocations at the TCRalpha locus, which is in close proximity to granzyme C (Figure 15). Because reaπangements at the site of the TCRalpha locus were common to all AT tumors, it was postulated that AT tumors arise due to abeπant V(D)J rearrangements of the TCRα locus as seen in human cancers. Thus, the proximity of the granzyme C locus to a region of the chromosome known to undergo reaπangements (i.e., translocations), and the expression of a higher molecular weight band observed by Northern, suggested that the TOGA^® technology identified a novel gene fusion product containing granzyme C at the 3 '-end.

In order to confirm that this molecule was a novel, gene fusion product, a GeneRacer kit (Invitrogen), was utilized to obtain sequence information 5' to DST BAR1_27 (SEQ ID NO:9). This method is for full-length, RNA ligase-mediated rapid amplication of 5' and 3' cDNA ends (RLM-RACE). Briefly, this method involved selecting 5 '-capped mRNA. After removal of the 5'-cap, an RNA oligo is attached to the 5'-end and a PCR product was obtained using primers specific for this 5 '-end and also the 3 '-end containing BAR1_27 (SEQ ID NO: 9) (granzyme C) sequence. Using the GeneRacer methodology on RNA from the AT-7 tumor cell line, followed by cloning, sequencing, and electronic contig building analysis identified a novel gene fusion product that contained regions identical to granzyme B (CTLA-1, GenBank Accession Number X04072) and granzyme C. Granzyme B is also localized to chromosome 14, just upstream of granzyme C but downstream of TCRα.

Using primers specific to a 5 '-region of granzyme B and a 3 '-region of granzyme C, this granzyme B-granzyme C gene fusion product was independently confirmed by RT-PCR. Figure 16 shows RT-PCR results confirming expression of the granzyme B-granzyme C gene fusion product in AT-7 and AT-13 tumor cell lines. Figure 16A shows RT-PCR products amplified with gene-specific 5' Granzyme B and 3' Granzyme C primers. cDNA template was synthesized from DNase-treated RNA prepared from murine C57 Black thymus and the following ATM tumor cell lines: AT-4, AT-7, AT-10, AT-12, AT-13, and APT-3. The left side of panel A shows PCR products amplified with the 5' primer Gzmb-II and 3' primer Gzmc III. Asterisks in AT-7 and AT- 13 indicate bands with the expected size of 869 bp for gene fusion product I. The right side of panel A displays PCR products amplified with 5' primer Gzmb-I and 3' primer Gzmc III.

Asterisks in AT-7 and AT-13 indicate bands with the expected size of 896 bp for gene fusion I.

Figure 16B shows a map of gene fusion I is shown with the relative locations of 5' Granzyme B primers Gzmb-I and Gzmb-II with 3' Granzyme C primer Gzmb-III. Primers I, II, and III (SEQ ID

NOs: 98, 99, and 96) were used for PCR amplification experiments in panel 16A. The expression pattern observed, greater in AT-7 than AT-13, matched the original pattern predicted by TOGA^® analysis (see Figure 13).

Further PCR cloning using primers specific to the 5 '-region of granzyme B and the 3'- region of granzyme C identified an additional gene fusion product that was present in both AT-7 and AT-10. The two gene fusion products both contain the granzyme C sequence identified by

TOGA^® using the parsing primer ACGG458. Gene fusion product 1 (SEQ ID NO: 70), obtained from electronic contig analysis of multiple clones contains a 284 bp region that aligns with granzyme B (CTLA-1) at positions 84 to 371 of granzyme B and a region representing positions

333 to the polyA tail of granzyme C (Figure 17). The last 42 bases sequenced of granzyme C ending with the polyA tail, identified by TOGA^®, add more information about granzyme C that is cuπently not in the GenBank database. Gene fusion product 2 was independently sequenced from four different clones which varied by 2-4 nucleotides in sequence out of ~800 bp using AT-7 and

AT-10 RNA (SEQ ID NO: 71, 72, 73, and 74). Gene fusion product 2 consisted of a 356 bp region that aligns with granzyme B at positions 101-556 and a 453 bp region representing positions 481-

894, plus 42 bases which are outside of the current GenBank sequence but common to the DST and gene fusion 1. Figure 17 shows a map detailing these two gene fusion products (gene fusion product 1; SEQ ID NO: 70 & gene fusion product 2; SEQ ID NOs: 71-74). The nucleotide positions of Granzyme B/Granzyme C fusions I and II are shown relative to wild-type Granzyme B

(lower left comer) and Granzyme C (upper right comer) transcript sequences. Solid white boxes represent Granzyme C sequences and solid black boxes show Granzyme B sequences. Gray regions indicate overlapping cDNA sequences common to both Granzyme B and Granzyme C. Hatched regions represent 5' Granzyme B sequence that are likely to occur in the full-length fusion transcripts. The hatched regions are 5' to the Granzyme B primers used for RT-PCR, thus these upstream sequences were not observed in the amplification products analyzed, interestingly, both gene fusion products are in frame, and thus encode intact fusion proteins. In fact, the region of overlap between granzyme B and granzyme C contain extremely high homology, allowing for a smooth transition from one protein product to the next, suggestive of a functional fusion protein. It is important to note that both gene fusion products contain the same digital sequence tag by TOGA^®, which most likely explains the greatly enhanced expression of this peak in AT-7 (see Figure 13).

Final independent confirmation of a granzyme B-granzyme C gene fusion product was obtained by Northern blot analysis. Figure 18 represents a Northern blot using the using the gene fusion product 1 (SEQ ID NO: 70) as the radiolabeled probe. The image shows hybridization of one band of mRNA at the appropriate size in only tumor cell lines AT-7, AT-10, and APT-3. Consistent with previous results, no gene fusion product was detectable in cell lines AT-4, AT-12, AT-13, P53-1, 101-7, 292-3, and thymus.

As observed in Figure 16 and other experiments (data not shown), a number of abeπations in addition to gene fusion products were observed in the Atm^{" "} cell lines, including deletions and alternative splicing. Thus, the TOGA^® technology allowed for identification of aberrantly expressed transcripts that coπelated to gene fusion products, which lead to a number of cellular defects that appear to be a hallmark of specific Atm^{' '} tumor cell lines. No such events were observed in RT-PCR experiments with RNA from normal mouse thymus.

From granzyme B knockout studies, it was found that granzyme B is required in CTL, NK, and LAK cells for the rapid induction of DNA fragmentation in target cells, a hallmark of apoptosis. In fact, NK cells have an absolute requirement for granzyme B for the triggering of apoptosis in susceptible target cells. Both rapid DNA fragmentation and the induction of morphologic changes associated with apoptosis depend on granzyme B expression. Interestingly, granzyme B ^~'^~ mice have dramatically reduced levels of granzymes C, D, E, F, and G (all found downstream of granzyme B on chromosome 14), suggesting that dysregulation of granzyme B results in dysregulation of the entire granzyme cluster (Mak, T.W., et al., The Gene Knockout Facts Book, Academic Press: San Diego, 1998). Therefore, due to the role of the granzymes in apoptosis and degradation of the extracellular matrix, expression of these gene fusion proteins, could play an important role in increased invasiveness (increased metastatic potential) and survival of tumor cells.

A unique feature of AT lymphomas is their invasive nature (Barlow et al., (1996) 86:159-

171). Multiple tumors from Atm^{" "} mice have been found to invade all organs including the brain and bone (periosteum). In addition, AT patients and patients with particularly aggressive lymphomas have also been shown to have a loss of heterozygosity (LOH) of ATM (Starostik, P et al., (1998) Cancer Res. 58(20):4552-7). Interestingly, granzymes have been linked to invasiveness in several aggressive tumors and preliminary studies have been described to specifically inhibit granzyme C activity by overexpression of a serine protease inhibitor (Medema JP, et al., (2001) Proc Natl Acad Sci U S A. 98:11515-20; Kim GE, et al., (2001) Cancer 91:2343-52; Motyka B, et al., (2000) Cell.l03:491-500; Yamashita Y, et al., (1998) Mod Pathol. 11:313-23).

To further confirm that granzyme C is involved in a translocation, FISH analysis was performed. Analysis by FISH revealed that in normal cells, as well as in tumor cells AT-4, AT-7, AT-10, AT-12 and AT-13, that the granzyme C locus is adjacent to the TCRalpha/delta locus (Figure 19A). In this study, the granzyme C probe was labeled with green fluorescence and the TCRalpha probe was labeled with red fluorescence, such that regions that co-localize are represented as yellow fluorescence. In the greyscale version of the image shown in Figure 19A, aπows are drawn to show the pair of duplicate spots appearing on a metaphase chromosome spread. FISH analysis of AT-7 cells with a granzyme C specific probe showed duplication events at the TCRalpha and granzyme C loci; and in some cases a translocation of granzyme C to chromosome 12, where it co-localized with Tcl-1 (Figure 19B). In other cases, translocations of 12; 14 were not observed, but rather abnormalities on chromosome 14 were observed, suggesting that in these cases at this locus an abeπant fusion event occured (Figure 19 A). This may be due to improper recombination events, and is likely a direct result of the absence of ATM in these cells.

Gene fusion product 1 was inserted into an arabinose-inducible expression vector containing a histidine tag. As shown by Western analysis using an anti-histine antibody (Figure 20), a 25.8 kDa intact fusion protein was expressed after treatment of cells with increasing concentrations of arabinose (ara). Histidine-tagged luciferase (luc-his) was also expressed as a positive control. As a negative control, demonstrating specificity, in the absence of ara (0%), neither the fusion protein or the luc-his protein are expressed, and thus no band was observed. Similar studies will be done with gene fusion product 2. This study demonstrated that an intact, in- frame, fusion protein could be expressed, which will be further characterized for invasiveness in various cell culture models.

In summary, the TOGA^® technology demonstrated its ability to identify polynucleotides specifically dysregulated due to translocations and other chromosome abeπations, specifically leading to the discovery of novel gene fusion products. This polynucleotide represents an important biomarker that is only expressed in specific AT tumors and in the Atm-/-/p53+/- (APT-3) tumor line, but not any of the p53-/- tumor cell lines. As p53 is commonly mutated in practically any tumor type at various stages of disease, it is not a useful diagnostic or prognostic biomarker. The gene fusion products described in this invention may serve as highly specific biomarkers which can be used to determine the stage of progression of a tumor, for predicting the aggressiveness or invasive potential (i.e., metastatic potential) of a tumor, for helping to predict the course of treatment, and for predicting remission or regression of the tumor, associated with AT or other cancers for prognostic purposes.

EXAMPLE 5 Identification and Characterization of a Polynucleotide Up in Two Cell Lines and Associated with Genomic Amplification

An example of an additional molecule that is upregulated in two tumor cell lines, AT-7 and AT-13, is shown in Figure 21. Figure 21 is a graphical representation of the results of TOGA analysis using a 5' PCR primer with parsing bases CTCG (SEQ ID NO: 93) and the universal 3' primer (SEQ ID NO: 78) showing PCR products produced from mRNA extracted from tumor cell lines AT-4 (Fig. 21 A), AT-7 (Fig. 21B), AT-12 (Fig. 21C), and AT-13 (Fig. 21D); the vertical index identifying PCR products of about 310 base pairs in length. The PCR product was cloned and sequenced (DST BAR1 28; SEQ ID NO: 10). The DNA molecule identified as Mspl CTCG 310 was further identified as peripheral benzodiazepine receptor (GenBank Accession Number D21207), where the human homolog has been mapped to 22ql3.31, and is located on mouse chromosome 15 (Tables 3 A & 6C).

Chromosome 15 is frequently amplified in multiple mouse tumor cell lines and this locus harbors the myc proto-oncogene. The benzodiazepine receptor locus is approximately 10 cM form the c-myc locus (Figure 22 A). Although several ATM deficient tumor lines have full or partial duplications of chromosome 15 (i.e. AT-4, AT-7, AT-10, AT-11, AT-12 and AT-13), c-myc is not overexpressed in tumor cell lines relative to normal thymus (Taylor, A.M. et al. (1996) Leukemia and lymphoma in ataxia telangiectasia, Blood, 87:423-438). To further investigate this finding we performed Northern blot analysis (Figure 22B) on the various tumor cell lines. Figure 22B is an example of Northern blot analysis using a radiolabeled probe derived from c-Myc, GAPDH (as a loading control), or the candidate gene benzodiazepine receptor (GenBank Accession Number

D21207). Each panel represents hybridization of the respective probe with a band of mRNA that is expressed at various levels in cell lines AT-4, AT-7, AT-10, AT-11, AT-12, AT-13, APT-3, P53-1,

101-7, and 292-3. Although p53 deficient tumors had amplification of chromosome 15 and concommitant upregulation of c-myc, the Atm deficient tumor cells had no increase in c-myc expression, in spite of amplification of chromosome 15. The benzodiazepine locus is expressed 5.3-fold higher than c-myc in normal thymus (data not shown), suggesting that at baseline, c-myc is expressed at decreased levels relative to this transcript. In Atm deficient tumor cell lines, c-myc shows decreased expression relative to benzodiazepine, whereas tumors that are either partially or completely p53 deficient show increased expression of c-myc relative to Bzrp (Figure 22B). Despite the presence of additional copies of c-myc, Atm deficient cells lines express c-myc at levels similar to those seen in normal thymus. In contrast, c-myc is highly expressed deficient thymic lymphomas, suggesting that tumorigenesis or survival in the absence of Atm may require active suppression of c-myc expression. Further, since tumorigenesis in the absence of Atm shows characteristics distinguishable from tumorigenesis in the absence of p53 function, thymic lymphomagenesis in the absence of Atm is not simply the result of lost p53 function downstream of Atm, but is suggestive of loss of genome maintenance (tumor suppressive) functions unique to the ATM protein. Further examination of the role of c-myc suppression in Atm deficient tumorigenesis and proliferative maintenance are necessary.

In summary, TOGA^® analysis identified a polynucleotide that is upregulated in two Atm ^' tumor cell lines that is correlated with an amplification of chromosome 15. Further characterization of loci on chromosome 15 demonstrated that although it is amplified, gene expression is tightly regulated at specific loci in the absence of ATM. In addition, Atm ^''^' tumors are unique from p53 deficient tumors in that c-myc is not concomitantly upregulated with this chromosome amplification. The polynucleotide identified as the benzodiazapene receptor was upregulated in specific Atm '^' tumors, but not p53^'A tumors, suggesting an association of this mRNA with a distinct pathway of tumorigenesis, and suggests the expression of this transcript relates to a specific phenotype of AT cells. Thus this DST is an ideal biomarker for certain AT-specific lymphomas that may relate to a specific stage of tumorigenesis or aggressiveness.

EXAMPLE 6 Absence of ATM During Normal Thymus Does Not Result in Significant Differences in Gene

Expression

Finally, to ascertain whether differential expression between tumor cell lines reflects tumorigenesis or merely results from loss of Atm expression, cDNA microarrays were used to compare the expression profile of normal and Atm^''^' thymus at various time points prior to tumor development (Figure 23). As shown, we examined gene expression in normal and Atm ^{' '} thymus at

4, 5, 8, 9 and 16 weeks relative to age-matched wild type mice. ESTs differentially expressed above a 1.4 fold change at more than on age point were studied. Of the 9094 ESTs profiled, 9, or -0.20%) of the ESTs expressed, (Figure 23, shown above 1.0) showed elevated expression and 8, or -0.18%) of the ESTs expressed, (Figure 23, shown below 1.0) showed reduced expression in Atm deficient thymus. Of these, only TCR gamma (Figure 23, asterisk) is differentially expressed both in the tumor cell lines profiled and in tumor free thymus. This suggests that most transcripts differentially expressed between Atm ^~'^~ tumor cell lines are unique to those lines and not the result of differences pre-existing in the absence of tumors. Quantitative RT-PCR confirmed differential expression of CD53 and TCRγ in the same samples profiled.

Several of these differentially expressed transcripts have been implicated in thymocyte development. CD53 and cathepsin S are directly involved in pathways of thymocyte maturation and selection, respectively. Cyclin D2 governs entry into Gl in peripheral T-cells, and is involved in thymocyte proliferation. The general absence of maturing TCR alpha/beta CD4+/CD8+ T-cells in the Atm^''^' thymus could result in the observed decrease in the levels of these transcripts. TCR gamma, which is upregulated in the absence of Atm, is repressed during T-cell maturation by expression of TCR alpha/beta. The absence of mature TCR alpha/beta T-cells in the Atm ^''^' thymus might explain observed increases in the level of this transcript. Ig alpha, also known as the immunoglobin heavy chain locus (IgH), resides on mouse chromosome 12 telomeric to Tell. In thymic lymphomas derived from Atm deficient mice, both Tell and IgH loci are deleted on all copies of chromosome 12 involved in translocations, but appear unaffected in intact chromatids. Decreased expression of IgH transcripts could represent loss of heterozygosity or regulatory disruption at this locus in advance of tumorigenesis, and may serve as a marker of premalignancy in Atm deficient thymus. Summary

Ataxia Telangiectasia mutated (ATM) plays a key role in monitoring and responding to DNA damage, and is essential for the appropriate regulation of cell cycle checkpoint control for the maintenance of genome integrity. In the absence of Atm, defects in these control points may contribute to the increased incidence of genomic alterations, including deletions, amplifications and translocations and thereby alter genetic stability and cancer susceptibility. Both AT- homozygous patients and mice deficient in Atm develop unusually proliferative and invasive lymphoblastic lymphomas and leukemias. Using spectral karyotype analysis we have determined a number of conserved chromosomal disruptions in T-cell lymphomas derived from separate Atm- deficient mice. The majority of tumor cells were at the CD4⁺/CD8⁺ stage of T-cell maturation, during which the TCRα locus undergoes reaπangement, and fluorescence in situ hybridization with probes for the TCRα locus indicated that both alleles of TCRα are disrupted in these cells. The TOGA^® technology was used to examine the global RNA expression profiles of several Atm^'1' T-cell lymphomas. TOGA^® enabled the classification of cytogenetically similar tumors from a dissimilar tumor line based on the resulting expression profiles. In addition, this approach allowed the identification of chromosomal disruptions including specific gene fusion events which may account for the unusually aggressive and invasive nature of v4tm-deficient tumors. These candidates have been validated by independent methods and several have been shown to be specifically expressed in tm-deficient lymphomas and not in tumors lacking p53. By using this combined approach of powerful imaging techniques coupled with sensitive RNA profiling technologies, it has been possible to identify specific loci which appear to be involved in directing the initiation and progression of lymphoreticular malignancies which arise as a result of Atm- deficiencies in both mice and humans.

Genes that are abnormally regulated among AT cell lines may be uniquely associated with AT tumorigenesis (as compared to other mechanisms of tumorigenesis) due to genetic abnormalities caused by AT deficiency. These may include phenomena, such as translocations of lymphoid specific promoters or other regulatory sequences in or near genes promoting cell growth, or inactivation or deletion of genes that may regulate or suppress lymphoid tumor growth. Thus, the examples shown here were also chosen in part on the basis of their potential location in or near chromosomal aberrations among the AT cell lines studied. The genes identified in these examples and in Table 2 may relate to specific biological aspects found both in the mouse model and in human AT tumorigenesis (e.g., the development of particularly aggressive forms of lymphomas). In summary, the TOGA^® technology has been shown to be an important tool for the identification of genes associated with chromosomal aberrations, which can then be used as biomarkers for diagnosis, prognosis, and further gene mapping studies.

The polynucleotides, polypeptides, kits and methods of the present invention may be embodied in other specific forms without department from the teachings or essential characteristics of the invention. The described embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are, therefore, to be embraced within. Table 1. Phenotypes and Cytogenetic Abeπations in ATM- and p53 Mutant Thymomas.

Abbreviations: Translocation (T), Deletion (Del), Duplication (Dup), Insertion (Ins), Dicentric chromosome (Die), Robertsonian translocation (Rb), Inversion (Inv)

oe

© ©

EST = Expressed Sequence Tag, N/A = Not Applicable

Table 5. TOGA differences >2-fold between tumor lines, demonstrating the uniqueness of AT-12 compared to the other AT tumor lines.

Table 6A. Genes with increased expression in a single tumor cell line.

©

Ul

^* confirmed by northern analysis

Q indicates synteny between mouse/human chromosomes, but not direct mapping evidence.

Table 6B. Genes with decreased expression in a single tumor cell line.

^* confirmed by northern analysis

[] indicates synteny between mouse/human chromosomes, but not direct mapping evidence.

©

-4

Table 6C. Genes with increased expression in two tumor cell lines.

^* confirmed by northern analysis

[] indicates synteny between mouse/human chromosomes, but not direct mapping evidence

Claims

We claim:

1. A method for preventing, or treating a pathological condition, comprising administering to a mammalian subject a therapeutically effective amount of at least one of:

(a) a first polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74, or a second polynucleotide at least 95% identical to said first polynucleotide; or

(b) a third polynucleotide at least ten bases in length that is hybridizable to said first polynucleotide under stringent conditions; or

(c) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to said gene, or a region of any of the foregoing genes, wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

2. A method for preventing, or treating a pathological condition, comprising administering to a mammalian subject a therapeutically effective amount of:

(a) a first polypeptide encoded by a polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74 or another polynucleotide at least 95% identical to said polynucleotide; or

(b) a second polypeptide encoded by a gene corresponding to any of the foregoing polynucleotides or another gene at least 95% identical to said gene; or

(c) a third polypeptide at least 90% identical to one of the foregoing polypeptides; or

(d) a fragment of one of the foregoing polypeptides, wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

3. A method for preventing, or treating a pathological condition, comprising administering to a mammalian subject a therapeutically effective amount of an antibody that binds specifically to at least one of:

(a) a first polypeptide encoded by a polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74, or another polynucleotide at least 95% identical to said polynucleotide; or

(c) a third polypeptide at least 90% identical to any one of the foregoing polypeptides; or

(d) a fragment of any one of the foregoing polypeptides, wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

4. A method for assessing the efficacy of a test compound for treating a pathological condition in a mammalian subject, the method comprising the step of comparing:

(a) a level of expression of a marker in a first sample obtained from the subject, wherein the first sample is exposed to the test compound and wherein the marker is selected from the group consisting of polynucleotides listed in SEQ ID NO: 1-74; polypeptides encoded by the polynucleotides listed in SEQ ID NO: 1-74; and fragments thereof; and

(b) a level of expression of the same marker in a second sample obtained from the subject, wherein the second sample is not exposed to the test compound, and wherein a substantially increased or decreased level of expression of the marker in the first sample, relative to the second sample, is an indication that the test compound is efficacious in treating the pathological condition, wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

5. A method for diagnosing a pathological condition or a susceptibility to a pathological condition in a subject comprising:

(a) determining the presence of a mutation in:

(i) a first polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74, or a second polynucleotide at least 95% identical to said first polynucleotide; or

(ii) a third polynucleotide at least ten bases in length that is hybridizable to said first polynucleotide under stringent conditions; or

(iii) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to said gene, or a region of any of the foregoing genes; and

(b) diagnosing the pathological condition or a susceptibility to the pathological condition based on the presence of said mutation, wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

-I l l-

6. A method for diagnosing a pathological condition or a susceptibility to a pathological condition in a subject, the method comprising:

(a) obtaining a first biological sample from a patient suspected of having the pathological condition;

(b) obtaining a second sample from a suitable comparable control source;

(c) determining in the first and second samples a level of expression of at least one polynucleotide selected from the group consisting of:

(i) a first polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74;

(ii) a second polynucleotide at least 95% identical to said first polynucleotide;

(iii) a third polynucleotide at least ten bases in length that is hybridizable to said first polynucleotide under stringent conditions; and

(iv) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to said gene, or a region of any of the foregoing genes;

(d) comparing in the first and second samples the level of expression of the at least one polynucleotide, wherein a patient is diagnosed as having or susceptible to the pathological condition if the amount of the at least one polynucleotide molecule in the first sample is greater than or less than the amount of the at least one polynucleotide molecule in the second sample, and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

7. A method for diagnosing a pathological condition or a susceptibility to a pathological condition in a subject, the method comprising:

(b) obtaining a second sample from a suitable comparable control source;

(c) determining in the first and second samples a level of expression of at least one polypeptide selected from the group consisting of:

(i) a first polypeptide encoded by a polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74 or another polynucleotide at least 95% identical to said polynucleotide;

(ii) a second polypeptide encoded by a gene corresponding to any of the foregoing polynucleotides or another gene at least 95% identical to said gene;

(iii) a third polypeptide at least 90% identical to one of the foregoing polypeptides; and

(iv) a fragment of one of the foregoing polypeptides;

(d) comparing the level of expression of the at least one polypeptide in the first and second samples, wherein a patient is diagnosed as having or susceptible to the pathological condition if the amount of the at least one polypeptide in the first sample is greater than or less than the amount of the at least one polypeptide in the second sample, and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

8. A method for assessing the efficacy or toxicity of a therapeutic treatment for a pathological condition by testing for regulation of at least one of:

9. A method for identifying a binding partner comprising:

(a) contacting at least one polypeptide with a binding partner, wherein the at least one polypeptide is selected from the group consisting of:

(iv) a fragment of one of the foregoing polypeptides; and

(b) determining whether the binding partner affects an activity of said polypeptide.

10. A method for identifying a binding partner, the method comprising:

(i) a first polypeptide encoded by a polynucleotide chosen from the group consisting of SEQ ID NOs: 1-74, or another polynucleotide at least 95% identical to said polynucleotide;

(iii) a third polypeptide at least 90% identical to any one of the foregoing polypeptides; and

(iv) a fragment of any one of the foregoing polypeptides;

(b) contacting the at least one polypeptide bound to the binding partner with an antibody capable of immunospecifically binding to the at least one polypeptide bound to the binding partner.

11. A first substantially pure isolated DNA molecule suitable for use as a probe for genes regulated in a pathological condition, wherein said first substantially pure isolated DNA molecule is a polynucleotide chosen from the group consisting of SEQ ID NO: 1-74, a gene corresponding to said polynucleotide, or regions of said gene; or a second substantially pure isolated DNA molecule at least 95% similar to said first substantially pure isolated DNA molecule, wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

12. A kit for detecting in a sample from a mammalian subject the presence of at least one polypeptide encoded by:

(c) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to said gene, or a region of any of the foregoing genes, wherein said kit comprises a biomolecule which specifically binds with said at least one polypeptide in an amount sufficient for at least one assay and suitable packaging material.

13. The kit of claim 12, wherein the biomolecule is a first antibody.

14. The kit of claim 13 further comprising a second antibody that binds to the first antibody, wherein the second antibody is labeled.

15. A kit for detecting the presence of a gene encoding a protein comprising a first polynucleotide chosen from the group consisting of SEQ ID NO: 1-74, or a second polynucleotide at least 95% identical to said first polynucleotide, or a third polynucleotide at least ten bases in length that is hybridizable to said first polynucleotide under stringent conditions, or a fragment of any of the foregoing polynucleotides having at least 10 contiguous bases, in an amount sufficient for at least one assay, and suitable packaging material.

16. A method for detecting the presence of a nucleic acid encoding a protein in a mammalian subject, comprising the steps of:

(a) obtaining a biological sample from the subject;

(b) hybridizing with a first polynucleotide from the sample or a first gene corresponding to said first polynucleotide:

(i) a second polynucleotide chosen from the group consisting of SEQ ID NO: 1-74 or a second gene corresponding to said second polynucleotide; or

(ii) a third polynucleotide at least 95% identical to said second polynucleotide or a third gene corresponding to said third polynucleotide; or

(iii) a fourth polynucleotide at least ten bases in length that is hybridizable to said second polynucleotide under stringent conditions or a fourth gene corresponding to said fourth polynucleotide; or

(iv) a fragment of any of the foregoing polynucleotides or any of the foregoing genes having at least 10 contiguous bases; and

(c) detecting the presence of the hybridization product.

17. A method for providing a therapeutic molecule to a mammalian subject afflicted with a pathological condition and in need of a therapeutic molecule, the method comprising:

(a) linking the therapeutic molecule to a polynucleotide selected from the group consisting of:

(i) a first polynucleotide identified in SEQ ID NOs: 1-74;

(iv) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to said gene, or a region of any of the foregoing genes; and

(b) administering the therapeutic molecule linked to the polynucleotide to the mammalian subject, wherein the therapeutic molecule is selected from the group consisting of genes, vaccines, diagnostic reagents, peptides, proteins and macromolecules and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

18. A method for providing a therapeutic molecule to a mammalian subject afflicted with a pathological condition and in need of the therapeutic molecule, the method comprising:

(a) linking the therapeutic molecule to a polypeptide selected from the group consisting of:

(i) a first polypeptide encoded by a polynucleotide identified in SEQ ID NOs: 1-74, or another polynucleotide at least 95% identical to said polynucleotide;

(iv) a fragment of any one of the foregoing polypeptides; and

(b) administering the therapeutic molecule linked to the polypeptide to the mammalian subject, wherein the therapeutic molecule is selected from the group consisting of genes, vaccines, diagnostic reagents, peptides, proteins and macromolecules and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

19. A method for providing a therapeutic molecule to a mammalian subject afflicted with a pathological condition and in need of the therapeutic molecule, the method comprising:

(a) linking the therapeutic molecule to an antibody capable of immunospecific binding to a polypeptide selected from the group consisting of:

(iv) a fragment of any one of the foregoing polypeptides; and

(b) administering the therapeutic molecule linked to the antibody to the mammalian subject, wherein the therapeutic molecule is selected from the group consisting of genes, vaccines, diagnostic reagents, peptides, proteins and macromolecules and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

20. A method for predicting whether a subject afflicted with a pathological condition is likely to respond favorably to a treatment prior to administration of the treatment to the subject, the method comprising the steps of:

(a) obtaining a sample from the subject;

(b) determining a level of expression of at least one of:

(i) a first polynucleotide selected from the group consisting of SEQ ID NOs: 1-74;

(ii) a second polynucleotide at least 95% identical to the first polynucleotide;

(iii) a gene corresponding to any of the foregoing polynucleotides, another gene at least 95% identical to the gene, or a region of any of the foregoing genes;

(iv) a first polypeptide encoded by a polynucleotide selected from the group consisting of SEQ ID NOs: 1-74;

(v) a second polypeptide encoded by a gene corresponding to any of the foregoing polynucleotides or another gene at least 95% identical to said gene;

(vi) a third polypeptide at least 90% identical to one of the foregoing polypeptides; or

(vii) a fragment of one of the foregoing polypeptides; and

(c) comparing the level of expression to a database comprising expression patterns from patients previously given the treatment, wherein a similar level of expression from the subject as compared to the level of expression from the database of patients that responded favorably to the treatment predicts that the subject will respond favorably to the treatment and wherein the pathological condition is selected from the group consisting of AT, AT tumors and other cancers.

21. A method for predicting a metastatic potential, remission or regression of a tumor in a subject, the method comprising the steps of:

(a) obtaining a sample from the subject;

(b) determining a level of expression of at least one of:

(vii) a fragment of one of the foregoing polypeptides; and

(c) comparing the level of expression to a database comprising levels of expression patterns correlated with metastatic potential, remission or regression from patients having tumors, wherein a similar level of expression from the subject as compared to the level of expression from the database predicts the metastatic potential, remission or regression of the tumor in the subject.

22. A method for determining a stage of tumor progression in a subject, the method comprising the steps of:

(a) obtaining a sample from the subject;

(b) determining a level of expression of at least one of:

(vii) a fragment of one of the foregoing polypeptides; and

(c) comparing the level of expression to a database comprising expression patterns correlated with the stage of tumor progression from patients previously diagnosed with tumors, wherein a similar level of expression from the subject as compared to the level of expression from the database of patients previously diagnosed with tumors determines the stage of tumor progression in the subject.

23. A method for identifying genes associated with a chromosomal aberration in AT and AT tumors, the method comprising the steps of:

(a) obtaining a sample from a subject;

(b) isolating RNA from the sample;

(c) performing TOGA on the isolated RNA;

(d) identifying genes that are differentially regulated based on TOGA;

(e) determining whether the differentially regulated genes comprises a genetic marker at a site of the chromosomal aberration, wherein the chromosomal aberration is selected from the group consisting of chromosome breaks, amplifications, deletions and translocations.

24. A first isolated fusion polynucleotide comprising SEQ ID NOs: 70-74; or a second isolated fusion polynucleotide at least 95% identical to the first isolated fusion polynucleotide.

25. An isolated nucleic acid molecule at least ten bases in length that is hybridizable to any isolated fusion polynucleotide of claim 24 under stringent conditions.

26. An isolated nucleic acid molecule comprising a homolog, paralog, or ortholog of any isolated fusion polynucleotide of claim 24.

27. An isolated fusion polypeptide encoded by:

(a) a first fusion polynucleotide identified in SEQ ID NOs:70-74; or

(b) a second fusion polynucleotide at least 95% identical to the first fusion polynucleotide; or

(c) complements and degenerate variants of any of the foregoing polynucleotides.

28. An isolated fusion polypeptide, the amino acid sequence of which is 90% identical to any polypeptide of claim 27.

29. An isolated fragment of any fusion polypeptide of claim 27.

30. An isolated fusion polypeptide that is a homolog, paralog, or ortholog of any fusion polypeptide of claim 27.

31. An isolated antibody that binds specifically to any fusion polypeptide of claim 27.

32. The isolated antibody of claim 31, wherein the antibody is selected from the group consisting of monoclonal and polyclonal antibodies.