WO2000009655A9 - Genes amplified in cancer cells - Google Patents

Genes amplified in cancer cells

Info

Publication number
WO2000009655A9
WO2000009655A9 PCT/US1999/018101 US9918101W WO0009655A9 WO 2000009655 A9 WO2000009655 A9 WO 2000009655A9 US 9918101 W US9918101 W US 9918101W WO 0009655 A9 WO0009655 A9 WO 0009655A9
Authority
WO
WIPO (PCT)
Prior art keywords
rna
cells
cancer
gene
sequence
Prior art date
Application number
PCT/US1999/018101
Other languages
French (fr)
Other versions
WO2000009655A3 (en
WO2000009655A2 (en
Inventor
Helene Di Smith
Ling-Chun Chen
Original Assignee
California Pacific Med Center
Smith Alan Hm
Chen Ling Chun
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by California Pacific Med Center, Smith Alan Hm, Chen Ling Chun filed Critical California Pacific Med Center
Priority to AU54748/99A priority Critical patent/AU5474899A/en
Publication of WO2000009655A2 publication Critical patent/WO2000009655A2/en
Publication of WO2000009655A3 publication Critical patent/WO2000009655A3/en
Publication of WO2000009655A9 publication Critical patent/WO2000009655A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/82Translation products from oncogenes

Definitions

  • the present invention relates generally to the field of human genetics More specifically, it relates to the identification and characterization of novel genes associated with overabundance of RNA in human cancer such as breast cancer It pertains especially to those genes and the products thereof which may be important in diagnosis and treatment
  • Cancer is a heterogeneous disease It manifests itself in a wide variety of tissue sites, with different degrees of de-differentiation, invasiveness, and aggressiveness Some forms of cancer are responsive to traditional modes of therapy, but many are not For most common cancers, there is a pressing need to improve the arsenal of therapies available to provide more precise and more effective treatment in a less invasive way As an example breast cancer has an unsatisfactory morbidity and mortality, despite presently available forms of medical intervention Traditional clinical initiatives are focused on early diagnosis, followed by surgery and chemotherapy Such interventions are of limited success, particularly in patients where the tumor has undergone metastasis
  • the heterogeneous nature of cancer arises because different cancer cells achieve their growth and pathological properties by different phenotypic alterations Alteration of gene expression is intimately related to the uncontrolled growth and de-differentiation that are hallmarks of cancer Certain similar phenotypic alterations in turn may have a different genetic base in different tumors Yet, the number of genes central to the malignant process must be a finite one Accordingly, new pharmaceutical
  • the first type is the decreased expression of recessive genes, known as tumor suppresser genes, that apparently act to prevent malignant growth
  • tumor suppresser genes that apparently act to prevent malignant growth
  • dominant genes such as oncogenes
  • alteration in the expression of either type of gene is a potential diagnostic indicator
  • a treatment strategy might seek to reinstate the expression of suppresser genes, or reduce the expression of dominant genes
  • the present invention is directed to identifying genes of either type, particularly those of the second type
  • amplification This is a process whereby the gene is duplicated within the chromosomes of the ancestral cell into multiple copies
  • the process involves unscheduled replications of the region of the chromosome comprising the gene, followed by recombination of the replicated segments back into the chromosome (A talo et al ) As a result, 50 or more copies of the gene may be produced
  • the duplicated region is sometimes referred to as an "amp con"
  • the level of expression of the gene that is, the amount of messenger RNA produced
  • efi B2 gene also known as HER-2/nety
  • HSRs are chromosomal regions that appear in karyotype analysis with intermediate density Giemsa staining throughout their length, rather than with the normal pattern of alternating dark and light bands They correspond to multiple gene repeats HSRs are particularly abundant in breast cancers, showing up in 60-65% of tumors surveyed (Dutnllaux et al , Zafrani et al ) When such regions are checked by in situ hybridization with probes for any of 16 known human oncogenes, including er ⁇ B2 and myc, only a proportion of tumors show any hybridization to HSR regions Furthermore, only a proportion of the HSRs within each karyotype are implicated
  • CGH comparative genomic hybridization
  • genes that undergo duplication in cancer are a daunting challenge
  • human oncogenes have been identified by hybridizing with probes for other known growth-promoting genes, particularly known oncogenes in other species
  • the eribB2 gene was identified using a probe from a chemically induced rat neuroghoblastoma (Slamon et al ) Genes with novel sequences and functions will evade this type of search
  • genes may be cloned from an area identified as containing a duplicated region by CGH method Since CGH is able to indicate only the approximate chromosomal region of duplicated genes, an extensive amount of experimentation is required to walk through the entire region and identify the particular gene involved Genes may also be overexpressed in cancer without being duplicated Methods that rely on identification from genetic abnormalities necessarily bypass such genes Increased expression can come about through a higher level of transcription of the gene, for example, by up-regulation of the promoter or substitution with an alternative promoter It can also occur if the transcription product
  • RNA preparation and expanded via the polymerase chain reaction using primers of particular specificity Similar subpopulations are compared across several RNA preparations by gel autoradiography for expression differences In order to survey the RNA preparations entirely, the assay is repeated with a comprehensive set of PCR primers
  • the screening strategy more effectively includes multiple positive and negative control samples (Sunday et al )
  • the method has recently been applied to breast cancer cell lines, and highlights a number of expression differences (Liang et al 1992b, Chen et al , McKenzie et al , Watson et al 1994 & 1996, Kocher et al )
  • By excising the corresponding region of the separating gel it is possible to recover and sequence the cDNA
  • differential display problems remain in terms of applying it in the search for new cancer genes
  • the method can be used for any type of cancer, providing a plurality of cell populations or cell lines of the type of cancer are available, in conjunction with a suitable control cell population
  • the method is highly effective in identifying genes and gene products that are intimately related to malignant transformation or maintenance of the malignant properties of the cancer cells
  • An important derivative of applying the method is the selection and retrieval of cDNA and cDNA fragments corresponding to the cancer-associated gene These fragments can be used inter alia to determine the nucleotide sequence of the gene and mRNA, the ammo acid sequence of any encoded protein, or to retrieve from a cDNA or genomic library additional polynucleotides related to the gene or its transcripts Since the genes are typically involved in the malignant process of the cell, the polynucleotides, polypeptides, and antibodies derived by using this method can in turn be used
  • Another objective of this invention to provide isolated polynucleotides, polypeptides, and antibodies derived from four novel genes which are associated with several different types of cancer, including breast cancer
  • the genes are designated CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14-2a16-1 These designations refer to both strands of the cDNA and fragments thereof, and to the respective corresponding messenger RNA, including splice variants, allelic variants, and fragments of any of these forms
  • These genes show RNA overabundance in a majority of cancer cell lines tested A majority of the cells showing RNA overabundance also have duplication of the corresponding gene
  • Another object of this invention is to provide materials and methods based on these polynucleotides, polypeptides, and antibodies for use in the diagnosis and treatment of cancer, particularly breast cancer
  • one embodiment of this invention is an isolated polynucleotide comprising a linear sequence contained in a polynucleotide selected from the group consisting of CH1-9a11-2, CH8- 2a13-1 , CH13-2a12-1 , and CH14-2a16-1
  • the linear sequence is contained in a duplicated gene or overabundant RNA in cancerous cells
  • the RNA may be overabundant due to gene duplication, increased RNA transcription or processing, increased RNA persistence, any combination thereof, or by any other mechanism, in a proportion of breast cancer cells
  • the RNA is overabundant in at least about 20% of a representative panel of breast cancer cell lines such as the panels listed herein, more preferably, it is overabundant in at least about 40% of the panel, even more preferably, it is overabundant in at least 60% or more of the panel
  • the RNA is overabundant in at least about 5% of spontaneously occurring breast cancer tumors, more preferably, it is overa
  • a sequence of at least 10 nucleotides is essentially identical between the isolated polynucleotide of the invention and a cDNA from CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14- 2a16-1 , more preferably, a sequence of at least about 15 nucleotides is essentially identical, more preferably, a sequence of at least about 20 nucleotides is essentially identical, more preferably, a sequence of at least about 30 nucleotides is essentially identical, more preferably, a sequence of at least about 40 nucleotides is essentially identical, even more preferably, a sequence of at least about 70 nucleotides is essentially identical, still more preferably, a sequence of about 100 nucleotides or more is essentially identical
  • a further embodiment of this invention is an isolated polynucleotide comprising a linear sequence essentially identical to a sequence selected from the group consisting of SEQ ID NO 15, SEQ ID NO 18, SEQ ID
  • This invention also provides an isolated polypeptide comprising a sequence of ammo acids essentially identical to the polypeptide encoded by or translated from a polynucleotide selected from the group consisting of CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14-2a16-1
  • a sequence of at least about 5 ammo acids is essentially identical between the polypeptide of this invention and that encoded by the polynucleotide
  • a sequence of at least about 10 ammo acids is essentially identical, more preferably, a sequence of at least 15 ammo acids is essentially identical, even more preferably, a sequence of at least 20 ammo acids is essentially identical, still more preferably, a sequence of about 30 ammo acids or more is essentially identical
  • the polypeptide comprises a linear sequence of at least 15 ammo acids essentially identical to a sequence encoded by said polynucleotide
  • Another embodiment of this invention is a polypeptide comprising a sequence
  • a further embodiment of this invention is a method of using the polynucleotides of this invention for detecting or measuring gene duplication in cancerous cells, especially but not limited to breast cancer cells, comprising the steps of reacting DNA contained in a clinical sample with a reagent comprising the polynucleotide, said clinical sample having been obtained from an individual suspected of having cancerous cells, and comparing the amount of complexes formed between the reagent and the DNA in the clinical sample with the amount of complexes formed between the reagent and DNA in a control sample
  • a further embodiment is a method of using the polynucleotides of this invention for detecting or measuring overabundance of RNA in cancerous cells, especially but not limited to breast cancer cells, comprising the steps of reacting RNA contained in a clinical sample with a reagent comprising the polynucleotide, said clinical sample having been obtained from an individual suspected of having cancerous cells, and comparing the amount of complexes formed between the reagent and the RNA in the clinical sample with the amount of complexes formed between the reagent and RNA in a control sample
  • Another embodiment of this invention is a diagnostic kit for detecting or measuring gene duplication or RNA overabundance in cells contained in an individual as manifest in a clinical sample, comprising a reagent and a buffer in suitable packaging, wherein the reagent comprises a polynucleotide of this invention
  • Another embodiment of this invention is a method of using a polypeptide of this invention for detecting or measuring specific antibodies in a clinical sample, comprising the steps of reacting antibodies contained in the clinical sample with a reagent comprising the polypeptide, said clinical sample having been obtained from an individual suspected of having cancerous cells, especially but not limited to breast cancer cells, and comparing the amount of complexes formed between the reagent and the antibodies in the clinical sample with the amount of complexes formed between the reagent and antibodies in a control sample
  • Another embodiment of this invention is a method of using an antibody of this invention for detecting or measuring altered protein expression in a clinical sample, comprising the steps of reacting a polypeptide contained in the clinical sample with a reagent comprising the antibody, said clinical sample having been obtained from an individual suspected of having cancerous cells, especially but not limited to breast cancer cells, and comparing the amount of complexes formed between the reagent and the polypeptide in the clinical sample with the amount of complexes formed between the reagent and a polypeptide in a control sample
  • diagnostic kits for detecting or measuring a polypeptide or antibody present in a clinical sample comprising a reagent and a buffer in suitable packaging, wherein the reagent respectively comprises either an antibody or a polypeptide of this invention
  • Yet another embodiment of this invention is a host cell transfected by a polynucleotide of this invention
  • a further embodiment of this invention is a method for using a polynucleotide for screening a pharmaceutical candidate, comprising the steps of separating progeny of the transfected host cell into a first group and a second group, treating the first group of cells with the pharmaceutical candidate, not treating the second group of cells with the pharmaceutical candidate, and comparing the phenotype of the treated cells with that of the untreated cells
  • This invention also embodies a pharmaceutical preparation for use in cancer therapy, comprising a polynucleotide or polypeptide embodied by this invention, said preparation being capable of reducing the pathology of cancerous cells, especially for but not limited to breast cancer cells
  • Still another embodiment of this invention is a pharmaceutical preparation or active vaccine comprising a polypeptide embodied by this invention in an immunogenic form and a pharmaceutically compatible excipient
  • a further embodiment is a method for treatment of cancer, especially but not limited to breast cancer, either prophylactically or after cancerous cells are present in an individual being treated, comprising administration of the aforementioned pharmaceutical preparation
  • Another series of embodiments of this invention relate to methods for obtaining cDNA corresponding to a gene associated with cancer, comprising the steps of a) supplying an RNA preparation from uncultured control cells, b) supplying RNA preparations from at least two different cancer cells, c) displaying cDNA corresponding to the RNA preparations of step a) and step b) such that different cDNA corresponding to different RNA in each preparation are displayed separately, d) selecting cDNA corresponding to RNA that is present in greater abundance in the cancer cells of step b) relative to the control cells of step a), e) supplying a digested DNA preparation from control cells, f) supplying digested DNA preparations from at least two different cancer cells, g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step f), and h) further selecting cDNA from the cDNA of step d) corresponding to genes that are duplicated in the cancer cells of step f) relative to the control
  • Cancer cells are preferably used for step b) that share a duplicated gene in the same region of a chromosome If desired, the practitioner may test cancer cells beforehand to detect the duplication or deletion of chromosome regions, or cancer cell lines may be used that have already been characterized in this respect
  • a higher plurality of cancer cells are preferably used to provide DNA for step b), step f), or preferably both step b) and step f)
  • the use of three cancer cells is preferred over two, the use of four cancer cells is more preferred, about five cancer cells is still more preferred, about eight cancer cells is even more preferred
  • the cDNA of each cancer cell population is displayed or hybridized separately, in accordance with the method
  • a higher plurality of control cells are preferably used to provide DNA for step a), step e), or preferably both step a) and step e)
  • the use of two control cell populations is preferred, the use of three or more is even more preferred
  • Both proliferating and non- proliferating populations are preferably used, if available
  • control cells are preferably supplied fresh from a tissue source, and are not cultured or transformed into a cell line This is increasingly important when the control cell populations used in step a) is only one or two in number Freshly obtained cancer cells may also be used as an alternative to cancer cell lines, although this is less critical
  • An additional screening step is preferably conducted in which the cDNA corresponding to the putative cancer-associated gene is additionally hybridized with a digested mitochondnal DNA preparation, to eliminate mitochondnal genes
  • This screening step may be conducted before, between, subsequent to, or simultaneously with the other screening steps of the method
  • RNA is supplied from a plurality of cancer cells, and one or preferably more control cell populations, the RNA is contacted with cDNA corresponding to the putative cancer-associated gene under conditions that permit formation of a stable duplex, and cDNA is selected corresponding to RNA that is present in greater abundance in a proportion of the cancer cells relative to the control cells
  • the plurality of cancer cells is a panel of at least five, preferably at least ten cells
  • at least three, more preferably at least five of the cancer cells show greater abundance of RNA
  • at least one and preferably more of the cancer cells shows a greater abundance of RNA compared with control cells, but does not show duplication of the corresponding gene in step h) of the method
  • inventions are methods for obtaining cDNA corresponding to a gene that is deleted or underexpressed in cancer, comprising the steps of a) supplying an RNA preparation from control cells, b) supplying RNA preparations from at least two different cancer cells that share a deleted gene in the same region of a chromosome, c) displaying cDNA corresponding to the RNA preparations of step a) and step b) such that different cDNA corresponding to different RNA in each preparation are displayed separately, and d) selecting cDNA corresponding to RNA that is present in lower abundance in the cancer cells of step b) relative to the control cells of step a)
  • Such methods typically comprise the following further steps e) supplying a digested DNA preparation from control cells, f) supplying digested DNA preparations from at least two different cancer cells, g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step f), and h) further selecting cDNA from the cDNA of
  • Additional embodiments of this invention are methods for screening candidate drugs for cancer treatment, comprising obtaining cDNA corresponding to a gene that is duplicated, overexpressed, deleted, or underexpressed in cancer, and comparing the effect of the candidate drug on a cell genetically altered with the cDNA or fragment thereof with the effect on a cell not genetically altered
  • Cancers of particular interest include lung cancer, glioblastoma, pancreatic cancer, colon cancer, prostate cancer, hepatoma, myeloma, and breast cancer
  • Figure 1 is a half-tone reproduction of an autoradiogram of a differential display experiment, in which radiolabeled cDNA corresponding to a subset of total messenger RNA in different cells are compared This is used to select cDNA corresponding to particular RNA that are overabundant in breast cancer
  • Figure 2 is a half-tone reproduction of an autoradiogram of electrophoresed DNA digests from a panel of breast cancer cell lines probed with a CH8-2a13-1 insert (Panel A) or a loading control (Panel B)
  • Figure 3 is a half-tone reproduction of an autoradiogram of electrophoresed total RNA from a panel of breast cancer cell lines probed with a CH8-2a13-1 insert (Panel A) or a loading control (Panel B)
  • Figure 4 is a half-tone reproduction of an autoradiogram of electrophoresed DNA digests from a panel of breast cancer cell lines probed with a CH13-2a12-1 insert
  • Figure 5 is a half-tone reproduction of an autoradiogram of electrophoresed total RNA from a panel of breast cancer cell lines probed with a CH13-2a12-1 insert
  • Figure 6 is a map of cDNA fragments obtained for the breast cancer associated genes CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 and CH14-2a16-1 Regions of the fragments used to deduce sequence data listed in the application are indicated by shading Nucleotide positions are numbered from the left-most residue for which double-strand sequence data has been obtained, which is not necessarily the 5' terminus of the corresponding message
  • Figure 7 is a listing of primers used for obtaining the cDNA sequence data for CH1-9a11-2
  • Figure 8 is a listing of cDNA sequence obtained for CH1-9a11-2
  • Figure 9 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 8
  • the single-letter am o acid code is used Stop codons are indicated by a dot (•)
  • the upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence
  • a possible transmembrane region is indicated by underlining
  • Figure 10 is a listing of primers used for obtaining the cDNA sequence data for CH8-2a13-1
  • Figure 11 is a listing of cDNA sequence obtained for CH8-2a13-1
  • Figure 12 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH8-2a13-1 shown in Figure 11
  • the upper panel shows the complete ammo acid translation
  • the lower panel shows the predicted gene product protein sequence
  • Figure 13 is a listing of the nucleotide sequence predicted for a full-length CH8-2a13-1 cDNA
  • Figure 14 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH8-2a13-1 shown in Figure 13
  • Figure 15 is a listing of primers used for obtaining the cDNA sequence data for CH13-2a12-1
  • Figure 16 is a listing of cDNA sequence obtained for CH13-2a12-1 As explained in Example 6, the first 405 base pairs shown in this sequence are believed to be part of an mtron and do not typically appear in the mature mRNA Additional sequence present in the mature mRNA is shown in Figure 35
  • Figure 17 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH13-2a12-1 shown in Figure 16
  • the upper panel shows the complete ammo acid translation
  • the lower panel shows the predicted gene product protein sequence
  • Figure 18 is a listing of primers used for obtaining cDNA sequence data for CH13-2a12-1
  • Figure 19 is a listing of the cDNA sequence data obtained by two-directional sequencing for CH14- 2a16-1
  • Figure 20 is a listing of the am o acid sequence corresponding to the longest open reading frame of the DNA sequence of CH14-2a16-1 shown in Figure 19
  • the upper panel shows the complete ammo acid translation
  • the lower panel shows the predicted gene product protein sequence Residues corresponding to three zinc finger motifs are underlined, indicating that the protein may have DNA or RNA binding activity
  • Figure 21 is a listing of additional DNA sequence data towards the 5' end of CH14-2a16-1 obtained by one-directional sequencing of the fragment pCH14-1 3
  • First two panels show nucleotide and ammo acid sequence from the 5' end of the fragment, the second two panels show nucleotide and am o acid sequence from the 3' end of the fragment Regions of overlap with pCH 14-800 are underlined
  • Figure 22 is a listing of the nucleotide sequences of initial fragments obtained corresponding to the four breast cancer associated genes, along with their ammo acid translations
  • Figure 23 is a listing of additional cDNA sequence obtained for CH1-9a11-2, comprising approximately 1934 base pairs 5' from the sequence of Figure 8 Additional sequence for this gene is shown in Figure 27
  • Figure 24 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 23
  • the single-letter am o acid code is used Stop codons are indicated by a dot (•)
  • Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1 , comprising approximately 1934 base pairs 5' from the sequence of Figure 19
  • Figure 26 is a listing of the am o acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 25
  • the single-letter am o acid code is used Stop codons are indicated by a dot (•)
  • the upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence
  • Figure 27 is a listing of the cDNA nucleotide sequence obtained for CH1-9a11-2, containing an additional -2467 base pairs 5' from the sequence of Figure 23
  • Figure 28 is a listing of the predicted ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 27
  • Figure 29 and Figure 31 are nucleotide sequence listings for two other cDNA species obtained for CH1-9a11-2 Additional base pairs are present in the encoding region, adding to the predicted protein sequence without altering the reading frame
  • Figure 30 and Figure 32 are ammo acid sequence listings corresponding to the DNA sequences shown in Figures 29 and 31
  • Figure 33 is a map of the CH1-9a11-2 cDNA, compared with the human chromosomal gene and homologous sequences from other species A large internal region is conserved between distant species suggesting the gene has a fundamental role in cell metabolism Some of the members of the family including the human gene have predicted transmembrane regions
  • Figure 34 is a six-panel photostatic reproduction of in situ hybridization analysis for CH1-9a11-2 overexpression at the mRNA level Positive staining was shown with the 600 PE cell line (Panel A) and 44% of primary breast cancers (exemplified in Panels E and F), compared with control samples or surrounding tissue
  • Figure 35 is a listing of the cDNA nucleotide sequence obtained for CH13-2a12-1 , containing an additional ⁇ 708 base pairs at the 5' end
  • Figure 36 is a listing of the predicted ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH13-2a12-1 shown in Figure 35
  • Figure 37 and Figure 39 are nucleotide sequence listings for two other cDNA species obtained for CH13-2a12-1 Additional base pairs are present in the encoding region, adding to the predicted protein sequence without altering the reading frame
  • Figure 38 and Figure 40 are ammo acid sequence listings corresponding to the DNA sequences shown in Figures 29 and 31
  • Figure 41 is a two panel figure showing standardized values of DNA amplification (Left Panel) and RNA overexpression (Right Panel) for CH13-2a12-1 in 16 breast cancer cell lines compared with normal tissue
  • Figure 42 is a four-panel photostatic reproduction of in situ hybridization analysis for CH13-2a12-1 overexpression at the mRNA level Positive staining was shown with the 600 PE cell line (Panel B) and a proportion of primary breast cancers (exemplified in Panel C), compared with control samples or surrounding tissue
  • This invention relates to the discovery and characterization of four novel genes associated with breast cancer
  • the cDNA of these genes, and their sequences as disclosed below, provide the basis of a series of reagents that can be used in diagnosis and therapy
  • each of the four genes was found to be duplicated in 40-60% of the cells tested Surprisingly, each of the four genes was duplicated in at least one cell line where studies using comparative genomic hybridization had not revealed any amplification of the corresponding chromosomal region Levels of expression at the mRNA level were tested in a similar panel for two of these four genes In addition to those cell lines showing gene duplication, 17 to 37% of the lines showed RNA overabundance without gene duplication, indicating that the malignant cells had used some mechanism other than gene duplication to promote the abundance of RNA corresponding to these genes All four of the breast cancer genes have open reading frames, and likely are transcribed at various levels in different cell types Overabundance of the corresponding RNA in a cancerous cell is likely associated with overexpression of the protein gene product Such overexpression may be manifest as increased secretion of the protein from the cell into blood or the surrounding environment, an increased density of the protein at the cell surface, or an increased accumulation the protein within the cell
  • All four genes sequences are unrelated to other genes known to be overexpressed in breast cancer, including the ert B2 gene (Adnane et al ), tissue factor (Chen et al ), mammaglobulin (Watson et al ), and DD96 (Kocher et al )
  • the four mRNA sequences each comprise an open reading frame
  • the CH1-9a11-2 gene is expressed at the mRNA level at relatively elevated levels in pancreas and testis
  • the CH8-2a13-1 gene is expressed at relatively elevated levels in adult heart, spleen, thymus, small intestine, colon, and tissues of the reproductive system, and at higher levels in certain tissues of the fetus
  • the CH13- 2a12-1 gene is expressed at relatively elevated leves in heart, skeletal muscle, and testis
  • the CH14- 2a16-1 gene is expressed at relatively elevated levels in testis
  • the level of expression of all four genes is especially high in a substantial proportion of breast
  • the CH1-9a11-2 gene encodes a protein with a putative transmembrane region, and may be expressed as a surface protein on cancer cells
  • the CH13-2a12-1 gene is distantly related to a C elegans gene implicated in cell cycle regulation, and may play a role in the regulation of cell proliferation
  • the protein encoded by CH13-2a12-1 is distantly related to a vasopressin-activated calcium binding receptor, and may have Ca ++ binding activity
  • the CH14-2a16-1 comprises at least five domains of a zinc finger binding motif and is distantly related to a yeast RNA binding protein
  • the CH14-2a16-1 gene product is suspected of having DNA or RNA binding activity, which may relate to a role in cancer pathogenesis
  • the four genes described here are exemplars of genes that undergo altered expression in cancer, identifiable using the gene screening methods of the invention The method involves an analysis for both DNA duplication and altered RNA abundance relating to the same gene Since abnormal gene regulation is central to the malignant process, the identification method may
  • polynucleotide refers to a polymeric form of nucleotides of any length, either deoxynbonucleotides or nbonucleotides, or analogs thereof Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown The following are non-limiting examples of polynucleotides a gene or gene fragment, exons, mtrons, messenger RNA
  • a polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer
  • the sequence of nucleotides may be interrupted by non-nucleotide components
  • a polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component
  • polynucleotide refers interchangeably to double- and single-stranded molecules Unless otherwise specified or required, any embodiment of the invention described herein that is a polynucleotide encompasses both the double-stranded form, and each of two complementary single-stranded forms known or predicted to make up the double-stranded form
  • a “linear sequence” or a “sequence” is an order of nucleotides in a polynucleotide in a 5' to 3' direction in which residues that neighbor each other in the sequence are contiguous in the primary structure of the polynucleotide
  • a "partial sequence” is a linear sequence of part of a polynucleotide which is known to comprise additional residues in one or both directions
  • Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues The hydrogen bonding is sequence-specific, and typically occurs by Watson-Crick base pai ⁇ ng A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a
  • Hybridization reactions can be performed under conditions of different "stringency" Relevant conditions include temperature, ionic strength, time of incubation, the presence of additional solutes in the reaction mixture such as formamide, and the washing procedure
  • Higher stringency conditions are those conditions, such as higher temperature and lower sodium ion concentration, which require higher minimum complementarity between hybridizing elements for a stable hybridization complex to form
  • Conditions that increase the stringency of a hybridization reaction are widely known and published in the art see, for example, "Molecular Cloning A Laboratory Manual", Second Edition (Sambrook, Fritsch & Maniatis, 1989) When hybridization occurs in an antiparallel configuration between two single-stranded polynucleotides, those polynucleotides are described as "complementary"
  • a double-stranded polynucleotide can be "complementary" to another polynucleotide, if hybridization can occur between
  • a linear sequence of nucleotides is "essentially identical" to another linear sequence, if both sequences are capable of hybridizing to form a duplex with the same complementary polynucleotide Sequences that hybridize under conditions of greater stringency are more preferred It is understood that hybridization reactions can accommodate insertions, deletions, and substitutions in the nucleotide sequence
  • linear sequences of nucleotides can be essentially identical even if some of the nucleotide residues do not precisely correspond or align
  • essentially identical sequences of about 40 nucleotides in length will hybridize at about 30 °C in 10 x SSC (0 15 M NaCl, 15 mM citrate buffer), preferably, they will hybridize at about 40°C in 6 x SSC, more preferably, they will hybridize at about 50°C in 6 x SSC, even more preferably, they will hybridize at about 60 °C in 6 x SSC, or at about
  • a sequence that preserves the functionality of the polynucleotide with which it is being compared is particularly preferred Functionality may be established by different criteria, such as ability to hybridize with a target polynucleotide, and whether the polynucleotide encodes an identical or essentially identical polypeptides
  • nucleotide substitutions which cause a non-conservative substitution in the encoded polypeptide are preferred over nucleotide substitutions that create a stop codon
  • nucleotide substitutions that cause a conservative substitution in the encoded polypeptide are more preferred
  • identical nucleotide sequences are even more preferred
  • Insertions or deletions in the polynucleotide that result in insertions or deletions in the polypeptide are preferred over those that result in the down-stream coding region being rendered out of phase
  • the relative importance of hybridization properties and the polypeptide encoded by a polynucleotide depends on the application of the invention
  • a “reagent” polynucleotide, polypeptide, or antibody is a substance provided for a reaction, the substance having some known and desirable parameters for the reaction
  • a reaction mixture may also contain a "target", such as a polynucleotide, antibody, or polypeptide that the reagent is capable of reacting with
  • a "target” may also be a cell, collection of cells, tissue, or organ that is the object of an administered substance, such as a pharmaceutical compound "cDNA” or “complementary DNA” is a single- or double-stranded DNA polynucleotide in which one strand is complementary to a messenger RNA "Full-length cDNA” is cDNA comprised of a strand which is complementary to an entire messenger RNA molecule
  • a “splice variant” is an alternative gene transcript The term includes both splicing intermediates produced during transcript processing and maturation, and variant species of mature transcript produced from the same chromosomal gene
  • Different polynucleotides are said to "correspond" to each other if one is ultimately derived from another
  • messenger RNA corresponds to the gene from which it is transcribed cDNA corresponds to the RNA from which it has been produced, such as by a reverse transcription reaction, or by chemical synthesis of a DNA based upon knowledge of the RNA sequence cDNA also corresponds to the gene that encodes the RNA Polynucleotides may be said to correspond even when one of the pair is derived from only a portion of the other
  • a "probe” when used in the context of polynucleotide manipulation refers to a polynucleotide which is provided as a reagent to detect a target potentially present in a sample of interest by hybridizing with the target Usually, a probe will comprise a label or a means by which a label can be attached, either before or subsequent to the hybridization reaction Suitable labels include, but are not limited to radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and enzymes
  • a “primer” is a short polynucleotide, generally with a free 3' -OH group, that binds to a target potentially present in a sample of interest by hybridizing with the target, and thereafter promoting polymerization of a polynucleotide complementary to the target
  • a "polymerase chain reaction” (“PCR") is a reaction in which replicate copies are made of a target polynucleotide using one or more primers, and a catalyst of polymerization, such as a reverse transcnptase or a DNA polymerase, and particularly a thermally stable polymerase enzyme Methods for PCR are taught in U S Patent Nos
  • an “operon” is a genetic region comprising a gene encoding a protein and functionally related 5' and 3' flanking regions Elements within an operon include but are not limited to promoter regions, enhancer regions, repressor binding regions, transcription initiation sites, nbosome binding sites, translation initiation sites, protein encoding regions, mtrons and exons, and termination sites for transcription and translation
  • a “promoter” is a DNA region capable under certain conditions of binding RNA polymerase and initiating transc ⁇ ption of a coding region located downstream (in the 3' direction) from the promoter
  • “Operably linked” refers to a juxtaposition of genetic elements, wherein the elements are in a relationship permitting them to operate in the expected manner For instance, a promoter is operably linked to a coding region if the promoter helps initiate transcription of the coding sequence There may be intervening residues between the promoter and coding region so long as this functional relationship is maintained
  • Gene duplication is a term used herein to describe the process whereby an increased number of copies of a particular gene or a fragment thereof is present in a particular cell or cell line "Gene amplification" generally is synonymous with gene duplication
  • RNA overexpression reflects the presence of more RNA (as a proportion of total RNA) from a particular gene in a cell being described, such as a cancerous cell, in relation to that of the cell it is being compared with, such as a non-cancerous cell
  • the protein product of the gene may or may not be produced in normal or abnormal amounts
  • Protein overexpression similarly reflects the presence of relatively more protein present in or produced by, for example, a cancerous cell
  • “Abundance” of RNA refers to the amount of a particular RNA present in a particular cell type
  • RNA overabundance or “overabundance of RNA” describes RNA that is present in greater proportion of total RNA in the cell type being described, compared with the same RNA as a proportion of the total RNA in a control cell
  • a number of mechanisms may contribute to RNA overabundance in a particular cell type for example, gene duplication, increased level of transcription of the gene, increased persistence of the RNA within the cell after it is produced, or any combination of these
  • polypeptide polypeptide
  • peptide protein
  • polymers of ammo acids of any length The polymer may be linear or branched, it may comp ⁇ se modified ammo acids, and it may be interrupted by non-ammo acids
  • the terms also encompass an ammo acid polymer that has been modified, for example, disulfide bond formation, glycosylation, hpidation, acetylation, phosphorylation, or any other manipulation, such as conjugation with a labeling component
  • a “linear sequence” or a “sequence” is an order of ammo acids in a polypeptide in an N-terminal to C-terminal direction in which residues that neighbor each other in the sequence are contiguous in the primary structure of the polypeptide
  • a "partial sequence” is a linear sequence of part of a polypeptide which is known to comprise additional residues in one or both directions
  • a linear sequence of ammo acids is "essentially identical" to another sequence if the two sequences have a substantial degree of sequence identity It is understood that the functional proteins can accommodate insertions, deletions, and substitutions in the ammo acid sequence Thus linear sequences of ammo acids can be essentially identical even if some of the residues do not precisely correspond or align Sequences that correspond or align more closely to the invention disclosed herein are more preferred It is also understood that some ammo acid substitutions are more easily tolerated For example, substitution of an ammo acid with hydrophobic side chains, aromatic side chains, polar side chains, side chains with a positive or negative charge, or side chains comprising two or fewer carbon atoms, by another am o acid with a side chain of like properties can occur without disturbing the essential identity of the two sequences Methods for determining homologous regions and scoring the degree of homology are well known in the art, see for example Altschul et al and Henikoff et al Well-tolerated sequence differences are referred to as "conservative substitutions"
  • ammo acid sequences that are essentially identical are at least about 15% identical, and comprise at least about another 15% which are either identical or are conservative substitutions, after alignment of homologous regions More preferably, essentially identical sequences comprise at least about 50% identical residues or conservative substitutions, more preferably, they comprise at least about 70% identical residues or conservative substitutions, more preferably, they comprise at least about 80% identical residues or conservative substitutions, more preferably, they comprise at least about 90% identical residues or conservative substitutions, more preferably, they comprise at least about 95% identical residues or conservative substitutions, even more preferably, they contain 100% identical residues In determining whether polypeptide sequences are essentially identical, a sequence that preserves the functionality of the polypeptide with which it is being compared is particularly preferred Functionality may be established by different parameters, such as enzymatic activity, the binding rate or affinity in a receptor-ligand interaction, the binding affinity with an antibody, and X-ray crystallographic structure An "antibody” (interchangeably used in plural form) is an immunoglobulft
  • antigen refers to the target molecule that is specifically bound by an antibody through its antigen recognition site
  • the antigen may, but need not be chemically related to the immunogen that stimulated production of the antibody
  • the antigen may be polyvalent, or it may be a monovalent hapten
  • kinds of antigens that can be recognized by antibodies include polypeptides, polynucleotides, other antibody molecules, ohgosaccharides, complex lipids, drugs, and chemicals
  • An "immunogen” is an antigen capable of stimulating production of an antibody when injected into a suitable host, usually a mammal
  • Compounds may be rendered immunogenic by many techniques known in the art, including crosshnking or conjugating with a carrier to increase valency mixing with a mitogen to increase the immune response, and combining with an adjuvant to enhance presentation
  • An "active vaccine” is a pharmaceutical preparation for human or animal use, which is used with the intention of eliciting a specific immune response
  • the immune response may be either humoral or cellular, systemic or secretory
  • the immune response may be desired for experimental purposes, for the treatment of a particular condition, for the elimination of a particular substance, or for prophylaxis against a particular condition or substance
  • an "isotated" polynucleotide, polypeptide, protein, antibody, or other substance refers to a preparation of the substance devoid of at least some of the other components that may also be present where the substance or a similar substance naturally occurs or is initially obtained from
  • an isolated substance may be prepared by using a purification technique to enrich it from a source mixture
  • Enrichment can be measured on an absolute basis, such as weight per volume of solution, or it can be measured in relation to a second, potentially interfering substance present in the source mixture increasing enrichments of the embodiments of this invention are increasingly more preferred
  • a 2-fold enrichment is preferred, 10-fold enrichment is more preferred, 100-fold enrichment is more preferred, 1000-fold enrichment is even more preferred
  • a substance can also be provided in an isolated state by a process of artificial assembly, such as by chemical synthesis or recombinant expression
  • a polynucleotide used in a reaction such as a probe used in a hybridization reaction, a primer used in a PCR, or
  • the "effector component" of a pharmaceutical preparation is a component which modifies target cells by altering their function in a desirable way when administered to a subject bearing the cells
  • Some advanced pharmaceutical preparations also have a "targeting component", such as an antibody, which helps deliver the effector component more efficaciously to the target site
  • the effector component may have any one of a number of modes of action For example, it may restore or enhance a normal function of a cell, it may eliminate or suppress an abnormal function of a cell, or it may alter a cell's phenotype Alternatively, it may kill or render dormant a cell with pathological features, such as a cancer cell Examples of effector components are provided in a later section
  • a “pharmaceutical candidate” or “drug candidate” is a compound believed to have therapeutic potential, that is to be tested for efficacy
  • the “screening” of a pharmaceutical candidate refers to conducting an assay that is capable of evaluating the efficacy and/or specificity of the candidate
  • efficacy refers to the ability of the candidate to effect the cell or organism it is administered to in a beneficial way for example, the limitation of the pathology of cancerous cells
  • a “cell line” or “cell culture” denotes higher eukaryotic cells grown or maintained in vitro It is understood that the descendants of a cell may not be completely identical (either morphologically, genotypically, or phenotypically) to the parent cell Cells described as "uncultured” are obtained directly from a living organism, and have been maintained for a limited amount of time away from the organism not long enough or under conditions for the cells to undergo substantial replication
  • Genetic alteration refers to a process wherein a genetic element is introduced into a cell other than by mitosis or meiosis
  • the element may be heterologous to the cell, or it may be an additional copy or improved version of an element already present in the cell
  • Genetic alteration may be effected, for example, by transfect g a cell with a recombinant plasmid or other polynucleotide through any process known in the art, such as electroporation, calcium phosphate precipitation, or contacting with a polynucleotide-hposome complex, or by transduction or infection with a DNA or RNA virus or viral vector
  • the alteration is preferably but not necessarily inheritable by progeny of the altered cell
  • a “host cell” is a cell which has been genetically altered, or is capable of being genetically altered, by administration of an exogenous polynucleotide
  • cancer cell refers to cells that have undergone a malignant transformation that makes them pathological to the host organism
  • Malignant transformation is a single- or multi-step process, which involves in part an alteration in the genetic makeup of the cell and/or the expression profile Malignant transformation may occur either spontaneously, or via an event or combination of events such as drug or chemical treatment, radiation, fusion with other cells, viral infection, or activation or mactivation of particular genes Malignant transformation may occur in vivo or in vitro, and can if necessary be experimentally induced
  • a frequent feature of cancer cells is the tendency to grow in a manner that is uncontrollable by the host, but the pathology associated with a particular cancer cell may take another form, as outlined infra Primary cancer cells (that is, cells obtained from near the site of malignant transformation) can be readily distinguished from non-cancerous cells by well-established techniques, particularly histological examination
  • the definition of a cancer cell includes not only a primary cancer cell, but any cell derived from a cancer cell ancestor This includes metastasized cancer cells, and in vitro cultures and cell lines derived from cancer cells
  • the "pathology" caused by a cancer cell within a host is anything that compromises the well-being or normal physiology of the host This may involve (but is not limited to) abnormal or uncontrollable growth of the cell, metastasis, release of cytokmes or other secretory products at an inappropriate level, manifestation of a function inappropriate for its physiological milieu, interference with the normal function of neighboring cells, aggravation or suppression of an inflammatory or immunological response,
  • Treatment of an individual or a cell is any type of intervention in an attempt to alter the natural course of the individual or cell
  • treatment of an individual may be undertaken to decrease or limit the pathology caused by a cancer cell harbored in the individual
  • Treatment includes (but is not limited to) administration of a composition, such as a pharmaceutical composition, and may be performed either prophylactically, or subsequent to the initiation of a pathologic event or contact with an etiologic agent
  • Effective amounts used in treatment are those which are sufficient to produce the desired effect, and may be given in single or divided doses
  • control cell is an alternative source of cells or an alternative cell line used in an experiment for comparison purposes Where the purpose of the experiment is to establish a base line for gene copy number or expression level, it is generally preferable to use a control cell that is not a cancer cell
  • cancer gene refers to any gene which is yielding transcription or translation products at a substantially altered level or in a substantially altered form in cancerous cells compared with non-cancerous cells, and which may play a role in supporting the malignancy of the cell It may be a normally quiescent gene that becomes activated (such as a dominant proto-oncogene), it may be a gene that becomes expressed at an abnormally high level (such as a growth factor receptor), it may be a gene that becomes mutated to produce a variant phenotype, or it may be a gene that becomes expressed at an abnormally low level (such as a tumor suppresser gene)
  • the present invention is directed towards the discovery of genes in all these categories It is understood that a "clinical sample” encompasses a variety of sample types obtained from a subject and useful in an in vitro procedure, such as a diagnostic test The definition encompasses solid tissue samples obtained as a surgical removal, a pathology specimen, or a biopsy specimen, tissue cultures or cells derived therefrom and the
  • the relative amount of a reagent forming a complex in a reaction is the amount reacting with a test specimen, compared with the amount reacting with a control specimen
  • the control specimen may be run separately in the same assay, or it may be part of the same sample (for example, normal tissue surrounding a malignant area in a tissue section)
  • a "differential' result is generally obtained from an assay in which a comparison is made between the findings of two different assay samples, such as a cancerous cell line and a control cell line
  • "differential expression” is observed when the level of expression of a particular gene is higher in one cell than another
  • “Differential display” refers to a display of a component, particularly RNA, from different cells to determine if there is a difference in the level of the component amongst different cells
  • Differential display of RNA is conducted, for example, by selective production and display of cDNA corresponding thereto A method for performing differential display is provided in a later section
  • a polynucleotide derived from or corresponding to CH1 -9a11-2, CH8-2a13-1 , CH13-2a12-1 , or CH14-2a16-1 is any of the following the respective cDNA fragments, the corresponding messenger RNA, including splice variants and fragments thereof, both strands of the corresponding full-length cDNA and fragments thereof, and the corresponding gene Isolated allelic variants of any of these forms are included
  • This invention embodies any polynucleotide corresponding to CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , or CH14-2a16-1 in an isolated form It also embodies any such polynucleotide that has been cloned or transfected into a cell line CH13-2a12-1 may also be referred to as Cul-4A
  • displaying cDNA is any technique in which DNA copies of RNA (not restricted to mRNA) is rendered detectable in a quantitative or relatively quantitative fashion, in that DNA copies present in a relatively greater amount in a first sample compared with a second sample generates a relatively stronger or weaker signal compared with that of the second sample due to the difference in copy number
  • a preferred method of display is the differential display technique, and enhancements thereupon described in this disclosure and elsewhere
  • hybridizing in this context refers to contacting a first polynucleotide with a second polynucleotide under conditions that permit the formation of a multi-stranded polynucleotide duplex whenever one strand of the first polynucleotide has a sequence of sufficient complementarity to a sequence on the second polynucleotide
  • the duplex may be a long-lived one, such as when one DNA molecule is used as a labeled probe to detect another DNA molecule, that may optionally be bound to a nitrocellulose filter or present in a separating gel
  • the duplex may also be a shorter- lived one, such as when one DNA
  • steps of a method of this invention may be performed in any order, or combined where desired and appropriate
  • steps a) to c) of the method either before or after steps e) to g) of the method, as long as the cDNA ultimately selected fulfills the criteria of both steps d) and step h)
  • screening against different digested DNA preparations even if outlined separately, may optionally be done at the same time All permutations of this kind are within the scope of the invention
  • cancer gene screening methods of this invention may be brought to bear to discover novel genes associated with cancer Exemplars of cancer-associated genes identified by this method are described below The exemplars were identified using breast cancer cell lines and tissue, but the strategy can be applied to any cancer type of interest
  • a central feature of the cancer gene screening method of this invention is to look for both DNA duplication and RNA overabundance relating to the same gene. This feature is particularly powerful in the discovery of new and potentially important cancer genes While amp cons occur frequently in cancer, the presently available techniques indicate only the broad chromosomal region involved in the duplication event, not the specific genes involved
  • the present invention provides a way of detecting genes that may be present in an amphcon from a functional basis Because an early part of the method involves detecting RNA, the method avoids genes that may be duplicated in an amphcon but are quiescent (and therefore irrelevant) in the cancer cells Furthermore, it recruits active genes from a duplicated region of the chromosome too small to be detectable by the techniques used to describe amphcons
  • RNA expression refers to expression at the RNA transcription level
  • the RNA is in turn be translated into a protein with a particular enzymatic, binding, or regulatory activity which increases after malignant transformation
  • the RNA may encode or participate as a ribozyme, antisense polynucleotide, or other functional nucleic acid molecule during malignancy
  • RNA expression may be incidental but symptomatic of an important event in transformation
  • RNA overabundance without gene duplication such as by increasing the rate of transcription of the gene (e g , by upregulation of the promoter region), by enhancing transcript promotion or transport, or by increasing mRNA survival
  • the method entails screening at the RNA level, several cancer cell lines or tumors, and several normal cell lines or tissue samples at the same time RNA are selected that show a consistent elevation amongst the cancer cells as compared with normal cells
  • Additional strategies may be employed in combination with the RNA screening to improve the success rate of the method
  • One such strategy is to use several cancer cell lines that are all known to have duplicated genes in the same region of a particular chromosome
  • the RNA that emerge from the screen are more likely to represent a deliberate overexpression event, and the overexpressed gene is likely to be within the duplicated region
  • a supplemental strategy is to use freshly prepared tissue samples rather than cell lines as controls for base-line expression This avoids selection of genes that may alter their expression level just as a result of tissue cultunng
  • Another supplemental strategy is to conduct an additional level of screening, following identification of shared, overexpressed RNA The selected RNA are used to screen DNA from suitable cancer cells and normal cells, to ensure that at least a proportion of the cells achieved the overexpression by way
  • the strategy for detecting such genes comprises a number of innovations over those that have been used in previous work
  • the first part of the method is based on a search for particular RNAs that are overabundant in cancer cells
  • a first innovation of the method is to compare RNA abundance between control cells and several different cancer cells or cancer cell lines of the desired type
  • the cDNA fragments that emerge in a greater amount in several different cancer lines, but not in control cells, are more likely to reflect genes that are important in disease progression, rather than those that have undergone secondary or coincidental activation It is particularly preferred to use cancer cells that are known to share a common duplicated chromosomal region
  • a second innovation of this method is to supply as control, not RNA from a cell line or culture, but from fresh tissue samples of non-malignant origin There are two reasons for this First, the tissue will provide the spectrum of expression that is typical to the normal cell phenotype, rather than individual differences that may become more prominent in culture This establishes a more reliable baseline for normal expression levels More importantly, the tissue will
  • a third innovation of this method is to undertake a subselection for cDNA corresponding to genes that achieve their RNA overabundance in a substantial proportion of cancer cells by gene duplication
  • appropriate cDNA corresponding to overabundant RNA identified in the foregoing steps are used to probe digests of cellular DNA from a panel of different cancer cells, and from normal genomic DNA cDNA that shows evidence of higher copy numbers in a proportion of the panel are selected for further characterization
  • An additional advantage of this step is that cDNA corresponding to mitochondnal genes can rapidly be screened away by including a mitochondnal DNA digest as an additional sample for testing the probe This eliminates most of the false-positive cDNA, which otherwise make up a majority of the cDNA identified
  • RNA is prepared from both cancerous and control cells by standard techniques
  • Cancer-associated genes may affect cellular metabolism by any one of a number of mechanisms
  • they may encode ribozymes, anti-sense polynucleotides, DNA-bindmg polynucleotides, altered nbosomal RNA, and the like
  • the gene screening methods of this invention may employ a comparison of RNA abundance levels at the total RNA level, not strictly limited to mRNA
  • the vast majority of cancer-associated genes are predicted to encode a protein gene whose up-regulation is closely linked to the metabolic process
  • the four exemplary breast cancer genes described elsewhere in this application all comprise an open reading frame Accordingly, a focus on mRNA enriches the selectable pool for candidate cancer-associated genes Focus towards mRNA can be conducted at any step in the method It is particularly convenient to use a display method that displays cDNA copied only from mRNA In this case, whole RNA may be prepared and analyzed from cancer and control
  • RNA transcribed from the duplicated region is expected to be overabundant compared with that of the control cell Accordingly, a highly effective strategy is to identify overabundant RNA that is present in all (or at least several) of the cancer cell preparations, but none of the control preparations
  • the RNA comparison will be strongly biased in favor of RNA overabundance transcribed from the shared duplicated region Since the shared region is optimally only a small segment of a single chromosome, expression differences arising from elsewhere in the genome in one cancer cell or another will not be selected
  • this is highly effective in eliminating a) RNA abundance differences resulting from normal metabolic variations between cells, and/or b) RNA abundance differences related to cancer cell malignancy, but occurring secondarily to malignant transformation This is important, because it considerably minimizes the chief deficiency in the use of RNA comparison methods, particularly differential display, for
  • CGH comparative genomic hybridization
  • a particular chromosomal mapping approach is irrelevant, especially once knowledge of the duplicated region is known If the location of the chromosome duplication is already established for a cell line to be used in RNA comparison during the course of the present invention, then it is unnecessary to conduct a mapping technique de novo For example, established cancer cell lines exist for which mapping data is already available in the public domain Provided in the reference section of this application is a list of over 40 articles in which the locations of duplicated regions in particular cancer cells are described In the context of the present invention, a plurality of cancer cells is chosen for the screening panel based on such data, so that they share a duplicated chromosomal region The chromosomal location of a suspected duplication may be confirmed by hybridization analysis, if desired, using a probe specific for the location
  • the chromosome 13 gene (CH13-2a12-1) was overexpressed in 2 of the 3 cell lines, namely BT474 and SKBR3 Southern analysis subsequently established that the chromosome 13 gene was duplicated in the same two cell lines (Example 6, Table 5)
  • control cell RNA can be derived from in vitro cultures of non-malignant cells, or established cell lines derived from a non-malignant source
  • the transforming event may, in turn, be shared with that of certain cancer cells, at least at the level of RNA abundance
  • comparison of the RNA levels in cancer cells with so-called control cell lines may lead the practitioner to miss genes that are related to malignancy
  • control cells may be maintained in culture for a brief period before the experiment, and even stimulated, however, multiple rounds of cell division are to be avoided if possible
  • Use of both stimulated and unstimulated cells as controls may help provide RNA patterns corresponding to the normal range of abundance within various metabolic events of the cell cycle In one illustration highlighted in Example 1 ,
  • RNA is preserved until use in the comparison experiment in such a way to minimize fragmentation
  • reproducibihty can also be provided by preparing enough RNA so that it can be preserved in ahquots
  • RNA display methods For displaying relative overabundance of RNA in the cancer cells, compared with the control cells, many standard techniques are suitable These would include any form of subtractive hybridization or comparative analysis Preferred are techniques in which more than two RNA sources are compared at the same time, such as various types of arbitrarily primed PCR fingerprinting techniques (Welsh et al , Yoshikawa et al ) Particularly preferred are differential mRNA display methods and variations thereof, in which the samples are run in neighboring lanes in a separating gel These techniques are focused towards mRNA by using primers that are specific for the poly-A tail characteristic of mRNA (Liang et al , 1992a, U.S Patent 5,262,311 )
  • RNA is first reverse transcribed by standard techniques Short primers are used for the selection, preferably chosen such that alternative primers used in a series of like assays can complete a comprehensive survey of the mRNA
  • primers can be used for the 3' region of the mRNAs which have an ohgo-dT sequence, followed by two other nucleotides (TiNM, where i « 11 , N e ⁇ A,C,G ⁇ , and M e ⁇ A,C,G,T ⁇ )
  • TiNM ohgo-dT sequence
  • two other nucleotides i « 11 , N e ⁇ A,C,G ⁇ , and M e ⁇ A,C,G,T ⁇
  • a random or arbitrary primer of minimal length can then be used for replication towards what corresponds in the sequence to the 5' region of the mRNA
  • the optimal length for the random primer is about 10 nucleotides
  • the product of the PCR reaction is labeled with a radioisotope, such as 35 S
  • the labeled cDNA is then separated by molecular weight, such as on a polyacrylamide sequencing gel
  • variations on the differential display technique may be employed for example, one-base oh
  • RNAs are chosen which are present as a higher proportion of the RNA in cancerous cells, compared with control cells
  • the cDNA corresponding to overabundant RNA will produce a band with greater proportional intensity amongst neighboring cDNA bands, compared with the proportional intensity in the control lanes
  • Desired cDNAs can be recovered most directly by cutting the spot in the gel corresponding to the band, and recovering the DNAs therefrom Recovered cDNA can be replicated again for further use by any technique or combination of techniques known in the art, including PCR and cloning into a suitable carrier
  • An optional but highly beneficial additional screening step is aimed at identifying genes that are duplicated in a substantial proportion of cancers
  • This is conducted by using cDNA such as selected from differential display to probe digests of chromosomal DNA obtained from two or more cancerous cells, such as cancer cell lines
  • Chromosomal DNA from non-cancerous cells that essentially reflects the germ line in terms of gene copy number is used for the control
  • a preferred source of control DNA in experiments for human cancer genes is placental DNA, which is readily obtainable
  • the DNA samples are cleaved at sequence-specific sites along the chromosome, most usually with a suitable restriction enzyme into fragments of appropriate size
  • the DNA can be blotted directly onto a suitable medium, or separated on an agarose gel before blotting The latter method is preferred, because it enables a comparison of the hybridizing chromosomal restriction fragment to determine whether the probe is binding to the same fragment in all samples
  • the amount of probe binding to DNA digests from each of the cancer cells is compared with the
  • cDNA for mitochondnal genes it is preferable to include in a parallel analysis a mitochondnal DNA preparation digested with the same restriction enzyme Any cDNA probe that hybridizes to the appropriate mitochondnal restriction fragments can be suspected of corresponding to a mitochondnal gene
  • the random primer may bind at any location along the RNA sequence
  • the copied and replicated segment may be a fragment of the full-length RNA
  • Longer cDNA corresponding to a greater portion of the sequence can be obtained, if desired, by several techniques known to practitioners of ordinary skill These include using the cDNA fragment to isolate the corresponding RNA, or to isolate complementary DNA from a cDNA library of the same species
  • the library is derived from the same tissue source, and more preferably from a cancer cell line of the same type
  • a preferred library is derived from breast cancer cell line BT474, constructed in lambda GT10
  • Sequences of the cDNA can be determined by standard techniques, or by submitting the sample to commercial sequencing services
  • the chromosomal locations of the genes can be determined by any one of several methods known in the art, such as in situ hybridization using chromosomal smears, or panels of somatic cell hybrids of known chromosomal composition
  • the cDNA obtained through the selection process outlined can then be tested against a larger panel of cancer cell lines and/or fresh tumor cells to determine what proportion of the cells have duplicated the gene This can be accomplished by using the cDNA as a probe for chromosomal DNA digests, as described earlier As illustrated in the Example section, a preferred method for conducting this determination is Southern analysis
  • the cDNA can also be used to determine what proportion of the cells have RNA overabundance This can be accomplished by standard techniques, such as slot blots or blots of agarose gels, using whole RNA or messenger RNA from each of the cells in the panel The blots are then probed with the cDNA using standard techniques It is preferable to provide an internal loading and blotting control for this analysis A preferred method is to re-probe the same blot for transcripts of a gene likely to be present in about the same level in all cells of the same type, such as the gene for a cytoskeletal protein Thus, a preferred second probe is the cDNA for beta-actm or 36B4, available from the ATCC
  • RNA overabundance without gene duplication
  • the strategies for identifying genes that are duplicated and/or associated with RNA overabundance may be reversed appropriately to screen for genes that are deleted and/or associated with RNA underabundance
  • the principles are essentially the same Genes that are frequently down-regulated in cancer (such as tumor suppresser genes) may be down-regulated by different mechanisms in different cells, and a gene with this behavior is more likely to be central to malignant transformation or persistence of the malignant state
  • RNA is prepared from a plurality of tumors or cancer cell lines and the abundance is compared with RNA preparation from control cells
  • cancer cells that share a deleted gene in the same chromosomal region, in order to focus any differences at the RNA level towards particular alterations in cancer cells and away from normal variations or coincidental changes
  • the CGH technique may be used to identify deletions in previously uncharactenzed cancer cells
  • cancer cells may be chosen on the basis of previous knowledge of deleted regions, there is no need to conduct methods such as CGH on previously characterized lines cDNA from the RNA of cancer cells is displayed (preferably by differential display) alongside cDNA copied from
  • cDNA is selected that appears to be underrepresented in at least two (preferably more) of the cancer cells compared with the control cells cDNA thus selected may optionally be further screened against digested DNA preparations, to confirm that the RNA underabundance observed in the cancer cell populations is attributable in at least a proportion of the cells to an actual gene deletion
  • the cDNA may be used for sequencing or rescuing additional polynucleotides, in this case not from the cancer cells but from cells containing or expressing the gene at normal levels
  • Pharmaceuticals based on deleted genes or those associated with underexpressed RNA are typically oriented at restoring or upregulating the gene, or a functional equivalent of the encoded gene product
  • RNA has been compared between breast cancer cells and control cells
  • the amount of total cellular RNA was compared using a modified differential display method
  • Primers were used for the 3' region of the mRNAs which have an ohgo-dT sequence, followed by two other nucleotides as described in the previous section
  • Random or arbitrary primers of about 10 nucleotides were used for replication towards what corresponds in the sequence to the 5' region of the mRNA
  • the labeled amplification product was then separated by molecular weight on a polyacrylamide sequencing gel
  • mRNAs were chosen that were present in a higher proportion of the RNA in cancerous cells, compared with control cells, according to the proportional intensity amongst neighboring cDNA bands
  • the cDNA was recovered directly from the gel and amplified to provide a probe for screening
  • Candidate polynucleotides were screened by a number of criteria, including both Northern and Southern analysis to determine if the corresponding genes were duplicated or responsible for to RNA overabundance in breast cancer cells Sequence data of the polynucleotides was obtained and compared with sequences in GenBank Novel polynucleotides with the desired expression patterns were used to probe for longer cDNA inserts in a ⁇ gt10 library constructed from the breast cancer cell line BT474, which were then sequenced
  • Polynucleotides based on the cDNA of CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , CH14-2a16- 1 can be rescued from cloned plasmids and phage provided as part of this invention They may also be obtained from breast cancer cell libraries or mRNA preparations, or from normal human tissues such as placenta, by judicious use of primers or probes based on the sequence data provided herein Alternatively, the sequence data provided herein can be used in chemical synthesis to produce a polynucleotide with an identical sequence, or that incorporates occasional variations Polypeptides encoded by the corresponding mRNA can be prepared by several different methods, all of which will be known to a practitioner of ordinary skill For example, the appropriate strand of the full-length cDNA can be operably linked to a suitable promoter, and transfected into a suitable host cell The host cell is then cultured under conditions that allow transcription and translation to occur, and the polypeptide is
  • Antibodies against polypeptides of this invention may be prepared by any method known in the art For stimulating antibody production in an animal, it is often preferable to enhance the immunogemcity of a polypeptide by such techniques as polymerization with glutaraldehyde, or combining with an adjuvant, such as Freund's adjuvant
  • the immunogen is injected into a suitable experimental animal preferably a rodent for the preparation of monoclonal antibodies, preferably a larger animal such as a rabbit or sheep for preparation of polyclonal antibodies It is preferable to provide a second or booster injection after about 4 weeks, and begin harvesting the antibody source no less than about 1 week later
  • Sera harvested from the immunized animals provide a source of polyclonal antibodies Detailed procedures for purifying specific antibody activity from a source material are known within the art Unwanted activity cross-reacting with other antigens, if present, can be removed, for example, by running the preparation over adsorbants made of those antigens attached to a solid phase, and collecting the unbound fraction If desired, the specific antibody activity can be further purified by such techniques as protein A chromatography, ammonium sulfate precipitation, ion exchange chromatography, high-performance liquid chromatography and immunoaffinity chromatography on a column of the immunizing polypeptide coupled to a solid support
  • immune cells such as splenocytes can be recovered from the immunized animals and used to prepare a monoclonal antibody-producing cell line See, for example, Harrow & Lane (1988), U S Patent Nos 4,491 ,632 (J R Wands et al ), U S 4,472,500 (C Milstem et al ), and U S 4,444,887 (M K Hoffman et al )
  • an antibody-producing line can be produced inter a a by cell fusion, or by transfecting antibody-producing cells with Epstein Barr Virus, or transforming with oncogenic DNA
  • the treated cells are cloned and cultured, and clones are selected that produce antibody of the desired specificity Specificity testing can be performed on culture supematants by a number of techniques, such as using the immunizing polypeptide as the detecting reagent in a standard immunoassay, or using cells expressing the polypeptide in immunohistochemistry
  • a supply of monoclonal antibody from the selected clones can be purified from a large volume of tissue culture supernatant, or from the ascites fluid of suitably prepared host animals injected with the clone
  • Antibody fragments and other derivatives can be prepared by methods of standard protein chemistry, such as subjecting the antibody to cleavage with a proteolytic enzyme
  • Genetically engineered variants of the antibody can be produced by obtaining a polynucleotide encoding the antibody, and applying the general methods of molecular biology to introduce mutations and translate the variant
  • Novel cDNA sequences corresponding to genes associated with cancer are potentially useful as diagnostic aids
  • polypeptides encoded by such genes, and antibodies specific for these polypeptides are also potentially useful as diagnostic aids
  • RNA in particular cells can help identify those cells as being cancerous, and thereby play a part in the initial diagnosis
  • Increased levels of RNA corresponding to CH1-9a11-2, CH8-2a13-12, CH13-2a12-1, and CH14-2a16-1 are present in a substantial proportion of breast cancer cell lines and primary breast tumors
  • preliminary Northern analysis using probes for CH8-2a13-12, CH13-2a12-1 , and CH14-2a16-1 indicates that these genes may be duplicated or be associated with RNA overabundance in certain cell lines derived from cancers other than breast cancer, including colon cancer, lung cancer, prostrate cancer, ghoma, and ovarian cancer
  • RNA can assist with clinical management and prognosis
  • overabundance of RNA may be a useful predictor of disease survival, metastasis, susceptibility to various regimens of standard chemotherapy, the stage of the cancer, or its aggressiveness
  • Blast U S Patent No 4,968,603 (Slamon et al ) and PCT Application WO 94/00601 (Levine et al ) All of these determinations are important in helping the clinician choose between the available treatment options
  • a particularly important diagnostic application contemplated in this invention is the identification of patients suitable for gene-specific therapy, as outlined in the following section
  • treatment directed against a particular gene or gene product is appropriate in cancers where the gene is duplicated or there is RNA overabundance
  • a diagnostic test specific for the same gene is important in selecting patients likely to benefit from the pharmaceutical
  • diagnostic tests for each gene are important in selecting which pharmaceutical is likely to benefit a particular patient
  • polynucleotide, polypeptide, and antibodies embodied in this invention provide specific reagents that can be used in standard diagnostic procedures
  • the actual procedures for conducting diagnostic tests are extensively known in the art, and are routine for a practitioner of ordinary skill See, for example, U S Patent No 4,968,603 (Slamon et al ), and PCT Applications WO 94/00601
  • the polynucleotide of this invention can be used as a reagent to detect a DNA or RNA target, such as might be present in a cell with duplication or RNA overabundance of the corresponding gene
  • the polypeptide can be used as a reagent to detect a target for which it has a specific binding site, such as an antibody molecule or (if the polypeptide is a receptor) the corresponding ligand
  • the antibody can be used as a reagent to detect a target it specifically recognizes, such as the polypeptide used as an immunogen to raise it
  • the target is supplied by obtaining a suitable tissue sample from an individual for whom the diagnostic parameter is to be measured
  • Relevant test samples are those obtained from individuals suspected of containing cancerous cells, particularly breast cancer cells
  • Many types of samples are suitable for this purpose, including those that are obtained near the suspected tumor site by biopsy or surgical dissection, in vitro cultures of cells derived therefrom, blood, and blood components
  • the target may be partially purified from the sample or amplified before the assay is conducted
  • the reaction is performed by contacting the reagent with the sample under conditions that will allow a complex to form between the reagent and the target
  • the reaction may be performed in solution, or on a solid tissue sample, for example, using histology sections
  • the formation of the complex is detected by a number of techniques known in the art
  • the reagent may be supplied with a label and unreacted reagent may be removed from the complex, the amount of remaining label thereby indicating the amount of complex formed Further details and alternatives for complex detection are provided in the descriptions that follow
  • the assay result is compared with a similar assay conducted on a control sample
  • a control sample which is from a non-cancerous source, and otherwise similar in composition to the clinical sample being tested
  • any control sample may be suitable provided the relative amount of target in the control is known or can be used for comparative purposes
  • suitable control cells with normal histopathology may surround the cancerous cells being tested
  • the amount of complex formed is quantifiable and sufficiently consistent, it is acceptable to assay the test sample and control sample on different days or in different laboratories
  • a polynucleotide embodied in this invention can be used as a reagent for determining gene duplication or RNA overabundance that may be present in a clinical sample
  • the binding of the reagent polynucleotide to a target in a clinical sample generally relies in part on a hybridization reaction between a region of the polynucleotide reagent, and the DNA or RNA in a sample being tested
  • the nucleic acid may be extracted from the sample, and may also be partially purified
  • the preparation is preferably enriched for chromosomal DNA, to measure RNA overabundance, the preparation is preferably enriched for RNA
  • the target polynucleotide can be optionally subjected to any combination of additional treatments, including digestion with restriction endonucleases, size separation, for example by elecfrophoresis in agarose or polyacrylamide, and affixed to a reaction matrix, such as a blotting material
  • Hybridization is allowed to occur by mixing the reagent polynucleotide with a sample suspected of containing a target polynucleotide under appropriate reaction conditions This may be followed by washing or separation to remove unreacted reagent Generally, both the target polynucleotide and the reagent must be at least partly equilibrated into the single -stranded form in order for complementary sequences to hybridize efficiently Thus, it may be useful (particularly in tests for DNA) to prepare the sample by standard denaturation techniques known in the art
  • the minimum complementarity between the reagent sequence and the target sequence for a complex to form depends on the conditions under which the complex-forming reaction is allowed to occur Such conditions include temperature, ionic strength, time of incubation, the presence of additional solutes in the reaction mixture such as formamide, and washing procedure Higher stringency conditions are those under which higher minimum complementarity is required for stable hybridization to occur It is generally preferable in diagnostic applications to increase the specificity of the reaction, minimizing cross-reactivity of the reagent polynucleotide alternative undesired hybridization sites in the sample Thus, it is preferable to conduct the reaction under conditions of high stringency for example, in the presence of high temperature, low salt, formamide, a combination of these, or followed by a low-salt wash
  • the reagent In order to detect the complexes formed between the reagent and the target, the reagent is generally provided with a label
  • Some of the labels often used in this type of assay include radioisotopes such as 32 P and 33 P, chemiluminescent or fluorescent reagents such as fluorescein, and enzymes such as alkaline phosphatase that are capable of producing a colored solute or precipitant
  • the label may be intrinsic to the reagent, it may be attached by direct chemical linkage, or it may be connected through a series of intermediate reactive molecules, such as a biotm-avidm complex, or a series of inter-reactive polynucleotides
  • the label may be added to the reagent before hybridization with the target polynucleotide, or afterwards
  • it is often desirable to increase the signal ensuing from hybridization This can be accomplished by replicating either the target polynucleotide or the reagent polynucleotide, such as by a polymerase chain reaction
  • a combination of serially hybridizing polynucleotides or branched polynucleotides can be used in such a way that multiple label components become incorporated into each complex See U S Patent No 5,124,246 (Urdea et al )
  • An antibody embodied in this invention can also be used as a reagent in cancer diagnosis, or for determining gene duplication or RNA over
  • any such protein product can be detected in solid tissue samples and cultured cells by immunohistological techniques that will be obvious to a practitioner of ordinary skill
  • the tissue is preserved by a combination of techniques which may include cooling, exchanging into different solvents, fixing with agents such as paraformaldehyde, or embedding in a commercially available medium such as paraffin or OCT
  • a section of the sample is suitably prepared and overlaid with a primary antibody specific for the protein
  • the primary antibody may be provided directly with a suitable label More frequently, the primary antibody is detected using one of a number of developing reagents which are easily produced or available commercially Typically, these developing reagents are anti -immunoglobulm or protein A, and they typically bear labels which include, but are not limited to fluorescent markers such as fluorescein, enzymes such as peroxidase that are capable of precipitating a suitable chemical compound, electron dense markers such as colloidal gold, or radioisotopes such as 25 l
  • the section is then visualized using an appropriate microscopic technique, and the level of labeling is compared between the suspected cancer cell and a control cell, such as cells surrounding the tumor area or those taken from an alternative site
  • the amount of protein corresponding to the cancer-associated gene may be detected in a standard quantitative immunoassay If the protein is secreted or shed from the cell in any appreciable amount, it may be detectable in plasma or serum samples Alternatively, the target protein may be solubihzed or extracted from a solid tissue sample
  • the protein may be mixed with a pre-determmed non-hmitmg amount of the reagent antibody specific for the protein
  • the reagent antibody may contain a directly attached label, such as an enzyme or a radioisotope, or a second labeled reagent may be added, such as anti-immunoglobuhn or protein
  • a second labeled reagent may be added, such as anti-immunoglobuhn or protein
  • A For a solid-phase assay, unreacted reagents are removed by washing
  • unreacted reagents are removed by some other separation technique, such as filtration or chromatography
  • the amount of label captured in the complex is positively related to the amount of target protein present in the test sample
  • a variation of this technique is a competitive assay, in which the target protein competes with a labeled analog for binding sites on the specific antibody In this case, the amount of label captured is negatively related to the amount of target protein present in a test sample
  • a polypeptide embodied in this invention can also be used as a reagent in cancer diagnosis, or for determining gene duplication or RNA overabundance that may be present in a clinical sample
  • Overabundance of RNA in affected cells may result in the corresponding polypeptide being produced by the cells in an abnormal amount
  • overabundance of RNA may occur concurrently with expression of the polypeptide in an unusual form This in turn may result in stimulation of the immune response of the host to produce its own antibody molecules that are specific for the polypeptide
  • a number of human hybndomas have been raised from cancer patients that produce antibodies against their own tumor antigens
  • an immunoassay is conducted Suitable methods are generally the same as the immunoassays outlined in the preceding paragraphs, except that the polypeptide is provided as a reagent, and the antibody is the target in the clinical sample which is to be quantified
  • diagnostic procedures may be performed by diagnostic laboratories, experimental laboratories, practitioners, or private individuals
  • This invention provides diagnostic kits which can be used in these settings
  • the presence of cancer cells in the individual may be manifest in a clinical sample obtained from that individual as an alteration in the DNA, RNA, protein, or antibodies contained in the sample
  • An alteration in one of these components resulting from the presence of cancer may take the form of an increase or decrease of the level of the component, or an alteration in the form of the component, compared with that in a sample from a healthy individual
  • the clinical sample is optionally pre-treated for enrichment of the target being tested for The user then applies a reagent contained in the kit in order to detect the changed level or alteration in the diagnostic component
  • Each kit necessarily comprises the reagent which renders the procedure specific a reagent polynucleotide, used for detecting target DNA or RNA, a reagent antibody, used for detecting target protein, or a reagent polypeptide, used for detecting target antibody that may be present in a sample to be analyzed
  • the reagent is supplied in a solid form or liquid buffer that is suitable for inventory storage, and later for exchange or addition into the reaction medium when the test is performed Suitable packaging is provided
  • the kit may optionally provide additional components that are useful in the procedure These optional components include buffers, capture reagents, developing reagents, labels, reacting surfaces, means for detection, control samples, instructions, and interpretive information
  • Embodied in this invention are modes of treating subjects bearing cancer cells that have overabundance of the particular RNA described
  • the strategy used to obtain the cDNAs provided in this invention was deliberately focused on genes that achieve RNA overabundance by gene duplication in some cells, and by alternative mechanisms in other cells
  • These alternative mechanisms may include, for example, translocation or enhancement of transcription enhancing elements near the coding region of the gene, deletion of repressor binding sites, or altered production of gene regulators
  • Such mechanisms would result in more RNA being transcribed from the same gene
  • the same amount of RNA may be transcribed, but may persist longer in the cell, resulting in greater abundance This could occur, for example, by reduction in the level of ribozymes or protein enzymes that degrade RNA, or in the modification of the RNA to render it more resistant to such enzymes or spontaneous degradation
  • the general screening strategy is to apply the candidate to a manifestation of a gene associated with cancer, and then determine whether the effect is beneficial and specific
  • a composition that interferes with a polynucleotide or polypeptide corresponding any of the novel cancer-associated genes described herein has the potential to block the associated pathology when administered to a tumor of the appropriate phenotype It is not necessary that the mechanism of interference be known, only that the interference be preferential for cancerous cells (or cells near the cancer site) but not other cells
  • a preferred method of screening is to provide cells in which a polynucleotide related to a cancer gene has been transfected See, for example, PCT application WO 93/08701
  • a suitable vector such as a viral vector
  • conveying the vector into the cell such as by electroporation
  • selecting cells that have been transformed such as by using a reporter or drug sensitivity element
  • a cell line which has a phenotype desirable in testing, and which can be maintained well in culture
  • the cell line is transfected with a polynucleotide corresponding to one of the cancer-associated genes identified herein Transfection is performed such that the polynucleotide is operably linked to a genetic controlling element that permits the correct strand of the polynucleotide to be transcribed within the cell
  • Successful transfection can be determined by the increased abundance of the RNA compared with an untransfected cell It is not necessary that the cell previously be devoid of the RNA, only that the transfection result in a substantial increase in the level observed RNA abundance in the cell is measured using the same polynucleotide, according to the hybridization assays outlined earlier
  • Drug screening is performed by adding each candidate to a sample of transfected cells, and monitoring the effect
  • the experiment includes a parallel sample which does not receive the candidate drug
  • the treated and untreated cells are then compared by any suitable phenotypic criteria, including but not limited to microscopic analysis, viability testing, ability to replicate, histological examination, the level of a particular RNA or polypeptide associated with the cells, the level of enzymatic activity expressed by the cells or cell lysates, and the ability of the cells to interact with other cells or compounds Differences between treated and untreated cells indicates effects attributable to the candidate
  • the effect of the drug on the cell transfected with the polynucleotide is also compared with the effect on a control cell Suitable control cells include untransfected cells of similar ancestry, cells transfected with an alternative polynucleotide, or cells transfected with the same polynucleotide in an inoperative fashion Optimally the drug has a greater effect on operably transfected cells than on
  • Desirable effects of a candidate drug include an effect on any phenotype that was conferred by transfection of the cell line with the polynucleotide from the cancer-associated gene, or an effect that could limit a pathological feature of the gene in a cancerous cell
  • Examples of the first type would be a drug that limits the overabundance of RNA in the transfected cell, limits production of the encoded protein, or limits the functional effect of the protein The effect of the drug would be apparent when comparing results between treated and untreated cells
  • An example of the second type would be a drug that makes use of the transfected gene or a gene product to specifically poison the cell The effect of the drug would be apparent when comparing results between operably transfected cells and control cells
  • This invention also provides gene-specific pharmaceuticals in which each of the polynucleotides, polypeptides, and antibodies embodied herein as a specific active ingredient in pharmaceutical compositions
  • Such compositions may decrease the pathology of cancer cells on their own, or render the cancer cells more susceptible to treatment by the non-specific agents, such as classical chemotherapy or radiation
  • An example of how polynucleotides embodied in this invention can be effectively used in treatment is gene therapy See, for example, Morgan et al , Culver et al , and U S Patent No 5,399,346 (French et al )
  • the general principle is to introduce the polynucleotide into a cancer cell in a patient, and allow it to interfere with the expression of the corresponding gene, such as by complexing with the gene itself or with the RNA transcribed from the gene Entry into the cell is facilitated by suitable techniques known in the art as providing the polynucleotide in the form of a suitable vector, or encapsulation of the poly
  • a preferred mode of gene therapy is to provide the polynucleotide in such a way that it will replicate inside the cell, enhancing and prolonging the interference effect
  • the polynucleotide is operably linked to a suitable promoter, such as the natural promoter of the corresponding gene, a heterologous promoter that is intrinsically active in cancer cells, or a heterologous promoter that can be induced by a suitable agent
  • the construct is designed so that the polynucleotide sequence operably linked to the promoter is complementary to the sequence of the corresponding gene
  • the transcript of the administered polynucleotide will be complementary to the transcript of the gene, and capable of hybridizing with it
  • This approach is known as anti-sense therapy See, for example, Culver et al and Roth
  • the use of antibodies embodied in this invention in the treatment of cancer partly relies on the fact that genes that show RNA overabundance in cancer frequently encode cell-surface proteins Location of these proteins at the cell surface may correspond to
  • a specific antibody may be effective in decreasing the pathology of a cancer cell
  • an antibody that blocks the ligand binding site or causes endocytosis of the receptor would decrease the ability of the receptor to provide its signal to the cell
  • the effectiveness of a particular antibody can be predicted empirically by testing with cultured cancer cells expressing the corresponding protein
  • Monoclonal antibodies may be more effective in this form of cancer therapy if several different clones directed at different determinants of the same cancer-associate gene product are used in combination see PCT application WO 94/00136 (Kasprzyk et al )
  • Such antibody treatment may directly decrease the pathology of the cancer cells, or render them more susceptible to non-specific cytotoxic agents such as platinum (Lippman)
  • Another example of how antibodies can be used in cancer therapy is in the specific targeting of effector components
  • the protein product of the cancer-associated gene is expected to appear in high frequency on cancer cells compared to unaffected cells, due to the overabundance of the corresponding RNA
  • the protein therefore provides a marker for cancer cells that a specific antibody can bind to
  • An effector component attached to the antibody therefore becomes concentrated near the cancer cells, improving the effect on those cells and decreasing the effect on non-cancer cells This concentration would generally occur not only near the primary tumor, but also near cancer cells that have metastasized to other tissue sites Furthermore, if the antibody is able to induce endocytosis, this will enhance entry of the effector into the cell interior
  • an antibody specific for the protein of the cancer-associated gene is conjugated with a suitable effector component, preferably by a covalent or high -affinity bond
  • suitable effector components in such compositions include radionuchdes such as 131 l, toxic chemicals such as vincnstine, and toxic peptides such as diphtheria toxin
  • Other suitable effector components include peptides or polynucleotides capable of altering the phenotype of the cell in a desirable fashion for example, installing a tumor suppresser gene, or rendering them susceptible to immune attack
  • an active vaccine comprising a polypeptide encoded by the cDNA of this invention would be appropriately administered to subjects having overabundance of the corresponding RNA
  • an active vaccine comprising a polypeptide encoded by the cDNA of this invention would be appropriately administered to subjects having overabundance of the corresponding RNA
  • an active vaccine comprising a polypeptide encoded by the cDNA of this invention would be appropriately administered to subjects having overabundance of the corresponding RNA
  • Peptides may be capable of eliciting an immune response on their own, or they may be rendered more immunogenic by chemical manipulation, such as cross-linking or attaching to a protein carrier like KLH
  • the vaccine also comprises an adjuvant, such as alum, muramyl dipeptides, hposomes, or DETOXTM
  • the vaccine may optionally comprise auxiliary substances such as wetting agents, emulsifying agents, and organic or inorganic salts or acids, It also comprises a pharmaceutically acceptable excipient which is compatible with the active ingredient and appropriate for the route of administration
  • the desired dose for peptide vaccines is generally from 10 ⁇ g to 1 mg, with a broad effective latitude
  • the vaccine is preferably administered first as a priming dose, and then again as a boosting dose, usually at least four weeks later Further boosting doses may be given to enhance the effect
  • the dose and its timing are usually determined by the person responsible for the treatment
  • Sequence data for CH8-2a13-1 and CH13-2a12-1 cDNA are believed to comprise the entire translated coding sequence, and 5' and 3' untranslated regions corresponding to those found in typical mRNA transcripts Multiple mRNA transcripts may be found depending on the patterns of transcript processing in various cell types of interest Sequence data for CH1 -9a11-2 and CH14-2a16-1 cDNA comprise a portion of the coding sequence and 3' untranslated regions Additional sequence is typically present in the corresponding mRNA transcripts, comprising an additional coding region in the
  • N-termmal direction of the protein, and possibly a 5' untranslated region Certain embodiments of this invention may be practiced by polynucleotide synthesis according to the data provided herein, by rescuing an appropriate insert corresponding to the gene of interest from one of the deposits listed below, or by isolating a corresponding polynucleotide from a suitable tissue source
  • Various useful probes and primers for use in polynucleotide isolation are provided herein, or may be designed from the sequence data Three deposits have been made on May 31 , 1996 with the American Type Culture Collection
  • Sequence databases contain sequences of polynucleotide and polypeptide fragments with varyous degrees of identity and overlap with certain embodiments of this invention.
  • accession numbers is provided for the interest of the reader; it is not intended to be comprehensive or a limitation on the invention.
  • the database disclosures do not typically indicate use in cancer diagnosis, drug development, or disease treatment.
  • GenBank accession numbers are listed in relation to CH1-9a11-2: dbEST N32686; N45113; N36176; N22982; AA278830; H88670; AA235936; AA236951; H26301; N28026; H88063; H88064; D61948; H88718; H26460; AA137920; AA145308; W12952; AA200687; N44164; T27279; dbSTS G22044; G04961.
  • GenBank accession numbers are listed in relation to CH8-2a13-1 dbNR D83780
  • GenBank accession numbers are listed in relation to CH13-2a12-1 dbNR
  • GenBank accession numbers are listed in relation to CH14-2a16-1 dbEST N64802, W56903, N31400, W95674, AA233551, AA233636, N24105, W03447, W25821, AA233666, AA233647, N67843, D55778, T66839, N55370, N75650, AA280736, H97110, Z19643, H91250,
  • Example 1 Selecting cDNA for messenger RNA that is overabundant in breast cancer cells
  • RNA was isolated from each breast cancer cell line or control cell by centnfugation through a gradient of guanidme isothiocyanate/CsCl
  • the RNA was treated with RNase-free DNase (Promega, Madison, Wl)
  • the RNA preparations were stored at -70°C Ohgo-dT polynucleotides for priming at the 3' end of messenger RNA with the sequence T ⁇ NM (where N e ⁇ A,C,G ⁇ and M e ⁇ A,C,G,T ⁇ ) were synthesized according to standard protocols
  • Arbitrary decamer polynucleotides OPA01 to OPA20 for priming towards the 5' end were purchased from Operon Biotechnology, Inc , Alameda, CA
  • RNA was reverse-transcribed using AMV reverse transc ⁇ ptase (obtained from BRL) and an anchored ohgo-dT pnmer in a volume of 20 ⁇ L, according to the manufacturer's directions
  • the reaction was incubated at 370C for 60 mm and stopped by incubating at 950C for 5 mm
  • the cDNA obtained was used immediately or stored frozen at -70°C
  • Differential display was conducted according to the following procedure 1 ⁇ L cDNA was replicated in a total volume of 10 ⁇ L PCR mixture containing the appropriate T ⁇ NM sequence, 0 5 TM of a decamer primer, 200 TM dNTP, 5 TCi [ 35 S]-dATP (Amersham), Taq polymerase buffer with 2 5 mM MgCl 2 and 0 3 unit Taq polymerase (Promega) Forty cycles were conducted in the following sequence 94°C for 30 sec, 40°C for 2 mm, 72°C for 30 sec, and then the sample was incubated at 72°C for 5 mm The replicated cDNA was separated on a 6% polyacrylamide sequencing gel After elecfrophoresis, the gel was dried and exposed to X-ray film
  • FIG. 1 provides an example of an autoradiogram from such an experiment
  • Lane 1 is from non-proliferating normal breast cells
  • lane 2 is from proliferating normal breast cells
  • lanes 3 to 5 are from breast cancer cell lines BT474, SKBR3, and MCF7
  • the left and right side shows the pattern obtained from experiments using the same T ⁇ NM sequence (T ⁇ AC), but two different decamer primers
  • T ⁇ AC T ⁇ NM sequence
  • the arrows indicate the cDNA fragments that were more abundant in all three tumor lines compared with controls
  • RNA derived from uncultured normal mammary epithelial cells are obtained from surgical samples resected from healthy breast tissue, which are then coaxed apart by blunt dissection techniques and mild enzyme treatment Using organoids as the negative control, 33 cDNA fragments were isolated from 15 displays
  • Example 2 Sub-selecting cDNA that corresponds to genes that are duplicated in breast cancer cells
  • BT474, SKBR3 and ZR-75-30 were used to prepare Southern blots to screen the cloned cDNA fragments
  • the cloned cDNA fragments were labeled with [ 32 P]-dCTP, and used individually to probe the blots A larger relative amount of binding of the probe to the lanes corresponding to the cancer cell DNA indicated that the corresponding gene had been duplicated in the cancer cells
  • the labeled cDNA probes were also used in Northern blots to verify that the corresponding RNA was overabundant in the appropriate cell lines
  • a partial nucleotide sequence was obtained using M13 primers Each sequence was compared with the known sequences in GenBank in initial experiments, 5 of the first 7 genes sequenced were mitochondnal genes To avoid repeated isolation of mitochondnal genes, subsequent screening experiments were done with additional lanes in the DNA blot analysis for EcoRI digested and H/ndlll digeste
  • the fragments were used as probes to screen a cDNA library from breast cancer cell line BT474, constructed in lambda GT10
  • the longer cDNA obtained from lambda GT10 were sequenced using lambda GT10 primers
  • the chromosomal locations of the cDNAs were determined using panels of somatic cell hybrids
  • Example 3 Using the cDNA to test panels of breast cancer cells
  • the four cDNA obtained according to the selection procedures described were used to probe a panel of breast cancer cell lines and primary tumors. Gene duplication was detected either by Southern analysis or slot -blot analysis.
  • Southern analysis 10 ⁇ g of EcoRI digested genomic DNA from different cell lines was electrophoresed on 0.8% agarose and transferred to a HYBONDTM N+ membrane (Amersham). The filters were hybridized with 32 P-labeled cDNA for the putative breast cancer gene. After an autoradiogram was obtained, the probe was stripped and the blot was re-probed using a reference probe to adjust for differences in sample loading.
  • Either chromosome 2 probe D2S5 or chromosome 21 probe D21S6 was used as a reference. Densities of the signals on the autoradiograms were obtained using a densitometer (Molecular Dynamics). The density ratio between the breast cancer gene and the reference gene was calculated for each sample. Two samples of placental DNA digests were run in each Southern analysis as a control. For slot-blot analysis, 1 ⁇ g of genomic DNA was denatured and slotted on the HYBONDTM membrane. D21S5 or human repetitive sequences were used as reference probes for slot blots. The density ratio between the breast cancer gene and the reference gene was calculated for each sample.
  • the standardized ratio calculated as described underestimates the gene copy number, although it is expected to rank in the same order
  • the standardized ratio obtained for the c-myc gene in the SKBR3 breast cancer cell was 5 0
  • SKBR3 has approximately 50 copies of the c-myc gene
  • RNA from breast cancer cell lines or primary breast cancer tumors were electrophoresed on 0 8% agarose in the presence of the denaturant formamide, and then transferred to a nylon membrane The membrane was probed first with 32 P-labeled cDNA corresponding to the putative breast cancer gene, then stripped and reprobed with
  • Table 4 summarizes the results of the analysis for gene duplication and RNA overabundance
  • control samples were used for the gene duplication experiments, consisting of different preparations of placental DNA The control sample with the highest level of intensity was used for standardizing the other values Other sources used for this analysis were breast cancer cell lines with the designations shown For reasons stated in Example 3, the quantitative number is not a direct indication of the gene copy number, although it is expected to rank in the same order Similarly, up to 6 control samples were used for the RNA overabundance experiments, consisting of different preparations of breast cell organoids which had been maintained briefly in tissue culture until the experiment was performed The control sample with the highest level of intensity was used for standardizing the other values Each cell line was scored + or - according to an arbitrary cut-off value
  • the CH1-9a11-2 gene was further characterized by obtaining additional sequence information
  • a ⁇ -GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was screened using the initial cDNA insert, and a clone with a 2 5 kilobase insert was identified The identified clone was subcloned into plasmid vector pCRII T7 and Sp6 primers for regions flanking the cDNA inserts were used as initial sequencing primers
  • a second clone (designated pCH1-1 1) overlapping on the 5' end was obtained using CLONTECH MarathonTM cDNA Amplification Kit A map showing the overlapping regions is provided in Figure 6 Briefly, two DNA primers designated CH1a and CH1b ( Figure 7) were synthesized
  • RNA from breast cancer cell line 600PE was reverse transcribed using CH1b pnmer After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded cDNA
  • the 5' end cDNA of CH1-9a11-2 was then amplified by PCR using primers CH1a and AP1 (provided in the kit)
  • primers CH1a and AP1 provided in the kit
  • the first PCR products were PCR reamphfied using nested primers CH1a and AP2 (provided in the kit)
  • the PCR products were cloned into pCRII vector (Invifrogen) and screened with CH1-9a11-2 probe
  • Figure 23 is a listing of additional cDNA sequence obtained for CH1-9a11-2, comprising approximately 1934 base pairs 5' from the sequence of Figure 8
  • the additional sequence data was obtained by rescuing and amplifying two further fragments of CH1-9a11-2 cDNA Nested primers were designed ⁇ 100 base pairs downstream from the 5' end of the known sequence
  • the primers were used in a nested amplification assay using AP1 and AP2, using the CLONTECH MarathonTM cDNA Amplification Kit as described above
  • the template for the first upstream fragment was reverse- transcribed polyadenylated RNA from breast cancer cell line 600PE , as described earlier This fragment was sequenced, and another set of nested primers was designed
  • the template for the next upstream fragment was a MarathonTM ready cDNA preparation from human testes, also supplied by CLONTECH
  • the nucleotide sequence shown in Figure 23 comprises an open reading frame through to the 5' end Figure 24 shows the corresponding protein translation Between about another 500-1000 bases are predicted to be present in the CH1-9a11-2 direction, with the protein encoding sequence beginning somewhere within this additional sequence Sequencing of the encoding region is completed by obtaining additional CH1-9a11-2 fragments in this direction
  • a CH1-9a11-2 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources The RNA was obtained already prepared for
  • the outermost primer is used to synthesize a first cDNA strand complementary to the mRNA in the upstream direction
  • Second strand synthesis is performed using reagents in a CLONTECH MarathonTM cDNA amplification kit according to manufacturer's directions
  • the double-stranded DNA is then ligated at the 5' end of the coding sequence with the double-stranded adaptor fragment provided in the kit
  • a first PCR amplification (about 30 cycles) is performed using the first adapter primer from the kit and the outermost RNA-specific primer
  • a second amplification is performed using the second adapter primer and the innermost RNA-specific primer
  • a CLONTECH RACE-READY single-stranded cDNA from human placenta is PCR amplified using nested 5' anchor primers in combination with the outermost and innermost RNA- specific primers Amplified DNA obtained using either approach is analyzed by gel elecfrophoresis, and cloned into plasmid vector
  • CH1-9a11-2 insert Clones corresponding to full-length mRNA (4 5 kb or 5 5 kb, Table 1 ), or cDNA fragments overlapping at the 5' end are selected for sequencing Compared with the 4 5 kb form, additional polynucleotide segments may be present in the 5 5 kb form within the encoding region, or in the 5' or 3' untranslated region
  • Figure 27 shows what is believed to be the full length cDNA sequence for the principal transcript of the Chromosome 1 gene CH1-9a11-2 is 5919 nucleotides in length which matches the size of mRNA observed by Northern blot analysis
  • the nucleotides 1 to 2467 represent new sequence not shown in preceding figures
  • CH1c and CH1d were prepared Polyadenylated RNA from breast cancer cell line BT474 was reverse transcribed using CH1c primer After second strand synthesis, adaptor DNA provided by the kit was ligated to the double-stranded cDNAs The 5' end cDNA of CH1-9a11-2 was then amplified by PCR using primers CH1c and AP1(prov ⁇ ded by the kit) To increase the specificity of the PCR products, the first PCR products were PCR amplified using nested primers CH1d and AP2 The PCR products were cloned into PCR2 1 vector (Invifrogen) and screened with 32 P-labeled pCH1-1 1 k probe Clone pCH1- 800 was obtained and the cDNA insert was sequenced Based upon the pCH 1-800 sequences, primers CH1 I and CH1J were prepared Using the method described above, clone pCH1-J8 was obtained Based upon the pCH-J8 sequence
  • Figure 28 shows the corresponding ammo acid encoding region for the predicted protein product
  • CH1-9A11-2 gene product may have a fundamental function for cell metabolism
  • CH1-9A11-2 may represent another oncogene involved in the oncogenesis of diverse human cancers
  • CH1-9A11-2 3 to 10 fold Most breast cancer cell lines expressed CH1-9A11-2 at levels comparable to that in cultured normal epithelial cells Several cell lines, including 600PE, MDA-MB453 as well as MDA-MB- 134 expressed CH1-9A11-2 RNA several fold higher than cultured normal epithelial cells However MDA-MB-134 does not have CH1-9A11-2 gene amplification
  • the expression of CH1-9a11-2 was also evaluated by RNA in situ hybridization Archival paraffin blocks of infiltrating breast cancer were obtained from 41 randomly selected patients from the University of California at San Francisco Cell-lines 600PE and MDA-MB-435 were used as control Confluent cultured cells were tryps ized, centnfuged and wrapped in colloidm bag The colloidm bag was fixed in 4% buffered formalin for 24 h and embedded routinely in paraffin wax In situ hybridization was carried out by standard methods Briefly, deparaff ⁇ nized 4 mm thick tissue sections were treated with prote ase K and hybridized overnight
  • Figure 34 shows the overexpression of CH1-9a11-2 in primary breast cancer samples as determined by in situ hybridization
  • concentration of the digoxigenm-labeled CH1-9A11-2 antisense RNA probe was titrated to show a strong staining on 600PE cell line which overexpresses CH1-9A11-2 (Panel A) and a negative staining on MDA-MB-435 cell line which does not overexpress CH1-9A11-2 (Panel B)
  • Specificity of the antisense RNA was illustrated by the negative hybridizing signal with CH1-9A11-2 sense RNA probe on 600PE cell line (Panel C)
  • Nine of the ten reduction mammaplasties show negative stain as illustrated in Panel D and the remaining case showed weak staining in some ducts
  • CH1-9A11-2 RNA was detected in 18 of the 41 (44%) breast cancers (panel E and F) Of these 41 breast carcinoma, 11 also had adjacent non-malignant breast epithelium In 10 of these 11 (
  • FIG. 1 shows the Southern blot analysis for the corresponding gene in various DNA digests
  • Lane 1 (P12) is the control preparation of placental DNA, the rest show DNA obtained from human breast cancer cell lines
  • Panel A shows the pattern obtained using the 32 P-labeled CH8-2a13-1 cDNA probe
  • Panel B shows the pattern obtained with the same blot using the 32 P-labeled D2S6 probe as a loading control
  • the sizes of the restriction fragments are indicated on the right
  • Figure 3 shows the Northern blot analysis for RNA overabundance
  • Lanes 1 -3 show the level of expression in cultured normal epithelial cells
  • Lanes 4-19 show the level of expression in human breast cancer cell lines
  • Panel A shows the pattern obtained using the CH8-2a13-1 probe
  • panel B shows the pattern obtained with beta-actm cDNA, a loading control
  • a third clone of about 600 bp (designated pCH8-600) overlapping on the 5' end ( Figure 6) was obtained using CLONTECH MarathonTM cDNA Amplification Kit Briefly, two DNA primers CH8a and CH8b ( Figure 10) were synthesized Polyadenylated RNA from breast cancer cell line BT474 was reverse transcribed using CH8b primer After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded cDNA The 5' end cDNA of CH8-2a13-1 was then amplified by PCR using primers CH8a and AP1 (provided in the kit) To increase the specificity of the PCR products, the first PCR products were PCR reamplified using nested primers CH8a and AP2 (provided in the kit) The PCR products were cloned into pCRII vector (Invifrogen) and screened with CH8-2a13-
  • GENINFO® BLAST search of nucleotide and peptide sequence databases was performed through the National Center for Biotechnology Information on March 26, 1996 The sequences were found to be about 99% identical at the nucleotide and ammo acid level with bases 343-4103 of KIAA0196 protein (N Nomura et al , in press, sequence submitted to the DDBJ/EMBUGenBank databases on March 4, 1996)
  • the KIAA0196 was one of 200 different cDNA cloned at random from an immature male human myeloblast cell line KIAA0196 has no known biological function, and is described by Nomura et al as being ubiquitously expressed
  • a fourth clone of about 600 bp overlapping pCH8-600 at the 5' end has also been obtained.
  • a DNA primer was synthesized corresponding to about the first 20 nucleotides at the 5' of the predicted cDNA sequence, and used along with a primer based on the pCH8-600 sequence to reverse-transcribe RNA from breast cancer cell line BT474.
  • the product was cloned into pCRII vector (Invifrogen) and screened with a CH8-2a13-1 probe. The new clone is sequenced along both strands to obtain additional 5' untranslated sequence data for the cDNA.
  • the predicted compiled cDNA nucleotide sequence of CH8-2a13-1 cDNA is shown in Figure 13 (SEQ. ID NO:21 ).
  • the corresponding amino acid sequence of this frame is shown in Figure 14 (SEQ. ID NO:22).
  • a polynucleotide comprising the compiled sequence is assembled by joining the insert of this fourth clone to pCH8-4k within the shared region. Briefly, CH8-4k is cut with Xbal and ⁇ /ofl. The fourth clone is cut with BamHI and XJ al. The ligated polynucleotide is then inserted into pCRII cut with BamHI and ⁇ /ofl.
  • a CH8-2a13-1 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4.
  • the relative CH8-2a13-12 expression observed at the mRNA level is shown in Table 7:
  • Relative levels of expression observed were as follows Low levels of expression were observed in adult peripheral blood leukocytes (PBL), brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas Medium levels of expression were observed in adult heart, spleen, thymus, prostate, testis, ovary, small intestine, and colon High levels of expression were observed in four fetal tissues tested brain, lung, liver and kidney
  • PBL peripheral blood leukocytes
  • the level of expression in breast cancer cell lines is relatively high (about ++++ on the scale), since the Northern analysis performed on these lines was conducted on total cellular RNA It is likely that the CH8-2a13-1 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism
  • FIG. 1 shows the Southern blot analysis for the corresponding gene in various DNA digests
  • Lanes 1 and 2 are control preparations of placental DNA, the rest show DNA obtained from human breast cancer cell lines
  • Panel A shows the pattern obtained using the CH13-2a12-1 cDNA probe
  • panel B shows the pattern using D2S6 probe as a loading control
  • the sizes of the restriction fragments are indicated on the right
  • Figure 5 shows the Northern blot analysis for RNA overabundance of the CH13-2a12-1 gene
  • Lanes 1-3 show the level of expression in cultured normal epithelial ceils
  • Lanes 4-19 show the level of expression in human breast cancer cell lines
  • Panel A shows the pattern obtained using the CH13-2a12-1 probe
  • panel B shows the pattern obtained with beta-act cDNA
  • the apparent size of the mRNA varied depending upon conditions of elecfrophoresis Full-length mRNA is believed to occur at sizes of about 3 2 and 3 5 kb
  • Table 8 The results of the RNA abundance comparison are summarized in Table 8
  • the scoring method is the same as for Example 4
  • a CH13-2a12-1 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4.
  • the relative CH13-2a12-1 expression observed at the mRNA level is shown in Table 9:
  • the level of expression in breast cancer cell lines is relatively high (about ++++ on the scale), since the Northern analysis performed on these lines was conducted on total cellular RNA. It is likely that the CH13-2a12-1 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism.
  • overlapping clones corresponding to the 5'end of CH13- 2a12-1 cDNA were isolated using CLONTECH MarathonTM cDNA amplification Kit Briefly, DNA primers CH1c and CH1d were prepared Polyadenylated RNA from breast cancer cell line BT474 was reverse transcribed using CH1c primer After second strand synthesis, adaptor DNA provided by the kit was ligated to the double-stranded cDNAs The 5' end cDNA of CH13-2a12-1 was then amplified by PCR using primers CH13I and AP1 (provided by the kit) To increase the specificity of the PCR products, the first PCR products were PCR amplified using nested primers CH13J and AP2 The PCR products were cloned into PCR2 1 TM vector (Invifrogen) and screened by PCR using primer piars CM3J/AP2 and Ch13J/Ch13H Clone pCH13-J26 which were amplified by Ch
  • Figure 36 The splice variants are believed to represent partially processed forms of messenger RNA It is also possible that these are mature RNA produced by variant splicing, and that the proteins they encode are also produced by the cell, perhaps at a lower amount
  • CH13-2a12-1 was localized to chromosome 13q
  • CH13-2a12- 1 was mapped to chromosome 13q34-qter by fluorescence in situ hybridization using a CH13-2a12-1
  • BAC clone isolated from a human BAC library (Research Genetics Inc ) according to the supplier's protocol DNA was extracted and labeled with d ⁇ gox ⁇ gen ⁇ n-11-dUTP by nick translation Fluorescence in situ hybridization (FISH) was carried out in the presence of human Cot 1 DNA to suppress the background signal and hybridized to metaphase chromosomes overnight The hybridization signal was detected by anti-digoxigenm conjugated with FITC The chromosomes were counterstamed with
  • DAPI DAPI The location of the probe was determined by digital image microscopy following FISH and localized by DAPI banding (Stokke, T et al , Genomics, 26 134, 1995)
  • Figure 41 shows the relative amplification and overexpression of CH13-2a12-1 in breast cancer cell lines
  • slot analysis was performed, which allowed simultaneous analysis of many normal and cancer cell samples
  • a set of non-malignant (normal) samples were analyzed by Southern or Northern hybridization Densities of the signals on the autoradiograms were obtained using a densitometer (Molecular Dynamics, Sunnyvale, CA) The density ratio between the CH13-2a12-1 gene and the reference gene was calculated for each sample
  • Two steps were required to determine the cut-off point First, the data for normal tissues were transformed so that it become normally distributed (i e followed a Gaussian distribution curve) Next, a table of tolerance bounds for a normal distribution was used to define cut points so that the confidence was 90% that no more than P% of the distribution would e above the cut-off point The cut-off point was then transformed back to the original measurement unit Southern analysis is shown in the left panel Based on the defined cut-off point,
  • Figure 42 shows the overexpression of CH13-2a12-1 in primary breast cancer samples as determined by in situ hybridization
  • a negatively stained cell line (BT20, which does not overexpress CH13-2a12-1) and a positively stained cell line (600PE) are shown in Panels A and B respectively
  • CH13-2a12-1 belongs to a conserved family of genes, culhn (Kipreos, E T et al , Cell, 85 828, 1996) It is therefore proposed that this gene be designated Cul-4A, the human form of which is Hs-Cul-4A
  • VACM-1 Vasopressm-Activated Calcium Mobihzating receptor 1
  • Hs-cul-5 The rabbit ortholog of cul-5 known as VACM-1 (Vasopressm-Activated Calcium Mobihzating receptor 1) mobilizes Ca +2 after arginme vasopressm induction, increasing the mtracellular calcium concentration
  • VACM-1 Vasopressm-Activated Calcium Mobihzating receptor 1
  • Hs-cul-5 The rabbit ortholog of cul-5 known as VACM-1 (Vasopressm-Activated Calcium Mobihzating receptor 1) mobilizes Ca +2 after arginme vasopressm induction, increasing the mtracellular calcium concentration
  • Hs-cul-5 The highest expression level of Hs-cul-5 was found to be in the heart and the skeletal muscle, contractile tissues that require a high level of calcium influx
  • This disclosure shows that heart and skeletal muscle also expressed the highest level of Hs-cul-4A, suggesting it may also be involved in calcium mobilization
  • the yeast cul-1 gene, or Cdc53 (Kipreos et al , supra) is part of a protein complex that targets cell cycle proteins for degradation by the ubiquitm-proteasome pathway
  • the Caenorhabditis elegans cul-1 (Ce-cul-1) is required for cells to exit the cell cycle Moreover, a null mutation of Ce-cul-1 causes hyperplasia in all tissues
  • human cul-2 gene product has been shown to form a stable complex with the von Hippel Lmdau tumor suppressor gene product (Pause, A et al .
  • the CH14-2a16-1 gene was further characterized by obtaining additional sequence information
  • a ⁇ -GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was screened using the initial cDNA insert, and two clones were identified one with a 1 6 kb insert, and the other with a 2 5 kb insert The identified clones were subcloned into plasmid vector pCRII
  • the 1 6 kb insert was sequenced by using T7 and Sp6 primers for regions flanking the cDNA inserts as initial sequencing primers Sequencing continued by walking along the region of interest by standard techniques, using sequencing primers based on data already obtained Primers used are those designated 1-11 in Figure 18
  • a third clone (designated pCH 14-800) overlapping on the 5' end ( Figure 6) was obtained using
  • DNA primers CH14a, CH14b, CH14c and CH14d were prepared Polyadenylated RNA from breast cancer cell line MDA453 was reverse transcribed using 14b primer After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded cDNA The 5' end cDNA of CH14-2a16-1 was then amplified by PCR using primers CH14b (or CH14c) and AP1 (provided in the kit) To increase the specificity of the
  • the first PCR products were PCR reamphfied using nested primers CH14a (or CH14d) and AP2 (provided in the kit)
  • the PCR products were cloned into pCRII vector (Invitrogen) and screened with CH14-2a16-1 probe
  • NAB2 is one of the major proteins associated with nuclear polyadenylated RNA in yeast cells, as detected by UV light-induced cross-linking and immunofluorescence NAB2 is strongly and specifically associated with nuclear poly(A)+ RNA in vivo Gene knock-out experiments have shown that this protein is essential to yeast cell survival (Anderson et al ) Accordingly, the protein encoded by CH14-2a16-1 is suspected of having DNA or RNA binding activity
  • a fourth clone (pCH14-1 3) has been obtained that overlaps the pCH14-800 clone at the 5' end ( Figure 6)
  • the method of isolation was similar to that for pCH 14-800, using primers based on the pCH14-800 sequence
  • Partial sequence data for pCH14-1 3 has been obtained by one-directional sequencing from the 5' and 3' ends of the pCH14-1 3 clone
  • Figure 21 shows the nucleotide sequence of the sequence of the 5' end (SEQ ID NO 29) and the ammo acid translation of the likely open reading frame (SEQ ID NO 30), the nucleotide sequence of the 3' end (SEQ ID NO 31) and the likely open reading frame (SEQ ID NO 32) This data is confirmed and additional sequence between SEQ ID NOS 29 and 31 is obtained by fully sequencing both strands of pCH14-1 3
  • the sequence data from pCH14-1 3, pCH14-800 and pCH14-1 6 may be shorter than the apparent
  • Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1 , comprising approximately 1934 base pairs 5' from the sequence of Figure 19 The corresponding ammo acid translation is shown in the upper panel of Figure 26
  • the additional sequence data was obtained by rescuing and amplifying further fragments of CH14-2a16-1 cDNA Nested primers were designed -100 base pairs downstream from the 5' end of the known sequence The primers were used in a nested amplification assay using AP1 and AP2, using the CLONTECH MarathonTM cDNA Amplification Kit as described above
  • the template was a MarathonTM ready cDNA preparation from human testes, also supplied by CLONTECH
  • the nucleotide sequence shown in Figure 25 is closed at the the 5' end
  • the lower panel of Figure 26 shows what is predicted to be the sequence of the gene product, beginning at the first methionine residue
  • the nucleotide sequence shown contains a point difference at the position indicated by the underlining in Figure 25
  • a base determined to be A from the previously obtained polynucleotide fragment was a G in the one used in this part of the experiment This corresponds to a change from E (glutamic acid) to G (glycme) in the protein sequence, at the position underlined in Figure 26 This may represent a natural allelic variation
  • a CH14-2a16-1 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4
  • the relative CH14-2a16-1 expression observed at the mRNA level is shown in Table 11
  • CH14-2a16-1 mRNA was particularly high in testis.
  • the level of expression in breast cancer cell lines is also quite high, since the Northern analysis performed on these lines was conducted on total cellular RNA. It is likely that the CH14-2a16-1 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism.
  • Zinc finger motifs Five motifs corresponding to a zinc finger protein have been found in the CH14-2a16-1 nucleotide sequence. Further zinc finger motifs may be present in CH14-2a16-1 in the upstream direction. Zinc finger motifs are present, for example, in RNA polymerases I, II, and III from S. cerevisiae, and are related to the zinc knuckle family of RNAssDNA-binding proteins found in the HIV nucleocapsid protein. The actual sequence observed in each of the five zinc finger motifs of CH14-2a16-1 is:
  • the CH14-2a16-1 gene product is suspected of having DNA or RNA binding activity, and may be specific for polyadenylated RNA. It may very well play a role in the regulation of gene replication, transcription, the processing of hnRNA into mature mRNA, the export of mRNA from the nucleus to the cytoplasm, or translation into protein. This role in turn may be closely implicated in cell growth or proliferation, particularly as manifest in tumor cells.
  • cDNA fragments corresponding to additional cancer-associated genes are obtained by applying the techniques of Examples 1 & 2 with appropriate adaptations.
  • cancer cells are selected for use in differential display of RNA, based on whether they share a duplicated chromosomal region according to Table 12:
  • Control RNA is prepared from normal tissues to match that of the cancer cells in the experiment Normal tissue is obtained from autopsy, biopsy, or surgical resection Absence of neoplastic cells in the control tissue is confirmed, if necessary, by standard histological techniques cDNA corresponding to RNA that is overabundant in cancer cells and duplicated in a proportion of the same cells is characterized further, as in Examples 3-7 Additional cDNA comprising an entire protein-product encoding region is rescued or selected according to standard molecular biology techniques as described elsewhere in this disclosure
  • SEQ ID NO: 11 CAATCGCCGT 10
  • SEQ ID NO: 12 TCGGCGATAG 10
  • SEQ ID NO: 13 CAGCACCCAC 10
  • SEQ ID NO: 14 AGCCAGCGAA 10

Abstract

New methods are disclosed for detecting cancer associated genes, and obtaining corresponding cDNA sequences. The methods involve displaying cDNA from control cells, and from a plurality of different cancer cells that share a duplicated or deleted gene in the same region of a chromosome. Four novel genes associated with cancer have been identified. In at least about 60% of the breast cancer cell lines tested, RNA hybridizing with the cDNAs were substantially more abundant than in normal cells. Many of the cell lines also showed a duplication of the corresponding gene, which probably contributed to the increased level of RNA in the cell. However, for each of the four genes, there were some cell lines which had RNA overabundance without gene duplication. This suggests that the gene product is sufficiently important to the cancer process that cells will use several alternative mechanisms to achieve increased expression. One of the protein products is predicted to be a membrane protein, another is predicted to have DNA or RNA binding activity. Several of the genes share conserved regions with distantly related organisms, suggesting a fundamental role in cellular metabolism or proliferation.

Description

GENES AMPLIFIED IN CANCER CELLS
STATEMENT OF RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH
This invention was made in part during work supported by a grant from the National Cancer Institute (P01-CA44768) The Government has certain rights in the invention
REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part of International Application PCT/US97/05930 designating the U S , filed April 9, 1997, currently pending under PCT Chapter II proceedings PCT/US97/05930 is a continuation-in-part and claims the priority benefit of the following applications US 60/015,167, filed April 9, 1996, now abandoned, PCT/US96/09286 designating the U S , filed April 9, 1996, now abandoned, US 60/019,202, filed June 6, 1996, now abandoned, and US
08/678,280, filed July 10, 1996, issued as U S Patent No 5,776,683 on July 7, 1998 PCT/US96/09286 is a continuation-in-part and claims the priority benefit of US 08/463,660, filed June 5, 1995, issued as U S Patent No 5,759,776 on June 2, 1998 The present application claims the priority benefit of all aforelisted applications, which are hereby incorporated herein by reference in their entirety
TECHNICAL FIELD
The present invention relates generally to the field of human genetics More specifically, it relates to the identification and characterization of novel genes associated with overabundance of RNA in human cancer such as breast cancer It pertains especially to those genes and the products thereof which may be important in diagnosis and treatment
BACKGROUND OF THE INVENTION
Cancer is a heterogeneous disease It manifests itself in a wide variety of tissue sites, with different degrees of de-differentiation, invasiveness, and aggressiveness Some forms of cancer are responsive to traditional modes of therapy, but many are not For most common cancers, there is a pressing need to improve the arsenal of therapies available to provide more precise and more effective treatment in a less invasive way As an example breast cancer has an unsatisfactory morbidity and mortality, despite presently available forms of medical intervention Traditional clinical initiatives are focused on early diagnosis, followed by surgery and chemotherapy Such interventions are of limited success, particularly in patients where the tumor has undergone metastasis The heterogeneous nature of cancer arises because different cancer cells achieve their growth and pathological properties by different phenotypic alterations Alteration of gene expression is intimately related to the uncontrolled growth and de-differentiation that are hallmarks of cancer Certain similar phenotypic alterations in turn may have a different genetic base in different tumors Yet, the number of genes central to the malignant process must be a finite one Accordingly, new pharmaceuticals that are tailored to specific genetic alterations in an individual tumor may be more effective
There are two types of altered gene expression that take place, together or independently, in different cancer cells (reviewed by Bishop) The first type is the decreased expression of recessive genes, known as tumor suppresser genes, that apparently act to prevent malignant growth The second type is the increased expression of dominant genes, such as oncogenes, that act to promote malignant growth, or to provide some other phenotype critical for malignancy Thus, alteration in the expression of either type of gene is a potential diagnostic indicator Furthermore, a treatment strategy might seek to reinstate the expression of suppresser genes, or reduce the expression of dominant genes The present invention is directed to identifying genes of either type, particularly those of the second type
The most frequently studied mechanism for gene overexpression in cancer cells is sometimes referred to as amplification This is a process whereby the gene is duplicated within the chromosomes of the ancestral cell into multiple copies The process involves unscheduled replications of the region of the chromosome comprising the gene, followed by recombination of the replicated segments back into the chromosome (A talo et al ) As a result, 50 or more copies of the gene may be produced The duplicated region is sometimes referred to as an "amp con" The level of expression of the gene (that is, the amount of messenger RNA produced) escalates in the transformed cell in the same proportion as the number of copies of the gene that are made (Ahtalo et al ) Several human oncogenes have been described, some of which are duplicated, for example, in a significant proportion of breast tumors A prototype is the efi B2 gene (also known as HER-2/nety), which encodes a 185 kDa membrane growth factor receptor homologous to the epidermal growth factor receptor erbB2 is duplicated in 61 of 283 tumors (22%) tested in a recent survey (Adnane et al ) Other oncogenes duplicated in breast cancer are the bek gene, duplicated in 34 out of 286 (12%), the fig gene, duplicated in 37 out of 297 (12%), the myc gene, duplicated in
43 out of 275 (16%) (Adnane et al ) Work with other oncogenes, particularly those described for neuroblastoma, suggested that gene duplication of the proto-oncogene was an event involved in the more malignant forms of cancer, and could act as a predictor of clinical outcome (reviewed by Schwab et al and Ahtalo et al ) In breast cancer, duplication of the en B2 gene has been reported as correlating both with reoccurrence of the disease and decreased survival times (Slamon et al ) There is some evidence that enbB2 helps identify tumors that are responsive to adjuvant chemotherapy with cyclophosphamide, doxorubicin, and fluorouracil (Muss et al )
It is clear that only a proportion of the genes that can undergo gene duplication in cancer have been identified First, chromosome abnormalities, such as double minute (DM) chromosomes and homogeneously stained regions (HSRs), are abundant in cancer cells HSRs are chromosomal regions that appear in karyotype analysis with intermediate density Giemsa staining throughout their length, rather than with the normal pattern of alternating dark and light bands They correspond to multiple gene repeats HSRs are particularly abundant in breast cancers, showing up in 60-65% of tumors surveyed (Dutnllaux et al , Zafrani et al ) When such regions are checked by in situ hybridization with probes for any of 16 known human oncogenes, including erόB2 and myc, only a proportion of tumors show any hybridization to HSR regions Furthermore, only a proportion of the HSRs within each karyotype are implicated
Second, comparative genomic hybridization (CGH) has revealed the presence of copy number increases in tumors, even in chromosomal regions outside of HSRs CGH is a new method in which whole chromosome spreads are stained simultaneously with DNA fragments from normal cells and from cancer cells, using two different fluorochromes The images are computer-processed for the fluorescence ratio, revealing chromosomal regions that have undergone amplification or deletion in the cancer cells (Kallioniemi et al 1992) This method was recently applied to 15 breast cancer cell lines (Kallioniemi et al 1994) DNA sequence copy number increases were detected in all 23 chromosome pairs
Cloning the genes that undergo duplication in cancer is a formidable challenge In one approach, human oncogenes have been identified by hybridizing with probes for other known growth-promoting genes, particularly known oncogenes in other species For example, the eribB2 gene was identified using a probe from a chemically induced rat neuroghoblastoma (Slamon et al ) Genes with novel sequences and functions will evade this type of search In another approach, genes may be cloned from an area identified as containing a duplicated region by CGH method Since CGH is able to indicate only the approximate chromosomal region of duplicated genes, an extensive amount of experimentation is required to walk through the entire region and identify the particular gene involved Genes may also be overexpressed in cancer without being duplicated Methods that rely on identification from genetic abnormalities necessarily bypass such genes Increased expression can come about through a higher level of transcription of the gene, for example, by up-regulation of the promoter or substitution with an alternative promoter It can also occur if the transcription product is able to persist longer in the cell, for example, by increasing the resistance to cytoplasmic RNase or by reducing the level of such cytoplasmic enzymes Two examples are the epidermal growth factor receptor, overexpressed in 45% of breast cancer tumors (Klijn et al ), and the IGF-1 receptor, overexpressed in 50-93% of breast cancer tumors (Berns et al ) In almost all cases, the overexpression of each of these receptors is by a mechanism other than gene duplication
One way of examining overexpression at the messenger RNA level is by subtractive hybridization It involves producing positive and negative cDNA strands from two RNA preparations, and looking for cDNA which is not completely hybridized by the opposing preparation This is a laborious procedure which has distinct limitations in cancer research In particular, since each subtraction involves cDNA from only two cell populations at a time, it is sensitive to individual phenotypic differences due not just to the presence of cancer, but also through natural metabolic variations
Another way of examining overexpression at the messenger RNA level is by differential display (Liang et al 1992a) In this technique, cDNA is prepared from only a subpopulation of each
RNA preparation, and expanded via the polymerase chain reaction using primers of particular specificity Similar subpopulations are compared across several RNA preparations by gel autoradiography for expression differences In order to survey the RNA preparations entirely, the assay is repeated with a comprehensive set of PCR primers The screening strategy more effectively includes multiple positive and negative control samples (Sunday et al ) The method has recently been applied to breast cancer cell lines, and highlights a number of expression differences (Liang et al 1992b, Chen et al , McKenzie et al , Watson et al 1994 & 1996, Kocher et al ) By excising the corresponding region of the separating gel, it is possible to recover and sequence the cDNA Despite the advancement provided by differential display, problems remain in terms of applying it in the search for new cancer genes First, because this is a test for RNA levels, any phenotypic difference between cell lines constitute part of the recovered set, leading to a large proportion of "false positive" identifications It has been found that cDNA for mitochondnal genes constitute a large proportion of the differentially expressed bands, and it consumes substantial resources to recover the sample and obtain a partial sequence in order to eliminate them Second, false positive identifications are made for reasons attributed to multiple cDNA species and competition for the PCR primers by RNA species of different abundance (Debouck) Third, differential display highlights high copy number mRNAs and shorter mRNAs (Bertioh et al , Yeatman et al ) , and may therefore miss critical cancer-associated transcripts when used as a survey technique Fourth, a number of adjustments are made to gene expression levels when a cell undergoes malignant transformation or cultured in vitro Most of these adjustments are secondary, and not part of the transformation process Thus, even when a novel sequence is obtained from the differential display, it is far from certain that the corresponding gene is at the root of the disease process
An early step in developing gene-specific therapeutic approaches is the identification of genes that are more central to malignant transformation or the persistence of the malignant phenotype
SUMMARY OF THE INVENTION
It is an objective of this invention to provide a method for identifying and characterizing genes and gene products which are duplicated or associated with overabundant RNA in cancer cells The method can be used for any type of cancer, providing a plurality of cell populations or cell lines of the type of cancer are available, in conjunction with a suitable control cell population The method is highly effective in identifying genes and gene products that are intimately related to malignant transformation or maintenance of the malignant properties of the cancer cells An important derivative of applying the method is the selection and retrieval of cDNA and cDNA fragments corresponding to the cancer-associated gene These fragments can be used inter alia to determine the nucleotide sequence of the gene and mRNA, the ammo acid sequence of any encoded protein, or to retrieve from a cDNA or genomic library additional polynucleotides related to the gene or its transcripts Since the genes are typically involved in the malignant process of the cell, the polynucleotides, polypeptides, and antibodies derived by using this method can in turn be used to design or screen important diagnostic reagents and therapeutic compounds
Another objective of this invention to provide isolated polynucleotides, polypeptides, and antibodies derived from four novel genes which are associated with several different types of cancer, including breast cancer The genes are designated CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14-2a16-1 These designations refer to both strands of the cDNA and fragments thereof, and to the respective corresponding messenger RNA, including splice variants, allelic variants, and fragments of any of these forms These genes show RNA overabundance in a majority of cancer cell lines tested A majority of the cells showing RNA overabundance also have duplication of the corresponding gene Another object of this invention is to provide materials and methods based on these polynucleotides, polypeptides, and antibodies for use in the diagnosis and treatment of cancer, particularly breast cancer
Accordingly, one embodiment of this invention is an isolated polynucleotide comprising a linear sequence contained in a polynucleotide selected from the group consisting of CH1-9a11-2, CH8- 2a13-1 , CH13-2a12-1 , and CH14-2a16-1 The linear sequence is contained in a duplicated gene or overabundant RNA in cancerous cells The RNA may be overabundant due to gene duplication, increased RNA transcription or processing, increased RNA persistence, any combination thereof, or by any other mechanism, in a proportion of breast cancer cells Preferably, the RNA is overabundant in at least about 20% of a representative panel of breast cancer cell lines such as the panels listed herein, more preferably, it is overabundant in at least about 40% of the panel, even more preferably, it is overabundant in at least 60% or more of the panel Preferably, the RNA is overabundant in at least about 5% of spontaneously occurring breast cancer tumors, more preferably, it is overabundant in at least about 10% of such tumors, more preferably, it is overabundant in at least about 20% of such tumors, more preferably, it is overabundant in at least about 30% of such tumors, even more preferably, it is overabundant in at least about 50% of such tumors
Preferably, a sequence of at least 10 nucleotides is essentially identical between the isolated polynucleotide of the invention and a cDNA from CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14- 2a16-1 , more preferably, a sequence of at least about 15 nucleotides is essentially identical, more preferably, a sequence of at least about 20 nucleotides is essentially identical, more preferably, a sequence of at least about 30 nucleotides is essentially identical, more preferably, a sequence of at least about 40 nucleotides is essentially identical, even more preferably, a sequence of at least about 70 nucleotides is essentially identical, still more preferably, a sequence of about 100 nucleotides or more is essentially identical A further embodiment of this invention is an isolated polynucleotide comprising a linear sequence essentially identical to a sequence selected from the group consisting of SEQ ID NO 15, SEQ ID NO 18, SEQ ID NO 21, SEQ ID NO 23, SEQ ID NO 26, SEQ ID NO 29, SEQ ID NO 31 , SEQ ID NO 33, and SEQ ID NO 35 These embodiments include an isolated polynucleotide which is a DNA polynucleotide, an RNA polynucleotide, a polynucleotide probe, or a polynucleotide primer
This invention also provides an isolated polypeptide comprising a sequence of ammo acids essentially identical to the polypeptide encoded by or translated from a polynucleotide selected from the group consisting of CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14-2a16-1 Preferably, a sequence of at least about 5 ammo acids is essentially identical between the polypeptide of this invention and that encoded by the polynucleotide, more preferably, a sequence of at least about 10 ammo acids is essentially identical, more preferably, a sequence of at least 15 ammo acids is essentially identical, even more preferably, a sequence of at least 20 ammo acids is essentially identical, still more preferably, a sequence of about 30 ammo acids or more is essentially identical Preferably, the polypeptide comprises a linear sequence of at least 15 ammo acids essentially identical to a sequence encoded by said polynucleotide Another embodiment of this invention is a polypeptide comprising a linear sequence essentially identical to a sequence selected from the group consisting of SEQ ID NO 17, SEQ ID NO 20, SEQ ID NO 25, SEQ ID NO 28, SEQ ID NO 30, SEQ ID NO 32, SEQ ID NO 34, and SEQ ID NO 37
A further embodiment of this invention is an antibody specific for a polypeptide embodied in this invention This encompasses both monoclonal and isolated polyclonal antibodies
A further embodiment of this invention is a method of using the polynucleotides of this invention for detecting or measuring gene duplication in cancerous cells, especially but not limited to breast cancer cells, comprising the steps of reacting DNA contained in a clinical sample with a reagent comprising the polynucleotide, said clinical sample having been obtained from an individual suspected of having cancerous cells, and comparing the amount of complexes formed between the reagent and the DNA in the clinical sample with the amount of complexes formed between the reagent and DNA in a control sample
A further embodiment is a method of using the polynucleotides of this invention for detecting or measuring overabundance of RNA in cancerous cells, especially but not limited to breast cancer cells, comprising the steps of reacting RNA contained in a clinical sample with a reagent comprising the polynucleotide, said clinical sample having been obtained from an individual suspected of having cancerous cells, and comparing the amount of complexes formed between the reagent and the RNA in the clinical sample with the amount of complexes formed between the reagent and RNA in a control sample
Another embodiment of this invention is a diagnostic kit for detecting or measuring gene duplication or RNA overabundance in cells contained in an individual as manifest in a clinical sample, comprising a reagent and a buffer in suitable packaging, wherein the reagent comprises a polynucleotide of this invention
Another embodiment of this invention is a method of using a polypeptide of this invention for detecting or measuring specific antibodies in a clinical sample, comprising the steps of reacting antibodies contained in the clinical sample with a reagent comprising the polypeptide, said clinical sample having been obtained from an individual suspected of having cancerous cells, especially but not limited to breast cancer cells, and comparing the amount of complexes formed between the reagent and the antibodies in the clinical sample with the amount of complexes formed between the reagent and antibodies in a control sample
Another embodiment of this invention is a method of using an antibody of this invention for detecting or measuring altered protein expression in a clinical sample, comprising the steps of reacting a polypeptide contained in the clinical sample with a reagent comprising the antibody, said clinical sample having been obtained from an individual suspected of having cancerous cells, especially but not limited to breast cancer cells, and comparing the amount of complexes formed between the reagent and the polypeptide in the clinical sample with the amount of complexes formed between the reagent and a polypeptide in a control sample Further embodiments of this invention are diagnostic kits for detecting or measuring a polypeptide or antibody present in a clinical sample, comprising a reagent and a buffer in suitable packaging, wherein the reagent respectively comprises either an antibody or a polypeptide of this invention
Yet another embodiment of this invention is a host cell transfected by a polynucleotide of this invention A further embodiment of this invention is a method for using a polynucleotide for screening a pharmaceutical candidate, comprising the steps of separating progeny of the transfected host cell into a first group and a second group, treating the first group of cells with the pharmaceutical candidate, not treating the second group of cells with the pharmaceutical candidate, and comparing the phenotype of the treated cells with that of the untreated cells
This invention also embodies a pharmaceutical preparation for use in cancer therapy, comprising a polynucleotide or polypeptide embodied by this invention, said preparation being capable of reducing the pathology of cancerous cells, especially for but not limited to breast cancer cells
Further embodiments of this invention are methods for treating an individual bearing cancerous cells, such as breast cancer cells, comprising administering any of the aforementioned pharmaceutical preparations
Still another embodiment of this invention is a pharmaceutical preparation or active vaccine comprising a polypeptide embodied by this invention in an immunogenic form and a pharmaceutically compatible excipient A further embodiment is a method for treatment of cancer, especially but not limited to breast cancer, either prophylactically or after cancerous cells are present in an individual being treated, comprising administration of the aforementioned pharmaceutical preparation
Another series of embodiments of this invention relate to methods for obtaining cDNA corresponding to a gene associated with cancer, comprising the steps of a) supplying an RNA preparation from uncultured control cells, b) supplying RNA preparations from at least two different cancer cells, c) displaying cDNA corresponding to the RNA preparations of step a) and step b) such that different cDNA corresponding to different RNA in each preparation are displayed separately, d) selecting cDNA corresponding to RNA that is present in greater abundance in the cancer cells of step b) relative to the control cells of step a), e) supplying a digested DNA preparation from control cells, f) supplying digested DNA preparations from at least two different cancer cells, g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step f), and h) further selecting cDNA from the cDNA of step d) corresponding to genes that are duplicated in the cancer cells of step f) relative to the control cells of step e) One or more enhancements may optionally be included in the methods of this invention, including the following
1 Cancer cells are preferably used for step b) that share a duplicated gene in the same region of a chromosome If desired, the practitioner may test cancer cells beforehand to detect the duplication or deletion of chromosome regions, or cancer cell lines may be used that have already been characterized in this respect
2 A higher plurality of cancer cells are preferably used to provide DNA for step b), step f), or preferably both step b) and step f) The use of three cancer cells is preferred over two, the use of four cancer cells is more preferred, about five cancer cells is still more preferred, about eight cancer cells is even more preferred The cDNA of each cancer cell population is displayed or hybridized separately, in accordance with the method
3 A higher plurality of control cells are preferably used to provide DNA for step a), step e), or preferably both step a) and step e) The use of two control cell populations is preferred, the use of three or more is even more preferred Both proliferating and non- proliferating populations are preferably used, if available
4 The control cells are preferably supplied fresh from a tissue source, and are not cultured or transformed into a cell line This is increasingly important when the control cell populations used in step a) is only one or two in number Freshly obtained cancer cells may also be used as an alternative to cancer cell lines, although this is less critical
5 An additional screening step is preferably conducted in which the cDNA corresponding to the putative cancer-associated gene is additionally hybridized with a digested mitochondnal DNA preparation, to eliminate mitochondnal genes This screening step may be conducted before, between, subsequent to, or simultaneously with the other screening steps of the method
6 An additional screening step is preferably conducted in which RNA is supplied from a plurality of cancer cells, and one or preferably more control cell populations, the RNA is contacted with cDNA corresponding to the putative cancer-associated gene under conditions that permit formation of a stable duplex, and cDNA is selected corresponding to RNA that is present in greater abundance in a proportion of the cancer cells relative to the control cells Preferably, the plurality of cancer cells is a panel of at least five, preferably at least ten cells Preferably at least three, more preferably at least five of the cancer cells show greater abundance of RNA Preferably at least one and preferably more of the cancer cells shows a greater abundance of RNA compared with control cells, but does not show duplication of the corresponding gene in step h) of the method
Other embodiments of the invention are methods for obtaining cDNA corresponding to a gene that is deleted or underexpressed in cancer, comprising the steps of a) supplying an RNA preparation from control cells, b) supplying RNA preparations from at least two different cancer cells that share a deleted gene in the same region of a chromosome, c) displaying cDNA corresponding to the RNA preparations of step a) and step b) such that different cDNA corresponding to different RNA in each preparation are displayed separately, and d) selecting cDNA corresponding to RNA that is present in lower abundance in the cancer cells of step b) relative to the control cells of step a) Such methods typically comprise the following further steps e) supplying a digested DNA preparation from control cells, f) supplying digested DNA preparations from at least two different cancer cells, g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step f), and h) further selecting cDNA from the cDNA of step d) corresponding to a gene that is deleted in the cancer cells of step f) relative to the control cells of step e) Such methods for identifying deleted or underexpressed genes may also comprise enhancements such as those described above Additional embodiments of this invention are methods for characterizing cancer genes, comprising obtaining cDNA corresponding to a cancer-associated gene according to a method of this invention, particularly those highlighted above, and then sequencing the cDNA Alternatively or in addition, the cDNA may be used to rescue additional polynucleotides corresponding to a cancer- associated gene from an mRNA preparation, or a cDNA or genomic DNA library
Additional embodiments of this invention are methods for screening candidate drugs for cancer treatment, comprising obtaining cDNA corresponding to a gene that is duplicated, overexpressed, deleted, or underexpressed in cancer, and comparing the effect of the candidate drug on a cell genetically altered with the cDNA or fragment thereof with the effect on a cell not genetically altered
Various embodiments of this invention may be employed in pursuit of any form of cancer for which suitable tissue sources are available Cancers of particular interest include lung cancer, glioblastoma, pancreatic cancer, colon cancer, prostate cancer, hepatoma, myeloma, and breast cancer
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a half-tone reproduction of an autoradiogram of a differential display experiment, in which radiolabeled cDNA corresponding to a subset of total messenger RNA in different cells are compared This is used to select cDNA corresponding to particular RNA that are overabundant in breast cancer
Figure 2 is a half-tone reproduction of an autoradiogram of electrophoresed DNA digests from a panel of breast cancer cell lines probed with a CH8-2a13-1 insert (Panel A) or a loading control (Panel B)
Figure 3 is a half-tone reproduction of an autoradiogram of electrophoresed total RNA from a panel of breast cancer cell lines probed with a CH8-2a13-1 insert (Panel A) or a loading control (Panel B)
Figure 4 is a half-tone reproduction of an autoradiogram of electrophoresed DNA digests from a panel of breast cancer cell lines probed with a CH13-2a12-1 insert
Figure 5 is a half-tone reproduction of an autoradiogram of electrophoresed total RNA from a panel of breast cancer cell lines probed with a CH13-2a12-1 insert
Figure 6 is a map of cDNA fragments obtained for the breast cancer associated genes CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 and CH14-2a16-1 Regions of the fragments used to deduce sequence data listed in the application are indicated by shading Nucleotide positions are numbered from the left-most residue for which double-strand sequence data has been obtained, which is not necessarily the 5' terminus of the corresponding message
Figure 7 is a listing of primers used for obtaining the cDNA sequence data for CH1-9a11-2
Figure 8 is a listing of cDNA sequence obtained for CH1-9a11-2
Figure 9 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 8 The single-letter am o acid code is used Stop codons are indicated by a dot (•) The upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence A possible transmembrane region is indicated by underlining
Figure 10 is a listing of primers used for obtaining the cDNA sequence data for CH8-2a13-1
Figure 11 is a listing of cDNA sequence obtained for CH8-2a13-1
Figure 12 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH8-2a13-1 shown in Figure 11 The upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence
Figure 13 is a listing of the nucleotide sequence predicted for a full-length CH8-2a13-1 cDNA
Figure 14 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH8-2a13-1 shown in Figure 13
Figure 15 is a listing of primers used for obtaining the cDNA sequence data for CH13-2a12-1
Figure 16 is a listing of cDNA sequence obtained for CH13-2a12-1 As explained in Example 6, the first 405 base pairs shown in this sequence are believed to be part of an mtron and do not typically appear in the mature mRNA Additional sequence present in the mature mRNA is shown in Figure 35
Figure 17 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH13-2a12-1 shown in Figure 16 The upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence
Figure 18 is a listing of primers used for obtaining cDNA sequence data for CH13-2a12-1 Figure 19 is a listing of the cDNA sequence data obtained by two-directional sequencing for CH14- 2a16-1
Figure 20 is a listing of the am o acid sequence corresponding to the longest open reading frame of the DNA sequence of CH14-2a16-1 shown in Figure 19 The upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence Residues corresponding to three zinc finger motifs are underlined, indicating that the protein may have DNA or RNA binding activity
Figure 21 is a listing of additional DNA sequence data towards the 5' end of CH14-2a16-1 obtained by one-directional sequencing of the fragment pCH14-1 3 First two panels show nucleotide and ammo acid sequence from the 5' end of the fragment, the second two panels show nucleotide and am o acid sequence from the 3' end of the fragment Regions of overlap with pCH 14-800 are underlined
Figure 22 is a listing of the nucleotide sequences of initial fragments obtained corresponding to the four breast cancer associated genes, along with their ammo acid translations
Figure 23 is a listing of additional cDNA sequence obtained for CH1-9a11-2, comprising approximately 1934 base pairs 5' from the sequence of Figure 8 Additional sequence for this gene is shown in Figure 27
Figure 24 is a listing of the ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 23 The single-letter am o acid code is used Stop codons are indicated by a dot (•)
Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1 , comprising approximately 1934 base pairs 5' from the sequence of Figure 19
Figure 26 is a listing of the am o acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 25 The single-letter am o acid code is used Stop codons are indicated by a dot (•) The upper panel shows the complete ammo acid translation, the lower panel shows the predicted gene product protein sequence
Figure 27 is a listing of the cDNA nucleotide sequence obtained for CH1-9a11-2, containing an additional -2467 base pairs 5' from the sequence of Figure 23 Figure 28 is a listing of the predicted ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH1-9a11-2 shown in Figure 27
Figure 29 and Figure 31 are nucleotide sequence listings for two other cDNA species obtained for CH1-9a11-2 Additional base pairs are present in the encoding region, adding to the predicted protein sequence without altering the reading frame
Figure 30 and Figure 32 are ammo acid sequence listings corresponding to the DNA sequences shown in Figures 29 and 31
Figure 33 is a map of the CH1-9a11-2 cDNA, compared with the human chromosomal gene and homologous sequences from other species A large internal region is conserved between distant species suggesting the gene has a fundamental role in cell metabolism Some of the members of the family including the human gene have predicted transmembrane regions
Figure 34 is a six-panel photostatic reproduction of in situ hybridization analysis for CH1-9a11-2 overexpression at the mRNA level Positive staining was shown with the 600 PE cell line (Panel A) and 44% of primary breast cancers (exemplified in Panels E and F), compared with control samples or surrounding tissue
Figure 35 is a listing of the cDNA nucleotide sequence obtained for CH13-2a12-1 , containing an additional ~708 base pairs at the 5' end
Figure 36 is a listing of the predicted ammo acid sequence corresponding to the longest open reading frame of the DNA sequence of CH13-2a12-1 shown in Figure 35
Figure 37 and Figure 39 are nucleotide sequence listings for two other cDNA species obtained for CH13-2a12-1 Additional base pairs are present in the encoding region, adding to the predicted protein sequence without altering the reading frame
Figure 38 and Figure 40 are ammo acid sequence listings corresponding to the DNA sequences shown in Figures 29 and 31
Figure 41 is a two panel figure showing standardized values of DNA amplification (Left Panel) and RNA overexpression (Right Panel) for CH13-2a12-1 in 16 breast cancer cell lines compared with normal tissue Figure 42 is a four-panel photostatic reproduction of in situ hybridization analysis for CH13-2a12-1 overexpression at the mRNA level Positive staining was shown with the 600 PE cell line (Panel B) and a proportion of primary breast cancers (exemplified in Panel C), compared with control samples or surrounding tissue
DETAILED DECRIPTION
This invention relates to the discovery and characterization of four novel genes associated with breast cancer The cDNA of these genes, and their sequences as disclosed below, provide the basis of a series of reagents that can be used in diagnosis and therapy
Using a panel of about 15 cancer cell lines, each of the four genes was found to be duplicated in 40-60% of the cells tested Surprisingly, each of the four genes was duplicated in at least one cell line where studies using comparative genomic hybridization had not revealed any amplification of the corresponding chromosomal region Levels of expression at the mRNA level were tested in a similar panel for two of these four genes In addition to those cell lines showing gene duplication, 17 to 37% of the lines showed RNA overabundance without gene duplication, indicating that the malignant cells had used some mechanism other than gene duplication to promote the abundance of RNA corresponding to these genes All four of the breast cancer genes have open reading frames, and likely are transcribed at various levels in different cell types Overabundance of the corresponding RNA in a cancerous cell is likely associated with overexpression of the protein gene product Such overexpression may be manifest as increased secretion of the protein from the cell into blood or the surrounding environment, an increased density of the protein at the cell surface, or an increased accumulation the protein within the cell, in comparison to the typical level in noncancerous cells of the same tissue type Different tumors bear different genotypes and phenotypes, even when derived from the same tissue Gene therapy in cancer is more likely to be effective if it is aimed at genes that are involved in supporting the malignancy of the cancer This invention discloses genes that achieve RNA overabundance by several mechanisms, because they are more likely to be directly involved in the pathogenic process, and therefore suitable targets for pharmacological manipulation Features of the four novel genes, the respective mRNA, and the cDNA used to find them are provided in Table 1
Figure imgf000017_0001
All four genes sequences are unrelated to other genes known to be overexpressed in breast cancer, including the ert B2 gene (Adnane et al ), tissue factor (Chen et al ), mammaglobulin (Watson et al ), and DD96 (Kocher et al ) The four mRNA sequences each comprise an open reading frame The CH1-9a11-2 gene is expressed at the mRNA level at relatively elevated levels in pancreas and testis The CH8-2a13-1 gene is expressed at relatively elevated levels in adult heart, spleen, thymus, small intestine, colon, and tissues of the reproductive system, and at higher levels in certain tissues of the fetus The CH13- 2a12-1 gene is expressed at relatively elevated leves in heart, skeletal muscle, and testis The CH14- 2a16-1 gene is expressed at relatively elevated levels in testis The level of expression of all four genes is especially high in a substantial proportion of breast cancer cell lines
The CH1-9a11-2 gene encodes a protein with a putative transmembrane region, and may be expressed as a surface protein on cancer cells The CH13-2a12-1 gene is distantly related to a C elegans gene implicated in cell cycle regulation, and may play a role in the regulation of cell proliferation The protein encoded by CH13-2a12-1 is distantly related to a vasopressin-activated calcium binding receptor, and may have Ca++ binding activity The CH14-2a16-1 comprises at least five domains of a zinc finger binding motif and is distantly related to a yeast RNA binding protein The CH14-2a16-1 gene product is suspected of having DNA or RNA binding activity, which may relate to a role in cancer pathogenesis The four genes described here are exemplars of genes that undergo altered expression in cancer, identifiable using the gene screening methods of the invention The method involves an analysis for both DNA duplication and altered RNA abundance relating to the same gene Since abnormal gene regulation is central to the malignant process, the identification method may be brought to bear on any type of cancer The screening method is superior to any previously available approach in several respects Particularly significant is that screening is rapidly focused towards genes that are central to the malignant process, and away from those that have variable levels of expression as part of normal metabolic processes Furthermore, because the end-product is a cDNA corresponding to the gene, the process leads rapidly to detailed characterization of the gene, and any effector molecule it may encode This in turn leads to development of new diagnostic and therapeutic materials and techniques
Definitions
Terms used in this application include the following
The term "polynucleotide" refers to a polymeric form of nucleotides of any length, either deoxynbonucleotides or nbonucleotides, or analogs thereof Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown The following are non-limiting examples of polynucleotides a gene or gene fragment, exons, mtrons, messenger RNA
(mRNA), transfer RNA, nbosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer The sequence of nucleotides may be interrupted by non-nucleotide components A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component
The term polynucleotide, as used herein, refers interchangeably to double- and single-stranded molecules Unless otherwise specified or required, any embodiment of the invention described herein that is a polynucleotide encompasses both the double-stranded form, and each of two complementary single-stranded forms known or predicted to make up the double-stranded form
In the context of polynucleotides, a "linear sequence" or a "sequence" is an order of nucleotides in a polynucleotide in a 5' to 3' direction in which residues that neighbor each other in the sequence are contiguous in the primary structure of the polynucleotide A "partial sequence" is a linear sequence of part of a polynucleotide which is known to comprise additional residues in one or both directions
"Hybridization" refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues The hydrogen bonding is sequence-specific, and typically occurs by Watson-Crick base paiπng A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a
PCR, or the enzymatic cleavage of a polynucleotide by a ribozyme Hybridization reactions can be performed under conditions of different "stringency" Relevant conditions include temperature, ionic strength, time of incubation, the presence of additional solutes in the reaction mixture such as formamide, and the washing procedure Higher stringency conditions are those conditions, such as higher temperature and lower sodium ion concentration, which require higher minimum complementarity between hybridizing elements for a stable hybridization complex to form Conditions that increase the stringency of a hybridization reaction are widely known and published in the art see, for example, "Molecular Cloning A Laboratory Manual", Second Edition (Sambrook, Fritsch & Maniatis, 1989) When hybridization occurs in an antiparallel configuration between two single-stranded polynucleotides, those polynucleotides are described as "complementary" A double-stranded polynucleotide can be "complementary" to another polynucleotide, if hybridization can occur between one of the strands of the first polynucleotide and the second Complementarity (the degree that one polynucleotide is complementary with another) is quantifiable in terms of the proportion of bases in opposing strands that are expected to form hydrogen bonding with each other, according to generally accepted base-pairing rules A linear sequence of nucleotides is "identical" to another linear sequence, if the order of nucleotides in each sequence is the same, and occurs without substitution, deletion, or material substitution It is understood that punne and pyπmidine nitrogenous bases with similar structures can be functionally equivalent in terms of Watson-Crick base-pairing, and the inter-substitution of like nitrogenous bases, particularly uracil and thymme, or the modification of nitrogenous bases, such as by methylation, does not constitute a material substitution An RNA and a DNA polynucleotide have identical sequences when the sequence for the RNA reflects the order of nitrogenous bases in the polynbonucleotides, the sequence for the DNA reflects the order of nitrogenous bases in the polydeoxynbonucleotides, and the two sequences satisfy the other requirements of this definition Where one or both of the polynucleotides being compared is double-stranded, the sequences are identical if one strand of the first polynucleotide is identical with one strand of the second polynucleotide
A linear sequence of nucleotides is "essentially identical" to another linear sequence, if both sequences are capable of hybridizing to form a duplex with the same complementary polynucleotide Sequences that hybridize under conditions of greater stringency are more preferred It is understood that hybridization reactions can accommodate insertions, deletions, and substitutions in the nucleotide sequence Thus, linear sequences of nucleotides can be essentially identical even if some of the nucleotide residues do not precisely correspond or align In general, essentially identical sequences of about 40 nucleotides in length will hybridize at about 30 °C in 10 x SSC (0 15 M NaCl, 15 mM citrate buffer), preferably, they will hybridize at about 40°C in 6 x SSC, more preferably, they will hybridize at about 50°C in 6 x SSC, even more preferably, they will hybridize at about 60 °C in 6 x SSC, or at about
40°C in 0 5 x SSC, or at about 30°C in 6 x SSC containing 50% formamide, still more preferably, they will hybridize at 40°C or higher in 2 x SSC or lower in the presence of 50% or more formamide It is understood that the rigor of the test is partly a function of the length of the polynucleotide, hence shorter polynucleotides with the same homology should be tested under lower stringency and longer polynucleotides should be tested under higher stringency, adjusting the conditions accordingly The relationship between hybridization stringency, degree of sequence identity, and polynucleotide length is known in the art and can be calculated by standard formulae, see, e g , Memkoth et al Sequences that correspond or align more closely to the invention disclosed herein are comparably more preferred Generally, essentially identical sequences are at least about 50% identical with each other, after alignment of the homologous regions Preferably, the sequences are at least about 60% identical, more preferably, they are at least about 70% identical, more preferably, they are at least about 80% identical, more preferably, the sequences are at least about 90% identical, even more preferably, they are at least 95% identical, still more preferably, the sequences are 100% identical Percent identity is calculated as the percent of residues in the sequence being compared that are identical to those in the reference sequence, which is usually one of those listed or described in this application, unless stated otherwise No penalty is imposed for introduction of gaps in the reference or comparison sequence for purposes of alignment, but the resulting fragments must be rationally derived — small gaps may not be introduced to trivially improve the identity score
In determining whether polynucleotide sequences are essentially identical, a sequence that preserves the functionality of the polynucleotide with which it is being compared is particularly preferred Functionality may be established by different criteria, such as ability to hybridize with a target polynucleotide, and whether the polynucleotide encodes an identical or essentially identical polypeptides Thus, nucleotide substitutions which cause a non-conservative substitution in the encoded polypeptide are preferred over nucleotide substitutions that create a stop codon, nucleotide substitutions that cause a conservative substitution in the encoded polypeptide are more preferred, and identical nucleotide sequences are even more preferred Insertions or deletions in the polynucleotide that result in insertions or deletions in the polypeptide are preferred over those that result in the down-stream coding region being rendered out of phase The relative importance of hybridization properties and the polypeptide encoded by a polynucleotide depends on the application of the invention
A "reagent" polynucleotide, polypeptide, or antibody, is a substance provided for a reaction, the substance having some known and desirable parameters for the reaction A reaction mixture may also contain a "target", such as a polynucleotide, antibody, or polypeptide that the reagent is capable of reacting with For example, in some types of diagnostic tests, the amount of the target in a sample is determined by adding a reagent, allowing the reagent and target to react, and measuring the amount of reaction product In the context of clinical management, a "target" may also be a cell, collection of cells, tissue, or organ that is the object of an administered substance, such as a pharmaceutical compound "cDNA" or "complementary DNA" is a single- or double-stranded DNA polynucleotide in which one strand is complementary to a messenger RNA "Full-length cDNA" is cDNA comprised of a strand which is complementary to an entire messenger RNA molecule A "cDNA fragment" as used herein generally represents a sub-region of the full-length form, but the entire full-length cDNA may also be included Unless explicitly specified, the term cDNA encompasses both the full-length form and the fragment form
A "splice variant" is an alternative gene transcript The term includes both splicing intermediates produced during transcript processing and maturation, and variant species of mature transcript produced from the same chromosomal gene Different polynucleotides are said to "correspond" to each other if one is ultimately derived from another For example, messenger RNA corresponds to the gene from which it is transcribed cDNA corresponds to the RNA from which it has been produced, such as by a reverse transcription reaction, or by chemical synthesis of a DNA based upon knowledge of the RNA sequence cDNA also corresponds to the gene that encodes the RNA Polynucleotides may be said to correspond even when one of the pair is derived from only a portion of the other
A "probe" when used in the context of polynucleotide manipulation refers to a polynucleotide which is provided as a reagent to detect a target potentially present in a sample of interest by hybridizing with the target Usually, a probe will comprise a label or a means by which a label can be attached, either before or subsequent to the hybridization reaction Suitable labels include, but are not limited to radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and enzymes
A "primer" is a short polynucleotide, generally with a free 3' -OH group, that binds to a target potentially present in a sample of interest by hybridizing with the target, and thereafter promoting polymerization of a polynucleotide complementary to the target A "polymerase chain reaction" ("PCR") is a reaction in which replicate copies are made of a target polynucleotide using one or more primers, and a catalyst of polymerization, such as a reverse transcnptase or a DNA polymerase, and particularly a thermally stable polymerase enzyme Methods for PCR are taught in U S Patent Nos
4,683,195 (Mulhs) and 4,683,202 (Mullis et al ) All processes of producing replicate copies of the same polynucleotide, such as PCR or gene cloning, are collectively referred to herein as "replication "
An "operon" is a genetic region comprising a gene encoding a protein and functionally related 5' and 3' flanking regions Elements within an operon include but are not limited to promoter regions, enhancer regions, repressor binding regions, transcription initiation sites, nbosome binding sites, translation initiation sites, protein encoding regions, mtrons and exons, and termination sites for transcription and translation A "promoter" is a DNA region capable under certain conditions of binding RNA polymerase and initiating transcπption of a coding region located downstream (in the 3' direction) from the promoter "Operably linked" refers to a juxtaposition of genetic elements, wherein the elements are in a relationship permitting them to operate in the expected manner For instance, a promoter is operably linked to a coding region if the promoter helps initiate transcription of the coding sequence There may be intervening residues between the promoter and coding region so long as this functional relationship is maintained
"Gene duplication" is a term used herein to describe the process whereby an increased number of copies of a particular gene or a fragment thereof is present in a particular cell or cell line "Gene amplification" generally is synonymous with gene duplication
"Expression" is defined alternately in the scientific literature either as the transcription of a gene into an RNA polynucleotide, or as the transcription and subsequent translation into a polypeptide As used herein, "expression" or "gene expression" generally refers to the production of the RNA unless specified or required otherwise Thus, "RNA overexpression" reflects the presence of more RNA (as a proportion of total RNA) from a particular gene in a cell being described, such as a cancerous cell, in relation to that of the cell it is being compared with, such as a non-cancerous cell The protein product of the gene may or may not be produced in normal or abnormal amounts "Protein overexpression" similarly reflects the presence of relatively more protein present in or produced by, for example, a cancerous cell "Abundance" of RNA refers to the amount of a particular RNA present in a particular cell type
Thus, "RNA overabundance" or "overabundance of RNA" describes RNA that is present in greater proportion of total RNA in the cell type being described, compared with the same RNA as a proportion of the total RNA in a control cell A number of mechanisms may contribute to RNA overabundance in a particular cell type for example, gene duplication, increased level of transcription of the gene, increased persistence of the RNA within the cell after it is produced, or any combination of these
Similarly, "lower abundance" or "underabundance" describes RNA that is present in lower proportion in the cell being described compared with a control cell
The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to polymers of ammo acids of any length The polymer may be linear or branched, it may compπse modified ammo acids, and it may be interrupted by non-ammo acids The terms also encompass an ammo acid polymer that has been modified, for example, disulfide bond formation, glycosylation, hpidation, acetylation, phosphorylation, or any other manipulation, such as conjugation with a labeling component
In the context of polypeptides, a "linear sequence" or a "sequence" is an order of ammo acids in a polypeptide in an N-terminal to C-terminal direction in which residues that neighbor each other in the sequence are contiguous in the primary structure of the polypeptide A "partial sequence" is a linear sequence of part of a polypeptide which is known to comprise additional residues in one or both directions
A linear sequence of ammo acids is "essentially identical" to another sequence if the two sequences have a substantial degree of sequence identity It is understood that the functional proteins can accommodate insertions, deletions, and substitutions in the ammo acid sequence Thus linear sequences of ammo acids can be essentially identical even if some of the residues do not precisely correspond or align Sequences that correspond or align more closely to the invention disclosed herein are more preferred It is also understood that some ammo acid substitutions are more easily tolerated For example, substitution of an ammo acid with hydrophobic side chains, aromatic side chains, polar side chains, side chains with a positive or negative charge, or side chains comprising two or fewer carbon atoms, by another am o acid with a side chain of like properties can occur without disturbing the essential identity of the two sequences Methods for determining homologous regions and scoring the degree of homology are well known in the art, see for example Altschul et al and Henikoff et al Well-tolerated sequence differences are referred to as "conservative substitutions" Thus, sequences with conservative substitutions are preferred over those with other substitutions in the same positions, sequences with identical residues at the same positions are still more preferred
In general, ammo acid sequences that are essentially identical are at least about 15% identical, and comprise at least about another 15% which are either identical or are conservative substitutions, after alignment of homologous regions More preferably, essentially identical sequences comprise at least about 50% identical residues or conservative substitutions, more preferably, they comprise at least about 70% identical residues or conservative substitutions, more preferably, they comprise at least about 80% identical residues or conservative substitutions, more preferably, they comprise at least about 90% identical residues or conservative substitutions, more preferably, they comprise at least about 95% identical residues or conservative substitutions, even more preferably, they contain 100% identical residues In determining whether polypeptide sequences are essentially identical, a sequence that preserves the functionality of the polypeptide with which it is being compared is particularly preferred Functionality may be established by different parameters, such as enzymatic activity, the binding rate or affinity in a receptor-ligand interaction, the binding affinity with an antibody, and X-ray crystallographic structure An "antibody" (interchangeably used in plural form) is an immunoglobulm molecule capable of specific binding to a target, such as a polypeptide, through at least one antigen recognition site, located in the variable region of the immunoglobulm molecule As used herein, the term encompasses not only intact antibodies, but also fragments thereof, mutants thereof, fusion proteins, humanized antibodies, and any other modified configuration of the immunoglobulm molecule that comprises an antigen recognition site of the required specificity
The term "antigen" refers to the target molecule that is specifically bound by an antibody through its antigen recognition site The antigen may, but need not be chemically related to the immunogen that stimulated production of the antibody The antigen may be polyvalent, or it may be a monovalent hapten Examples of kinds of antigens that can be recognized by antibodies include polypeptides, polynucleotides, other antibody molecules, ohgosaccharides, complex lipids, drugs, and chemicals An "immunogen" is an antigen capable of stimulating production of an antibody when injected into a suitable host, usually a mammal Compounds may be rendered immunogenic by many techniques known in the art, including crosshnking or conjugating with a carrier to increase valency mixing with a mitogen to increase the immune response, and combining with an adjuvant to enhance presentation
An "active vaccine" is a pharmaceutical preparation for human or animal use, which is used with the intention of eliciting a specific immune response The immune response may be either humoral or cellular, systemic or secretory The immune response may be desired for experimental purposes, for the treatment of a particular condition, for the elimination of a particular substance, or for prophylaxis against a particular condition or substance
An "isotated" polynucleotide, polypeptide, protein, antibody, or other substance refers to a preparation of the substance devoid of at least some of the other components that may also be present where the substance or a similar substance naturally occurs or is initially obtained from Thus, for example, an isolated substance may be prepared by using a purification technique to enrich it from a source mixture Enrichment can be measured on an absolute basis, such as weight per volume of solution, or it can be measured in relation to a second, potentially interfering substance present in the source mixture increasing enrichments of the embodiments of this invention are increasingly more preferred Thus, for example, a 2-fold enrichment is preferred, 10-fold enrichment is more preferred, 100-fold enrichment is more preferred, 1000-fold enrichment is even more preferred A substance can also be provided in an isolated state by a process of artificial assembly, such as by chemical synthesis or recombinant expression A polynucleotide used in a reaction, such as a probe used in a hybridization reaction, a primer used in a PCR, or a polynucleotide present in a pharmaceutical preparation, is referred to as "specific" or "selective" if it hybridizes or reacts with the intended target more frequently, more rapidly, or with greater duration than it does with alternative substances Similarly, an antibody is referred to as "specific" or "selective" if it binds via at least one antigen recognition site to the intended target more frequently, more rapidly, or with greater duration than it does to alternative substances A polynucleotide or antibody is said to "selectively inhibit" or "selectively interfere with" a reaction if it inhibits or interferes with the reaction between particular substrates to a greater degree or for a greater duration than it does with the reaction between alternative substrates An antibody is capable of "specifically delivering" a substance if it conveys or retains that substance near a particular cell type more frequently or for a greater duration compared with other cell types
The "effector component" of a pharmaceutical preparation is a component which modifies target cells by altering their function in a desirable way when administered to a subject bearing the cells Some advanced pharmaceutical preparations also have a "targeting component", such as an antibody, which helps deliver the effector component more efficaciously to the target site Depending on the desired action, the effector component may have any one of a number of modes of action For example, it may restore or enhance a normal function of a cell, it may eliminate or suppress an abnormal function of a cell, or it may alter a cell's phenotype Alternatively, it may kill or render dormant a cell with pathological features, such as a cancer cell Examples of effector components are provided in a later section
A "pharmaceutical candidate" or "drug candidate" is a compound believed to have therapeutic potential, that is to be tested for efficacy The "screening" of a pharmaceutical candidate refers to conducting an assay that is capable of evaluating the efficacy and/or specificity of the candidate In this context, "efficacy" refers to the ability of the candidate to effect the cell or organism it is administered to in a beneficial way for example, the limitation of the pathology of cancerous cells
A "cell line" or "cell culture" denotes higher eukaryotic cells grown or maintained in vitro It is understood that the descendants of a cell may not be completely identical (either morphologically, genotypically, or phenotypically) to the parent cell Cells described as "uncultured" are obtained directly from a living organism, and have been maintained for a limited amount of time away from the organism not long enough or under conditions for the cells to undergo substantial replication
"Genetic alteration" refers to a process wherein a genetic element is introduced into a cell other than by mitosis or meiosis The element may be heterologous to the cell, or it may be an additional copy or improved version of an element already present in the cell Genetic alteration may be effected, for example, by transfect g a cell with a recombinant plasmid or other polynucleotide through any process known in the art, such as electroporation, calcium phosphate precipitation, or contacting with a polynucleotide-hposome complex, or by transduction or infection with a DNA or RNA virus or viral vector The alteration is preferably but not necessarily inheritable by progeny of the altered cell
A "host cell" is a cell which has been genetically altered, or is capable of being genetically altered, by administration of an exogenous polynucleotide
The terms "cancerous cell" or "cancer cell", used either in the singular or plural form, refer to cells that have undergone a malignant transformation that makes them pathological to the host organism Malignant transformation is a single- or multi-step process, which involves in part an alteration in the genetic makeup of the cell and/or the expression profile Malignant transformation may occur either spontaneously, or via an event or combination of events such as drug or chemical treatment, radiation, fusion with other cells, viral infection, or activation or mactivation of particular genes Malignant transformation may occur in vivo or in vitro, and can if necessary be experimentally induced
A frequent feature of cancer cells is the tendency to grow in a manner that is uncontrollable by the host, but the pathology associated with a particular cancer cell may take another form, as outlined infra Primary cancer cells (that is, cells obtained from near the site of malignant transformation) can be readily distinguished from non-cancerous cells by well-established techniques, particularly histological examination The definition of a cancer cell, as used herein, includes not only a primary cancer cell, but any cell derived from a cancer cell ancestor This includes metastasized cancer cells, and in vitro cultures and cell lines derived from cancer cells The "pathology" caused by a cancer cell within a host is anything that compromises the well-being or normal physiology of the host This may involve (but is not limited to) abnormal or uncontrollable growth of the cell, metastasis, release of cytokmes or other secretory products at an inappropriate level, manifestation of a function inappropriate for its physiological milieu, interference with the normal function of neighboring cells, aggravation or suppression of an inflammatory or immunological response, or the harboring of undesirable chemical agents or invasive organisms
"Treatment" of an individual or a cell is any type of intervention in an attempt to alter the natural course of the individual or cell For example, treatment of an individual may be undertaken to decrease or limit the pathology caused by a cancer cell harbored in the individual Treatment includes (but is not limited to) administration of a composition, such as a pharmaceutical composition, and may be performed either prophylactically, or subsequent to the initiation of a pathologic event or contact with an etiologic agent Effective amounts used in treatment are those which are sufficient to produce the desired effect, and may be given in single or divided doses
A "control cell" is an alternative source of cells or an alternative cell line used in an experiment for comparison purposes Where the purpose of the experiment is to establish a base line for gene copy number or expression level, it is generally preferable to use a control cell that is not a cancer cell
The term "cancer gene" as used herein refers to any gene which is yielding transcription or translation products at a substantially altered level or in a substantially altered form in cancerous cells compared with non-cancerous cells, and which may play a role in supporting the malignancy of the cell It may be a normally quiescent gene that becomes activated (such as a dominant proto-oncogene), it may be a gene that becomes expressed at an abnormally high level (such as a growth factor receptor), it may be a gene that becomes mutated to produce a variant phenotype, or it may be a gene that becomes expressed at an abnormally low level (such as a tumor suppresser gene) The present invention is directed towards the discovery of genes in all these categories It is understood that a "clinical sample" encompasses a variety of sample types obtained from a subject and useful in an in vitro procedure, such as a diagnostic test The definition encompasses solid tissue samples obtained as a surgical removal, a pathology specimen, or a biopsy specimen, tissue cultures or cells derived therefrom and the progeny thereof, and sections or smears prepared from any of these sources Non-limiting examples are samples obtained from breast tissue, lymph nodes, and tumors The definition also encompasses blood, spinal fluid, and other liquid sample of biologic origin, and may refer to either the cells or cell fragments suspended therein, or to the liquid medium and its solutes
The term "relative amount" is used where a comparison is made between a test measurement and a control measurement Thus, the relative amount of a reagent forming a complex in a reaction is the amount reacting with a test specimen, compared with the amount reacting with a control specimen
The control specimen may be run separately in the same assay, or it may be part of the same sample (for example, normal tissue surrounding a malignant area in a tissue section) A "differential' result is generally obtained from an assay in which a comparison is made between the findings of two different assay samples, such as a cancerous cell line and a control cell line Thus, for example, "differential expression" is observed when the level of expression of a particular gene is higher in one cell than another "Differential display" refers to a display of a component, particularly RNA, from different cells to determine if there is a difference in the level of the component amongst different cells Differential display of RNA is conducted, for example, by selective production and display of cDNA corresponding thereto A method for performing differential display is provided in a later section
A polynucleotide derived from or corresponding to CH1 -9a11-2, CH8-2a13-1 , CH13-2a12-1 , or CH14-2a16-1 is any of the following the respective cDNA fragments, the corresponding messenger RNA, including splice variants and fragments thereof, both strands of the corresponding full-length cDNA and fragments thereof, and the corresponding gene Isolated allelic variants of any of these forms are included This invention embodies any polynucleotide corresponding to CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , or CH14-2a16-1 in an isolated form It also embodies any such polynucleotide that has been cloned or transfected into a cell line CH13-2a12-1 may also be referred to as Cul-4A
When used in referring to the gene screening methods of this invention (such as those outlined in the last paragraph), "displaying cDNA" is any technique in which DNA copies of RNA (not restricted to mRNA) is rendered detectable in a quantitative or relatively quantitative fashion, in that DNA copies present in a relatively greater amount in a first sample compared with a second sample generates a relatively stronger or weaker signal compared with that of the second sample due to the difference in copy number Separate display of different cDNA in a preparation (particularly but not limited to cDNA of different size) allows comparison of levels of a particular cDNA between different samples A preferred method of display is the differential display technique, and enhancements thereupon described in this disclosure and elsewhere
The term "digested" DNA encompasses DNA (particularly chromosomal DNA) that has been fragmented by any suitable chemical or enzymatic means into fragments conveniently separable by standard techniques, particularly gel elecfrophoresis Digestion with a restriction endonuclease specific for a particular nucleotide sequence is preferred "Hybridizing" in this context refers to contacting a first polynucleotide with a second polynucleotide under conditions that permit the formation of a multi-stranded polynucleotide duplex whenever one strand of the first polynucleotide has a sequence of sufficient complementarity to a sequence on the second polynucleotide The duplex may be a long-lived one, such as when one DNA molecule is used as a labeled probe to detect another DNA molecule, that may optionally be bound to a nitrocellulose filter or present in a separating gel The duplex may also be a shorter- lived one, such as when one DNA molecule is used to prime an amplification reaction of the other DNA molecule, and the amplified product is subsequently detected The practitioner may alter the conditions of the reaction to alter the degree of complementarity required, as long as sequence specificity remains a determining factor in the reaction
Unless explicitly indicated or otherwise required by the techniques used, the steps of a method of this invention may be performed in any order, or combined where desired and appropriate In one example, in the method comprising steps a) through h) that is described above, it is entirely appropriate to conduct steps a) to c) of the method either before or after steps e) to g) of the method, as long as the cDNA ultimately selected fulfills the criteria of both steps d) and step h) In another example, screening against different digested DNA preparations, even if outlined separately, may optionally be done at the same time All permutations of this kind are within the scope of the invention
General methods
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art Such techniques are explained fully in the literature See, for example, "Molecular Cloning A Laboratory Manual", Second Edition (Sambrook, Fritsch & Maniatis, 1989), "Oligonucleotide Synthesis" (M J Gait, ed , 1984), "Animal Cell Culture" (R I Freshney, ed , 1987), the series "Methods in Enzymology" (Academic Press, Inc ), "Handbook of Experimental Immunology" (D M Weir & C C Blackwell, Eds ), "Gene Transfer Vectors for Mammalian Cells" (J M Miller & M P Ca s, eds , 1987),
"Current Protocols in Molecular Biology" (F M Ausubel et al , eds , 1987), and "Current Protocols in Immunology" (J E Coligan et al , eds , 1991) All patents, patent applications, articles and publications mentioned herein, both supra and infra, are hereby incorporated herein by reference
Features of the cancer gene screening method
The cancer gene screening methods of this invention may be brought to bear to discover novel genes associated with cancer Exemplars of cancer-associated genes identified by this method are described below The exemplars were identified using breast cancer cell lines and tissue, but the strategy can be applied to any cancer type of interest
A central feature of the cancer gene screening method of this invention is to look for both DNA duplication and RNA overabundance relating to the same gene This feature is particularly powerful in the discovery of new and potentially important cancer genes While amp cons occur frequently in cancer, the presently available techniques indicate only the broad chromosomal region involved in the duplication event, not the specific genes involved The present invention provides a way of detecting genes that may be present in an amphcon from a functional basis Because an early part of the method involves detecting RNA, the method avoids genes that may be duplicated in an amphcon but are quiescent (and therefore irrelevant) in the cancer cells Furthermore, it recruits active genes from a duplicated region of the chromosome too small to be detectable by the techniques used to describe amphcons
Near the heart of this approach are several concepts One is that genes encoding products implicated positively in the malignant process achieve elevated gene expression as a part of malignant transformation In this context, "gene expression" refers to expression at the RNA transcription level Most typically, the RNA is in turn be translated into a protein with a particular enzymatic, binding, or regulatory activity which increases after malignant transformation In a less common example, the RNA may encode or participate as a ribozyme, antisense polynucleotide, or other functional nucleic acid molecule during malignancy In a third example, RNA expression may be incidental but symptomatic of an important event in transformation
Another concept is that overexpression, if central to malignant transformation, may be achieved in different tumors by different mechanisms, and that at least one such possible mechanism is gene duplication Accordingly, a substantial proportion of transformed cells will have an amphcon, or duplicated region of a chromosome, that includes within its compass the overexpressed gene Other transformed cells may achieve RNA overabundance without gene duplication, such as by increasing the rate of transcription of the gene (e g , by upregulation of the promoter region), by enhancing transcript promotion or transport, or by increasing mRNA survival
Thus, the method entails screening at the RNA level, several cancer cell lines or tumors, and several normal cell lines or tissue samples at the same time RNA are selected that show a consistent elevation amongst the cancer cells as compared with normal cells Additional strategies may be employed in combination with the RNA screening to improve the success rate of the method One such strategy is to use several cancer cell lines that are all known to have duplicated genes in the same region of a particular chromosome Thus, the RNA that emerge from the screen are more likely to represent a deliberate overexpression event, and the overexpressed gene is likely to be within the duplicated region A supplemental strategy is to use freshly prepared tissue samples rather than cell lines as controls for base-line expression This avoids selection of genes that may alter their expression level just as a result of tissue cultunng Another supplemental strategy is to conduct an additional level of screening, following identification of shared, overexpressed RNA The selected RNA are used to screen DNA from suitable cancer cells and normal cells, to ensure that at least a proportion of the cells achieved the overexpression by way of gene duplication
The strategy for detecting such genes comprises a number of innovations over those that have been used in previous work The first part of the method is based on a search for particular RNAs that are overabundant in cancer cells A first innovation of the method is to compare RNA abundance between control cells and several different cancer cells or cancer cell lines of the desired type The cDNA fragments that emerge in a greater amount in several different cancer lines, but not in control cells, are more likely to reflect genes that are important in disease progression, rather than those that have undergone secondary or coincidental activation It is particularly preferred to use cancer cells that are known to share a common duplicated chromosomal region A second innovation of this method is to supply as control, not RNA from a cell line or culture, but from fresh tissue samples of non-malignant origin There are two reasons for this First, the tissue will provide the spectrum of expression that is typical to the normal cell phenotype, rather than individual differences that may become more prominent in culture This establishes a more reliable baseline for normal expression levels More importantly, the tissue will be devoid of the effects that in vitro cultunng may have in altering or selecting particular phenotypes For example, proto-oncogenes or growth factors may become up-regulated in culture When cultured cells are used as the control for differential display, these up-regulated genes would be missed
A third innovation of this method is to undertake a subselection for cDNA corresponding to genes that achieve their RNA overabundance in a substantial proportion of cancer cells by gene duplication To accomplish this, appropriate cDNA corresponding to overabundant RNA identified in the foregoing steps are used to probe digests of cellular DNA from a panel of different cancer cells, and from normal genomic DNA cDNA that shows evidence of higher copy numbers in a proportion of the panel are selected for further characterization An additional advantage of this step is that cDNA corresponding to mitochondnal genes can rapidly be screened away by including a mitochondnal DNA digest as an additional sample for testing the probe This eliminates most of the false-positive cDNA, which otherwise make up a majority of the cDNA identified
Thus, the identification of genes yielding products that are present at abnormal levels is accomplished by a method comprised of the following steps
To identify particular RNA that is overabundant in cancer cells, RNA is prepared from both cancerous and control cells by standard techniques Cancer-associated genes may affect cellular metabolism by any one of a number of mechanisms For example, they may encode ribozymes, anti-sense polynucleotides, DNA-bindmg polynucleotides, altered nbosomal RNA, and the like The gene screening methods of this invention may employ a comparison of RNA abundance levels at the total RNA level, not strictly limited to mRNA However, the vast majority of cancer-associated genes are predicted to encode a protein gene whose up-regulation is closely linked to the metabolic process For example, the four exemplary breast cancer genes described elsewhere in this application all comprise an open reading frame Accordingly, a focus on mRNA enriches the selectable pool for candidate cancer-associated genes Focus towards mRNA can be conducted at any step in the method It is particularly convenient to use a display method that displays cDNA copied only from mRNA In this case, whole RNA may be prepared and analyzed from cancer and control cell populations without separating out mRNA In terms of the cancer cells used as an RNA source it is particularly advantageous to use a plurality of cancer cells known to contain a duplicated gene or chromosomal segment in the same region of the chromosome The duplicated segment need not be the same size in all the cells, nor is it necessary that the number of duplications be the same, so long as there is at least some part of the duplicated segment that is shared amongst all the cancer cells used in the screen Thus, a minimum of two, and preferably at least three cancer cells are used that are sufficiently characterized to identify a shared duplicated region, and can be used as a source of RNA for the screening test In contrast, the control cell population will not comprise chromosomal duplications
Assuming the duplication to be related to the malignancy of the cancer cells, RNA transcribed from the duplicated region is expected to be overabundant compared with that of the control cell Accordingly, a highly effective strategy is to identify overabundant RNA that is present in all (or at least several) of the cancer cell preparations, but none of the control preparations By using cancer cells that share a duplicated chromosomal region, the RNA comparison will be strongly biased in favor of RNA overabundance transcribed from the shared duplicated region Since the shared region is optimally only a small segment of a single chromosome, expression differences arising from elsewhere in the genome in one cancer cell or another will not be selected We have found that this is highly effective in eliminating a) RNA abundance differences resulting from normal metabolic variations between cells, and/or b) RNA abundance differences related to cancer cell malignancy, but occurring secondarily to malignant transformation This is important, because it considerably minimizes the chief deficiency in the use of RNA comparison methods, particularly differential display, for the screening of potential cancer genes namely, the onerous number of false-positives that such techniques generate
Shared duplicated regions in cancer cells may be identified by a relevant analytical technique, or by reference to such analysis already conducted and published One approach that has been highly effective in mapping approximate sub-chromosomal locations of duplicated segments is comparative genomic hybridization (CGH) This technique involves extracting, amplifying and labeling DNA from the subject cell, hybridizing to reference metaphase chromosomes treated to remove repetitive sequences, and observing the position of the hybridized DNA on the chromosomes (WO 93/18186, Gray et al ) The greater the signal intensity at a given position, the greater the copy number of the sequences in the subject cell Thus, regions showing elevated staining correspond to genes duplicated in the cancer cells, while regions showing diminished staining correspond to genes deleted in the cancer cells Related techniques which a practitioner in the art will be well aware are methods for preparing and using repeat sequence chromosome-specific nucleic acid probes (US 5,427,932, Weier et al ), methods for staining target chromosomal DNA using labeled nucleic acid fragments in conjunction with blocking fragments complementary to repetitive DNA segments (US 5,447,841 , Gray et al ), and methods for detecting amplified or deleted chromosomal regions using a mapped library of labeled polynucleotide probes (US 5 472,842, Stokke et al ) if desired, multiple fluorochromes can be used as labeling agents with CGH and related techniques, to provide a three-color visualization of deleted, normal, and duplicated chromosome abnormalities (Lucas et al )
The choice of a particular chromosomal mapping approach is irrelevant, especially once knowledge of the duplicated region is known If the location of the chromosome duplication is already established for a cell line to be used in RNA comparison during the course of the present invention, then it is unnecessary to conduct a mapping technique de novo For example, established cancer cell lines exist for which mapping data is already available in the public domain Provided in the reference section of this application is a list of over 40 articles in which the locations of duplicated regions in particular cancer cells are described In the context of the present invention, a plurality of cancer cells is chosen for the screening panel based on such data, so that they share a duplicated chromosomal region The chromosomal location of a suspected duplication may be confirmed by hybridization analysis, if desired, using a probe specific for the location
The cancer cells used for RNA comparison are also generally (but not necessarily) derived from the same type of cancer or the same tissue Using cells derived from the same type of cancer increases the probability that the gene ultimately identified will be common in that type of cancer, and suitable as a type-specific diagnostic marker Using cells derived from different types of cancer is in effect a search for cancer-related genes that are less tissue specific and more related to the malignant process in general Both types of genes are of interest for both diagnostic and therapeutic purposes In one illustration highlighted in Example 1 , RNA was screened from the three breast cancer cell lines BT474, SKBR3, and MCF7, which have been determined by CGH or Southern analysis to share a duplicated genetic regions in chromosomes 1 , 8, 14, 17, and 20 When the RNA from these cells was displayed, a number of RNA were found to be overabundant in the cancer cells, but not controls (Figure 1) Three RNA overabundant in all three cancer cell lines corresponded to cancer-associated genes located on chromosomes 1 , 8, and 14 that are listed in
Table 1 The chromosome 13 gene (CH13-2a12-1) was overexpressed in 2 of the 3 cell lines, namely BT474 and SKBR3 Southern analysis subsequently established that the chromosome 13 gene was duplicated in the same two cell lines (Example 6, Table 5)
Selection of the source or sources of control cell RNA is also a matter of some refinement The control RNA can be derived from in vitro cultures of non-malignant cells, or established cell lines derived from a non-malignant source However, it is preferable for the control RNA to be obtained directly from normal human tissue of the same type as the cancer cells This is because most normal cells do not proliferate indefinitely, hence adaptation of a cell into a cell line involves a degree of transformation The transforming event may, in turn, be shared with that of certain cancer cells, at least at the level of RNA abundance Hence, comparison of the RNA levels in cancer cells with so-called control cell lines may lead the practitioner to miss genes that are related to malignancy For convenience, control cells may be maintained in culture for a brief period before the experiment, and even stimulated, however, multiple rounds of cell division are to be avoided if possible Use of both stimulated and unstimulated cells as controls may help provide RNA patterns corresponding to the normal range of abundance within various metabolic events of the cell cycle In one illustration highlighted in Example 1 , RNA was screened using both proliferating and non- proliferating cells As stated, the screening of breast cancer RNA is preferably conducted using uncultured normal mammary epithelial cells (termed "organoids") as sources of control RNA These cells may be obtained from surgical samples resected from healthy breast tissue
The RNA is preserved until use in the comparison experiment in such a way to minimize fragmentation To facilitate confirmation experiments, it is useful to use RNA of a reproducible character For this reason, it is convenient to use RNA that has been obtained from stable cancerous cell lines and/or ready tissue sources, although reproducibihty can also be provided by preparing enough RNA so that it can be preserved in ahquots
For displaying relative overabundance of RNA in the cancer cells, compared with the control cells, many standard techniques are suitable These would include any form of subtractive hybridization or comparative analysis Preferred are techniques in which more than two RNA sources are compared at the same time, such as various types of arbitrarily primed PCR fingerprinting techniques (Welsh et al , Yoshikawa et al ) Particularly preferred are differential mRNA display methods and variations thereof, in which the samples are run in neighboring lanes in a separating gel These techniques are focused towards mRNA by using primers that are specific for the poly-A tail characteristic of mRNA (Liang et al , 1992a, U.S Patent 5,262,311 )
Because many thousands of genes are expressed in the cells of higher organisms at any one time, it is preferable to improve the legibility of the display by surveying only a subset of the RNA at a time Methods for accomplishing this are known in the art A preferred method is by using selective primers that initiate PCR replication for a subset of the RNA Thus, the RNA is first reverse transcribed by standard techniques Short primers are used for the selection, preferably chosen such that alternative primers used in a series of like assays can complete a comprehensive survey of the mRNA
In a preferred example, primers can be used for the 3' region of the mRNAs which have an ohgo-dT sequence, followed by two other nucleotides (TiNM, where i « 11 , N e {A,C,G}, and M e {A,C,G,T}) Thus, 12 possible primers are required to complete the survey A random or arbitrary primer of minimal length can then be used for replication towards what corresponds in the sequence to the 5' region of the mRNA The optimal length for the random primer is about 10 nucleotides The product of the PCR reaction is labeled with a radioisotope, such as 35S The labeled cDNA is then separated by molecular weight, such as on a polyacrylamide sequencing gel If desired, variations on the differential display technique may be employed For example, one-base ohgo-dT primers may be used (Liang et al , 1993 & 1994), although this is generally less preferred because the display pattern is correspondingly more complex Selection of primers may be optimized mathematically depending on the number of RNA species in a tissue of interest (Bauer et al ) The method may be adapted for non-denaturing gels, and for use with automatic DNA sequencers (Bauer et al ) Alternative radioisotopes (Trentmann et al ) or fluorochromes (Sun et al ) may be used for labeling the differential display Differential display may optionally be combined with a nbonuclease protection assay (Yeatman et al ) PCR primers may optionally incorporate a restriction site to facilitate cloning (Lmskens et al , Ayala et al ) Using Taq polymerase from multiple manufacturers can increase the amount of variation under otherwise identical conditions (Haag et al ) Nested PCR primers may be used in differential display to decrease background created by ohgo-dT primers (WO 95/33760) Other variants of the differential display technique are known in the art and described inter alia in the references cited in this disclosure The use of such modifications are within the scope of the present invention, but are not required, as evidenced by the examples described below
Based on the comparison of relative abundance of RNA, particular RNAs are chosen which are present as a higher proportion of the RNA in cancerous cells, compared with control cells When using the differential display method, the cDNA corresponding to overabundant RNA will produce a band with greater proportional intensity amongst neighboring cDNA bands, compared with the proportional intensity in the control lanes Desired cDNAs can be recovered most directly by cutting the spot in the gel corresponding to the band, and recovering the DNAs therefrom Recovered cDNA can be replicated again for further use by any technique or combination of techniques known in the art, including PCR and cloning into a suitable carrier
An optional but highly beneficial additional screening step, typically performed subsequently to an RNA comparison as described above, is aimed at identifying genes that are duplicated in a substantial proportion of cancers This is conducted by using cDNA such as selected from differential display to probe digests of chromosomal DNA obtained from two or more cancerous cells, such as cancer cell lines Chromosomal DNA from non-cancerous cells that essentially reflects the germ line in terms of gene copy number is used for the control A preferred source of control DNA in experiments for human cancer genes is placental DNA, which is readily obtainable The DNA samples are cleaved at sequence-specific sites along the chromosome, most usually with a suitable restriction enzyme into fragments of appropriate size The DNA can be blotted directly onto a suitable medium, or separated on an agarose gel before blotting The latter method is preferred, because it enables a comparison of the hybridizing chromosomal restriction fragment to determine whether the probe is binding to the same fragment in all samples The amount of probe binding to DNA digests from each of the cancer cells is compared with the amount binding to control DNA Because the comparison is quantitative, it is preferable to standardize the measurement internally One method is to administer a second probe to the same blot, probing for a second chromosomal gene unlikely to be duplicated in the cancer cells This method is preferred, because it standardizes not only for differences in the amount of DNA provided, but also for differences in the amount transferred during blotting This can be accomplished by using alternative labels for the two probes, or by stripping the first probe with a suitable eluant before administering the second
To eliminate cDNA for mitochondnal genes, it is preferable to include in a parallel analysis a mitochondnal DNA preparation digested with the same restriction enzyme Any cDNA probe that hybridizes to the appropriate mitochondnal restriction fragments can be suspected of corresponding to a mitochondnal gene
In the initial replication of the RNA, the random primer may bind at any location along the RNA sequence Thus, the copied and replicated segment may be a fragment of the full-length RNA Longer cDNA corresponding to a greater portion of the sequence can be obtained, if desired, by several techniques known to practitioners of ordinary skill These include using the cDNA fragment to isolate the corresponding RNA, or to isolate complementary DNA from a cDNA library of the same species Preferably, the library is derived from the same tissue source, and more preferably from a cancer cell line of the same type For example, for cDNA corresponding to human breast cancer genes, a preferred library is derived from breast cancer cell line BT474, constructed in lambda GT10
Sequences of the cDNA can be determined by standard techniques, or by submitting the sample to commercial sequencing services The chromosomal locations of the genes can be determined by any one of several methods known in the art, such as in situ hybridization using chromosomal smears, or panels of somatic cell hybrids of known chromosomal composition
The cDNA obtained through the selection process outlined can then be tested against a larger panel of cancer cell lines and/or fresh tumor cells to determine what proportion of the cells have duplicated the gene This can be accomplished by using the cDNA as a probe for chromosomal DNA digests, as described earlier As illustrated in the Example section, a preferred method for conducting this determination is Southern analysis
The cDNA can also be used to determine what proportion of the cells have RNA overabundance This can be accomplished by standard techniques, such as slot blots or blots of agarose gels, using whole RNA or messenger RNA from each of the cells in the panel The blots are then probed with the cDNA using standard techniques It is preferable to provide an internal loading and blotting control for this analysis A preferred method is to re-probe the same blot for transcripts of a gene likely to be present in about the same level in all cells of the same type, such as the gene for a cytoskeletal protein Thus, a preferred second probe is the cDNA for beta-actm or 36B4, available from the ATCC
Using a novel cDNA found by this selection procedure, it is anticipated that essentially all cancer cells showing gene duplication will also show RNA overabundance, but that some will show
RNA overabundance without gene duplication The practitioner will readily appreciate that the strategies for identifying genes that are duplicated and/or associated with RNA overabundance may be reversed appropriately to screen for genes that are deleted and/or associated with RNA underabundance The principles are essentially the same Genes that are frequently down-regulated in cancer (such as tumor suppresser genes) may be down-regulated by different mechanisms in different cells, and a gene with this behavior is more likely to be central to malignant transformation or persistence of the malignant state
To screen for such down-regulated genes according to the present invention, RNA is prepared from a plurality of tumors or cancer cell lines and the abundance is compared with RNA preparation from control cells Again, it is highly preferable to use cancer cells that share a deleted gene in the same chromosomal region, in order to focus any differences at the RNA level towards particular alterations in cancer cells and away from normal variations or coincidental changes The CGH technique may be used to identify deletions in previously uncharactenzed cancer cells As before, cancer cells may be chosen on the basis of previous knowledge of deleted regions, there is no need to conduct methods such as CGH on previously characterized lines cDNA from the RNA of cancer cells is displayed (preferably by differential display) alongside cDNA copied from
(preferably uncultured) control cells, and cDNA is selected that appears to be underrepresented in at least two (preferably more) of the cancer cells compared with the control cells cDNA thus selected may optionally be further screened against digested DNA preparations, to confirm that the RNA underabundance observed in the cancer cell populations is attributable in at least a proportion of the cells to an actual gene deletion
As before, the cDNA may be used for sequencing or rescuing additional polynucleotides, in this case not from the cancer cells but from cells containing or expressing the gene at normal levels Pharmaceuticals based on deleted genes or those associated with underexpressed RNA are typically oriented at restoring or upregulating the gene, or a functional equivalent of the encoded gene product
The identification of four exemplary cancer associated genes
To identify particular RNA that is overabundant in cancer cells, RNA has been compared between breast cancer cells and control cells The amount of total cellular RNA was compared using a modified differential display method Primers were used for the 3' region of the mRNAs which have an ohgo-dT sequence, followed by two other nucleotides as described in the previous section Random or arbitrary primers of about 10 nucleotides were used for replication towards what corresponds in the sequence to the 5' region of the mRNA The labeled amplification product was then separated by molecular weight on a polyacrylamide sequencing gel
Particular mRNAs were chosen that were present in a higher proportion of the RNA in cancerous cells, compared with control cells, according to the proportional intensity amongst neighboring cDNA bands The cDNA was recovered directly from the gel and amplified to provide a probe for screening Candidate polynucleotides were screened by a number of criteria, including both Northern and Southern analysis to determine if the corresponding genes were duplicated or responsible for to RNA overabundance in breast cancer cells Sequence data of the polynucleotides was obtained and compared with sequences in GenBank Novel polynucleotides with the desired expression patterns were used to probe for longer cDNA inserts in a λgt10 library constructed from the breast cancer cell line BT474, which were then sequenced
Further description of the actual experimental events that occurred during identification of the four exemplary genes, and sequence data for CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , and CH14- 2a16-1 are provided in the Example section
Preparation of polynucleotides, polypeptides and antibodies
Polynucleotides based on the cDNA of CH1-9a11-2, CH8-2a13-1 , CH13-2a12-1 , CH14-2a16- 1 , can be rescued from cloned plasmids and phage provided as part of this invention They may also be obtained from breast cancer cell libraries or mRNA preparations, or from normal human tissues such as placenta, by judicious use of primers or probes based on the sequence data provided herein Alternatively, the sequence data provided herein can be used in chemical synthesis to produce a polynucleotide with an identical sequence, or that incorporates occasional variations Polypeptides encoded by the corresponding mRNA can be prepared by several different methods, all of which will be known to a practitioner of ordinary skill For example, the appropriate strand of the full-length cDNA can be operably linked to a suitable promoter, and transfected into a suitable host cell The host cell is then cultured under conditions that allow transcription and translation to occur, and the polypeptide is subsequently recovered Another convenient method is to determine the polynucleotide sequence of the cDNA, and predict the polypeptide sequence according to the genetic code A polypeptide can then be prepared directly, for example, by chemical synthesis, either identical to the predicted sequence, or incorporating occasional variations
Antibodies against polypeptides of this invention may be prepared by any method known in the art For stimulating antibody production in an animal, it is often preferable to enhance the immunogemcity of a polypeptide by such techniques as polymerization with glutaraldehyde, or combining with an adjuvant, such as Freund's adjuvant The immunogen is injected into a suitable experimental animal preferably a rodent for the preparation of monoclonal antibodies, preferably a larger animal such as a rabbit or sheep for preparation of polyclonal antibodies It is preferable to provide a second or booster injection after about 4 weeks, and begin harvesting the antibody source no less than about 1 week later
Sera harvested from the immunized animals provide a source of polyclonal antibodies Detailed procedures for purifying specific antibody activity from a source material are known within the art Unwanted activity cross-reacting with other antigens, if present, can be removed, for example, by running the preparation over adsorbants made of those antigens attached to a solid phase, and collecting the unbound fraction If desired, the specific antibody activity can be further purified by such techniques as protein A chromatography, ammonium sulfate precipitation, ion exchange chromatography, high-performance liquid chromatography and immunoaffinity chromatography on a column of the immunizing polypeptide coupled to a solid support
Alternatively, immune cells such as splenocytes can be recovered from the immunized animals and used to prepare a monoclonal antibody-producing cell line See, for example, Harrow & Lane (1988), U S Patent Nos 4,491 ,632 (J R Wands et al ), U S 4,472,500 (C Milstem et al ), and U S 4,444,887 (M K Hoffman et al )
Briefly, an antibody-producing line can be produced inter a a by cell fusion, or by transfecting antibody-producing cells with Epstein Barr Virus, or transforming with oncogenic DNA The treated cells are cloned and cultured, and clones are selected that produce antibody of the desired specificity Specificity testing can be performed on culture supematants by a number of techniques, such as using the immunizing polypeptide as the detecting reagent in a standard immunoassay, or using cells expressing the polypeptide in immunohistochemistry A supply of monoclonal antibody from the selected clones can be purified from a large volume of tissue culture supernatant, or from the ascites fluid of suitably prepared host animals injected with the clone
Effective variations of this method include those in which the immunization with the polypeptide is performed on isolated cells Antibody fragments and other derivatives can be prepared by methods of standard protein chemistry, such as subjecting the antibody to cleavage with a proteolytic enzyme Genetically engineered variants of the antibody can be produced by obtaining a polynucleotide encoding the antibody, and applying the general methods of molecular biology to introduce mutations and translate the variant
Use in diagnosis
Novel cDNA sequences corresponding to genes associated with cancer are potentially useful as diagnostic aids Similarly, polypeptides encoded by such genes, and antibodies specific for these polypeptides, are also potentially useful as diagnostic aids
More specifically, gene duplication or overabundance of RNA in particular cells can help identify those cells as being cancerous, and thereby play a part in the initial diagnosis Increased levels of RNA corresponding to CH1-9a11-2, CH8-2a13-12, CH13-2a12-1, and CH14-2a16-1 are present in a substantial proportion of breast cancer cell lines and primary breast tumors In addition, preliminary Northern analysis using probes for CH8-2a13-12, CH13-2a12-1 , and CH14-2a16-1 indicates that these genes may be duplicated or be associated with RNA overabundance in certain cell lines derived from cancers other than breast cancer, including colon cancer, lung cancer, prostrate cancer, ghoma, and ovarian cancer
For patients already diagnosed with cancer, gene duplication or overabundance of RNA can assist with clinical management and prognosis For example, overabundance of RNA may be a useful predictor of disease survival, metastasis, susceptibility to various regimens of standard chemotherapy, the stage of the cancer, or its aggressiveness See generally the article by Blast, U S Patent No 4,968,603 (Slamon et al ) and PCT Application WO 94/00601 (Levine et al ) All of these determinations are important in helping the clinician choose between the available treatment options
A particularly important diagnostic application contemplated in this invention is the identification of patients suitable for gene-specific therapy, as outlined in the following section For example, treatment directed against a particular gene or gene product is appropriate in cancers where the gene is duplicated or there is RNA overabundance Given a particular pharmaceutical that is directed at a particular gene, a diagnostic test specific for the same gene is important in selecting patients likely to benefit from the pharmaceutical Given a selection of such pharmaceuticals specific for different genes, diagnostic tests for each gene are important in selecting which pharmaceutical is likely to benefit a particular patient
The polynucleotide, polypeptide, and antibodies embodied in this invention provide specific reagents that can be used in standard diagnostic procedures The actual procedures for conducting diagnostic tests are extensively known in the art, and are routine for a practitioner of ordinary skill See, for example, U S Patent No 4,968,603 (Slamon et al ), and PCT Applications WO 94/00601
(Levine et al ) and WO 94/17414 (K Keyomarsi et al ) What follows is a brief non -limiting survey of some of the known procedures that can be applied
Generally, to perform a diagnostic method of this invention, one of the compositions of this invention is provided as a reagent to detect a target in a clinical sample with which it reacts Thus, the polynucleotide of this invention can be used as a reagent to detect a DNA or RNA target, such as might be present in a cell with duplication or RNA overabundance of the corresponding gene The polypeptide can be used as a reagent to detect a target for which it has a specific binding site, such as an antibody molecule or (if the polypeptide is a receptor) the corresponding ligand The antibody can be used as a reagent to detect a target it specifically recognizes, such as the polypeptide used as an immunogen to raise it
The target is supplied by obtaining a suitable tissue sample from an individual for whom the diagnostic parameter is to be measured Relevant test samples are those obtained from individuals suspected of containing cancerous cells, particularly breast cancer cells Many types of samples are suitable for this purpose, including those that are obtained near the suspected tumor site by biopsy or surgical dissection, in vitro cultures of cells derived therefrom, blood, and blood components If desired, the target may be partially purified from the sample or amplified before the assay is conducted The reaction is performed by contacting the reagent with the sample under conditions that will allow a complex to form between the reagent and the target The reaction may be performed in solution, or on a solid tissue sample, for example, using histology sections The formation of the complex is detected by a number of techniques known in the art For example, the reagent may be supplied with a label and unreacted reagent may be removed from the complex, the amount of remaining label thereby indicating the amount of complex formed Further details and alternatives for complex detection are provided in the descriptions that follow
To determine whether the amount of complex formed is representative of cancerous or non- cancerous cells, the assay result is compared with a similar assay conducted on a control sample It is generally preferable to use a control sample which is from a non-cancerous source, and otherwise similar in composition to the clinical sample being tested However, any control sample may be suitable provided the relative amount of target in the control is known or can be used for comparative purposes Where the assay is being conducted on tissue sections, suitable control cells with normal histopathology may surround the cancerous cells being tested It is often preferable to conduct the assay on the test sample and the control sample simultaneously However, if the amount of complex formed is quantifiable and sufficiently consistent, it is acceptable to assay the test sample and control sample on different days or in different laboratories
A polynucleotide embodied in this invention can be used as a reagent for determining gene duplication or RNA overabundance that may be present in a clinical sample The binding of the reagent polynucleotide to a target in a clinical sample generally relies in part on a hybridization reaction between a region of the polynucleotide reagent, and the DNA or RNA in a sample being tested
If desired, the nucleic acid may be extracted from the sample, and may also be partially purified To measure gene duplication, the preparation is preferably enriched for chromosomal DNA, to measure RNA overabundance, the preparation is preferably enriched for RNA The target polynucleotide can be optionally subjected to any combination of additional treatments, including digestion with restriction endonucleases, size separation, for example by elecfrophoresis in agarose or polyacrylamide, and affixed to a reaction matrix, such as a blotting material
Hybridization is allowed to occur by mixing the reagent polynucleotide with a sample suspected of containing a target polynucleotide under appropriate reaction conditions This may be followed by washing or separation to remove unreacted reagent Generally, both the target polynucleotide and the reagent must be at least partly equilibrated into the single -stranded form in order for complementary sequences to hybridize efficiently Thus, it may be useful (particularly in tests for DNA) to prepare the sample by standard denaturation techniques known in the art
The minimum complementarity between the reagent sequence and the target sequence for a complex to form depends on the conditions under which the complex-forming reaction is allowed to occur Such conditions include temperature, ionic strength, time of incubation, the presence of additional solutes in the reaction mixture such as formamide, and washing procedure Higher stringency conditions are those under which higher minimum complementarity is required for stable hybridization to occur It is generally preferable in diagnostic applications to increase the specificity of the reaction, minimizing cross-reactivity of the reagent polynucleotide alternative undesired hybridization sites in the sample Thus, it is preferable to conduct the reaction under conditions of high stringency for example, in the presence of high temperature, low salt, formamide, a combination of these, or followed by a low-salt wash
In order to detect the complexes formed between the reagent and the target, the reagent is generally provided with a label Some of the labels often used in this type of assay include radioisotopes such as 32P and 33P, chemiluminescent or fluorescent reagents such as fluorescein, and enzymes such as alkaline phosphatase that are capable of producing a colored solute or precipitant
The label may be intrinsic to the reagent, it may be attached by direct chemical linkage, or it may be connected through a series of intermediate reactive molecules, such as a biotm-avidm complex, or a series of inter-reactive polynucleotides The label may be added to the reagent before hybridization with the target polynucleotide, or afterwards To improve the sensitivity of the assay, it is often desirable to increase the signal ensuing from hybridization This can be accomplished by replicating either the target polynucleotide or the reagent polynucleotide, such as by a polymerase chain reaction Alternatively, a combination of serially hybridizing polynucleotides or branched polynucleotides can be used in such a way that multiple label components become incorporated into each complex See U S Patent No 5,124,246 (Urdea et al ) An antibody embodied in this invention can also be used as a reagent in cancer diagnosis, or for determining gene duplication or RNA overabundance that may be present in a clinical sample This relies on the fact that overabundance of RNA in affected cells is often associated with increased production of the corresponding polypeptide Several of the genes up-regulated in cancer cells encode for cell surface receptors A for example, eπ B-2, c-myc and epidermal growth factor Alternatively, the RNA may encode a protein kept inside the cell, or it may encode a protein secreted by the cell into the surrounding milieu
Any such protein product can be detected in solid tissue samples and cultured cells by immunohistological techniques that will be obvious to a practitioner of ordinary skill Generally, the tissue is preserved by a combination of techniques which may include cooling, exchanging into different solvents, fixing with agents such as paraformaldehyde, or embedding in a commercially available medium such as paraffin or OCT A section of the sample is suitably prepared and overlaid with a primary antibody specific for the protein
The primary antibody may be provided directly with a suitable label More frequently, the primary antibody is detected using one of a number of developing reagents which are easily produced or available commercially Typically, these developing reagents are anti -immunoglobulm or protein A, and they typically bear labels which include, but are not limited to fluorescent markers such as fluorescein, enzymes such as peroxidase that are capable of precipitating a suitable chemical compound, electron dense markers such as colloidal gold, or radioisotopes such as 25l The section is then visualized using an appropriate microscopic technique, and the level of labeling is compared between the suspected cancer cell and a control cell, such as cells surrounding the tumor area or those taken from an alternative site The amount of protein corresponding to the cancer-associated gene may be detected in a standard quantitative immunoassay If the protein is secreted or shed from the cell in any appreciable amount, it may be detectable in plasma or serum samples Alternatively, the target protein may be solubihzed or extracted from a solid tissue sample Before quantitatmg, the protein may optionally be affixed to a solid phase, such as by a blot technique or using a capture antibody A number of immunoassay methods are established in the art for performing the quantitation
For example, the protein may be mixed with a pre-determmed non-hmitmg amount of the reagent antibody specific for the protein The reagent antibody may contain a directly attached label, such as an enzyme or a radioisotope, or a second labeled reagent may be added, such as anti-immunoglobuhn or protein A For a solid-phase assay, unreacted reagents are removed by washing For a liquid-phase assay, unreacted reagents are removed by some other separation technique, such as filtration or chromatography The amount of label captured in the complex is positively related to the amount of target protein present in the test sample A variation of this technique is a competitive assay, in which the target protein competes with a labeled analog for binding sites on the specific antibody In this case, the amount of label captured is negatively related to the amount of target protein present in a test sample Results obtained using any such assay on a sample from a suspected cancer-bearing source are compared with those from a non-cancerous source
A polypeptide embodied in this invention can also be used as a reagent in cancer diagnosis, or for determining gene duplication or RNA overabundance that may be present in a clinical sample Overabundance of RNA in affected cells may result in the corresponding polypeptide being produced by the cells in an abnormal amount On occasion, overabundance of RNA may occur concurrently with expression of the polypeptide in an unusual form This in turn may result in stimulation of the immune response of the host to produce its own antibody molecules that are specific for the polypeptide Thus, a number of human hybndomas have been raised from cancer patients that produce antibodies against their own tumor antigens To use the polypeptide in the detection of such antibodies in a subject suspected of having cancer, an immunoassay is conducted Suitable methods are generally the same as the immunoassays outlined in the preceding paragraphs, except that the polypeptide is provided as a reagent, and the antibody is the target in the clinical sample which is to be quantified For example, human IgG antibody molecules present in a serum sample may be captured with solid -phase protein A, and then overlaid with the labeled polypeptide reagent The amount of antibody would then be proportional to the label attached to the solid phase Alternatively, cells or tissue sections expressing the polypeptide may be overlaid first with the test sample containing the antibody, and then with a detecting reagent such as labeled anti-immunoglobulin The amount of antibody would then be proportional to the label attached to the cells The amount of antibody detected in the sample from a suspected cancerous source would be compared with the amount detected in a control sample
These diagnostic procedures may be performed by diagnostic laboratories, experimental laboratories, practitioners, or private individuals This invention provides diagnostic kits which can be used in these settings The presence of cancer cells in the individual may be manifest in a clinical sample obtained from that individual as an alteration in the DNA, RNA, protein, or antibodies contained in the sample An alteration in one of these components resulting from the presence of cancer may take the form of an increase or decrease of the level of the component, or an alteration in the form of the component, compared with that in a sample from a healthy individual The clinical sample is optionally pre-treated for enrichment of the target being tested for The user then applies a reagent contained in the kit in order to detect the changed level or alteration in the diagnostic component
Each kit necessarily comprises the reagent which renders the procedure specific a reagent polynucleotide, used for detecting target DNA or RNA, a reagent antibody, used for detecting target protein, or a reagent polypeptide, used for detecting target antibody that may be present in a sample to be analyzed The reagent is supplied in a solid form or liquid buffer that is suitable for inventory storage, and later for exchange or addition into the reaction medium when the test is performed Suitable packaging is provided The kit may optionally provide additional components that are useful in the procedure These optional components include buffers, capture reagents, developing reagents, labels, reacting surfaces, means for detection, control samples, instructions, and interpretive information
Use in pharmaceutical development
Embodied in this invention are modes of treating subjects bearing cancer cells that have overabundance of the particular RNA described The strategy used to obtain the cDNAs provided in this invention was deliberately focused on genes that achieve RNA overabundance by gene duplication in some cells, and by alternative mechanisms in other cells These alternative mechanisms may include, for example, translocation or enhancement of transcription enhancing elements near the coding region of the gene, deletion of repressor binding sites, or altered production of gene regulators Such mechanisms would result in more RNA being transcribed from the same gene Alternatively, the same amount of RNA may be transcribed, but may persist longer in the cell, resulting in greater abundance This could occur, for example, by reduction in the level of ribozymes or protein enzymes that degrade RNA, or in the modification of the RNA to render it more resistant to such enzymes or spontaneous degradation
Thus, different cells make use of at least two different mechanisms to achieve a single result A the overabundance of a particular RNA This suggests that RNA overabundance of these genes is central to the cancer process in the affected cells Interfering with the specific gene or gene product would consequently modify the cancer process It is an objective of this invention to provide pharmaceutical compositions that enable therapy of this kind
One way this invention achieves this objective is through screening candidate drugs The general screening strategy is to apply the candidate to a manifestation of a gene associated with cancer, and then determine whether the effect is beneficial and specific For example, a composition that interferes with a polynucleotide or polypeptide corresponding any of the novel cancer-associated genes described herein has the potential to block the associated pathology when administered to a tumor of the appropriate phenotype It is not necessary that the mechanism of interference be known, only that the interference be preferential for cancerous cells (or cells near the cancer site) but not other cells
A preferred method of screening is to provide cells in which a polynucleotide related to a cancer gene has been transfected See, for example, PCT application WO 93/08701 A practitioner of ordinary skill will be well acquainted with techniques for transfectmg eukaryotic cells, including the preparation of a suitable vector, such as a viral vector, conveying the vector into the cell, such as by electroporation, and selecting cells that have been transformed, such as by using a reporter or drug sensitivity element
A cell line is chosen which has a phenotype desirable in testing, and which can be maintained well in culture The cell line is transfected with a polynucleotide corresponding to one of the cancer-associated genes identified herein Transfection is performed such that the polynucleotide is operably linked to a genetic controlling element that permits the correct strand of the polynucleotide to be transcribed within the cell Successful transfection can be determined by the increased abundance of the RNA compared with an untransfected cell It is not necessary that the cell previously be devoid of the RNA, only that the transfection result in a substantial increase in the level observed RNA abundance in the cell is measured using the same polynucleotide, according to the hybridization assays outlined earlier
Drug screening is performed by adding each candidate to a sample of transfected cells, and monitoring the effect The experiment includes a parallel sample which does not receive the candidate drug The treated and untreated cells are then compared by any suitable phenotypic criteria, including but not limited to microscopic analysis, viability testing, ability to replicate, histological examination, the level of a particular RNA or polypeptide associated with the cells, the level of enzymatic activity expressed by the cells or cell lysates, and the ability of the cells to interact with other cells or compounds Differences between treated and untreated cells indicates effects attributable to the candidate In a preferred method, the effect of the drug on the cell transfected with the polynucleotide is also compared with the effect on a control cell Suitable control cells include untransfected cells of similar ancestry, cells transfected with an alternative polynucleotide, or cells transfected with the same polynucleotide in an inoperative fashion Optimally the drug has a greater effect on operably transfected cells than on control cells
Desirable effects of a candidate drug include an effect on any phenotype that was conferred by transfection of the cell line with the polynucleotide from the cancer-associated gene, or an effect that could limit a pathological feature of the gene in a cancerous cell Examples of the first type would be a drug that limits the overabundance of RNA in the transfected cell, limits production of the encoded protein, or limits the functional effect of the protein The effect of the drug would be apparent when comparing results between treated and untreated cells An example of the second type would be a drug that makes use of the transfected gene or a gene product to specifically poison the cell The effect of the drug would be apparent when comparing results between operably transfected cells and control cells
Use in treatment
This invention also provides gene-specific pharmaceuticals in which each of the polynucleotides, polypeptides, and antibodies embodied herein as a specific active ingredient in pharmaceutical compositions Such compositions may decrease the pathology of cancer cells on their own, or render the cancer cells more susceptible to treatment by the non-specific agents, such as classical chemotherapy or radiation An example of how polynucleotides embodied in this invention can be effectively used in treatment is gene therapy See, for example, Morgan et al , Culver et al , and U S Patent No 5,399,346 (French et al ) The general principle is to introduce the polynucleotide into a cancer cell in a patient, and allow it to interfere with the expression of the corresponding gene, such as by complexing with the gene itself or with the RNA transcribed from the gene Entry into the cell is facilitated by suitable techniques known in the art as providing the polynucleotide in the form of a suitable vector, or encapsulation of the polynucleotide in a liposome The polynucleotide may be provided to the cancer site by an antigen-specific homing mechanism, or by direct injection
A preferred mode of gene therapy is to provide the polynucleotide in such a way that it will replicate inside the cell, enhancing and prolonging the interference effect Thus, the polynucleotide is operably linked to a suitable promoter, such as the natural promoter of the corresponding gene, a heterologous promoter that is intrinsically active in cancer cells, or a heterologous promoter that can be induced by a suitable agent Preferably, the construct is designed so that the polynucleotide sequence operably linked to the promoter is complementary to the sequence of the corresponding gene Thus, once integrated into the cellular genome, the transcript of the administered polynucleotide will be complementary to the transcript of the gene, and capable of hybridizing with it This approach is known as anti-sense therapy See, for example, Culver et al and Roth The use of antibodies embodied in this invention in the treatment of cancer partly relies on the fact that genes that show RNA overabundance in cancer frequently encode cell-surface proteins Location of these proteins at the cell surface may correspond to an important biological function of the cancer cell, such as their interaction with other cells, the modulation of other cell -surface proteins, or triggering by an incoming cytokine
These mechanisms suggest a variety of ways in which a specific antibody may be effective in decreasing the pathology of a cancer cell For example, if the gene encodes for a growth receptor, then an antibody that blocks the ligand binding site or causes endocytosis of the receptor would decrease the ability of the receptor to provide its signal to the cell It is unnecessary to have knowledge of the mechanism beforehand, the effectiveness of a particular antibody can be predicted empirically by testing with cultured cancer cells expressing the corresponding protein Monoclonal antibodies may be more effective in this form of cancer therapy if several different clones directed at different determinants of the same cancer-associate gene product are used in combination see PCT application WO 94/00136 (Kasprzyk et al ) Such antibody treatment may directly decrease the pathology of the cancer cells, or render them more susceptible to non-specific cytotoxic agents such as platinum (Lippman)
Another example of how antibodies can be used in cancer therapy is in the specific targeting of effector components The protein product of the cancer-associated gene is expected to appear in high frequency on cancer cells compared to unaffected cells, due to the overabundance of the corresponding RNA The protein therefore provides a marker for cancer cells that a specific antibody can bind to An effector component attached to the antibody therefore becomes concentrated near the cancer cells, improving the effect on those cells and decreasing the effect on non-cancer cells This concentration would generally occur not only near the primary tumor, but also near cancer cells that have metastasized to other tissue sites Furthermore, if the antibody is able to induce endocytosis, this will enhance entry of the effector into the cell interior
For the purpose of targeting, an antibody specific for the protein of the cancer-associated gene is conjugated with a suitable effector component, preferably by a covalent or high -affinity bond Suitable effector components in such compositions include radionuchdes such as 131l, toxic chemicals such as vincnstine, and toxic peptides such as diphtheria toxin Other suitable effector components include peptides or polynucleotides capable of altering the phenotype of the cell in a desirable fashion for example, installing a tumor suppresser gene, or rendering them susceptible to immune attack
In most applications of antibody molecules in human therapy, it is preferable to use human monoclonals, or antibodies that have been humanized by techniques known in the art This helps prevent the antibody molecules themselves from becoming a target of the host's immune system An example of how polypeptides embodied in this invention can be effectively used in treatment is through vaccination The growth of cancer cells is naturally limited in part due to immune surveillance This refers to the recognition of cancer cells by immune recognition units, particularly antibodies and T cells, and the consequent triggering of immune effector functions that limit tumor progression Stimulation of the immune system using a particular tumor-specific antigen enhances the effect towards the tumor expressing the antigen Thus, an active vaccine comprising a polypeptide encoded by the cDNA of this invention would be appropriately administered to subjects having overabundance of the corresponding RNA There may also be a prophylactic role for the vaccine in a population predisposed for developing cancer cells with overabundance of the same RNA
Ways of increasing the effectiveness of cancer vaccines are known in the art (Beardsley, MacLean et al ) For example, synthetic antigens are conjugated to a carrier like keyhole limpet hemocyanm (KLH), and then combined with an adjuvant such as DETOX™, a mixture of mycobactenal cell walls and hpid A Any polypeptide encoded by the four novel genes described in this invention can be used in analogous compositions
Methods for preparing and administering polypeptide vaccines are known in the art Peptides may be capable of eliciting an immune response on their own, or they may be rendered more immunogenic by chemical manipulation, such as cross-linking or attaching to a protein carrier like KLH Preferably, the vaccine also comprises an adjuvant, such as alum, muramyl dipeptides, hposomes, or DETOX™ The vaccine may optionally comprise auxiliary substances such as wetting agents, emulsifying agents, and organic or inorganic salts or acids, It also comprises a pharmaceutically acceptable excipient which is compatible with the active ingredient and appropriate for the route of administration The desired dose for peptide vaccines is generally from 10 μg to 1 mg, with a broad effective latitude The vaccine is preferably administered first as a priming dose, and then again as a boosting dose, usually at least four weeks later Further boosting doses may be given to enhance the effect The dose and its timing are usually determined by the person responsible for the treatment
Sequence data and deposits
The foregoing detailed description provides, inter a a, a detailed explanation of how genes associated with cancer can be identified and their cDNA obtained Polynucleotide sequences for CH1-9a11-2, CH8-2a13-1, CH13-2a12-1 , and CH14-2a16-1 are provided The sequence data listed in this application was obtained by two-directional sequencing, except where indicated otherwise The data are believed to be accurate — nevertheless, it is readily appreciated that the techniques of the art as used herein have the potential of introducing occasional and infrequent sequence errors Clones and inserts obtained via PCR may also comprise occasional errors introduced during amplification Nucleotide sequences predicted from database compilations, and sequence data obtained by one-directional sequencing may also contain occasional errors in accordance with the limitations of the underlying techniques In addition, allelic variations to both nucleotide and am o acid sequences may occur naturally or be deliberately induced Differences of any of these types between the sequences provided herein and the invention as practiced may be present without departing from the spirit of the invention
Sequence data for CH8-2a13-1 and CH13-2a12-1 cDNA are believed to comprise the entire translated coding sequence, and 5' and 3' untranslated regions corresponding to those found in typical mRNA transcripts Multiple mRNA transcripts may be found depending on the patterns of transcript processing in various cell types of interest Sequence data for CH1 -9a11-2 and CH14-2a16-1 cDNA comprise a portion of the coding sequence and 3' untranslated regions Additional sequence is typically present in the corresponding mRNA transcripts, comprising an additional coding region in the
N-termmal direction of the protein, and possibly a 5' untranslated region Certain embodiments of this invention may be practiced by polynucleotide synthesis according to the data provided herein, by rescuing an appropriate insert corresponding to the gene of interest from one of the deposits listed below, or by isolating a corresponding polynucleotide from a suitable tissue source Various useful probes and primers for use in polynucleotide isolation are provided herein, or may be designed from the sequence data Three deposits have been made on May 31 , 1996 with the American Type Culture Collection
(ATCC), 12301 Parklawn Drive, Rockville, Maryland 20852 under terms of the Budapest treaty The deposits are outlined in Table 2
Figure imgf000049_0001
Sequence databases contain sequences of polynucleotide and polypeptide fragments with varyous degrees of identity and overlap with certain embodiments of this invention. The following list of accession numbers is provided for the interest of the reader; it is not intended to be comprehensive or a limitation on the invention. The database disclosures do not typically indicate use in cancer diagnosis, drug development, or disease treatment.
The following GenBank accession numbers are listed in relation to CH1-9a11-2: dbEST N32686; N45113; N36176; N22982; AA278830; H88670; AA235936; AA236951; H26301; N28026; H88063; H88064; D61948; H88718; H26460; AA137920; AA145308; W12952; AA200687; N44164; T27279; dbSTS G22044; G04961. The following GenBank accession numbers are listed in relation to CH8-2a13-1 dbNR D83780
The following GenBank accession numbers are listed in relation to CH13-2a12-1 dbNR
U58090, dbEST AA182441 , AA253924, AA179755, AA112715, AA112640, W67977, AA150317, W68080, AA150243, AA100446, W69636, H46574, AA245889, AA100651 , H77368, AA192778,
T85671 , N32682, T86257, T78239, T77874, AA187866, Z33557, R40816, N99802, R19302,
AA100650, N55904, AA257151 , H77369, T79014
The following GenBank accession numbers are listed in relation to CH14-2a16-1 dbEST N64802, W56903, N31400, W95674, AA233551, AA233636, N24105, W03447, W25821, AA233666, AA233647, N67843, D55778, T66839, N55370, N75650, AA280736, H97110, Z19643, H91250,
AA230765, R93089, T84665, W94857, R92873
The examples presented below are provided as a further guide to a practitioner of ordinary skill in the art, and are not meant to be limiting in any way
EXAMPLES
Example 1: Selecting cDNA for messenger RNA that is overabundant in breast cancer cells
Total RNA was isolated from each breast cancer cell line or control cell by centnfugation through a gradient of guanidme isothiocyanate/CsCl The RNA was treated with RNase-free DNase (Promega, Madison, Wl) After extraction with phenol -chloroform, the RNA preparations were stored at -70°C Ohgo-dT polynucleotides for priming at the 3' end of messenger RNA with the sequence T^NM (where N e {A,C,G} and M e {A,C,G,T}) were synthesized according to standard protocols Arbitrary decamer polynucleotides (OPA01 to OPA20) for priming towards the 5' end were purchased from Operon Biotechnology, Inc , Alameda, CA
The RNA was reverse-transcribed using AMV reverse transcπptase (obtained from BRL) and an anchored ohgo-dT pnmer in a volume of 20 μL, according to the manufacturer's directions The reaction was incubated at 370C for 60 mm and stopped by incubating at 950C for 5 mm The cDNA obtained was used immediately or stored frozen at -70°C
Differential display was conducted according to the following procedure 1 μL cDNA was replicated in a total volume of 10 μL PCR mixture containing the appropriate T^NM sequence, 0 5 TM of a decamer primer, 200 TM dNTP, 5 TCi [35S]-dATP (Amersham), Taq polymerase buffer with 2 5 mM MgCl2 and 0 3 unit Taq polymerase (Promega) Forty cycles were conducted in the following sequence 94°C for 30 sec, 40°C for 2 mm, 72°C for 30 sec, and then the sample was incubated at 72°C for 5 mm The replicated cDNA was separated on a 6% polyacrylamide sequencing gel After elecfrophoresis, the gel was dried and exposed to X-ray film
The autoradiogram was analyzed for labeled cDNA that was present in larger relative amount in all of the lanes corresponding to breast cancer cells, compared with all of the lanes corresponding to control cells Figure 1 provides an example of an autoradiogram from such an experiment Lane 1 is from non-proliferating normal breast cells, lane 2 is from proliferating normal breast cells, lanes 3 to 5 are from breast cancer cell lines BT474, SKBR3, and MCF7 The left and right side shows the pattern obtained from experiments using the same T^NM sequence (T^AC), but two different decamer primers The arrows indicate the cDNA fragments that were more abundant in all three tumor lines compared with controls
The assay illustrated in Figure 1 was conducted using different combinations of ohgo-dT primers and decamer primers A number of differentially expressed bands were detected when different primer combinations were used However, not all differences seen initially were reproducible after re-screening We therefore routinely repeated each differential display for each primer combination Only bands showing RNA overabundance in at least 2 experiments were selected for further analysis
It is preferable to include in the differential display experiment RNA derived from uncultured normal mammary epithelial cells (termed "organoids") These cells are obtained from surgical samples resected from healthy breast tissue, which are then coaxed apart by blunt dissection techniques and mild enzyme treatment Using organoids as the negative control, 33 cDNA fragments were isolated from 15 displays
Example 2: Sub-selecting cDNA that corresponds to genes that are duplicated in breast cancer cells
cDNA fragments that were differentially expressed in the fashion described in Example 1 were excised from the dried gel and extracted by boiling at 950C for 10 mm Eluted cDNA was recovered by ethanol precipitation, and replicated by PCR The product was cloned into the pCRII vector using the TA cloning system (Invifrogen) EcoRI digested placenta DNA, and EcoRI digested DNA from the breast cancer cell lines
BT474, SKBR3 and ZR-75-30 were used to prepare Southern blots to screen the cloned cDNA fragments The cloned cDNA fragments were labeled with [32P]-dCTP, and used individually to probe the blots A larger relative amount of binding of the probe to the lanes corresponding to the cancer cell DNA indicated that the corresponding gene had been duplicated in the cancer cells The labeled cDNA probes were also used in Northern blots to verify that the corresponding RNA was overabundant in the appropriate cell lines To determine whether the cDNA fragments obtained by this selection procedure corresponded to novel genes, a partial nucleotide sequence was obtained using M13 primers Each sequence was compared with the known sequences in GenBank in initial experiments, 5 of the first 7 genes sequenced were mitochondnal genes To avoid repeated isolation of mitochondnal genes, subsequent screening experiments were done with additional lanes in the DNA blot analysis for EcoRI digested and H/ndlll digested mitochondnal DNA Any cDNA fragment that hybridized to the appropriate mitochondnal restriction fragments was suspected of corresponding to a mitochondnal gene, and not analyzed further
From the 33 cDNA fragments detected from differential displays using organoid mRNA, 12 were subcloned Of these 12, 6 detected suitable gene duplications in the appropriate cell lines
Three cDNA failed to detect duplicated genes, and 3 appeared to correspond to mitochondnal genes Sequence analysis of the 6 suitable cDNA fragments showed no identity to any known genes
To obtain longer cDNA corresponding to the cDNA fragments with novel sequences, the fragments were used as probes to screen a cDNA library from breast cancer cell line BT474, constructed in lambda GT10 The longer cDNA obtained from lambda GT10 were sequenced using lambda GT10 primers The chromosomal locations of the cDNAs were determined using panels of somatic cell hybrids
Four of the 6 novel cDNA identified so far have been processed in this fashion The probes used to obtain the 4 new breast cancer genes are shown in Table 3
TABLE 3: Primers used for Differential Display
cDNA Oligo-dT primer Arbitrary primer
CH1-9a11-2 T11CC (SEQ ID NO: 9) SEQ ID NO:11
CH8-2a13-1 T11AC (SEQ ID NO:10) SEQ ID NO:12
CH13-2a12-1 T11AC (SEQ ID NO:10) SEQ ID NO:13
CH14-2a16-1 T11AC (SEQ ID NO:10) SEQ ID NO:14
Example 3: Using the cDNA to test panels of breast cancer cells
To determine the proportion of breast cancers in which the putative breast cancer genes were duplicated, or showed RNA overabundance without gene duplication, the four cDNA obtained according to the selection procedures described were used to probe a panel of breast cancer cell lines and primary tumors. Gene duplication was detected either by Southern analysis or slot -blot analysis. For Southern analysis, 10 μg of EcoRI digested genomic DNA from different cell lines was electrophoresed on 0.8% agarose and transferred to a HYBOND™ N+ membrane (Amersham). The filters were hybridized with 32P-labeled cDNA for the putative breast cancer gene. After an autoradiogram was obtained, the probe was stripped and the blot was re-probed using a reference probe to adjust for differences in sample loading. Either chromosome 2 probe D2S5 or chromosome 21 probe D21S6 was used as a reference. Densities of the signals on the autoradiograms were obtained using a densitometer (Molecular Dynamics). The density ratio between the breast cancer gene and the reference gene was calculated for each sample. Two samples of placental DNA digests were run in each Southern analysis as a control. For slot-blot analysis, 1 μg of genomic DNA was denatured and slotted on the HYBOND™ membrane. D21S5 or human repetitive sequences were used as reference probes for slot blots. The density ratio between the breast cancer gene and the reference gene was calculated for each sample.
10-15 samples of placental DNA digests were used as control. Amongst the control samples, the highest density ratio was set at 1.0. The density ratio of the tumor cell lines were standardized accordingly. An arbitrary cut-off for the standardized ratio (typically 1.3) was defined to identify samples in which the putative gene had been duplicated Each of the cell lines in the breast cancer panel was scored positively or negatively for duplication of the gene being tested
Some of the cell lines in the panel were known to have duplicated chromosomal regions from comparative genomic hybridization analysis In instances where the cDNA being used as probe mapped to the known amplified region, the cDNA indicated that the corresponding gene had also been duplicated However, duplicated genes were also detected using each of the four cDNAs in instances where comparative genomic hybridization had not revealed any amplification
Because of the nature of the technique, the standardized ratio calculated as described underestimates the gene copy number, although it is expected to rank in the same order For example, the standardized ratio obtained for the c-myc gene in the SKBR3 breast cancer cell was 5 0
However, it is known that SKBR3 has approximately 50 copies of the c-myc gene
To test for overabundance of RNA, 10 μg of total RNA from breast cancer cell lines or primary breast cancer tumors were electrophoresed on 0 8% agarose in the presence of the denaturant formamide, and then transferred to a nylon membrane The membrane was probed first with 32P-labeled cDNA corresponding to the putative breast cancer gene, then stripped and reprobed with
32P-labeled cDNA for the beta-actm gene (or more recently with a probe for 36B4) to adjust for differences in sample loading Ratios of densities between the candidate gene and the reference gene were calculated RNA from three different cultured normal epithelial cells were included in the analysis as a control for the normal level of gene expression The highest ratio obtained from the normal cell samples was set at 1 0, and the ratios in the various tumor cells were standardized accordingly
Example 4: Chromosome 1 gene CH1-9a11-2
One of the cDNA obtained through the selection procedures of Examples 1 and 2 corresponded to a gene that mapped to Chromosome 1
Table 4 summarizes the results of the analysis for gene duplication and RNA overabundance
Both quantitative and qualitative assessment is shown The numbers shown were obtained by comparing the autoradiograph intensity of the hybridizing band in each sample with that of the controls
Several control samples were used for the gene duplication experiments, consisting of different preparations of placental DNA The control sample with the highest level of intensity was used for standardizing the other values Other sources used for this analysis were breast cancer cell lines with the designations shown For reasons stated in Example 3, the quantitative number is not a direct indication of the gene copy number, although it is expected to rank in the same order Similarly, up to 6 control samples were used for the RNA overabundance experiments, consisting of different preparations of breast cell organoids which had been maintained briefly in tissue culture until the experiment was performed The control sample with the highest level of intensity was used for standardizing the other values Each cell line was scored + or - according to an arbitrary cut-off value
Figure imgf000055_0001
Gene duplication or RNA overabundance; - no duplication or overabundance, nd = not done
Degree of gene duplication is reported relative to placental DNA preparations. ** Degree of RNA overabundance is reported relative to the highest level observed for several cultures of normal epithelial cells Two hybridizing species of RNA are calculated and reported separately
The gene corresponding to the CH1-9a11-2 cDNA was duplicated in 9 out of 15 (60%) of the breast cancer cell lines tested, compared with placental DNA digests (P3 and P12). The sequence of the 115 bases from the 5' end of the cDNA fragment (SEQ ID NO 1) is shown in Figure 22 There was no substantial homology to any known gene in GenBank One of the three possible reading frames was found to be open, with the predicted am o acid shown in Figure 22 (SEQ ID NO 2)
The CH1-9a11-2 gene was further characterized by obtaining additional sequence information A λ-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was screened using the initial cDNA insert, and a clone with a 2 5 kilobase insert was identified The identified clone was subcloned into plasmid vector pCRII T7 and Sp6 primers for regions flanking the cDNA inserts were used as initial sequencing primers
T7 primer (SEQ ID NO 42)
5'-TAATACGACTCACTATAGGGAGA-3' Sp6 primer (SEQ ID NO 43)
5'-CATACGATTTAGGTGACACTATAG-3'
Sequencing continued by walking along the region of interest by standard techniques using sequencing primers based on data already obtained Primers used in sequencing are designated 1-16 in Figure 7
A second clone (designated pCH1-1 1) overlapping on the 5' end was obtained using CLONTECH Marathon™ cDNA Amplification Kit A map showing the overlapping regions is provided in Figure 6 Briefly, two DNA primers designated CH1a and CH1b (Figure 7) were synthesized
Polyadenylated RNA from breast cancer cell line 600PE was reverse transcribed using CH1b pnmer After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded cDNA The 5' end cDNA of CH1-9a11-2 was then amplified by PCR using primers CH1a and AP1 (provided in the kit) To increase the specificity of the PCR products, the first PCR products were PCR reamphfied using nested primers CH1a and AP2 (provided in the kit) The PCR products were cloned into pCRII vector (Invifrogen) and screened with CH1-9a11-2 probe
The sequence of 3452 base pairs between the 5' end of pCH1-1 1 and the poly-A tail of CH1- 9a11-2 was determined by standard sequencing techniques The DNA sequence is shown in Figure 8 (SEQ ID NO 15) The longest open reading frame is in frame 1 (bases 1-1875), and codes for 624 ammo acids before the stop codon The corresponding ammo acid sequence of this frame is shown in the upper panel of Figure 9 (SEQ ID NO 16) The partial sequence predicted for the translated protein is listed the low panel of Figure 9 (SEQ ID NO 17) Bases 1876 to the end of the sequence are believed to be a 3' untranslated region A hydrophobicity analysis identified a putative membrane insertion or membrane spanning region at about ammo acids 382-400, indicated in Figure 9 by underlining
Figure 23 is a listing of additional cDNA sequence obtained for CH1-9a11-2, comprising approximately 1934 base pairs 5' from the sequence of Figure 8 The additional sequence data was obtained by rescuing and amplifying two further fragments of CH1-9a11-2 cDNA Nested primers were designed ~100 base pairs downstream from the 5' end of the known sequence The primers were used in a nested amplification assay using AP1 and AP2, using the CLONTECH Marathon™ cDNA Amplification Kit as described above The template for the first upstream fragment was reverse- transcribed polyadenylated RNA from breast cancer cell line 600PE , as described earlier This fragment was sequenced, and another set of nested primers was designed The template for the next upstream fragment was a Marathon™ ready cDNA preparation from human testes, also supplied by CLONTECH
The nucleotide sequence shown in Figure 23 comprises an open reading frame through to the 5' end Figure 24 shows the corresponding protein translation Between about another 500-1000 bases are predicted to be present in the CH1-9a11-2 direction, with the protein encoding sequence beginning somewhere within this additional sequence Sequencing of the encoding region is completed by obtaining additional CH1-9a11-2 fragments in this direction
A GENINFO® BLAST search of nucleotide and peptide sequence databases was performed through the National Center for Biotechnology Information on February 23, 1996 Short segments of homology with other reported human sequences were found at the nucleotide level (<500 base pairs), but none with any ascribed function in the respective identifier At the ammo acid level, no identity higher than 30% was found with any reported eukaryotic sequences
A CH1-9a11-2 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources The RNA was obtained already prepared for
Northern blot analysis (CLONTECH Catalog # 7759-1 , 7760-1 and 7756-1 ) The manufacturer produced the blots from approximately 2 μg of poly-A RNA per lane, run on a denaturing formaldehyde 1-2% agarose gel, transferred to a nylon membrane, and fixed by UV irradiation The relative CH1- 9a11-2 expression observed at the RNA level is shown in Table 5
Figure imgf000058_0001
Relatively elevated levels of expression were observed in heart, placenta, pancreas, prostate, testis and ovary. The level of expression in breast cancer cell lines is also relatively high (about ++++ on the scale), since the Northern analysis performed on these lines (described above) was conducted on total cellular RNA, of which polyadenylated RNA constitutes only about 5%. It is likely that the CH1-9a11-2 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism.
Since the obtained sequence is shorter than the apparent size of mRNA observed in Northern analysis (Table 1), an additional polynucleotide segment is believed to be present at the 5' end of the sequence shown in SEQ. ID NO:15. Further sequence data at the 5' end is deduced by obtaining additional cloned cDNA using standard techniques. Briefly, in one approach, mRNA from breast cancer cell lines MDA-453 and/or 600PE are cloned and screened using primers based on sequence data from SEQ. ID NO:15. Two nested primers of about 20 nucleotides are prepared, the innermost about 150 base pairs from the 5' end, and the outermost about 170 base pairs from the 5' end. The outermost primer is used to synthesize a first cDNA strand complementary to the mRNA in the upstream direction Second strand synthesis is performed using reagents in a CLONTECH Marathon™ cDNA amplification kit according to manufacturer's directions The double-stranded DNA is then ligated at the 5' end of the coding sequence with the double-stranded adaptor fragment provided in the kit A first PCR amplification (about 30 cycles) is performed using the first adapter primer from the kit and the outermost RNA-specific primer, and a second amplification (about 30 cycles) is performed using the second adapter primer and the innermost RNA-specific primer In an alternative approach, a CLONTECH RACE-READY single-stranded cDNA from human placenta is PCR amplified using nested 5' anchor primers in combination with the outermost and innermost RNA- specific primers Amplified DNA obtained using either approach is analyzed by gel elecfrophoresis, and cloned into plasmid vector pCRII Clones are screened, as necessary, using the 2 5 kilobase
CH1-9a11-2 insert Clones corresponding to full-length mRNA (4 5 kb or 5 5 kb, Table 1 ), or cDNA fragments overlapping at the 5' end are selected for sequencing Compared with the 4 5 kb form, additional polynucleotide segments may be present in the 5 5 kb form within the encoding region, or in the 5' or 3' untranslated region Figure 27 shows what is believed to be the full length cDNA sequence for the principal transcript of the Chromosome 1 gene CH1-9a11-2 is 5919 nucleotides in length which matches the size of mRNA observed by Northern blot analysis The nucleotides 1 to 2467 represent new sequence not shown in preceding figures
To obtain the additional sequence, overlapping clones corresponding to the 5'end of CH1- 9a11-2 cDNA were isolated using CLONTECH Marathon cDNA amplification Kit Briefly, DNA primers
CH1c and CH1d were prepared Polyadenylated RNA from breast cancer cell line BT474 was reverse transcribed using CH1c primer After second strand synthesis, adaptor DNA provided by the kit was ligated to the double-stranded cDNAs The 5' end cDNA of CH1-9a11-2 was then amplified by PCR using primers CH1c and AP1(provιded by the kit) To increase the specificity of the PCR products, the first PCR products were PCR amplified using nested primers CH1d and AP2 The PCR products were cloned into PCR2 1 vector (Invifrogen) and screened with 32P-labeled pCH1-1 1 k probe Clone pCH1- 800 was obtained and the cDNA insert was sequenced Based upon the pCH 1-800 sequences, primers CH1 I and CH1J were prepared Using the method described above, clone pCH1-J8 was obtained Based upon the pCH-J8 sequences, primers CH1M and CH1 N were prepared to isolate clone pCH1 N21 Finally, based upon the sequence of pCH1-N21 , primers CHI P and CH1Q were synthesized and used to isolate clone pCH1-QE The position of the primers (numbered as shown in Figure 27) was as follows CH1c 2588→2564, CH1d 2570→2546, CH1 I 2285→2262, CH1J 2240→2217, CH1M 1404→1381 , CH1 N 1388→1365, CH1Q 657→634, CHI P 677→654
Figure 28 shows the corresponding ammo acid encoding region for the predicted protein product
The largest open reading frame codes for a polypeptide of 1404 ammo acids with the first methionine codon at position #124 from the 5'-end Sequence analysis revealed two potential transmembrane domains spanning ammo acids 202 to 223 and 1162 to 1183 There is a coiled coil domain spanning ammo acid 1118 to 1161 There is a slight, possibly significant degree of homology of CH1-9A11-2 to the rod-like tail of myosm, dynein, and other proteins involved in molecular motion The homology is restricted to the coiled-coil domain which is made up of four heptapeptide repeats Figures 29-32 relate to two cDNA clones representing splice variants A clone desiginated P1 comprised 869 base pairs (numbered 2611 to 3470), and is believed to represent genomic sequence Clone PE16 comprised 715 base pairs in this region, with a deletion of 154 base pairs between residues 3250 and 3403 Clones PE10 and PE1 comprised 601 base pairs, with a deletion of 268 base pairs between residues 3132 and 3399 Clone PE4 comprised 489 base pairs, with a deletion of 380 base pairs between residues 3059 and 3438 The sequence shown in Figure 27 includes the PE4 sequence, is 5919 nucleotides in length, and is believed to be that of the mature transcπpt The sequence shown in Figure 29 includes the PE16 sequence and is 6145 nucleotides in length The sequence shown in Figure 31 includes the PE10 sequence and is 6031 nucleotides in length The translations of the two splice variants are shown in Figures 30 and 32, respectively They include an open reading frame of 1210 ammo acids (compared to 1504 ammo acids for Figure 28) The size of the protein observed in Western analysis corresponds more closely to the sequence shown in Figure 28 Accordingly, Figures 29 and 31 are believed to represent partially processed forms of messenger RNA It is also possible that these are mature RNA produced by variant splicing, and that the proteins they encode are also produced by the cell, perhaps at a lower amount Figure 33 shows a map of the CH1-9A11-2 cDNA, compared with the human chromosomal gene and homologous sequences from other species Searches of the sequence databases revealed that the cDNA sequence of CH1-9A11-2 is encompassed in two overlapping PAC clones, PAC125H23 and PAC105D12 (database Accession Nos Z94054 and Z96050) Alignment of the cDNA sequences and the genomic sequences identified 23 exons ranging from 56 bp to 2021 bp (compare lines "A" and "B")
Significant homology was found with four products of predicted protein coding regions from four evolutionary diverged species Caenorhabditis elegans (AF 067219), Drosophila melanogaster (AC002443), Arabidopsis thaliana (AF000657 and AC002343), Saccharomyces cerevisiae (U55020) and Schizosaccharomyces pombe (AL023534) The degrees of ammo acid identity and conservation within the first homologous regions are 70% between CH1-9A11-2 and C elegans protein
(AF067219), 69% between CH1-9A11-2 and D melanogaster protein (AC001808), 63% between CH1-9A11-2 and A thaliana protein (AF000569), 61% between CH1-9A11-2 and another A thaliana protein (AC002343), 68% between CH1-9A11-2 and S cerevisiae protein (U55020) and 73% between CH1-9A11-2 and S pombe protein (AL023534) The degrees of ammo acid identity and conservation within the second homologous regions are 62% between CH1-9A11-2 and C elegans protein
(AF067219), 66% between CH1-9A11-2 and A thaliana protein (AF000569) and 55% between CH1- 9A11-2 and S cerevisiae protein (U55020) The presence of highly conserved domains in CH1-9A11- 2 protein through evolutionally diverged species indicated that CH1-9A11-2 gene product may have a fundamental function for cell metabolism
Two genomic CH1-9A11-2 P1 clones isolated from an arrayed P1 library were used to map CH1-9A11-2 to chromosome 1q24-31 in normal human lymphocytes by fiuorescence in situ hybridization (FISH) Briefly, DNA was extracted and labeled with dιgoxιgenιn-11-dUTP by nick translation Hybridization was carried out in the presence of human Cot 1 DNA to suppress the background signal and hybridized to metaphase chromosome overnight The hybridization signal was detected by anti-digoxigenm conjugated with FITC The chromosomes and nuclei were counterstained with propidium iodide and DAPI respectively The location of the probe was determined by digital image microscopy following FISH and localized by fractional length from the p-terminus
Localization of CH1-9A11-2 to chromosome 1q24-31 is potentially significant Alterations of the long arm of chromosome 1 are the most frequent abnormalities found in human breast carcinomas (Bieche, I et al Clinical Cancer Research 1 123, 1995) These alterations include both loss and amplification of distinct regions of chromosome 1q as detected by cytogenetic, CGH or RFLP analyses Alteration of chromosome 1q in breast cancer has been revealed as polysomy of 1q , isochromosomy 1q, translocations and amplification of various lengths of the 1q region The frequent presence of multiple copies of chromosome 1q in human breast cancer suggest that elevated expression of multiple genes on this chromosome region may contribute to breast cancer progression Besides human breast cancer, preliminary results indicated that CH1-9A11-2 was also overexpressed in some cell lines derived from human colon cancer and lung cancer (data not shown) Therefore
CH1-9A11-2 may represent another oncogene involved in the oncogenesis of diverse human cancers
Further analysis was performed on the relative amplification and overexpression of CH1-9a11-2 in breast cancer In order to achieve a statistically significant cut-off point for defining gene amplification, slot analysis was performed, which allowed simultaneous analysis of many normal and cancer cell samples Seventeen normal DNA samples from placenta or skin and 16 breast cancer cell lines were compared at the same time 0 5 mg of DNA was loaded in each slot The blots were then hybridized with a 32P labeled 2 5 kb CH1-9A11-2 cDNA fragment (corresponding to nucleotides 3408 to 5909) After an autoradiogram was obtained , the probe was stripped and the blot was re- probed using a reference probe to adjust for differences in sample loading Human repeated sequences (GIBCO) were used as reference probe The relative CH1-9A11-2 gene copy numbers were expressed as density ratio of CH1-9A11-2 over total DNA loaded, with the cut-off point calculated as defined below for the CH13-2a12-1 gene Northern blot analysis was performed using total RNA was extracted from cultured cells and uncultured mammary epithelial tissues by guanidme isothiocyanate/CsCl step gradients 10 mg of total RNA was analyzed by Northern hybridization using a 32P labeled 2 5 kb CH1-9A11-2 cDNA fragment as probe 36B4 probes (ATCC, Rockville, MD) was used as reference probe to adjust sample loading Based on the defined cut-off point, 10 of 16 cell lines showed significantly increased gene copy number (p<0 001) Using the same methodology, CH1-9A11-2 was found to be amplified in 16 of 98 (16%) primary breast tumors analyzed ( p<0 001) Northern hybridization analysis is shown in the right panel CH1-9A11-2 was expressed at very low levels in uncultured normal breast epithelial cells When the cells were cultured in complete medium, the expression of CH1-9A11-2 RNA increased by
3 to 10 fold Most breast cancer cell lines expressed CH1-9A11-2 at levels comparable to that in cultured normal epithelial cells Several cell lines, including 600PE, MDA-MB453 as well as MDA-MB- 134 expressed CH1-9A11-2 RNA several fold higher than cultured normal epithelial cells However MDA-MB-134 does not have CH1-9A11-2 gene amplification The expression of CH1-9a11-2 was also evaluated by RNA in situ hybridization Archival paraffin blocks of infiltrating breast cancer were obtained from 41 randomly selected patients from the University of California at San Francisco Cell-lines 600PE and MDA-MB-435 were used as control Confluent cultured cells were tryps ized, centnfuged and wrapped in colloidm bag The colloidm bag was fixed in 4% buffered formalin for 24 h and embedded routinely in paraffin wax In situ hybridization was carried out by standard methods Briefly, deparaffϊnized 4 mm thick tissue sections were treated with prote ase K and hybridized overnight at 45°C with digoxigenm-labeled antisense transcripts from a 800 bp CH1-9A11-2 3'UTR clone Sections were incubated with sheep anti-digoxigenm antibody followed by alkaline phosphatase detection (Boehπnger Mannheim) A 200 bp β-actin antisense probe was used as positive control for the negatively stained slides to confirm the qualities of the RNA on these slides
Figure 34 shows the overexpression of CH1-9a11-2 in primary breast cancer samples as determined by in situ hybridization The concentration of the digoxigenm-labeled CH1-9A11-2 antisense RNA probe was titrated to show a strong staining on 600PE cell line which overexpresses CH1-9A11-2 (Panel A) and a negative staining on MDA-MB-435 cell line which does not overexpress CH1-9A11-2 (Panel B) Specificity of the antisense RNA was illustrated by the negative hybridizing signal with CH1-9A11-2 sense RNA probe on 600PE cell line (Panel C) Nine of the ten reduction mammaplasties show negative stain as illustrated in Panel D and the remaining case showed weak staining in some ducts In contrast, CH1-9A11-2 RNA was detected in 18 of the 41 (44%) breast cancers (panel E and F) Of these 41 breast carcinoma, 11 also had adjacent non-malignant breast epithelium In 10 of these 11 (93%) cases, the adjacent normal breast epithelium stained negatively for CH1-9A11-2 mRNA
Example 5: Chromosome 8 gene CH8-2a13-1
One of the cDNA obtained corresponded to a gene that mapped to Chromosome 8 Figure 2 shows the Southern blot analysis for the corresponding gene in various DNA digests Lane 1 (P12) is the control preparation of placental DNA, the rest show DNA obtained from human breast cancer cell lines Panel A shows the pattern obtained using the 32P-labeled CH8-2a13-1 cDNA probe Panel B shows the pattern obtained with the same blot using the 32P-labeled D2S6 probe as a loading control The sizes of the restriction fragments are indicated on the right
Figure 3 shows the Northern blot analysis for RNA overabundance Lanes 1 -3 show the level of expression in cultured normal epithelial cells Lanes 4-19 show the level of expression in human breast cancer cell lines Panel A shows the pattern obtained using the CH8-2a13-1 probe, panel B shows the pattern obtained with beta-actm cDNA, a loading control
The results are summarized in Table 6 The scoring method is the same as for Example 4
Figure imgf000064_0001
+ Gene duplication or RNA overabundance - no duplication or overabundance nd = not done
* Degree of gene duplication is reported relative to placental DNA preparations
** Degree of RNA overabundance is reported relative to the highest level observed for several cultures of normal epithelial cells
The gene corresponding to CH8-2a13-1 showed clear evidence of duplication in 12 out of 17 (71%) of the cells tested RNA overabundance was observed in 14 out of 17 (82%) Thus, 11% of the cells had achieved RNA overabundance by a mechanism other than gene duplication Since the known oncogene c-myc is located on Chromosome 8, the Southern analysis was also conducted using a probe for c-myc At least 2 of the breast cancer cells showing duplication of the gene corresponding to CH8-2a13-1 gene did not show duplication of c-myc This indicates that the gene corresponding to CH8-2a13-1 is not part of the myc amphcon The sequence of 150 bases from the 5' end of the cDNA fragment is shown in Figure 22 (SEQ ID NO 3) There was no substantial homology to any known gene in GenBank One of the three possible reading frames was found to be open, with the ammo acid sequence shown in Figure 22 (SEQ ID NO 4) The CH8-2a13-1 gene was further characterized by obtaining additional sequence information A λ-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was screened using the initial cDNA insert, and clones with a 3 0 kb and a 4 0 kb insert were identified The two identified clones were subcloned into plasmid vector pCRII T7 and Sp6 primers for regions flanking the cDNA inserts were used as initial sequencing primers Sequencing continued by walking along the region of interest by standard techniques, using sequencing primers based on data already obtained
The two inserts were found to overlap (Figure 6) Primers used are those designated 1-25 in Figure 10
A third clone of about 600 bp (designated pCH8-600) overlapping on the 5' end (Figure 6) was obtained using CLONTECH Marathon™ cDNA Amplification Kit Briefly, two DNA primers CH8a and CH8b (Figure 10) were synthesized Polyadenylated RNA from breast cancer cell line BT474 was reverse transcribed using CH8b primer After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded cDNA The 5' end cDNA of CH8-2a13-1 was then amplified by PCR using primers CH8a and AP1 (provided in the kit) To increase the specificity of the PCR products, the first PCR products were PCR reamplified using nested primers CH8a and AP2 (provided in the kit) The PCR products were cloned into pCRII vector (Invifrogen) and screened with CH8-2a13-
1 probe
By sequencing relevant portions of the three clones, a nucleic acid sequence of 3982 base pairs between the 5' end and the poly-A tail of CH8-2a13-1 was determined The DNA sequence is shown in Figure 11 (SEQ ID NO 18) Bases 1-152 are believed to be a 5' untranslated region The longest open reading frame is in frame 3 from base 153 to 3911 , and codes for 1252 am o acids before the stop codon The corresponding ammo acid sequence of this frame is shown in the upper panel of Figure 12 (SEQ ID NO 19) The sequence predicted for the translated protein is shown in the lower panel of Figure 12(SEQ ID NO 20)
A GENINFO® BLAST search of nucleotide and peptide sequence databases was performed through the National Center for Biotechnology Information on March 26, 1996 The sequences were found to be about 99% identical at the nucleotide and ammo acid level with bases 343-4103 of KIAA0196 protein (N Nomura et al , in press, sequence submitted to the DDBJ/EMBUGenBank databases on March 4, 1996) The KIAA0196 was one of 200 different cDNA cloned at random from an immature male human myeloblast cell line KIAA0196 has no known biological function, and is described by Nomura et al as being ubiquitously expressed
A fourth clone of about 600 bp overlapping pCH8-600 at the 5' end has also been obtained Briefly, a DNA primer was synthesized corresponding to about the first 20 nucleotides at the 5' of the predicted cDNA sequence, and used along with a primer based on the pCH8-600 sequence to reverse-transcribe RNA from breast cancer cell line BT474. The product was cloned into pCRII vector (Invifrogen) and screened with a CH8-2a13-1 probe. The new clone is sequenced along both strands to obtain additional 5' untranslated sequence data for the cDNA. The predicted compiled cDNA nucleotide sequence of CH8-2a13-1 cDNA is shown in Figure 13 (SEQ. ID NO:21 ). The corresponding amino acid sequence of this frame is shown in Figure 14 (SEQ. ID NO:22). A polynucleotide comprising the compiled sequence is assembled by joining the insert of this fourth clone to pCH8-4k within the shared region. Briefly, CH8-4k is cut with Xbal and Λ/ofl. The fourth clone is cut with BamHI and XJ al. The ligated polynucleotide is then inserted into pCRII cut with BamHI and Λ/ofl.
A CH8-2a13-1 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4. The relative CH8-2a13-12 expression observed at the mRNA level is shown in Table 7:
Figure imgf000066_0001
Relative levels of expression observed were as follows Low levels of expression were observed in adult peripheral blood leukocytes (PBL), brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas Medium levels of expression were observed in adult heart, spleen, thymus, prostate, testis, ovary, small intestine, and colon High levels of expression were observed in four fetal tissues tested brain, lung, liver and kidney The level of expression in breast cancer cell lines is relatively high (about ++++ on the scale), since the Northern analysis performed on these lines was conducted on total cellular RNA It is likely that the CH8-2a13-1 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism
Example 6: Chromosome 13 gene CH13-2a12-1
One of the cDNA obtained corresponded to a gene that mapped to Chromosome 13 Figure 4 shows the Southern blot analysis for the corresponding gene in various DNA digests Lanes 1 and 2 are control preparations of placental DNA, the rest show DNA obtained from human breast cancer cell lines Panel A shows the pattern obtained using the CH13-2a12-1 cDNA probe, panel B shows the pattern using D2S6 probe as a loading control The sizes of the restriction fragments are indicated on the right
Figure 5 shows the Northern blot analysis for RNA overabundance of the CH13-2a12-1 gene Lanes 1-3 show the level of expression in cultured normal epithelial ceils Lanes 4-19 show the level of expression in human breast cancer cell lines Panel A shows the pattern obtained using the CH13-2a12-1 probe, panel B shows the pattern obtained with beta-act cDNA, a loading control The apparent size of the mRNA varied depending upon conditions of elecfrophoresis Full-length mRNA is believed to occur at sizes of about 3 2 and 3 5 kb The results of the RNA abundance comparison are summarized in Table 8 The scoring method is the same as for Example 4
Figure imgf000068_0001
Gene duplication or RNA overabundance, - no duplication or overabundance nd = not done
Degree of gene duplication is reported relative to placental DNA preparations
Degree of RNA overabundance is reported relative to the highest level observed for several cultures of normal epithelial cells
The gene corresponding to CH13-2a12-1 was duplicated in 7 out of 16 (44%) of the cells tested Three of the positive cell lines (600PE, BT474, and MDA435) had been studied previously by comparative genomic hybridization, but had not shown amplified chromatm in the region where CH13- 2A12-1 has been mapped in these studies RNA overabundance was observed in 13 out of 16 (81%) of the cell lines tested Thus, 37% of the cells had achieved RNA overabundance by a mechanism other than gene duplication
Cells from primary breast tumors have also been analyzed them for duplication of the chromosome 13 gene Ten of the 82 tumors analyzed (12%) were positive, confirming that duplication of this gene is not an artifact of in vitro culture
The sequence of 07 bases from the 5' end of the 1 5 kb cDNA fragment is shown in Figure 22 (SEQ ID NO 5) There was no substantial homology to any known gene in GenBank One of the three possible reading frames was found to be open, with the predicted ammo acid sequence shown in Figure 22 (SEQ ID NO 6) The CH13-2a12-1 gene was further characterized by obtaining additional sequence information A λ-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was screened using the initial cDNA insert, and clones with a 3 5 kilobase and a 1 6 kilobase insert were identified The two identified clones were subcloned into plasmid vector pCRII T7 and Sp6 primers for regions flanking the cDNA inserts were used as initial sequencing primers Sequencing continued by walking along the region of interest by standard techniques, using sequencing primers based on data already obtained The two inserts were found to overlap (Figure 6) Primers used during sequencing are shown in Figure 15
By sequencing relevant portions of the 3 5 and 1 6 kb clones, a nucleic acid sequence of 3339 base pairs between the 5' end and the poly-A tail of CH13-2a12-1 was determined The DNA sequence is shown in Figure 16 (SEQ ID NO 23) Bases 1-520 are believed to be a 5' untranslated region The longest open reading frame is in frame 2 from base 521 to 1838, and codes for 611 ammo acids before the stop codon The corresponding ammo acid sequence of this frame is shown in the upper panel of Figure 17 (SEQ ID NO 24) The sequence predicted for the translated protein is shown in the lower panel of Figure 17 (SEQ ID NO 25) Bases 1838 to 3339 of the nucleotide sequence are believed to be a 3' untranslated region, which is present in the 3 5 kb insert The 3 5 kb insert appears to be a splice variant (Figure 6), in which the 3' untranslated region consists of bases 1838-2797 in the sequence
A GENINFO® BLAST search of nucleotide and peptide sequence databases was performed through the National Center for Biotechnology Information on March 26, 1996 Short segments of homology with other reported human sequences were found at the nucleotide level (<500 base pairs), but none with any ascribed function in the respective identifier At the ammo acid level, the sequence was found to share 33% identities and 54% positives with 228 residues of the lin 19 protein of Caenorhabditis elegans This protein has been implicated in regulating the cell cycle of C elegans (ET Kiprecs, W He & EM Hedgecock) The CH13-2a12-1 gene is suspected of a role in controlling cell proliferation "Controlling cell proliferation" in this context means that an abnormally high or low level of gene expression at the RNA or protein level results in a higher or lower rate of cell proliferation, or vice versa, compared with cells with an otherwise similar phenotype There is also a low-level homology between CH13-2a12-1 and VACM-1 , a vasopressin-activated, calcium-mobilizing receptor from rabbit kidney medulla (Burnatowska-Hledin et al). VACM-1 has a transmembrane sequence, whereas none has been detected in CH13-2a12-1. Nevertheless, it is possible that the CH13-2a12-1 protein product has a Ca++ binding or Ca++ mobilizing function.
A CH13-2a12-1 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4. The relative CH13-2a12-1 expression observed at the mRNA level is shown in Table 9:
Figure imgf000070_0001
Relatively elevated levels of expression were observed in heart, skeletal muscle and testis.
The level of expression in breast cancer cell lines is relatively high (about ++++ on the scale), since the Northern analysis performed on these lines was conducted on total cellular RNA. It is likely that the CH13-2a12-1 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism. Fragments corresponding to the CH13-2a12-1 gene have also been used to screen cell lines derived from other types of cancer Southern analysis showed that about 1 out of 4 breast cancer cell lines tested have gene duplication of CH13-2a12-1 Northern analysis showed that about 3 out of 6 lines tested have overexpression of the corresponding RNA transcript Figure 35 shows what is believed to be the full length cDNA sequence for the principal transcript of the Chromosome 13 gene The full length Ch13-2a12-1 cDNA sequences is 3642 nucleotides The 708 nucleotides from the 5' end represent sequence not shown in the preceding figures The first 405 nucleotides of the sequence shown in Figure 16 is believed to represent intron sequence present only in incompletely processed transcripts Thus, the 708 nucleotides at the 5' end shown in Figure 33 joins to the previously obtained sequence beginning at the fragment -GAAA-
To obtain the additional sequence, overlapping clones corresponding to the 5'end of CH13- 2a12-1 cDNA were isolated using CLONTECH Marathon™ cDNA amplification Kit Briefly, DNA primers CH1c and CH1d were prepared Polyadenylated RNA from breast cancer cell line BT474 was reverse transcribed using CH1c primer After second strand synthesis, adaptor DNA provided by the kit was ligated to the double-stranded cDNAs The 5' end cDNA of CH13-2a12-1 was then amplified by PCR using primers CH13I and AP1 (provided by the kit) To increase the specificity of the PCR products, the first PCR products were PCR amplified using nested primers CH13J and AP2 The PCR products were cloned into PCR2 1 ™ vector (Invifrogen) and screened by PCR using primer piars CM3J/AP2 and Ch13J/Ch13H Clone pCH13-J26 which were amplified by Ch13J/AP2 but not amplified by Ch13J/Ch13H were selected for sequence analysis Based upon the pCH13-J26 sequences, primers CH13M and CH1N were prepared Using the same methods, clone pCH13-N11 was obtained The primer sequences (numbered according to Figure 35) were CH13I 833→810, CH13J 813→790, CH13M 291- 270, CH13N 237→214, CH13H (Intron) GAA GGG CAG AGC CGA CAT TCC GCC TTC TGC (SEQ ID NO 44) Figure 36 shows the corresponding ammo acid encoding region for the predicted protein product The largest open reading frame codes for a polypeptide of 659 ammo acids with the first methionine codon at posotion #161 from the 5'end The first 105 ammo acids shown in Figure 17 are contained in what is believed to be an intron Thus, the predicted protein sequence joins the sequence shown in Figure 17 at the fragment -KPL- Figures 37-40 relate to two cDNA clones representing splice variants The sequence shown in Figure 37 is 3753 nucleotides in length, as compared with 3642 nucleotides in the sequence shown in Figure 35 The longer sequence has two inserts near the 5' terminal end Figure 39 is 3668 nucleotides in length, and includes only the second of the two inserts of Figure 37 The translations of the two splice variants are obtained from the same reading frame, and are shown in Figures 38 and 40, respectively They are 749 and 720 ammo acids in length, compared with the 659 ammo acids for
Figure 36 The splice variants are believed to represent partially processed forms of messenger RNA It is also possible that these are mature RNA produced by variant splicing, and that the proteins they encode are also produced by the cell, perhaps at a lower amount
Using human-rodent somatic cell hybrids (obtained from Coriell Institute for Medical Research, Camden, N J ), the CH13-2a12-1 gene was localized to chromosome 13q Subsequently, CH13-2a12- 1 was mapped to chromosome 13q34-qter by fluorescence in situ hybridization using a CH13-2a12-1
BAC clone isolated from a human BAC library (Research Genetics Inc ) according to the supplier's protocol DNA was extracted and labeled with dιgoxιgenιn-11-dUTP by nick translation Fluorescence in situ hybridization (FISH) was carried out in the presence of human Cot 1 DNA to suppress the background signal and hybridized to metaphase chromosomes overnight The hybridization signal was detected by anti-digoxigenm conjugated with FITC The chromosomes were counterstamed with
DAPI The location of the probe was determined by digital image microscopy following FISH and localized by DAPI banding (Stokke, T et al , Genomics, 26 134, 1995)
Figure 41 shows the relative amplification and overexpression of CH13-2a12-1 in breast cancer cell lines In order to achieve a statistically significant cut-off point for defining gene amplification, slot analysis was performed, which allowed simultaneous analysis of many normal and cancer cell samples To calculate the cut-off points, a set of non-malignant (normal) samples were analyzed by Southern or Northern hybridization Densities of the signals on the autoradiograms were obtained using a densitometer (Molecular Dynamics, Sunnyvale, CA) The density ratio between the CH13-2a12-1 gene and the reference gene was calculated for each sample Two steps were required to determine the cut-off point First, the data for normal tissues were transformed so that it become normally distributed (i e followed a Gaussian distribution curve) Next, a table of tolerance bounds for a normal distribution was used to define cut points so that the confidence was 90% that no more than P% of the distribution would e above the cut-off point The cut-off point was then transformed back to the original measurement unit Southern analysis is shown in the left panel Based on the defined cut-off point, three (600PE,
MDA-MB157, and SKBR3) of the 15 (20%) cell lines analyzed showed significantly increased gene copy number (p<0 01) In addition, using the same methodology, CH13-2a12-1 was found to be amplified in 17 of 105 (16%) untreated primary breast tumors analyzed ( p<0 01) Northern hybridization analysis is shown in the right panel The blots were quantitated by densitometry and expressed as a density ratio of CH13-2a12-1 mRNA to β-actin or 36B4 mRNA Based on the defined cut-off point, ten of fourteen breast cancer cell lines significantly overexpressed CH13-2a12-1 mRNA when compared to normal breast epithelial cells derived from reduction mammoplasty specimens (p<0 01) In summary, CH13-2a12-1 was found to be amplified and overexpressed in three of fourteen breast cancer cell lines analyzed, while the gene was overexpressed in 8 additional cell lines in which it is not amplified CH13-2a12-1 was found to be amplified in 17 of 105 (16%) cases of untreated primary breast cancers, while 14 of 30 cases analyzed (47%) were shown by RNA in situ hybridization to overexpress CH13-2a12-1 The expression of CH13-2a12-1 RNA was also evaluated by RNA in situ hybridization Archival paraffin blocks of infiltrating breast cancer were obtained from 30 randomly selected patients from University of California at San Francisco Cultured breast cancer cell-lines (600PE and BT20) were used as control The cultured cells were spun down and wrapped in colloidm bag The colloidm bag was fixed in 4% buffered formalin for 24 h and embedded routinely in paraffin wax In situ hybridization was carried out as described (25, 26) Briefly, deparaffinized 4 μm thick tissue sections were treated with prote ase K and hybridized overnight at 45°C with digoxigenm-labeled antisense transcripts from a 1 2 kb CH13-2a12-1 3'-UTR clone Sections were incubated with sheep anti- digoxigenin antibody followed by alkaline phosphatase detection (Boehπnger Mannheim) The concentration of the probe was titrated to show a strong staining on 600PE cell line and a negative staining on BT20 cell line A 200 bp β-actin antisense probe was used as positive control for the negatively stained slides to confirm the qualities of the RNA on such slides
Figure 42 shows the overexpression of CH13-2a12-1 in primary breast cancer samples as determined by in situ hybridization A negatively stained cell line (BT20, which does not overexpress CH13-2a12-1) and a positively stained cell line (600PE) are shown in Panels A and B respectively
This result is consistent with the data obtained by Northern hybridization Panel C shows a representative tumor that overexpressed the gene for CH13-2a12-1 while adjacent normal breast epithelium stained negatively as shown in Panel D CH13-2a12-1 mRNA was detected in 14 of 30 (47%) of breast tumors In contrast, in 14 of 15 (93%) cases, the adjacent normal breast epithelium stained negatively for CH13-2a12-1 mRNA
It is a hypothesis of this disclosure that CH13-2a12-1 belongs to a conserved family of genes, culhn (Kipreos, E T et al , Cell, 85 828, 1996) It is therefore proposed that this gene be designated Cul-4A, the human form of which is Hs-Cul-4A
There are at least six human, five nematode and three yeast cul n genes The rabbit ortholog of cul-5 known as VACM-1 (Vasopressm-Activated Calcium Mobihzating receptor 1) mobilizes Ca +2 after arginme vasopressm induction, increasing the mtracellular calcium concentration The highest expression level of Hs-cul-5 was found to be in the heart and the skeletal muscle, contractile tissues that require a high level of calcium influx This disclosure shows that heart and skeletal muscle also expressed the highest level of Hs-cul-4A, suggesting it may also be involved in calcium mobilization However, there is no obvious transmembrane sequence in Hs-cul-4A
On the other hand, the yeast cul-1 gene, or Cdc53 (Kipreos et al , supra) is part of a protein complex that targets cell cycle proteins for degradation by the ubiquitm-proteasome pathway The Caenorhabditis elegans cul-1 (Ce-cul-1) is required for cells to exit the cell cycle Moreover, a null mutation of Ce-cul-1 causes hyperplasia in all tissues Recently, human cul-2 gene product has been shown to form a stable complex with the von Hippel Lmdau tumor suppressor gene product (Pause, A et al . Proc Natl Acad Sci USA 94 2156, 1997) Taken together, these results suggest that cul-1 and cul-2 gene products might be candidate tumor suppressors In contrast, the data provided here suggest that overabundance of cul-4A contributes to malignancy, thus cul-4A may function as an oncogene For instance, Hs-cul-4A may target tumor suppressors or other proteins which negatively regulate the cell cycle It is also possible that overexpression of \s-cul-4A may have a profound effect on localization of the protein to the proper cellular compartment Both Ce-cul-1 and Hs-cul-2 contain potential bipartite nuclear targeting signals, and the latter is a cytosohc protein that can be translocated to the nucleus by binding to the pVHL complex Overexpression of Hs-cul-2 without pVHL or coexpression of Hs-cul-2 with a mutant pVHL results in localization of Hs-cul-2 to the cytosol exclusively However, Hs-cul-4 contains a variant nuclear localization signal The lack of conservation of the nuclear localization motif in some members of the culhn family suggest that this motif is not required for their functions
Example 7: Chromosome 14 gene CH14-2a16-1
One of the cDNA obtained corresponded to a gene that mapped to Chromosome 14 Results of the analysis are summarized in Table 10 The scoring method is the same as for Example 4
Figure imgf000075_0001
+ Gene duplication or overabundance - no duplication or overabundance, nd = not done * Degree of gene duplication is reported relative to placental DNA preparations
** Degree of RNA overabundance is reported relative to the highest level observed for several cultures of normal epithelial cells
The gene corresponding to CH14-2a16-1 was duplicated in 8 out of 15 (53%) of the cells tested The sequence of 114 bases from the 5' end of the cDNA fragment is shown in Figure 22 (SEQ
ID NO 7) There was no substantial homology to any known gene in GenBank One of the three possible reading frames was found to be open, with the predicted ammo acid sequence shown in
Figure 22 (SEQ ID NO 8)
The CH14-2a16-1 gene was further characterized by obtaining additional sequence information A λ-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was screened using the initial cDNA insert, and two clones were identified one with a 1 6 kb insert, and the other with a 2 5 kb insert The identified clones were subcloned into plasmid vector pCRII The 1 6 kb insert was sequenced by using T7 and Sp6 primers for regions flanking the cDNA inserts as initial sequencing primers Sequencing continued by walking along the region of interest by standard techniques, using sequencing primers based on data already obtained Primers used are those designated 1-11 in Figure 18 A third clone (designated pCH 14-800) overlapping on the 5' end (Figure 6) was obtained using
CLONTECH Marathon™ cDNA Amplification Kit Briefly, DNA primers CH14a, CH14b, CH14c and CH14d (Figure 18) were prepared Polyadenylated RNA from breast cancer cell line MDA453 was reverse transcribed using 14b primer After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded cDNA The 5' end cDNA of CH14-2a16-1 was then amplified by PCR using primers CH14b (or CH14c) and AP1 (provided in the kit) To increase the specificity of the
PCR products, the first PCR products were PCR reamphfied using nested primers CH14a (or CH14d) and AP2 (provided in the kit) The PCR products were cloned into pCRII vector (Invitrogen) and screened with CH14-2a16-1 probe
By sequencing pCH14-1 6 and pCH14-800, a nucleic acid sequence of 2021 base pairs between the 5' end and the poly-A tail of CH14-2a16-1 has been determined The DNA sequence is shown in Figure 19 (SEQ ID NO 26) The longest open reading frame is in frame 1 from base 1 to 792, and codes for 263 ammo acids before the stop codon The corresponding ammo acid sequence of this frame is shown in the upper panel of Figure 20 (SEQ ID NO 27) The partial sequence predicted for the translated protein is shown in the lower panel of Figure 20 (SEQ ID NO 28) The 2 1 kb clone has not been sequenced, but is believed to consist about the same region of the
CH14-2a16-1 cDNA as pCH14-1 6 and pCH14-800 combined
A GENINFO® BLAST search of nucleotide and peptide sequence databases was performed through the National Center for Biotechnology Information on March 26, 1996 Short segments of homology with other reported human sequences were found at the nucleotide level (<500 base pairs), but none with any ascribed function in the respective identifier At the ammo acid level, the sequence was found to share homologies within the first 106 residues with an RNA binding protein from Saccharomyces cerevisiae with the designation NAB2 NAB2 is one of the major proteins associated with nuclear polyadenylated RNA in yeast cells, as detected by UV light-induced cross-linking and immunofluorescence NAB2 is strongly and specifically associated with nuclear poly(A)+ RNA in vivo Gene knock-out experiments have shown that this protein is essential to yeast cell survival (Anderson et al ) Accordingly, the protein encoded by CH14-2a16-1 is suspected of having DNA or RNA binding activity
A fourth clone (pCH14-1 3) has been obtained that overlaps the pCH14-800 clone at the 5' end (Figure 6) The method of isolation was similar to that for pCH 14-800, using primers based on the pCH14-800 sequence Partial sequence data for pCH14-1 3 has been obtained by one-directional sequencing from the 5' and 3' ends of the pCH14-1 3 clone Figure 21 shows the nucleotide sequence of the sequence of the 5' end (SEQ ID NO 29) and the ammo acid translation of the likely open reading frame (SEQ ID NO 30), the nucleotide sequence of the 3' end (SEQ ID NO 31) and the likely open reading frame (SEQ ID NO 32) This data is confirmed and additional sequence between SEQ ID NOS 29 and 31 is obtained by fully sequencing both strands of pCH14-1 3 Once compiled, the sequence data from pCH14-1 3, pCH14-800 and pCH14-1 6 may be shorter than the apparent size of mRNA observed in Northern analysis (Table 1 ) If necessary, further sequence data at the 5' end is deduced by obtaining additional cloned cDNA according to approaches described in this Example or Example 4
Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1 , comprising approximately 1934 base pairs 5' from the sequence of Figure 19 The corresponding ammo acid translation is shown in the upper panel of Figure 26 The additional sequence data was obtained by rescuing and amplifying further fragments of CH14-2a16-1 cDNA Nested primers were designed -100 base pairs downstream from the 5' end of the known sequence The primers were used in a nested amplification assay using AP1 and AP2, using the CLONTECH Marathon™ cDNA Amplification Kit as described above The template was a Marathon™ ready cDNA preparation from human testes, also supplied by CLONTECH
The nucleotide sequence shown in Figure 25 is closed at the the 5' end The lower panel of Figure 26 shows what is predicted to be the sequence of the gene product, beginning at the first methionine residue The nucleotide sequence shown contains a point difference at the position indicated by the underlining in Figure 25 A base determined to be A from the previously obtained polynucleotide fragment was a G in the one used in this part of the experiment This corresponds to a change from E (glutamic acid) to G (glycme) in the protein sequence, at the position underlined in Figure 26 This may represent a natural allelic variation
A CH14-2a16-1 cloned insert has been used to probe the level of relative expression in polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4 The relative CH14-2a16-1 expression observed at the mRNA level is shown in Table 11
Figure imgf000078_0001
CH14-2a16-1 mRNA was particularly high in testis. The level of expression in breast cancer cell lines is also quite high, since the Northern analysis performed on these lines was conducted on total cellular RNA. It is likely that the CH14-2a16-1 gene is involved in a biological process that is typical to the tissue types showing medium to high levels of expression, which may relate to increased tissue growth or metabolism.
Five motifs corresponding to a zinc finger protein have been found in the CH14-2a16-1 nucleotide sequence. Further zinc finger motifs may be present in CH14-2a16-1 in the upstream direction. Zinc finger motifs are present, for example, in RNA polymerases I, II, and III from S. cerevisiae, and are related to the zinc knuckle family of RNAssDNA-binding proteins found in the HIV nucleocapsid protein. The actual sequence observed in each of the five zinc finger motifs of CH14-2a16-1 is:
Cys-(Xaa)fi-Cys-(Xaa)4-Cys-(Xaa)3-His or (SEQ. ID NO:38) Cys-(Xaa)5-Cys-(Xaa) 5-Cys-(Xaa)3-His (SEQ. ID NO:39)
which is indicated in Figure 20 by underlining. This is identical to the 7 zinc finger motifs of NAB2, which make up an RNA/ssDNA binding region (Anderson et al.). Accordingly, the CH14-2a16-1 gene product is suspected of having DNA or RNA binding activity, and may be specific for polyadenylated RNA. It may very well play a role in the regulation of gene replication, transcription, the processing of hnRNA into mature mRNA, the export of mRNA from the nucleus to the cytoplasm, or translation into protein. This role in turn may be closely implicated in cell growth or proliferation, particularly as manifest in tumor cells.
Example 8: Identification of other cancer-associated genes
cDNA fragments corresponding to additional cancer-associated genes are obtained by applying the techniques of Examples 1 & 2 with appropriate adaptations. As before, cancer cells are selected for use in differential display of RNA, based on whether they share a duplicated chromosomal region according to Table 12:
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Control RNA is prepared from normal tissues to match that of the cancer cells in the experiment Normal tissue is obtained from autopsy, biopsy, or surgical resection Absence of neoplastic cells in the control tissue is confirmed, if necessary, by standard histological techniques cDNA corresponding to RNA that is overabundant in cancer cells and duplicated in a proportion of the same cells is characterized further, as in Examples 3-7 Additional cDNA comprising an entire protein-product encoding region is rescued or selected according to standard molecular biology techniques as described elsewhere in this disclosure
REFERENCES
Articles on general topics
1 Adnane J et al (1991), "BEK and FLG, two receptors to members of the FGF family, are amplified in subsets of human breast cancers", Oncogene 6 659-661
2 Ahtalo K et al (1986), "Oncogene amplification in tumor cells", Adv Cancer Res 47 235-281
3 Altschul et al (1986), Bull Math Bio 48 603-616
4 Beardsley T (1994), "Crabshoot manufacturers gamble on cancer vaccines again", Scientific American, Sept 102 5 Berns E M et al (1992), "Sporadic amplification of the insulin -like growth factor 1 receptor gene in human breast tumors", Cancer Res 52 1036-1039
6 Bishop J M (1991), "Molecular themes in oncogenesis", Cell 64 235-248
7 Blast R C Jr (1993), "Perspectives on the future of cancer markers", Chn Chem 31 2444-2451
8 Bnson O (1993), "Gene amplification and tumor progression", Biochim Biophys Acta 1155 25-41 9 Culver K W et al (1994), "Gene therapy for cancer," Trends Genet 10 174-178
10 Henikoff et al (1992), Proc Natl Acad Sci USA 89 10915-10919
11 Kallioniemi A et al (1992), "Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors", Science 258 818-821
12 Kocher O et al (1995), "Identification of a novel gene, selectively up-regulated in human carcinomas, using the differential display technique", C n Cancer Res 1 1209-1215
13 Lippman M E (1993), "The development of biological therapies for breast cancer", Science 259 631-632
14 MacLean G D et al (1992), "The immune system, cancer antigens and immunotherapy", Contemp Oncol Aug/Sept 15 McKenzie D et al (1994), "Using the RNA arbitrarily pnumed polymerase chain reaction (RAP-
PCR) to analyze gene expression in human breast cancer cells lines" [abstract], J Cell Biochem 18D 248 6 Muss H B et al (1994), c-erJbB-2 expression and response to adjuvant therapy in women with node-positive early breast cancer", New Engl J Med 330 1260-1266
17 Morgan R A et al (1993), "Human gene therapy," Annu Rev Biochem 62 191-217
18 Roth J A (1994), "Modulation of oncogene and tumor-suppressor gene expression a novel strategy for cancer prevention and treatment", Ann Surg Oncol 1 79-86
19 Samt-Ruf C et al (1990), "Proto-oncogene amplification and homogeneously staining regions in human breast carcinomas", Genes Chromosomes Cancer 2 18-26 0 Slamon D J et al (1987), "Human breast cancer correlation of relapse and survival with amplification of the HER-2/neιy oncogene", Science 235 178-182 21 Schwab M et al (1990), "Amplification of cellular oncogenes a predictor of clinical outcome of human cancer", Genes Chromosomes Cancer 1 181-193
22 Thompson C T et al (1993), "Cytogenetic profiling using fluorescence in situ hybridization (FISH) and comparative genomic hybridization (CGH)", J Cell Biochem 17G 139-143
23 Unsigned (1994), "Synthetic vaccine stabilizes advanced cancer, prolongs survival", Oncol News 3 1
24 Watson M A et al (1994), "Isolation of differentially expressed sequence tags from human breast cancer", Cancer Res 544598-4602
25 Watson M A et al (1996), "Mammaglobulin, a mammary-specific member of the uteroglobuhn gene family, is overexpressed in human breast cancer", Cancer Res 56 860-865 26 Zafrani B et al (1992), "Cytogenetic study of breast cancer", Hum Pathol 23 542-547
Articles on Differential Display, RNA fingerprinting, and related techniques
1 Ayala M et al (1995), "New primer strategy improves precision of differential display", BioTechniques 18 842-850
2 Bauer D et al (1993), "Identification of differentially expressed mRNA species by an improved display technique (DDRT-PCR), Nucl Acids Res 21 4272^280
3 Bertioh D J et al (1995), "An analysis of differential display shows a strong bias towards high copy number mRNAs", Nucl Acids Res 234520-4523 4 Chen Z et al (1995), "Differential expression of human tissue factor in normal mammary epithelial cells and in carcinomas", Molecular Med 1 153-160
5 Haag E et al (1994), "Effects of primer choice and source of Taq DNA polymerase on the baindmg patterns of differential display RT-PCR", BioTechniques 17 226-228
6 Hadman M Et al (1995), "Modifications to the differential display technique reduce background and increase sensitivity", Anal Biochem 226 383-386
7 Ito T et al (1994), "Fluorescent differential display arbitrarily primed RT-PCR fingerprinting on an automated DNA sequencer", FEBS Lett 351 231-236 Liang P et al (1992a), "Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction", Science 257 967-971 Liang P et al (1992b), "Differential display and cloning of messenger RNAs in human breast cancer versus mammary epithelial cells", Cancer Res 52 6966-6968 0 Liang P et al (1993), "Distribution and cloning of eukaryotic mRNAs by means of differential display refinements and optimization", Nucl Acids Res 21 3269-3275 1 Liang P et al (1994), "Differential display using one-base anchored ohgo-dT primers", Nucl Acids Res 22 5763-5764 2 Liang P et al (1995a), "Recent advances in differential display", Curr 0pm Immunol 7 274-280 13 Liang P et al (1995b), "analysis of altered gene expression by differential display", Methods
Enzymol 254 304-321
14 Lmskens M H K et al (1995), "Cataloging altered gene expression in young and senescent cells using enhanced differential display", Nucl Acids Res 23 3244-3251
15 Snager R et al (1993), "Identification by differential display of alpha-6 integπn as a candidate tumor suppressor gene", FASEB J 7 964-970
16 Sompayrac L et al (1995), "Overcoming limitations of the mRNA differential display technique", Nucl Acids Res 23 4738^739
17 Sun Y et al (1994), "Moelcular cloning of five messenger RNAs differentially expressed in preneoplastic or neoplastic JB6 mouse epidermal cells one is homologous to human tissue inhibitor of metalloproteιnases-3", Cancer Res 54 1139-1144
18 Sunday M E et al (1995), "Differential display RT-PCR for identifying novel gene expression in the lung", Am J Physiol 269 L273-L284
19 Trentmann S M et al (1995), "Alternatives to 35S as a label for the differential display of eukaryotic messenger RNA", Science 267 1186-1187 20 Welsh J et al (1992), "Arbitrarily primed PCR fingerprinting of RNA", Nucl Acids Res 20 4965-
4970 21 Yeatman T J et al (1995), "Identification of a differentially-expressed message associated with colon cancer liver metastasis using an improved method of differential display", Nucl Acids Res
23 4007-4008 22 Yoshikawa T et al (1995), "Detection, simultaneous display and direct sequencing of multiple nuclear hormone receptor genes using bilaterally targeted RNA fingerprinting", Biochim Biophys
Acta 1264 63-71
Articles on Duplicated Chromosome Regions in Cancer
1 Bentz M et al (1994), "Fluorescent in situ hybridization in leukemias 'the FISH are spawning'"', Leukemia 8 1447-1452 Bentz M et al (1995a), "Comparative genomic hybridization in chronic B-cell leukemias shows a high incidence of chomosomal gains and losses", Blood 85 3610-3618 Bentz M et al (1995b), "Comparative genomic hybridization in the investigation of myeloid leukemias", Genes Chrom Cancer 12 193-200 Bryndorf T et al (1995), "Comparative genomic hybridization in clinical cytogenetics", Am J Hu
Genetics 57 1211-1220 Cher ML et al (1994), "Comparative genomic hybridization, allelic imbalance, and fluorescence in situ hybridization on chromosome 8 in prostate cancer", Genes Chrom Cancer 11 153-162 Dut llaux B et al (1990), "Characterization of chromosomal anomalies in human breast cancer", Cancer Genet Cytogenet 49 203-217 Feuerstein BG et al (1995), "Molecular cytogenetic quantitation of gams and losses of genetic material from human g omas", J Neuro-Oncol 24 47-55 Forus A et al (1995a), "Comparative genomic hybridization analysis of human sarcomas I Occurrence of genomic imbalances and identification of a novel major amphcon at 1q21-q22 in soft tissue sarcomas", Genes Chrom Cancer 14 8-14 Forus A et al (1995b), "Comparative genomic hydπdization analysis of human sarcomas II Identification of novel amphcons at 6p and 17p in osteosarcomas", Genes Chrom Cancer 14 15-21 Gordon KB et al (1994), "Comparative genomic hybridization in the detection of DNA copy number abnormalities in uveal melanoma", Cancer Res 544764-4768 Gray JW et al (1994), "Fluorescence in situ hybridization in cancer and radiation biology", Radiation Res 137275-289 Houldsworth J et al (1994), "Comparative genomic hybridization an overview", Am J Path 145 1253-1260 Isola JJ et al (1995), "Genetic aberrations detected by comparative genomic hybridization predict outcome in node-negative breast cancer", Am J Path 147 905-911 Iwabuchi H et al (1995), "Genetic analysis of benign, low-grade, and high-grade ovarian tumors", Cancer Res 556172-6180 Kallioniemi A et al (1994), "Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization", Proc Natl Acad Sci USA 91 2156-2160 Kallioniemi A et al (1995), "Identification of gams and losses of DNA sequences in primary bladder cancer by comparative genomic hybridization", Genes Chrom Cancer 12 213-219 Kim DH et al (1995), "Chromosomal abnormalities in glioblastoma multiforme tumors and glioma cell lines detected by comparative genomic hybridization", Int J Cancer 60 812-819 Levin NA et al (1994), "Identification of frequent novel genetic alterations in small cell lung carcinoma", Cancer Res 5086-5091 Levin NA et al (1995), "Identification of novel regions of altered DNA copy number in small cell lung tumors", Genes Chrom Cancer 13 175-185 Lisitsyn NA et al (1995), "Comparative genomic analysis of tumors detection of DNA losses and amplification", Proc Natl Acad Sci USA 92 151-155 Mohamed AN et al (1994), "Extrachromosomal gene amplification in acute myeloid leukemia, characterization by metaphase analysis, comparative genomic hybridization, and semi-quantitative PCR", Genes Chrom Cancer 8 185-189 Mohapatra G et al (1995), "Detection of multiple gams and losses of genetic mateπal in ten glioma cell lines by comparative genomic hybridization" Genes Chrom Cancer 13 86-93 Muleris M et al (1994a), "Detection of DNA amplification in 17 primary breast carcinomas with homogeneously staining regions by a modified comparative genomic hybridization technique", Genes Chrom Cancer 10 160-170 Muleris M et al (1994b), "Oncogene amplification in human fliomas a molecular cytogenetic analysis", Oncogene 9 2717-2722 Nacheva E et al (1995), "Comparative genomic hybridization in acute myeloid leukemia A comparison with G-bandmg and chromosome painting", Cancer Genetics Cytogenetics 82 9-16 Ried T et al (1994), "Mapping of multiple DNA gams and losses in primary small cell lung carcinomas by comparative genomic hybridization", Cancer Res 54 1801-1806 Ried T et al (1995), "Comparative genomic hybridization of formalin-fixed, paraffin-embedded breast tumors reveals different patterns of chromosomal gains and losses in fibroadenomas and diploid and aneuploid carcinomas", Cancer Res 55 5415-5423 Schlegel J et al (1994), "Detection of amplified DNA sequences by comparative genomic in situ hybridization with human glioma tumor DNA as probe", Verhand Deut G Path 78 204-207 Schlegel J et al (1995), "Comparative genomic in situ hybridization of colon carcinomas with replication error", Cancer Res 55 6002-6005 Schlegel J et al (1996), "Detection of complex genetic alterations in human glioblastoma multiforme using comparative genomic hybridization", J Neuropa Mol Exp Neurol 55 81-87 Schrock E et al (1994), "Comparative genomic hybridization of human malignant g omas reveals multiple amplification sites and nonrandom chromosomal gams and losses", Am J Path 144 1203-1218 Seruca R et al (1995), "Increasing levels of MYC and MET co-amplification duπng tumor progression of a case of gastric cancer", Cancer Genetics Cytogenetics 82 140-145 Speicher MR et al (1994), "Chromosomal gams and losses in uveal melanomas detected by comparative genomic hybridization", Cancer Res 54 3817-3823 Speicher MR et al (1995), "Comparative genomic hybridization detects novel deletions and amplifications in head and neck squamous cell carcinomas", Cancer Res 55 1010-1013 Steilen-Gimbel H et al (1996), "A novel site of DNA amplification on chromosome 1 p32-33 in a rhabdomyosarcoma revealed by comparative genomic hybridization", Hu Genetics 97 87-90 Suijkerbuijk RF et al (1994), "Comparative genomic hybridization as a tool to define two distinct chromosome 12-deπved amplification units in well-differentiated hposarcomas", Genes Chrom Cancer 9 292-295 Tanner MM et al (1994), "Increased copy number at 20q13 in breast cancer defining the critical region and exclusion of candidate genes", Cancer Res 544257-4260 Tarkkanen M et al (1995), "Gams and losses of DNA sequences in osteosarcomas by comparative genomic hybridization", Cancer Res 55 1334-1338 Visakorpi T et al (1995a), "Genetic changes in primary and recurrent prostate cancer by comparative genomic hybridization", Cancer Res 55 342-347 Visakorpi T et al (1995b), "In vivo amplification of the androgen receptor gene and progression of human prostate cancer", Nature Genetics 9 401-406 Voorter C et al (1995), "Detection of chromosomal imbalances in transitional cell carcinoma of the bladder by comparative genomic hybridization", Am J Path 146 1341 -1354 Wiltshire RN et al (1995), "Direct visualization of the clonal progression of primary cutaneous melanoma application of tissue microdissection and comparative genomic hybridization", Cancer Res 55.3954-3957
TABLE OF SEQUENCE LISTINGS:
Figure imgf000088_0001
Figure imgf000089_0001
SEQ ID N0:9: l lll TCC 13
SEQ ID NO: 10: l lll TAC 13
SEQ ID NO: 11: CAATCGCCGT 10
SEQ ID NO: 12: TCGGCGATAG 10
SEQ ID NO: 13: CAGCACCCAC 10
SEQ ID NO: 14: AGCCAGCGAA 10

Claims

What is claimed as the invention is
1 An isolated polynucleotide comprising a linear sequence of at least 40 consecutive nucleotides at least 90% identical to a linear sequence contained in a sequence selected from the group consisting of SEQ ID NOS 15, 33, 45, 47, and 49 (CH1-9a11-2), SEQ ID NOS 18 and 21 (CH8-2a13-1), SEQ ID NOS 23, 51 , 53, and 55 (CH13-2a12), and SEQ ID NOS 26, 29, 31 , and 35 (CH14-2a16-1), but not any of SEQ ID NOS 3, 5, and 7
2 The isolated polynucleotide of claim 1 , comprising a linear sequence of at least 100 consecutive nucleotides at least 90% identical to a sequence contained in the selected sequence
3 The isolated polynucleotide of claim 1 , comprising a linear sequence of at least 40 consecutive nucleotides at least 95% identical to a sequence contained in the selected sequence
4 The isolated polynucleotide of claim 1 , comprising a linear sequence of polynucleotides essentially identical to a sequence selected from the group consisting of SEQ ID NOS 15,
33, 45, 47, and 49 (CH1-9a11-2), SEQ ID NOS 18 and 21 (CH8-2a13-1), SEQ ID NOS 23, 51 , 53, and 55 (CH13-2a12), and SEQ ID NOS 26, 29, 31 , and 35 (CH14-2a16-1)
5 An isolated polynucleotide comprising a linear sequence of at least 40 consecutive nucleotides having one or more of the following properties i) the ability to hybridize with DNA having a sequence selected from the group consisting of
SEQ ID NOS 15, 33, 45, 47, and 49 (CH1-9a11-2), SEQ ID NOS 18 and 21 (CH8-
2a13-1), SEQ ID NOS 23, 51 , 53, and 55 (CH13-2a12), and SEQ ID NOS 26, 29, 31 , and 35 (CH14-2a16-1), under conditions where it does not hybridize with DNA consisting of SEQ ID NOS 3, 5, 7, or any other DNA from a human cell, or
II) the ability to hybridize with RNA having a sequence selected from the group consisting of
SEQ ID NOS 15, 33, 45, 47, and 49 (CH1-9a11-2), SEQ ID NOS 18 and 21 (CH8-
2a13-1), SEQ ID NOS 23, 51 , 53, and 55 (CH13-2a12), and SEQ ID NOS 26, 29, 31 , and 35 (CH14-2a16-1), under conditions where it does not hybridize with RNA consisting of SEQ ID NOS 3, 5, 7, or any other RNA from a human cell The isolated polynucleotide of claim 5, wherein the linear sequence is at least 100 consecutive nucleotides
The isolated polynucleotide of claim 1 , wherein said linear sequence is contained in a duplicated gene or overabundant RNA in cancerous cells
The isolated polynucleotide of claim 1 , which is a CH13-2a12-1 polynucleotide, and is contained in an encoding region for a protein or RNA molecule that controls cell proliferation
The isolated polynucleotide of claim 1 , which is a CH14-2a16-1 polynucleotide, and is contained in an encoding region for a protein with DNA or RNA binding activity
The isolated polynucleotide of claim 1 , having a nucleotide sequence contained in a recombinant plasmid deposited under ATCC Accession No 98074, a recombinant phage deposited under ATCC Accession No 97595, or the λBCBT474 cDNA library deposited under ATCC Accession No 97594
An isolated polypeptide comprising a linear sequence of at least 5 am o acid residues identical to a sequence encoded by a polynucleotide according to claim 1
The isolated polypeptide of claim 11 , comprising a linear sequence of at least 15 consecutive ammo acids at least 90% identical to a linear sequence contained in the selected sequence
An isolated polypeptide comprising a linear sequence of at least five ammo acids essentially identical to a sequence selected from the group consisting SEQ ID NOS 17, 34, 46, 48, and 50 (CH1-9a11-2), SEQ ID NOS 20 and 22 (CH8-2a13-1), SEQ ID NOS 25, 52, 54, and 56 (CH13-2a12), and SEQ ID NOS 28, 30, 32, and 37 (CH14-2a16-1 )
The isolated polypeptide of claim 13, which is overexpressed in cancerous cells
The isolated polypeptide of claim 13, wherein the polynucleotide selected from said group is a CH1-9a11-2 polynucleotide, and the polypeptide is a transmembrane polypeptide
An isolated polynucleotide comprising an encoding sequence for the polypeptide of claim 13
A monoclonal or isolated polyclonal antibody specific for the polypeptide of claim 13 A host cell genetically altered with the polynucleotide of claim 1 , or any progeny thereof retaining the genetic alteration
A method of detecting gene duplication in cancerous cells, comprising the steps of a) reacting DNA contained in a clinical sample with a reagent comprising the polynucleotide of claim 1 , said clinical sample having been obtained from an individual suspected of having cancerous cells, and b) comparing the amount of any complexes formed between the reagent and the DNA in the clinical sample with the amount of any complexes formed between the reagent and
DNA in a control sample
A method of detecting overabundance of RNA in cancerous cells, comprising the steps of a) reacting RNA contained in a clinical sample with a reagent comprising the polynucleotide of claim 1 , said clinical sample having been obtained from an individual suspected of having cancerous cells, and b) comparing the amount of any complexes formed between the reagent and the RNA in the clinical sample with the amount of any complexes formed between the reagent and RNA in a control sample
A method of determining gene duplication or overabundance of RNA in cancerous cells, comprising the steps of a) amplifying DNA or RNA in a clinical sample with a primer comprising the polynucleotide of claim 1 to yield an amplified polynucleotide, said clinical sample having been obtained from an individual suspected of having cancerous cells, and b) comparing the amount of polynucleotide amplified from the DNA or RNA with the amount of polynucleotide amplified from DNA or RNA from a control sample
A method of screening for cancer associated with a gene duplication in an individual, comprising the steps of a) determining gene duplication in cells from the individual according to the method of claim 19, and b) correlating any gene duplication determined in step a) with an increased risk for the cancer A method of screening for cancer associated with overexpression of RNA in an individual, comprising the steps of a) determining overexpression of RNA in cells from the individual according to the method of claim 20, and b) correlating any RNA overexpression determined in step a) with an increased risk for the cancer
A method of screening for cancer associated with a gene duplication or overexpression of RNA in an individual, comprising the steps of a) determining gene duplication or overexpression of RNA in cells from the individual according to the method of claim 21 , and b) correlating any gene duplication or overexpression of RNA determined in step a) with an increased risk for the cancer
The method of claim 23, which is a screening method for breast cancer
A method for detecting altered protein expression in cancerous cells, comprising the steps of a) reacting a polypeptide contained in a clinical sample with a reagent comprising the antibody of claim 17, said clinical sample having been obtained from an individual suspected of having cancerous cells, and b) comparing the amount of any complexes formed between the reagent and the polypeptide in the clinical sample with the amount of any complexes formed between the reagent and a polypeptide in a control sample
A method of screening a candidate drug for cancer treatment, comprising the steps of a) providing a host cell genetically altered by the polynucleotide of claim 1 , b) separating progeny of the cell of step a) into a first group and a second group, c) treating the first group of cells with the pharmaceutical candidate, d) not treating the second group of cells with the pharmaceutical candidate, and e) comparing the phenotype of the treated cells with that of the untreated cells
A method of screening a candidate drug for cancer treatment, comprising obtaining cDNA corresponding to a gene that is duplicated, deleted or has altered expression in cancer according to a method comprising the following steps a) supplying an RNA preparation from control cells, b) supplying RNA preparations from at least two different cancer cells that share a duplicated or deleted gene in the same region of a chromosome, c) displaying cDNA corresponding to the RNA preparations of step a) and step b) such that different cDNA corresponding to different RNA in each preparation are displayed separately, and d) selecting cDNA corresponding to RNA that is present in higher abundance in the cancer cells of step b) relative to the control cells of step a) if the cancer cells share a duplicated gene, or selecting cDNA corresponding to RNA that is present in lower abundance in the cancer cells of step b) relative to the control cells of step a) if the cancer cells share a deleted gene, and then comparing the effect of the candidate drug on a cell genetically altered to affect the level of expression of the cDNA with the effect on a cell not so altered
PCT/US1999/018101 1998-08-10 1999-08-10 Genes amplified in cancer cells WO2000009655A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU54748/99A AU5474899A (en) 1998-08-10 1999-08-10 Genes amplified in cancer cells

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13202998A 1998-08-10 1998-08-10
US09/132,029 1998-08-10

Publications (3)

Publication Number Publication Date
WO2000009655A2 WO2000009655A2 (en) 2000-02-24
WO2000009655A3 WO2000009655A3 (en) 2000-05-18
WO2000009655A9 true WO2000009655A9 (en) 2000-11-09

Family

ID=22452121

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/018101 WO2000009655A2 (en) 1998-08-10 1999-08-10 Genes amplified in cancer cells

Country Status (2)

Country Link
AU (1) AU5474899A (en)
WO (1) WO2000009655A2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1659177A3 (en) * 2000-06-02 2006-05-31 Genentech, Inc. Secreted and transmembrane polypeptides and nucleic acids ancoding the same
US7157558B2 (en) 2001-06-01 2007-01-02 Genentech, Inc. Polypeptide encoded by a polynucleotide overexpresses in tumors

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6071715A (en) * 1993-08-12 2000-06-06 Board Of Regents, The University Of Texas System Nucleic acids encoding novel proteins which bind to retinoblastoma protein
US5710001A (en) * 1994-08-12 1998-01-20 Myriad Genetics, Inc. 17q-linked breast and ovarian cancer susceptibility gene
US5795726A (en) * 1996-11-12 1998-08-18 Millennium Pharmaceuticals, Inc. Methods for identifying compounds useful in treating type II diabetes

Also Published As

Publication number Publication date
WO2000009655A3 (en) 2000-05-18
AU5474899A (en) 2000-03-06
WO2000009655A2 (en) 2000-02-24

Similar Documents

Publication Publication Date Title
US5776683A (en) Methods for identifying genes amplified in cancer cells
US5759776A (en) Targets for breast cancer diagnosis and treatment
WO1996039516A9 (en) Targets for breast cancer diagnosis and treatment
WO1997038085A2 (en) Genes amplified in cancer cells
EP0578909B1 (en) Human prohibitin and DNA coding for the same
JP2002533056A (en) Compounds and methods for treatment and diagnosis of lung cancer
US7335475B2 (en) PR/SET-domain containing nucleic acids, polypeptides, antibodies and methods of use
US5866372A (en) Nucleic acids encoding lymphoid CD30 antigen
US5552526A (en) MDC proteins and DNAS encoding the same
WO2005026205A2 (en) Genetic products which are differentially expressed in tumours and use thereof
US6800730B1 (en) Isolated peptides which bind to MHC class II molecules, and uses thereof
Dragani et al. Major urinary protein as a negative tumor marker in mouse hepatocarcinogenesis
WO2000009655A9 (en) Genes amplified in cancer cells
EP0974652A1 (en) Cancerous metastasis-associated gene
KR20020026530A (en) T-CELL RECEPTOR γALTERNATE READING FRAME PROTEIN(TARP) AND USES THEREOF
EP0831148B1 (en) Human adhesion molecule occludin
WO1997028193A1 (en) Compositions and methods useful in the detection and/or treatment of cancerous conditions
WO1997028193A9 (en) Compositions and methods useful in the detection and/or treatment of cancerous conditions
PL204844B1 (en) Polynucleotide useful for modulating cancer cell proliferation
US7365158B2 (en) PR-domain containing nucleic acids, polypeptides, antibodies and methods
US5559023A (en) Tumor suppressor gene
JP2002523015A (en) Human vault RNA
US20030049623A1 (en) PR/SET-domain containing nucleic acids, polypeptides, antibodies and methods of use
US20030040029A1 (en) Detection of tumor marker transcript and protein recognized by naive natural killer cells
US6982317B2 (en) C21 polypeptide that modulates the stability of transcriptional regulatory complexes regulating nuclear hormone receptor activity

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: C2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C2

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

COP Corrected version of pamphlet

Free format text: PAGES 1/60-60/60, DRAWINGS, REPLACED BY NEW PAGES 1/60-60/60; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase