EP1856289A2 - Predicting chemosensitivity to cytotoxic agents - Google Patents

Predicting chemosensitivity to cytotoxic agents

Info

Publication number
EP1856289A2
EP1856289A2 EP06736375A EP06736375A EP1856289A2 EP 1856289 A2 EP1856289 A2 EP 1856289A2 EP 06736375 A EP06736375 A EP 06736375A EP 06736375 A EP06736375 A EP 06736375A EP 1856289 A2 EP1856289 A2 EP 1856289A2
Authority
EP
European Patent Office
Prior art keywords
genes
chemosensitivity
gene
polynucleotide probes
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP06736375A
Other languages
German (de)
French (fr)
Inventor
Wolfgang Sadee
Ying Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ohio State University Research Foundation
Original Assignee
Ohio State University Research Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ohio State University Research Foundation filed Critical Ohio State University Research Foundation
Publication of EP1856289A2 publication Critical patent/EP1856289A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/142Toxicological screening, e.g. expression profiles which identify toxicity

Definitions

  • the present invention provides methods for detecting the chemosensitivity gene expression profile for a cancer cell.
  • the chemosensitivity gene expression profile reflects the expression levels of a plurality of target polynucleotides in a sample, wherein the target polynucleotides encode gene products that are markers for cancer cell chemosensitivity.
  • the method comprises contacting a polynucleotide sample obtained from cells of the specific cancer of interest to polynucleotide probes to detect and measure the amount of target polynucleotides in the sample.
  • the measured levels of expression of target polynucleotides provides an expression profile for the cancer cells that is compared to the drug- gene correlations described in the appendices hereto.
  • the chemo sensitivity expression profile can be used, for example: (a) in the prediction of the chemosensitivity of a particular cancer cell or cell type to a therapeutic agent; (b) in the choice of drug therapy for a patient in need of the same; (c) in the identification of targets for altering the chemosensitivity of a cancer; and (d) in the identification of novel agents for modulating the chemosensitivity of a cancer.
  • the present invention provides new methods for identifying and characterizing new agents that modulate the chemosensitivity of a cancer by altering the expression of one or more growth factor signaling genes, which are markers for cancer cell chemosensitivity.
  • the method comprises treating a sample of cells from the cancer with a test agent, obtaining polynucleotide samples from untreated cancer cells and the treated cancer cells, and contacting the polynucleotide samples to polynucleotide probes to detect and measure the amount of target polynucleotides in the sample and thereby obtain an expression profile of genes, such as genes that are involved in growth factor signaling, which are markers for chemosensitivity.
  • the method further comprises comparing the growth factor signaling gene expression profiles of the control and treated cells to determine whether the agent altered the expression of any of the genes correlated with chemosensitivity or chemoresistance to various drugs.
  • Figure 1 shows a hierarchical cluster analysis of 69 negatively correlated genes against the 119 anticancer drugs using gene-drug Pearson correlation coefficients. Genes and drugs cluster into two main groups, while drugs further cluster into subgroups according to their mechanisms of reaction.
  • Figure 2 shows gene-drug correlation profiles for EGFR (panel A) and ERBB2 (panel b) with 119 anticancer drugs. Pearson correlation coefficients against 119 drugs are sorted for EGFR, while the order of drugs is maintained for ERBB.
  • Figure 3 shows the relative mRNA (panel A, Iog2 transformed from the 70-mer microarray hybridization) and protein level (panel B) of EGFR and ERBB 2 in cell lines SK-OV-
  • Figure 4 shows the drug combination indexes with respect to fraction affected (Fa) for the combination of EGFR inhibitor AG1478 with camptothecin 10-OH (Panel A), and AG1478 with paclitaxel (Panel B).
  • the combination index of AG1478/camptothecin 10-OH is ⁇ 1, an indication of synergism, while that of AG1478/paclitaxel is >1, indicative of antagonism.
  • Figure 5 shows the genes negatively and negatively correlated with drug response. Only genes correlated with at least 5 drugs with P ⁇ 0.001 are included. For all genes with at least 1 drug at P ⁇ 0.001, see Supplemental Table 3 and 4.
  • Figure 6 shows the drug combination effects between AG1478 or AG825 and paclitaxel, cisplatin, or CPT, 10-OH, using the combination index (see Figure 4).
  • Figure 7 shows the genes occurring in predictive models more often than expected by chance, sorted by number of drugs.
  • Figure 8 is a comparison of prediction accuracy for select drugs, between a heuristic approach with all relevant genes (69, 49, and 343), and using only a short list of genes (top 12, 9, and 13) showing high frequencies as predictors (Figure 7). In the latter case all possible combinations were tested and the highest scoring set selected. A complete list of the 68 drugs tested is available in Figure 20.
  • Figure 9 is an example of a compound (mitomycin) with bimodal distribution of growth inhibition (-log(GI 5 o)).
  • Figure 10 shows a hierarchical cluster analysis of the NCI-60 cell-cell correlations. The analysis was based on gene expression of 107 probes representing genes that are differentially
  • BR breast cancer
  • CNS CNS cancer
  • CO colon cancer
  • LC lung cancer
  • LE leukemia
  • ME melanoma
  • OV ovarian cancer
  • PR prostate cancer
  • RE renal cancer.
  • Figure 11 shows a hierarchical cluster analysis of gene-cell correlations using expression of 69 genes that are negatively correlated with at least one drug with P ⁇ 0.001.
  • EGFR expression from 2 independent experiments were included (EGFR and EGFR_1), showing close proximity in the cluster.
  • Cell lines tend to cluster together according to tissue of origin, such as leukemia, colon cancer, melanoma, and renal cell carcinoma, which is consistent with Figure 1.
  • Figure 12 shows a hierarchical cluster analysis of gene-gene correlations using the expression for 69 genes with negative correlations as distance measures.
  • the two major gene clusters are similar to the two main groups in Figure 2, based on cell-gene expression correlation.
  • Figure 13 shows a hierarchical cluster analysis of gene-gene correlation for 69 genes negatively correlated with at least one drug, using gene-drug correlation coefficients as distance measures.
  • the two major cluster patterns are nearly identical to the pattern in Figure 3 based on 2006/007045
  • Figure 14 shows 343 genes for which probes were included in the 70-mer oligonucleotide microarray.
  • Figure 15 shows the number of genes (out of 343 genes) negatively or positively correlated with at least 4, 2, or 1 drug(s) out of 119 drugs (using bootstrap P ⁇ 0.001 and Pearson correlation coefficient > 0.4 as the cutoffs).
  • Figure 16 shows the genes with predominantly negative correlations to drug response. Only three genes showed also positive correlations.
  • Figure 17 shows the genes with predominantly positive correlations with drug response.
  • Figure 18 shows prediction accuracy for 68 drugs using positively correlated or negatively correlated or all the 343 genes. Note the very low and variable accuracy for some drugs in all three groups, reflecting the heuristic nature of the selection algorithm.
  • Figure 19 shows predictive genes sorted by number of affected drugs where the gene is present in the optimized gene panel. Each included gene has at least one drug correlation at P ⁇ 0.001 for both positively and negatively correlated genes.
  • Figure 20 shows a comparison of prediction accuracy for select drugs, between a heuristic approach with all relevant genes (69, 49, and 343), and using only a short list of genes (top 12, 9, and 13) showing high frequencies as predictors (see Figures 7 and 8). In the latter case all possible combinations were tested and the highest scoring set selected.
  • Figure 21 shows prediction accuracy and predictive gene sets using only genes occurring in the predictive models for drugs more than by chance listed in Figure 7. Note the small US2006/007045
  • “Chemosensitivity” refers to the propensity of a cell to be affected by a cytotoxic agent, wherein a cell may range from sensitive to resistant to such an agent.
  • the expression of a chemosensitivity gene can be a marker for or indicator of chemosensitivity.
  • “Chemosensitivity gene” refers to a gene whose protein product influences the chemosensitivity of a cell to one or more cytotoxic agents. According to the instant invention, 2006/007045
  • chemosensitivity genes may themselves render cells more sensitive or more resistant to the effects of one or more cytotoxic agents, or may be associated with other factors that directly influence chemosensitivity.
  • chemosensitivity genes may or may not directly participate in rendering a cell sensitive or resistant to a drug, but expression of such genes may be related to the expression of other factors which may influence chemosensitivity.
  • Expression of a chemosensitivity gene can be correlated with the sensitivity of a cell or cell type to an agent, wherein a negative correlation may indicate that the gene affects cellular resistance to the drug, and a positive correlation may indicate that the gene affects cellular sensitivity to a drug.
  • chemosensitivity genes have been identified among known and putative growth factor signaling genes. The appendices hereto list the accession numbers for the known genes, whereby the full sequences of the genes may be referenced, and which are expressly incorporated herein by reference thereto as of the filing of this application for patent.
  • Array or “microarray” refers to an arrangement of hybridizable array elements, such as polynucleotides, which in some embodiments may be on a substrate.
  • the arrangement of polynucleotides may be ordered.
  • the array elements are arranged so that there are at least ten or more different array elements, and in other embodiments at least 100 or more array elements.
  • the hybridization signal from each of the array elements may be individually distinguishable.
  • the array elements comprise nucleic acid molecules.
  • the array comprises probes to tow or more chemosensitivity genes, and in other embodiments the array comprises probes to 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250 or more chemosensitivity genes. In some embodiments, the array comprises probes to genes that encode products other than chemosensitivity proteins. In some embodiments, the array comprises probes to 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more genes that encode products other than chemosensitivity proteins.
  • Gene when used herein, broadly refers to any region or segment of DNA associated with a biological molecule or function. Thus, genes include coding sequence, and may further include regulatory regions or segments required for their expression. Genes may also include non-expressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences encoding desired parameters. "Hybridization complex” refers to a complex between two nucleic acid molecules by virtue of the formation of hydrogen bonds between purines and pyrimidines.
  • nucleic acid or polypeptide sequences refer to two or more sequences or subsequences that may be the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence.
  • sequence comparison typically one sequence acts as a reference sequence to which test sequences are compared.
  • sequence comparison algorithm test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • isolated when used herein in the context of a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with that it is associated in the natural state. It is preferably in a homogeneous state although it can be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein that is the predominant molecular species present in a preparation is substantially purified. An isolated gene is separated from open reading frames that flank the gene and encode a protein other than the gene of interest.
  • Marker as used herein in reference to a chemosensitivity gene, means an indicator of chemosensitivity.
  • a marker may either directly or indirectly influence the chemosensitivity of a cell to a cytotoxic agent, or it may be associated with other factors that influence chemosensitivity.
  • Naturally-occurring and wild-type are used herein to describe something that can be found in nature as distinct from being artificially produced by man.
  • a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and that has not been intentionally modified by man in the laboratory is naturally-occurring.
  • wild-type is used herein to refer to the naturally-occurring or native forms of growth factor signaling proteins and their encoding nucleic acid sequences. Therefore, in the context of this application, 'wild-type' includes naturally occurring variant forms for growth factor signaling genes, either representing splice variants or genetic variants between individuals, which may require different probes for selective detection.
  • Nucleic acid when used herein, refers to deoxyribonucleotides or ribonucleotides, nucleotides, oligonucleotides, polynucleotide polymers and fragments thereof in either single- or double-stranded form.
  • a nucleic acid may be of natural or synthetic origin, double-stranded or single-stranded, and separate from or combined with carbohydrate, lipids, protein, other nucleic acids, or other materials, and may perform a particular activity such as transformation or form a useful composition such as a peptide nucleic acid (PNA).
  • PNA peptide nucleic acid
  • nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and may be metabolized in a manner similar to naturally-occurring nucleotides.
  • a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences and as well as the sequence explicitly indicated.
  • degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res.
  • nucleic acid is used interchangeably with gene, cDNA, and niRNA encoded by a gene.
  • An "oligonucleotide” or “oligo” is a nucleic acid and is substantially equivalent to the terms amplimer, primer, oligomer, element, target, and probe, and may be either double or single stranded.
  • Polynucleotide refers to nucleic acid having a length from 25 to 3,500 nucleotides.
  • Probe or “Polynucleotide Probe” refers to a nucleic acid capable of hybridizing under stringent conditions with a target region of a target sequence to form a polynucleotide probe/target complex. Probes comprise polynucleotides that are 15 consecutive nucleotides in length. Probes maybe 15, 16, 17, 18 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
  • probes are 70 nucleotides in length. Probes may be less than 100% complimentary to a target region, and may comprise sequence alterations in the form of one or more deletions, insertions, or substitutions, as compared to probes that are 100% complementary to a target region.
  • nucleic acid or protein when used herein in the context of nucleic acids or proteins, denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is at least 50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94,
  • sample refers to an isolated sample of material, such as material obtained from an organism, containing nucleic acid molecules.
  • a sample may comprise a bodily fluid; a cell; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; or a biological tissue or biopsy thereof.
  • a sample may be obtained from any bodily fluid (blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations.
  • “Stringent hybridization conditions” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization experiments such as Southern and northern hybridizations are sequence dependent, and are different under different environmental parameters.
  • nucleic acids having longer sequences hybridize specifically at higher temperatures.
  • An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology — Hybridization with Nucleic Acid Probes part I chapter 2 "Overview of principles of hybridization and the strategy of nucleic acid probe assays," Elsevier, N. Y.
  • highly stringent hybridization and wash conditions are selected to be 5 0 C. lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH.
  • T m thermal melting point
  • a probe will hybridize to its target subsequence, but to no other sequences.
  • the T m is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe.
  • Very stringent conditions are selected to be equal to the T m for a particular probe.
  • An example of stringent hybridization conditions for hybridization of complementary nucleic acids that have more than 100 complementary residues on a filter in a Southern or northern blot is 50% formamide with 1 mg of heparin at 42 °C, with the hybridization being carried out overnight.
  • An example of highly stringent wash conditions is 0.15 M NaCl at 72 °C for 15 minutes.
  • An example of stringent wash conditions is a 0.2x SSC wash at 65 °C for 15 minutes ⁇ see, Sambrook, infra., for a description of SSC buffer).
  • a high stringency wash is preceded by a low stringency wash to remove background probe signal.
  • An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is Ix SSC at 45 0 C for 15 minutes.
  • An example low stringency wash for a duplex of, e.g., more than 100 nucleotides is 4-6x SSC at 40 °C for 15 minutes.
  • stringent conditions typically involve salt concentrations of less than 1.0 M Na ion, typically 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least 30 0 C.
  • Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide.
  • destabilizing agents such as formamide.
  • a signal to noise ratio of 2x (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization.
  • Nucleic acids that do not hybridize to each other under stringent conditions are still substantially similar if the polypeptides that they encode are substantially similar. This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
  • Substrate refers to a support, such as a rigid or semi-rigid support, to which nucleic acid molecules or proteins are applied or bound, and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles, and other types of supports, which may have a variety of surface forms including wells, trenches, pins, channels and pores.
  • Target polynucleotide refers to a nucleic acid to which a polynucleotide probe can hybridize by base pairing and that comprises all or a fragment of a gene that encodes a protein that is a marker for chemosensitivity in cancer cells.
  • the sequences of target and probes may be 100% complementary (no mismatches) when aligned. In other instances, there may be up to a 10% mismatch.
  • Target polynucleotides represent a subset of all of the polynucleotides in a sample that encode the expression products of all transcribed and expressed genes in the cell or tissue from which the polynucleotide sample is prepared.
  • the gene products of target polynucleotides are markers for chemosensitivity of cancer cells; some may directly influence chemosensitivity through involvement in growth factor signaling. Alternatively, they may direct or influence cancer cell characteristics that indirectly confer or influence sensitivity or resistance.
  • Target Region means a stretch of consecutive nucleotides comprising all or a portion of a target sequence such as a gene or an oligonucleotide encoding a protein that is a marker for chemosensitivity.
  • Target regions may be 15, 16, 17, 18 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 5,6, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 200 or more
  • Polynucleotide Probes can be genomic DNA or cDNA or mRNA, or any RNA-like or DNA-like material, such as peptide nucleic acids, branched DNAs and the like.
  • the polynucleotide probes can be sense or antisense polynucleotide probes. Where target polynucleotides are double stranded, the probes may be either sense or antisense strands. Where the target polynucleotides are single stranded, the nucleotide probes are complementary single strands.
  • the polynucleotide probes can be prepared by a variety of synthetic or enzymatic schemes that are well known in the art.
  • the probes can be synthesized, in whole or in part, using chemical methods well known in the art Caruthers et al. (1980) Nucleic Acids Res. Symp. Ser. 215-233). Alternatively, the probes can be generated, in whole or in part, enzymatically.
  • Nucleotide analogues can be incorporated into the polynucleotide probes by methods well known in the art. The only requirement is that the incorporated nucleotide analogues must serve to base pair with target polynucleotide sequences.
  • certain guanine nucleotides can be substituted with hypoxanthine that base pairs with cytosine residues. However, these base pairs are less stable than those between guanine and cytosine.
  • adenine nucleotides can be substituted with 2,6-diaminopurine that can form stronger base pairs than those between adenine and thymidine.
  • the polynucleotide probes can include nucleotides that have been derivatized chemically or enzymatically. Typical chemical modifications include derivatization with acyl, alkyl, aryl or amino groups.
  • the polynucleotide probes may be labeled with one or more labeling moieties to allow for detection of hybridized probe/target polynucleotide complexes.
  • the labeling moieties can include compositions that can be detected by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means.
  • the labeling moieties include radioisotopes, such as P 32 , P 33 or S 35 , chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
  • the polynucleotide probes can be immobilized on a substrate.
  • Preferred substrates are any suitable rigid or semi-rigid support, including membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles and capillaries.
  • the substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which the polynucleotide probes are bound.
  • the substrates are optically transparent.
  • a sample containing polynucleotides that will be assessed for the presence of target polynucleotides are obtained.
  • the samples can be any sample containing target polynucleotides and obtained from any bodily fluid (blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations.
  • DNA or RNA can be isolated from the sample according to any of a number of methods well known to those of skill in the art. For example, methods of purification of nucleic acids are described in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I.
  • RNA is isolated using the TRIZOL reagent (Life Technologies, Gaithersburg Md.), and niRNA is isolated using oligo d(T) column chromatography or glass beads.
  • the polynucleotides can be a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a DNA amplified from that cDNA, an RNA transcribed from the amplified DNA, and the like.
  • the polynucleotide is derived from DNA
  • the polynucleotide can be DNA amplified from DNA or RNA reverse transcribed from DNA.
  • Suitable methods for measuring the relative amounts of the target polynucleotide transcripts in samples of polynucleotides are Northern blots, RT-PCR, or real-time PCR, or RNase protection assays. Fore ease in measuring the transcripts for target polynucleotides, it is preferred that arrays as described above be used.
  • the target polynucleotides may be labeled with one or more labeling moieties to allow for detection of hybridized probe/target polynucleotide complexes.
  • the labeling moieties can include compositions that can be detected by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means.
  • the labeling moieties include radioisotopes, such as P 32 , P 33 or S 35 , chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
  • Hybridization complexes Hybridization causes a denatured polynucleotide probe and a denatured complementary target polynucleotide to form a stable duplex through base pairing.
  • Hybridization methods are well known to those skilled in the art (See, e.g., Ausubel (1997; Short Protocols in Molecular Biology, John Wiley & Sons, New YorkN.Y., units 2.8-2.11, 3.18-3.19 and 4-6-4.9).
  • Conditions can be selected for hybridization where exactly complementary target and polynucleotide probe can hybridize, i.e., each base pair must interact with its complementary base pair.
  • conditions can be selected where target and polynucleotide probes have mismatches but are still able to hybridize. Suitable conditions can be selected, for example, by varying the concentrations of salt in the prehybridization, hybridization and wash solutions, or by varying the hybridization and wash temperatures. With some membranes, the temperature can be decreased by adding formamide to the prehybridization and hybridization solutions.
  • Hybridization conditions are based on the melting temperature (T m ) the nucleic acid binding complex or probe, as described in Berger and Kimmel (1987) Guide to Molecular Cloning Techniques, Methods in Enzymology, vol 152, Academic Press.
  • T m melting temperature
  • stringent conditions is the “stringency” that occurs within a range from Tm-5 (5° below the melting temperature of the probe) to 20° C below Tm.
  • highly stringent employ at least 0.2 x SSC buffer and at least 65° C.
  • stringency conditions can be attained by varying a number of factors such as the length and nature, i.e., DNA or RNA 5 of the probe; the length and nature of the target sequence, the concentration of the salts and other components, such as formamide, dextran sulfate, and polyethylene glycol, of the hybridization solution. All of these factors may be varied to generate conditions of stringency that are equivalent to the conditions listed above.
  • Hybridization can be performed at low stringency with buffers, such as 6. times. S SPE with 0.005% Triton X-100 at 37.degree. C, which permits hybridization between target and polynucleotide probes that contain some mismatches to form target polynucleotide/probe complexes. Subsequent washes are performed at higher stringency with buffers, such as 0.5.times.SSPE with 0.005% Triton X-100 at 50.degree. C, to retain hybridization of only those target/probe complexes that contain exactly complementary sequences. Alternatively, hybridization can be performed with buffers, such as 5.times.SSC/0.2% SDS at ⁇ O.degree. C.
  • nucleic acid sequences can be used in the construction of arrays, for example, microarrays. Methods for construction of microarrays, and the use of such microarrays, are known in the art, examples of which can be found in U.S. Patent Nos. 5,445,934, 5,744,305, 5,700,637, and 5,945,334, the entire disclosure of each of which is hereby incorporated by reference.
  • Microarrays can be arrays of nucleic acid probes, arrays of peptide or oligopeptide probes, or arrays of chimeric probes — peptide nucleic acid (PNA) probes.
  • PNA peptide nucleic acid
  • the in situ synthesized oligonucleotide Affymetrix GeneChip system is widely used in many research applications with rigorous quality control standards. (Rouse R. and Hardiman G., "Microarray technology - an intellectual property retrospective," Pharmacogenomics 5:623-632 (2003).).
  • the Affymetrix GeneChip uses eleven 25- oligomer probe pair sets containing both a perfect match and a single nucleotide mismatch for each gene sequence to be identified on the array.
  • highly dense glass oligo probe array sets (>1, 000,000 25- oligomer probes) can be constructed in a ⁇ 3 x 3 -cm plastic cartridge that serves as the hybridization chamber.
  • the ribonucleic acid to be hybridized is isolated, amplified, fragmented, labeled with a fluorescent reporter group, and stained with fluorescent dye after incubation. Light is emitted from the fluorescent reporter group only when it is bound to the probe.
  • the intensity of the light emitted from the perfect match oligoprobe, as compared to the single base pair mismatched oligoprobe, is detected in a scanner, which in turn is analyzed by bioinformatics software (http://www.affymetrix.com).
  • bioinformatics software http://www.affymetrix.com.
  • the GeneChip system provides a standard platform for array fabrication and data analysis, which permits data comparisons among different experiments and laboratories.
  • Microarrays according to the invention can be used for a variety of purposes, as further described herein, including but not limited to, screening for the resistance or susceptibility of a cancer to a drug based on the genetic expression profile of the cancer.
  • the present invention provides a chemosensitivity gene expression analysis system comprising a plurality of polynucleotide probes, wherein each of said polynucleotide probes comprises a nucleic acid sequence that is complimentary under strict hybridization conditions to at least a portion of a gene that encodes a protein that is a marker for the sensitivity of cancer cells to cytotoxic agents, as presented in the appendices hereto.
  • polynucleotides probes are provided on an array.
  • the array elements are organized in an ordered fashion so that each element is present at a specified location on the substrate. Because the array elements are at specified locations on the substrate, the hybridization patterns and intensities (which together create a unique expression profile) can be interpreted in terms of expression levels of particular genes and can be correlated with a particular disease or condition or treatment.
  • the gene expression analysis system in some embodiments in the form of an array, can be used for gene expression analysis of target polynucleotides that represent the expression products of cells of interest, particularly cancer cells.
  • the array can also be used in the prediction of the responsiveness of a patient to a therapeutic agent, such as the response of a cancer patient to a chemotherapeutic agent. Further, as described below, the array can be employed to investigate the profile of a cancer cell in terms of its likely sensitivity or resistance to chemotherapeutic agents. Furthermore, as described below, the array can be employed to characterize a therapeutic agent's chemosensitivity profile for use in treating various cancers.
  • the array can also be used to identify new agents, as described below, which can modulate the chemosensitivity of a cancer cell to one or more therapeutic agents by altering the expression of genes that are markers for and influence chemosensitivity.
  • the gene expression analysis system can be used to purify a subpopulation of mRNAs, cDNAs, genomic fragments and the like, in a sample.
  • samples will include target polynucleotides and other non-target nucleic acids that may undesirably affect the hybridization background. Therefore, it may be advantageous to remove these non-target nucleic acids from the sample.
  • One method for removing the non-target nucleic acids is by contacting the polynucleotide sample with the array, hybridizing the target polynucleotides contained therein with immobilized polynucleotide probes under hybridizing conditions.
  • the non-target nucleic acids that do not hybridize to the polynucleotide probes are then washed away, and thereafter, the immobilized target polynucleotide probes can be released in the form of purified target polynucleotides.
  • Examples of the types of molecules that may be used as probes are cDNA molecules, oligonucleotides that contain 15, 16, 17, 18 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 5,6, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 61, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100 or more nucleotides, and other gene probes that comprise nucleobases including synthetic gene probes such as, for example, peptide nucleic acids
  • At least some of said polynucleotide probes comprise a polynucleotide sequence that is complementary to a target region of a gene that encodes a protein associated with growth factor signaling and that is a marker for the sensitivity or resistance of cancer cells to cytotoxic agents.
  • the plurality of polynucleotide probes comprises at least two or more probes, each of which comprises a polynucleotide sequence that is complementary to a target region of a chemosensitivity gene as described in the appendices hereto.
  • the present invention provides a physical embodiment of the expression profile for a cancer cell of proteins that are involved in growth factor signaling and that are markers for the sensitivity of cancer cells to cytotoxic agents.
  • the expression profile comprises the polynucleotide probes of the invention.
  • the expression profile also includes a plurality of detectable complexes, in some embodiments in the form of a gene expression analysis system, and in some embodiments in the form of an array.
  • Each complex is formed by hybridization of one or more polynucleotide probes to one or more complementary target polynucleotides in a sample.
  • the polynucleotide probes are hybridized to a complementary target polynucleotide forming target/probe complexes.
  • a complex is detected by incorporating at least one labeling moiety in the complex. Labeling moieties are described herein and are well known in the art.
  • the chemosensitivity expression profile comprises a printed report that shows the expression of the analysis of an array.
  • the printed report may be in the form of a developed or digital film of the hybridized and developed gene expression analysis system.
  • the printed report may also be a manually or computer generated numerical analysis of the developed gene expression analysis system.
  • the printed report may optionally contain gene- drug correlation information.
  • the expression profiles provide "snapshots" that can show unique expression patterns that are characteristic of susceptibility or resistance of a cell to one or more cytotoxic chemotherapeutic agents.
  • the chemosensitivity expression profile can be used, as further described below: (a) in the prediction of the chemosensitivity of a particular cancer cell or cell type to a therapeutic agent; (b) in the choice of drug therapy for a patient in need of the same; (c) in the identification of targets for altering the chemosensitivity of a cancer; and (d) in the identification of novel agents for modulating the chemosensitivity of a cancer.
  • the present invention provides a method of predicting the response of a specific cancer, and more particularly a cancer in a patient, to treatment with a therapeutic agent.
  • the method comprises contacting a polynucleotide sample obtained from the cells of the specific cancer to polynucleotide probes to measure the levels of expression of one or, in some embodiments, a plurality of target polynucleotides.
  • the expression levels of the target polynucleotides are then used to provide an expression profile for the cancer cells that is then compared to the drug-gene correlations described in the appendices hereto, wherein a positive correlation between a drug and a gene expressed in the cancer cells indicates that the cancer cells would be sensitive to the drug, and wherein a negative correlation between a drug and a gene expressed in the cancer cells indicates that the cancer cells would be resistant to the drug.
  • Methods of Identifying New Therapeutic Agents The present invention provides novel methods for identifying and characterizing new agents that modulate the chemosensitivity of a cancer by altering the expression of one or more growth factor signaling genes.
  • the method comprises treating a sample of cells from the cancer with an agent, and thereafter determining any change in expression of genes, such as growth factor signaling genes, which are markers for chemosensitivity. This is done by obtaining polynucleotide samples from untreated cancer cells and the treated cancer cells, and contacting the polynucleotide samples to polynucleotide probes to determine the levels of target polynucleotides to obtain growth factor signaling gene chemosensitivity expression profiles. In some embodiments, the measurement is made using an array or micro array as described above that comprises one or more probes. The method further comprises comparing the growth factor signaling gene expression profiles of the control and treated cells to determine whether the agent alters the expression of any of the chemosensitive or chemoresistant genes.
  • genes such as growth factor signaling genes, which are markers for chemosensitivity. This is done by obtaining polynucleotide samples from untreated cancer cells and the treated cancer cells, and contacting the polynucleotide samples to polynucleotide probe
  • separate cultures of cells are exposed to different dosages of the candidate agent.
  • the effectiveness of the agent's ability to alter chemosensitivity can be tested using standard assays that use, for example, the one or more of the NCI60 cancer cell lines.
  • the agent is tested by conducting assays in that sample cancer cells are co treated with the newly identified agent along with a previously known therapeutic agent.
  • the choice of previously known therapeutic agent is determined based upon the gene-drug correlation between the gene or genes whose expression is affected by the new agent.
  • the present invention further provides novel methods for identifying and characterizing new agents that modulate the chemosensitivity of a cancer by altering the activity of one or more growth factor signaling genes.
  • the method comprises treating a sample of cells from the cancer with an agent, which is capable of inhibiting the activity of a protein implicated in chemosensitivity by correlation analysis between gene expression and drug potency in multiple cancer cell lines.
  • an agent which is capable of inhibiting the activity of a protein implicated in chemosensitivity by correlation analysis between gene expression and drug potency in multiple cancer cell lines.
  • an inhibitor of an efflux pump will increase the potency of an anticancer drug if the efflux pump is highly expressed. This permits one to search either for inhibitors of the chemosensitivity gene or to test whether an anticancer agent is subject to influence by the chemosensitivity gene product.
  • the cell line is a human cell line, such as, for example, any one of the cells from the NCI60 cell lines.
  • RNA is extracted from such cells, converted to cDNA and applied to arrays to that probes have been applied, as described above.
  • Cytotoxic potencies of 119 drugs against 60 neoplastic cell lines were correlated with expression of 343 genes, including 90 growth factors and receptors, 63 metalloproteinases, and 92 ras-like GTPases as downstream signaling factors.
  • Correlating gene expression with drug potency yielded two sets of genes showing either negative or positive drug correlations (P ⁇ 0.001), indicative of a role in chemoresistance and -sensitivity.
  • Known chemoresistance factors showed significant negative correlations with multiple drugs, but several novel candidate genes also scored highly.
  • Negatively correlated genes clustered into two main groups with distinct expression profiles and drug correlations, represented by EGFR and ERBB2 (Her-2/Neu), which displayed distinct drug correlation patterns. Synergism and antagonism between EGFR and ERBB2 inhibitors, in combination with classical anticancer drugs, were not directly related to EGFR and ERBB2 expression in four cells lines tested. Good accuracy in predicting drug potency against the NCI- 60 was attainable with subset of only 13 genes ⁇ RAB5B, TGFBR3, RAB6A, ARF4, ARHC, ERBB3 (negative correlations), and PLCL2, RAN, PLCD4, RAB37, RAC2, RAB39B (positive)). Our approach reveals known and potentially novel biomarkers and drug targets in cancer chemotherapy.
  • Growth factor signaling regulates cell proliferation, differentiation, and apoptosis (5, 9). Amplification, point mutations, or chromosomal translocation can result in uncontrolled activation of growth factor signaling pathways, as exemplified by ERBB2 in breast cancer (10). This has prompted the targeting of growth factor signaling as a therapeutic strategy, including growth factor receptors such as EGFR (11) and ERBB2 (HER-2) (9, 12). Growth factor signaling also conveys chemoresistance against classical anticancer drugs (13), involving suppression of apoptosis (14) and activation of multi-drug resistance genes, such as the drug efflux pump MDRl (15).
  • MDRl multi-drug resistance genes
  • the NCI-60 a set of 60 diverse human cancer cell lines (22) has served in the screening of more than 100,000 candidate drugs.
  • Gene-drug correlations can reveal novel drug targets or mechanisms of chemoresistance (1, 23, 24), while clustering of drug potency against the NCI60 cells can reveal mechanisms of action (25).
  • mRNA expression profiles of subsets of 20-200 genes can serve as predictors of cytotoxic drag potencies (8, 26).
  • the underlying mechanisms remain unclear.
  • With a more focused approach measuring expression of genes involved in transmembrane transport in the NCI-60 we have identified numerous new drug-transporter relationships relevant to drug targeting and potency (28) that can be extrapolated to in vivo studies because causal interactions are implied.
  • Oligonucleotide microarrays 70-mer oligonucleotide probes (total 343 genes) were designed for 90 growth factors and their receptors (e.g., EGF, IGF, FGF, PDGF, VEGF, TGF and 27VF families), 63 metalloproteinases and their inhibitors (20 MMPs, 4 TIMPs, 20 ADAMs and 19 ADAMTSs), and 92 small GTP-binding proteins (GTPases).
  • 98 probes were targeted to GPCRs, heterotrimeric G proteins, phospho lipases, etc ( Figure 14), for comparison.
  • the 70-mer oligonucleotides were designed and synthesized by Operon (Alameda, CA), Qiagen (http ://omad. qiagcn. com/human2/index .php) . Probes for these genes were added to the transporter and channel gene microarray described previously (27, 29) , with 25 genes overlapping with the current study. Each probe was printed 4 times on poly-L-lysine glass slides to permit assessment of intra-assay variability of mRNA measurements. The 25 genes analyzed in duplicate served as quality control, in addition to comparison with cDNA arrays with 144 overlapping genes. 6 007045
  • NCI-60 cancer cell lines purchased from the Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health
  • EGFR inhibitor. AG1478 and ERBB2 inhibitor AG825 were purchased from Calbiochem (San Diego, CA). Cisplatin was obtained from Sigma (St. Louis, MO). Paclitaxel and camptothecin, 10-OH (CPT, 10-OH) were from the Developmental Therapeutics Program at NCI (Bethesda, MD).
  • Clustering of cell lines, genes, and drugs on the basis of gene expression and drug potency profile can be used to group cell lines and genes in terms of their patterns of gene expression (23, 33).
  • To obtain cell-cell cluster trees for 107 genes that showed distinct expression patterns across the 60 cell lines (i.e., genes that passed the filter S.D. > 0.35), we used the programs "Cluster” and "Tree View” (34) with average linkage clustering and a correlation metric.
  • Cells and drugs were also clustered by drug potency profiles (23), and moreover, genes and drugs were classified using correlations between each gene (expression across the NCI-60) and each drug (potency across the NCI-60) as distance measure.
  • Cytotoxicity assay Drug potency was tested using a proliferation assay with sulforhodamine B (SRB)(35). 3000 - 5000 cells per well were seeded in 96-well plates and incubated for 24 hours. Drags were added in a dilution series in 3 replicate wells. After 3 days, incubation was
  • Unbound dye was washed off with 1% acetic acid. After air-drying and re-solubilization of the protein-bound dye in 10 mM Tris-HCl (pH 8.0), absorbance was read in a micro-plate reader at 570 run.
  • combination index a measure of synergism or antagonism between two co-administrated agents.
  • This step differs from the one used in (26) by using information from all cell lines for prediction purposes and by incorporating into the analysis the bimodal behavior of growth inhibition distributions.
  • the final predictor genes are then selected as the smallest set of genes producing a model within 5% from the highest percentage of correct classifications obtained in the stepwise search described above. This reduces the number of predictor genes, by accepting predictor sets within 5% of the optimally observed values, which may lead to more robust predictor sets. This led to a list of genes ranked by frequency of presence in the predictor sets for a subgroup of 68 drugs (pruned from 119 to avoid redundancy between similar drugs).
  • Basal mRNA expression of 343 genes was measured in the NCI-60 panel.
  • Basal mRNA expression of 343 genes was measured in the NCI-60 panel.
  • gene-gene Pearson correlation coefficient ranged from 0.3 to 0.78.
  • expression data were compared to previous results obtained with a cDNA array platform (23).
  • MMP24, ADAM9, and the inhibitor TIMP2 ranked highly, with multiple negative correlations. Furthermore, small GTPases scored strongly, with seven genes showing negative correlations with 10 drugs or more, including ARHC, RRAS2, RAB 5B, and RALB, consistent with their pervasive role in cellular signaling. Among the other signaling factors (98 genes involved in various signaling pathways, such as GPCRs and G protein subunits), considerably fewer genes produced multiple negative correlations (Figure 16), such as GNGlO and GNGIl with 15 and 7 negatively correlated genes, respectively. Of 29 G-protein coupled receptors, only 4 showed strong negative correlation (P ⁇ 0.001) with 1 or 2 out of 119 drugs. This result suggests that these pathways are less germane to chemoresistance.
  • ADAM9 a metalloproteinase mediating release of membrane-tethered growth factors such as HB-EGF
  • EGFR a metalloproteinase mediating release of membrane-tethered growth factors such as HB-EGF
  • EGFR a metalloproteinase mediating release of membrane-tethered growth factors such as HB-EGF
  • EGFR a metalloproteinase mediating release of membrane-tethered growth factors such as HB-EGF
  • GTPase ARHC RhoQ clustered within the same group in all analyses and hence may represent members of a signaling pathway relevant to chemoresistance for a portion of the drug panel.
  • Network analysis indicates that EGFR and ADAM9 are close neighbors (to be published).
  • ERBB2 and RALB clustered together implying a functional relationship.
  • the close receptor homologues, EGFR and ERBB2 presumed to have similar signaling pathways, clustered at some distance
  • EGFR and ERBB2 examples from the two gene groups with distinct drug correlations
  • Figure 3 A shows the mRNA levels of EGFR and ERBB2 in four cancer cells based on our array data, which is consistence with reported protein levels
  • Prediction accuracy ranged mainly from 0.6 - 0.9, usually with 5-10 genes selected as predictors. Using all 343 genes or only the 69 negatively correlated genes yielded similar results in most cases, while only the 49 positively correlated genes tended to score somewhat lower. As not all possible combinations of genes can be tested in our heuristic algorithm with a large number of genes, better scoring gene sets may well exist.
  • the top genes with negative correlations include TGFBR3, RAB6A, RALB, TIMP2, ARHC, RABl 7 and ARF4, while the top positive genes are RAB37, RAC2, PLCL2, PDElB, PLCD4 and ADAM12. It is noted that these genes are not always the highest-ranking genes sorted by number of highly correlated drugs ( Figure 5 and Figures 16 and 17). It remains to be determined which measure is more accurate. Further tests will be needed to determine whether the selected genes are functionally most relevant. To limit the effect of the heuristic approach on the prediction models, we reduced the number of candidate predictor genes to those appearing most frequently in predictive sets of drug potency.
  • Predictive accuracy for only these highest scoring genes is listed in Figure 8 (and Figure 20), with optimal predictions derived from all possible gene combinations for each drug. This improved the prediction accuracy, especially for those drugs with previously low prediction values, resulting from the heuristic approach used to detect predictor genes.
  • Figure 21 shows examples of predictive gene sets for individual compounds. Hence, starting from 343 genes in this study, we have identified a small subset of genes yielding good predictions for a majority drugs. Discussion: We have evaluated the role of three gene families related to growth factor signaling in chemoresistance.
  • CYR61 was included because of its association with breast cancer chemoresistance (47), converging on growth factor signaling through the NF- lcappaB/XIAP pathway (48). Furthermore, vascular endothelial growth factor-165 receptor (VEGFl 65R), showing strong negative drag correlations, had been shown to be involved in tumor angiogenesis, progression, chemoresistance, and poor prognosis (49, 50).
  • VEGFl 65R vascular endothelial growth factor-165 receptor
  • TGFBR3 scored highest as a predictor for chemoresistance. While devoid of
  • TGFBR3 appears to be a necessary component of the TGF ⁇
  • receptor signaling complex 52
  • Additional growth factors implicated by negative correlations include FGF 17, 18, and 19, IGF2, and NRGl. Their relevance to chemoresistance needs to be validated in each case.
  • ADAM9 a disintegrin and metalloproteinase domain 9
  • ADAM9 is highly expressed in hepatocellular carcinoma (53), and in pancreatic ductal adenocarcinomas where cytoplasmic expression is correlated with poor prognosis (54).
  • ADAM9 is part of the signaling cascade evading apoptosis induced by cytotoxic drags (55), possibly by mediating release of heparin- binding EGF-like growth factor (HB-EGF) (56).
  • TC21/RRAS2 mediates transformation of cancer cells involving phosphatidylinositol 3-kinase (PD-K) (59, 60), and is activated by growth factors (61), including FGFl and FGF2 shown to convey chemoresistance (62).
  • PD-K phosphatidylinositol 3-kinase
  • 61 growth factors
  • FGFl and FGF2 shown to convey chemoresistance
  • RALB clusters together with ERRB2 by gene expression and drug potency correlations Lastly, ARHC (RhoC) promotes tumor metastasis (64), and appears to contribute to chemoresistance via growth factor signaling (65).
  • RAB37 shows multiple drug correlations.
  • ARHGDIA encodes a Rho GDP dissociation inhibitor, and is predictive as a chemosensitivity factor for 14 drugs, possibly by regulating Rho activity.
  • IGFALS is an insulin-like growth factor-binding protein (acid labile subunit), which complexes IGF and IGFBP3 into a 150 kD aggregate (see OMM, 601489). This could account for the positive correlation and predictive power for 15 drugs.
  • EGFL4 and L5 encode EGF-like polypeptides containing multiple EGF repeats, and function in cell adhesion, but the physiological role remains uncertain.
  • EGFR and ERBB2 belong to different gene clusters with distinct expression patterns, which indicated that different mechanisms of drug resistance might be involved. This was surprising as EGFR and ERBB2 are coexpressed in some tumors and can heterodimerize (9).
  • AG1478 not only inhibits EGFR, but could also ERBB4 (68), and parallel signaling pathways can bypass the block. Inhibition of growth factor signaling at different junctions might improve anticancer potency, as shown with combined inhibition of both mutated EGFR and
  • ERBB receptors might induce cell death under some circumstances (71, 72).
  • Step A Cytotoxic drug potency in training cell lines.
  • the probability of cell line i to be resistant can be computed by:
  • the class assignment for the test cell lines will be based on the predictive probabilities of class membership (p, 1- p) (3).
  • Step B Ranking genes according to their ability to discriminate between resistant and sensitive cell lines.
  • pi is the previously determined predictive probability of sample i for being in the resistant class
  • Step C Predicting drug resistance in test cell lines.
  • Step D Prediction accuracy.
  • the final set of predictor genes is then selected as the smallest set of genes producing a model with prediction accuracy within 5% from the highest percentage of correct classifications obtained in the stepwise search described above. This reduces the number of predictor genes, by accepting predictor sets within 5% of the optimally observed values, which may lead to more robust predictor sets. In this way we can generate a list of genes ranked by frequency of presence in the predictor sets for a subgroup of 68 drugs (pruned from 119 to avoid redundancy between similar drugs), hi a further iteration, we compared the observed frequencies of presence in the predictor sets (see Figure 19) with the ones one would expect if these models would be random.
  • Average gene expression values for all probes on each array in the dataset were 200.
  • Gene expression levels were also evaluated for the top-scoring 12 negatively and 9 positively correlated genes, and 13 genes selected from all 343 genes ( Figure 7) showing significantly higher frequencies as predictors than expected.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Gene expression analysis systems are provided for identifying the chemosensitivity gene profile of a cancer cell, the analysis systems comprising a plurality of polynucleotide probes, wherein each of said polynucleotide probes comprises a polynucleotide sequence that is complementary to a target region of a gene that encodes a protein associated with transport of molecules into and out of cells and that is a marker for the sensitivity or resistance of cancer cells to cytotoxic agents.

Description

PREDICTING CHEMOSENSITIVITY TO CYTOTOXIC AGENTS
STATEMENT ON FEDERALLY FUNDED RESEARCH
The present invention was made with support from National Institutes of Health Grant NO. GM61390. The United States Government has certain rights in the invention. PRIORITY CLAM
This application claims priority to US Provisional Patent Application 60/656,195, filed February 25, 2005, and PCT application US2004/032280, filed October 1, 2004 and US Utility Patent Application 10/957,432, filed October 1, 2004, each of which is incorporated herein by reference, in its entirety. BACKGROUND
Use of cytotoxic agents is an important mode of treatment for many forms of cancer. However, only a limited proportion of cancer patients respond favorably to most chemotherapeutic drugs, and drug efficacy varies widely among these patients. Treatment according to standard drug protocols can result in the selection of more resistant and aggressive cancer cells. Previous studies have revealed several genetic factors that influence the chemosensitivity of cancer cells. But due to the lack of predictability regarding the genetic bases for development of drug resistance, there are few clear options for treatment. Thus, cancer 6 007045
patients are often treated according to a standard regimen without any consideration of individual differences in chemosensitivity. This approach commonly leads to the development of resistance of the patient's cancer during treatment, and often results in treatment failure.
What are lacking are tools for predicting the likelihood that a particular cancer will be responsive to a chemotherapy regimen, and in particular, identifying agents to which the cancer will be sensitive or resistant. Also lacking are tools for profiling genetic factors influencing sensitivity and resistance of cancers to therapeutic agents. Such tools, and the resulting gene expression profiles, would be predictive of treatment response of a cancer to a particular drug, and would allow for increased predictability regarding chemosensitivity or chemoresistance of cancers to enable the design of optimal treatment regimens for patients. Such tools would likewise enable the identification of new drugs that modulate expression of genes that affect chemosensitivity, particularly new agents that alter expression of these genes to overcome drug resistance or enhance chemosensitivity.
SUMMARY OF THE INVENTION The present invention provides methods for detecting the chemosensitivity gene expression profile for a cancer cell. The chemosensitivity gene expression profile reflects the expression levels of a plurality of target polynucleotides in a sample, wherein the target polynucleotides encode gene products that are markers for cancer cell chemosensitivity. In one embodiment, the method comprises contacting a polynucleotide sample obtained from cells of the specific cancer of interest to polynucleotide probes to detect and measure the amount of target polynucleotides in the sample. The measured levels of expression of target polynucleotides provides an expression profile for the cancer cells that is compared to the drug- gene correlations described in the appendices hereto. Expression in the cancer cells of a gene that has a positive correlation (r>0) with a drug indicates that the cancer cells would be sensitive to the drug. Expression in the cancer cells of a gene that has a negative correlation (r<0) with a drug indicates that the cancer cells would be resistant to the drug. The chemo sensitivity expression profile can be used, for example: (a) in the prediction of the chemosensitivity of a particular cancer cell or cell type to a therapeutic agent; (b) in the choice of drug therapy for a patient in need of the same; (c) in the identification of targets for altering the chemosensitivity of a cancer; and (d) in the identification of novel agents for modulating the chemosensitivity of a cancer.
In another aspect the present invention provides new methods for identifying and characterizing new agents that modulate the chemosensitivity of a cancer by altering the expression of one or more growth factor signaling genes, which are markers for cancer cell chemosensitivity. The method comprises treating a sample of cells from the cancer with a test agent, obtaining polynucleotide samples from untreated cancer cells and the treated cancer cells, and contacting the polynucleotide samples to polynucleotide probes to detect and measure the amount of target polynucleotides in the sample and thereby obtain an expression profile of genes, such as genes that are involved in growth factor signaling, which are markers for chemosensitivity. The method further comprises comparing the growth factor signaling gene expression profiles of the control and treated cells to determine whether the agent altered the expression of any of the genes correlated with chemosensitivity or chemoresistance to various drugs.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows a hierarchical cluster analysis of 69 negatively correlated genes against the 119 anticancer drugs using gene-drug Pearson correlation coefficients. Genes and drugs cluster into two main groups, while drugs further cluster into subgroups according to their mechanisms of reaction.
Figure 2 shows gene-drug correlation profiles for EGFR (panel A) and ERBB2 (panel b) with 119 anticancer drugs. Pearson correlation coefficients against 119 drugs are sorted for EGFR, while the order of drugs is maintained for ERBB.
Figure 3 shows the relative mRNA (panel A, Iog2 transformed from the 70-mer microarray hybridization) and protein level (panel B) of EGFR and ERBB 2 in cell lines SK-OV-
3, TK-IO5 EKVX and SK-MEL-2. EGFR and ERBB2 protein levels (Western blots, relative expression level compared to A431 cells) are taken from (http://dtp.nci.nih.gov/mtweb/tarRetinfo?moltid=MT1173&moltnbi-=813y
Figure 4 shows the drug combination indexes with respect to fraction affected (Fa) for the combination of EGFR inhibitor AG1478 with camptothecin 10-OH (Panel A), and AG1478 with paclitaxel (Panel B). The combination index of AG1478/camptothecin 10-OH is < 1, an indication of synergism, while that of AG1478/paclitaxel is >1, indicative of antagonism. Figure 5 shows the genes negatively and negatively correlated with drug response. Only genes correlated with at least 5 drugs with P < 0.001 are included. For all genes with at least 1 drug at P < 0.001, see Supplemental Table 3 and 4.
Figure 6 shows the drug combination effects between AG1478 or AG825 and paclitaxel, cisplatin, or CPT, 10-OH, using the combination index (see Figure 4). Figure 7 shows the genes occurring in predictive models more often than expected by chance, sorted by number of drugs.
Figure 8 is a comparison of prediction accuracy for select drugs, between a heuristic approach with all relevant genes (69, 49, and 343), and using only a short list of genes (top 12, 9, and 13) showing high frequencies as predictors (Figure 7). In the latter case all possible combinations were tested and the highest scoring set selected. A complete list of the 68 drugs tested is available in Figure 20.
Figure 9 is an example of a compound (mitomycin) with bimodal distribution of growth inhibition (-log(GI5o)).
Figure 10 shows a hierarchical cluster analysis of the NCI-60 cell-cell correlations. The analysis was based on gene expression of 107 probes representing genes that are differentially
expressed in the NCI-60 (Standard deviation S.D. > 0.35). BR: breast cancer; CNS: CNS cancer;
CO: colon cancer; LC: lung cancer; LE: leukemia; ME: melanoma; OV: ovarian cancer; PR: prostate cancer; RE: renal cancer.
Figure 11 shows a hierarchical cluster analysis of gene-cell correlations using expression of 69 genes that are negatively correlated with at least one drug with P < 0.001. In this analysis, EGFR expression from 2 independent experiments were included (EGFR and EGFR_1), showing close proximity in the cluster. Cell lines tend to cluster together according to tissue of origin, such as leukemia, colon cancer, melanoma, and renal cell carcinoma, which is consistent with Figure 1.
Figure 12 shows a hierarchical cluster analysis of gene-gene correlations using the expression for 69 genes with negative correlations as distance measures. The two major gene clusters are similar to the two main groups in Figure 2, based on cell-gene expression correlation. Figure 13 shows a hierarchical cluster analysis of gene-gene correlation for 69 genes negatively correlated with at least one drug, using gene-drug correlation coefficients as distance measures. The two major cluster patterns are nearly identical to the pattern in Figure 3 based on 2006/007045
Pearson correlation coefficients, and to that shown in Figure 12, based on gene-gene correlation using expression.
Figure 14 shows 343 genes for which probes were included in the 70-mer oligonucleotide microarray. Figure 15 shows the number of genes (out of 343 genes) negatively or positively correlated with at least 4, 2, or 1 drug(s) out of 119 drugs (using bootstrap P < 0.001 and Pearson correlation coefficient > 0.4 as the cutoffs).
Figure 16 shows the genes with predominantly negative correlations to drug response. Only three genes showed also positive correlations. Figure 17 shows the genes with predominantly positive correlations with drug response.
Only two genes had also negative correlations.
Figure 18 shows prediction accuracy for 68 drugs using positively correlated or negatively correlated or all the 343 genes. Note the very low and variable accuracy for some drugs in all three groups, reflecting the heuristic nature of the selection algorithm. Figure 19 shows predictive genes sorted by number of affected drugs where the gene is present in the optimized gene panel. Each included gene has at least one drug correlation at P<0.001 for both positively and negatively correlated genes.
Figure 20 shows a comparison of prediction accuracy for select drugs, between a heuristic approach with all relevant genes (69, 49, and 343), and using only a short list of genes (top 12, 9, and 13) showing high frequencies as predictors (see Figures 7 and 8). In the latter case all possible combinations were tested and the highest scoring set selected.
Figure 21 shows prediction accuracy and predictive gene sets using only genes occurring in the predictive models for drugs more than by chance listed in Figure 7. Note the small US2006/007045
number of genes in each set, and the partial overlap between the negative and positive genes versus all genes.
DETAILED DESCRIPTION OF THE INVENTION
The present invention will now be described with occasional reference to the specific embodiments of the invention. This invention may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to that this invention belongs. The terminology used in the description of the invention herein is for describing particular embodiments only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. AU publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety.
Unless otherwise indicated, all numbers expressing quantities of ingredients, properties such as molecular weight, reaction conditions, and so forth as used in the specification and claims are to be understood as being modified in all instances by the term "about." Accordingly, unless otherwise indicated, the numerical properties set forth in the following specification and claims are approximations that may vary depending on the desired properties sought to be obtained in embodiments of the present invention. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical values, however, inherently contain certain errors necessarily resulting from error found in their respective measurements.
The disclosure of all patents, patent applications (and any patents that issue thereon, as well as any corresponding published foreign patent applications), GenBank and other accession numbers and associated data, and publications mentioned throughout this description are hereby incorporated by reference herein. It is expressly not admitted, however, that any of the documents incorporated by reference herein teach or disclose the present invention.
The present invention may be understood more readily by reference to the following detailed description of the embodiments of the invention and the Examples included herein. However, before the present methods, compounds and compositions are disclosed and described, it is to be understood that this invention is not limited to specific methods, specific nucleic acids, specific polypeptides, specific cell types, specific host cells or specific conditions, etc., as such may, of course, vary, and the numerous modifications and variations therein will be apparent to those skilled in the art. It is also to be understood that the terminology used herein is for the purpose of describing specific embodiments only and is not intended to be limiting. Definitions
"Chemosensitivity" refers to the propensity of a cell to be affected by a cytotoxic agent, wherein a cell may range from sensitive to resistant to such an agent. The expression of a chemosensitivity gene, either alone or in combination with other factors or gene expression products, can be a marker for or indicator of chemosensitivity.
"Chemosensitivity gene" refers to a gene whose protein product influences the chemosensitivity of a cell to one or more cytotoxic agents. According to the instant invention, 2006/007045
along a scale that is a continuum, relatively high expression of a given gene in drug-sensitive cell lines is considered a positive correlation, and high expression in drug resistant cells is considered a negative correlation. Thus, negative correlation indicates that a chemosensitivity gene is associated with resistance of a cancer cell to a drug, whereas positive correlation indicates that a chemosensitivity gene is associated with sensitivity of a cancer cell to a drug. Chemosensitivity genes may themselves render cells more sensitive or more resistant to the effects of one or more cytotoxic agents, or may be associated with other factors that directly influence chemosensitivity. That is to say, some chemosensitivity genes may or may not directly participate in rendering a cell sensitive or resistant to a drug, but expression of such genes may be related to the expression of other factors which may influence chemosensitivity. Expression of a chemosensitivity gene can be correlated with the sensitivity of a cell or cell type to an agent, wherein a negative correlation may indicate that the gene affects cellular resistance to the drug, and a positive correlation may indicate that the gene affects cellular sensitivity to a drug. According to the instant disclosure, chemosensitivity genes have been identified among known and putative growth factor signaling genes. The appendices hereto list the accession numbers for the known genes, whereby the full sequences of the genes may be referenced, and which are expressly incorporated herein by reference thereto as of the filing of this application for patent.
"Array" or "microarray" refers to an arrangement of hybridizable array elements, such as polynucleotides, which in some embodiments may be on a substrate. The arrangement of polynucleotides may be ordered. In some embodiments, the array elements are arranged so that there are at least ten or more different array elements, and in other embodiments at least 100 or more array elements. Furthermore, the hybridization signal from each of the array elements may be individually distinguishable. In one embodiment, the array elements comprise nucleic acid molecules. In some embodiments, the array comprises probes to tow or more chemosensitivity genes, and in other embodiments the array comprises probes to 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250 or more chemosensitivity genes. In some embodiments, the array comprises probes to genes that encode products other than chemosensitivity proteins. In some embodiments, the array comprises probes to 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more genes that encode products other than chemosensitivity proteins.
"Gene," when used herein, broadly refers to any region or segment of DNA associated with a biological molecule or function. Thus, genes include coding sequence, and may further include regulatory regions or segments required for their expression. Genes may also include non-expressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences encoding desired parameters. "Hybridization complex" refers to a complex between two nucleic acid molecules by virtue of the formation of hydrogen bonds between purines and pyrimidines.
"Identical" or percent "identity," when used herein in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that may be the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence. For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
"Isolated," when used herein in the context of a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with that it is associated in the natural state. It is preferably in a homogeneous state although it can be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein that is the predominant molecular species present in a preparation is substantially purified. An isolated gene is separated from open reading frames that flank the gene and encode a protein other than the gene of interest.
"Marker," as used herein in reference to a chemosensitivity gene, means an indicator of chemosensitivity. A marker may either directly or indirectly influence the chemosensitivity of a cell to a cytotoxic agent, or it may be associated with other factors that influence chemosensitivity.
"Naturally-occurring" and "wild-type," are used herein to describe something that can be found in nature as distinct from being artificially produced by man. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and that has not been intentionally modified by man in the laboratory is naturally-occurring. In particular, "wild-type" is used herein to refer to the naturally-occurring or native forms of growth factor signaling proteins and their encoding nucleic acid sequences. Therefore, in the context of this application, 'wild-type' includes naturally occurring variant forms for growth factor signaling genes, either representing splice variants or genetic variants between individuals, which may require different probes for selective detection.
"Nucleic acid," when used herein, refers to deoxyribonucleotides or ribonucleotides, nucleotides, oligonucleotides, polynucleotide polymers and fragments thereof in either single- or double-stranded form. A nucleic acid may be of natural or synthetic origin, double-stranded or single-stranded, and separate from or combined with carbohydrate, lipids, protein, other nucleic acids, or other materials, and may perform a particular activity such as transformation or form a useful composition such as a peptide nucleic acid (PNA). Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and may be metabolized in a manner similar to naturally-occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences and as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res. 19: 5081; Ohtsuka et al. (1985) J. Biol. Chem. 260: 2605-2608; Cassol et al. (1992); Rossolini et al. (1994) MoI. Cell. Probes 8: 91-98). The term nucleic acid is used interchangeably with gene, cDNA, and niRNA encoded by a gene. An "oligonucleotide" or "oligo" is a nucleic acid and is substantially equivalent to the terms amplimer, primer, oligomer, element, target, and probe, and may be either double or single stranded.
"Plurality" refers to a group of at least two or more members. "Polynucleotide" refers to nucleic acid having a length from 25 to 3,500 nucleotides.
"Probe" or "Polynucleotide Probe" refers to a nucleic acid capable of hybridizing under stringent conditions with a target region of a target sequence to form a polynucleotide probe/target complex. Probes comprise polynucleotides that are 15 consecutive nucleotides in length. Probes maybe 15, 16, 17, 18 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 5,6, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100 polynucleotides in length. In some embodiments, probes are 70 nucleotides in length. Probes may be less than 100% complimentary to a target region, and may comprise sequence alterations in the form of one or more deletions, insertions, or substitutions, as compared to probes that are 100% complementary to a target region.
"Purified," when used herein in the context of nucleic acids or proteins, denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is at least 50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94,
95, 96, 91, 98, 99 or 100% pure with respect to the presence of any other nucleic acid or protein species.
"Sample" refers to an isolated sample of material, such as material obtained from an organism, containing nucleic acid molecules. A sample may comprise a bodily fluid; a cell; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; or a biological tissue or biopsy thereof. A sample may be obtained from any bodily fluid (blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations. "Stringent hybridization conditions" and "stringent hybridization wash conditions" in the context of nucleic acid hybridization experiments such as Southern and northern hybridizations are sequence dependent, and are different under different environmental parameters. Nucleic acids having longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology — Hybridization with Nucleic Acid Probes part I chapter 2 "Overview of principles of hybridization and the strategy of nucleic acid probe assays," Elsevier, N. Y. Generally, highly stringent hybridization and wash conditions are selected to be 5 0C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. Typically, under "stringent conditions" a probe will hybridize to its target subsequence, but to no other sequences. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleic acids that have more than 100 complementary residues on a filter in a Southern or northern blot is 50% formamide with 1 mg of heparin at 42 °C, with the hybridization being carried out overnight. An example of highly stringent wash conditions is 0.15 M NaCl at 72 °C for 15 minutes. An example of stringent wash conditions is a 0.2x SSC wash at 65 °C for 15 minutes {see, Sambrook, infra., for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is Ix SSC at 45 0C for 15 minutes. An example low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6x SSC at 40 °C for 15 minutes. For short probes (e.g., 10 to 50 nucleotides), stringent conditions typically involve salt concentrations of less than 1.0 M Na ion, typically 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least 30 0C. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. In general, a signal to noise ratio of 2x (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially similar if the polypeptides that they encode are substantially similar. This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
"Substrate" refers to a support, such as a rigid or semi-rigid support, to which nucleic acid molecules or proteins are applied or bound, and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles, and other types of supports, which may have a variety of surface forms including wells, trenches, pins, channels and pores.
"Target polynucleotide," as used herein, refers to a nucleic acid to which a polynucleotide probe can hybridize by base pairing and that comprises all or a fragment of a gene that encodes a protein that is a marker for chemosensitivity in cancer cells. In some instances, the sequences of target and probes may be 100% complementary (no mismatches) when aligned. In other instances, there may be up to a 10% mismatch. Target polynucleotides represent a subset of all of the polynucleotides in a sample that encode the expression products of all transcribed and expressed genes in the cell or tissue from which the polynucleotide sample is prepared. The gene products of target polynucleotides are markers for chemosensitivity of cancer cells; some may directly influence chemosensitivity through involvement in growth factor signaling. Alternatively, they may direct or influence cancer cell characteristics that indirectly confer or influence sensitivity or resistance.
"Target Region" means a stretch of consecutive nucleotides comprising all or a portion of a target sequence such as a gene or an oligonucleotide encoding a protein that is a marker for chemosensitivity. Target regions may be 15, 16, 17, 18 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 5,6, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 200 or more polynucleotides in length, hi some embodiments, target regions are 70 nucleotides in length, and lack secondary structure. Target regions may be identified using computer software programs such as OLIGO 4.06 software (National Biosciences, Plymouth MN), LASERGENE software (DNASTAR, Madison Wis.), MACDNASIS (Hitachi Software Engineering Co., San Francisco, Calif.) and the like. Polynucleotide Probes The polynucleotide probes can be genomic DNA or cDNA or mRNA, or any RNA-like or DNA-like material, such as peptide nucleic acids, branched DNAs and the like. The polynucleotide probes can be sense or antisense polynucleotide probes. Where target polynucleotides are double stranded, the probes may be either sense or antisense strands. Where the target polynucleotides are single stranded, the nucleotide probes are complementary single strands.
The polynucleotide probes can be prepared by a variety of synthetic or enzymatic schemes that are well known in the art. The probes can be synthesized, in whole or in part, using chemical methods well known in the art Caruthers et al. (1980) Nucleic Acids Res. Symp. Ser. 215-233). Alternatively, the probes can be generated, in whole or in part, enzymatically. Nucleotide analogues can be incorporated into the polynucleotide probes by methods well known in the art. The only requirement is that the incorporated nucleotide analogues must serve to base pair with target polynucleotide sequences. For example, certain guanine nucleotides can be substituted with hypoxanthine that base pairs with cytosine residues. However, these base pairs are less stable than those between guanine and cytosine. Alternatively, adenine nucleotides can be substituted with 2,6-diaminopurine that can form stronger base pairs than those between adenine and thymidine. Additionally, the polynucleotide probes can include nucleotides that have been derivatized chemically or enzymatically. Typical chemical modifications include derivatization with acyl, alkyl, aryl or amino groups.
The polynucleotide probes may be labeled with one or more labeling moieties to allow for detection of hybridized probe/target polynucleotide complexes. The labeling moieties can include compositions that can be detected by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means. The labeling moieties include radioisotopes, such as P32, P33 or S35, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like. The polynucleotide probes can be immobilized on a substrate. Preferred substrates are any suitable rigid or semi-rigid support, including membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which the polynucleotide probes are bound. Preferably, the substrates are optically transparent. Target Polynucleotides
In order to conduct sample analysis, a sample containing polynucleotides that will be assessed for the presence of target polynucleotides are obtained. The samples can be any sample containing target polynucleotides and obtained from any bodily fluid (blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations. DNA or RNA can be isolated from the sample according to any of a number of methods well known to those of skill in the art. For example, methods of purification of nucleic acids are described in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic Acid Preparation, Elsevier, New York N.Y. In one case, total RNA is isolated using the TRIZOL reagent (Life Technologies, Gaithersburg Md.), and niRNA is isolated using oligo d(T) column chromatography or glass beads. Alternatively, when polynucleotide samples are derived from an mRNA, the polynucleotides can be a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a DNA amplified from that cDNA, an RNA transcribed from the amplified DNA, and the like. When the polynucleotide is derived from DNA, the polynucleotide can be DNA amplified from DNA or RNA reverse transcribed from DNA.
Suitable methods for measuring the relative amounts of the target polynucleotide transcripts in samples of polynucleotides are Northern blots, RT-PCR, or real-time PCR, or RNase protection assays. Fore ease in measuring the transcripts for target polynucleotides, it is preferred that arrays as described above be used. The target polynucleotides may be labeled with one or more labeling moieties to allow for detection of hybridized probe/target polynucleotide complexes. The labeling moieties can include compositions that can be detected by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means. The labeling moieties include radioisotopes, such as P32, P33 or S35, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like. Hybridization complexes Hybridization causes a denatured polynucleotide probe and a denatured complementary target polynucleotide to form a stable duplex through base pairing. Hybridization methods are well known to those skilled in the art (See, e.g., Ausubel (1997; Short Protocols in Molecular Biology, John Wiley & Sons, New YorkN.Y., units 2.8-2.11, 3.18-3.19 and 4-6-4.9). Conditions can be selected for hybridization where exactly complementary target and polynucleotide probe can hybridize, i.e., each base pair must interact with its complementary base pair. Alternatively, conditions can be selected where target and polynucleotide probes have mismatches but are still able to hybridize. Suitable conditions can be selected, for example, by varying the concentrations of salt in the prehybridization, hybridization and wash solutions, or by varying the hybridization and wash temperatures. With some membranes, the temperature can be decreased by adding formamide to the prehybridization and hybridization solutions.
Hybridization conditions are based on the melting temperature (Tm) the nucleic acid binding complex or probe, as described in Berger and Kimmel (1987) Guide to Molecular Cloning Techniques, Methods in Enzymology, vol 152, Academic Press. The term "stringent conditions, as used herein, is the "stringency" that occurs within a range from Tm-5 (5° below the melting temperature of the probe) to 20° C below Tm. As used herein "highly stringent" conditions employ at least 0.2 x SSC buffer and at least 65° C. As recognized in the art, stringency conditions can be attained by varying a number of factors such as the length and nature, i.e., DNA or RNA5 of the probe; the length and nature of the target sequence, the concentration of the salts and other components, such as formamide, dextran sulfate, and polyethylene glycol, of the hybridization solution. All of these factors may be varied to generate conditions of stringency that are equivalent to the conditions listed above.
Hybridization can be performed at low stringency with buffers, such as 6. times. S SPE with 0.005% Triton X-100 at 37.degree. C, which permits hybridization between target and polynucleotide probes that contain some mismatches to form target polynucleotide/probe complexes. Subsequent washes are performed at higher stringency with buffers, such as 0.5.times.SSPE with 0.005% Triton X-100 at 50.degree. C, to retain hybridization of only those target/probe complexes that contain exactly complementary sequences. Alternatively, hybridization can be performed with buffers, such as 5.times.SSC/0.2% SDS at όO.degree. C. and washes are performed in 2.times.SSC/0.2% SDS and then in O.l.times.SSC. Background signals can be reduced by the use of detergent, such as sodium dodecyl sulfate, Sarcosyl or Triton X- 100, or a blocking agent, such as salmon sperm DNA. Array Construction The nucleic acid sequences can be used in the construction of arrays, for example, microarrays. Methods for construction of microarrays, and the use of such microarrays, are known in the art, examples of which can be found in U.S. Patent Nos. 5,445,934, 5,744,305, 5,700,637, and 5,945,334, the entire disclosure of each of which is hereby incorporated by reference. Microarrays can be arrays of nucleic acid probes, arrays of peptide or oligopeptide probes, or arrays of chimeric probes — peptide nucleic acid (PNA) probes. Those of skill in the art will recognize the uses of the collected information.
One particular example, the in situ synthesized oligonucleotide Affymetrix GeneChip system, is widely used in many research applications with rigorous quality control standards. (Rouse R. and Hardiman G., "Microarray technology - an intellectual property retrospective," Pharmacogenomics 5:623-632 (2003).). Currently the Affymetrix GeneChip uses eleven 25- oligomer probe pair sets containing both a perfect match and a single nucleotide mismatch for each gene sequence to be identified on the array. Using a light-directed chemical synthesis process (photolithography technology), highly dense glass oligo probe array sets (>1, 000,000 25- oligomer probes) can be constructed in a ~ 3 x 3 -cm plastic cartridge that serves as the hybridization chamber. The ribonucleic acid to be hybridized is isolated, amplified, fragmented, labeled with a fluorescent reporter group, and stained with fluorescent dye after incubation. Light is emitted from the fluorescent reporter group only when it is bound to the probe. The intensity of the light emitted from the perfect match oligoprobe, as compared to the single base pair mismatched oligoprobe, is detected in a scanner, which in turn is analyzed by bioinformatics software (http://www.affymetrix.com). The GeneChip system provides a standard platform for array fabrication and data analysis, which permits data comparisons among different experiments and laboratories. Microarrays according to the invention can be used for a variety of purposes, as further described herein, including but not limited to, screening for the resistance or susceptibility of a cancer to a drug based on the genetic expression profile of the cancer. Chemosensitivity Gene Expression Analysis System In one aspect, the present invention provides a chemosensitivity gene expression analysis system comprising a plurality of polynucleotide probes, wherein each of said polynucleotide probes comprises a nucleic acid sequence that is complimentary under strict hybridization conditions to at least a portion of a gene that encodes a protein that is a marker for the sensitivity of cancer cells to cytotoxic agents, as presented in the appendices hereto. In some embodiments, polynucleotides probes are provided on an array.
When the polynucleotide probes are employed as hybridizable array elements in an array, the array elements are organized in an ordered fashion so that each element is present at a specified location on the substrate. Because the array elements are at specified locations on the substrate, the hybridization patterns and intensities (which together create a unique expression profile) can be interpreted in terms of expression levels of particular genes and can be correlated with a particular disease or condition or treatment.
The gene expression analysis system, in some embodiments in the form of an array, can be used for gene expression analysis of target polynucleotides that represent the expression products of cells of interest, particularly cancer cells. The array can also be used in the prediction of the responsiveness of a patient to a therapeutic agent, such as the response of a cancer patient to a chemotherapeutic agent. Further, as described below, the array can be employed to investigate the profile of a cancer cell in terms of its likely sensitivity or resistance to chemotherapeutic agents. Furthermore, as described below, the array can be employed to characterize a therapeutic agent's chemosensitivity profile for use in treating various cancers. The array can also be used to identify new agents, as described below, which can modulate the chemosensitivity of a cancer cell to one or more therapeutic agents by altering the expression of genes that are markers for and influence chemosensitivity. The gene expression analysis system can be used to purify a subpopulation of mRNAs, cDNAs, genomic fragments and the like, in a sample. Typically, samples will include target polynucleotides and other non-target nucleic acids that may undesirably affect the hybridization background. Therefore, it may be advantageous to remove these non-target nucleic acids from the sample. One method for removing the non-target nucleic acids is by contacting the polynucleotide sample with the array, hybridizing the target polynucleotides contained therein with immobilized polynucleotide probes under hybridizing conditions. The non-target nucleic acids that do not hybridize to the polynucleotide probes are then washed away, and thereafter, the immobilized target polynucleotide probes can be released in the form of purified target polynucleotides.
Examples of the types of molecules that may be used as probes are cDNA molecules, oligonucleotides that contain 15, 16, 17, 18 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 5,6, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 61, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100 or more nucleotides, and other gene probes that comprise nucleobases including synthetic gene probes such as, for example, peptide nucleic acids. At least some of said polynucleotide probes comprise a polynucleotide sequence that is complementary to a target region of a gene that encodes a protein associated with growth factor signaling and that is a marker for the sensitivity or resistance of cancer cells to cytotoxic agents. In one embodiment, the plurality of polynucleotide probes comprises at least two or more probes, each of which comprises a polynucleotide sequence that is complementary to a target region of a chemosensitivity gene as described in the appendices hereto. In another aspect, the present invention provides a physical embodiment of the expression profile for a cancer cell of proteins that are involved in growth factor signaling and that are markers for the sensitivity of cancer cells to cytotoxic agents. The expression profile comprises the polynucleotide probes of the invention. The expression profile also includes a plurality of detectable complexes, in some embodiments in the form of a gene expression analysis system, and in some embodiments in the form of an array. Each complex is formed by hybridization of one or more polynucleotide probes to one or more complementary target polynucleotides in a sample. The polynucleotide probes are hybridized to a complementary target polynucleotide forming target/probe complexes. A complex is detected by incorporating at least one labeling moiety in the complex. Labeling moieties are described herein and are well known in the art.
In another embodiment, the chemosensitivity expression profile comprises a printed report that shows the expression of the analysis of an array. The printed report may be in the form of a developed or digital film of the hybridized and developed gene expression analysis system. The printed report may also be a manually or computer generated numerical analysis of the developed gene expression analysis system. The printed report may optionally contain gene- drug correlation information. The expression profiles provide "snapshots" that can show unique expression patterns that are characteristic of susceptibility or resistance of a cell to one or more cytotoxic chemotherapeutic agents. The chemosensitivity expression profile can be used, as further described below: (a) in the prediction of the chemosensitivity of a particular cancer cell or cell type to a therapeutic agent; (b) in the choice of drug therapy for a patient in need of the same; (c) in the identification of targets for altering the chemosensitivity of a cancer; and (d) in the identification of novel agents for modulating the chemosensitivity of a cancer. Methods of Predicting Response to Therapeutic Agents
In another aspect, the present invention provides a method of predicting the response of a specific cancer, and more particularly a cancer in a patient, to treatment with a therapeutic agent. The method comprises contacting a polynucleotide sample obtained from the cells of the specific cancer to polynucleotide probes to measure the levels of expression of one or, in some embodiments, a plurality of target polynucleotides. The expression levels of the target polynucleotides are then used to provide an expression profile for the cancer cells that is then compared to the drug-gene correlations described in the appendices hereto, wherein a positive correlation between a drug and a gene expressed in the cancer cells indicates that the cancer cells would be sensitive to the drug, and wherein a negative correlation between a drug and a gene expressed in the cancer cells indicates that the cancer cells would be resistant to the drug. Methods of Identifying New Therapeutic Agents The present invention provides novel methods for identifying and characterizing new agents that modulate the chemosensitivity of a cancer by altering the expression of one or more growth factor signaling genes. The method comprises treating a sample of cells from the cancer with an agent, and thereafter determining any change in expression of genes, such as growth factor signaling genes, which are markers for chemosensitivity. This is done by obtaining polynucleotide samples from untreated cancer cells and the treated cancer cells, and contacting the polynucleotide samples to polynucleotide probes to determine the levels of target polynucleotides to obtain growth factor signaling gene chemosensitivity expression profiles. In some embodiments, the measurement is made using an array or micro array as described above that comprises one or more probes. The method further comprises comparing the growth factor signaling gene expression profiles of the control and treated cells to determine whether the agent alters the expression of any of the chemosensitive or chemoresistant genes. In some embodiments, separate cultures of cells are exposed to different dosages of the candidate agent. The effectiveness of the agent's ability to alter chemosensitivity can be tested using standard assays that use, for example, the one or more of the NCI60 cancer cell lines. The agent is tested by conducting assays in that sample cancer cells are co treated with the newly identified agent along with a previously known therapeutic agent. The choice of previously known therapeutic agent is determined based upon the gene-drug correlation between the gene or genes whose expression is affected by the new agent. The present invention further provides novel methods for identifying and characterizing new agents that modulate the chemosensitivity of a cancer by altering the activity of one or more growth factor signaling genes. The method comprises treating a sample of cells from the cancer with an agent, which is capable of inhibiting the activity of a protein implicated in chemosensitivity by correlation analysis between gene expression and drug potency in multiple cancer cell lines. For example, an inhibitor of an efflux pump will increase the potency of an anticancer drug if the efflux pump is highly expressed. This permits one to search either for inhibitors of the chemosensitivity gene or to test whether an anticancer agent is subject to influence by the chemosensitivity gene product.
Any cell line that is capable of being maintained in culture may be used in the method. In some embodiments, the cell line is a human cell line, such as, for example, any one of the cells from the NCI60 cell lines. According to one approach, RNA is extracted from such cells, converted to cDNA and applied to arrays to that probes have been applied, as described above. EXAMPLES Example 1:
We have investigated the relationship between cytotoxicity of anticancer drugs and expressions of genes involved in growth factor signaling. Cytotoxic potencies of 119 drugs against 60 neoplastic cell lines (NCI-60) were correlated with expression of 343 genes, including 90 growth factors and receptors, 63 metalloproteinases, and 92 ras-like GTPases as downstream signaling factors. Correlating gene expression with drug potency yielded two sets of genes showing either negative or positive drug correlations (P < 0.001), indicative of a role in chemoresistance and -sensitivity. Known chemoresistance factors showed significant negative correlations with multiple drugs, but several novel candidate genes also scored highly. Negatively correlated genes clustered into two main groups with distinct expression profiles and drug correlations, represented by EGFR and ERBB2 (Her-2/Neu), which displayed distinct drug correlation patterns. Synergism and antagonism between EGFR and ERBB2 inhibitors, in combination with classical anticancer drugs, were not directly related to EGFR and ERBB2 expression in four cells lines tested. Good accuracy in predicting drug potency against the NCI- 60 was attainable with subset of only 13 genes {RAB5B, TGFBR3, RAB6A, ARF4, ARHC, ERBB3 (negative correlations), and PLCL2, RAN, PLCD4, RAB37, RAC2, RAB39B (positive)). Our approach reveals known and potentially novel biomarkers and drug targets in cancer chemotherapy.
Introduction: Classification of tumors and prediction of drug response have advanced with the use of mRNA expression profiles (1-4). Pathways underlying drug response include membrane transport, drug metabolism, apoptosis, DNA repair, and cell cycle control (3, 5). However, the full potential of transcriptional profiling in understanding of chemoresistance has yet to be achieved (4, 6), owing to poor reproducibility between array platforms (1, 7), inherent variability of gene expression in vitro and in vivo, and variable processing of the vast amount of high- dimensional data. This has resulted in different sets of candidate genes with little overlap between studies. Holleman et al. (8) have identified a set of 124 genes predictive of ALL response to four anticancer drugs, but only 3 had been previously associated with drug resistance. Because the 20-40 predictive genes differed entirely between each of the 4 drugs, the authors concluded that upstream mechanisms specific to each drug determine the response (8). Yet, one would have expected common downstream cell survival pathways to determine chemoresistance against multiple drugs. Therefore, we have focused here on genes that affect cytotoxic potencies of multiple drugs, comprising three gene families involved in growth factor signaling.
Growth factor signaling regulates cell proliferation, differentiation, and apoptosis (5, 9). Amplification, point mutations, or chromosomal translocation can result in uncontrolled activation of growth factor signaling pathways, as exemplified by ERBB2 in breast cancer (10). This has prompted the targeting of growth factor signaling as a therapeutic strategy, including growth factor receptors such as EGFR (11) and ERBB2 (HER-2) (9, 12). Growth factor signaling also conveys chemoresistance against classical anticancer drugs (13), involving suppression of apoptosis (14) and activation of multi-drug resistance genes, such as the drug efflux pump MDRl (15). As a result, combined administration of growth factor inhibitors and conventional cytotoxic drugs can result in synergistic effects (9, 16-18). However, this strategy has yielded variable results in clinical trials, possibly because of multiple parallel signaling pathways capable of bypassing the blocked signal (13). For example, signaling via type I insulin- like growth factor receptor (IGF-IR) may render anti-ERBB2 Herceptin therapy ineffective (19).
Growth factors and their receptors targeted by current drugs involve only a small subset of the encoding genes. In addition, gene families encoding metalloproteinases and small GTPases are potential drug targets in the growth factor signaling cascades. Matrix and membrane associated metalloproteinases are involved in transformation, proliferation and metastasis, and growth factor release, while GTPases are downstream integrators of cellular signaling, with NRAS and HRAS already implicated in chemoresistance (20, 21). The present study assesses the role of these pivotal components in growth factor signaling in chemosensitivity and -resistance. We exploit correlations between gene expression patterns in the NCI-60 cancer cell lines with cytotoxic drug potency. The NCI-60, a set of 60 diverse human cancer cell lines (22) has served in the screening of more than 100,000 candidate drugs. Gene-drug correlations can reveal novel drug targets or mechanisms of chemoresistance (1, 23, 24), while clustering of drug potency against the NCI60 cells can reveal mechanisms of action (25). Moreover, mRNA expression profiles of subsets of 20-200 genes can serve as predictors of cytotoxic drag potencies (8, 26). However, the underlying mechanisms remain unclear. On the other hand, with a more focused approach measuring expression of genes involved in transmembrane transport in the NCI-60 (27), we have identified numerous new drug-transporter relationships relevant to drug targeting and potency (28) that can be extrapolated to in vivo studies because causal interactions are implied. In this study, we have measured expression of 343 genes related to growth factor signaling, and other signaling pathways for comparison (Figure 14), in the NCI-60 (27), and have correlated the results with cytotoxic drug potencies. Genes with positive correlations were evaluated for a role in chemosensitivity, and those with negative correlations for chemoresistance. Among the latter, EGFR, and ERBB2 displayed sharply distinct drug correlation patterns, representing two main clusters of putative resistance genes. We further tested synergism and antagonism between combinations of EGFR and ERBB2 inhibitors, and traditional cytotoxic anticancer drugs. Finally, we identified subsets of a minimal number of genes as predictors of cytotoxic potency for multiple drugs, and as potential biomarkers and drug targets.
Materials and Methods:
Oligonucleotide microarrays. 70-mer oligonucleotide probes (total 343 genes) were designed for 90 growth factors and their receptors (e.g., EGF, IGF, FGF, PDGF, VEGF, TGF and 27VF families), 63 metalloproteinases and their inhibitors (20 MMPs, 4 TIMPs, 20 ADAMs and 19 ADAMTSs), and 92 small GTP-binding proteins (GTPases). In addition, 98 probes were targeted to GPCRs, heterotrimeric G proteins, phospho lipases, etc (Figure 14), for comparison. The 70-mer oligonucleotides were designed and synthesized by Operon (Alameda, CA), Qiagen (http ://omad. qiagcn. com/human2/index .php) . Probes for these genes were added to the transporter and channel gene microarray described previously (27, 29) , with 25 genes overlapping with the current study. Each probe was printed 4 times on poly-L-lysine glass slides to permit assessment of intra-assay variability of mRNA measurements. The 25 genes analyzed in duplicate served as quality control, in addition to comparison with cDNA arrays with 144 overlapping genes. 6 007045
NCI-60 cancer cell lines. Cell lines, purchased from the Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health
(http://www.dtp.iici.nih. ROV), were cultured in RPMI 1640 medium with L-glutamine,
supplemented with 10% fetal bovine serum, 100 U/ml sodium penicillin G and 100 μg/ml
streptomycin. Cells were grown in tissue culture flasks at 37°C in a 5% CO2 atmosphere.
Chemicals. EGFR inhibitor. AG1478 and ERBB2 inhibitor AG825 were purchased from Calbiochem (San Diego, CA). Cisplatin was obtained from Sigma (St. Louis, MO). Paclitaxel and camptothecin, 10-OH (CPT, 10-OH) were from the Developmental Therapeutics Program at NCI (Bethesda, MD).
Microarray hybridization. The hybridization was performed following published procedures (27, 29). Total RNA was extracted from cell cultures using TriZol (Invitrogen,
Carlsbad, CA) and purified by RNeasy® mini kit (Qiagen, Valencia, CA). Expression of each
gene was assessed by the ratio of expression level in the sample against a pooled control sample
from 12 diverse cell lines of the NCI-60 (23). 12.5 μg total RNA was used for cDNA synthesis
and then labeled with Cy5 or Cy3. The samples were then mixed and hybridized to the slides, and analyzed with an Affymetrix 428 scanner.
Data analysis. Microarray data analysis was performed as previously described (27). Background subtraction and calculation of medians of pixel measurements/spot was carried out using GenePix Software 3.0 (Foster City, CA). Spots were filtered out if they had both red and green intensity < 500 units after subtraction of the background or if they were flagged for any visual reason (odd shapes, background noise). Data normalization was carried out using the statistical software package R (www .r-proi ect. org) . To correct for intensity and dye bias, we used location and scale normalization methods, which are based on robust, locally linear fits, implemented in the SMA R package. This method is based on transformations:
R/G → log2 RJG - Cj (A) = log2 Rf kj (A) * G → (1/oj) * log2 R/ k, (A) * G, where c/A) is the
Lowess fit of the M vs. A plot for spots on the/* grid of each slide, and α,- is the scale factor for the jth grid (to obtain equal variances along individual slides). After performing these transformations, the gene expression level of each probe was set to be the median of the four copies of that probe.
Correlation analysis between gene expression and drug activity. Growth inhibition data for 119 standard anticancer drugs (23)(GIs0 values for the 60 human tumor cell lines) were obtained from the Developmental Therapeutics Program (http ://www. dtp .nci .nih. ROV), expressed as negative log of the molar concentration calculated in the NCI screen (30). Pearson correlation
coefficients were calculated for each gene-drug pair (119x343 pairs). Confidence intervals and
unadjusted P-values were obtained using Efron's bootstrap resampling method (31), with 10,000 bootstrap samples for each gene-drug comparison. Because controlling false discovery rate by the method of Benjamini et al (32) proved too stringent, an arbitrary cut-off of P = 0.001 was used for the unadjusted bootstrap P value. This is expected to detect more "true" gene-drug associations, at the expense of increasing the number of false positive ones, to be validated by other means.
Clustering of cell lines, genes, and drugs on the basis of gene expression and drug potency profile. Hierarchical clustering can be used to group cell lines and genes in terms of their patterns of gene expression (23, 33). To obtain cell-cell cluster trees for 107 genes that showed distinct expression patterns across the 60 cell lines (i.e., genes that passed the filter S.D. >=0.35), we used the programs "Cluster" and "Tree View" (34) with average linkage clustering and a correlation metric. Cells and drugs were also clustered by drug potency profiles (23), and moreover, genes and drugs were classified using correlations between each gene (expression across the NCI-60) and each drug (potency across the NCI-60) as distance measure. To reduce noise, we use stringent filters for the selection of genes included in this analysis, showing a correlation with at least one drug at P < 0.001.
Cytotoxicity assay. Drug potency was tested using a proliferation assay with sulforhodamine B (SRB)(35). 3000 - 5000 cells per well were seeded in 96-well plates and incubated for 24 hours. Drags were added in a dilution series in 3 replicate wells. After 3 days, incubation was
terminated by replacing the medium with 100 μl 10% trichloroacetic acid (Sigma, St. Louis,
MO), followed by incubation at 4° C for 1 hour. Plates were washed with water, air-dried, and
stained with 100 μl 0.4% SRB (Sigma) in 1% acetic acid for 30 min at room temperature.
Unbound dye was washed off with 1% acetic acid. After air-drying and re-solubilization of the protein-bound dye in 10 mM Tris-HCl (pH 8.0), absorbance was read in a micro-plate reader at 570 run.
Determination of combination index, a measure of synergism or antagonism between two co-administrated agents. The combination index (CI) was calculated according to equation: CI = dl/Dl + d2/D2 (36, 37). Dl and D2 represent the doses of drug 1 and drag 2 alone, required to produce x% effect, and dl and d2 are the doses of drags 1 and 2 in combination required to produce the same effect. The combined effect of the two drags could be synergistic (CI < 1), additive (CI = 1) or antagonistic (CI >1). Since the CI could differ at different levels of growth inhibition, combination indexes were obtained at different levels of growth inhibition, using increasing concentrations at a fixed ratio between the two drugs. The combination index was plotted against fraction affected (Fa of 0.25 would equal 75% viable cells).
Using gene expression data as predictors of cytotoxic potency. We developed predictive models for drug response against the 60 cancer cell lines based on gene expression profiles. Three groups of genes were used as candidate predictors, all 343 genes, genes showing strong negative correlations, and genes with strong positive correlations (each gene correlated with at least 1 drug at P < 0.001). For each compound we separated cell lines into training and test samples. To assess the accuracy of this method, we used a leave-one-out cross-validation procedure. We first estimated the probability of the training cell lines to be resistant to a given compound by modeling drug response levels (-logio(GI50)) as a mixture of normal distributions (Figure 9). Class assignment for a test sample is based on the predictive probabilities of class membership. This step differs from the one used in (26) by using information from all cell lines for prediction purposes and by incorporating into the analysis the bimodal behavior of growth inhibition distributions. Next, in order to select predictive genes, we sort genes by their ability to discriminate between the two classes (resistant and sensitive), using the BW measure. In the final step we use quadratic discriminant rule to predict resistance of the test cell line based on top-scoring genes according to the previous sorting criteria, and identified the set of predictor genes capable of producing maximal percent of correct classification of the 60 cell lines for each drug. We search, stepwise, for a subset of genes that improves the percentage of correct classifications using quadratic discriminant rule, by adding one new gene at each step. Once the model cannot be improved, the final predictor genes are then selected as the smallest set of genes producing a model within 5% from the highest percentage of correct classifications obtained in the stepwise search described above. This reduces the number of predictor genes, by accepting predictor sets within 5% of the optimally observed values, which may lead to more robust predictor sets. This led to a list of genes ranked by frequency of presence in the predictor sets for a subgroup of 68 drugs (pruned from 119 to avoid redundancy between similar drugs).
In a further iteration, we compared the observed frequencies of presence in the predictor sets (seeFigure 19) with the ones one would expect if these models would be random. Only the top 9 positively and the top 12 negatively correlated genes, and the top 13 genes among all 343 genes show significantly higher frequencies as predictors than expected from chance. Finally we used only these top-scoring genes (affecting multiple drugs each) to scan all possible combinations for predicting drug response, resulting in a refined model with the least number of highly predictive genes for each individual drug (Figure 20).
Results:
Expression of genes involved in growth factor signaling and other signaling pathways.
Basal mRNA expression of 343 genes was measured in the NCI-60 panel. In a previous independent experiment (27), we had measured 25 of these 343 genes with the same array method, using mRNA extracts from cells grown at different times and locations. Among the 12 genes with sufficiently robust expression to permit analytical analysis, gene-gene Pearson correlation coefficient ranged from 0.3 to 0.78. Moreover, expression data were compared to previous results obtained with a cDNA array platform (23). Based on 144 probes for genes common between the two arrays, the average Pearson correlation coefficients between the 70- mer oligo and the cDNA array is 0.42, with 90 out of 144 probes with r > 0.30 (P < 0.05) - similar to a previous study comparing cDNA arrays (23) and Affymetrix oligonucleotide HU6800 axray (7, 24). These results indicate that the array results are robust, in particular with highly expressed genes, while requiring experimental validation as needed.
The expression of 107 genes with differential expression across the NCI-60 (SD >0.35) served to cluster the 60 cell lines (Supplemental Fig. 2). Further validating the array results, leukemia, colon cancer, melanoma, and renal cell carcinoma cells clustered into groups, except for breast, ovarian and lung cancer (23, 38).
Pearson correlation of NCI-60 gene expression and drug potency.
To assess the relationship between gene expression and cytotoxic potency of 119 chemotherapeutic drugs, Pearson correlation coefficients were calculated, together with statistical significance (bootstrap P value). On the basis of previous results (27), we used P = 0.001 as a stringent cut-off, which yielded similar results compared to an absolute correlation coefficient value of > 0.4 (Figure 15). This resulted in 49 positively and 69 negatively correlated genes, many showing correlations with multiple drugs. Among the 245 genes in the three main families studied in this report, 53 had significant correlations with at least one drug, whereas in the comparator group of 98 genes, 16 qualified, most of which with fewer drug correlations.
We first focused on chemoresistance genes with significant negative correlations (Figure 5, Figure 16). Among the 69 genes encoding growth factors and their receptors, many showed significant negative correlations with multiple drugs, but were entirely lacking positive drug correlations; therefore, they are candidates as broad chemoresistance factors. Several genes were already known to induce chemoresistance, being involved in EGFR, ERBB2, VEGF, IGF, and PDGF signaling pathways (13). The number of significant negative drug correlations provides a crude measure of the relevance of each gene in mediating multi-drug resistance, yielding a rank order of CYR61, EGFR, IGFBP7, PDGFC, and ERBB2 (Figure 5). Both EGFR and ERBB2 are targets of current anticancer therapy, indicating that our approach identified known drug targets. However, other growth factors and receptors scored equally well, and therefore, represent interesting candidates for further study.
Among the metalloproteinases, MMP24, ADAM9, and the inhibitor TIMP2 ranked highly, with multiple negative correlations. Furthermore, small GTPases scored strongly, with seven genes showing negative correlations with 10 drugs or more, including ARHC, RRAS2, RAB 5B, and RALB, consistent with their pervasive role in cellular signaling. Among the other signaling factors (98 genes involved in various signaling pathways, such as GPCRs and G protein subunits), considerably fewer genes produced multiple negative correlations (Figure 16), such as GNGlO and GNGIl with 15 and 7 negatively correlated genes, respectively. Of 29 G-protein coupled receptors, only 4 showed strong negative correlation (P < 0.001) with 1 or 2 out of 119 drugs. This result suggests that these pathways are less germane to chemoresistance.
Genes with positive correlations, indicative of a possible role in chemo sensitivity, are listed in Figure 5 (and Figure 17). Representation of growth factors and associated signaling proteins in this group is consistent with the dual nature of growth factor signaling, supporting either cell survival or apoptosis. Positively correlated genes are less prominent, with only RAB37 and RAN scoring against 9-10 drugs at P < 0.001. Among the other signaling pathways,
phospholipases, G protein γ subunits (GNGs), and a phosphodiesterase scored highly (Figure
17). The mechanisms of a potential role in chemotherapy remain uncertain, each candidate gene requiring experimental validation. Therefore, we focus mainly on chemoresistance genes in this report. Hierarchical cluster analysis of genes negatively correlated with at least one drug.
Clustering the NCI-60 cells against expression of the 69 negatively correlated genes (Figure 11) still resulted in the expected cell line clusters obtained form gene expression results alone (Figure 10). Moreover, the 69 genes clustered into two main groups, suggesting some level of coexpression. To explore this further, genes were compared pairwise to each other with respect to their expression across the NCI-60. This process again resulted in the same two main groups of genes (Figure 13), supporting coordinate expression between the two groups.
To determine whether the grouping of the negatively correlated genes is relevant to chemoresistance, we performed hierarchical cluster analysis of genes versus the drug panel, using gene-drug correlations as the distance measure. In this fashion, genes that affect the same set of drugs cluster together. This resulted again in the same two major groups of genes, and two large clusters of drugs (Figure 1). Moreover, drugs with similar functions tend to cluster together into smaller subgroups, such as tubulin-active anti-mitotic agents (taxol analogs, colchicine, vincristine-sulfate and vinblatine-sulfate), and topoisomerase I inhibitors (e.g., camptothecin and its derivatives). We next grouped the genes further by performing a gene-gene cluster analysis, on the basis of gene-drug correlations. In this analysis, each gene is compared to all others, using the correlation coefficients against the 119-drug panel. Remarkably, this analysis revealed a sharp contrast between the two main groups of genes observed in the preceding cluster experiments (Figure 13), showing that the two gene clusters potentially related to chemoresistance contain genes with similar expression patterns and drug correlations.
Genes showing similar NCI-60 expression and drug potency profiles are likely to be part of the same signaling pathway. For example, ADAM9 (a metalloproteinase mediating release of membrane-tethered growth factors such as HB-EGF), EGFR, and the GTPase ARHC (RhoQ clustered within the same group in all analyses and hence may represent members of a signaling pathway relevant to chemoresistance for a portion of the drug panel. Network analysis indicates that EGFR and ADAM9 are close neighbors (to be published). Similarly, ERBB2 and RALB clustered together implying a functional relationship. On the other hand, the close receptor homologues, EGFR and ERBB2, presumed to have similar signaling pathways, clustered at some distance in either of the two main groups, suggesting that they serve independent functions across the NCI-60 panel.
EGFR and ERBB2: examples from the two gene groups with distinct drug correlations
The expression results of EGFR and ERBB 2 were reproducible between 70-mer oligonucleotide and cDNA arrays (r = 0.80 and 0.60, respectively). Each receptor correlated negatively with a number of drugs, but unsupervised cluster analysis (Fig.l) suggested that the correlated drug sets are distinct. Figure 2 (upper panel) displays the sorted Pearson correlation coefficients for EGFR versus the 119 anticancer drugs. Keeping the same order of the 119 drugs, a clearly distinct and even opposite profile is observed for ERBB2 correlation coefficients (Fig. 2, lower panel, correlation between EGFR and ERBB2 r = -0.25, P < 0.002, t test). This illustrates profound and unexpected differences in the interactions between EGFR and ERBB2 expression and cytotoxic drugs.
Synergism and antagonism between EGFR- and ERBB2-selective inhibitors, and classical cytotoxic drugs.
We tested the potency of, and interactions between, EGFR inhibitor AG1478 and ERBB2 inhibitor AG825, in cell lines with different levels of EGFi? and ERBB2 expression. Figure 3 A shows the mRNA levels of EGFR and ERBB2 in four cancer cells based on our array data, which is consistence with reported protein levels
(http://dtp.nci.nih. go v/mtweb/targetinfo?moltid=MT 1173&moltnbr=::813) using Western hybridization (Fig. 3B). Genotyping of 37 cell lines of the NCI-60 had revealed a lack of activating mutations for EGFR in exons 19 and 21 (39), while mutational status remains unknown for ERBB2. In addition to AGl 478 and AG825, we included the conventional anticancer drugs paclitaxel, cisplatin, and camptothecin 10-OH, in drug-drug interaction studies. The Pearson correlation coefficient between EGFR and paclitaxel, cisplatin, camptothecin 10- OH was -0.43, -0.14, -0.08, respectively. ERBB2 correlated with the three drugs at -0.03, -0.33 and —0.13 respectively. Synergism or antagonism was determined by calculation of the combination index (CI) versus fraction affected (36). For example, AG1478 and camptothecin 10-OH were synergistic in SK-MEL-2 cells (CI < 1, Fig. 4A). In contrast, antagonism was observed (CI >1, Fig. 4B) between AG1478 and cisplatin in SK-MEL-2 cells. AG1478 and AG825 did not act synergistically in the 4 cell lines tested. Cytotoxic effects of all dual drug combinations in four cell lines are listed in
Figure 6, using the combination index (CI). For some fixed ratio drug combinations, the CI value varied over different levels of growth inhibition (fraction affected, Fa). Synergistic effects were observed between AGl 478 or AG825 and paclitaxel, cisplatin or CPT, 10-OH in some cells, whereas antagonism occurred in others, with no clear relationship to EGFR or ERBB2 expression. Furthermore, the gene-drug correlations for EGFR and ERBB2 did not predict synergism or antagonism when the inhibitor of EGFR (AG1478) or ERBB2 (AG825) was combined with cytotoxic anticancer drugs in the four cell lines tested. Whereas the EGFR inhibitor AGl 478 potentiated camptothecin 10-OH (CPT5 10-OH), no significant correlation was observed between EGFR and CPT, 10-OH (r = -0.08).
Prediction of chemosensitivity using gene expression profiles and a learning algorithm
For this analysis, 68 drugs out of the 119-drug panel were selected to avoid redundancy and bias stemming from compounds with similar chemical structure, mechanism of action, and potency. Different predictive strategies were performed using all 343 genes, or only the 69 negatively or 49 positively correlated genes with P < 0.001 against at least one drug (Figure 18). Overlap between the positively and negatively correlated genes is minimal. Using a heuristic leave-one-out strategy (see supplemental methods), we identified predictive sets of genes. For each set of predictive genes, the minimum set of genes was determined within 5% of the optimal prediction of the entire set.
Prediction accuracy ranged mainly from 0.6 - 0.9, usually with 5-10 genes selected as predictors. Using all 343 genes or only the 69 negatively correlated genes yielded similar results in most cases, while only the 49 positively correlated genes tended to score somewhat lower. As not all possible combinations of genes can be tested in our heuristic algorithm with a large number of genes, better scoring gene sets may well exist.
Several genes recur more frequently as predictors of drug potency (Figure 19). The top genes with negative correlations include TGFBR3, RAB6A, RALB, TIMP2, ARHC, RABl 7 and ARF4, while the top positive genes are RAB37, RAC2, PLCL2, PDElB, PLCD4 and ADAM12. It is noted that these genes are not always the highest-ranking genes sorted by number of highly correlated drugs (Figure 5 and Figures 16 and 17). It remains to be determined which measure is more accurate. Further tests will be needed to determine whether the selected genes are functionally most relevant. To limit the effect of the heuristic approach on the prediction models, we reduced the number of candidate predictor genes to those appearing most frequently in predictive sets of drug potency. Only the top-scoring 9 positively and 12 negatively correlated genes, and 13 genes among all 343 genes showed significantly higher frequencies as predictors than expected from random occurrence (Figure 7). The 13 genes with both positive or negative correlations overlapped partially with those obtained from predictions using only negatively or positively correlated genes, suggesting that sampling variability is kept at a reasonable level. The small subset of genes, including TGFBR3, FGF19, FGFR2, TIMP2, RAB6A, ARHC (negative correlations), and PLCL2, ADAM12, MMPLl, RAB37, RAC2, RAB39B (positive), was found to be highly predictive for many anticancer drugs (Figure 7). RAB39B appeared in many predictive models because of multiple positive correlations (maximally r = 0.47), even though the lowest P value was only 0.002. Predictive accuracy for only these highest scoring genes is listed in Figure 8 (and Figure 20), with optimal predictions derived from all possible gene combinations for each drug. This improved the prediction accuracy, especially for those drugs with previously low prediction values, resulting from the heuristic approach used to detect predictor genes. Figure 21 shows examples of predictive gene sets for individual compounds. Hence, starting from 343 genes in this study, we have identified a small subset of genes yielding good predictions for a majority drugs. Discussion: We have evaluated the role of three gene families related to growth factor signaling in chemoresistance. Comparing basal gene expression patterns with potency of 119 standard anticancer drugs in the NCI-60 panel revealed numerous significant gene-drug correlations (P < 0.001; r >0.4). Strong negative gene-drug correlations were observed for genes already known to be involved in chemoresistance, including growth factors and receptors, metalloproteinases, and GTPases, which overall scored more frequently and higher than genes encoding other signaling factors tested for comparison (e.g., GPCRs). In addition, novel candidate genes were revealed that scored at least as strongly as the known chemoresistance factors. It is implicitly acknowledged that basal mRNA expression profiles yield only a partial window onto all factors that determine drug responses. Moreover, gene expression profiles relevant to growth factor signaling could have been associated with tumor cell lines inherently resistant or sensitive to drugs because of other factors. Nevertheless, where significant correlations are found, these can then be further validated by other means. The identification of many known drug resistance factors validates our approach.
Growth factors and chemoresistance
EGFR and ERBB2 are chemoresistance factors (40-42) showing negative correlation with multiple drugs in this study. Consistent with its negative correlation with EGFR (r = -0.4), paclitaxel was shown to act synergistically with the EGFR inhibitor, ZD 1839, in vitro and against xenografts of human renal cancer SKRC-49 (43). Similarly, ERBB2 displays multiple negative correlations, with ERBB2/PI-3K/Akt signaling conveying multi-drug resistance (44). Moreover, HerceptinR (ERBB2/HER2 antibody) causes chemosensitivity in animal models and clinical studies (41, 42, 45). Showing significant negative correlations to multiple drugs, CYR61
is an integrin receptor receptor αvβ3 ligand converging downstream on heregulin-ERBB2/3/4
receptor-mediated signaling (46). CYR61 was included because of its association with breast cancer chemoresistance (47), converging on growth factor signaling through the NF- lcappaB/XIAP pathway (48). Furthermore, vascular endothelial growth factor-165 receptor (VEGFl 65R), showing strong negative drag correlations, had been shown to be involved in tumor angiogenesis, progression, chemoresistance, and poor prognosis (49, 50).
Our results also point to potentially novel growth factors and receptors involved in chemoresistance. TGFBR3 scored highest as a predictor for chemoresistance. While devoid of
serine/threonine kinase activity (51), TGFBR3 appears to be a necessary component of the TGFβ
receptor signaling complex (52), and therefore, may represent an interesting target for cancer treatment, or a predictor of treatment response. Additional growth factors implicated by negative correlations include FGF 17, 18, and 19, IGF2, and NRGl. Their relevance to chemoresistance needs to be validated in each case.
Metalloproteases and chemoresistance
Further putative chemoresistance factor include members of the matrix metalloproteinase family, such as ADAM9 (a disintegrin and metalloproteinase domain 9), which is negatively correlated with many drugs in the NCI-60. ADAM9 is highly expressed in hepatocellular carcinoma (53), and in pancreatic ductal adenocarcinomas where cytoplasmic expression is correlated with poor prognosis (54). Furthermore, ADAM9 is part of the signaling cascade evading apoptosis induced by cytotoxic drags (55), possibly by mediating release of heparin- binding EGF-like growth factor (HB-EGF) (56). Expression and chemoresistance profiles in the NCI-60 indicate that ADAM9 and EGFR are functionally interacting. Several other metalloproteinases and inhibitors (Table 1), such as TIMP2 and MMP24, also showed strong negatively correlation with multiple anticancer drags, and moreover, served as predictors of drag potency. Their possible role in chemoresistance needs to be further validated. 2006/007045
GTPases and chemoresistance
Among the 92 GTPases tested, several were negatively correlated with multiple drugs (e.g., ARHC, RΛB5B, ARF4, RRAS2/TC21, RAB6A, RALB). A few GTPases showed multiple positive correlations (e.g., RAB37, RAN, RAC2, RAB39B), indicating that signaling networks can have dual outcome, promoting either apoptosis or survival. RAB6A and RALB (negative), and RAB 37 and RAC2 (positive) were heavily represented in gene panels predictive of drug sensitivity. Previous work supports a role of GTPases in cell transformation and survival (consistent with negative drug correlations). Ras couples extracellular signaling to downstream RAF/MEK/ERK and PI3-K/AKT cascades (57), activation of which via EGFR contributing to BCNU resistance in gliomas (58). Also negatively correlated with multiple drugs, TC21/RRAS2, mediates transformation of cancer cells involving phosphatidylinositol 3-kinase (PD-K) (59, 60), and is activated by growth factors (61), including FGFl and FGF2 shown to convey chemoresistance (62). Similarly implicated in our study, and a known key modulator of cell proliferation (63), RALB clusters together with ERRB2 by gene expression and drug potency correlations. Lastly, ARHC (RhoC) promotes tumor metastasis (64), and appears to contribute to chemoresistance via growth factor signaling (65).
Negative correlations further point to a series of GTPases not yet directly implicated in chemoresistance, such as ARF4, RHEB2, RAB5B, RAB6A, RAB18 and RAB32 (Figure 5). Further studies are needed to validate these genes as chemoresistance factors.
Chemosensitivity genes
Positive gene drug correlations imply a possible role in chemosensitivity (e.g. Figure 5, Figure 17), e.g., RAB37 showing multiple drug correlations. RAB39B is a member of the ras oncogene family, but displayed exclusively positive correlation (highest r = 0.47 with iproplatin); this observation requires further study. ARHGDIA encodes a Rho GDP dissociation inhibitor, and is predictive as a chemosensitivity factor for 14 drugs, possibly by regulating Rho activity. IGFALS is an insulin-like growth factor-binding protein (acid labile subunit), which complexes IGF and IGFBP3 into a 150 kD aggregate (see OMM, 601489). This could account for the positive correlation and predictive power for 15 drugs. EGFL4 and L5 encode EGF-like polypeptides containing multiple EGF repeats, and function in cell adhesion, but the physiological role remains uncertain.
EGFR and ERBB2 interactions
EGFR and ERBB2 belong to different gene clusters with distinct expression patterns, which indicated that different mechanisms of drug resistance might be involved. This was surprising as EGFR and ERBB2 are coexpressed in some tumors and can heterodimerize (9).
Although ZD 1839 (Iressa) in combination with trastuzumab (Herceptin) acted synergistically against human breast cancer cell growth (66), no synergism between EGFR inhibitor AG1478 and ERBB2 inhibitor AG825 was observed in the current study. In addition, the combined effects of AG1478 or AG825 with cytotoxic anticancer drugs in different cell lines were not directly related to the expression level of EGFR or ERBB2. Indeed, overexpression of ERBB2 in cancer cells could result in either chemoresistance or chemosensitivity for different anticancer drugs (67). The mechanism of the combined effects appears to be context-dependent and is not currently predictable. A confounding factor is the lack of specificity of the inhibitors. For example, AG1478 not only inhibits EGFR, but could also ERBB4 (68), and parallel signaling pathways can bypass the block. Inhibition of growth factor signaling at different junctions might improve anticancer potency, as shown with combined inhibition of both mutated EGFR and
PI3K (69). Another confounding factor, activating mutations in EGFR in lung cancer patients 2006/007045
predict clinical response to EGFR inhibitor gefitinib in lung cancer (39, 70); however, cell lines SK-OV-3 and SK-MEL-2 (39) used here do not carry EGFR mutations. Whereas members of the ERBB receptor family are mediators of cell survival, ERBB receptors might induce cell death under some circumstances (71, 72).
In this study, strong negative correlation between EGFR and ERBB2, and multiple drugs, implicating chemoresistance, failed to predict synergism in some cases. The mechanisms of observed synergism and antagonism requires further study to facilitate design of effective combination treatments.
Prediction of drug potency A promising avenue is the use of biomarkers to predict anticancer drug response (3, 26).
Here we have used expression of growth factor signaling genes for prediction of cytotoxic potency in the NCI-60. For each drug, we have evaluated several predictive models, using only negatively or positively correlated genes, or both. For example, EGFR, ERBB2, RRAS2, ARHC, RALB, ARF4, CYR61 and ADAM9, all negatively correlated with multiple anticancer drugs, have predictive power for many drugs (Figure 19). Using both positively and negatively correlated genes RAB37, RAB6A, RAC2, PLCL2, and TGFB3 (Figure 19) have highest predictive value for multiple drugs. When we used only the top scoring genes (9 positively and 12 from negatively correlated genes, and 13 genes selected from all 343 genes) in prediction analyses, this small set of genes was sufficient and even exceeded predictions with larger sets. Comparing to previous studies using expression of many genes, the attained accuracy is at least as good. This result indicates that random sampling noise poses a problem to selecting optimally predictive gene sets. By focusing on gene families known to function in chemoresistance, and exhaustively testing through all possible combinations, we have improved prediction of drug potency in the NCI60. These genes are potential key factors in chemoresistance and sensitivity, and could serve as novel drug target per se. In future studies, we will extend this approach (progressive selection of candidate genes) to expression arrays covering the entire genome. Prediction of drug potencies in vitro may not be directly applicable to in vivo therapy; however, the present study guides the selection of biomarkers in cancer chemotherapy.
Example 2:
Using gene expression data as predictors of cytotoxic potency.
We developed predictive models for drug response against the 60 cancer cell lines based on gene expression profiles. For each compound we separated cell lines into training and test samples and use leave-one-out cross-validation procedure to assess the accuracy of candidate models. Our method contains four main steps. First, we estimate the probability of the training cell lines to be resistant to a given compound. Next, in order to select predictive genes, we sort genes by their ability to discriminate between the two classes (resistant and sensitive), using the BW measure (4). In the third step we use quadratic discriminant rule to predict resistance of the test cell lines based on top-scoring genes (by number of drugs correlated with a gene) according to the previous sorting criteria, and identify the set of candidate predictor genes capable of producing maximal percent of correct classification of the 60 cell lines for the given drug. Finally, we prune the sets of candidate predictor genes for each drug, in order to obtain highly predictive models with a minimal number of genes.
Step A. Cytotoxic drug potency in training cell lines.
To determine the probability of the training cell lines to be resistant to a given compound, we model drug response levels (-logio(GI50)) as a mixture of two normal distributions, one for the resistant and one for the sensitive cell lines. This general modeling framework is a flexible scheme even for multi-type classification. Thus, the distribution of drug responses of the n training cell lines is defined by (seeFigure 9): (1) F(X1 \ θM) = πιf(Xi \ μ1 >σι) + π2f(Xi \ μ2 >σ2)
where ,σ{), f(X|μ22) are the normal density functions of the resistant/sensitive cell lines and TT1 , Tr2=I- 7T1 represent the probabilities of a cell line to be resistant or sensitive. The maximum likelihood estimates (MLE) for the parameters μ\, 0\, μ2, σ2 are obtained using the EM algorithm. Let Z1- denote the unknown type of cell line i, 1 if the cell line is resistant and 0 otherwise. Thus, the complete data is {(Xi, Zi), i=l,...,n}, and the corresponding complete data likelihood is:
where I(Zj=c) is the indicator function taking value 1 and 0 otherwise.
The EM iterates for the parameters are given by:
where E(I(Z1 and c-1 or 2. For the starting iteration the parameters can be initiated with 1A for the mixture weights, and use K-means with 2 clusters for ^1 and Θ2.
Once the EM estimators for the parameters Tr1, μ\, O\, μ2, σ2 are obtained, then the probability of cell line i to be resistant can be computed by:
The class assignment for the test cell lines will be based on the predictive probabilities of class membership (p, 1- p) (3).
Step B. Ranking genes according to their ability to discriminate between resistant and sensitive cell lines.
Denote XJ; the expression level of gene j in cell line i. The ability of each gene j to discriminate between the two classes (resistant and sensitive) is measured by the ratio:
J WSS3iPi( -Xny + Z 1(I - P1)(XJ1 - XJ2)2
where summation is over the training cell lines, pi is the previously determined predictive probability of sample i for being in the resistant class,
are estimates of the mean expression level for the two classes (resistant or sensitive) and Xj
denotes the average expression level of gene j across all samples. Genes are sorted according to BW measure (4), and the top scoring genes are selected for the remaining analysis.
Step C. Predicting drug resistance in test cell lines.
We use quadratic discriminant rule to predict the class (sensitive or resistant) of a testing cell line: if (V1 , ..., Vk) are the expression levels of a selected set of k predictor genes for a testing cell line then the cell line is declared resistant if
where xβ and xJ2 axe defined in (5) and
are estimates for the variance of expression levels of the jth selected gene in the training cell lines.
Step D. Prediction accuracy.
We use leave-one-out cross validation approach in order to assess prediction accuracy of a model based on a given set of k top scoring (4) genes. We search, stepwise, for a subset of genes that improves the percentage of correct classifications using quadratic discriminant rule, by adding one new gene at each step. We stop once the prediction accuracy cannot be improved.
The final set of predictor genes is then selected as the smallest set of genes producing a model with prediction accuracy within 5% from the highest percentage of correct classifications obtained in the stepwise search described above. This reduces the number of predictor genes, by accepting predictor sets within 5% of the optimally observed values, which may lead to more robust predictor sets. In this way we can generate a list of genes ranked by frequency of presence in the predictor sets for a subgroup of 68 drugs (pruned from 119 to avoid redundancy between similar drugs), hi a further iteration, we compared the observed frequencies of presence in the predictor sets (see Figure 19) with the ones one would expect if these models would be random. Only the top 9 genes among positively and 12 from negatively correlated genes, and 13 genes among all 343 genes showed significantly higher frequencies as predictors then expected. Finally we used only these top-scoring genes (affecting multiple drugs each) to scan all possible combinations for prediction drug response, resulting in a refined model with the least number of highly predictive genes (Figure 20).
Gene expression levels in the NCI-60 cell lines:
The precision and accuracy of the array results depends strongly on expression level, with high levels commonly yielding high correlations between array platforms (1). Spurious results from genes with low levels of expression across the NCI-60 could jeopardize the gene-drug correlation analysis. However, robust gene-drug correlations depend on strong gene expression in a portion of the cell lines of the NCI-60 (1), indicating that the selected gene-drug correlations in this report are likely to come from robust expression data. To confirm that the selected genes were expressed at relative high levels, we searched the NCI60 gene expression database (http://symatlas.gnf.org/SymAtlas/ affymetrix U133A) to determine the absolute gene expression levels across the NCI-60 cell lines. Average gene expression values for all probes on each array in the dataset were 200. For the 69 genes that are negatively correlated with at least one drug at P < 0.001, only 11 showed expression lower than 200 (average value) in all NCI-60 cell lines, and for 3 genes no data are available. Gene expression levels were also evaluated for the top-scoring 12 negatively and 9 positively correlated genes, and 13 genes selected from all 343 genes (Figure 7) showing significantly higher frequencies as predictors than expected. For the 12, 9 and 13 genes, there are only 1, 1, and 0 genes, respectively, with expression lower than 200 in all the cell lines. Therefore, most genes important for the prediction analysis were expressed at above average levels in at least a portion of the NCI-60 cell lines.

Claims

CLAIMSWhat is claimed is:
1. An array for determining the chemosensitivity of a cancer cell to a particular agent, comprising a plurality of polynucleotide probes designed to be complementary to and hybridize under stringent conditions with a target region of at least one gene listed in one of Figures 5, 7, 16, 17, 19, or 21, wherein at least one of the polynucleotide probes is a control probe.
2. The array of claim 1 , wherein the polynucleotide probes are immobilized on a substrate.
3. The array of claim 2, wherein the polynucleotide probes are between 10 and 80 nucleotides in length.
4. The array of claim 3 wherein the polynucleotide probes are 70 nucleotides in length.
5. The array of claim 4 wherein the polynucleotide probes are selected from the group consisting of oligonucleotides, cDNA molecules, and synthetic gene probes comprising nucleobases.
6. The array of claim 1, comprising at least 10 control probes and at least 10 polynucleotide probes designed to be complementary to and hybridize under stringent conditions with a target region of at least one gene listed in one of Figures 5, 7, 16, 17, 19, or 21.
7. A method for detecting a chemosensitivity gene expression profile a cancer cell, comprising hybridizing at least one target nucleic acid from a sample containing the cancer cell to an array of polynucleotide probes immobilized on a surface, said array comprising a plurality of polynucleotide probes, at least one of which is a control probe, and wherein at least one of said polynucleotide probes is complementary to a target region of at least one chemosensitivity gene listed in one of Figures 5, 7, 16, 17, 19, or 21; and quantifying the hybridization of said target nucleic acids to said array, wherein the expression profile of the cell provides an indication of the likely chemosensitivity or chemoresistance of the cells to a variety of different cytotoxic agents.
8. The method of claim 7, wherein said array comprises mismatch control polynucleotide probes.
9. The method of claim 8, wherein said quantifying comprises calculating the difference in hybridization signal intensity between each of said polynucleotide probes and its corresponding mismatch control probe.
10. The method of claim 9, wherein said quantifying comprises calculating the average difference in hybridization signal intensity between each of said polynucleotide probes and its corresponding mismatch control probe for each gene.
11. The method of claim 7, wherein said plurality of polynucleotide probes is 100 or more.
12. The method of claim 7, wherein for each target region of at least one chemosensitivity gene, said array comprises at least 10 different polynucleotide probes complementary to a target region of each chemosensitivity gene.
13. The method of claim 7, wherein said oligonucleotides are from 15 to 100 nucleotides in length.
14. The method of claim 7, wherein said oligonucleotides are 70 nucleotides in length.
15. The method of claim 7, wherein said pool of target nucleic acids is a pool of mRNAs.
16. The method of claim 7, wherein said pool of target nucleic acids is a pool of RNAs in vitro transcribed from a pool of cDNAs.
17. The method of claim 7, wherein said pool of target nucleic acids is amplified from a biological sample by an in vivo or an in vitro method.
18. The method of claim 7, wherein said pool of target nucleic acids comprises fluorescently labeled nucleic acids.
19. The method of claim 7, wherein each different polynucleotide probe is localized in a predetermined region of said surface, the density of said different polynucleotide probes is greater than about 60 different polynucleotide probes per 1 cm2.
20. The method of claim 7, comprising the step of comparing the pattern of chemosensitivity gene expression with gene-drug correlations shown in Figures 5, 7, 16, 17, 19, or 21 to identify matches between the genes expressed in the cells and genes that correlate with chemosensitivity or chemoresistance.
21. A method for predicting the effect of a cytotoxic agent on a cancer cell obtained from a mammalian subject, comprising hybridizing a sample containing target nucleic acids obtained from a cancer cell from a mammalian subject to an array of polynucleotide probes immobilized on a surface, said array comprising a plurality of different polynucleotide probes, at least one of which is a control probe, and wherein at least one of said polynucleotide probes is complementary to a target region of at least one chemosensitivity gene listed in one of Figures 5, 7, 16, 17, 19, or 21; and quantifying the hybridization of said nucleic acids to said array, wherein the expression profile of the cells provides an indication of the chemosensitivity or chemoresistance of the cells to a variety of different cytotoxic agents.
22. The method of claim 21, comprising the step of comparing the pattern of chemosensitivity gene expression with the gene-drug correlations listed in Figures 5, 7, 16, 17, 19, or 21 to identify matches between the genes expressed in the cells and genes that correlate with chemosensitivity or chemoresistance.
23. A method of identifying and characterizing an agent that modulates the expression or activity of one or more chemosensitivity genes, comprising: exposing a culture of mammalian cells to said candidate agent; determining the effect of the candidate agent on expression of one or more chemosensitivity genes listed in one of Figures 5, 7, 16, 17, 19, or 21.
24. The method of claim 23 wherein the effect of the candidate agent on transcription of chemosensitivity genes is determined by measuring the levels of transcripts of said chemosensitivity genes in said cells.
25. The method of claim 23 wherein the levels of transcripts are measured using an array that comprises polynucleotide probes that hybridize with at least 10 chemosensitivity gene transcripts, wherein not more than 100 polynucleotide probes are complementary to genes that do not influence chemosensitivity.
26. The method of claim 23 wherein the array comprises 10 or more of said oligonucleotides.
27. The method of claim 23 wherein the oligonucleotides comprise polynucleotide probes designed to be complementary to, or hybridize under stringent conditions with, 10 or more chemosensitivity genes listed in listed in one of Figures 5, 7, 16, 17, 19, or 21.
28. The method of claim 23 wherein the oligonucleotides comprise nucleotide probes designed to be complementary to, or hybridize under stringent conditions with target regions of 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, or more chemosensitivity genes listed in listed in one of Figures 5, 7, 16, 17, 19, or 21.
EP06736375A 2005-02-25 2006-02-27 Predicting chemosensitivity to cytotoxic agents Withdrawn EP1856289A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US65619505P 2005-02-25 2005-02-25
PCT/US2006/007045 WO2006091969A2 (en) 2005-02-25 2006-02-27 Predicting chemosensitivity to cytotoxic agents

Publications (1)

Publication Number Publication Date
EP1856289A2 true EP1856289A2 (en) 2007-11-21

Family

ID=36928133

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06736375A Withdrawn EP1856289A2 (en) 2005-02-25 2006-02-27 Predicting chemosensitivity to cytotoxic agents

Country Status (2)

Country Link
EP (1) EP1856289A2 (en)
WO (1) WO2006091969A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113584133B (en) * 2021-08-30 2024-03-26 中国药科大学 Multi-target in-situ detection method based on color coding and programmable fluorescent probe

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006091969A3 *

Also Published As

Publication number Publication date
WO2006091969A2 (en) 2006-08-31
WO2006091969A3 (en) 2008-02-07

Similar Documents

Publication Publication Date Title
JP4938672B2 (en) Methods, systems, and arrays for classifying cancer, predicting prognosis, and diagnosing based on association between p53 status and gene expression profile
US8349555B2 (en) Methods and compositions for predicting death from cancer and prostate cancer survival using gene expression signatures
Rahbari et al. Identification of differentially expressed microRNA in parathyroid tumors
CN107881234B (en) Lung adenocarcinoma related gene labels and application thereof
US20240110249A1 (en) Method of diagnosis, staging and monitoring of melanoma using microrna gene expression
CA2480045C (en) Method and compositions for the diagnosis and treatment of non-small cell lung cancer using gene expression profiles
WO2017215230A1 (en) Use of a group of gastric cancer genes
WO2021036620A1 (en) Application of a group of genes related to ovarian cancer prognosis
US10113201B2 (en) Methods and compositions for diagnosis of glioblastoma or a subtype thereof
US20070015148A1 (en) Gene expression profiles in breast tissue
JP2007049991A (en) Prediction of recurrence of breast cancer in bone
WO2010042831A2 (en) Diagnosis, prognosis and treatment of glioblastoma multiforme
EP2419540B1 (en) Methods and gene expression signature for assessing ras pathway activity
JP2008520251A (en) Methods and systems for prognosis and treatment of solid tumors
US20130005597A1 (en) Methods and compositions for analysis of clear cell renal cell carcinoma (ccrcc)
CA2753971C (en) Accelerated progression relapse test
WO2005032347A2 (en) Determining the chemosensitivity of cells to cytotoxic agents
EP2808815A2 (en) Identification of biologically and clinically essential genes and gene pairs, and methods employing the identified genes and gene pairs
US20080014579A1 (en) Gene expression profiling in colon cancers
US20080119367A1 (en) Prognosis of Renal Cell Carcinoma
US9580756B2 (en) Stratification of left-side and right-side colon cancer
US20120264639A1 (en) Methods and compositions for predicting survival in subjects with cancer
WO2006091969A2 (en) Predicting chemosensitivity to cytotoxic agents
WO2015131095A1 (en) Methods and compositions for prognostic risk analysis of clear cell renal cell carcinoma
WO2013077859A1 (en) Gene expression signature for the prognosis of epithelial cancer

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070914

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK YU

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20071206

R17D Deferred search report published (corrected)

Effective date: 20080207