WO2006125195A2 - Leukemia disease genes and uses thereof - Google Patents

Leukemia disease genes and uses thereof Download PDF

Info

Publication number
WO2006125195A2
WO2006125195A2 PCT/US2006/019614 US2006019614W WO2006125195A2 WO 2006125195 A2 WO2006125195 A2 WO 2006125195A2 US 2006019614 W US2006019614 W US 2006019614W WO 2006125195 A2 WO2006125195 A2 WO 2006125195A2
Authority
WO
WIPO (PCT)
Prior art keywords
leukemia
genes
mds
disease
gene
Prior art date
Application number
PCT/US2006/019614
Other languages
French (fr)
Other versions
WO2006125195A3 (en
Inventor
Michael E. Burczynski
Jennifer Ann Stover
Andrew J. Dorner
Natalie C. Twine
Original Assignee
Wyeth
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wyeth filed Critical Wyeth
Priority to CA002608092A priority Critical patent/CA2608092A1/en
Priority to EP06770765A priority patent/EP1888784A2/en
Priority to JP2008512570A priority patent/JP2008545399A/en
Priority to AU2006247027A priority patent/AU2006247027A1/en
Priority to MX2007014537A priority patent/MX2007014537A/en
Publication of WO2006125195A2 publication Critical patent/WO2006125195A2/en
Publication of WO2006125195A3 publication Critical patent/WO2006125195A3/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57426Specifically defined cancers leukemia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • A61P35/02Antineoplastic agents specific for leukemia
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification

Definitions

  • This invention relates to leukemia disease genes and methods of using the same for diagnosis and treatment of leukemia.
  • MDS Myelodysplastic syndromes
  • AML acute myelogenous leukemia
  • peripheral blood samples suitable for the present invention include, but are not limited to, whole blood samples or samples comprising un-fractionated PBMCs.
  • the peripheral blood samples employed comprise enriched un-fractionated PBMCs. By "enriched,” it means that the percentage of PBMCs in a sample is higher than that in whole blood.
  • PBMCs PBMCs
  • Enriched un- fractionated PBMCs can be prepared from whole blood by Ficoll gradients centrifugation or using cell purification tubes (CPTs). Other conventional methods can also be used to prepare enriched un-fractionated PBMCs.
  • CPTs cell purification tubes
  • Other conventional methods can also be used to prepare enriched un-fractionated PBMCs.
  • the invention provides genes whose expression profiles are indicative of the existence, status, progression or treatment of a leukemia. Leukemias that are amenable to the present invention include, but are not limited to, AML and MDS.
  • Table 4 recites genes differentially expressed in PBMCs from MDS patients versus PBMCs from disease-free subjects.
  • Table 6 recites genes differentially expressed in PBMCs from AML patients versus PBMCs from MDS patients.
  • Table 8 recites genes whose expression levels are useful for distinguishing humans with AML from humans with MDS, humans with AML from disease-free humans, and humans with MDS from disease free humans.
  • Acute lymphoblastic leukemia, chronic lymphocytic leukemia, chronic myelogenous leukemia, or hairy cell leukemia may also be analyzed according to the present invention.
  • the invention provides methods for diagnosis, or monitoring the occurrence, development, progression or treatment of leukemia (such as, for example, AML or MDS) in a subject using genes from Table 4 or Table 6.
  • the methods include generating a gene expression profile from a peripheral blood sample from the subject and comparing the gene expression profile to one or more reference expression profiles (e.g. an expression profile representing a disease-free human, an expression profile representing a human with a leukemia, or an expression profile representing a human of borderline diagnosis).
  • the gene expression profile and reference expression profiles include the expression patterns of one or more genes selected from Table 4 or 6 in PBMCs.
  • genes different from those recited in Table 2 are selected from Table 4 or 6, although genes recited in Table 2 can additionally be included.
  • the genes selected from Table 4 or 6 are those also recited in Table 8.
  • the difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative of the presence, absence, occurrence, development, progression, or effectiveness of treatment of leukemia in the subject.
  • the gene expression profile and the reference expression profiles can include the expression pattern of only one gene or of two or more (e.g. three or more, four or more, five or more, six or more, eight or more, ten or more, fifteen or more, twenty or more, forty or more, sixty or more, 100 or more, 200 or more, 300 or more, or 400 or more). In some embodiments, smaller numbers of genes (e.g.
  • the expression profile of the leukemia disease gene(s) in a subject of interest can be determined by measuring the RNA transcript level of each of the gene(s) in a peripheral blood sample of the subject. Methods suitable for this purpose include, but are not limited to, quantitative RT-PCR, nucleic acid arrays, Northern blot, in situ hybridization, slot-blotting, and nuclease protection assay.
  • the expression profile of the leukemia disease gene(s) can also be determined by measuring the protein product level of each of the gene(s) in the peripheral blood sample of the subject.
  • Methods suitable for this propose include, but are not limited to, immunoassays (e.g., ELISA, RIA, FACS, or Western Blot), protein arrays, two- dimensional gel electrophoresis, and mass spectroscopy.
  • a typical reference expression profile employed in the present invention includes values or ranges that are suggestive of the expression pattern of the leukemia disease gene(s) in peripheral blood samples of disease-free humans or patients with known leukemias.
  • a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of disease-free humans.
  • a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of patients having a known leukemia.
  • a reference expression profile comprises two or more individual expression profiles, each of which is the expression profile of the leukemia disease gene(s) in a peripheral blood sample of a different leukemia patient or disease-free human.
  • a reference expression profile comprises ranges that reflect variations in the expression levels of each of the leukemia disease gene(s) in peripheral blood samples of disease-free humans or patient with known leukemias.
  • a reference expression profile employed in the present invention can be prepared using the same type of peripheral blood samples as the peripheral blood sample of the subject of interest and following the same preparation procedure and methodology.
  • a reference expression profile can be predetermined or prerecorded. It can also be determined concurrently with or after the measurement of the expression profile of the subject of interest.
  • the comparison of the expression profile of a subject of interest to a reference expression profile can be performed manually or electronically.
  • the difference or similarity between the expression profile of the subject of interest and the reference expression profile is indicative of the presence or absence, or progression or non-progression, of leukemia in the subject.
  • the expression level of each of the leukemia disease genes employed in the comparison is correlated with a class distinction under a nearest-neighbor analysis or a significance analysis of microarrays.
  • the class distinction represents an ideal expression pattern of the gene in un-fractionated PBMCs of disease-free humans and patients who have a specified leukemia (e.g., uniformly high in PBMCs of the disease-free humans and uniformly low in PBMCs of the leukemia patients, or vice versa).
  • the disease status of a subject of interest can be predicted by comparing the expression profile of the leukemia disease genes in the subject of interest to a reference expression profile of the same genes using a fc-nearest-neighbors or weighted voting algorithm.
  • the invention also provides a general method for diagnosing or monitoring the occurrence, development, progression or treatment of MDS.
  • the method includes generating a gene expression profile from a peripheral blood sample of a subject and comparing the gene expression profile to one or more reference expression profiles (e.g. an expression profile representing a disease-free human, an expression profile representing a human with MDS 5 an expression profile representing a human with a non-MDS leukemia such as AML, or an expression profile representing a human of borderline diagnosis).
  • the gene expression profile and the one or more reference expression profiles comprise the expression patterns of one or more MDS disease genes in PBMCs.
  • the difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative of the presence, absence, occurrence, development, progression, or effectiveness of treatment of MDS in the subject.
  • the MDS disease genes can optionally include one or more genes selected from Tables 4, 6, or 8.
  • the gene expression profile and the reference expression profiles can include the expression pattern of only one gene or of two or more (e.g. three or more, four or more, five or more, six or more, eight or more, ten or more, fifteen or more, twenty or more, forty or more, sixty or more, 100 or more, 200 or more, 300 or more, or 400 or more).
  • smaller numbers of genes are used.
  • the comparison of the gene expression profile to the reference expression profiles can be done, for example, by a k-nearest neighbor analysis or a weighted voting algorithm. Based on the comparison, the subject from whom the sample was taken can be diagnosed with MDS or diagnosed as MDS-free or disease-free; or an existing MDS can be assessed for changes, such as those associated with progression or treatment.
  • the invention also provides a method for identifying an MDS patient who is likely to progress to acute myelogenous leukemia (AML) using one or more genes from Table 6.
  • the method includes generating a gene expression profile from a peripheral blood sample from an MDS patient and comparing the gene expression profile to one or more reference expression profiles (e.g. an expression profile representing a human with AML, an expression profile representing a human with MDS known to progress to AML, or an expression profile representing a human with MDS known not to progress to AML).
  • the gene expression profile and the one or more reference expression profiles include the expression patterns in PBMCs of one or more leukemia disease genes selected from Table 6.
  • the difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative that the MDS patient is likely to progress to AML.
  • the leukemia disease genes selected from Table 6 are optionally different from those recited in Table 2, although genes from Table 2 could also be included.
  • the leukemia disease genes selected from Table 6 are optionally among those also recited in Table 8.
  • the present invention features methods for evaluating the effectiveness of a treatment of leukemia in a patient of interest. These methods comprise comparing an expression profile of at least one leukemia disease gene in a peripheral blood sample of the patient of interest to a reference expression profile of the same gene(s), where the peripheral blood sample is isolated from the patient after initiation of the treatment, and each of the leukemia disease gene(s) employed is differentially expressed in un-fractionated PBMCs of patients who have the leukemia being evaluated, as compared to in un-fractionated PBMCs of disease-free humans.
  • the leukemia being assessed is MDS
  • the leukemia disease gene(s) employed includes one or more genes selected from Table 4.
  • the present invention features methods for evaluating the effectiveness of a treatment in preventing the progression of MDS to AML in a patient of interest.
  • These methods comprise comparing an expression profile of at least one leukemia disease gene in a peripheral blood sample of the patient of interest to a reference expression profile of the same gene(s), where the peripheral blood sample is isolated from the patient after initiation of the treatment, and each of the leukemia disease gene(s) employed is differentially expressed in un- fractionated PBMCs of MDS patients as compared to in AML patients.
  • leukemia disease genes suitable for this purpose include, but are not limited to, those depicted in Table 6.
  • the expression profile of the leukemia disease gene(s) in the patient of interest during the course of the treatment is indicative of the effectiveness of the treatment in preventing the progression of MDS to AML in the patient.
  • the invention also provides arrays useful, for example, for diagnosing MDS or other leukemias.
  • the arrays include a substrate having several addresses; distinct probes, such as distinct nucleic acid sequences or distinct antibody variable regions, are disposed on each address.
  • at least 15% (or at least 30% or at least 50%) of the addresses have probes that can specifically detect MDS disease genes in PBMCs; the MDS disease genes are optionally selected from Table 4.
  • at least 15% (or at least 30% or at least 50%) of the addresses have probes that can specifically detect genes selected from Tables 4 or 6; the selected genes are different from those recited in Table 2, although genes from Table 2 could also be included.
  • the invention also provides digitally-encoded expression profiles, as may be encoded in a computer-readable medium, useful, for example, as reference expression profiles to evaluate a gene expression profile from a peripheral blood sample.
  • Each expression profile includes one or more digitally-encoded expression signals including a value representing the expression of a gene selected from Tables 4 or 6; the selected genes are different from those recited in Table 2, although digitally-encoded expression signals including values representing the expression of genes from Table 2 could additionally be included in the expression profile.
  • the values in the digitally-encoded expression signals can represent, for example, the expression of the genes in a PBMC of a human with MDS or a human with AML.
  • Each expression profile can include a single digitally-encoded expression signal or can include two, three, four, five, six, seven, eight, nine, or more digitally-encoded expression signals, such as at least ten, at least 20, at least 30, at least 40, at least 50, at least 100, or at least 200.
  • the invention provides kits useful for diagnosis of a leukemia.
  • the kit includes one or more probes that can specifically detect MDS disease genes (optionally selected from Table 4) in PBMCs.
  • the probes are optionally polynucleotides that hybridize under stringent or nucleic acid array hybridization conditions to the RNA transcripts, or the complements thereof, of the MDS disease genes, or, optionally, are antibody variable domains that bind the products of the MDS disease genes.
  • the kit includes one or more probes that can specifically detect genes selected from Tables 4 and 6; the selected genes are different from those recited in Table 2, although probes for genes from Table 2 could additionally be included. Genes selected from Tables 4 and 6 can optionally be among those also recited in Table 8.
  • the kits also include one or more controls, each representing a reference expression level of a gene detectable by the probes.
  • the invention features a method of making a decision, e.g. selecting a payment class, for a course of treatment for a leukemia such as AML or MDS.
  • the method includes assigning an individual to a class based on a value that is a function of the expression of one or more genes in a peripheral blood sample from the individual, thereby making a decision regarding the individual.
  • the genes include one or more genes from among those recited in Tables 4 and 6 but not recited in Table 2, although the expression of genes recited in Table 2 could also be considered.
  • the one or more genes are selected from those also recited in Table 8.
  • the decision can include, for example, selecting a treatment, such as an AML treatment, MDS treatment, other leukemia treatment, or an absence of treatment, based on the assignment of the individual to the class.
  • the decision also can include administering or declining to administer a treatment based on the assignment; issuing, transmitting or receiving a prescription; or authorizing, paying for, or causing a transfer of funds to pay for a treatment.
  • Treatment refers to any action to deal with a disease or condition, regardless of whether the action is intended as preventative, curative, or palliative, for example; or to address a cause or symptom of the disease or condition; or to improve a second treatment by, for example, improving its efficacy or addressing a side effect.
  • the decision may be recorded, such as in a computer-readable medium.
  • the invention also features a method of providing information on which to make a decision about an individual.
  • the method includes providing (e.g. by receiving) an evaluation of a subject, wherein the evaluation was made by a method described herein, such as by determining the level of expression of one or more genes in a peripheral blood sample of the subject, thereby providing a value.
  • the genes include one or more genes from among those recited in Tables 4 and 6 but not recited in Table 2, although the expression of genes recited in Table 2 could also be considered.
  • the method also includes providing a comparison of the value with a reference value, thereby providing information on which to make a decision about the subject.
  • the method can also include making the decision or communicating the information to another party, such as by computer, compact disc, telephone, facsimile, or letter.
  • the decision can include selecting a subject for payment or making or authorizing payment for a first course of action if the subject demonstrates a gene expression level, pattern or profile observed in a leukemia (e.g AML or MDS) and a second course of action if the subject demonstrates a gene expression level, pattern or profile observed in a different leukemia (e.g. MDS or AML) or in leukemia-free humans.
  • Payment can be from a first party to a second party.
  • the first party can be a party other than the patient, such as a third party payor, an insurance company, employer, employer-sponsored health plan, HMO, or governmental entity.
  • the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity that sells or supplies the drug.
  • the invention features a method of making a data record. The method includes entering the result of a method described herein into a record, e.g. a computer readable record.
  • the record is evaluated and/or transmitted to a third party payor, an insurance company, employer, employer sponsored health plan, HMO, or governmental entity, or a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug.
  • the disclosure features a method of providing data.
  • the method includes providing data described herein, e.g., generated by a method described herein, to provide a record, e.g., a record described herein, for determining if a payment will be provided.
  • the data is provided by computer, compact disc, telephone, facsimile, email, or letter.
  • the data is provided by a first party to a second party.
  • the first party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug.
  • the second party is a third party payor, an insurance company, employer, employer sponsored health plan, HMO, or governmental entity.
  • the first party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, an insurance company, or an entity which sells or supplies the drug and the second party is a governmental entity.
  • the first party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, an insurance company, or an entity which sells or supplies the drug and the second party is an insurance company.
  • the disclosure features a method of transmitting a record described herein.
  • the method includes a first party transmitting the record to a second party, such as by computer, compact disc, telephone, facsimile, email, or letter.
  • the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug.
  • the first party is an insurance company or government entity and the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug.
  • the first party is a governmental entity or insurance company and the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, an insurance company, or an entity which sells or supplies the drug.
  • the present invention features the use of whole blood samples or samples comprising un-fractionated PBMCs for diagnosing or monitoring the progression or treatment of AML and MDS.
  • Genes that are differentially expressed in un- fractionated PBMCs of AML (or MDS) patients as compared to in disease-free humans can be identified. These genes can be used as surrogate markers for diagnosing or evaluating the treatment of AML (or MDS) in a subject of interest.
  • Genes that are differentially expressed in un-fractionated PBMCs of AML patients as compared to in MDS patients can also be identified. These genes can be used to monitor the progression of MDS in a patient of interest.
  • the present invention does not require positive selection of specific cell subtypes (e.g., CD34 + or AC133 + ), thereby allowing for rapid diagnosis and evaluation of AML and MDS.
  • Other leukemias such as acute lymphoblastic leukemia, chronic lymphocytic leukemia, chronic myelogenous leukemia, or hairy cell leukemia, can be similarly assessed according to the present invention.
  • This invention features the use of nucleic acid arrays for the identification of genes that are differentially expressed in un-fractionated PBMCs of leukemia patients as compared to in disease-free humans or in patients who have a different type of leukemia.
  • Nucleic acid arrays allow for quantitative detection of expression profiles of a large number of genes at one time.
  • Non-limiting examples of nucleic acid arrays suitable for this purpose include Genechip ® microarrays (Affymetrix, Santa Clara, CA), cDNA microarrays (Agilent Technologies, Palo Alto, CA), and bead arrays (U.S. Patent Nos. 6,288,220 and 6,391,562).
  • Polynucleotides to be hybridized to a nucleic acid array can be labeled with one or more labeling moieties to allow for detection of hybridized polynucleotide complexes.
  • the labeling moieties can include compositions that are detectable by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means.
  • Exemplary labeling moieties include, but are not limited to, radioisotopes, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers (such as fluorescent markers or dyes), magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
  • Polynucleotides to be hybridized to a nucleic acid array can be cDNA, cRNA, or other types of nucleic acid molecules.
  • Hybridization reactions can be performed in absolute or differential hybridization formats.
  • polynucleotides derived from one sample such as un-fractionated PBMCs from an AML or MDS patient or a disease-free human
  • PBMCs un-fractionated PBMCs from an AML or MDS patient or a disease-free human
  • Signals detected after the formation of hybridization complexes correlate to the polynucleotide levels in the sample.
  • polynucleotides derived from two biological samples such as one from an AML or MDS patient and the other from a disease-free human, are labeled with different labeling moieties (e.g., Cy3 and Cy5, respectively).
  • a mixture of these differently labeled polynucleotides is hybridized to a nucleic acid array. The nucleic acid array is then examined under conditions in which the emissions from the two different labels are individually detectable.
  • Signals gathered from nucleic acid arrays can be analyzed using commercially available software, such as software provided by Affymetrix or Agilent Technologies. Controls, such as for scan sensitivity, probe labeling or cDNA quantitation, can be included in the hybridization experiments.
  • signals from nucleic acid arrays are scaled or normalized before being further analyzed.
  • the expression signals of a gene can be normalized to take into account variations in hybridization intensities when more than one array is used under similar test conditions.
  • Signals for individual polynucleotide complex hybridization can also be normalized using the intensities derived from internal normalization controls contained on each array.
  • genes with relatively consistent expression levels across the samples can be used to normalize the expression levels of other genes.
  • the expression levels are normalized across the samples such that the mean is zero and the standard deviation is one.
  • the expression signals from a nucleic acid array are subject to a variation filter which excludes genes showing minimal or insignificant variation across different classes of samples.
  • Expression profiles in un-fractionated PBMCs of leukemia patients are compared to the corresponding expression profiles in disease-free humans.
  • Genes that are differentially expressed in un-fractionated PBMCs of leukemia patients as compared to in un-fractionated PBMCs of disease-free humans can be identified. These genes are hereinafter referred to as leukemia disease genes.
  • leukemia disease genes By “differentially expressed,” it means that the average expression level of a leukemia disease gene in un-fractionated PBMCs of leukemia patients is statistically significantly different from that in un-fractionated PBMCs of disease-free humans.
  • the p-value of a Student's t-test (e.g., two-tailed distribution, two-sample unequal variance) for the observed difference is no more than 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, or less.
  • the average expression level of a leukemia disease gene in un-fractionated PBMCs of leukemia patients can be substantially higher or lower than that in disease-free PBMCs.
  • the average expression level of a leukemia disease gene in PBMCs of leukemia patients can be at least 1, 2, 3, 4, 5, 10, 20, or more folds higher or lower than that in PBMCs of disease-free humans.
  • Leukemia disease genes that are differentially expressed in patients who have different leukemias (e.g., AML versus MDS) can be similarly identified.
  • Leukemia disease genes can also be identified using supervised or unsupervised clustering algorithms.
  • supervised clustering algorithms include the nearest-neighbor analysis, support vector machines, the SAM (Significance Analysis of Microarrays) method, artificial neural networks, and SPLASH.
  • unsupervised clustering algorithms include self-organized maps (SOMs), k-means, principal component analysis, and hierarchical clustering.
  • Class 0 includes subjects having a first disease status (e.g., disease-free), and class 1 includes subjects having a second disease status (e.g. AML or MDS).
  • Other forms of class distinction can also be employed.
  • a class distinction represents an idealized expression pattern, where the expression level of a gene is uniformly high for samples in one class and uniformly low for samples in the other class.
  • the correlation between gene "g" and the class distinction can be measured by a signal-to-noise score:
  • P(g,c) [ ⁇ i (g) - ⁇ 2 (g)]/[ ⁇ i(g) + ⁇ 2 (g)]
  • ⁇ i(g) and ⁇ 2 (g) represent the means of the log-transformed expression levels of gene "g" in class 0 and class 1, respectively
  • ⁇ g) and ⁇ 2 (g) represent the standard deviation of the log-transformed expression levels of gene "g” in class 0 and class I 5 respectively.
  • a higher absolute value of a signal-to-noise score indicates that the gene is more highly expressed in one class than in the other.
  • the samples used to derive the signal-to-noise scores comprise enriched or purified un-fractionated PBMCs and, therefore, the signal-to-noise score P(g,c) represents a correlation between the class distinction and the expression level of gene "g" in un-fractionated PBMCs.
  • the correlation between gene "g” and the class distinction can also be measured by other methods, such as the Pearson correlation coefficient or the Euclidean distance, as appreciated by those skilled in the art.
  • the significance of the correlation between gene expression profiles in un-fractionated PBMCs and a class distinction can be evaluated using a random permutation test.
  • the correlation between genes and a class distinction can be diagrammatically viewed through a neighborhood analysis plot, in which the y-axis represents the number of genes within various neighborhoods around the class distinction and the x-axis indicates the size of the neighborhood (i.e., P(g,c)). Curves showing different significance levels for the number of genes within corresponding neighborhoods of randomly permuted class distinctions can also be included in the plot.
  • the leukemia disease genes identified by the present invention are above the median significance level in the neighborhood analysis plot. This means that the correlation measure P(g,c) for each of these leukemia disease genes is such that the number of genes within the neighborhood of the class distinction having the size of P(g,c) is greater than the number of genes within the corresponding neighborhoods of randomly permuted class distinctions at the median significance level.
  • the leukemia disease genes identified by the present invention can also be above the 40%, 30%, 20%, 10%, 5%, 2%, or 1% significance level.
  • x% significance level means that x% of random neighborhoods contain as many genes as the real neighborhood around the class distinction.
  • the leukemia disease genes identified by the nearest-neighbor analysis can be used to construct class predictors.
  • Each class predictor includes two or more leukemia disease genes, and can be used to assign a subject of interest to a disease status (e.g., AML, MDS, or disease-free).
  • a class predictor includes or consists of leukemia disease genes that are significantly correlated with a class distinction by the permutation test (e.g., genes above the 1%, 2%, 5%, 10%, 20%, 30%, 40%, or 50% significance level).
  • a class predictor includes or consists of leukemia disease genes that have top absolute values of P(g,c).
  • the SAM method can also be used to correlate disease statuses with gene expression profiles in un-fractionated PBMCs.
  • the prediction analysis of microarrays (PAM) method can be used to identify class predictors that can best characterize a predefined disease or disease-free class and predict the class membership of new samples. See, for example, Tibshirani et al., (2002) Proc. Natl. Acad. Sci. U.S.A. 99:6567-6572.
  • the prediction accuracy of a class predictor of the present invention can be evaluated by k-fold cross validation, such as 10-fold cross validation, 4-fold cross validation, or leave-one-out cross validation.
  • k-fold cross validation such as 10-fold cross validation, 4-fold cross validation, or leave-one-out cross validation.
  • the data is divided into k subsets of approximately equal size.
  • the model is trained k times, each time leaving out one of the subsets from training and using the omitted subset as the test samples to calculate the prediction error. Where k equals the sample size, it becomes the leave-one-out cross validation.
  • Other methods can also be used to identify leukemia disease genes.
  • the above-described methods can also be used to identify genes whose expression profiles in un-fractionated PBMCs are predictive of different stages of leukemia progression, or different clinical responses of leukemia patients to a therapeutic treatment. For instance, gene expression profiles in PBMCs of MDS patients who eventually progress to AML can be compared to the corresponding gene expression profiles in MDS patients who do not progress to AML. Genes that are differentially expressed in these two classes of patients can be identified and used for the prediction of progression from MDS to AML. For another instance, leukemia patients can be grouped based on their responses to a therapeutic treatment. The global gene expression analysis is then used to identify genes that are differentially expressed in PBMCs of one group of patients versus another group. Genes thus identified are predictive of clinical outcome of a leukemia patient in response to the therapeutic treatment.
  • HG-U133A Genechips® (Affymetrix, Inc.) were used to identify AML or MDS disease genes. Genes that were differentially expressed in un-fractionated PBMCs of AML (or MDS) patients as compared to in disease-free humans were identified. Genes that were differentially expressed in un-fractionated PBMCs of AML patients as compared to in MDS patients were also identified. [0043] Table 1 lists qualifiers on HG-Ul 33 A Genechips® that showed elevated or decreased signals when hybridized to AML samples as compared to disease-free samples. Each qualifier in Table 1 corresponds to an AML disease gene which is differentially expressed in un-fractionated PBMCs of AML patients as compared to in disease-free humans.
  • the hybridization signal at each qualifier represents the expression level of the corresponding gene in un-fractionated PBMCs.
  • Table 1 also illustrates the average hybridization signals at each qualifier for AML ("AML Average”) or disease-free samples ("Disease-Free Average”). The standard deviations of these signals (“AML StDev” and “Disease-Free StDev,” respectively) are also provided.
  • AML/Disease-Free ratios between AML and disease- free hybridization signals
  • Student's t- test two-tailed distribution, two-sample unequal variance
  • Each qualifier on a HG-Ul 33 A Genechip® represents a set of oligonucleotide probes (PM or perfect match probe) that are stably attached to the respective regions on the Genechip®.
  • the RNA transcript (or the complement thereof) of the gene identified by a qualifier can hybridize under nucleic acid array hybridization conditions to at least one oligonucleotide probe of the qualifier.
  • the RNA transcript (or the complement thereof) of the gene does not hybridize under nucleic acid array hybridization conditions to the mismatch (MM) probes of the qualifier.
  • a mismatch probe is identical to the corresponding PM probe except for a single, homomeric substitution at or near the center of the mismatch probe.
  • the MM probe for a 25-mer PM probe has a homomeric base change at the 13th position.
  • the RNA transcript (or the complement thereof) of the gene identified by a qualifier can hybridize under nucleic acid array hybridization conditions to at least 50%, 60%, 70%, 80%, 90% or 100% of the PM probes of the qualifier, but not to the corresponding mismatch probes.
  • the discrimination score (R) for each of these PM probes as measured by the ratio of the hybridization intensity difference of the corresponding probe pair (i.e., PM - MM) over the overall hybridization intensity (i.e., PM + MM), can be no less than 0.015, 0.02, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5 or greater.
  • the RNA transcript (or the complement thereof) of the gene when hybridized to a HG-U133A Genechip® according to the manufacturer's instructions, produces a "present" call at the corresponding qualifier under the default settings (i.e., the threshold Tau is 0.015 and the significance level a t is 0.4).
  • the threshold Tau is 0.015 and the significance level a t is 0.4.
  • Table 2 lists the genes that are represented by the qualifiers in Table 1. These genes, as well as their corresponding unigene IDs and Entrez accession numbers, were identified according to Affymetrix Genechip® annotation.
  • a unigene is composed of a non-redundant set of gene-oriented clusters. Each unigene cluster is believed to include sequences that represent a unique gene.
  • the Entrez database collects sequences from a variety of sources, such as GenBank, RefSeq and PDB. The oligonucleotide probes of each qualifier can be derived from its corresponding Entrez sequence.
  • Table 3 describes qualifiers that showed elevated or decreased signals when hybridized to MDS samples as compared to disease-free samples.
  • the average hybridization signals at each qualifier for MDS (“MDS Average”) or disease-free (“Disease-Free Average”) samples are provided, together with their corresponding standard deviations (“MDS StDev” and “Disease-Free StDev,” respectively).
  • MDS StDev standard deviations
  • MDS StDev standard deviations
  • Table 4 further describes the genes that are represented by the qualifier in Table 3.
  • Table 5 illustrates qualifiers that showed elevated or decreased signals when hybridized to AML samples as compared to MDS samples. Like Tables 1 and 3, the average hybridization signals at each qualifier for AML or MDS samples ("AML Average” and “MDS Average,” respectively), the corresponding standard deviations (“AML StDev” and “MDS StDev,” respectively), the ratios between the hybridized signals (“AML/MDS”), and the p-values for the observed differences are provided in Table 5. The genes represented by the qualifiers in Table 5 are further described in Table 6. Table 3. Genes Differentially Expressed in MDS vs. Disease-Free PBMCs
  • genes depicted in Tables 2, 4, and 6 were identified according to Affymetrix annotation. Genes that corresponds to the qualifiers in Tables 1, 3, and 5 can also be identified by BLAST searching the target sequences of these qualifiers against human genome sequence databases. Databases suitable for this purpose include, but are not limited to, the human genome database at National Center for Biotechnology Information (NCBI), Bethesda, MD. NCBI also provides BLAST programs, such as "blastn," for searching its sequence databases. A BLAST search of a gene that corresponds to a qualifier can be conducted using an unambiguous segment of the target sequence of the qualifier (i.e., a sequence segment that does not contain any unknown nucleotide residue).
  • NCBI National Center for Biotechnology Information
  • a BLAST search of a gene that corresponds to a qualifier can be conducted using an unambiguous segment of the target sequence of the qualifier (i.e., a sequence segment that does not contain any unknown nucleotide residue).
  • RNA transcript or the complement thereof
  • the qualifiers in Tables 1, 3, and 5 represent not only genes that are explicitly depicted in the tables, but also genes that are not listed but nonetheless can hybridize under stringent or nucleic acid array hybridization conditions to the PM probes of the qualifiers.
  • stringent conditions are at least as stringent as conditions G-L in Table 7.
  • “Highly stringent conditions” are at least as stringent as conditions A-F in Table 7. For each condition, hybridization is carried out under the corresponding hybridization conditions (Hybridization Temperature and Buffer) for about four hours, followed by two 20-minute washes under the corresponding wash conditions (Wash Temp, and Buffer).
  • the hybrid length is that anticipated for the hybridized region(s) of the hybridizing polynucleotides.
  • the hybrid length is assumed to be that of the hybridizing polynucleotide.
  • the hybrid length can be determined by aligning the sequences of the polynucleotides and identifying the region or regions of optimal sequence complementarity.
  • SSPE 0.15M NaCl, 1OmM NaH 2 PO 4 , and 1.25mM EDTA, pH 7.4
  • SSC 0.15M NaCl and 15mM sodium citrate
  • the leukemia disease genes of the present invention can be used for diagnosis and prognosis of MDS, AML or other leukemias.
  • the disease genes can be used to identify an MDS patient who is likely to progress to acute myelogenous leukemia (AML).
  • AML acute myelogenous leukemia
  • the leukemia disease genes can also be used to evaluate the progression or effectiveness of a treatment of leukemia in a patient of interest. Any type of leukemia can be assessed according to the present invention. Examples of these leukemias include, but are not limited to, AML, MDS, acute lymphoblastic leukemia, chronic lymphocytic leukemia, chronic myelogenous leukemia, and hairy cell leukemia.
  • the diagnosis and prognosis typically involve comparison of the peripheral blood expression profile of one or more disease genes in the leukemia patient of interest to at least one reference expression profile.
  • the disease genes employed for diagnosis and prognosis are selected such that the peripheral blood expression profile of each disease gene is correlated with a class distinction under a class-based correlation analysis (such as the nearest-neighbor analysis), where the class distinction represents an idealized expression pattern of the selected genes in peripheral blood samples of leukemia patients who have different clinical outcomes.
  • the selected disease genes are correlated with the class distinction at above the 50%, 25%, 10%, 5%, or 1% significance level under a random permutation test.
  • the disease genes can also be selected such that the average expression profile of each disease gene in peripheral blood samples of one class of leukemia patients is statistically different from that in another class of leukemia patients or disease-free humans.
  • the p-value under a Student's t-test for the observed difference can be no more than 0.05, 0.01, 0.005, 0.001, or less.
  • the disease genes can be selected such that the average peripheral blood expression level of each disease gene in one class of patients is at least 2-, 3-, 4-, 5-, 10-, or 20-fold different from that in another class of patients or disease-free humans.
  • the expression profile of the leukemia disease gene(s) in a peripheral blood sample of a subject of interest can be compared to a reference expression profile of the same gene(s) for diagnosing or evaluating the progression or treatment of leukemia in the subject of interest.
  • the reference expression profile can be prepared using the same type of peripheral blood samples (e.g., whole blood samples or blood samples comprising enriched un-fractionated PBMCs) as the peripheral blood sample of the subject of interest. Both expression profiles can be prepared using the same preparation procedure or methodology. As a consequence, for each component in the expression profile of the subject of interest, there is at least one corresponding component in the reference expression profile.
  • a reference expression profile can be pre-determined or pre-recorded.
  • a reference expression profile employed in the present invention typically includes or consists of values or ranges that are suggestive of the expression pattern of the leukemia disease gene(s) in peripheral blood samples of disease-free humans or patients having known leukemias.
  • a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of disease-free humans.
  • a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of patients who have the leukemia being investigated.
  • the reference expression profiles may include a plurality of expression profiles, each of which represents the peripheral blood expression pattern of the disease gene(s) in a particular leukemia patient whose clinical outcome is known or determinable.
  • a reference expression profile may include two or more individual expression profiles, each of which represents the expression profile of the leukemia disease gene(s) in a peripheral blood sample of a different leukemia patient or disease-free volunteer.
  • the expression profile of a subject of interest can be compared to these individual reference expression profiles using a pattern recognition algorithm, such as weighted voting, ⁇ -nearest neighbors, or support vector machines.
  • a reference expression profile suitable for the invention may contain ranges for the expression levels of each leukemia disease gene employed.
  • Each range can be selected to reflect variations in the expression levels of the corresponding gene in peripheral blood samples of disease-free humans or patients who have known leukemias.
  • the range can be selected to be one standard deviation (or a multiple or fraction thereof) from the mean expression level of the corresponding gene in peripheral blood samples of disease-free humans (or patients having a known leukemia). Where the expression level of the gene in a subject of interest falls within that range, a "similar" call can be made with respect to that gene.
  • the expression profile of the patient of interest and the reference expression profile(s) can be constructed in any form.
  • the expression profiles comprise the expression level of each disease gene used in outcome prediction.
  • the expression levels can be absolute, normalized, or relative levels. Suitable normalization procedures include, but are not limited to, those used in nucleic acid array gene expression analyses or those described in Hill et ah, (2001) Genome Biol., 2:research0055.1-0055.13.
  • the expression levels are normalized such that the mean is zero and the standard deviation is one.
  • the expression levels are normalized based on internal or external controls, as appreciated by those skilled in the art.
  • the expression levels are normalized against one or more control transcripts with known abundances in blood samples.
  • the expression profile of the patient of interest and the reference expression profile(s) are constructed using the same or comparable methodologies.
  • each expression profile being compared comprises one or more ratios between the expression levels of different disease genes.
  • An expression profile can also include other measures that are capable of representing gene expression patterns.
  • Peripheral blood samples suitable for the present invention include, but are not limited to, whole blood samples or samples comprising un-fractionated PBMCs.
  • peripheral blood samples comprising enriched un- fractionated PBMCs are employed.
  • enriched it means that the percentage of PBMCs in a sample is higher than that in whole blood.
  • at least 75%, 80%, 85%, 90%, 95%, 99%, or 100% of the cells in an enriched sample is PBMCs.
  • Methods suitable for preparing enriched un-fractionated PBMCs include, but are not limited to, Ficoll gradients centrifugation or cell purification tubes (CPTs). Other conventional methods can also be used to prepare enriched un-fractionated PBMCs.
  • the expression level of a gene can be determined by measuring the level of the RNA transcript(s) of the gene. Suitable methods include, but are not limited to, quantitative RT-PCT, Northern Blot, in situ hybridization, slot-blotting, nuclease protection assay, and nucleic acid array (including bead array).
  • the expression level of a gene can also be determined by measuring the level of the polypeptide(s) encoded by the gene. Suitable methods include, but are not limited to, immunoassays (such as ELISA, RIA, FACS, or Western blot), 2-dimensional gel electrophoresis, mass spectrometry, or protein arrays.
  • the expression profile of the leukemia disease gene(s) in a subject of interest can be determined by measuring the RNA transcript level of each of the gene(s) in a peripheral blood sample of the subject. Methods suitable for this purpose include, but are not limited to, quantitative RT-PCT 5 competitive RT-PCR, real time RT-PCR, differential display RT-PCR, Northern Blot, in situ hybridization, slot-blotting, nuclease protection assays, and nucleic acid arrays (including bead arrays).
  • the expression profile of the leukemia disease gene(s) can also be determined by measuring the protein product level of each of the gene(s) in the peripheral blood sample of the subject of interest.
  • Methods suitable for this purpose include, but are not limited to, immunoassays ⁇ e.g., ELISA (enzyme-linked immunosorbent assay), RIA (radioimmunoassay), FACS (fluorescence-activated cell sorter), Western Blot, dot blot, immunohistochemistry, or antibody-based radioimaging), protein arrays, high-throughput protein sequencing, two-dimensional SDS-polyacrylamide gel electrophoresis, and mass spectrometry.
  • the biological activity e.g., enzymatic activity or protein/DNA binding activity
  • the protein product encoded by a leukemia disease gene can also be used to measure the expression level of the gene in a peripheral blood sample of interest.
  • the expression profile the leukemia disease gene(s) can have any form.
  • the expression profile includes the expression level of each leukemia disease gene employed.
  • Each expression level can be an absolute expression level, or a normalized or relative expression level. Methods suitable for normalizing expression levels of different genes include, but are not limited to, those described in Hill et al., (2001) Genome Biol, 2:research0055.1-0055.13, and Genechip® Expression Analysis - Data Analysis Fundamentals (Part No. 701190 Rev. 2, Affymetrix, Inc., 2002), both of which are incorporated herein by reference in their entireties.
  • the expression level of each leukemia disease gene is normalized based on internal or external controls. The expression level of each leukemia disease gene can also be normalized against one or more control transcripts with known abundances in the samples used.
  • the expression profile of the leukemia disease gene(s) can also include ratio or ratios between the expression levels of different leukemia disease genes (e.g., ratios between the expression levels of genes that are up-regulated in PBMCs of leukemia patients versus genes that are down-regulated). Ratios between the expression levels of leukemia disease genes versus non-leukemia disease genes can also be used to construct the expression profiles of leukemia disease genes. Other measures that are indicative of gene expression patterns can also be used to prepare gene expression profiles.
  • the difference or similarity between the expression profile of a subject of interest and a reference expression profile can be determined by assessing the differences or similarities between the corresponding components in the two profiles. Methods suitable for this purpose include, but are not limited to, fold changes or absolute differences.
  • the expression level of a leukemia disease gene in a subject of interest is considered similar to the corresponding reference level in the reference expression profile if the difference between the two levels is less than 50%, 40%, 30%, 20%, or 10% of the reference level.
  • the expression level of a leukemia disease gene in a subject of interest is considered similar to the corresponding reference level in the reference expression profile if the former level falls within the standard deviation (or a multiple or fraction therefore) of the reference level.
  • the criteria for the overall similarity between the expression profile of a subject of interest and a reference expression profile can be selected such that the accuracy (the ratio of correct calls over the total of correct and incorrect calls) for leukemia diagnosis or assessment is relatively high.
  • the similarity criteria can be selected such that the accuracy for leukemia diagnosis or assessment is at least 50%, 60%, 70%, 80%, 90%, or more.
  • an overall similarity call is made if at least 50%, 60%, 70%, 80%, 90%, or more of the components in the expression profile of the subject of interest are considered similar to the corresponding components in the reference expression profile. Different components in the expression profiles may have the same or different weights in comparison.
  • the gene expression-based methods can also be combined with other clinical tests to improve the accuracy of leukemia diagnosis or assessment.
  • the weighted voting algorithm is capable of assigning a class membership to a subject of interest. See Golub et ah, supra, and Slonim et ah, supra. Software programs suitable for this purpose include, but are not limited to, the GeneCluster 2 software (Broad Institute, Cambridge, MA).
  • a subject of interest is being assigned to one of two classes ⁇ i.e., class 0 and class 1), each class representing a different disease status (e.g., AML, MDS, or disease-free).
  • class 0 can include disease-free humans and class 1 includes MDS (or AML) patients.
  • class 0 can include AML patients and class 1 includes MDS patients.
  • a set of AML or MDS disease genes can be selected from Tables 2, 4, or 6 to form a classifier (i.e., class predictor). Each gene in the classifier casts a weighted vote for one of the two classes (class 0 or class 1).
  • b g equals to [x ⁇ (g) + xl(g)]/2, which is the average of the mean logs of the expression levels of gene "g" in class 0 and class 1.
  • x g represents the normalized log of the expression level of gene "g" in the sample of interest.
  • a positive v g indicates a vote for class 0, and a negative v g indicates a vote for class 1.
  • VO denotes the sum of all positive votes, and Vl denotes the absolute value of the sum of all negative votes.
  • Cross-validation can be used to evaluate the accuracy of a class predictor created under the weighted voting algorithm.
  • cross-validation includes withholding a sample which has been used in the neighborhood analysis for the identification of the disease genes. A class predictor is created based on the remaining samples, and then used to predict the class of the sample withheld. This process is repeated for each sample that has been used in the neighborhood analysis. Class predictors with different leukemia disease genes are evaluated by cross- validation, and the best class predictor with the most accurate predication can be identified.
  • any number of leukemia disease genes can be employed in the present invention.
  • at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or more genes selected from Table 2 are used for the diagnosis or evaluation of the effectiveness of a treatment of AML in a subject of interest.
  • at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or more genes selected from Table 4 are used for the diagnosis or evaluation of the effectiveness of a treatment of MDS in a subject of interest.
  • at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or more genes selected from Table 6 are used for the diagnosis or evaluation of the progression or treatment of MDS in a subject of interest.
  • a combination of genes selected from Tables 2, 4, or 6 are used for the diagnosis or evaluation of the progression or treatment of AML or MDS in a subject of interest.
  • the leukemia disease gene(s) employed in the present invention can be selected to have p-values of no greater than 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, or less.
  • the leukemia disease gene(s) can also be selected to include gene(s) that is upregulated in leukemia patients as compared to in disease-free humans, as well as gene(s) that is downregulated in leukemia patients as compared to in disease-free humans.
  • the leukemia disease gene(s) can also be selected through the use of optimization algorithms such as the mean variance algorithm as described in U.S. Patent Application 20040214179.
  • the leukemia disease genes in a class predictor can be selected such that they are significantly correlated with the class distinction in the neighborhood analysis. For instance, the leukemia disease genes that are above the 1%, 5%, or 10% significance level in the neighborhood analysis can be selected. See Golub et al, supra, and Slonim et al, supra.
  • the leukemia disease genes in a class predictor can also include top upregulated leukemia disease gene(s), as well as top downregulated leukemia disease gene(s).
  • a class predictor employed in the present invention comprises or consists of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 40, or more genes selected from Table 2 or Table 4.
  • the class predictor can include at least two groups of genes.
  • the first group includes gene or genes having AML/Disease-Free ratios (or MDS/Disease-Free ratios) of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more
  • the second group includes gene or genes having AML/Disease- Free ratios (or MDS/Disease-Free ratios) of no greater than 0.5, 0.333, 0.25, 0.2, 0.1, or less.
  • a class predictor employed in the present invention comprises or consists of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 40, or more genes selected from Table 6.
  • the class predictor can also include at least two groups of genes.
  • the first group includes gene or genes having AML/MDS ratios of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more
  • the second group includes gene or genes having AML/MDS ratios of no greater than 0.5, 0.333, 0.25, 0.2, 0.1, or less.
  • the present invention also contemplates the use of other leukemia disease genes for the diagnosis or assessment of the progression or treatment of leukemia in a subject of interest.
  • RNA transcript or the complement thereof, depending on the strandedness of the oligonucleotide probes of the qualifier
  • oligonucleotide probe of the qualifier can hybridize to at least one oligonucleotide probe of the qualifier.
  • the RNA transcript (or the complement thereof) of the gene can hybridize under stringent or nucleic acid array hybridization conditions to at least 50%, 60%, 70%, 80%, 90% or 100% of the oligonucleotide probes of the qualifier and produce a "present" call at the qualifier on an Affymetrix Genechip® under the default settings (i.e., the threshold Tau is 0.015 and the significance level ⁇ l is 0.4). See Genechip® Expression Analysis - Data Analysis Fundamentals (Part No. 701190 Rev. 2, Affymetrix, Inc., 2002).
  • a clinical challenge concerning AML, MDS and other blood or bone marrow diseases is the highly variable response of patients to a therapy.
  • the basic concept of pharmacogenomics is to understand a patient's genotype in relation to available treatment options and then individualize the most appropriate option for the patient.
  • different classes of patients can be created based on their different responses to a given therapy.
  • Genes differentially expressed in un-fractionated PBMCs of one response class as compared to in another response class can be identified using the global gene expression analysis. These genes are molecular markers for predicting whether a patient of interest will be more or less responsive to the therapy. For patients predicted to have a favorable outcome, efforts to minimized toxicity of the therapy may be considered, whereas for those predicted not to respond to the therapy, treatment with other therapies or experimental regimes can be explored.
  • patients are grouped into at least two classes (class 0 and class 1).
  • Class 0 includes patients who die within a specified period of time (such as one year) after initiation of a treatment.
  • Class 1 includes patients who survive beyond the specified period of time after initiation of the treatment.
  • Genes that are differentially expressed in un-fractionated PBMCs of class 0 patients as compared to in un-fractionated PBMCs of class 1 patients can be identified. These genes are prognostic markers of patient clinical outcome.
  • Other clinical outcome criteria such as remission/non-remission, time to progression, complete response, partial response, stable disease, or progressive disease, can also be used to group leukemia patients to identify the corresponding prognostic genes.
  • the leukemia disease genes of the present invention can also be used to identify or test drugs for the treatment of AML or MDS.
  • the ability of a drug candidate to reduce or abolish the abnormal expression of AML or MDS disease genes in un-fractionated PBMCs is suggestive of the effectiveness of the drug candidate in treating AML or MDS.
  • Methods for screening or evaluating drug candidates are well known in the art. These methods can be carried out either in animal models or during human clinical trials.
  • the present invention also contemplates expression vectors encoding AML or MDS disease genes. These AML or MDS disease genes may be under- expressed in AML or MDS tumor cells. By introducing the expression vectors into the patients in need thereof, abnormal expression of these genes can be corrected. Expression vectors and gene delivery techniques suitable for this purpose are well known in the art.
  • this invention contemplates sequences that are antisense to AML or MDS disease genes or expression vectors encoding the same.
  • the AML or MDS disease genes may be over-expressed in AML or MDS tumor cells. By introducing the antisense sequences or expression vectors encoding the same, abnormal expression of these disease genes can be corrected.
  • Expression of an AML or MDS disease gene can also be inhibited by RNA interference ("RNAi").
  • RNAi is a technique used in post transcriptional gene silencing ("PTGS”), in which the targeted gene activity is specifically abolished.
  • RNAi resembles in many aspects PTGS in plants and has been detected in many invertebrates including trypanosome, hydra, planaria, nematode and fruit fly (Drosophila melanogaster). It may be involved in the modulation of transposable element mobilization and antiviral state formation.
  • RNAi in mammalian systems is disclosed in PCT application WO00/63364.
  • dsRNA of at least about 21 nucleotides is introduced into cells to silence the expression of the target gene.
  • the present invention features antibodies that specifically recognize the polypeptides encoded by AML or MDS disease genes. These antibodies can be administered to patients in need thereof.
  • an antibody of the present invention can substantially reduce or inhibit the activity of a disease gene.
  • the antibody can reduce the activity of a disease gene by at least about 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more.
  • Suitable antibodies for the present invention include, but are not limited to, polyclonal antibodies, monoclonal antibodies, chimeric antibodies, humanized antibodies, single chain antibodies, Fab fragments, or fragments produced by a Fab expression library.
  • the antibodies of the present invention can bind to the respective AML or MDS disease gene products or other desired antigens with a binding affinity constant K a of at least 10 6 M “1 , 10 7 M “1 , 10 8 M “1 , 10 9 M “1 , or more.
  • a pharmaceutical composition comprising an antibody or a polynucleotide of the present invention can be prepared.
  • the pharmaceutical composition can be formulated to be compatible with its intended route of administration. Examples of routes of administration include, but are not limited to, parenteral, intravenous, intradermal, subcutaneous, oral, inhalational, transdermal, topical, transmucosal, and rectal administration. Methods for preparing desirable pharmaceutical compositions are well known in the art.
  • kits or apparatuses for diagnosing or monitoring the progression or treatment of AML or MDS.
  • a kit or apparatus of the present invention includes or consists essentially of one or more polynucleotides, each of which is capable of hybridizing under stringent conditions to a gene selected from Tables 2, 4, or 6.
  • the polynucleotide(s) can be labeled with fluorescent, radioactive, or other detectable moieties.
  • the polynucleotide(s) can be also un-labeled. Any number of polynucleotides can be included in a kit or apparatus.
  • polynucleotides can be included in a kit or apparatus, each polynucleotide being capable of hybridizing under stringent conditions to a different respective gene selected from Tables 2, 4, or 6.
  • the polynucleotide(s) included in a kit or apparatus is enclosed in a vial, a tube, a bottle or another containing mean.
  • the polynucleotide(s) is stably attached to one or more substrates. Nucleic acid hybridization can be directly conducted on the substrate(s).
  • Hybridization reagents can also be included in a kit or apparatus of the present invention.
  • kits or apparatus of the present invention includes or consists essentially of one or more antibodies specific for the polypeptide(s) encoded by the gene(s) selected from Tables 2, 4, or 6.
  • the antibody or antibodies can be labeled with one or more detectable moieties to allow for detection of antibody-antigen complexes.
  • the antibody or antibodies can also be un-labeled.
  • kits or apparatus Any number of antibodies can be included in a kit or apparatus. For instance, at least 1, 2, 3, 4, 5, 10, 15, 20, or more antibodies can be included in a kit or apparatus, and each of these antibodies can specifically recognize a different respective AML or MDS disease gene product.
  • Immunodetection reagents can also be included in a kit or apparatus of the present invention.
  • a kit of the present invention includes one or more containers which enclose the antibody or antibodies.
  • the antibody or antibodies in an apparatus of the present invention are stably attached to one or more substrates. Substrates suitable for this purpose include, but are not limited to, films, membranes, column matrices, or microtiter plate wells.
  • the present invention features systems capable of comparing an expression profile of interest to at least one reference expression profile.
  • the reference expression profiles are stored in a database.
  • the comparison between the expression profile of interest and the reference expression profile(s) can be carried out electronically, such as by using a computer system.
  • the computer system typically comprises a processor coupled to a memory which stores data representing the expression profiles to be compared.
  • the memory is readable as well as rewritable.
  • the expression profiles can be retrieved or modified.
  • the computer system includes one or more programs capable of causing the processor to compare the expression profiles.
  • the computer system includes a program capable of executing a weighted voting or a ⁇ -nearest-neighbors algorithm.
  • the computer system is coupled to a nucleic array from which hybridization signals can be directly fed into the system. Kits for prognosis, diagnosis or selection of treatment of MDS, AML, and other leukemias
  • kits useful for the prognosis, diagnosis or selection of treatment of MDS, AML or other leukemias Each kit includes or consists essentially of at least one probe for a leukemia disease gene (e.g., a gene selected from Tables 2, 4, or 6). Reagents or buffers that facilitate the use of the kit can also be included. Any type of probe can be using in the present invention, such as hybridization probes, amplification primers, or antibodies.
  • a kit of the present invention includes or consists essentially of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more polynucleotide probes or primers.
  • Each probe/primer can hybridize under stringent conditions or nucleic acid array hybridization conditions to a different respective leukemia disease gene.
  • a polynucleotide can hybridize to a gene if the polynucleotide can hybridize to an RNA transcript, or the complement thereof, of the gene.
  • a kit of the present invention includes one or more antibodies, each of which is capable of binding to a polypeptide encoded by a different respective leukemia disease gene.
  • a kit of the present invention includes or consists essentially of probes ⁇ e.g., hybridization or PCR amplification probes or antibodies) for at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75 or more genes selected from Table 2, 4 or 6.
  • probes e.g., hybridization or PCR amplification probes or antibodies
  • the probes employed in the present invention can be either labeled or unlabeled.
  • Labeled probes can be detectable by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical, chemical, or other suitable means.
  • Exemplary labeling moieties for a probe include radioisotopes, chemimminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
  • kits of the present invention can also have containers containing buffer(s) or reporter means.
  • the kits can include reagents for conducting positive or negative controls.
  • the probes employed in the present invention are stably attached to one or more substrate supports. Nucleic acid hybridization or immunoassays can be directly carried out on the substrate support(s). Suitable substrate supports for this purpose include, but are not limited to, glasses, silica, ceramics, nylons, quartz wafers, gels, metals, papers, beads, tubes, fibers, films, membranes, column matrices, or microtiter plate wells.
  • the kits of the present invention may also contain one or more controls, each representing a reference expression level of a disease gene detectable by one or more probes contained in the kits.
  • MDS patients were primarily of Caucasian descent and had a mean age of 66 years (range of 52-84 years).
  • AML patients were exclusively of Caucasian descent and had a mean age of 45 years (range of 19-65 years).
  • Disease-free volunteers were exclusively of Caucasian descent with a mean age of 23 years (range of 18-32 years).
  • Inclusion criteria for AML patients included blasts in excess of 20% in the bone marrow, morphologic diagnosis of AML according to the FAB classification system and flow cytometry analysis indicating CD33 + status.
  • Inclusion criteria for MDS patients included morphologic diagnosis of MDS and FAB classification as refractory anemia, refractory anemia with ringed sideroblasts, refractory anemia with excess blasts, or refractory anemia with excess blasts in transformation (where disease stability had been demonstrated for a minimum of 2 months).
  • the blood samples were drawn into CPT Cell Preparation Vacutainer Tubes (Becton Dickinson).
  • PBMCs were isolated over Ficoll gradients according to the manufacturer's protocol (Becton Dickinson). Total RNA was isolated from un- fractionated PBMC pellets using Qiagen RNeasy ® mini-kits (Qiagen, Valencia, CA).
  • Labeled target for oligonucleotide arrays was prepared using a modification of the procedure described in Lockhart et al, (1996) Nature Biotechnology 14:1675-1680. Two micrograms of total RNA were converted to cDNA using an oligo-d(T)24 primer containing a T7 DNA polymerase promoter at the 5' end. The cDNA was used as the template for in vitro transcription using a T7 DNA polymerase kit (Ambion, Woodlands, TX, USA) and biotinylated CTP and UTP (Enzo, Farmingdale, NY, USA). Labeled cRNA was fragmented in 40 mM Tris-acetate pH 8.0, 100 mM KOAc, 30 mM MgOAc for 35 min at 94 0 C in a final volume of 40 ⁇ l.
  • HG- U133A Genechips® Affymetrix
  • 10 ⁇ g of labeled target is diluted in Ix MES buffer with 100 ⁇ g/ml herring sperm DNA and 50 ⁇ g/ml acetylated BSA.
  • Ix MES buffer 100 ⁇ g/ml herring sperm DNA
  • 50 ⁇ g/ml acetylated BSA 100 ⁇ g/ml herring sperm DNA
  • in vitro synthesized transcripts of 11 bacterial genes are included in each hybridization reaction, as described in Hill et al. (2000) Science 290:809-812.
  • the abundance of these transcripts ranges from 1:300,000 (3 ppm) to 1 : 1000 (1000 ppm) stated in terms of the number of control transcripts per total transcripts.
  • the sensitivity of detection of the arrays can range between about 1:300,000 and 1 : 100,000 copies/million.
  • Labeled probes are denatured at 99 0 C for 5 minutes and then 45 0 C for 5 minutes and hybridized to oligonucleotide arrays comprised of over 12,500 human genes (HG-Ul 33 A, Affymetrix). Arrays are hybridized for 16 hours at 45 0 C.
  • the hybridization buffer includes 100 mM MES, 1 M [Na + ], 20 mM EDTA, and 0.01% Tween 20. After hybridization, the cartridges is washed extensively with wash buffer 6x SSPET (e.g., three times at room temperature for at least 10 minutes each time).
  • nucleic acid array hybridization conditions are collectively referred to as “nucleic acid array hybridization conditions.”
  • the washed cartridges are subsequently stained with phycoerythrin coupled to streptavidin.
  • 12x MES stock contains 1.22 M MES and 0.89 M [Na + ].
  • the stock can be prepared by mixing 70.4 g MES free acid monohydrate, 193.3 g MES sodium salt and 800 ml of molecular biology grade water, and adjusting volume to 1000 ml.
  • the pH should be between 6.5 and 6.7.
  • 2x hybridization buffer can be prepared by mixing 8.3 ml of 12x MES stock, 17.7 ml of 5 M NaCl, 4.0 ml of 0.5 M EDTA, 0.1 ml of 10% Tween 20 and 19.9 ml of water.
  • 6x SSPET contains 0.9 M NaCl, 60 mM NaH 2 PO 4 , 6 mM EDTA, pH 7.4, and 0.005% Triton X-100.
  • the wash buffer is replaced with a more stringent wash buffer. 1000 ml of the stringent wash buffer can be prepared by mixing 83.3 ml of 12x MES stock, 5.2 ml of 5 M NaCl, 1.0 ml of 10% Tween 20 and 910.5 ml of water.
  • Genechip® 3.2 software uses algorithms to calculate the likelihood as to whether a gene is "absent” or “present” as well as a specific hybridization intensity value or "average difference” for each transcript represented on the array. The algorithms used in these calculations are described in the Affymetrix Genechip® Analysis Suite User Guide.
  • transcripts can be evaluated further if they meet the following criteria.
  • genes that are designated "absent" by the Genechip® 3.2 software in all samples are excluded from the analysis.
  • a fourth criterion which requires that average fold changes in frequency values across the statistically significant subset of genes be 2-fold or greater, is also used.
  • Unsupervised hierarchical clustering of genes and/or arrays on the basis of similarity of their expression profiles can be performed using the procedure described in Eisen et al. (1998) Proc. Nat. Acad. Sci. U.S.A., 95: 14863-14868.
  • Nearest-neighbor prediction analysis and supervised cluster analysis can be performed using metrics illustrated in Golub et al, supra.
  • data can be first log-transformed and then normalized to have a mean value of zero and a variance of one.
  • a Student's Mest can be used to compare disease-free, AML and MDS PBMC expression profiles.
  • a p value of no more than 0.05 e.g., no more than 0.01, 0.001, or less
  • the measures of correlation for the most statistically significant genes observed in real class distinctions can be compared to the most statistically significant measures of correlation observed in randomly permuted class distinctions.
  • the top 1%, 5% and median distance measurements of 100 randomly permuted classes compared to the observed distance measurements for AML versus disease-free, MDS versus disease-free, or AML versus MDS can be plotted to show the statistical verification of the leukemia disease genes identified by this invention.
  • a 24-qualifier signature (8 cDNAs representing 7 genes defining AML, 8 cDNAs representing 7 genes defining MDS, and 8 cDNAs representing 8 genes defining disease-free) was identified. This signature can accurately predict and classify PBMC samples of disease-free individuals, MDS patients, or AML patients. This signature also identifies rapid MDS progressors as "AML,” with implications for early detection of AML progression in MDS patients.
  • the qualifiers in the 24-qualifier signature are listed in Table 8, below.
  • the signal-to-noise value associated with the qualifier is provided in the column labeled "Score.”
  • Score the signal-to-noise value
  • Each signal-to-noise value was greater than the value in the adjacent "Perm 1%" column, representing the signal-to-noise values observed for the top 1% of random permutations when the labels of the profiles were scrambled and then compared using identical class sizes.
  • the actual signal-to- noise values for the qualifiers were superior to those in the top 1% of random permutations.
  • the corresponding human genes are identified by name, by symbol, by chromosomal location ("Cyto Band”), by Unigene number, and by GenBank accession number.
  • Human genes used to identify AML include human myb; human neuronal protein 3.1; human myeloperoxidase; human catalase; human CGI-49; human stem cell growth factor; and human serine peptidase inhibitor, Kazal type 2 (acrosin-trypsin inhibitor).
  • Human genes used to identify MDS include human NEDD4L; human glutathione peroxidase 3; human X-lmked Kx blood group; human synuclein, alpha; human chromosome 8 open reading frame 5 I/hypothetical protein MGC3113; human interferon, alpha-inducible protein 27; and human transglutaminase 3.
  • Human genes used to identify PBMCs from disease-free individuals include human chromosome 21 open reading frame 7; human amyloid beta A4 precursor protein-binding family A member 2; human KIAA0449; human F-box only protein 21; human death effector filament-forming ced-4-like apoptosis protein; human zinc finger protein 14; human vasoactive intestinal peptide receptor 1; and human KIAA0443.
  • a supervised approach on a training set of healthy, AML and non-progressor MDS samples was used to identify a gene classifier correlated with profiles in healthy individuals, stable MDS patients, and AML patients.
  • An 8 gene classifier was optimally predictive, exhibiting an overall accuracy of 94 % in the training set (62/66 subjects correctly assigned by leave-one- out cross validation).
  • One of the four misclassified samples in the training set was from an MDS patient with a conflicting diagnosis who was "misclassified” upon cross validation as AML.
  • this classifier identified the remaining unambiguous samples in the test set with similar accuracy (87% overall accuracy of class assignment).
  • This 8-gene predictor also assigned both samples from patients with conflicting diagnosis and all three samples from MDS patients with rapid times to disease progression as originating from patients with AML.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Hematology (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Pathology (AREA)
  • Oncology (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Hospice & Palliative Care (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • General Chemical & Material Sciences (AREA)
  • Cell Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention features the use of whole blood samples or samples comprising peripheral blood mononuclear cells (PBMCs) for diagnosing or evaluating the progression or treatment of leukemia. Genes that are differentially expressed in un-fractionated PBMCs of leukemia patients, as compared to in disease-free humans or in patients who have a differential type of leukemia, can be identified according to the invention. These genes are leukemia disease genes and can be used to detect the presence or progression of leukemia in a subj ect of interest. Leukemias that are amenable to the present invention include, but are not limited to, acute myelogenous leukemia (AML) and myelodysplastic syndromes (MDS). Non- limiting examples of AML or MDS disease genes are provided in Tables 2, 4 and 6.

Description

PCT International Application
Express Mail Mailing Label No. EV832483451US
Attorney Docket No. WYE-O35PC
LEUKEMIA DISEASE GENES AND USES THEREOF
TECHNICAL FIELD
[0001] This invention relates to leukemia disease genes and methods of using the same for diagnosis and treatment of leukemia.
BACKGROUND
[0002] Myelodysplastic syndromes (MDS) are a heterogeneous group of clonal disorders of bone marrow cell precursors characterized by variable clinical courses and outcomes. Approximately 30 percent of patients with MDS eventually progress to acute myelogenous leukemia (AML) and a clinical diagnostic assay especially suited to early identification of this subset of patients would help focus therapeutic options in these individuals.
[0003] Recent expression profiling studies have revealed differences in AC 133+ hematopoeitic stem cell fractions from patients with MDS versus AML (Miyazato et al. (2001) Blood 98:422-427). Similar results have been observed in transcriptional profiles of CD34+ cells purified from bone marrow of patients with MDS, which are radically altered from the transcriptional profiles of CD34+ cells from disease-free individuals (Hofrnann et al, (2002) Blood 100: 3553-3560). These studies, however, involved positive selection of specific cell subtypes, which is laborious and time-consuming.
SUMMARY OF THE INVENTION
[0004] The present invention features the use of peripheral blood samples containing peripheral blood mononuclear cells (PBMCs) for diagnosis or evaluation of the progression or treatment of AML and MDS. The present invention does not require positive selection of specific cell subtypes from the blood sample, thereby allowing rapid diagnosis and assessment of a leukemia. Accordingly, peripheral blood samples suitable for the present invention include, but are not limited to, whole blood samples or samples comprising un-fractionated PBMCs. In many cases, the peripheral blood samples employed comprise enriched un-fractionated PBMCs. By "enriched," it means that the percentage of PBMCs in a sample is higher than that in whole blood. In many cases, at least 75%, 80%, 85%, 90%, 95%, 99%, or 100% of the cells in an enriched sample are PBMCs. Enriched un- fractionated PBMCs can be prepared from whole blood by Ficoll gradients centrifugation or using cell purification tubes (CPTs). Other conventional methods can also be used to prepare enriched un-fractionated PBMCs. [0005] The invention provides genes whose expression profiles are indicative of the existence, status, progression or treatment of a leukemia. Leukemias that are amenable to the present invention include, but are not limited to, AML and MDS. For example, Table 4 recites genes differentially expressed in PBMCs from MDS patients versus PBMCs from disease-free subjects. Table 6 recites genes differentially expressed in PBMCs from AML patients versus PBMCs from MDS patients. Table 8 recites genes whose expression levels are useful for distinguishing humans with AML from humans with MDS, humans with AML from disease-free humans, and humans with MDS from disease free humans. Acute lymphoblastic leukemia, chronic lymphocytic leukemia, chronic myelogenous leukemia, or hairy cell leukemia may also be analyzed according to the present invention. [0006] Thus, in one aspect, the invention provides methods for diagnosis, or monitoring the occurrence, development, progression or treatment of leukemia (such as, for example, AML or MDS) in a subject using genes from Table 4 or Table 6. The methods include generating a gene expression profile from a peripheral blood sample from the subject and comparing the gene expression profile to one or more reference expression profiles (e.g. an expression profile representing a disease-free human, an expression profile representing a human with a leukemia, or an expression profile representing a human of borderline diagnosis). The gene expression profile and reference expression profiles include the expression patterns of one or more genes selected from Table 4 or 6 in PBMCs. In some embodiments, genes different from those recited in Table 2 are selected from Table 4 or 6, although genes recited in Table 2 can additionally be included. In some embodiments, the genes selected from Table 4 or 6 are those also recited in Table 8. The difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative of the presence, absence, occurrence, development, progression, or effectiveness of treatment of leukemia in the subject. The gene expression profile and the reference expression profiles can include the expression pattern of only one gene or of two or more (e.g. three or more, four or more, five or more, six or more, eight or more, ten or more, fifteen or more, twenty or more, forty or more, sixty or more, 100 or more, 200 or more, 300 or more, or 400 or more). In some embodiments, smaller numbers of genes (e.g. two, up to three, up to four, up to five, up to six, up to eight, up to ten, up to fifteen, up to twenty, up to forty, up to sixty, up to 100, or up to 200) are used. [0007] The expression profile of the leukemia disease gene(s) in a subject of interest can be determined by measuring the RNA transcript level of each of the gene(s) in a peripheral blood sample of the subject. Methods suitable for this purpose include, but are not limited to, quantitative RT-PCR, nucleic acid arrays, Northern blot, in situ hybridization, slot-blotting, and nuclease protection assay. The expression profile of the leukemia disease gene(s) can also be determined by measuring the protein product level of each of the gene(s) in the peripheral blood sample of the subject. Methods suitable for this propose include, but are not limited to, immunoassays (e.g., ELISA, RIA, FACS, or Western Blot), protein arrays, two- dimensional gel electrophoresis, and mass spectroscopy.
[0008] A typical reference expression profile employed in the present invention includes values or ranges that are suggestive of the expression pattern of the leukemia disease gene(s) in peripheral blood samples of disease-free humans or patients with known leukemias. In one example, a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of disease-free humans. In another example, a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of patients having a known leukemia. In still another embodiment, a reference expression profile comprises two or more individual expression profiles, each of which is the expression profile of the leukemia disease gene(s) in a peripheral blood sample of a different leukemia patient or disease-free human. In a further embodiment, a reference expression profile comprises ranges that reflect variations in the expression levels of each of the leukemia disease gene(s) in peripheral blood samples of disease-free humans or patient with known leukemias.
[0009] A reference expression profile employed in the present invention can be prepared using the same type of peripheral blood samples as the peripheral blood sample of the subject of interest and following the same preparation procedure and methodology. A reference expression profile can be predetermined or prerecorded. It can also be determined concurrently with or after the measurement of the expression profile of the subject of interest.
[0010] The comparison of the expression profile of a subject of interest to a reference expression profile can be performed manually or electronically. The difference or similarity between the expression profile of the subject of interest and the reference expression profile is indicative of the presence or absence, or progression or non-progression, of leukemia in the subject. [0011] In some embodiments, the expression level of each of the leukemia disease genes employed in the comparison is correlated with a class distinction under a nearest-neighbor analysis or a significance analysis of microarrays. The class distinction represents an ideal expression pattern of the gene in un-fractionated PBMCs of disease-free humans and patients who have a specified leukemia (e.g., uniformly high in PBMCs of the disease-free humans and uniformly low in PBMCs of the leukemia patients, or vice versa). The disease status of a subject of interest (disease-free versus leukemia) can be predicted by comparing the expression profile of the leukemia disease genes in the subject of interest to a reference expression profile of the same genes using a fc-nearest-neighbors or weighted voting algorithm. Based on the comparison, the subject from whom the sample was taken can be diagnosed with leukemia or diagnosed as disease-free; or an existing leukemia can be assessed for changes, such as those associated with progression or treatment. [0012] The invention also provides a general method for diagnosing or monitoring the occurrence, development, progression or treatment of MDS. The method includes generating a gene expression profile from a peripheral blood sample of a subject and comparing the gene expression profile to one or more reference expression profiles (e.g. an expression profile representing a disease-free human, an expression profile representing a human with MDS5 an expression profile representing a human with a non-MDS leukemia such as AML, or an expression profile representing a human of borderline diagnosis). The gene expression profile and the one or more reference expression profiles comprise the expression patterns of one or more MDS disease genes in PBMCs. The difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative of the presence, absence, occurrence, development, progression, or effectiveness of treatment of MDS in the subject. The MDS disease genes can optionally include one or more genes selected from Tables 4, 6, or 8. The gene expression profile and the reference expression profiles can include the expression pattern of only one gene or of two or more (e.g. three or more, four or more, five or more, six or more, eight or more, ten or more, fifteen or more, twenty or more, forty or more, sixty or more, 100 or more, 200 or more, 300 or more, or 400 or more). In some embodiments, smaller numbers of genes (e.g. two, up to three, up to four, up to five, up to six, up to eight, up to ten, up to fifteen, up to twenty, up to forty, up to sixty, up to 100, or up to 200) are used. The comparison of the gene expression profile to the reference expression profiles can be done, for example, by a k-nearest neighbor analysis or a weighted voting algorithm. Based on the comparison, the subject from whom the sample was taken can be diagnosed with MDS or diagnosed as MDS-free or disease-free; or an existing MDS can be assessed for changes, such as those associated with progression or treatment.
[0013] The invention also provides a method for identifying an MDS patient who is likely to progress to acute myelogenous leukemia (AML) using one or more genes from Table 6. The method includes generating a gene expression profile from a peripheral blood sample from an MDS patient and comparing the gene expression profile to one or more reference expression profiles (e.g. an expression profile representing a human with AML, an expression profile representing a human with MDS known to progress to AML, or an expression profile representing a human with MDS known not to progress to AML). The gene expression profile and the one or more reference expression profiles include the expression patterns in PBMCs of one or more leukemia disease genes selected from Table 6. The difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative that the MDS patient is likely to progress to AML. The leukemia disease genes selected from Table 6 are optionally different from those recited in Table 2, although genes from Table 2 could also be included. The leukemia disease genes selected from Table 6 are optionally among those also recited in Table 8.
[0014] In another aspect, the present invention features methods for evaluating the effectiveness of a treatment of leukemia in a patient of interest. These methods comprise comparing an expression profile of at least one leukemia disease gene in a peripheral blood sample of the patient of interest to a reference expression profile of the same gene(s), where the peripheral blood sample is isolated from the patient after initiation of the treatment, and each of the leukemia disease gene(s) employed is differentially expressed in un-fractionated PBMCs of patients who have the leukemia being evaluated, as compared to in un-fractionated PBMCs of disease-free humans. In one example, the leukemia being assessed is MDS, and the leukemia disease gene(s) employed includes one or more genes selected from Table 4. An elimination or reduction in the difference between the expression profile of the leukemia disease gene(s) in the patient of interest and the corresponding expression profile in disease-free humans during the course of the treatment is indicative of the effectiveness of the treatment for the patient of interest. As compared to conventional methods, the gene expression profiling-based methods may have improved sensitivity for the detection of disease progression or remission. [0015] In still another aspect, the present invention features methods for evaluating the effectiveness of a treatment in preventing the progression of MDS to AML in a patient of interest. These methods comprise comparing an expression profile of at least one leukemia disease gene in a peripheral blood sample of the patient of interest to a reference expression profile of the same gene(s), where the peripheral blood sample is isolated from the patient after initiation of the treatment, and each of the leukemia disease gene(s) employed is differentially expressed in un- fractionated PBMCs of MDS patients as compared to in AML patients. Examples of leukemia disease genes suitable for this purpose include, but are not limited to, those depicted in Table 6. The expression profile of the leukemia disease gene(s) in the patient of interest during the course of the treatment is indicative of the effectiveness of the treatment in preventing the progression of MDS to AML in the patient. [0016] The invention also provides arrays useful, for example, for diagnosing MDS or other leukemias. The arrays include a substrate having several addresses; distinct probes, such as distinct nucleic acid sequences or distinct antibody variable regions, are disposed on each address. In some embodiments, at least 15% (or at least 30% or at least 50%) of the addresses have probes that can specifically detect MDS disease genes in PBMCs; the MDS disease genes are optionally selected from Table 4. In other embodiments, at least 15% (or at least 30% or at least 50%) of the addresses have probes that can specifically detect genes selected from Tables 4 or 6; the selected genes are different from those recited in Table 2, although genes from Table 2 could also be included.
[0017] The invention also provides digitally-encoded expression profiles, as may be encoded in a computer-readable medium, useful, for example, as reference expression profiles to evaluate a gene expression profile from a peripheral blood sample. Each expression profile includes one or more digitally-encoded expression signals including a value representing the expression of a gene selected from Tables 4 or 6; the selected genes are different from those recited in Table 2, although digitally-encoded expression signals including values representing the expression of genes from Table 2 could additionally be included in the expression profile. The values in the digitally-encoded expression signals can represent, for example, the expression of the genes in a PBMC of a human with MDS or a human with AML. Each expression profile can include a single digitally-encoded expression signal or can include two, three, four, five, six, seven, eight, nine, or more digitally-encoded expression signals, such as at least ten, at least 20, at least 30, at least 40, at least 50, at least 100, or at least 200.
[0018] In another aspect, the invention provides kits useful for diagnosis of a leukemia. In one embodiment, the kit includes one or more probes that can specifically detect MDS disease genes (optionally selected from Table 4) in PBMCs. The probes are optionally polynucleotides that hybridize under stringent or nucleic acid array hybridization conditions to the RNA transcripts, or the complements thereof, of the MDS disease genes, or, optionally, are antibody variable domains that bind the products of the MDS disease genes. In another embodiment, the kit includes one or more probes that can specifically detect genes selected from Tables 4 and 6; the selected genes are different from those recited in Table 2, although probes for genes from Table 2 could additionally be included. Genes selected from Tables 4 and 6 can optionally be among those also recited in Table 8. The kits also include one or more controls, each representing a reference expression level of a gene detectable by the probes.
[0019] In another aspect, the invention features a method of making a decision, e.g. selecting a payment class, for a course of treatment for a leukemia such as AML or MDS. The method includes assigning an individual to a class based on a value that is a function of the expression of one or more genes in a peripheral blood sample from the individual, thereby making a decision regarding the individual. The genes include one or more genes from among those recited in Tables 4 and 6 but not recited in Table 2, although the expression of genes recited in Table 2 could also be considered. In some embodiments, the one or more genes are selected from those also recited in Table 8. The decision can include, for example, selecting a treatment, such as an AML treatment, MDS treatment, other leukemia treatment, or an absence of treatment, based on the assignment of the individual to the class. The decision also can include administering or declining to administer a treatment based on the assignment; issuing, transmitting or receiving a prescription; or authorizing, paying for, or causing a transfer of funds to pay for a treatment. "Treatment" as used herein, refers to any action to deal with a disease or condition, regardless of whether the action is intended as preventative, curative, or palliative, for example; or to address a cause or symptom of the disease or condition; or to improve a second treatment by, for example, improving its efficacy or addressing a side effect. The decision may be recorded, such as in a computer-readable medium. [0020] The invention also features a method of providing information on which to make a decision about an individual. The method includes providing (e.g. by receiving) an evaluation of a subject, wherein the evaluation was made by a method described herein, such as by determining the level of expression of one or more genes in a peripheral blood sample of the subject, thereby providing a value. The genes include one or more genes from among those recited in Tables 4 and 6 but not recited in Table 2, although the expression of genes recited in Table 2 could also be considered. The method also includes providing a comparison of the value with a reference value, thereby providing information on which to make a decision about the subject. The method can also include making the decision or communicating the information to another party, such as by computer, compact disc, telephone, facsimile, or letter. The decision can include selecting a subject for payment or making or authorizing payment for a first course of action if the subject demonstrates a gene expression level, pattern or profile observed in a leukemia (e.g AML or MDS) and a second course of action if the subject demonstrates a gene expression level, pattern or profile observed in a different leukemia (e.g. MDS or AML) or in leukemia-free humans. Payment can be from a first party to a second party. The first party can be a party other than the patient, such as a third party payor, an insurance company, employer, employer-sponsored health plan, HMO, or governmental entity. In some embodiments, the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity that sells or supplies the drug. [0021] In one aspect, the invention features a method of making a data record. The method includes entering the result of a method described herein into a record, e.g. a computer readable record. In some embodiments, the record is evaluated and/or transmitted to a third party payor, an insurance company, employer, employer sponsored health plan, HMO, or governmental entity, or a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug.
[0022] In one aspect, the disclosure features a method of providing data. The method includes providing data described herein, e.g., generated by a method described herein, to provide a record, e.g., a record described herein, for determining if a payment will be provided. In some embodiments, the data is provided by computer, compact disc, telephone, facsimile, email, or letter. In some embodiments, the data is provided by a first party to a second party. In some embodiments, the first party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug. In some embodiments, the second party is a third party payor, an insurance company, employer, employer sponsored health plan, HMO, or governmental entity. In some embodiments, the first party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, an insurance company, or an entity which sells or supplies the drug and the second party is a governmental entity. In some embodiments, the first party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, an insurance company, or an entity which sells or supplies the drug and the second party is an insurance company.
[0023] In one aspect, the disclosure features a method of transmitting a record described herein. The method includes a first party transmitting the record to a second party, such as by computer, compact disc, telephone, facsimile, email, or letter. In some embodiments, the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug. In some embodiments, the first party is an insurance company or government entity and the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, a governmental entity, or an entity which sells or supplies the drug. In some embodiments, the first party is a governmental entity or insurance company and the second party is selected from the subject, a healthcare provider, a treating physician, an HMO, a hospital, an insurance company, or an entity which sells or supplies the drug.
[0024] Other features, objects, and advantages of the present invention are apparent in the detailed description that follows. It should be understood, however, that the detailed description, while indicating preferred embodiments of the invention, does so by way of illustration only, not limitation. Various changes and modifications within the scope of the invention will become apparent to those skilled in the art from the detailed description.
DETAILED DESCRIPTION
[0025] The present invention features the use of whole blood samples or samples comprising un-fractionated PBMCs for diagnosing or monitoring the progression or treatment of AML and MDS. Genes that are differentially expressed in un- fractionated PBMCs of AML (or MDS) patients as compared to in disease-free humans can be identified. These genes can be used as surrogate markers for diagnosing or evaluating the treatment of AML (or MDS) in a subject of interest. Genes that are differentially expressed in un-fractionated PBMCs of AML patients as compared to in MDS patients can also be identified. These genes can be used to monitor the progression of MDS in a patient of interest. The present invention does not require positive selection of specific cell subtypes (e.g., CD34+ or AC133+), thereby allowing for rapid diagnosis and evaluation of AML and MDS. Other leukemias, such as acute lymphoblastic leukemia, chronic lymphocytic leukemia, chronic myelogenous leukemia, or hairy cell leukemia, can be similarly assessed according to the present invention.
[0026] Various aspects of the invention are described in further detail in the following subsections. The use of subsections is not meant to limit the invention. Each subsection may apply to any aspect of the invention. In this application, the use of "or" means "and/or" unless otherwise stated.
A. General Methods for Identifying Leukemia Disease Genes
[0027] This invention features the use of nucleic acid arrays for the identification of genes that are differentially expressed in un-fractionated PBMCs of leukemia patients as compared to in disease-free humans or in patients who have a different type of leukemia. Nucleic acid arrays allow for quantitative detection of expression profiles of a large number of genes at one time. Non-limiting examples of nucleic acid arrays suitable for this purpose include Genechip® microarrays (Affymetrix, Santa Clara, CA), cDNA microarrays (Agilent Technologies, Palo Alto, CA), and bead arrays (U.S. Patent Nos. 6,288,220 and 6,391,562). [0028] Polynucleotides to be hybridized to a nucleic acid array can be labeled with one or more labeling moieties to allow for detection of hybridized polynucleotide complexes. The labeling moieties can include compositions that are detectable by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means. Exemplary labeling moieties include, but are not limited to, radioisotopes, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers (such as fluorescent markers or dyes), magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like. Polynucleotides to be hybridized to a nucleic acid array can be cDNA, cRNA, or other types of nucleic acid molecules.
[0029] Hybridization reactions can be performed in absolute or differential hybridization formats. In the absolute hybridization format, polynucleotides derived from one sample, such as un-fractionated PBMCs from an AML or MDS patient or a disease-free human, are hybridized to the probes on a nucleic acid array. Signals detected after the formation of hybridization complexes correlate to the polynucleotide levels in the sample. In the differential hybridization format, polynucleotides derived from two biological samples, such as one from an AML or MDS patient and the other from a disease-free human, are labeled with different labeling moieties (e.g., Cy3 and Cy5, respectively). A mixture of these differently labeled polynucleotides is hybridized to a nucleic acid array. The nucleic acid array is then examined under conditions in which the emissions from the two different labels are individually detectable.
[0030] Signals gathered from nucleic acid arrays can be analyzed using commercially available software, such as software provided by Affymetrix or Agilent Technologies. Controls, such as for scan sensitivity, probe labeling or cDNA quantitation, can be included in the hybridization experiments. In many embodiments, signals from nucleic acid arrays are scaled or normalized before being further analyzed. The expression signals of a gene can be normalized to take into account variations in hybridization intensities when more than one array is used under similar test conditions. Signals for individual polynucleotide complex hybridization can also be normalized using the intensities derived from internal normalization controls contained on each array. In addition, genes with relatively consistent expression levels across the samples can be used to normalize the expression levels of other genes. In one embodiment, the expression levels are normalized across the samples such that the mean is zero and the standard deviation is one. In another embodiment, the expression signals from a nucleic acid array are subject to a variation filter which excludes genes showing minimal or insignificant variation across different classes of samples.
[0031] Expression profiles in un-fractionated PBMCs of leukemia patients are compared to the corresponding expression profiles in disease-free humans. Genes that are differentially expressed in un-fractionated PBMCs of leukemia patients as compared to in un-fractionated PBMCs of disease-free humans can be identified. These genes are hereinafter referred to as leukemia disease genes. By "differentially expressed," it means that the average expression level of a leukemia disease gene in un-fractionated PBMCs of leukemia patients is statistically significantly different from that in un-fractionated PBMCs of disease-free humans. In many instances, the p-value of a Student's t-test (e.g., two-tailed distribution, two-sample unequal variance) for the observed difference is no more than 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, or less. The average expression level of a leukemia disease gene in un-fractionated PBMCs of leukemia patients can be substantially higher or lower than that in disease-free PBMCs. For instance, the average expression level of a leukemia disease gene in PBMCs of leukemia patients can be at least 1, 2, 3, 4, 5, 10, 20, or more folds higher or lower than that in PBMCs of disease-free humans. Leukemia disease genes that are differentially expressed in patients who have different leukemias (e.g., AML versus MDS) can be similarly identified. [0032] Leukemia disease genes can also be identified using supervised or unsupervised clustering algorithms. Non-limiting examples of supervised clustering algorithms include the nearest-neighbor analysis, support vector machines, the SAM (Significance Analysis of Microarrays) method, artificial neural networks, and SPLASH. Non-limiting examples of unsupervised clustering algorithms include self-organized maps (SOMs), k-means, principal component analysis, and hierarchical clustering.
[0033] The nearest-neighbor analysis, also known as the neighborhood analysis, is described in Golub et al., (1999) Science 286:531-537; Slonim et al, (2000) Procs. of the Fourth Annual International Conference on Computational Molecular Biology, Tokyo, Japan, April 8-11, pp. 263-272; and U.S. Patent No. 6,647,341, all of which are incorporated herein by reference. In the analysis, the expression profile of each gene is represented by an expression vector g = (els e2, e3, . . ., en), where e; corresponds to the expression level of gene "g" in the ϊth sample. A class distinction can be represented by an idealized expression pattern c = (c1; C2, C3, . . ., Cn), where Cj = 1 or -1, depending on whether the ith sample is isolated from class 0 or class 1. Class 0 includes subjects having a first disease status (e.g., disease-free), and class 1 includes subjects having a second disease status (e.g. AML or MDS). Other forms of class distinction can also be employed. Typically, a class distinction represents an idealized expression pattern, where the expression level of a gene is uniformly high for samples in one class and uniformly low for samples in the other class. [0034] The correlation between gene "g" and the class distinction can be measured by a signal-to-noise score:
P(g,c) = [μi(g) - μ2(g)]/[σi(g) + σ2(g)] where μi(g) and μ2(g) represent the means of the log-transformed expression levels of gene "g" in class 0 and class 1, respectively, and σ^g) and σ2(g) represent the standard deviation of the log-transformed expression levels of gene "g" in class 0 and class I5 respectively. A higher absolute value of a signal-to-noise score indicates that the gene is more highly expressed in one class than in the other. In one example, the samples used to derive the signal-to-noise scores comprise enriched or purified un-fractionated PBMCs and, therefore, the signal-to-noise score P(g,c) represents a correlation between the class distinction and the expression level of gene "g" in un-fractionated PBMCs. The correlation between gene "g" and the class distinction can also be measured by other methods, such as the Pearson correlation coefficient or the Euclidean distance, as appreciated by those skilled in the art.
[0035] The significance of the correlation between gene expression profiles in un-fractionated PBMCs and a class distinction can be evaluated using a random permutation test. An unusually high density of genes within the neighborhoods of the class distinction, as compared to random patterns, suggests that many of these genes have expression patterns that are significantly correlated with the class distinction. The correlation between genes and a class distinction can be diagrammatically viewed through a neighborhood analysis plot, in which the y-axis represents the number of genes within various neighborhoods around the class distinction and the x-axis indicates the size of the neighborhood (i.e., P(g,c)). Curves showing different significance levels for the number of genes within corresponding neighborhoods of randomly permuted class distinctions can also be included in the plot.
[0036] In many embodiments, the leukemia disease genes identified by the present invention are above the median significance level in the neighborhood analysis plot. This means that the correlation measure P(g,c) for each of these leukemia disease genes is such that the number of genes within the neighborhood of the class distinction having the size of P(g,c) is greater than the number of genes within the corresponding neighborhoods of randomly permuted class distinctions at the median significance level. The leukemia disease genes identified by the present invention can also be above the 40%, 30%, 20%, 10%, 5%, 2%, or 1% significance level. As used herein, x% significance level means that x% of random neighborhoods contain as many genes as the real neighborhood around the class distinction.
[0037] The leukemia disease genes identified by the nearest-neighbor analysis can be used to construct class predictors. Each class predictor includes two or more leukemia disease genes, and can be used to assign a subject of interest to a disease status (e.g., AML, MDS, or disease-free). In one embodiment, a class predictor includes or consists of leukemia disease genes that are significantly correlated with a class distinction by the permutation test (e.g., genes above the 1%, 2%, 5%, 10%, 20%, 30%, 40%, or 50% significance level). In another embodiment, a class predictor includes or consists of leukemia disease genes that have top absolute values of P(g,c).
[0038] The SAM method can also be used to correlate disease statuses with gene expression profiles in un-fractionated PBMCs. The prediction analysis of microarrays (PAM) method can be used to identify class predictors that can best characterize a predefined disease or disease-free class and predict the class membership of new samples. See, for example, Tibshirani et al., (2002) Proc. Natl. Acad. Sci. U.S.A. 99:6567-6572.
[0039] The prediction accuracy of a class predictor of the present invention can be evaluated by k-fold cross validation, such as 10-fold cross validation, 4-fold cross validation, or leave-one-out cross validation. In a typical k-fold cross validation, the data is divided into k subsets of approximately equal size. The model is trained k times, each time leaving out one of the subsets from training and using the omitted subset as the test samples to calculate the prediction error. Where k equals the sample size, it becomes the leave-one-out cross validation. [0040] Other methods can also be used to identify leukemia disease genes. These methods include, but are not limited to, quantitative RT-PCR, Northern Blot, in situ hybridization, protein arrays, immunoassays (e.g., ELISA, RIA or Western Blot), differential display, serial analysis of gene expression (S AGE), representation differential analysis (RDA), subtractive hybridization, GeneCalling® (CuraGen, New Haven, CT), and total gene expression analysis (TOGA). Genes thus identified are differentially expressed in un-fractionated PBMCs of one class of subjects relative to another class of subjects, each class of subjects having a different disease status (e.g., AML, MDS, or disease-free).
[0041] The above-described methods can also be used to identify genes whose expression profiles in un-fractionated PBMCs are predictive of different stages of leukemia progression, or different clinical responses of leukemia patients to a therapeutic treatment. For instance, gene expression profiles in PBMCs of MDS patients who eventually progress to AML can be compared to the corresponding gene expression profiles in MDS patients who do not progress to AML. Genes that are differentially expressed in these two classes of patients can be identified and used for the prediction of progression from MDS to AML. For another instance, leukemia patients can be grouped based on their responses to a therapeutic treatment. The global gene expression analysis is then used to identify genes that are differentially expressed in PBMCs of one group of patients versus another group. Genes thus identified are predictive of clinical outcome of a leukemia patient in response to the therapeutic treatment.
B. Identification of AML and MDS Disease Genes
[0042] HG-U133A Genechips® (Affymetrix, Inc.) were used to identify AML or MDS disease genes. Genes that were differentially expressed in un-fractionated PBMCs of AML (or MDS) patients as compared to in disease-free humans were identified. Genes that were differentially expressed in un-fractionated PBMCs of AML patients as compared to in MDS patients were also identified. [0043] Table 1 lists qualifiers on HG-Ul 33 A Genechips® that showed elevated or decreased signals when hybridized to AML samples as compared to disease-free samples. Each qualifier in Table 1 corresponds to an AML disease gene which is differentially expressed in un-fractionated PBMCs of AML patients as compared to in disease-free humans. The hybridization signal at each qualifier represents the expression level of the corresponding gene in un-fractionated PBMCs. [0044] Table 1 also illustrates the average hybridization signals at each qualifier for AML ("AML Average") or disease-free samples ("Disease-Free Average"). The standard deviations of these signals ("AML StDev" and "Disease-Free StDev," respectively) are also provided. In addition, the ratios between AML and disease- free hybridization signals ("AML/Disease-Free") and the p-values of Student's t- test (two-tailed distribution, two-sample unequal variance) for the observed differences are provided.
Table 1. Genes Differentially Expressed in AML vs. Disease-Free PBMCs
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Figure imgf000022_0001
Figure imgf000023_0001
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Table 2. Annotation of Genes Differentially Expressed in AML vs. Disease-Free PBMCs
Figure imgf000034_0002
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
Figure imgf000052_0001
[0045] Each qualifier on a HG-Ul 33 A Genechip® represents a set of oligonucleotide probes (PM or perfect match probe) that are stably attached to the respective regions on the Genechip®. The RNA transcript (or the complement thereof) of the gene identified by a qualifier can hybridize under nucleic acid array hybridization conditions to at least one oligonucleotide probe of the qualifier. Preferably, the RNA transcript (or the complement thereof) of the gene does not hybridize under nucleic acid array hybridization conditions to the mismatch (MM) probes of the qualifier. A mismatch probe is identical to the corresponding PM probe except for a single, homomeric substitution at or near the center of the mismatch probe. For instance, the MM probe for a 25-mer PM probe has a homomeric base change at the 13th position.
[0046] In one embodiment, the RNA transcript (or the complement thereof) of the gene identified by a qualifier can hybridize under nucleic acid array hybridization conditions to at least 50%, 60%, 70%, 80%, 90% or 100% of the PM probes of the qualifier, but not to the corresponding mismatch probes. The discrimination score (R) for each of these PM probes, as measured by the ratio of the hybridization intensity difference of the corresponding probe pair (i.e., PM - MM) over the overall hybridization intensity (i.e., PM + MM), can be no less than 0.015, 0.02, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5 or greater. In another embodiment, the RNA transcript (or the complement thereof) of the gene, when hybridized to a HG-U133A Genechip® according to the manufacturer's instructions, produces a "present" call at the corresponding qualifier under the default settings (i.e., the threshold Tau is 0.015 and the significance level at is 0.4). See Genechip® Expression Analysis - Data Analysis Fundamentals (Part No. 701190 Rev. 2, Affymetrix, Inc., 2002), the entire content of which is incorporated herein by reference. [0047] The sequences of each PM probe on HG-U133A Genechips®, and the target sequences from which the PM probes are derived, can be readily obtained from Affymetrix's sequence database at Affymetrix website. See, for example, HG- U133A_probe_tab.zip, the entire content of which is incorporated herein by reference. [0048] Table 2 lists the genes that are represented by the qualifiers in Table 1. These genes, as well as their corresponding unigene IDs and Entrez accession numbers, were identified according to Affymetrix Genechip® annotation. A unigene is composed of a non-redundant set of gene-oriented clusters. Each unigene cluster is believed to include sequences that represent a unique gene. The Entrez database collects sequences from a variety of sources, such as GenBank, RefSeq and PDB. The oligonucleotide probes of each qualifier can be derived from its corresponding Entrez sequence.
[0049] Table 3 describes qualifiers that showed elevated or decreased signals when hybridized to MDS samples as compared to disease-free samples. The average hybridization signals at each qualifier for MDS ("MDS Average") or disease-free ("Disease-Free Average") samples are provided, together with their corresponding standard deviations ("MDS StDev" and "Disease-Free StDev," respectively). In addition, the ratios between the average hybridization signals ("MDS/Disease-Free") and the p-values of Student's test (two-tailed distribution, two-sample unequal variance) for the observed differences are also provided. Table 4 further describes the genes that are represented by the qualifier in Table 3. [0050] Table 5 illustrates qualifiers that showed elevated or decreased signals when hybridized to AML samples as compared to MDS samples. Like Tables 1 and 3, the average hybridization signals at each qualifier for AML or MDS samples ("AML Average" and "MDS Average," respectively), the corresponding standard deviations ("AML StDev" and "MDS StDev," respectively), the ratios between the hybridized signals ("AML/MDS"), and the p-values for the observed differences are provided in Table 5. The genes represented by the qualifiers in Table 5 are further described in Table 6. Table 3. Genes Differentially Expressed in MDS vs. Disease-Free PBMCs
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Table 4. Annotation of Genes Differentially Expressed in MDS vs. Disease-Free PBMCs
Figure imgf000061_0002
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Table 5. Genes Differentially Expressed in AML vs. MDS PBMCs
Figure imgf000068_0002
Figure imgf000069_0001
Figure imgf000070_0001
Table 6. Annotation of Genes Differentially Expressed in AML vs. MDS PBMCs
Figure imgf000070_0002
Figure imgf000071_0001
Figure imgf000072_0001
[0051] The genes depicted in Tables 2, 4, and 6 were identified according to Affymetrix annotation. Genes that corresponds to the qualifiers in Tables 1, 3, and 5 can also be identified by BLAST searching the target sequences of these qualifiers against human genome sequence databases. Databases suitable for this purpose include, but are not limited to, the human genome database at National Center for Biotechnology Information (NCBI), Bethesda, MD. NCBI also provides BLAST programs, such as "blastn," for searching its sequence databases. A BLAST search of a gene that corresponds to a qualifier can be conducted using an unambiguous segment of the target sequence of the qualifier (i.e., a sequence segment that does not contain any unknown nucleotide residue). Gene(s) whose protein-coding sequence has significant sequence identity with the unambiguous segment (e.g., having at least 95%, 96%, 97%, 98%, 99%, or more sequence identity) can be identified. The RNA transcript (or the complement thereof) of the gene(s) thus identified can hybridize under stringent or nucleic acid array hybridization conditions to the PM probes of the qualifier. Accordingly, the qualifiers in Tables 1, 3, and 5 represent not only genes that are explicitly depicted in the tables, but also genes that are not listed but nonetheless can hybridize under stringent or nucleic acid array hybridization conditions to the PM probes of the qualifiers. [0052] As used herein, "stringent conditions" are at least as stringent as conditions G-L in Table 7. "Highly stringent conditions" are at least as stringent as conditions A-F in Table 7. For each condition, hybridization is carried out under the corresponding hybridization conditions (Hybridization Temperature and Buffer) for about four hours, followed by two 20-minute washes under the corresponding wash conditions (Wash Temp, and Buffer).
Table 7. Stringency Conditions
Figure imgf000074_0001
1: The hybrid length is that anticipated for the hybridized region(s) of the hybridizing polynucleotides. When hybridizing a polynucleotide to a target polynucleotide of unknown sequence, the hybrid length is assumed to be that of the hybridizing polynucleotide. When polynucleotides of known sequence are hybridized, the hybrid length can be determined by aligning the sequences of the polynucleotides and identifying the region or regions of optimal sequence complementarity.
H: SSPE (IxSSPE is 0.15M NaCl, 1OmM NaH2PO4, and 1.25mM EDTA, pH 7.4) can be substituted for SSC (IxSSC is 0.15M NaCl and 15mM sodium citrate) in the hybridization and wash buffers.
TB* - TR*: The hybridization temperature for hybrids anticipated to be less than 50 base pairs in length should be 5-1O0C less than the melting temperature (Tm) of the hybrid, where Tn, is determined according to the following equations. For hybrids less than 18 base pairs in length, Tm(°C) = 2(# of A + T bases) + 4(# of G + C bases). For hybrids between 18 and 49 base pairs in length, T1n(0C) = 81.5 + 16.6(1Og10Na+) + 0.41(%G + C) - (600/N), where N is the number of bases in the hybrid, and Na+ is the molar concentration of sodium ions in the hybridization buffer (Na+ for IxSSC = 0.165M).
C. Prognosis, Diagnosis and Selection of Treatment of MDS, AML or Other
Leukemias
[0053] The leukemia disease genes of the present invention can be used for diagnosis and prognosis of MDS, AML or other leukemias. For example, the disease genes can be used to identify an MDS patient who is likely to progress to acute myelogenous leukemia (AML). The leukemia disease genes can also be used to evaluate the progression or effectiveness of a treatment of leukemia in a patient of interest. Any type of leukemia can be assessed according to the present invention. Examples of these leukemias include, but are not limited to, AML, MDS, acute lymphoblastic leukemia, chronic lymphocytic leukemia, chronic myelogenous leukemia, and hairy cell leukemia. The diagnosis and prognosis typically involve comparison of the peripheral blood expression profile of one or more disease genes in the leukemia patient of interest to at least one reference expression profile. [0054] In one embodiment, the disease genes employed for diagnosis and prognosis are selected such that the peripheral blood expression profile of each disease gene is correlated with a class distinction under a class-based correlation analysis (such as the nearest-neighbor analysis), where the class distinction represents an idealized expression pattern of the selected genes in peripheral blood samples of leukemia patients who have different clinical outcomes. In many cases, the selected disease genes are correlated with the class distinction at above the 50%, 25%, 10%, 5%, or 1% significance level under a random permutation test. [0055] The disease genes can also be selected such that the average expression profile of each disease gene in peripheral blood samples of one class of leukemia patients is statistically different from that in another class of leukemia patients or disease-free humans. For instance, the p-value under a Student's t-test for the observed difference can be no more than 0.05, 0.01, 0.005, 0.001, or less. In addition, the disease genes can be selected such that the average peripheral blood expression level of each disease gene in one class of patients is at least 2-, 3-, 4-, 5-, 10-, or 20-fold different from that in another class of patients or disease-free humans.
[0056] The expression profile of the leukemia disease gene(s) in a peripheral blood sample of a subject of interest can be compared to a reference expression profile of the same gene(s) for diagnosing or evaluating the progression or treatment of leukemia in the subject of interest. The reference expression profile can be prepared using the same type of peripheral blood samples (e.g., whole blood samples or blood samples comprising enriched un-fractionated PBMCs) as the peripheral blood sample of the subject of interest. Both expression profiles can be prepared using the same preparation procedure or methodology. As a consequence, for each component in the expression profile of the subject of interest, there is at least one corresponding component in the reference expression profile. A reference expression profile can be pre-determined or pre-recorded. It can also be prepared concurrently with or after the determination of the expression profile of the subject of interest. Each expression profile employed in the present invention can have any format, such as a table format, a graphic format, or an electronic or digital format. [0057] A reference expression profile employed in the present invention typically includes or consists of values or ranges that are suggestive of the expression pattern of the leukemia disease gene(s) in peripheral blood samples of disease-free humans or patients having known leukemias. In one embodiment, a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of disease-free humans. In another example, a reference expression profile comprises the average expression levels of each of the leukemia disease gene(s) in peripheral blood samples of patients who have the leukemia being investigated. Any averaging method can be used, including but not limited to arithmetic means, harmonic means, average of absolute values, average of log-transformed values, and weighted average. [0058] The reference expression profiles may include a plurality of expression profiles, each of which represents the peripheral blood expression pattern of the disease gene(s) in a particular leukemia patient whose clinical outcome is known or determinable. For example, a reference expression profile may include two or more individual expression profiles, each of which represents the expression profile of the leukemia disease gene(s) in a peripheral blood sample of a different leukemia patient or disease-free volunteer. The expression profile of a subject of interest can be compared to these individual reference expression profiles using a pattern recognition algorithm, such as weighted voting, ^-nearest neighbors, or support vector machines.
[0059] A reference expression profile suitable for the invention may contain ranges for the expression levels of each leukemia disease gene employed. Each range can be selected to reflect variations in the expression levels of the corresponding gene in peripheral blood samples of disease-free humans or patients who have known leukemias. For instance, the range can be selected to be one standard deviation (or a multiple or fraction thereof) from the mean expression level of the corresponding gene in peripheral blood samples of disease-free humans (or patients having a known leukemia). Where the expression level of the gene in a subject of interest falls within that range, a "similar" call can be made with respect to that gene.
[0060] Other types of reference expression profiles can also be used in the present invention. For example, a numerical threshold can be used as a reference. [0061] The expression profile of the patient of interest and the reference expression profile(s) can be constructed in any form. In one embodiment, the expression profiles comprise the expression level of each disease gene used in outcome prediction. The expression levels can be absolute, normalized, or relative levels. Suitable normalization procedures include, but are not limited to, those used in nucleic acid array gene expression analyses or those described in Hill et ah, (2001) Genome Biol., 2:research0055.1-0055.13. In one example, the expression levels are normalized such that the mean is zero and the standard deviation is one. In another example, the expression levels are normalized based on internal or external controls, as appreciated by those skilled in the art. In still another example, the expression levels are normalized against one or more control transcripts with known abundances in blood samples. In many cases, the expression profile of the patient of interest and the reference expression profile(s) are constructed using the same or comparable methodologies.
[0062] In another embodiment, each expression profile being compared comprises one or more ratios between the expression levels of different disease genes. An expression profile can also include other measures that are capable of representing gene expression patterns.
[0063] Peripheral blood samples suitable for the present invention include, but are not limited to, whole blood samples or samples comprising un-fractionated PBMCs. In many embodiments, peripheral blood samples comprising enriched un- fractionated PBMCs are employed. By "enriched," it means that the percentage of PBMCs in a sample is higher than that in whole blood. In many cases, at least 75%, 80%, 85%, 90%, 95%, 99%, or 100% of the cells in an enriched sample is PBMCs. Methods suitable for preparing enriched un-fractionated PBMCs include, but are not limited to, Ficoll gradients centrifugation or cell purification tubes (CPTs). Other conventional methods can also be used to prepare enriched un-fractionated PBMCs. [0064] Construction of the expression profiles typically involves detection of the expression level of each disease gene used in the diagnosis or prognosis. Numerous methods are available for this purpose. For instance, the expression level of a gene can be determined by measuring the level of the RNA transcript(s) of the gene. Suitable methods include, but are not limited to, quantitative RT-PCT, Northern Blot, in situ hybridization, slot-blotting, nuclease protection assay, and nucleic acid array (including bead array). The expression level of a gene can also be determined by measuring the level of the polypeptide(s) encoded by the gene. Suitable methods include, but are not limited to, immunoassays (such as ELISA, RIA, FACS, or Western blot), 2-dimensional gel electrophoresis, mass spectrometry, or protein arrays.
[0065] The expression profile of the leukemia disease gene(s) in a subject of interest can be determined by measuring the RNA transcript level of each of the gene(s) in a peripheral blood sample of the subject. Methods suitable for this purpose include, but are not limited to, quantitative RT-PCT5 competitive RT-PCR, real time RT-PCR, differential display RT-PCR, Northern Blot, in situ hybridization, slot-blotting, nuclease protection assays, and nucleic acid arrays (including bead arrays). The expression profile of the leukemia disease gene(s) can also be determined by measuring the protein product level of each of the gene(s) in the peripheral blood sample of the subject of interest. Methods suitable for this purpose include, but are not limited to, immunoassays {e.g., ELISA (enzyme-linked immunosorbent assay), RIA (radioimmunoassay), FACS (fluorescence-activated cell sorter), Western Blot, dot blot, immunohistochemistry, or antibody-based radioimaging), protein arrays, high-throughput protein sequencing, two-dimensional SDS-polyacrylamide gel electrophoresis, and mass spectrometry. In addition, the biological activity (e.g., enzymatic activity or protein/DNA binding activity) of the protein product encoded by a leukemia disease gene can also be used to measure the expression level of the gene in a peripheral blood sample of interest. [0066] The expression profile the leukemia disease gene(s) can have any form. In one embodiment, the expression profile includes the expression level of each leukemia disease gene employed. Each expression level can be an absolute expression level, or a normalized or relative expression level. Methods suitable for normalizing expression levels of different genes include, but are not limited to, those described in Hill et al., (2001) Genome Biol, 2:research0055.1-0055.13, and Genechip® Expression Analysis - Data Analysis Fundamentals (Part No. 701190 Rev. 2, Affymetrix, Inc., 2002), both of which are incorporated herein by reference in their entireties. In one example, the expression level of each leukemia disease gene is normalized based on internal or external controls. The expression level of each leukemia disease gene can also be normalized against one or more control transcripts with known abundances in the samples used.
[0067] The expression profile of the leukemia disease gene(s) can also include ratio or ratios between the expression levels of different leukemia disease genes (e.g., ratios between the expression levels of genes that are up-regulated in PBMCs of leukemia patients versus genes that are down-regulated). Ratios between the expression levels of leukemia disease genes versus non-leukemia disease genes can also be used to construct the expression profiles of leukemia disease genes. Other measures that are indicative of gene expression patterns can also be used to prepare gene expression profiles.
[0068] The difference or similarity between the expression profile of a subject of interest and a reference expression profile can be determined by assessing the differences or similarities between the corresponding components in the two profiles. Methods suitable for this purpose include, but are not limited to, fold changes or absolute differences. In one example, the expression level of a leukemia disease gene in a subject of interest is considered similar to the corresponding reference level in the reference expression profile if the difference between the two levels is less than 50%, 40%, 30%, 20%, or 10% of the reference level. In another example, the expression level of a leukemia disease gene in a subject of interest is considered similar to the corresponding reference level in the reference expression profile if the former level falls within the standard deviation (or a multiple or fraction therefore) of the reference level. [0069] The criteria for the overall similarity between the expression profile of a subject of interest and a reference expression profile can be selected such that the accuracy (the ratio of correct calls over the total of correct and incorrect calls) for leukemia diagnosis or assessment is relatively high. For instance, the similarity criteria can be selected such that the accuracy for leukemia diagnosis or assessment is at least 50%, 60%, 70%, 80%, 90%, or more. In one example, an overall similarity call is made if at least 50%, 60%, 70%, 80%, 90%, or more of the components in the expression profile of the subject of interest are considered similar to the corresponding components in the reference expression profile. Different components in the expression profiles may have the same or different weights in comparison. The gene expression-based methods can also be combined with other clinical tests to improve the accuracy of leukemia diagnosis or assessment. [0070] The weighted voting algorithm is capable of assigning a class membership to a subject of interest. See Golub et ah, supra, and Slonim et ah, supra. Software programs suitable for this purpose include, but are not limited to, the GeneCluster 2 software (Broad Institute, Cambridge, MA). [0071] Under one form of the weighted voting analysis, a subject of interest is being assigned to one of two classes {i.e., class 0 and class 1), each class representing a different disease status (e.g., AML, MDS, or disease-free). For instance, class 0 can include disease-free humans and class 1 includes MDS (or AML) patients. For another instance, class 0 can include AML patients and class 1 includes MDS patients. A set of AML or MDS disease genes can be selected from Tables 2, 4, or 6 to form a classifier (i.e., class predictor). Each gene in the classifier casts a weighted vote for one of the two classes (class 0 or class 1). The vote of gene "g" can be defined as vg = ag (xg-bg), wherein ag equals to P(g,c) and reflects the correlation between the expression level of gene "g" and the class distinction between class 0 and class 1. bg equals to [xθ(g) + xl(g)]/2, which is the average of the mean logs of the expression levels of gene "g" in class 0 and class 1. xg represents the normalized log of the expression level of gene "g" in the sample of interest. A positive vg indicates a vote for class 0, and a negative vg indicates a vote for class 1. VO denotes the sum of all positive votes, and Vl denotes the absolute value of the sum of all negative votes. A prediction strength PS is defined as PS = (VO - V1)/(VO + Vl).
[0072] Cross-validation can be used to evaluate the accuracy of a class predictor created under the weighted voting algorithm. In one embodiment, cross-validation includes withholding a sample which has been used in the neighborhood analysis for the identification of the disease genes. A class predictor is created based on the remaining samples, and then used to predict the class of the sample withheld. This process is repeated for each sample that has been used in the neighborhood analysis. Class predictors with different leukemia disease genes are evaluated by cross- validation, and the best class predictor with the most accurate predication can be identified.
[0073] Any number of leukemia disease genes can be employed in the present invention. In one embodiment, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or more genes selected from Table 2 are used for the diagnosis or evaluation of the effectiveness of a treatment of AML in a subject of interest. In another embodiment, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or more genes selected from Table 4 are used for the diagnosis or evaluation of the effectiveness of a treatment of MDS in a subject of interest. In still another embodiment, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, or more genes selected from Table 6 are used for the diagnosis or evaluation of the progression or treatment of MDS in a subject of interest. In a further embodiment, a combination of genes selected from Tables 2, 4, or 6 are used for the diagnosis or evaluation of the progression or treatment of AML or MDS in a subject of interest.
[0074] The leukemia disease gene(s) employed in the present invention can be selected to have p-values of no greater than 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, or less. The leukemia disease gene(s) can also be selected to include gene(s) that is upregulated in leukemia patients as compared to in disease-free humans, as well as gene(s) that is downregulated in leukemia patients as compared to in disease-free humans. To improve the accuracy of leukemia diagnosis or assessment, the leukemia disease gene(s) can also be selected through the use of optimization algorithms such as the mean variance algorithm as described in U.S. Patent Application 20040214179. [0075] Where a weighted voting or ^-nearest neighbors algorithm is used, the leukemia disease genes in a class predictor can be selected such that they are significantly correlated with the class distinction in the neighborhood analysis. For instance, the leukemia disease genes that are above the 1%, 5%, or 10% significance level in the neighborhood analysis can be selected. See Golub et al, supra, and Slonim et al, supra. The leukemia disease genes in a class predictor can also include top upregulated leukemia disease gene(s), as well as top downregulated leukemia disease gene(s).
[0076] In one example, a class predictor employed in the present invention comprises or consists of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 40, or more genes selected from Table 2 or Table 4. The class predictor can include at least two groups of genes. The first group includes gene or genes having AML/Disease-Free ratios (or MDS/Disease-Free ratios) of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more, and the second group includes gene or genes having AML/Disease- Free ratios (or MDS/Disease-Free ratios) of no greater than 0.5, 0.333, 0.25, 0.2, 0.1, or less.
[0077] In another example, a class predictor employed in the present invention comprises or consists of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 40, or more genes selected from Table 6. The class predictor can also include at least two groups of genes. The first group includes gene or genes having AML/MDS ratios of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more, and the second group includes gene or genes having AML/MDS ratios of no greater than 0.5, 0.333, 0.25, 0.2, 0.1, or less. [0078] In addition to genes depicted in Tables 2, 4, and 6, the present invention also contemplates the use of other leukemia disease genes for the diagnosis or assessment of the progression or treatment of leukemia in a subject of interest. These genes can hybridize under stringent or nucleic acid array hybridization conditions to the qualifiers selected from Tables 2, 4, and 6. As used herein, a gene can hybridize to a qualifier if the RNA transcript (or the complement thereof, depending on the strandedness of the oligonucleotide probes of the qualifier) of the gene can hybridize to at least one oligonucleotide probe of the qualifier. In many instances, the RNA transcript (or the complement thereof) of the gene can hybridize under stringent or nucleic acid array hybridization conditions to at least 50%, 60%, 70%, 80%, 90% or 100% of the oligonucleotide probes of the qualifier and produce a "present" call at the qualifier on an Affymetrix Genechip® under the default settings (i.e., the threshold Tau is 0.015 and the significance level αl is 0.4). See Genechip® Expression Analysis - Data Analysis Fundamentals (Part No. 701190 Rev. 2, Affymetrix, Inc., 2002).
D. Other Applications
[0079] A clinical challenge concerning AML, MDS and other blood or bone marrow diseases is the highly variable response of patients to a therapy. The basic concept of pharmacogenomics is to understand a patient's genotype in relation to available treatment options and then individualize the most appropriate option for the patient. According to the present invention, different classes of patients can be created based on their different responses to a given therapy. Genes differentially expressed in un-fractionated PBMCs of one response class as compared to in another response class can be identified using the global gene expression analysis. These genes are molecular markers for predicting whether a patient of interest will be more or less responsive to the therapy. For patients predicted to have a favorable outcome, efforts to minimized toxicity of the therapy may be considered, whereas for those predicted not to respond to the therapy, treatment with other therapies or experimental regimes can be explored.
[0080] In one embodiment, patients are grouped into at least two classes (class 0 and class 1). Class 0 includes patients who die within a specified period of time (such as one year) after initiation of a treatment. Class 1 includes patients who survive beyond the specified period of time after initiation of the treatment. Genes that are differentially expressed in un-fractionated PBMCs of class 0 patients as compared to in un-fractionated PBMCs of class 1 patients can be identified. These genes are prognostic markers of patient clinical outcome. Other clinical outcome criteria, such as remission/non-remission, time to progression, complete response, partial response, stable disease, or progressive disease, can also be used to group leukemia patients to identify the corresponding prognostic genes. [0081] The leukemia disease genes of the present invention can also be used to identify or test drugs for the treatment of AML or MDS. The ability of a drug candidate to reduce or abolish the abnormal expression of AML or MDS disease genes in un-fractionated PBMCs is suggestive of the effectiveness of the drug candidate in treating AML or MDS. Methods for screening or evaluating drug candidates are well known in the art. These methods can be carried out either in animal models or during human clinical trials.
[0082] The present invention also contemplates expression vectors encoding AML or MDS disease genes. These AML or MDS disease genes may be under- expressed in AML or MDS tumor cells. By introducing the expression vectors into the patients in need thereof, abnormal expression of these genes can be corrected. Expression vectors and gene delivery techniques suitable for this purpose are well known in the art.
[0083] In addition, this invention contemplates sequences that are antisense to AML or MDS disease genes or expression vectors encoding the same. The AML or MDS disease genes may be over-expressed in AML or MDS tumor cells. By introducing the antisense sequences or expression vectors encoding the same, abnormal expression of these disease genes can be corrected. [0084] Expression of an AML or MDS disease gene can also be inhibited by RNA interference ("RNAi"). RNAi is a technique used in post transcriptional gene silencing ("PTGS"), in which the targeted gene activity is specifically abolished. RNAi resembles in many aspects PTGS in plants and has been detected in many invertebrates including trypanosome, hydra, planaria, nematode and fruit fly (Drosophila melanogaster). It may be involved in the modulation of transposable element mobilization and antiviral state formation. RNAi in mammalian systems is disclosed in PCT application WO00/63364. In one embodiment, dsRNA of at least about 21 nucleotides is introduced into cells to silence the expression of the target gene.
[0085] In addition, the present invention features antibodies that specifically recognize the polypeptides encoded by AML or MDS disease genes. These antibodies can be administered to patients in need thereof. In one embodiment, an antibody of the present invention can substantially reduce or inhibit the activity of a disease gene. For instance, the antibody can reduce the activity of a disease gene by at least about 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more. Suitable antibodies for the present invention include, but are not limited to, polyclonal antibodies, monoclonal antibodies, chimeric antibodies, humanized antibodies, single chain antibodies, Fab fragments, or fragments produced by a Fab expression library. In many embodiments, the antibodies of the present invention can bind to the respective AML or MDS disease gene products or other desired antigens with a binding affinity constant Kaof at least 106 M"1, 107 M"1, 108 M"1, 109 M"1, or more. [0086] A pharmaceutical composition comprising an antibody or a polynucleotide of the present invention can be prepared. The pharmaceutical composition can be formulated to be compatible with its intended route of administration. Examples of routes of administration include, but are not limited to, parenteral, intravenous, intradermal, subcutaneous, oral, inhalational, transdermal, topical, transmucosal, and rectal administration. Methods for preparing desirable pharmaceutical compositions are well known in the art.
[0087] The present invention further features kits or apparatuses for diagnosing or monitoring the progression or treatment of AML or MDS. In one embodiment, a kit or apparatus of the present invention includes or consists essentially of one or more polynucleotides, each of which is capable of hybridizing under stringent conditions to a gene selected from Tables 2, 4, or 6. The polynucleotide(s) can be labeled with fluorescent, radioactive, or other detectable moieties. The polynucleotide(s) can be also un-labeled. Any number of polynucleotides can be included in a kit or apparatus. For instance, at least 1, 2, 3, 4, 5, 10, 15, 20, or more polynucleotides can be included in a kit or apparatus, each polynucleotide being capable of hybridizing under stringent conditions to a different respective gene selected from Tables 2, 4, or 6. In one example, the polynucleotide(s) included in a kit or apparatus is enclosed in a vial, a tube, a bottle or another containing mean. In another example, the polynucleotide(s) is stably attached to one or more substrates. Nucleic acid hybridization can be directly conducted on the substrate(s). Hybridization reagents can also be included in a kit or apparatus of the present invention. [0088] In another embodiment, a kit or apparatus of the present invention includes or consists essentially of one or more antibodies specific for the polypeptide(s) encoded by the gene(s) selected from Tables 2, 4, or 6. The antibody or antibodies can be labeled with one or more detectable moieties to allow for detection of antibody-antigen complexes. The antibody or antibodies can also be un-labeled.
[0089] Any number of antibodies can be included in a kit or apparatus. For instance, at least 1, 2, 3, 4, 5, 10, 15, 20, or more antibodies can be included in a kit or apparatus, and each of these antibodies can specifically recognize a different respective AML or MDS disease gene product. Immunodetection reagents (such as secondary antibodies, controls or enzyme substrates) can also be included in a kit or apparatus of the present invention. In one example, a kit of the present invention includes one or more containers which enclose the antibody or antibodies. In another example, the antibody or antibodies in an apparatus of the present invention are stably attached to one or more substrates. Substrates suitable for this purpose include, but are not limited to, films, membranes, column matrices, or microtiter plate wells. Immunoassays can be performed directly on the substrate(s). [0090] Furthermore, the present invention features systems capable of comparing an expression profile of interest to at least one reference expression profile. In many embodiments, the reference expression profiles are stored in a database. The comparison between the expression profile of interest and the reference expression profile(s) can be carried out electronically, such as by using a computer system. The computer system typically comprises a processor coupled to a memory which stores data representing the expression profiles to be compared. In one embodiment, the memory is readable as well as rewritable. The expression profiles can be retrieved or modified. The computer system includes one or more programs capable of causing the processor to compare the expression profiles. In one embodiment, the computer system includes a program capable of executing a weighted voting or a ^-nearest-neighbors algorithm. In another embodiment, the computer system is coupled to a nucleic array from which hybridization signals can be directly fed into the system. Kits for prognosis, diagnosis or selection of treatment of MDS, AML, and other leukemias
[0091] In addition, the present invention features kits useful for the prognosis, diagnosis or selection of treatment of MDS, AML or other leukemias. Each kit includes or consists essentially of at least one probe for a leukemia disease gene (e.g., a gene selected from Tables 2, 4, or 6). Reagents or buffers that facilitate the use of the kit can also be included. Any type of probe can be using in the present invention, such as hybridization probes, amplification primers, or antibodies. [0092] In one embodiment, a kit of the present invention includes or consists essentially of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more polynucleotide probes or primers. Each probe/primer can hybridize under stringent conditions or nucleic acid array hybridization conditions to a different respective leukemia disease gene. As used herein, a polynucleotide can hybridize to a gene if the polynucleotide can hybridize to an RNA transcript, or the complement thereof, of the gene. In another embodiment, a kit of the present invention includes one or more antibodies, each of which is capable of binding to a polypeptide encoded by a different respective leukemia disease gene.
[0093] In one example, a kit of the present invention includes or consists essentially of probes {e.g., hybridization or PCR amplification probes or antibodies) for at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75 or more genes selected from Table 2, 4 or 6.
[0094] The probes employed in the present invention can be either labeled or unlabeled. Labeled probes can be detectable by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical, chemical, or other suitable means. Exemplary labeling moieties for a probe include radioisotopes, chemimminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like.
[0095] The kits of the present invention can also have containers containing buffer(s) or reporter means. In addition, the kits can include reagents for conducting positive or negative controls. In one embodiment, the probes employed in the present invention are stably attached to one or more substrate supports. Nucleic acid hybridization or immunoassays can be directly carried out on the substrate support(s). Suitable substrate supports for this purpose include, but are not limited to, glasses, silica, ceramics, nylons, quartz wafers, gels, metals, papers, beads, tubes, fibers, films, membranes, column matrices, or microtiter plate wells. The kits of the present invention may also contain one or more controls, each representing a reference expression level of a disease gene detectable by one or more probes contained in the kits.
[0096] It should be understood that the above-described embodiments and the following examples are given by way of illustration, not limitation. Various changes and modifications within the scope of the present invention will become apparent to those skilled in the art from the present description.
E. Examples
Example 1. Purification of PBMCs and RNA
[0097] Whole blood was collected from disease-free volunteers, patients with MDS, and patients with AML. Informed consents for the pharmacogenomic portions of these clinical studies were received and the project was approved by the local Institutional Review Boards at the participating clinical sites. MDS patients were primarily of Caucasian descent and had a mean age of 66 years (range of 52-84 years). AML patients were exclusively of Caucasian descent and had a mean age of 45 years (range of 19-65 years). Disease-free volunteers were exclusively of Caucasian descent with a mean age of 23 years (range of 18-32 years). [0098] Inclusion criteria for AML patients included blasts in excess of 20% in the bone marrow, morphologic diagnosis of AML according to the FAB classification system and flow cytometry analysis indicating CD33+ status. Inclusion criteria for MDS patients included morphologic diagnosis of MDS and FAB classification as refractory anemia, refractory anemia with ringed sideroblasts, refractory anemia with excess blasts, or refractory anemia with excess blasts in transformation (where disease stability had been demonstrated for a minimum of 2 months). [0099] The blood samples were drawn into CPT Cell Preparation Vacutainer Tubes (Becton Dickinson). PBMCs were isolated over Ficoll gradients according to the manufacturer's protocol (Becton Dickinson). Total RNA was isolated from un- fractionated PBMC pellets using Qiagen RNeasy® mini-kits (Qiagen, Valencia, CA).
Example 2. RNA Amplification and Generation ofGenechip® Hybridization Probes
[00100] Labeled target for oligonucleotide arrays was prepared using a modification of the procedure described in Lockhart et al, (1996) Nature Biotechnology 14:1675-1680. Two micrograms of total RNA were converted to cDNA using an oligo-d(T)24 primer containing a T7 DNA polymerase promoter at the 5' end. The cDNA was used as the template for in vitro transcription using a T7 DNA polymerase kit (Ambion, Woodlands, TX, USA) and biotinylated CTP and UTP (Enzo, Farmingdale, NY, USA). Labeled cRNA was fragmented in 40 mM Tris-acetate pH 8.0, 100 mM KOAc, 30 mM MgOAc for 35 min at 94 0C in a final volume of 40 μl.
Example 3. Hybridization to Affymetrix Microarrays and Detection of Fluorescence
[00101] Individual diseased and disease-free samples are hybridized to HG- U133A Genechips® (Affymetrix). No samples are pooled. 10 μg of labeled target is diluted in Ix MES buffer with 100 μg/ml herring sperm DNA and 50 μg/ml acetylated BSA. To normalize arrays to each other and to estimate the sensitivity of the oligonucleotide arrays, in vitro synthesized transcripts of 11 bacterial genes are included in each hybridization reaction, as described in Hill et al. (2000) Science 290:809-812. The abundance of these transcripts ranges from 1:300,000 (3 ppm) to 1 : 1000 (1000 ppm) stated in terms of the number of control transcripts per total transcripts. As determined by the signal response from these control transcripts, the sensitivity of detection of the arrays can range between about 1:300,000 and 1 : 100,000 copies/million.
[00102] Labeled probes are denatured at 990C for 5 minutes and then 450C for 5 minutes and hybridized to oligonucleotide arrays comprised of over 12,500 human genes (HG-Ul 33 A, Affymetrix). Arrays are hybridized for 16 hours at 450C. The hybridization buffer includes 100 mM MES, 1 M [Na+], 20 mM EDTA, and 0.01% Tween 20. After hybridization, the cartridges is washed extensively with wash buffer 6x SSPET (e.g., three times at room temperature for at least 10 minutes each time). These hybridization and washing conditions are collectively referred to as "nucleic acid array hybridization conditions." The washed cartridges are subsequently stained with phycoerythrin coupled to streptavidin. [00103] 12x MES stock contains 1.22 M MES and 0.89 M [Na+]. For 1000 ml, the stock can be prepared by mixing 70.4 g MES free acid monohydrate, 193.3 g MES sodium salt and 800 ml of molecular biology grade water, and adjusting volume to 1000 ml. The pH should be between 6.5 and 6.7. 2x hybridization buffer can be prepared by mixing 8.3 ml of 12x MES stock, 17.7 ml of 5 M NaCl, 4.0 ml of 0.5 M EDTA, 0.1 ml of 10% Tween 20 and 19.9 ml of water. 6x SSPET contains 0.9 M NaCl, 60 mM NaH2PO4, 6 mM EDTA, pH 7.4, and 0.005% Triton X-100. In some cases, the wash buffer is replaced with a more stringent wash buffer. 1000 ml of the stringent wash buffer can be prepared by mixing 83.3 ml of 12x MES stock, 5.2 ml of 5 M NaCl, 1.0 ml of 10% Tween 20 and 910.5 ml of water.
Example 4. Gene Expression Data Analysis
[00104] Data analysis and absent/present call determination are performed on raw fluorescent intensity values using Genechip® 3.2 software (Affymetrix). The "average difference" values for each transcript are normalized to "frequency" values using the scaled frequency normalization method in which the average differences for 11 control cRNAs with known abundance spiked into each hybridization solution were used to generate a global calibration curve. See Hill et al. (2001) Genome Biol., 2(12):research0055.1-0055.13, the entire content of which is incorporated herein by reference. This calibration is then used to convert average difference values for all transcripts to frequency estimates, stated in units of parts per million ranging from 1 :300,000 (3 parts per million (ppm)) to 1 :1000 (1000 ppm).
[00105] Genechip® 3.2 software uses algorithms to calculate the likelihood as to whether a gene is "absent" or "present" as well as a specific hybridization intensity value or "average difference" for each transcript represented on the array. The algorithms used in these calculations are described in the Affymetrix Genechip® Analysis Suite User Guide.
[00106] Specific transcripts can be evaluated further if they meet the following criteria. First, genes that are designated "absent" by the Genechip® 3.2 software in all samples are excluded from the analysis. Second, in comparisons of transcript levels between arrays, a gene is required to be present in at least one of the arrays. Third, for comparisons of transcript levels between groups, a Student's t-test is applied to identify a subset of transcripts that had a significant difference (p < 0.05) in frequency values. In many cases, a fourth criterion, which requires that average fold changes in frequency values across the statistically significant subset of genes be 2-fold or greater, is also used.
[00107] Unsupervised hierarchical clustering of genes and/or arrays on the basis of similarity of their expression profiles can be performed using the procedure described in Eisen et al. (1998) Proc. Nat. Acad. Sci. U.S.A., 95: 14863-14868. Nearest-neighbor prediction analysis and supervised cluster analysis can be performed using metrics illustrated in Golub et al, supra. For hierarchical clustering and nearest-neighbor prediction analysis, data can be first log-transformed and then normalized to have a mean value of zero and a variance of one. A Student's Mest can be used to compare disease-free, AML and MDS PBMC expression profiles. A p value of no more than 0.05 (e.g., no more than 0.01, 0.001, or less) can be used to indicate statistical significance.
[00108] A ^-nearest-neighbor's approach can be used to perform a neighborhood analysis of real and randomly permuted data using a correlation metric [P(g,c) = (μl-μ2) / (σl+ σ2)], where g is the expression vector of gene g, c is a class vector, μl and σl define the mean expression level and standard deviation of gene g in class 1, respectively, and μ2 and σ2 define the mean expression level and standard deviation of gene g in class 2, respectively. The measures of correlation for the most statistically significant genes observed in real class distinctions (AML versus disease-free, MDS versus disease-free, or AML versus MDS) can be compared to the most statistically significant measures of correlation observed in randomly permuted class distinctions. The top 1%, 5% and median distance measurements of 100 randomly permuted classes compared to the observed distance measurements for AML versus disease-free, MDS versus disease-free, or AML versus MDS can be plotted to show the statistical verification of the leukemia disease genes identified by this invention.
Example 5. Gene Classifiers for Prediction of Disease-Free versus MDS versus AML
[00109] A 24-qualifier signature (8 cDNAs representing 7 genes defining AML, 8 cDNAs representing 7 genes defining MDS, and 8 cDNAs representing 8 genes defining disease-free) was identified. This signature can accurately predict and classify PBMC samples of disease-free individuals, MDS patients, or AML patients. This signature also identifies rapid MDS progressors as "AML," with implications for early detection of AML progression in MDS patients.
[00110] The qualifiers in the 24-qualifier signature are listed in Table 8, below. For each qualifier, the signal-to-noise value associated with the qualifier is provided in the column labeled "Score." Each signal-to-noise value was greater than the value in the adjacent "Perm 1%" column, representing the signal-to-noise values observed for the top 1% of random permutations when the labels of the profiles were scrambled and then compared using identical class sizes. Thus, the actual signal-to- noise values for the qualifiers were superior to those in the top 1% of random permutations. The corresponding human genes are identified by name, by symbol, by chromosomal location ("Cyto Band"), by Unigene number, and by GenBank accession number. Human genes used to identify AML include human myb; human neuronal protein 3.1; human myeloperoxidase; human catalase; human CGI-49; human stem cell growth factor; and human serine peptidase inhibitor, Kazal type 2 (acrosin-trypsin inhibitor). Human genes used to identify MDS include human NEDD4L; human glutathione peroxidase 3; human X-lmked Kx blood group; human synuclein, alpha; human chromosome 8 open reading frame 5 I/hypothetical protein MGC3113; human interferon, alpha-inducible protein 27; and human transglutaminase 3. Human genes used to identify PBMCs from disease-free individuals include human chromosome 21 open reading frame 7; human amyloid beta A4 precursor protein-binding family A member 2; human KIAA0449; human F-box only protein 21; human death effector filament-forming ced-4-like apoptosis protein; human zinc finger protein 14; human vasoactive intestinal peptide receptor 1; and human KIAA0443.
[00111] Gene expression patterns in peripheral blood mononuclear cells were measured by oligonucleotide arrays for 45 disease-free subjects, 36 patients diagnosed with AML and 20 patients with initial diagnoses of MDS. Comparisons of these groups identified transcriptional differences that easily separated AML and MDS from healthy volunteers, and annotation revealed that many of the differences appeared due to proliferation of CD34+ blasts in the circulation of these patients. The possibility of discriminating between MDS and AML patients on the basis of transcriptional profiles in peripheral blood was next explored. Of the 20 patients with initial diagnoses of MDS, six of the patient samples were determined to come from either 1) MDS patients with conflicting diagnoses between the site pathologist and a central pathologist (n=3) or 2) MDS patients who rapidly progressed after blood sampling (< 3 months, n=3). A supervised approach on a training set of healthy, AML and non-progressor MDS samples was used to identify a gene classifier correlated with profiles in healthy individuals, stable MDS patients, and AML patients. An 8 gene classifier was optimally predictive, exhibiting an overall accuracy of 94 % in the training set (62/66 subjects correctly assigned by leave-one- out cross validation). One of the four misclassified samples in the training set was from an MDS patient with a conflicting diagnosis who was "misclassified" upon cross validation as AML. When the 8 gene classifier was applied to the remaining samples in the test set, this classifier identified the remaining unambiguous samples in the test set with similar accuracy (87% overall accuracy of class assignment). This 8-gene predictor also assigned both samples from patients with conflicting diagnosis and all three samples from MDS patients with rapid times to disease progression as originating from patients with AML. These preliminary results imply that AML-like transcriptional profiles of MDS-diagnosed patients can precede standard clinical evidence of AML progression (e.g., blast hyperproliferation). The results from these studies indicate that the expression pattern of select transcripts in peripheral blood can provide early indicators of AML progression in leukemic patients with blast percentages that are commonly associated with a diagnosis of MDS. Table 8. 24-qualifier classifier for AML, MDS5 and normal PBMCs
Figure imgf000094_0001
[00112] The foregoing description of the present invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise one disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. Thus, it is noted that the scope of the invention is defined by the claims and their equivalents.

Claims

We claim:
1. A method for diagnosis, or monitoring the occurrence, development, progression or treatment, of myelodysplastic syndromes (MDS), the method comprising the steps of:
(1) generating a gene expression profile from a peripheral blood sample of a subject; and
(2) comparing the gene expression profile to one or more reference expression profiles, wherein the gene expression profile and the one or more reference expression profiles comprise the expression patterns of one or more MDS disease genes in peripheral blood mononuclear cells (PBMCs), and wherein the difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative of the presence, absence, occurrence, development, progression, or effectiveness of treatment of MDS in the subject.
2. The method of claim 1, wherein the peripheral blood sample comprises whole blood or enriched un-fractionated PBMCs.
3. The method of any one of the preceding claims, wherein the one or more MDS disease genes comprise one or more genes selected from Tables 4 or 6.
4. The method of any one of the preceding claims, wherein the one or more MDS disease genes comprise ten or more genes selected from Tables 4 or 6.
5. The method of any one of the preceding claims, wherein the one or more MDS disease genes comprise one or more MDS disease genes selected from Table 8.
6. The method of any one of the preceding claims, wherein the one or more reference expression profiles comprise a reference expression profile representing a disease-free human.
7. The method of any one of the preceding claims, wherein step (2) comprises comparing the gene expression profile to the one or more reference expression profiles by a λ>nearest neighbor analysis or a weighted voting algorithm.
8. The method of any one of the preceding claims, further comprising the step of diagnosing or assessing MDS in the subject based on the comparison of step (2).
9. A method for diagnosis, or monitoring the occurrence, development, progression or treatment, of leukemia in a subject, the method comprising the steps of:
(1) generating a gene expression profile from a peripheral blood sample from the subject; and
(2) comparing the gene expression profile to one or more reference expression profiles, wherein the gene expression profile and the one or more reference expression profiles comprise the expression patterns of one or more leukemia disease genes selected from Table 4 or 6 in peripheral blood mononuclear cells (PBMCs), wherein the difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative of the presence, absence, occurrence, development, progression, or effectiveness of treatment of leukemia in the subject, wherein the one or more leukemia disease genes are not recited in Table 2.
10. The method of claim 9, wherein the peripheral blood sample comprises whole blood or enriched un-fractionated PBMCs.
11. The method of claim 9 or 10, wherein the one or more leukemia disease genes comprise ten or more genes selected from Table 4 or 6.
12. The method of any one of claims 9-11, wherein the one or more leukemia disease genes selected from Table 4 or 6 comprise one or more genes also recited in Table 8.
13. The method of any one of claims 9-12, wherein the one or more reference expression profiles comprise a reference expression profile representing a disease- free human.
14. The method of any one of claims 9-13, wherein step (2) comprises comparing the gene expression profile to the one or more reference expression profiles by a k- nearest neighbor analysis or a weighted voting algorithm.
15. The method of any one of claims 9-14, wherein the leukemia is an acute myelogenous leukemia.
16. The method of any one of claims 9-14, wherein the leukemia is a myelodysplastic syndrome.
17. The method of any one of claims 9-16, further comprising the step of diagnosing or assessing leukemia in the subject based on the comparison of step (2).
18. A method for identifying an MDS patient who is likely to progress to acute myelogenous leukemia (AML), the method comprising the steps of:
(1) generating a gene expression profile from a peripheral blood sample from an MDS patient;
(2) comparing the gene expression profile to one or more reference expression profiles, wherein the gene expression profile and the one or more reference expression profiles comprise the expression patterns of one or more leukemia disease genes selected from Table 6 in peripheral blood mononuclear cells (PBMCs), wherein the difference or similarity between the gene expression profile and the one or more reference expression profiles is indicative that the MDS patient is likely to progress to AML.
19. The method of claim 18, wherein the one or more reference expression profiles comprises a reference expression profile representing an AML patient.
20. The method of claim 18 or 19, wherein the peripheral blood sample comprises whole blood or enriched un-fractionated PBMCs.
21. The method of any one of claims 18-20, wherein the one or more leukemia disease genes selected from Table 6 are also recited in Table 8.
22. An array for use in a method for diagnosing a myelodysplasia syndrome (MDS) comprising a substrate having a plurality of addresses, each address comprising a distinct probe disposed thereon, wherein at least 15% of the plurality of addresses have disposed thereon probes that can specifically detect MDS disease genes in peripheral blood mononuclear cells.
23. The array of claim 22, wherein at least 30% of the plurality of addresses have disposed thereon probes that can specifically detect MDS disease genes in peripheral blood mononuclear cells.
24. The array of claim 22, wherein at least 50% of the plurality of addresses have disposed thereon probes that can specifically detect MDS disease genes in peripheral blood mononuclear cells.
25. The array of any one of claims 22-24, wherein the MDS disease genes are selected from Table 4.
26. The array of any one of claims 22-25, wherein the probe is a nucleic acid probe.
27. The array of any one of claims 22-25, wherein the probe is an antibody probe.
28. An array for use in a method for diagnosis of leukemia comprising a substrate having a plurality of addresses, each address comprising a distinct probe disposed thereon, wherein at least 15% of the plurality of addresses have disposed thereon probes that can specifically detect genes selected from Tables 4 or 6, wherein the genes are not recited in Table 2.
29. The array of claim 28, wherein at least 30% of the plurality of addresses have disposed thereon probes that can specifically detect genes selected from Tables 4 or 6, wherein the genes are not recited in Table 2.
30. The array of claim 28, wherein at least 50% of the plurality of addresses have disposed thereon probes that can specifically detect genes selected from Tables 4 or 6, wherein the genes are not recited in Table 2.
31. The array of any one of claims 28-30, wherein the probe is a nucleic acid probe.
32. The array of any one of claims 28-30, wherein the probe is an antibody probe.
33. A computer-readable medium comprising a digitally-encoded expression profile comprising a plurality of digitally-encoded expression signals, wherein each of the plurality of digitally-encoded expression signals comprises a value representing the expression of a gene selected from Tables 4 or 6, wherein the gene is not recited in Table 2.
34. The computer-readable medium of claim 33, wherein the value represents the expression of the gene in a peripheral blood mononuclear cell of a patient with a myelodysplastic syndrome (MDS).
35. The computer-readable medium of claim 33, wherein the value represents the expression of the gene in a peripheral blood mononuclear cell of a patient with acute myelogenous leukemia (AML).
36. The computer-readable medium of claim 33, wherein the digitally-encoded expression profile comprises at least ten digitally-encoded expression signals.
37. A kit for diagnosis of a myelodysplastic syndrome (MDS), the kit comprising: a) one or more probes that can specifically detect MDS disease genes in peripheral blood mononuclear cells; and b) one or more controls, each representing a reference expression level of an MDS disease gene detectable by the one or more probes.
38. The kit of claim 37, wherein the MDS disease genes are selected from Table 4.
39. The kit of claim 38, wherein the MDS disease genes selected from Table 4 are also recited in Table 8.
40. A kit for diagnosis of leukemia, the kit comprising: a) one or more probes that can specifically detect genes selected from Tables 4 or 6, wherein the genes are not recited in Table 2; and b) one or more controls, each representing a reference expression level of a disease gene detectable by the one or more probes.
41. The kit of claim 40, wherein the genes selected from Tables 4 or 6 are also recited in Table 8.
42. A method of making a decision regarding an individual, the method comprising the step of: assigning the individual to a class based on a value that is a function of the expression, in a peripheral blood sample from the individual, of one or more genes selected from Tables 4 or 6, wherein the genes are not recited in Table 2, thereby making a decision regarding the individual.
43. The method of claim 42, wherein the one or more genes selected from Tables 4 or 6 are also recited in Table 8.
44. The method of claim 42 or 43, wherein the decision is recorded.
45. The method of claim 44, wherein the decision is recorded in a computer- readable medium.
46. The method of any one of claims 42-45, wherein the method further includes selecting a leukemia treatment based on the assignment.
47. The method of any one of claims 42-46, wherein the method further includes administering a leukemia treatment based on the assignment.
48. The method of any one of claims 42-47, wherein the method further includes issuing, transmitting or receiving a prescription for a leukemia treatment based on the assignment.
49. The method of any one of claims 42-48, wherein the method further includes authorizing, paying for, or causing a transfer of funds to pay for a leukemia treatment based on the assignment.
PCT/US2006/019614 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof WO2006125195A2 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CA002608092A CA2608092A1 (en) 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof
EP06770765A EP1888784A2 (en) 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof
JP2008512570A JP2008545399A (en) 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof
AU2006247027A AU2006247027A1 (en) 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof
MX2007014537A MX2007014537A (en) 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof.

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US68198405P 2005-05-18 2005-05-18
US60/681,984 2005-05-18

Publications (2)

Publication Number Publication Date
WO2006125195A2 true WO2006125195A2 (en) 2006-11-23
WO2006125195A3 WO2006125195A3 (en) 2007-05-31

Family

ID=37266886

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/019614 WO2006125195A2 (en) 2005-05-18 2006-05-18 Leukemia disease genes and uses thereof

Country Status (7)

Country Link
EP (1) EP1888784A2 (en)
JP (1) JP2008545399A (en)
CN (1) CN101180407A (en)
AU (1) AU2006247027A1 (en)
CA (1) CA2608092A1 (en)
MX (1) MX2007014537A (en)
WO (1) WO2006125195A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7754431B2 (en) 2007-11-30 2010-07-13 Applied Genomics, Inc. TLE3 as a marker for chemotherapy
WO2012085188A1 (en) 2010-12-22 2012-06-28 Universite Francois Rabelais De Tours Method for diagnosing hematological disorders
WO2012156515A1 (en) * 2011-05-18 2012-11-22 Rheinische Friedrich-Wilhelms-Universität Bonn Molecular analysis of acute myeloid leukemia
US8420333B2 (en) * 2009-07-14 2013-04-16 Temple University Of The Commonwealth System Of Higher Education G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker
US8980269B2 (en) 2009-07-14 2015-03-17 Temple University Of The Commonwealth System Of Higher Education G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker
EP2883954A1 (en) * 2012-08-08 2015-06-17 Daiichi Sankyo Company, Limited Peptide library and use thereof
US9140704B2 (en) 2007-07-26 2015-09-22 Temple University Of The Commonwealth System Of Higher Education Serum markers associated with early and other stages of breast cancer
US20170199193A1 (en) * 2016-01-08 2017-07-13 Celgene Corporation Methods for treating cancer and the use of biomarkers as a predictor of clinical sensitivity to therapies
CN107982519A (en) * 2017-12-15 2018-05-04 中国人民解放军陆军军医大学第附属医院 Application of the P311 albumen in prevention and/or treatment surface of a wound angiogenesis obstacle or the medicine of deficiency is prepared
EP3532964A4 (en) * 2016-10-27 2020-06-10 Nantomics, LLC Mds to aml transition and prediction methods therefor
US11142570B2 (en) 2017-02-17 2021-10-12 Bristol-Myers Squibb Company Antibodies to alpha-synuclein and uses thereof

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100617467B1 (en) * 2005-09-27 2006-09-01 디지탈 지노믹스(주) Markers for predicting the response of a patient with acute myeloid leukemia to anti-cancer drugs
JP6476861B2 (en) * 2012-10-24 2019-03-06 日本電気株式会社 Electromagnetic field feature classification presentation device
CN107151672B (en) * 2017-05-11 2021-05-28 成都医学院 Recombinant plasmid and application thereof
CN108588068B (en) * 2018-05-11 2021-10-01 金晖 Acute erythroleukemia KEL gene and circular RNA molecular marker transcribed by same
CN113767179A (en) * 2018-12-13 2021-12-07 国立研究开发法人国立循环器病研究中心 Method for predicting risk of developing cerebral infarction
CN114712381B (en) * 2022-03-30 2024-04-26 浙江大学 Application of AK2 gene in preparation of leukemia induced differentiation therapeutic drug

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004097051A2 (en) * 2003-04-29 2004-11-11 Wyeth Methods for diagnosing aml and mds differential gene expression
WO2006048262A2 (en) * 2004-11-04 2006-05-11 Roche Diagnostics Gmbh Classification of acute myeloid leukemia
WO2006089233A2 (en) * 2005-02-16 2006-08-24 Wyeth Methods and systems for diagnosis, prognosis and selection of treatment of leukemia

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NZ525421A (en) * 1998-12-23 2004-10-29 Univ Sydney Method of treating cancer by determining the presence of the disease condition using an assay to determine the binding pattern of immobilised immunoglobulins

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004097051A2 (en) * 2003-04-29 2004-11-11 Wyeth Methods for diagnosing aml and mds differential gene expression
WO2006048262A2 (en) * 2004-11-04 2006-05-11 Roche Diagnostics Gmbh Classification of acute myeloid leukemia
WO2006089233A2 (en) * 2005-02-16 2006-08-24 Wyeth Methods and systems for diagnosis, prognosis and selection of treatment of leukemia

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
GJERTSEN BT ET AL: "Analysis of acute myelogenous leukemia: preparation of samples for genomic and proteomic analyses." JOURNAL OF HEMATOTHERAPY & STEM CELL RESEARCH. JUN 2002, vol. 11, no. 3, June 2002 (2002-06), pages 469-481, XP009074800 ISSN: 1525-8165 *
GOLUB T R ET AL: "Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring" SCIENCE, AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE, US, vol. 286, no. 5439, 15 October 1999 (1999-10-15), pages 531-537, XP002207658 ISSN: 0036-8075 *
LEE Y T ET AL: "Transcription patterning of uncoupled proliferation and differentiation in myelodysplastic bone marrow with erythroid-focused arrays" BLOOD, W.B.SAUNDERS COMPANY, ORLANDO, FL, US, vol. 98, no. 6, 15 September 2001 (2001-09-15), pages 1914-1921, XP002247841 ISSN: 0006-4971 *
MIYAZATO A ET AL: "IDENTIFICATION OF MYELODYSPLASTIC SYNDROME-SPECIFIC GENES BY DNA MICROARRAY ANALYSIS WITH PURIFIED HEMATOPOIETIC STEM CELL FRACTION" BLOOD, W.B.SAUNDERS COMPANY, ORLANDO, FL, US, vol. 98, no. 2, 15 July 2001 (2001-07-15), pages 422-427, XP002952629 ISSN: 0006-4971 *
OYAN ANNE MARGRETE ET AL: "CD34 expression in native human acute myelogenous leukemia blasts: Differences in CD34 membrane molecule expression are associated with different gene expression profiles" CYTOMETRY, vol. 64B, no. 1, March 2005 (2005-03), pages 18-27, XP002407419 ISSN: 0196-4763 *
See also references of EP1888784A2 *
TSUTSUMI C ET AL: "DNA microarray analysis of dysplastic morphology associated with acute myeloid leukemia" EXPERIMENTAL HEMATOLOGY, NEW YORK, NY, US, vol. 32, no. 9, September 2004 (2004-09), pages 828-835, XP004550087 ISSN: 0301-472X *
WALLOCH J ET AL: "CARBONIC ANHYDRASE A MARKER FOR THE ERYTHROID PHENOTYPE IN ACUTE NONLYMPHOCYTIC LEUKEMIA" BLOOD, vol. 68, no. 1, 1 July 1986 (1986-07-01), pages 304-306, XP002408919 ISSN: 0006-4971 *
YEOH E-J ET AL: "Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling" CANCER CELL, vol. 1, no. 2, March 2002 (2002-03), pages 133-143, XP002253604 ISSN: 1535-6108 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9140704B2 (en) 2007-07-26 2015-09-22 Temple University Of The Commonwealth System Of Higher Education Serum markers associated with early and other stages of breast cancer
US7816084B2 (en) 2007-11-30 2010-10-19 Applied Genomics, Inc. TLE3 as a marker for chemotherapy
US7754431B2 (en) 2007-11-30 2010-07-13 Applied Genomics, Inc. TLE3 as a marker for chemotherapy
US9005900B2 (en) 2007-11-30 2015-04-14 Clarient Diagnostic Services, Inc. TLE3 as a marker for chemotherapy
US8785156B2 (en) 2007-11-30 2014-07-22 Clarient Diagnostic Services, Inc. TLE3 as a marker for chemotherapy
US8980269B2 (en) 2009-07-14 2015-03-17 Temple University Of The Commonwealth System Of Higher Education G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker
US8420333B2 (en) * 2009-07-14 2013-04-16 Temple University Of The Commonwealth System Of Higher Education G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker
US9493836B2 (en) 2010-12-22 2016-11-15 Universite Francois Rabelais De Tours Method for diagnosing hematological disorders
WO2012085188A1 (en) 2010-12-22 2012-06-28 Universite Francois Rabelais De Tours Method for diagnosing hematological disorders
US20130288924A1 (en) * 2010-12-22 2013-10-31 Universite Francois Rabelais De Tours Method for diagnosing hematological disorders
WO2012156515A1 (en) * 2011-05-18 2012-11-22 Rheinische Friedrich-Wilhelms-Universität Bonn Molecular analysis of acute myeloid leukemia
US10550154B2 (en) 2012-08-08 2020-02-04 Daiichi Sankyo Company, Limited Peptide library and use thereof
EP2883954A4 (en) * 2012-08-08 2016-03-02 Daiichi Sankyo Co Ltd Peptide library and use thereof
AU2013300549B2 (en) * 2012-08-08 2019-04-11 Daiichi Sankyo Company,Limited Peptide library and use thereof
EP2883954A1 (en) * 2012-08-08 2015-06-17 Daiichi Sankyo Company, Limited Peptide library and use thereof
EP3748001A1 (en) * 2012-08-08 2020-12-09 Daiichi Sankyo Company, Limited Peptide library and use thereof
US11319345B2 (en) 2012-08-08 2022-05-03 Daiichi Sankyo Company, Limited Peptide library and use thereof
US20170199193A1 (en) * 2016-01-08 2017-07-13 Celgene Corporation Methods for treating cancer and the use of biomarkers as a predictor of clinical sensitivity to therapies
US10648983B2 (en) * 2016-01-08 2020-05-12 Celgene Corporation Methods for treating cancer and the use of biomarkers as a predictor of clinical sensitivity to therapies
US11460471B2 (en) 2016-01-08 2022-10-04 Celgene Corporation Methods for treating cancer and the use of biomarkers as a predictor of clinical sensitivity to therapies
EP3532964A4 (en) * 2016-10-27 2020-06-10 Nantomics, LLC Mds to aml transition and prediction methods therefor
US11142570B2 (en) 2017-02-17 2021-10-12 Bristol-Myers Squibb Company Antibodies to alpha-synuclein and uses thereof
US11827695B2 (en) 2017-02-17 2023-11-28 Bristol-Myers Squibb Company Antibodies to alpha-synuclein and uses thereof
CN107982519A (en) * 2017-12-15 2018-05-04 中国人民解放军陆军军医大学第附属医院 Application of the P311 albumen in prevention and/or treatment surface of a wound angiogenesis obstacle or the medicine of deficiency is prepared

Also Published As

Publication number Publication date
CA2608092A1 (en) 2006-11-23
JP2008545399A (en) 2008-12-18
AU2006247027A1 (en) 2006-11-23
CN101180407A (en) 2008-05-14
MX2007014537A (en) 2008-02-12
EP1888784A2 (en) 2008-02-20
WO2006125195A3 (en) 2007-05-31

Similar Documents

Publication Publication Date Title
EP1888784A2 (en) Leukemia disease genes and uses thereof
EP2864500B1 (en) Molecular malignancy in melanocytic lesions
WO2004097051A2 (en) Methods for diagnosing aml and mds differential gene expression
US20080032299A1 (en) Methods for prognosis and treatment of solid tumors
US11591655B2 (en) Diagnostic transcriptomic biomarkers in inflammatory cardiomyopathies
US20060134671A1 (en) Methods and systems for prognosis and treatment of solid tumors
WO2017215230A1 (en) Use of a group of gastric cancer genes
US20070015148A1 (en) Gene expression profiles in breast tissue
SG188397A1 (en) Molecular diagnostic test for cancer
WO2008137586A1 (en) Transcriptomic biomarkers for individual risk assessment in new onset heart failure
US20060240441A1 (en) Gene expression profiles and methods of use
WO2014165753A1 (en) Methods and compositions for diagnosis of glioblastoma or a subtype thereof
JP2008529554A (en) Pharmacogenomic markers for prognosis of solid tumors
EP2152916B1 (en) A transcriptomic biomarker of myocarditis
CA3085464A1 (en) Compositions and methods for diagnosing lung cancers using gene expression profiles
WO2007137366A1 (en) Diagnostic and prognostic indicators of cancer
US20130217656A1 (en) Methods and compositions for diagnosing and treating lupus
EP1308522A1 (en) Novel genetic markers for leukemias
EP2607494A1 (en) Biomarkers for lung cancer risk assessment
WO2003016476A2 (en) Gene expression profiles in glomerular diseases
AU2014259525B2 (en) A transcriptomic biomarker of myocarditis

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680017012.9

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2006247027

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 8083/DELNP/2007

Country of ref document: IN

ENP Entry into the national phase in:

Ref document number: 2006247027

Country of ref document: AU

Date of ref document: 20060518

Kind code of ref document: A

ENP Entry into the national phase in:

Ref document number: 2608092

Country of ref document: CA

ENP Entry into the national phase in:

Ref document number: 2008512570

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/a/2007/014537

Country of ref document: MX

NENP Non-entry into the national phase in:

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2006770765

Country of ref document: EP

NENP Non-entry into the national phase in:

Ref country code: RU