WO2006105252A2 - Diagnosis of chronic pulmonary obstructive disease and monitoring of therapy using gene expression analysis of peripheral blood cells - Google Patents

Diagnosis of chronic pulmonary obstructive disease and monitoring of therapy using gene expression analysis of peripheral blood cells Download PDF

Info

Publication number
WO2006105252A2
WO2006105252A2 PCT/US2006/011570 US2006011570W WO2006105252A2 WO 2006105252 A2 WO2006105252 A2 WO 2006105252A2 US 2006011570 W US2006011570 W US 2006011570W WO 2006105252 A2 WO2006105252 A2 WO 2006105252A2
Authority
WO
WIPO (PCT)
Prior art keywords
expression
gene
genes
copd
seq
Prior art date
Application number
PCT/US2006/011570
Other languages
French (fr)
Other versions
WO2006105252A3 (en
Inventor
Mark W. Geraci
Christopher D. Coldren
Michael P. Gruber
Andrew K. Sullivan
Norbert F. Voelkel
Original Assignee
The Regents Of The University Of Colorado
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Regents Of The University Of Colorado filed Critical The Regents Of The University Of Colorado
Publication of WO2006105252A2 publication Critical patent/WO2006105252A2/en
Publication of WO2006105252A3 publication Critical patent/WO2006105252A3/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • G01N33/5091Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing the pathological state of an organism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/12Pulmonary diseases
    • G01N2800/122Chronic or obstructive airway disorders, e.g. asthma COPD

Definitions

  • COPD chronic obstructive pulmonary disease
  • COPD chronic lung parenchyma destruction.
  • Studies examining airway histology, bronchoalveolar lavage, and sputum of smokers with COPD have demonstrated increases in macrophages, T-lymphocytes, and neutrophils relative to smokers without COPD, nonsmokers, or asthmatics 22 ' 24"29 .
  • Clinically, the quantitative presence of airway neutrophils 30 , macrophages 30 , and T-lymphocytes 20 have been correlated with disease severity, suggesting a role for these immunoregulatory cells in the progression of disease.
  • CD8+ T-lymphocytes in the airways of COPD patients is a consistent finding 28 ' 31" 33 .
  • the role played by CD8+ T-lymphocytes in underlying disease pathogenesis remains speculative.
  • Current hypotheses include; (1) enhanced apoptosis of alveolar epithelial cells leading to parenchymal destruction 31 , (2) persistent recruitment to the lung parenchyma as a result of recurrent or chronic viral infection causing TNF- ⁇ mediated alveolar epithelial cell destruction 34 , or (3) an autoimmune phenomenon 35 .
  • Table 1 Summary of previously performed studies of peripheral blood mononuclear cells in COPD demonstrating role for circulating immunoregulatory cells in the disease process
  • Miller et al. 44 analyzed peripheral blood lymphocyte populations in 60 smokers and 35 nonsmokers. They found that although there was no difference in the total number of T- lymphocytes or CD4+/CD8+ ratios in mild smokers compared to normal individuals, an increase in CD8+ lymphocytes and decreased CD4+/CD8+ ratio was observed in heavy smokers. Interestingly, these changes were reversible with smoking cessation. Ekberg-Jansson et al. found that peripheral blood T-cell activating markers were higher in 60 year old male smokers than in age-matched nonsmokers 46 . These studies support the hypothesis that tobacco exposure causes alterations in circulating T-lymphocytes and that these changes may be reversible with smoking cessation 44 .
  • T-lymphocytes and monocytes circulating immunoregulatory cells
  • the present invention provides a method to diagnose chronic obstructive pulmonary disease (COPD) or a predisposition to develop COPD, comprising, detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient to be tested, wherein the gene is chosen from the genes represented by SEQ ID NO: 1-323, and wherein the level of expression of each of the genes in any one or more of Tables 2-5 is associated with COPD as measured by either upregulation or downregulation of gene expression in peripheral blood cells from patients with COPD as compared to the level of expression of the genes in peripheral blood cells from normal controls; and comparing the level of expression of the gene from the patient sample to the level of expression of the gene in normal control peripheral blood cells, wherein detection of regulation of the expression of the gene in the patient sample in the direction associated with COPD indicated in Table 2, 3, 4 and/or 5 indicates a diagnosis of COPD in the patient.
  • COPD chronic obstructive pulmonary disease
  • the detecting comprises detecting expression of at least 5 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 5 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 10 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 15 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 20 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 25 genes chosen from the genes represented by SEQ ID NO:1- 323.
  • the detecting comprises detecting expression of at least 50 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 75 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 100 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 125 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 150 genes chosen from the genes represented by SEQ ID NO: 1-323.
  • the detecting comprises detecting expression of at least 175 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 200 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 225 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of all of the genes represented by SEQ ID NO: 1-323.
  • expression of the gene is detected by measuring amounts of transcripts of the gene in the patient peripheral blood cells. In other embodiments, expression of the gene is detected by detecting hybridization of at least a portion of the gene or a transcript thereof to a nucleic acid molecule comprising a portion of the gene or a transcript thereof in a nucleic acid array. In some embodiments, expression of the gene is detected by detecting the production of a protein encoded by the gene. In further embodiments, the level of expression of the gene in the peripheral blood cells of a normal control has been predetermined.
  • the present invention also provides a method to monitor the treatment of a patient with COPD, comprising detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient undergoing treatment for COPD, wherein the gene is chosen from the genes represented by any one of SEQ ID NO: 1-324, and wherein the level of expression of each of the genes represented by any one of SEQ ID NO: 1-324 is associated with COPD as measured by either upregulation or downregulation of gene expression in peripheral blood cells from patients with COPD as compared to the level of expression of the genes in peripheral blood cells from normal controls; and comparing the level of expression of the gene from the patient sample to the level of expression of the gene in a prior sample of peripheral blood cells from the patient, wherein detection of a change in the level of expression of the gene, as compared to the level of expression in the prior sample, toward the level of the expression of the gene in a normal control sample, indicates that the treatment for COPD is producing a beneficial result.
  • detection of a change in the level of expression of the gene, as compared to the level of expression in the prior sample, away from the level of the expression of the gene in a normal control sample indicates a progression of the COPD.
  • the detection of no significant change in the level of expression of the gene, as compared to the level of expression in the prior sample indicates no significant change in the progression or treatment of the COPD in the patient.
  • the invention also provides a plurality of polynucleotides for the detection of the expression of genes that are indicative of COPD in a patient or a preclinical disposition therefore, wherein the plurality of polynucleotides consists of at least two polynucleotides, wherein each polynucleotide is at least 5 nucleotides in length, and wherein each polynucleotide is complementary to an RNA transcript, or nucleotide derived therefrom, of a gene that is regulated differently in individuals with COPD as compared to individuals that do not have COPD.
  • each polynucleotide is complementary to an RNA transcript, or a polynucleotide derived therefrom, of a gene represented by any one of SEQ ID NO: 1-324.
  • the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least two genes represented by any one of SEQ ID NO: 1-324.
  • the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least five genes represented by any one of SEQ ID NO:1- 324.
  • the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 10 genes represented by any one of SEQ ID NO: 1-324.
  • the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 50 genes represented by any one of SEQ ID NO: 1-324.
  • the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 100 genes represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 150 genes represented by any one of SEQ ID NO:1- 229, and 320-324.
  • the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 200 genes represented by any one of SEQ ID NO: 1-229, and 320-324.
  • the polynucleotides are immobilized on a substrate.
  • the polynucleotides are hybridizable array elements in a microarray.
  • the polynucleotides are conjugated to detectable markers.
  • the invention further provides a method to diagnose chronic obstructive pulmonary disease (COPD) in a patient, comprising, detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient to be tested, wherein the gene is chosen from a group of genes, each of which has been previously identified to be upregulated or downregulated in the peripheral blood cells of patients who have been diagnosed with COPD, as compared to the level of expression of the gene in normal control peripheral blood cells; and comparing the level of expression of the gene from the patient sample to the level of expression of the gene in normal control peripheral blood cells, wherein detection of regulation of the expression of the gene in the patient sample in the direction associated with COPD as indicated by the previous identification, indicates a diagnosis of COPD in the patient.
  • COPD chronic obstructive pulmonary disease
  • the invention further provides a method to identify a compound with the potential to treat or prevent chronic obstructive pulmonary disease (COPD), comprising contacting a test compound with a cell that expresses a gene selected from any one or more of the genes represented by any one of SEQ ID NO: 1-324; identifying compounds that increase the expression or activity of genes represented by any one of SEQ ID NO: 1-324 or the proteins encoded thereby that are downregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls, or that decrease the expression or activity of genes represented by any one of SEQ ID NO: 1-324 or the proteins encoded thereby that are upregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls.
  • COPD chronic obstructive pulmonary disease
  • the invention provides a method to treat a patient with COPD, comprising administering to the patient a therapeutic composition comprising a compound identified by the method above.
  • detection of a change in the level of expression of at least one gene in methods of the invention comprises detecting the presence of a protein, hi other embodiments, the method further comprises detecting the presence of the protein using a reagent that specifically binds to the protein.
  • the reagent is selected from the group consisting of an antibody, an antibody derivative, and an antibody fragment.
  • the invention also provides a plurality of reagents for the detection of the expression of genes that are indicative of COPD in a patient or a preclinical disposition therefore; wherein the plurality of reagents consists of at least two reagents that each of which specifically bind to a protein, wherein each protein is at least 15 amino acids in length, and wherein each protein is encoded a gene that is regulated differently in individuals with COPD as compared to individuals that do not have COPD.
  • each protein is encoded a gene represented by any one of SEQ ID NO: 1-324.
  • at least two proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.
  • at least five proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.
  • at least 10 proteins are encoded a gene represented by any one of SEQ ID NO: 1-324..
  • at least 50 proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.
  • at least 100 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324.
  • At least 150 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, at least 200 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324.
  • the reagents are immobilized on a substrate. In some embodiments, the reagents are selected from the group consisting of an antibody, an antibody derivative, and an antibody fragment, and wherein each of said reagents are elements in a microarray. hi some embodiments, reagents are conjugated to detectable markers.
  • Fig. 1 shows the relationship of tobacco exposure to chronic obstructive pulmonary disease (COPD).
  • COPD chronic obstructive pulmonary disease
  • Fig. 2 shows examples of normal and accelerated loss of lung function over time.
  • Sl and S2 denote two individuals with tobacco exposure. Sl has accelerated loss of lung function with onset of symptoms at young age resulting in early diagnosis of COPD. S2 also has greater than expected loss of lung function but remains asymptomatic over lifetime without clinical diagnosis of COPD.
  • NS is normal nonsmoker with expected rate of decline in lung function over time.
  • Fig. 3 A and B are dendrograms of COPD and Normal samples, clustered using centered correlation and average linkage, and the tile plot of differentially expressed genes.
  • Fig. 3A shows unsupervised clustering based on 15022 genes.
  • Fig. 3B shows supervised clustering based on 240 genes.
  • Fig. 3 C show a tile plot of the 240 genes which preliminarily discriminate between COPD and normal PBMC samples. Darker areas represent high expression and lighter areas represent relatively low expression.
  • Fig. 4 is a schematic drawing showing the study methods used in Example 2.
  • Fig. 5 is a schematic drawing showing the validation protocol for differentially expressed genes.
  • the present invention generally relates to the identification of a large number of genes that are regulated differentially in individuals with chronic pulmonary obstructive disease (COPD) as compared to individuals that do not have this disease, and particularly, to the identification of how these genes are regulated during disease.
  • this invention generally relates to diagnostic and prognostic assays and kits for COPD, as well as the identification of targets for therapeutic prevention and intervention strategies.
  • the terms "chronic pulmonary obstructive disease", its acronym “COPD”, and “emphysema” can be used interchangeably to describe the same condition.
  • PBMC peripheral blood mononuclear cells
  • PBMC peripheral blood mononuclear cells
  • gene expression microarrays to study tobacco exposed individuals both with and without COPD enables the comparison of thousands of expression transcripts between groups and has resulted in the discovery of novel genes of interest, new diagnostic tools, disease subclassifications, and new candidate therapeutic targets.
  • the present inventors' analysis methods have the advantage of generating new hypotheses and investigative pathways based on the study of fewer individuals through minimally invasive methods.
  • the ability to conduct research in human subjects and not just animal models is further likely to enhance the discovery and understanding of this common and underappreciated disease within the target population.
  • the large amounts of data that can be generated from a microarray gene expression study enable the inventors to capture many simultaneous processes and convert these findings into meaningful, quantitative, and reproducible data.
  • the present inventors have identified multiple genes, the expression of which is regulated differentially in peripheral blood cells (PBC; also referred to herein as peripheral blood mononuclear cells, or PBMC) of patients with COPD as compared to subjects without COPD.
  • PBC peripheral blood cells
  • PBMC peripheral blood mononuclear cells
  • Table 2 shows the geometric means of intensities for the genes in both COPD patients (COPD column) and in normal controls (normal column) and provides a fold difference of the mean of intensities (fold change column). Using this information, one can clearly see whether a given gene is upregulated or downregulated in the peripheral blood cells of patients with COPD as compared to the normal control.
  • This table shows the differentially expressed transcripts sorted at p ⁇ 0.01 sorted by fold-change in the geometric intensities. Therefore, the first transcript in this table has the highest fold difference between COPD and normal control, and the last transcript in this table has the lowest fold difference, meaning that there is much greater expression in normal controls versus COPD.
  • the genes are identified by name, by probe set identifier and by GenBank Accession numbers.
  • SEQ ID NO's in Table 2 refers to the nucleotide sequence for the coding region of the gene, or, if the entire coding region is not available, whatever fragment of the coding sequence or genomic sequence is available.
  • Table 3 also reflects a listing of genes identified by the inventors as being differentially expressed in patients with COPD versus normal controls and in this table, results have been grouped into the following main categories: (1) genes that are selectively (i.e., exclusively or uniquely) upregulated in PBMCs of patients with COPD as compared to normal controls (Table 3); and (2) genes that are selectively downregulated in PBMCs of patients with COPD as compared to normal controls (Table 3). Again, the genes are identified by name, by probe set identifier and by GenBank Accession numbers.
  • Tables 4 and 5 also reflect a listing of genes identified by the inventors as being differentially expressed in patients with COPD versus normal controls. These tables show the differentially expressed transcripts sorted at p ⁇ 0.005 sorted by fold-change in the geometric intensities.
  • PSEM refers to "past smoker emphysema” and is equivalent to the category of "COPD” as set forth in Table 2.
  • NSNL refers to "non-smoker normal” and is equivalent to the category of "Normal” as set forth in Table 2.
  • the genes are identified by name, by probe set identifier and by GenBank Accession numbers, hi Tables 4 and 5, 22277 genes were filtered, and 13909 of these passed filtering criteria.
  • a two-sample T-test (with randomized variance model) was used.
  • the Multivariate Permutations test was computed based on 1000 permutations.
  • the nominal significance level of each univariate test was 0.005.
  • the confidence level of false discovery rate assessment was 50%.
  • the maximum allowed number of false-positive genes was 10, and the maximum allowed proportion of false-positive genes: 0.1.
  • Table 4 the number of genes significant at 0.005 level of the univariate test was 66, and probability of getting at least 66 genes significant by chance (at the 0.005 level) if there are no real differences between the classes was 0.283.
  • the number of genes significant at 0.005 level of the univariate test was 77, and the probability of getting at least 77 genes significant by chance (at the 0.005 level) if there are no real differences between the classes: 0.285.
  • the predicted number of false discoveries among the first 10 genes is 10
  • the predicted proportion of false discoveries among the first 0 genes is 10%
  • the predicted number of false discoveries among the first 21 genes is 10
  • the predicted proportion of false discoveries among the first 6 genes is 10%.
  • mRNA Homo sapiens mRNA; cDNA DKFZp434G012 (from clone DKFZp434G012), mRNA
  • GLI-Kruppel family member GLI2 GLI-Kruppel family member GLI2
  • AT2 receptor-interacting protein 1 AT2 receptor-interacting protein 1 ;
  • genes appear more than once in the tables provided herein, and in some cases, a gene may appear by name in both the "upregulated” and the "downregulated” category. This is because well-annotated genes often have multiple probe sets that one can use to identify the gene, and also because various isotypes of certain genes may be included, where there is some variation in the isotype sequence that is reflected by the various probe sets on the microarray chip ⁇ i.e., the probe sets are capable of differentiating between different isotypes of the same gene). As such, one isotype may be upregulated as compared to normal controls, where a second isotype may be down regulated as compared to normal controls.
  • the genes identified as being regulated (upregulated or downregulated) in PBMCs of patients with COPD can be used as endpoints or markers (also called “biomarkers") in a diagnostic or prognostic assay for COPD.
  • the biomarkers include any of the genes listed in any of the tables presented herein (e.g., Tables 2-5).
  • Diagnostic assays include assays that determine whether a patient has overt COPD or preclinical stage COPD.
  • Prognostic assays can be used to stage a patient's development of COPD, predict a patient's outcome or disease progression, and/or monitor the effectiveness of various treatment protocols on COPD.
  • biomarker can refer to an endpoint gene described herein or to the protein encoded by that gene.
  • biomarker can be generally used to refer to any portion of such a gene or protein that can identify or correlate with the full-length gene or protein, for example, in an assay of the invention.
  • an "endpoint gene” or “biomarker gene” is any gene, the expression of which is regulated (up or down) in a patient with a condition as compared to a normal control.
  • Selected sets of one, two, three, and more preferably several more of the genes of this invention can be used as end-points for rapid diagnostics or prognostics for COPD.
  • larger numbers of the genes identified in any one or more of Tables 2-5 are used in an assay of the invention ⁇ e.g., at least 10 genes or more), since the accuracy of the assay improves as the number of genes screened increases.
  • the method includes the step of detecting the expression of at least one, and preferably more than one (e.g., 2, 3, 4, 5, 6,...and so on, in increments of whole numbers up to all of the genes) of the genes that have now been shown to be selectively regulated in PBMCs of patients with COPD by the present inventors.
  • expression when used in connection with detecting the expression of a gene of the present invention, can refer to detecting transcription of the gene and/or to detecting translation of the gene. To detect expression of a gene refers to the act of actively determining whether a gene is expressed or not.
  • the step of detecting expression does not require that expression of the gene actually is upregulated or downregulated, but rather, can also include detecting no expression of the gene or detecting that the expression of the gene has not changed or is not different (i.e., detecting no significant expression of the gene or no significant change in expression of the gene as compared to a control).
  • the present method includes the step of detecting the expression of at least one gene that is selectively regulated in PBMCs of a patient with COPD.
  • the step of detecting includes detecting the expression of at least 2 genes, and preferably at least 3 genes, and more preferably at least 4 genes, and more preferably at least 5 genes, and more preferably at least 6 genes, and more preferably at least 7 genes, and more preferably at least 8 genes, and more preferably at least 9 genes, and more preferably at least 10 genes, and more preferably at least 11 genes, and more preferably at least 12 genes, and more preferably at least 13 genes, and more preferably at least 14 genes, and more preferably at least 15 genes, and more preferably at least 20 genes, and more preferably at least 25 genes, and more preferably at least 50 genes, and more preferably at least 75 genes, and more preferably at least 100 genes, and so on, in whole integer increments (i.e., 1, 2, 3,...10, 11, 12,...35, 36, 37,...56, 57,
  • Analysis of a number of genes greater than one can be accomplished simultaneously, sequentially, or cumulatively. As discussed above, it is preferred that several (e.g., at least 10) and up to most or all of the genes be detected in the present methods, as the accuracy of the method improves as the number of genes detected increases. However, it is to be understood that in some circumstances, it may be desirable and sufficient to detect the expression of only one or a few genes.
  • the gene(s) to be detected are preferably selected from the genes described in any one or more of Tables 2-5 (i.e., Table 2, Table 3, Table 4 or Table 5, or any combination thereof). These tables have been discussed above in detail and disclose genes that the present inventors have discovered to be selectively regulated in the PBMCs of patients with COPD. More specifically, these tables disclose the manner in which the genes are regulated (e.g., upregulated or downregulated) in a patient with COPD as compared to a normal control.
  • Tables 2-5 i.e., Table 2, Table 3, Table 4 or Table 5, or any combination thereof.
  • genes to be detected in any given method can include any one or more of the genes in any one or more of Tables 2-5, and can include the detection of any combination of two or more of the genes in any one or more of Tables 2-5, and preferably includes the detection of any combination of multiple genes (e.g., at least 3, 4, 5, 6,...up to all of the genes) in any one or more of Tables 2-5. It is not mandatory that a given assay be restricted to the detection of all of the various genes in a single table, or to at least one gene in each table.
  • the present method is not limited exclusively to detection of the genes identified herein, although the invention is primarily directed to the detection of one or more of these genes and includes the detection of at least one or more of these genes, hi addition, provided with this disclosure, one of skill in the art may proceed to identify additional genes that are differentially regulated in the PBMCs of patients with COPD, and detection of any of such genes may be used in the methods of the present invention, including in combination with detection of any of the genes disclosed herein. Indeed, the present inventors have now provided a powerful method to detect and evaluate biomarkers for COPD and have also provided data demonstrating the application of such technology.
  • one of skill in the art will be able to select one or more genes (at least one gene, and preferably, two, three, four, or any number of additional genes) to detect in a method of the present invention, and the selection of the one or more genes can be determined based on the preferences of the person using the assays described herein. hi one aspect, it may be desirable to preferentially select those genes for detection that are particularly highly regulated in patients with COPD in that they display the largest increases or decreases in expression levels in patients as compared to normal controls or as compared to the other form of COPD. The detection of such genes can be advantageous because the endpoint may be more clear and require less quantitation.
  • the relative expression levels of the genes identified in the present invention are listed in the tables.
  • a “baseline” or “control” can include a normal or negative control and/or a disease or positive control, against which a test level of gene expression can be compared. Therefore, it can be determined, based on the control or baseline level of gene expression, whether a sample to be evaluated for COPD has a measurable difference or substantially no difference in gene expression, as compared to the baseline level.
  • the baseline control is indicative of the level of gene expression as expected in the PBMCs of a normal individual (e.g., healthy individual, negative control, or non-COPD patient).
  • the term "negative control" used in reference to a baseline level of gene expression typically refers to a baseline level of expression from a population of individuals which is believed to be normal (i.e., not having or developing COPD). In some embodiments of the invention, it may also be useful to compare the gene expression in a test sample of PBMCs to a baseline that has previously been established from a patient or population of patients with COPD. Such a baseline level, also referred to herein as a "positive control”, refers to a level of gene expression established in PBMCs from one or preferably a population of individuals who had been positively diagnosed with COPD.
  • one baseline control can include the measurements of gene expression in a sample of PBMCs from the patient that was taken from a prior test in the same patient.
  • a new sample is evaluated periodically (e.g., at annual or more regular physicals), and any changes in gene expression in the patient PBMCs as compared to the prior measurement and most typically, also with reference to the above-described normal and/or positive controls, are monitored.
  • Monitoring of a patient's PBMC gene expression profile can be used by the clinician to prescribe or modify treatment for the patient based on whether any differences in gene expression in the PBMCs is indicated.
  • control or baseline levels of gene expression are obtained from PBMCs collected from "matched individuals".
  • matched individuals refers to a matching of the control individuals on the basis of one or more characteristics, such as gender, age, race, or any relevant biological or sociological factor that may affect the baseline of the control individuals and the patient (e.g., preexisting conditions, consumption of particular substances, levels of other biological or physiological factors).
  • the number of matched individuals from whom control samples must be obtained to establish a suitable control level (e.g., a population) can be determined by those of skill in the art, but should be statistically appropriate to establish a suitable baseline for comparison with the patient to be evaluated (i.e., the test patient).
  • the values obtained from the control samples are statistically processed using any suitable method of statistical analysis to establish a suitable baseline level using methods standard in the art for establishing such values. It will be appreciated by those of skill in the art that a baseline need not be established for each assay as the assay is performed but rather, a baseline can be established by referring to a form of stored information regarding a previously determined control level of gene expression. Such a form of stored information can include, for example, but is not limited to, a reference chart, listing or electronic file of population or individual data regarding "normal" (negative control) or COPD-positive gene expression; a medical chart for the patient recording data from previous evaluations; or any other source of data regarding control gene expression that is useful for the patient to be diagnosed or evaluated.
  • transcripts and/or proteins encoded by the genes of the invention is measured by any of a variety of known methods in the art.
  • the nucleic acid sequence of a nucleic acid molecule e.g., DNA or RNA
  • a suitable method or technique of measuring or detecting gene sequence or expression include, but are not limited to, polymerase chain reaction (PCR), reverse transcriptase- PCR (RT-PCR), in situ PCR, in situ hybridization, Southern blot, Northern blot, sequence analysis, microarray analysis, detection of a reporter gene, or other DNA/RNA hybridization platforms.
  • PCR polymerase chain reaction
  • RT-PCR reverse transcriptase- PCR
  • in situ PCR in situ hybridization
  • Southern blot Southern blot
  • Northern blot sequence analysis
  • microarray analysis detection of a reporter gene, or other DNA/RNA hybridization platforms.
  • RNA expression preferred methods include but are not limited to: extraction of cellular mRNA and Northern blotting using labeled probes that hybridize to transcripts encoding all or part of one or more of the genes of this invention; amplification of mRNA expressed from one or more of the genes of this invention using gene-specific primers, polymerase chain reaction (PCR), and reverse transcriptase-polymerase chain reaction (RT- PCR), followed by quantitative detection of the product by any of a variety of means; extraction of total RNA from the cells, which is then labeled and used to probe cDNAs or oligonucleotides encoding all or part of the genes of this invention, arrayed on any of a variety of surfaces; in situ hybridization; and detection of a reporter gene.
  • PCR polymerase chain reaction
  • RT-PCR reverse transcriptase-polymerase chain reaction
  • quantifying or “quantitating” when used in the context of quantifying transcription levels of a gene can refer to absolute or to relative quantification.
  • Absolute quantification may be accomplished by inclusion of known concentration(s) of one or more target nucleic acids and referencing the hybridization intensity of unknowns with the known target nucleic acids (e.g. through generation of a standard curve).
  • relative quantification can be accomplished by comparison of hybridization signals between two or more genes, or between two or more treatments to quantify the changes in hybridization intensity and, by implication, transcription level.
  • the present invention includes isolated proteins encoded by the genes identified by the inventors as being differentially expressed in patients with COPD versus normal controls; that is, the proteins listed in Tables 2-5 and encoded by SEQ ID NOs: 1- ), isolated proteins encoded by a sequence complementary thereto, or polypeptides encoded by a fragment, homologue, or variant of genes represented in Tables 2-5. These proteins, peptides and polypeptides of the invention can be made using the genes or derived from the sequence information of the genes are also disclosed in the present invention. Functional forms of the proteins can be prepared, as purified preparations by using a cloned gene as described herein. Alternatively, the proteins, peptides and polypeptides of the invention can be produced synthetically.
  • Full length proteins or fragments corresponding to one or more particular motifs and/or domains or to arbitrary sizes, for example, at least about 5, 10, 25, 50, 75, or 100 amino acids in length are within the scope of the present invention.
  • Methods to measure protein expression levels of selected genes of this invention include, but are not limited to: Western blot, immunoblot, enzyme-linked immunosorbant assay (ELISA), radioimmunoassay (RIA), immunoprecipitation, surface plasmon resonance, chemiluminescence, fluorescent polarization, phosphorescence, immunohistochemical analysis, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, microcytometry, microarray, microscopy, fluorescence activated cell sorting (FACS), flow cytometry, and assays based on a property of the protein including but not limited to DNA binding, ligand binding, or interaction with other protein partners.
  • ELISA enzyme-linked immunosorbant assay
  • Nucleic acid arrays are particularly useful for detecting the expression of the genes of the present invention.
  • the production and application of high-density arrays in gene expression monitoring have been disclosed previously in, for example, PCT Publication No. WO 97/10365; PCT Publication No. WO 92/10588; U.S. Patent No. 6,040,138; U.S. Patent No. 5,445,934; or PCT Publication No. WO 95/35505, all of which are incorporated herein by reference in their entireties.
  • arrays see Hacia et al. (1996) Nature Genetics 14:441-447; Lockhart et al. (1996) Nature Biotechnol. 14:1675-1680; and De Risi et al.
  • an oligonucleotide, a cDNA, or genomic DNA that is a portion of a known gene, occupies a known location on a substrate.
  • a nucleic acid target sample is hybridized with an array of such oligonucleotides and then the amount of target nucleic acids hybridized to each probe in the array is quantified.
  • One preferred quantifying method is to use confocal microscope and fluorescent labels.
  • the Affymetrix GeneChipTM Array system (Affymetrix, Santa Clara, Calif.) and the AtlasTM Human cDNA Expression Array system are particularly suitable for quantifying the hybridization; however, it will be apparent to those of skill in the art that any similar systems or other effectively equivalent detection methods can also be used.
  • Such novel pluralities of polynucleotides are contemplated to be a part of the present invention and are described in detail below.
  • Suitable nucleic acid samples for screening on an array contain transcripts of interest or nucleic acids derived from the transcripts of interest (i.e., transcripts derived from the genes associated with COPD of the present invention).
  • a nucleic acid derived from a transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template.
  • a cDNA reverse transcribed from a transcript, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample.
  • suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
  • such a sample is a total RNA preparation of a biological sample (e.g., peripheral blood mononuclear cells or PBMCs). More preferably in some embodiments, such a nucleic acid sample is the total mRNA isolated from such a biological sample.
  • the nucleic acids for screening are obtained from a homogenate of cells (e.g., peripheral blood mononuclear cells or PBMCs).
  • typical clinical samples include, but are not limited to, sputum, blood, blood cells (e.g., peripheral blood mononuclear cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom.
  • blood cells e.g., peripheral blood mononuclear cells
  • tissue or fine needle biopsy samples e.g., fine needle biopsy samples
  • urine e.g., peritoneal fluid
  • pleural fluid e.g., pleural fluid, or cells therefrom.
  • the present invention is primarily related to the detection of genes in peripheral blood mononuclear cells (PBMC or PBC).
  • amplification method if a quantitative result is desired, care must be taken to use a method that maintains or controls for the relative frequencies of the amplified nucleic acids to achieve quantitative amplification.
  • Methods of "quantitative" amplification are well known to those of skill in the art. For example, quantitative PCR involves simultaneously co-amplifying a known quantity of a control sequence using the same primers. This provides an internal standard that may be used to calibrate the PCR reaction. The high-density array may then include probes specific to the internal standard for quantification of the amplified nucleic acid.
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • Nucleic acid hybridization involves contacting a probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing.
  • hybridization conditions refer to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., is incorporated by reference herein in its entirety (see specifically, pages 9.31-9.62).
  • hybrid duplexes e.g., DNA:DNA, RNA:RNA, or RNA:DNA
  • RNA:DNA e.g., DNA:DNA, RNA:RNA, or RNA:DNA
  • specificity of hybridization is reduced at lower stringency.
  • higher stringency e.g., higher temperature or lower salt
  • High stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 80% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 20% or less mismatch of nucleotides).
  • Very high stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides).
  • stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na + ) at a temperature of between about 2O 0 C and about 35 0 C (lower stringency), more preferably, between about 28 0 C and about 40 0 C (more stringent), and even more preferably, between about 35 0 C and about 45°C (even more stringent), with appropriate wash conditions.
  • 6X SSC 0.9 M Na +
  • stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na + ) at a temperature of between about 30°C and about 45 0 C, more preferably, between about 38°C and about 5O 0 C, and even more preferably, between about 45°C and about 55 0 C, with similarly stringent wash conditions.
  • 6X SSC 0.9 M Na +
  • T m can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62.
  • the wash conditions should be as stringent as possible, and should be appropriate for the chosen hybridization conditions.
  • hybridization conditions can include a combination of salt and temperature conditions that are approximately 20-25 0 C below the calculated T m of a particular hybrid
  • wash conditions typically include a combination of salt and temperature conditions that are approximately 12-2O 0 C below the calculated T m of the particular hybrid.
  • hybridization conditions suitable for use with DNA:DNA hybrids includes a 2- 24 hour hybridization in 6X SSC (50% formamide) at about 42°C, followed by washing steps that include one or more washes at room temperature in about 2X SSC, followed by additional washes at higher temperatures and lower ionic strength (e.g., at least one wash as about 37 0 C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC).
  • 6X SSC 50% formamide
  • additional washes at higher temperatures and lower ionic strength e.g., at least one wash as about 37 0 C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC.
  • Other hybridization conditions and for example, those most useful with nucleic acid arrays, will be known to those of skill in the art.
  • the hybridized nucleic acids are detected by detecting one or more labels attached to the sample nucleic
  • Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., DynabeadsTM), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, yellow fluorescent protein and the like), radiolabels (e.g., 3 H, 125 I, 35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads.
  • fluorescent dyes e.g., fluorescein, texas red, rhodamine, green fluorescent
  • radiolabels may be detected using photographic film or scintillation counters
  • fluorescent markers may be detected using a photodetector to detect emitted light
  • Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label.
  • the method of the present invention includes a step of comparing the results of detecting the expression of the one or more genes that are selectively regulated in patients with COPD as compared to a control (baseline normal or negative control) in order to determine whether there is any observed change or difference in expression of each gene in the patient as compared to the control.
  • a positive control baseline COPD control
  • the present inventors have identified the expression profile of multiple genes that are differentially regulated in PBMCs of patients with COPD, as compared to a "normal" control (i.e., a patient that does not have or can not be detected to have COPD), including the manner in which the genes are regulated (i.e., up- or downregulated). Therefore, one can determine whether peripheral blood cells from a test patient have a gene expression profile that is statistically substantially similar to the profile of gene expression of a patient with COPD, or whether a profile of gene expression in the peripheral blood cells of the test patient is statistically more similar to the negative or normal, non-disease control.
  • an expression profile is substantially similar to a given profile of expression established for a group (e.g., COPD group, normal control group) if the expression profile of the gene or genes detected (including the identity of the gene, the manner in which expression is regulated, and/or the level of expression of the gene) is similar enough to the expected result so as to be statistically significant (i.e., with at least a 95% confidence level, or p ⁇ 0.05, and more preferably, with a confidence level of p ⁇ 0.01, and even more preferably, with a confidence level of p ⁇ 0.005, and even more preferably, with a confidence level of pO.OOl).
  • a group e.g., COPD group, normal control group
  • detection of the regulation of the expression of a gene in the "manner" associated with the established group refers to the detection of the regulation of a gene that has now been shown by the present inventors to be selectively regulated in PBMCs of patients having COPD, at least in the same direction (i.e., upregulation or downregulation) and preferably at a similar or comparable level, as compared to a normal or baseline control established for the expression of that gene.
  • a gene identified as being upregulated or downregulated, as compared to a baseline control is regulated in the same direction as the level of expression of the gene that is seen in established or confirmed patients with COPD as compared to a normal control.
  • a gene identified as being upregulated or downregulated, as compared to a baseline control is regulated to at least about 10%, and more preferably at least 20%, and more preferably at least 25%, and more preferably at least 30%, and more preferably at least 35%, and more preferably at least 40%, and more preferably at least 45%, and more preferably at least 50%, and preferably at least 55%, and more preferably at least 60%, and more preferably at least 65%, and more preferably at least 70%, and more preferably at least 75%, and more preferably at least 80%, and more preferably at least 85%, and more preferably at least 90%, and more preferably at least at least
  • Statistical significance should be at least p ⁇ 0.05, and more preferably, at least p ⁇ 0.01, and more preferably, p ⁇ 0.005, and even more preferably, pO.001.
  • one of skill in the art can use software programs available in the art that use algorithms to analyze gene expression profiles and identify significant differences among samples and controls.
  • one of skill in the art can apply various types of analyses as discussed above (e.g., cross-validation and/or permutation testing) to validate the results of the methods described herein.
  • a profile of individual gene biomarkers identified in a method of the invention, including a matrix of two or more markers, can be generated by one or more of the methods described above.
  • a profile of the genes regulated in a PBMC sample refers to a reporting of the expression level of a given gene that has been identified in any one or more of the tables presented herein, which, based on the knowledge of the regulation of the genes provided by the tables, includes a classification of the gene with regard to how the gene is regulated in PBMCs of a patient with COPD.
  • the profile for the blood cell sample will include the reporting of the expression of this gene as compared to one or more baseline controls (e.g., a negative/normal and/or a positive/COPD control).
  • the profile includes data for more than one (e.g., at least two), and preferably several genes (e.g., at least five, six, seven, eight, nine, ten, or more genes), such that a profile for the patient sample is created that can be compared to the control(s).
  • the data can be reported as raw data, and/or statistically analyzed by any of a variety of methods, and/or combined with any other prognostic marker(s) for COPD, including any markers that are expressed in cells or tissues other than PBMCs and are useful for evaluating COPD in a patient.
  • any other prognostic marker(s) for COPD including any markers that are expressed in cells or tissues other than PBMCs and are useful for evaluating COPD in a patient.
  • differences between the expression of genes in PBMCs of patients with COPD and without COPD may be small or large. Some small differences may be very reproducible and therefore are preferred for use in the diagnostic and prognostic methods of the invention. For other purposes, large differences may be desirable for ease of detection of the regulatory activity. It will therefore be appreciated that the exact boundary between a positive diagnosis and a negative diagnosis can shift, depending on the goal of the screening assay, the patient samples, the number of genes to be screened and the baseline controls used. For some assays, a given patient may be sampled over time to detect the efficacy of a treatment, and so changes in gene expression from a disease state toward a normal state may be detected.
  • the patient may still be positive for COPD as compared to a normal, disease-free control, but may show a shift toward the normal control gene expression profile if treatment is successful.
  • the technique being used for detection, as well as on the number of genes which are being tested, may impact how the assay is evaluated by those of skill in the art.
  • the profile of genes provided as a result of the screening of peripheral blood cells of a patient can be used by the patient or physician for decision-making regarding the usefulness of therapies for COPD in general.
  • the profile can be used to estimate how the disease is likely to respond and progress in any individual patient.
  • Clinical trials can be developed to correlate the relationship between COPD regulated genes and the biological behavior of the diseased tissues, including in response to particular treatments for COPD.
  • the profiling of genes expressed by peripheral blood cells can be extended to other diseases, and particularly, to other pulmonary diseases wherein diagnosis or prognosis of disease is difficult due to access to diseased tissue or difficulty distinguishing between subtypes of the disease based on conventional assays (e.g., histology).
  • conventional assays e.g., histology
  • one of skill in the art can use the techniques described herein to screen other gene arrays, including arrays of expressed tag sequences, to discover additional novel, genes that are regulated in the peripheral blood cells of patients with COPD.
  • the extension of the gene profiles within COPD and also to other diseases will allow for the development of a variety of diagnostic assays in such diseases, as well as the identification of additional targets for therapeutic strategies.
  • nucleotide or protein array wherein hundreds or thousands of genes could be detected if desired.
  • the array can be designed to test for more than one disease condition in order to confirm or rule out other potential causes of a patient condition. For example, one may design an assay to screen for COPD as described herein, and also for pulmonary hypertension.
  • nucleotide or protein arrays that are specifically designed to test for the expression of any combination of the genes of interest as described herein, alone or in combination with any other combination of genes that may be useful in evaluating a patient for COPD.
  • Another embodiment of the present invention relates to a plurality of polynucleotides for the detection of the expression of genes that are selectively regulated in peripheral blood cells of patients with COPD.
  • the plurality of polynucleotides consists of, or consists essentially of, polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that have been identified herein as being selectively regulated in the peripheral blood cells of patients with COPD, and is therefore distinguished from previously known nucleic acid arrays and primer sets.
  • the plurality of polynucleotides within the above- limitation includes at least two or more polynucleotide probes (e.g., at least 2, 3, 4, 5, 6, and so on, in whole integer increments, up to all of the possible probes) that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes identified by the present inventors. Such genes are selected from any of the genes listed in the tables provided herein. Multiple probes can also be used to detect the same gene or to detect different splice variants of the same gene.
  • genes that are not regulated in the peripheral blood cells of patients with COPD, or that are not presently known to be regulated in the peripheral blood cells of patients with COPD can be added to the set of genes to be identified by the plurality of polynucleotides.
  • Such genes would not be random genes, or large groups of unselected human genes, as are commercially available for detection now, but rather, would be specifically selected to complement the sets of genes identified by the present invention.
  • one of skill in the art may wish to add to the above-described plurality of polynucleotides one or more polynucleotides corresponding to (useful for identifying) genes that are of relevance because they are expressed by a particular tissue of interest (e.g., pulmonary tissue), are associated with the particular disease (COPD) but not necessarily with peripheral blood cells, or are associated with a particular cell, tissue or body function.
  • tissue of interest e.g., pulmonary tissue
  • COPD peripheral blood cells
  • a plurality of polynucleotides refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of polynucleotides, including polynucleotides representing all of the genes described herein (e.g., 106), 500, 1000, 10 4 , 10 5 , or at least 10 6 or more polynucleotides.
  • an isolated polynucleotide, or an isolated nucleic acid molecule is a nucleic acid molecule that has been removed from its natural milieu (i.e., that has been subject to human manipulation), its natural milieu being the genome or chromosome in which the nucleic acid molecule is found in nature.
  • isolated does not necessarily reflect the extent to which the nucleic acid molecule has been purified, but indicates that the molecule does not include an entire genome or an entire chromosome in which the nucleic acid molecule is found in nature.
  • the polynucleotides useful in the plurality of polynucleotides of the present invention are typically a portion of a gene (sense or non-sense strand) of the present invention that is suitable for use as a hybridization probe or PCR primer for the identification of a full-length gene (or portion thereof) in a given sample (e.g., a peripheral blood cell sample).
  • An isolated nucleic acid molecule can include a gene or a portion of a gene (e.g., the regulatory region or promoter), for example, to produce a reporter construct according to the present invention.
  • An isolated nucleic acid molecule that includes a gene is not a fragment of a chromosome that includes such gene, but rather includes the coding region and regulatory regions associated with the gene, but no additional genes naturally found on the same chromosome.
  • An isolated nucleic acid molecule can also include a specified nucleic acid sequence flanked by (i.e., at the 5' and/or the 3' end of the sequence) additional nucleic acids that do not normally flank the specified nucleic acid sequence in nature (i.e., heterologous sequences).
  • Isolated nucleic acid molecule can include DNA, RNA (e.g., mRNA), or derivatives of either DNA or RNA (e.g., cDNA).
  • nucleic acid molecule primarily refers to the physical nucleic acid molecule and the phrase “nucleic acid sequence” primarily refers to the sequence of nucleotides on the nucleic acid molecule, the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein.
  • an isolated nucleic acid molecule of the present invention is produced using recombinant DNA technology (e.g., polymerase chain reaction (PCR) amplification, cloning) or chemical synthesis.
  • PCR polymerase chain reaction
  • the minimum size of a nucleic acid molecule or polynucleotide of the present invention is a size sufficient to encode a protein having a desired biological activity, sufficient to form a probe or oligonucleotide primer that is capable of forming a stable hybrid with the complementary sequence of a nucleic acid molecule encoding the natural protein (e.g., under moderate, high or very high stringency conditions), or to otherwise be used as a target in an assay or in any therapeutic method discussed herein.
  • the size of the polynucleotide can be dependent on nucleic acid composition and percent homology or identity between the nucleic acid molecule and a complementary sequence as well as upon hybridization conditions per se (e.g., temperature, salt concentration, and formamide concentration).
  • the minimum size of a polynucleotide that is used as an oligonucleotide probe or primer is at least about 5 nucleotides in length, and preferably ranges from about 5 to about 50 or about 500 nucleotides or greater (1000, 2000, etc.), including any length in between, in whole number increments (i.e., 5, 6, 7, 8, 9, 10,...33, 34,...256, 257,...500...1000...), and more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length.
  • the oligonucleotide primer or probe is typically at least about 12 to about 15 nucleotides in length if the nucleic acid molecules are GC-rich and at least about 15 to about 18 bases in length if they are AT-rich.
  • the nucleic acid molecule can include a portion of a protein- encoding sequence or a nucleic acid sequence encoding a full-length protein.
  • the polynucleotide probes are conjugated to detectable markers.
  • Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads.TM.), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3 H,
  • polynucleotide probes are immobilized on a substrate.
  • the polynucleotide probes are hybridizable array elements in a microarray or high density array.
  • Nucleic acid arrays are well known in the art and are described for use in comparing expression levels of particular genes of interest, for example, in U.S. Patent No. 6,177,248, which is incorporated herein by reference in its entirety. Nucleic acid arrays are suitable for quantifying a small variations in expression levels of a gene in the presence of a large population of heterogeneous nucleic acids. Knowing the identity of the genes set forth by the present invention, nucleic acid arrays can be fabricated either by de novo synthesis on a substrate or by spotting or transporting nucleic acid sequences onto specific locations of substrate.
  • Nucleic acids are purified and/or isolated from biological materials, such as a bacterial plasmid containing a cloned segment of sequence of interest. It is noted that all of the genes identified by the present invention have been previously sequenced, at least in part, such that oligonucleotides suitable for the identification of such nucleic acids can be produced. The database accession number for each of the genes identified by the present inventors is provided in the tables of the invention. Suitable nucleic acids are also produced by amplification of template, such as by polymerase chain reaction or in vitro transcription.
  • An array will typically include a number of probes that specifically hybridize to the sequences of interest.
  • the array will include one or more control probes.
  • the high-density array chip includes "test probes". Test probes could be oligonucleotides having a minimum or maximum length as described above for other oligonucleotides.
  • test probes are double or single strand DNA sequences. DNA sequences are isolated or cloned from natural sources or amplified from natural sources using natural nucleic acids as templates, or produced synthetically.
  • the test probes have sequences complementary to particular subsequences of the genes whose expression they are designed to detect.
  • the test probes are capable of specifically hybridizing to the target nucleic acid they are to detect.
  • Another embodiment of the present invention relates to a plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated in peripheral blood cells in patients with COPD.
  • the plurality of antibodies, or antigen binding fragments thereof consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated in peripheral blood cells in patients with COPD, and that can be detected as protein products using antibodies.
  • the plurality of antibodies, or antigen binding fragments thereof comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins or portions thereof (peptides) encoded by any of the genes from the tables provided herein.
  • a plurality of antibodies, or antigen binding fragments thereof refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of antibodies, or antigen binding fragments thereof, including antibodies representing all of the genes described herein (e.g., 246) or more, such as 500, or at least 1000 antibodies, or antigen binding fragments thereof.
  • the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or binding partner (antigen binding peptide) to preferentially bind to specified proteins. More specifically, the phrase “selectively binds” refers to the specific binding of one protein to another (e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay.
  • any standard assay e.g., an immunoassay
  • controls when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA), immunoblot assays, etc.). Limited digestion of an immunoglobulin with a protease may produce two fragments.
  • An antigen binding fragment is referred to as an Fab, an Fab 1 , or an F(ab') 2 fragment.
  • a fragment lacking the ability to bind to antigen is referred to as an Fc fragment.
  • An Fab fragment comprises one arm of an immunoglobulin molecule containing a L chain (VL + C L domains) paired with the V H region and a portion of the C H region (CHl domain).
  • An Fab' fragment corresponds to an Fab fragment with part of the hinge region attached to the CHl domain.
  • An F(ab') 2 fragment corresponds to two Fab' fragments that are normally covalently linked to each other through a di-sulfide bond, typically in the hinge regions.
  • Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees.
  • Whole antibodies of the present invention can be polyclonal or monoclonal.
  • functional equivalents of whole antibodies such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab) 2 fragments), as well as genetically- engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention.
  • antigen binding fragments in which one or more antibody domains are truncated or absent e.g., Fv, Fab, Fab', or F(ab) 2 fragments
  • genetically- engineered antibodies or antigen binding fragments thereof including single chain antibodies or
  • a suitable experimental animal such as, for example, but not limited to, a rabbit, a sheep, a hamster, a guinea pig, a mouse, a rat, or a chicken, is exposed to an antigen against which an antibody is desired.
  • an animal is immunized with an effective amount of antigen that is injected into the animal.
  • An effective amount of antigen refers to an amount needed to induce antibody production by the animal.
  • the animal's immune system is then allowed to respond over a pre-determined period of time. The immunization process can be repeated until the immune system is found to be producing antibodies to the antigen.
  • polyclonal antibodies specific for the antigen serum is collected from the animal that contains the desired antibodies (or in the case of a chicken, antibody can be collected from the eggs). Such serum is useful as a reagent.
  • Polyclonal antibodies can be further purified from the serum (or eggs) by, for example, treating the serum with ammonium sulfate.
  • Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein ⁇ Nature 256:495-497, 1975). For example, B lymphocytes are recovered from the spleen (or any suitable tissue) of an immunized animal and then fused with myeloma cells to obtain a population of hybridoma cells capable of continual growth in suitable culture medium. Hybridomas producing the desired antibody are selected by testing the ability of the antibody produced by the hybridoma to bind to the desired antigen.
  • any of the genes of this invention can serve as targets for therapeutic strategies.
  • regulatory compounds that regulate e.g., upregulate or downregulate
  • the expression and/or biological activity of a target gene or its expression product can be identified and/or designed using the information regarding the biomarker targets described herein.
  • regulatory compounds that regulate e.g., upregulate or downregulate
  • the expression and/or biological activity of a target gene or its expression product can be identified and/or designed using the information regarding the biomarker targets described herein.
  • identify genes that are highly regulated in patients with COPD one can use such genes and their products to further investigate the molecular or biochemical mechanisms associated with the development and progression of COPD and then design or establish assays to identify therapeutic compounds that affect the molecular or biochemical mechanism with the goal of providing a therapeutic benefit to the patient.
  • one embodiment of the present invention relates to methods for identifying compounds that regulate the expression or activity of at least one of the biomarkers described herein.
  • such compounds can be used to further study mechanisms associated with COPD or more preferably, serve as a therapeutic agent for use in the treatment or prevention of at least one symptom or aspect of COPD, or as a lead compound for the development of such a therapeutic agent.
  • an assay can be used for screening and selecting a chemical compound or a biological compound having regulatory activity as a candidate reagent or therapeutic based on the ability of the compound to regulate the expression or activity of the target biomarker.
  • Reference herein to regulating a target can refer to one or both of regulating transcription of a target gene and regulating the translation and/or activity of its corresponding expression product.
  • a compound can be referred to herein as therapeutic compound, in one embodiment.
  • a cell line that naturally expresses the gene of interest or has been transfected with the gene (or suitable portions or derivatives thereof for assaying putative regulatory compounds) or other recombinant nucleic acid molecule encoding the protein of interest is incubated with various compounds, also referred to as candidate compounds, test compounds, or putative regulatory compounds.
  • a regulation of the expression of the gene of interest or regulation of the activities of its encoded product may be used to identify a therapeutic compound.
  • Therapeutic compounds identified in this manner can then be re-tested, if desired, in other assays to confirm their activities with regard to the target biomarker or a cellular or other activity related thereto.
  • the identification of compounds that increase the expression or activity of genes in any one or more of Tables 2-5, or the proteins encoded thereby, that are downregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls, or the identification of compounds that decrease the expression or activity of genes in any one or more of Tables 2-5, or the proteins encoded thereby, that are upregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls are predicted to be useful as therapeutic reagents or lead compounds therefore in the prevention and treatment of COPD.
  • one embodiment of the present invention relates to a method of using the differentially expressed genes described herein or the proteins encoded thereby (i.e., the biomarkers of the invention) as a target to identify a regulatory compound for regulation of a biological function associated with that gene or protein.
  • a method can include the steps of: (a) contacting a test compound with a cell that expresses the target biomarker or a useful portion thereof (i.e., useful being any portion of a gene, transcript or protein that can be used to identify a compound as discussed herein); and (b) identifying compounds that regulate the expression or activity of the gene or protein.
  • the biological activity or biological action of a protein refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions).
  • Modifications, activities or interactions which result in a decrease in protein expression or a decrease in the activity of the protein can be referred to as inactivation (complete or partial), down-regulation, reduced action, or decreased action or activity of a protein.
  • modifications, activities or interactions which result in an increase in protein expression or an increase in the activity of the protein can be referred to as amplification, overproduction, activation, enhancement, up- regulation or increased action of a protein.
  • the biological activity of a protein according to the invention can be measured or evaluated using any assay for the biological activity of the protein as known in the art.
  • assays can include, but are not limited to, binding assays, assays to determine internalization of the protein and/or associated proteins, enzyme assays, cell signal transduction assays (e.g., phosphorylation assays), and/or assays for determining downstream cellular events that result from activation or binding of the cell surface protein (e.g., expression of downstream genes, production of various biological mediators, etc.).
  • a biologically active fragment or homologue of a gene, nucleic acid transcript or derivative thereof, or protein maintains the ability to be useful in a method of the present invention. Therefore, the biologically active fragment or homologue maintains the ability to be used to identify regulators (e.g., inhibitors) of the native gene or protein when, for example, the biologically active fragment or homologue is expressed by a cell or used in another assay format. Therefore, the biologically active fragment or homologue has a structure that is sufficiently similar to the structure of the native gene or protein that a regulatory compound can be identified by its ability to bind to and/or regulate the expression or activity of the fragment or homologue in a manner consistent with the regulation of the native gene or protein.
  • regulators e.g., inhibitors
  • Compounds to be screened in the methods of the invention include known organic compounds such as antibodies, products of peptide libraries, and products of chemical combinatorial libraries. Compounds may also be identified using rational drug design relying on the structure of the product of a gene. Such methods are known to those of skill in the art and involve the use of three-dimensional imaging software programs. For example, various methods of drug design, useful to design or select mimetics or other therapeutic compounds useful in the present invention are disclosed in Maulik et al., 1997, Molecular Biotechnology: Therapeutic Applications and Strategies, Wiley-Liss, Inc., which is incorporated herein by reference in its entirety.
  • a mimetic refers to any peptide or non-peptide compound that is able to mimic the biological action of a naturally occurring peptide, often because the mimetic has a basic structure that mimics the basic structure of the naturally occurring peptide and/or has the salient biological properties of the naturally occurring peptide.
  • Mimetics can include, but are not limited to: peptides that have substantial modifications from the prototype such as no side chain similarity with the naturally occurring peptide (such modifications, for example, may decrease its susceptibility to degradation); anti-idiotypic and/or catalytic antibodies, or fragments thereof; non-proteinaceous portions of an isolated protein (e.g., carbohydrate structures); or synthetic or natural organic molecules, including nucleic acids and drugs identified through combinatorial chemistry, for example.
  • Such mimetics can be designed, selected and/or otherwise identified using a variety of methods known in the art.
  • a mimetic can be obtained, for example, from molecular diversity strategies (a combination of related strategies allowing the rapid construction of large, chemically diverse molecule libraries), libraries of natural or synthetic compounds, in particular from chemical or combinatorial libraries (i.e., libraries of compounds that differ in sequence or size but that have the similar building blocks) or by rational, directed or random drug design. See for example, Maulik et al., supra.
  • molecular diversity strategy large compound libraries are synthesized, for example, from peptides, oligonucleotides, carbohydrates and/or synthetic organic molecules, using biological, enzymatic and/or chemical approaches.
  • the critical parameters in developing a molecular diversity strategy include subunit diversity, molecular size, and library diversity.
  • the general goal of screening such libraries is to utilize sequential application of combinatorial selection to obtain high-affinity ligands for a desired target, and then to optimize the lead molecules by either random or directed design strategies.
  • Methods of molecular diversity are described in detail in Maulik, et al., ibid.
  • Maulik et al. also disclose, for example, methods of directed design, in which the user directs the process of creating novel molecules from a fragment library of appropriately selected fragments; random design, in which the user uses a genetic or other algorithm to randomly mutate fragments and their combinations while simultaneously applying a selection criterion to evaluate the fitness of candidate ligands; and a grid-based approach in which the user calculates the interaction energy between three dimensional receptor structures and small fragment probes, followed by linking together of favorable probe sites.
  • test compound “putative inhibitory compound” or “putative regulatory compound” refers to compounds having an unknown or previously unappreciated regulatory activity in a particular process.
  • identify with regard to methods to identify compounds is intended to include all compounds, the usefulness of which as a regulatory compound for the purposes of regulating the expression or activity of a target biomarker or otherwise regulating some activity that may be useful in the study or treatment of COPD is determined by a method of the present invention.
  • regulatory compounds are identified by exposing a target gene to a test compound; measuring the expression of a target; and selecting a compound that regulates (up or down) the expression of the target.
  • the putative regulatory compound can be exposed to a cell that expresses the target gene (endogenously or recombinantly).
  • a preferred cell to use in an assay includes a mammalian cell that either naturally expresses the target gene or has been transformed with a recombinant form of the target gene, such as a recombinant nucleic acid molecule comprising a nucleic acid sequence encoding the target protein or a useful fragment thereof. Methods to determine expression levels of a gene are well known in the art.
  • the conditions under which a cell, cell lysate, nucleic acid molecule or protein of the present invention is exposed to or contacted with a putative regulatory compound, such as by mixing, are any suitable culture or assay conditions.
  • the conditions include an effective medium in which the cell can be cultured or in which the cell lysate can be evaluated in the presence and absence of a putative regulatory compound.
  • Cells of the present invention can be cultured in a variety of containers including, but not limited to, tissue culture flasks, test tubes, microtiter dishes, and petri plates. Culturing is carried out at a temperature, pH and carbon dioxide content appropriate for the cell. Such culturing conditions are also within the skill in the art.
  • Cells are contacted with a putative regulatory compound under conditions which take into account the number of cells per container contacted, the concentration of putative regulatory compound(s) administered to a cell, the incubation time of the putative regulatory compound with the cell, and the concentration of compound administered to a cell. Determination of effective protocols can be accomplished by those skilled in the art based on variables such as the size of the container, the volume of liquid in the container, conditions known to be suitable for the culture of the particular cell type used in the assay, and the chemical composition of the putative regulatory compound (i.e., size, charge etc.) being tested.
  • a preferred amount of putative regulatory compound(s) can comprise between about 1 nM to about 10 mM of putative regulatory compound(s) per well of a 96-well plate.
  • To detect expression of a target refers to the act of actively determining whether a target is expressed or not. This can include determining whether the target expression is upregulated as compared to a control, downregulated as compared to a control, or unchanged as compared to a control. Therefore, the step of detecting expression does not require that expression of the target actually is upregulated or downregulated, but rather, can also include detecting that the expression of the target has not changed (i.e., detecting no expression of the target or no change in expression of the target).
  • Expression of transcripts and/or proteins is measured by any of a variety of known methods in the art, and such methods have been discussed previously herein.
  • measurement of translation of a protein includes any suitable method for detecting and/or measuring proteins from a cell or cell extract, and such methods have been described previously herein.
  • Designing a compound for testing in a method of the present invention can include creating a new chemical compound or searching databases of libraries of known compounds (e.g., a compound listed in a computational screening database containing three dimensional structures of known compounds). Designing can also be performed by simulating chemical compounds having substitute moieties at certain structural features.
  • the step of designing can include selecting a chemical compound based on a known function of the compound.
  • a preferred step of designing comprises computational screening of one or more databases of compounds in which the three dimensional structure of the compound is known and is interacted (e.g. , docked, aligned, matched, interfaced) with the three dimensional structure of a target by computer (e.g. as described by Humblet and Dunbar, Animal Reports in Medicinal Chemistry, vol.
  • Candidate compounds identified or designed by the above-described methods can be synthesized using techniques known in the art, and depending on the type of compound. Synthesis techniques for the production of non-protein compounds, including organic and inorganic compounds are well known in the art. For example, for smaller peptides, chemical synthesis methods are preferred. For example, such methods include well known chemical procedures, such as solution or solid-phase peptide synthesis, or semi-synthesis in solution beginning with protein fragments coupled through conventional solution methods. Such methods are well known in the art and may be found in general texts and articles in the area such as: Merrifield, 1997, Methods Enzymol. 289:3-13; Wade et al., 1993, Australas Biotechnol.
  • peptides may be synthesized by solid-phase methodology utilizing a commercially available peptide synthesizer and synthesis cycles supplied by the manufacturer.
  • solid phase synthesis could also be accomplished using the FMOC strategy and a TF A/scavenger cleavage mixture.
  • a compound that is a protein or peptide can also be produced using recombinant DNA technology and methods standard in the art, particularly if larger quantities of a protein are desired.
  • putative regulatory compounds are identified by exposing a target to a candidate compound; measuring the binding of the candidate compound to the target; and selecting a compound that binds to the target at a desired concentration, affinity, or avidity.
  • the assay is performed under conditions conducive to promoting the interaction or binding of the compound to the target.
  • a BIAcore machine can be used to determine the binding constant of a complex between the target protein (a protein encoded by the target gene) and a natural ligand in the presence and absence of the candidate compound.
  • the target protein or a ligand binding fragment thereof can be immobilized on a substrate.
  • a natural or synthetic ligand is contacted with the substrate to form a complex.
  • the dissociation constant for the complex can be determined by monitoring changes in the refractive index with respect to time as buffer is passed over the chip (O'Shannessy et al. Anal. Biochem. 212:457-468 (1993); Schuster et al., Nature 365:343-347 (1993)).
  • a candidate compound at various concentrations with the complex and monitoring the response function allows the complex dissociation constant to be determined in the presence of the test compound and indicates whether the candidate compound is either an inhibitor or an agonist of the complex.
  • the candidate compound can be contacted with the immobilized target protein at the same time as the ligand to see if the candidate compound inhibits or stabilizes the binding of the ligand to the target protein.
  • suitable assays for measuring the binding of a candidate compound to a target protein or for measuring the ability of a candidate compound to affect the binding of the target protein to another protein or molecule include, but are not limited to, Western blot, immunoblot, enzyme-linked immunosorbant assay (ELISA), radioimmunoassay (RIA), immunoprecipitation, surface plasmon resonance, chemiluminescence, fluorescent polarization, phosphorescence, immunohistochemical analysis, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, microcytometry, microarray, microscopy, fluorescence activated cell sorting (FACS), and flow cytometry.
  • ELISA enzyme-linked immunosorbant assay
  • RIA radioimmunoassay
  • MALDI-TOF matrix-assisted laser desorption/ionization time-of-flight
  • assays include those that are suitable for monitoring the effects of protein binding, including, but not limited to, cell-based assays such as: cytokine secretion assays, or intracellular signal transduction assays that determine, for example, protein or lipid phosphorylation, mediator release or intracellular Ca mobilization.
  • cell-based assays such as: cytokine secretion assays, or intracellular signal transduction assays that determine, for example, protein or lipid phosphorylation, mediator release or intracellular Ca mobilization.
  • putative regulatory compounds are identified by exposing a target protein of the present invention (or a cell expressing the protein naturally or recombinantly) to a candidate compound and measuring the ability of the compound to inhibit or enhance a biological activity of the protein.
  • the biological activity of a protein encoded by the target gene is measured by measuring the amount of product generated in a biochemical reaction mediated by the protein encoded by the target gene.
  • the activity of the protein encoded by the target gene is measured by measuring the amount of substrate generated in a biochemical reaction mediated by the protein encoded by the target gene.
  • a biological activity is measured by measuring a specific event in a cell-based assay, such as release or secretion of a biological mediator or compound that is regulated by the activity of the target protein, measuring intracellular signal transduction assays that determine, for example, protein or lipid phosphorylation, mediator release or intracellular Ca ++ mobilization.
  • a specific event such as release or secretion of a biological mediator or compound that is regulated by the activity of the target protein
  • intracellular signal transduction assays that determine, for example, protein or lipid phosphorylation, mediator release or intracellular Ca ++ mobilization.
  • the activity of the protein is measured in the presence and absence of the candidate compound, or in the presence of another suitable control compound.
  • a therapeutic compound is identified by exposing the enzyme encoded by a target gene to a test compound; measuring the activity of the enzyme encoded by the target gene in the presence and absence of the compound; and selecting a compound that down-regulates or inhibits the activity of the enzyme encoded by the target gene.
  • Methods to measure enzymatic activity are well known to those skilled in the art and are selected based on the identity of the enzyme being tested. For example, if the enzyme is a kinase, phosphorylation assays can be used.
  • methods used to identify therapeutic compounds are customized for each target gene or product.
  • the target product is an enzyme
  • the enzyme will be expressed in cell culture and purified.
  • the enzyme will then be screened in vitro against therapeutic compounds to look for inhibition of that enzymatic activity.
  • the target is a non- catalytic protein, then it will also be expressed and purified.
  • Therapeutic compounds will then be tested for their ability to regulate, for example, the binding of a site-specific antibody or a target-specific ligand to the target product.
  • therapeutic compounds that bind to target products are identified, then those compounds can be further tested in biological assays that test for other desirable characteristics and activities, such as utility as a reagent for the study of COPD or utility as a therapeutic compound for the prevention or treatment of COPD.
  • a composition, and particularly a therapeutic composition, of the present invention generally includes the therapeutic compound and a carrier, and preferably, a pharmaceutically acceptable carrier.
  • a pharmaceutically acceptable carrier includes pharmaceutically acceptable excipients and/or pharmaceutically acceptable delivery vehicles, which are suitable for use in administration of the composition to a suitable in vitro, ex vivo or in vivo site.
  • a suitable in vitro, in vivo or ex vivo site is preferably a pulmonary tissue or a cell that is associated with or travels to a pulmonary tissue.
  • Preferred pharmaceutically acceptable carriers are capable of maintaining a compound, a protein, a peptide, nucleic acid molecule or mimetic (drug) in a form that, upon arrival of the compound, protein, peptide, nucleic acid molecule or mimetic at the target site in a culture (in the case of an in vitro or ex vivo protocol) or in patient (in vivo), the compound, protein, peptide, nucleic acid molecule or mimetic is capable of providing the desired effect at the target site.
  • Suitable excipients of the present invention include excipients or formularies that transport or help transport, but do not specifically target a composition to a cell (also referred to herein as non-targeting carriers).
  • examples of pharmaceutically acceptable excipients include, but are not limited to water, phosphate buffered saline, Ringer's solution, dextrose solution, serum-containing solutions, Hank's solution, other aqueous physiologically balanced solutions, oils, esters and glycols.
  • Aqueous carriers can contain suitable auxiliary substances required to approximate the physiological conditions of the recipient, for example, by enhancing chemical stability and isotonicity.
  • a controlled release formulation that is capable of slowly releasing a composition of the present invention into a patient or culture.
  • a controlled release formulation comprises a therapeutic compound in a controlled release vehicle.
  • Suitable controlled release vehicles include, but are not limited to, biocompatible polymers, other polymeric matrices, capsules, microcapsules, microparticles, bolus preparations, osmotic pumps, diffusion devices, liposomes, lipospheres, and transdermal delivery systems.
  • Other carriers include liquids that, upon administration to a patient, form a solid or a gel in situ. Preferred carriers are also biodegradable (i.e., bioerodible).
  • suitable delivery vehicles include, but are not limited to liposomes, viral vectors or other delivery vehicles, including ribozymes.
  • Natural lipid-containing delivery vehicles include cells and cellular membranes.
  • Artificial lipid-containing delivery vehicles include liposomes and micelles.
  • a delivery vehicle of the present invention can be modified to target to a particular site in a patient, thereby targeting and making use of a therapeutic compound at that site. Suitable modifications include manipulating the chemical formula of the lipid portion of the delivery vehicle and/or introducing into the vehicle a targeting agent capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type.
  • a compound or composition can be delivered to a cell culture or patient by any suitable method. Selection of such a method will vary with the type of compound being administered or delivered (i.e., compound, protein, peptide, nucleic acid molecule, or mimetic), the mode of delivery (i.e., in vitro, in vivo, ex vivo) and the goal to be achieved by administration/delivery of the compound or composition.
  • an effective administration protocol i.e., administering a composition in an effective manner
  • suitable dose parameters and modes of administration that result in delivery of a composition to a desired site (i.e., to a desired cell) and/or in the desired regulatory event.
  • Administration routes include in vivo, in vitro and ex vivo routes.
  • In vivo routes include, but are not limited to, oral, nasal, intratracheal injection, inhaled, transdermal, rectal, and parenteral routes.
  • Preferred parenteral routes can include, but are not limited to, subcutaneous, intradermal, intravenous, intramuscular and intraperitoneal routes.
  • Intravenous, intraperitoneal, intradermal, subcutaneous and intramuscular administrations can be performed using methods standard in the art. Aerosol (inhalation) delivery can also be performed using methods standard in the art (see, for example, Stribling et al., Proc. Natl. Acad.
  • Oral delivery can be performed by complexing a therapeutic composition of the present invention to a carrier capable of withstanding degradation by digestive enzymes in the gut of an animal.
  • a carrier capable of withstanding degradation by digestive enzymes in the gut of an animal.
  • examples of such carriers include plastic capsules or tablets, such as those known in the art.
  • Direct injection techniques include, for example, injecting the composition directly into a site.
  • Ex vivo refers to performing part of the regulatory step outside of the patient, such as by transfecting a population of cells removed from a patient with a recombinant molecule comprising a nucleic acid sequence encoding a protein according to the present invention under conditions such that the recombinant molecule is subsequently expressed by the transfected cell, and returning the transfected cells to the patient.
  • In vitro and ex vivo routes of administration of a composition to a culture of host cells can be accomplished by a method including, but not limited to, transfection, transformation, electroporation, microinjection, lipofection, adsorption, protoplast fusion, use of protein carrying agents, use of ion carrying agents, use of detergents for cell permeabilization, and simply mixing (e.g., combining) a compound in culture with a target cell.
  • a therapeutic compound, as well as compositions comprising such compounds can be administered to any organism, and particularly, to any member of the Vertebrate class, Mammalia, including, without limitation, primates, rodents, livestock and domestic pets.
  • Livestock include mammals to be consumed or that produce useful products (e.g., sheep for wool production).
  • Preferred mammals to protect include humans.
  • a therapeutic benefit is not necessarily a cure for a particular disease or condition, but rather, preferably encompasses a result which can include alleviation of the disease or condition, elimination of the disease or condition, reduction of a symptom associated with the disease or condition, prevention or alleviation of a secondary disease or condition resulting from the occurrence of a primary disease or condition, and/or prevention of the disease or condition.
  • the phrase "protected from a disease” refers to reducing the symptoms of the disease, reducing the occurrence of the disease, and/or reducing the severity of the disease.
  • Protecting a patient can refer to the ability of a composition of the present invention, when administered to a patient, to prevent a disease from occurring and/or to cure or to alleviate disease symptoms, signs or causes.
  • to protect a patient from a disease includes both preventing disease occurrence (prophylactic treatment) and treating a patient that has a disease (therapeutic treatment) to reduce the symptoms of the disease.
  • a beneficial effect can easily be assessed by one of ordinary skill in the art and/or by a trained clinician who is treating the patient.
  • disease refers to any deviation from the normal health of a mammal and includes a state when disease symptoms are present, as well as conditions in which a deviation (e.g., infection, gene mutation, genetic defect, etc.) has occurred, but symptoms are not yet manifested.
  • a deviation e.g., infection, gene mutation, genetic defect, etc.
  • PAH pulmonary arterial hypertension
  • PBMC gene expression from individuals with COPD has a distinct immunologic phenotype from normal controls and patients with other chronic lung diseases (PAH).
  • PAH chronic lung diseases
  • Example 2 The following example describes a larger study demonstrating the utility of peripheral blood mononuclear cells as surrogate markers for COPD.
  • the outline of the study is as follows.
  • Exclusion Criteria are:history of underlying lung disease other than COPD (e.g.
  • rheumatoid arthritis systemic lupus erythematosus
  • known history of genetic predisposition to COPD alpha- one antitrypsin, cystic fibrosis, etc.
  • potential occupational exposure mining, metal work, etc.
  • use of immunsuppressive medication including oral or inhaled corticosteroids; chronic oxygen therapy; and history of malignancy.
  • Additional inclusion criteria for groups (3) and (4) includes: no evidence of airflow limitation by spirometry; FEVl > 80% predicted; and FEVl /FVC ratio > 70.
  • Exclusion criteria for groups (3) and (4) includes: history of underlying lung disease (e.g.
  • RNA isolated from PBMC separation will be stored for directed (quantitative reverse transcriptase PCR) validation analysis of observed differentially expressed genes.
  • an additional peripheral PBMC isolation tube will be collected and cryopreserved for future validation studies of differentially expressed cell marker transcripts (Fig. 4).
  • Sample size and power Using the data presented in Example 1 on differential gene expression, the inventors have determined that a sample size of 10 per group will allow the identification of large differences between any two groups.
  • Table 2 shows minimally detectable mean differences (Mean2-Meanl) on the log 2 intensity scale using an ⁇ -level of 0.001 (two-sided) and power of
  • Table 2 sample size and power.
  • the estimated standard deviations (Sl and S2) were obtained from the COPD and normal distributions of standard deviations from the 15,022 genes that passed the filtering criteria applied in BRB ArrayTools. Values are shown for the median, 75 th and 90 th percentiles of standard deviation as suggested by Yang and Speed 58 . It is assumed for these calculations that an unmatched analysis will be performed (or that the matching will not induce correlation between COPD and non-COPD individuals), which gives a conservative estimate of effect size. As the true proportion of differentially expressed genes in the population varies from 0.005, 0.05 to 0.20, the expected false discovery rate will vary from 0.20, 0.02, to 0.005 59 . Isolation of PBMCs and total RNA
  • the mononuclear cells are obtained from patient blood in an identical manner to that employed for the PAH study discussed in Example 1, with slight modifications to improve reproducibility. Specifically, eight milliliters of peripheral blood is collected into BD-CPT tubes 54 and processed following the manufacturer's instructions. In recent testing, the inventors have determined that PBMCs isolated in this fashion contain less than 5% granulocytes.
  • Total RNA from samples which are selected for array analysis are isolated using standard methods, quantified by spectophotometry, and the absence of degradation assured using the Agilent Bioanalyzer.
  • Biotinylated cRNA for array hybridization is generated from total RNA using the methods previously developed by the inventors and others. Briefly, total RNA (2-5 ⁇ g) is converted to double-stranded cDNA using a standard oligo-dT-T7 primer, followed by in vitro transcription by T7 with the incorporation of biotin-nucleotide triphosphates. Labeled cRNA is fragmented, added to a hybridization buffer, and applied to the microarray. Subsequent to the hybridization the unbound cRNA is washed away, and the bound probe is stained with Streptavidin Phycoerythrin. The array is scanned, and the quantity of hybridization is inferred from the intensity of fluorescence at each feature of the array.
  • Affymetrix® arrays for human gene expression are used: e.g., the Affymetrix Hu-133 Plus 2.0 GenechipTM.
  • This fifth-generation microarray measures approximately 47,000 transcripts, including essentially all of the well-characterized genes in the human genome (e.g., ⁇ 38,500 genes). Analysis of this microarray includes the use of a high- resolution scanner that is demanded for arrays with this increased density (1,300,000 individual array features).
  • the first step of the data flow is the conversion of image data into tabular data.
  • the inventors use the statistical algorithms implemented in Asymetrix GeneChipTM Operating System (GCOS) (Affymetrix, Santa Clara, Calif.) for this task. While a number of alternatives to GCOS exist (such as d-chip, RMA, and PerfectMatch), the inventors' experience indicates that while these tools may provide advantages in the direct comparison of paired samples, they provide no advantage in class comparison and class discovery applications. Internal measurements of chip and sample quality (brightness scaling factor, noise, % present calls, and control gene 375' ratios) are collected in GCOS, and only chips that meet the inventors' quality threshold are included for analysis.
  • GCOS GeneChipTM Operating System
  • the analysis of the array data is similar to that outlined for the pulmonary hypertension analysis in Example 1 above. Briefly, the data is normalized (mean centered) and examined in an unsupervised fashion (see description of study design above). The goal of this analysis is to reduce bias in the discovery of meaningful groups in the patient population. Subsequent supervised analyses follow a class comparison paradigm, with the aim of discovering patterns of gene expression that support, or co-vary with assigned membership in new classes discovered in the unsupervised analysis, or with the clinical parameters.
  • Clinical data obtained through additional studies is the basis for potential classification groups, including pulmonary physiologic testing and demographic data. For each discriminator a binary class comparison analysis is conducted, and the reliability statistics (as determined by permutation testing) associated with the postulated classification are determined.
  • RT-PCR Quantitative reverse transcription-PCR
  • RNA is maintained from the original isolation for quantitative PCR of differentially expressed transcripts.
  • Lymphocytes in the bronchoalveolar space reenter the lung tissue by means of the alveolar epithelium, migrate to regional lymph nodes, and subsequently rejoin the systemic immune system. Anat.Rec. 264:229-236.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Hematology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Urology & Nephrology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Pathology (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Wood Science & Technology (AREA)
  • Cell Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physiology (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

Genes and proteins that are regulated differentially in individuals with chronic pulmonary obstructive disease (COPD) are provided, as well as methods for their use.

Description

DIAGNOSIS OF CHRONIC PULMONARY OBSTRUCTIVE DISEASE AND MONITORING OF THERAPY USING GENE EXPRESSION ANALYSIS OF
PERIPHERAL BLOOD CELLS
Background of the Invention
Chronic obstructive pulmonary disease (COPD) is a global epidemic. Worldwide, COPD is predicted to become the third most common cause of death and the fifth most common cause of disability by 202O1. In the United States, COPD is the fourth leading cause of death and is projected to become the third leading cause of death for both males and females by the year 2020 . Estimates suggest that approximately 14 million people in the United States have COPD and the prevalence continues to rise3. Additionally, COPD is an increasingly common cause of chronic morbidity, lost economic productivity, and consumption of health care resources2. COPD is caused by long-term exposure to inhaled noxious gases and particles; most notably, tobacco smoke accounts for more than 90% of COPD cases in developed countries. Despite the strong causal link between tobacco smoke and COPD, only 10-20% of cigarette smokers develop clinically evident COPD (Fig. I)3'4.
Epidemiologic studies demonstrate that the progressive airflow limitation in COPD patients is due to an accelerated decline in lung function (measured as the forced expiratory volume in one second, FEVl) from the normal rate in adults of approximately 30 ml per year to 60 ml per year14'15. As illustrated in Figure 2, the progressive decline in lung function (FEVl) begins with a long asymptomatic period during which ongoing functional loss occurs without associated symptoms. Although somewhat variable, the onset of symptoms signifies a substantial loss of function associated with FEVl of approximately 50% of the predicted normal. One of major challenges in the diagnosis and management of patients with COPD is that individuals typically present at the time of symptom onset when loss of lung function is advanced and interventions such as smoking cessation offer limited clinical improvement. Thus, the ability to identify the subset of smokers who will go on to develop accelerated airflow limitation and chronic respiratory failure while still asymptomatic may result in a profound impact on rate of disease progression and a potential therapeutic window early in the disease process where intervention is more efficacious.
The reason why only a small fraction of tobacco users develop clinically evident COPD remains a fundamental question. A reasonable conclusion is that there are certain subpopulations of tobacco users who have unknown predisposing factors resulting in increased disease susceptibility. Efforts to identify the subpopulations of tobacco users who develop COPD and their associated risk factors remain an active area of COPD research. It is likely that both environmental and genetic factors have a role in the development of COPD. Environmental factors that have been identified in COPD include smoking, air pollution, tobacco use, and viral infections16.
Evidence for a genetic link associated with the development of COPD is supported by both twin studies17 and familial clustering18. Investigative methods including linkage analysis18, whole genome screens19 and candidate gene approaches19 have been utilized to further identify potential genetic links. From these approaches, a number of candidate genes involved in immune response, cytokine regulation, metabolic and enzymatic processes have been identified. These include alpha- 1 antitrypsin, neutrophil elastase, matrix metalloproteases, interleukin-8, tumor necrosis factor-α (TNF-α), transforming growth factor-β (TGF-β), immunoglobulins, vitamin D-binding protein, blood group antigens, and human leukocyte antigen (HLA) status19'20. These data support a role for genetic susceptibility to COPD; however, there likely are multiple genes operating in concert with environmental factors in a synergistic fashion leading to the diverse and complex COPD phenotype. Although tobacco exposure is the major cause of COPD, smoking cessation does not appear to resolve ongoing airway inflammation once the disease is established5. This finding, in combination with the fact that most smokers do not develop COPD, supports the hypothesis that a distinct "immunologic phenotype" exists in smokers who develop COPD. The mechanisms by which the chronic inflammatory response is maintained is not known. However, a distinct "signature" of inflammatory cells, cytokine mediators, and proteases specific to COPD have been identified21"23.
The pathologic hallmarks of COPD are persistent airways inflammation, mucus hypersecretion, and lung parenchyma destruction. Studies examining airway histology, bronchoalveolar lavage, and sputum of smokers with COPD have demonstrated increases in macrophages, T-lymphocytes, and neutrophils relative to smokers without COPD, nonsmokers, or asthmatics22'24"29. Clinically, the quantitative presence of airway neutrophils30, macrophages30, and T-lymphocytes20 have been correlated with disease severity, suggesting a role for these immunoregulatory cells in the progression of disease. In particular, the presence of increased CD8+ T-lymphocytes in the airways of COPD patients is a consistent finding28'31" 33. The role played by CD8+ T-lymphocytes in underlying disease pathogenesis remains speculative. Current hypotheses include; (1) enhanced apoptosis of alveolar epithelial cells leading to parenchymal destruction31, (2) persistent recruitment to the lung parenchyma as a result of recurrent or chronic viral infection causing TNF-α mediated alveolar epithelial cell destruction34, or (3) an autoimmune phenomenon35. In addition to the abnormal inflammatory response in the airways and parenchyma, there is increasing evidence for systemic (non-pulmonary) inflammation including oxidative stress36, neutrophil activation37"39 and apoptosis40, and increased proinflammatory cytokines41'43. Several studies have compared peripheral blood T-lymphocytes in smokers versus nonsmokers (Table
^44-46
Table 1 : Summary of previously performed studies of peripheral blood mononuclear cells in COPD demonstrating role for circulating immunoregulatory cells in the disease process
Figure imgf000004_0001
Miller et al.44 analyzed peripheral blood lymphocyte populations in 60 smokers and 35 nonsmokers. They found that although there was no difference in the total number of T- lymphocytes or CD4+/CD8+ ratios in mild smokers compared to normal individuals, an increase in CD8+ lymphocytes and decreased CD4+/CD8+ ratio was observed in heavy smokers. Interestingly, these changes were reversible with smoking cessation. Ekberg-Jansson et al. found that peripheral blood T-cell activating markers were higher in 60 year old male smokers than in age-matched nonsmokers46. These studies support the hypothesis that tobacco exposure causes alterations in circulating T-lymphocytes and that these changes may be reversible with smoking cessation44. Following these studies, two other groups have investigated the role of smoking and presence or absence of COPD on circulating immunoregulatory T-lymphocytes (Table 1). De Jong et al. studied lymphocyte subsets in the peripheral blood of 42 individuals with COPD both former and active smokers and 24 mixed smoking and nonsmoking subjects without COPD. Results showed no difference in the number of T-lymphocyte subsets when comparing the entire groups or only smoking subjects. However, the percentage of CD8+ T-lymphocytes was significantly higher in the nonsmoking COPD subjects compared to the nonsmoking healthy controls. Additionally, results showed that in the nonsmoking COPD patients, a higher CD4+/CD8+ ratio correlated with improved lung function, again suggesting that the balance of T-lymphocyte subsets may play a role in disease pathogenesis47. Kim et al. investigated the relationship between T-lymphocyte subsets and pulmonary function in three groups: smokers with COPD, smokers without COPD, and healthy nonsmokers. The authors found that the proportion of CD8+ T-cells and the ratio of CD4+/CD8+ T-cells correlated with physiologic measures of gas exchange48. Recently, Hodge et al. analyzed peripheral blood T-cells from 18 patients with COPD and found increased T-cell apoptosis and Fas expression compared to normal controls. In addition, intracellular production of TNF-α and TGF-β by T-cells and monocytes in COPD patients was significantly increased over controls49. These findings are consistent with the report of Takabatake et al. demonstrating increased circulating TNF-α in the peripheral blood of patients with COPD50. Additionally, De Godoy et al. demonstrated significantly higher levels of TNF-α production in circulating peripheral blood monocytes in a subset of "weight-losing" COPD patients compared to "weight-stable" COPD patients suggesting a possible pathogenetic relationship between accelerated metabolism and TNF-α production43. Thus, there exists strong evidence not only for a quantitative change in T-lymphocyte populations in COPD but also a functional alteration in the circulating immunoregulatory cells (T-lymphocytes and monocytes) that may result in either disease progression, phenotypic variation, or lack of resolution.
Questions remain as to whether the development of COPD in a susceptible smoker is the result of an abnormal inflammatory response to a normal stimulus (i.e. tobacco smoke) or conversely, a normal inflammatory response to an abnormal stimulus (i.e. autoantigen). Arguments for the latter are supported by the ongoing communication and trafficking of inflammatory cells between the pulmonary and systemic circulation. Alternatively, or additionally, there is evidence that variation in T-lymphocyte subpopulations is genetically influenced52'53. Thus, genetic alterations in the host inflammatory response may result the "immunologic phenotype" observed in tobacco related COPD patients.
Chronic obstructive pulmonary disease is a growing worldwide epidemic. Despite the increasing individual and societal costs related to COPD, funding and support for COPD research has been relatively neglected. Thus, the understanding of the pathogenetic and cellular mechanisms involved in the development of COPD remains limited, and therefore, so do therapeutic approaches for preventing and treating the disease.
Summary of the Invention The present invention provides a method to diagnose chronic obstructive pulmonary disease (COPD) or a predisposition to develop COPD, comprising, detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient to be tested, wherein the gene is chosen from the genes represented by SEQ ID NO: 1-323, and wherein the level of expression of each of the genes in any one or more of Tables 2-5 is associated with COPD as measured by either upregulation or downregulation of gene expression in peripheral blood cells from patients with COPD as compared to the level of expression of the genes in peripheral blood cells from normal controls; and comparing the level of expression of the gene from the patient sample to the level of expression of the gene in normal control peripheral blood cells, wherein detection of regulation of the expression of the gene in the patient sample in the direction associated with COPD indicated in Table 2, 3, 4 and/or 5 indicates a diagnosis of COPD in the patient. In some embodiments, the detecting comprises detecting expression of at least 5 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 5 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 10 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 15 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 20 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 25 genes chosen from the genes represented by SEQ ID NO:1- 323. In some embodiments, the detecting comprises detecting expression of at least 50 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 75 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 100 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 125 genes chosen from the genes represented by SEQ ID NO:l-323. In some embodiments, the detecting comprises detecting expression of at least 150 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 175 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 200 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of at least 225 genes chosen from the genes represented by SEQ ID NO: 1-323. In some embodiments, the detecting comprises detecting expression of all of the genes represented by SEQ ID NO: 1-323.
In some embodiments, expression of the gene is detected by measuring amounts of transcripts of the gene in the patient peripheral blood cells. In other embodiments, expression of the gene is detected by detecting hybridization of at least a portion of the gene or a transcript thereof to a nucleic acid molecule comprising a portion of the gene or a transcript thereof in a nucleic acid array. In some embodiments, expression of the gene is detected by detecting the production of a protein encoded by the gene. In further embodiments, the level of expression of the gene in the peripheral blood cells of a normal control has been predetermined.
The present invention also provides a method to monitor the treatment of a patient with COPD, comprising detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient undergoing treatment for COPD, wherein the gene is chosen from the genes represented by any one of SEQ ID NO: 1-324, and wherein the level of expression of each of the genes represented by any one of SEQ ID NO: 1-324 is associated with COPD as measured by either upregulation or downregulation of gene expression in peripheral blood cells from patients with COPD as compared to the level of expression of the genes in peripheral blood cells from normal controls; and comparing the level of expression of the gene from the patient sample to the level of expression of the gene in a prior sample of peripheral blood cells from the patient, wherein detection of a change in the level of expression of the gene, as compared to the level of expression in the prior sample, toward the level of the expression of the gene in a normal control sample, indicates that the treatment for COPD is producing a beneficial result.
In some embodiments, detection of a change in the level of expression of the gene, as compared to the level of expression in the prior sample, away from the level of the expression of the gene in a normal control sample, indicates a progression of the COPD. In other embodiments, the detection of no significant change in the level of expression of the gene, as compared to the level of expression in the prior sample, indicates no significant change in the progression or treatment of the COPD in the patient.
The invention also provides a plurality of polynucleotides for the detection of the expression of genes that are indicative of COPD in a patient or a preclinical disposition therefore, wherein the plurality of polynucleotides consists of at least two polynucleotides, wherein each polynucleotide is at least 5 nucleotides in length, and wherein each polynucleotide is complementary to an RNA transcript, or nucleotide derived therefrom, of a gene that is regulated differently in individuals with COPD as compared to individuals that do not have COPD. In some embodiments, each polynucleotide is complementary to an RNA transcript, or a polynucleotide derived therefrom, of a gene represented by any one of SEQ ID NO: 1-324. In some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least two genes represented by any one of SEQ ID NO: 1-324. hi some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least five genes represented by any one of SEQ ID NO:1- 324. hi some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 10 genes represented by any one of SEQ ID NO: 1-324. hi some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 50 genes represented by any one of SEQ ID NO: 1-324. In some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 100 genes represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 150 genes represented by any one of SEQ ID NO:1- 229, and 320-324. In some embodiments, the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 200 genes represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, the polynucleotides are immobilized on a substrate. In some embodiments, the polynucleotides are hybridizable array elements in a microarray. In some embodiments, the polynucleotides are conjugated to detectable markers.
The invention further provides a method to diagnose chronic obstructive pulmonary disease (COPD) in a patient, comprising, detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient to be tested, wherein the gene is chosen from a group of genes, each of which has been previously identified to be upregulated or downregulated in the peripheral blood cells of patients who have been diagnosed with COPD, as compared to the level of expression of the gene in normal control peripheral blood cells; and comparing the level of expression of the gene from the patient sample to the level of expression of the gene in normal control peripheral blood cells, wherein detection of regulation of the expression of the gene in the patient sample in the direction associated with COPD as indicated by the previous identification, indicates a diagnosis of COPD in the patient.
The invention further provides a method to identify a compound with the potential to treat or prevent chronic obstructive pulmonary disease (COPD), comprising contacting a test compound with a cell that expresses a gene selected from any one or more of the genes represented by any one of SEQ ID NO: 1-324; identifying compounds that increase the expression or activity of genes represented by any one of SEQ ID NO: 1-324 or the proteins encoded thereby that are downregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls, or that decrease the expression or activity of genes represented by any one of SEQ ID NO: 1-324 or the proteins encoded thereby that are upregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls.
In some embodiments, the invention provides a method to treat a patient with COPD, comprising administering to the patient a therapeutic composition comprising a compound identified by the method above.
In some embodiments, detection of a change in the level of expression of at least one gene in methods of the invention comprises detecting the presence of a protein, hi other embodiments, the method further comprises detecting the presence of the protein using a reagent that specifically binds to the protein. In some embodiments, the reagent is selected from the group consisting of an antibody, an antibody derivative, and an antibody fragment.
The invention also provides a plurality of reagents for the detection of the expression of genes that are indicative of COPD in a patient or a preclinical disposition therefore; wherein the plurality of reagents consists of at least two reagents that each of which specifically bind to a protein, wherein each protein is at least 15 amino acids in length, and wherein each protein is encoded a gene that is regulated differently in individuals with COPD as compared to individuals that do not have COPD.
In some embodiments, each protein is encoded a gene represented by any one of SEQ ID NO: 1-324. In some embodiments, at least two proteins are encoded a gene represented by any one of SEQ ID NO: 1-324. In some embodiments, at least five proteins are encoded a gene represented by any one of SEQ ID NO: 1-324. In some embodiments, at least 10 proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.. In some embodiments, at least 50 proteins are encoded a gene represented by any one of SEQ ID NO: 1-324. hi some embodiments, at least 100 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, at least 150 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, at least 200 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324. In some embodiments, the reagents are immobilized on a substrate. In some embodiments, the reagents are selected from the group consisting of an antibody, an antibody derivative, and an antibody fragment, and wherein each of said reagents are elements in a microarray. hi some embodiments, reagents are conjugated to detectable markers.
Brief Description of the Figures of the Invention Fig. 1 shows the relationship of tobacco exposure to chronic obstructive pulmonary disease (COPD). Only approximately 10-20% of all tobacco exposed individuals develop COPD. However, tobacco exposure accounts for over 90% of COPD in developed countries. *Other causes of COPD include alpha-one antitrypsin deficiency, cystic fibrosis, environmental exposures, etc. Fig. 2 shows examples of normal and accelerated loss of lung function over time. Sl and S2 denote two individuals with tobacco exposure. Sl has accelerated loss of lung function with onset of symptoms at young age resulting in early diagnosis of COPD. S2 also has greater than expected loss of lung function but remains asymptomatic over lifetime without clinical diagnosis of COPD. NS is normal nonsmoker with expected rate of decline in lung function over time.
Fig. 3 A and B are dendrograms of COPD and Normal samples, clustered using centered correlation and average linkage, and the tile plot of differentially expressed genes. Fig. 3A shows unsupervised clustering based on 15022 genes. Fig. 3B shows supervised clustering based on 240 genes. Fig. 3 C show a tile plot of the 240 genes which preliminarily discriminate between COPD and normal PBMC samples. Darker areas represent high expression and lighter areas represent relatively low expression.
Fig. 4 is a schematic drawing showing the study methods used in Example 2. Fig. 5 is a schematic drawing showing the validation protocol for differentially expressed genes.
Description of the Invention
The present invention generally relates to the identification of a large number of genes that are regulated differentially in individuals with chronic pulmonary obstructive disease (COPD) as compared to individuals that do not have this disease, and particularly, to the identification of how these genes are regulated during disease. In addition, this invention generally relates to diagnostic and prognostic assays and kits for COPD, as well as the identification of targets for therapeutic prevention and intervention strategies. According to the present invention, the terms "chronic pulmonary obstructive disease", its acronym "COPD", and "emphysema" can be used interchangeably to describe the same condition.
More specifically, the present inventors have studied gene expression profiles in circulating peripheral blood mononuclear cells (PBMC) as surrogate markers for COPD. Given the systemic inflammatory changes associated with COPD and the knowledge that circulating lymphocytes traffic between the lungs and the peripheral circulation, the present inventors hypothesized that clinically relevant changes in gene expression could be observed in PBMC of COPD patients. The use of gene expression microarray technology is a powerful high- throughput method to investigate relative differences in gene expression between tobacco exposed individuals with and without COPD as well as to identify novel candidate genes and/or candidate therapeutic targets for future study and intervention. PBMC expression profiling is a powerful tool to identify prognostic indicators, disease subclassifications, candidate genes, and response to therapy in a diverse array of disease states6'12. However, the global analysis of gene expression utilizing minimally invasive methods in COPD has never been conducted. The present inventors provide data herein showing that there is a distinct gene expression profile among individuals with COPD. The identification of distinct expression profiles between tobacco exposed individuals both with and without COPD is useful for the analysis of disease pathogenesis and immunoregulatory mechanisms, and are further useful as biomarkers that can identify COPD susceptible individuals while in a pre-clinical state.
The use of peripheral blood mononuclear cells (PBMC) as surrogate markers of other diseases has been previously described in the medical literature. PBMC expression analyses have been used to identify prognostic indicators, disease subclassifications, candidate genes, and response to therapy in disease states including acute leukemia6'7, sickle cell disease9, connective tissue disease8'10, multiple sclerosis11, and colorectal cancer12. Recently, the present inventors and colleagues published a report using PBMC expression profiling to identify differentially expressed genes between idiopathic pulmonary arterial hypertension and secondary pulmonary hypertension. These studies showed that one can prospectively discriminate between patients with pulmonary hypertension and normal controls based solely on expression profiles, thus supporting the concept of an "immunologic signature" associated with a certain disease process . However, as discussed above, global analysis of peripheral blood gene expression in COPD had never been conducted prior to the present invention. Moreover, it is not necessarily intuitive that circulating blood cells will carry information related to COPD. T lymphocytes, which recognize antigens presented by PBMCs, respond to specific antigens, but not with a disease-specific response to a single antigen. Therefore, while screening PBMCs might be useful to distinguish between normal patients and those with one type of disease, the application of such technology to distinguish between normal patients and those with an entirely different disease is not intuitively operable, nor has such an approach been validated for COPD until the present invention. Current evidence supports a role for both genetic and environmental factors in COPD disease pathogenesis. However, past focus has been limited to a few genes and/or inflammatory mediators of known interest. Although these approaches are valid, without being bound by theory, the present inventors believe that the study of one or two genes in a population of 5000 individuals may be less informative about the disease process than the study of 30,000 genes in fewer affected individuals. The use of gene expression microarrays to study tobacco exposed individuals both with and without COPD enables the comparison of thousands of expression transcripts between groups and has resulted in the discovery of novel genes of interest, new diagnostic tools, disease subclassifications, and new candidate therapeutic targets. The present inventors' analysis methods have the advantage of generating new hypotheses and investigative pathways based on the study of fewer individuals through minimally invasive methods. The ability to conduct research in human subjects and not just animal models is further likely to enhance the discovery and understanding of this common and underappreciated disease within the target population. The large amounts of data that can be generated from a microarray gene expression study enable the inventors to capture many simultaneous processes and convert these findings into meaningful, quantitative, and reproducible data.
In addition to the discovery of biomarkers that can be used individually or in any combination in assays and kits for the diagnosis of, prognosis of, or other evaluation or study of COPD, the present inventors have uncovered a number of genes not previously recognized to play a role in the disease process of COPD, which can now be studied in more detail and/or be used as targets for the discovery of other modulators of disease or therapeutic agents.
The present inventors have identified multiple genes, the expression of which is regulated differentially in peripheral blood cells (PBC; also referred to herein as peripheral blood mononuclear cells, or PBMC) of patients with COPD as compared to subjects without COPD. One set of genes identified as being differentially expressed in patients with COPD versus normal controls are shown in Table 2.
Table 2. Genes Identified As Being Differentially Expressed In Patients With COPD Versus Normal Controls
Figure imgf000014_0001
Figure imgf000015_0001
Figure imgf000016_0001
Figure imgf000017_0001
Figure imgf000018_0001
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Figure imgf000022_0001
Figure imgf000023_0001
Table 2 shows the geometric means of intensities for the genes in both COPD patients (COPD column) and in normal controls (normal column) and provides a fold difference of the mean of intensities (fold change column). Using this information, one can clearly see whether a given gene is upregulated or downregulated in the peripheral blood cells of patients with COPD as compared to the normal control. This table shows the differentially expressed transcripts sorted at p<0.01 sorted by fold-change in the geometric intensities. Therefore, the first transcript in this table has the highest fold difference between COPD and normal control, and the last transcript in this table has the lowest fold difference, meaning that there is much greater expression in normal controls versus COPD. The genes are identified by name, by probe set identifier and by GenBank Accession numbers. All information associated with the publicly available identifiers and accession numbers in any of the tables described herein, including the nucleic acid sequences of the genes and probes, is incorporated herein by reference in its entirety. The SEQ ID NO's in Table 2 refers to the nucleotide sequence for the coding region of the gene, or, if the entire coding region is not available, whatever fragment of the coding sequence or genomic sequence is available.
Table 3 also reflects a listing of genes identified by the inventors as being differentially expressed in patients with COPD versus normal controls and in this table, results have been grouped into the following main categories: (1) genes that are selectively (i.e., exclusively or uniquely) upregulated in PBMCs of patients with COPD as compared to normal controls (Table 3); and (2) genes that are selectively downregulated in PBMCs of patients with COPD as compared to normal controls (Table 3). Again, the genes are identified by name, by probe set identifier and by GenBank Accession numbers.
Table 3. Genes Selectively Upregulated In Patients With COPD Versus Normal Controls
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Tables 4 and 5 also reflect a listing of genes identified by the inventors as being differentially expressed in patients with COPD versus normal controls. These tables show the differentially expressed transcripts sorted at p<0.005 sorted by fold-change in the geometric intensities. PSEM refers to "past smoker emphysema" and is equivalent to the category of "COPD" as set forth in Table 2. NSNL refers to "non-smoker normal" and is equivalent to the category of "Normal" as set forth in Table 2. Again, the genes are identified by name, by probe set identifier and by GenBank Accession numbers, hi Tables 4 and 5, 22277 genes were filtered, and 13909 of these passed filtering criteria. A two-sample T-test (with randomized variance model) was used. The Multivariate Permutations test was computed based on 1000 permutations. The nominal significance level of each univariate test was 0.005. The confidence level of false discovery rate assessment was 50%. The maximum allowed number of false-positive genes was 10, and the maximum allowed proportion of false-positive genes: 0.1. In Table 4, the number of genes significant at 0.005 level of the univariate test was 66, and probability of getting at least 66 genes significant by chance (at the 0.005 level) if there are no real differences between the classes was 0.283. In Table 5, the number of genes significant at 0.005 level of the univariate test was 77, and the probability of getting at least 77 genes significant by chance (at the 0.005 level) if there are no real differences between the classes: 0.285. In Table 4, the predicted number of false discoveries among the first 10 genes is 10, and the predicted proportion of false discoveries among the first 0 genes is 10%. In Table 5, the predicted number of false discoveries among the first 21 genes is 10, and the predicted proportion of false discoveries among the first 6 genes is 10%.
Table 4. Genes identified as being differentially expressed in patients with COPD versus normal controls.
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
Table 5. Genes identified as being differentially expressed in patients with COPD versus normal controls.
Figure imgf000031_0002
Fold
Mean of Mean of difference
Parametric i nntteennssiittiieeεs inntteennssiittiieess of geom Gene p-value in PSEM in NSNL means Probe set Description UG cluster symbol major histocompatibility complex, HLA-
0.0001105 5277.87 1116.924 4.725 211654 x at class II, DQ beta 1; SEQ ID NO:231 Hs.73931 DQBl major histocompatibility complex, HLA-
0.0001106 598.891 34.639 17.29 212999 x at class II, DQ beta 1; SEQ ID NO:231 Hs.73931 DQBl
0.0001255 34.683 10.679 3.248 216814 at ; SEQ ID NO:290 ataxin 2 related protein; SEQ ID
0.0001685 12.268 63.498 0.193 207798 s atNO:258 Hs.43509 A2LP caspase recruitment domain family,
0.0003715 239.102 96.001 2.491 220162 s at member 9; SEQ ID NO:234 Hs.271815 CARD9 amyloid beta (A4) precursor protein-binding, family A, member
0.0003822 326.747 641.328 0.509 209870 s at|2 (Xl 1-like); SEQ ID NO; Hs.26468 APBA2 chromosome 11 open reading frame
0.0004507 11.835 38.314 0.309 215692 s at8; SEQ ID NQ:259 Hs.46638 Cl lorfS cytochrome P450, subfamily HD (debrisoquine, sparteine, etc., - metabolizing), polypeptide 6; SEQ
0.0007448 71.787 119.867 0.599 207498 s atID NO:279 Hs.333497 CYP2D6 zinc finger protein 83 (HPFl); SEQ
0.000888 140.533 251.152 0.56 221645 s atID NO:276 Hs.305953 ZNF83
Homo sapiens mRNA; cDNA DKFZp434G012 (from clone DKFZp434G012), mRNA
0.0011876 196.535 343.84 0.572 202438 x atsequence; SEQ ID NO:315 Hs.303154 major histocompatibility complex, HLA-
0.001225 1240.817 3406.64 0.364 211656 x at dass II, DQ beta 1; SEQ ID NO:231 Hs.73931 DQBl inositol polyphosphates- phosphatase, type I, 107kDa; SEQ
0.0012391 193.937 366.926 0.529 204553 x atID NO:271 Hs.32944 INPP4A
GLI-Kruppel family member GLI2;
0.0012462 96.222 51.635 1.864 207034_s_atSEQ ID NO:240 Hs.111867 GLI2
0.0012933 399.647 894.951 0.447 215177_s_atintegrin, alpha 6; SEQ ID NO:266 Hs.227730 ITGA6
RAB32, member RAS oncogene
0.0014275 1737.224 1068.632 1.626 204214 s at family; SEQ ID NO:53 Hs.32217 RAB32 tissue inhibitor of metalloproteinase 1 (erythroid potentiating activity, collagenase inhibitor); SEQ ID
0.001619 5236.698 2859.16 1.832 201666 at NO:241 Hs.433425 TIMPl
RODl regulator of differentiation 1
0.0018609 259.179 480.63 0.539 214698 at (S. pombe); SEQ ID NO:273 Hs.374634 RODl hypothetical protein FLJl 0808;
0.0020119 146.181 315.775 0.463 218340 s atSEQ ID NO:267 Hs.59838 FLJ10808
AT2 receptor-interacting protein 1 ;
0.002022 55.264 146.737 0.377 212096 s atSEQ ID NO:262 Hs.7946 ATIPl phosphatidylinositol glycan, class
0.0021169 495.018 253.682 1.951 214151 s atP; SEQ ID NO:238 Hs.247118 PIGB
0.0021708 160.517 258.718 0.62 202613 at CTP synthase; SEQ ID NO:284 Hs.251871 CTPS protein phosphatase 1, regulatory
0.0021732 156.556 51.186 3.059 204555 s at subunit 3D; SEQ ID NO:3 Hs.42215 PPP1R3D breakpoint cluster region; SEQ ID
0.0022344 236.002 382.391 0.617 217223 s atNO:282 Hs.234799 BCR
FK506 binding protein 11, 19 kDa;
0.0023812 310.282 549.738 0.564 219118 at SEQ ID NO:277 Hs.24048 FKBPIl
Figure imgf000033_0001
Figure imgf000034_0001
It should be noted that certain genes appear more than once in the tables provided herein, and in some cases, a gene may appear by name in both the "upregulated" and the "downregulated" category. This is because well-annotated genes often have multiple probe sets that one can use to identify the gene, and also because various isotypes of certain genes may be included, where there is some variation in the isotype sequence that is reflected by the various probe sets on the microarray chip {i.e., the probe sets are capable of differentiating between different isotypes of the same gene). As such, one isotype may be upregulated as compared to normal controls, where a second isotype may be down regulated as compared to normal controls.
Accordingly, in one embodiment of the present invention, the genes identified as being regulated (upregulated or downregulated) in PBMCs of patients with COPD can be used as endpoints or markers (also called "biomarkers") in a diagnostic or prognostic assay for COPD. The biomarkers include any of the genes listed in any of the tables presented herein (e.g., Tables 2-5). Diagnostic assays include assays that determine whether a patient has overt COPD or preclinical stage COPD. Prognostic assays can be used to stage a patient's development of COPD, predict a patient's outcome or disease progression, and/or monitor the effectiveness of various treatment protocols on COPD.
The term "biomarker" as used herein can refer to an endpoint gene described herein or to the protein encoded by that gene. In addition, the term "biomarker" can be generally used to refer to any portion of such a gene or protein that can identify or correlate with the full-length gene or protein, for example, in an assay of the invention. According to the present invention, an "endpoint gene" or "biomarker gene" is any gene, the expression of which is regulated (up or down) in a patient with a condition as compared to a normal control. Selected sets of one, two, three, and more preferably several more of the genes of this invention (up to the number equivalent to all of the genes, including any intervening number, in whole number increments, e.g., 1, 2, 3, 4, 5, 6...) can be used as end-points for rapid diagnostics or prognostics for COPD. Preferably, larger numbers of the genes identified in any one or more of Tables 2-5 are used in an assay of the invention {e.g., at least 10 genes or more), since the accuracy of the assay improves as the number of genes screened increases.
According to the present invention, the method includes the step of detecting the expression of at least one, and preferably more than one (e.g., 2, 3, 4, 5, 6,...and so on, in increments of whole numbers up to all of the genes) of the genes that have now been shown to be selectively regulated in PBMCs of patients with COPD by the present inventors. As used herein, the term "expression", when used in connection with detecting the expression of a gene of the present invention, can refer to detecting transcription of the gene and/or to detecting translation of the gene. To detect expression of a gene refers to the act of actively determining whether a gene is expressed or not. This can include determining whether the gene expression is upregulated as compared to a control, downregulated as compared to a control, or substantially unchanged as compared to a control. Therefore, the step of detecting expression does not require that expression of the gene actually is upregulated or downregulated, but rather, can also include detecting no expression of the gene or detecting that the expression of the gene has not changed or is not different (i.e., detecting no significant expression of the gene or no significant change in expression of the gene as compared to a control).
The present method includes the step of detecting the expression of at least one gene that is selectively regulated in PBMCs of a patient with COPD. hi a preferred embodiment, the step of detecting includes detecting the expression of at least 2 genes, and preferably at least 3 genes, and more preferably at least 4 genes, and more preferably at least 5 genes, and more preferably at least 6 genes, and more preferably at least 7 genes, and more preferably at least 8 genes, and more preferably at least 9 genes, and more preferably at least 10 genes, and more preferably at least 11 genes, and more preferably at least 12 genes, and more preferably at least 13 genes, and more preferably at least 14 genes, and more preferably at least 15 genes, and more preferably at least 20 genes, and more preferably at least 25 genes, and more preferably at least 50 genes, and more preferably at least 75 genes, and more preferably at least 100 genes, and so on, in whole integer increments (i.e., 1, 2, 3,...10, 11, 12,...35, 36, 37,...56, 57, 58, ...98, 99, 100,...106), up to detecting expression of all of the genes that can be used to detect COPD as disclosed herein. Analysis of a number of genes greater than one can be accomplished simultaneously, sequentially, or cumulatively. As discussed above, it is preferred that several (e.g., at least 10) and up to most or all of the genes be detected in the present methods, as the accuracy of the method improves as the number of genes detected increases. However, it is to be understood that in some circumstances, it may be desirable and sufficient to detect the expression of only one or a few genes.
In the diagnostic or prognostic method of the present invention, the gene(s) to be detected are preferably selected from the genes described in any one or more of Tables 2-5 (i.e., Table 2, Table 3, Table 4 or Table 5, or any combination thereof). These tables have been discussed above in detail and disclose genes that the present inventors have discovered to be selectively regulated in the PBMCs of patients with COPD. More specifically, these tables disclose the manner in which the genes are regulated (e.g., upregulated or downregulated) in a patient with COPD as compared to a normal control.
It is to be understood that the organization of various genes into the present tables is for purposes of illustrating various experimental data described in the Examples section. The selection of genes to be detected in any given method can include any one or more of the genes in any one or more of Tables 2-5, and can include the detection of any combination of two or more of the genes in any one or more of Tables 2-5, and preferably includes the detection of any combination of multiple genes (e.g., at least 3, 4, 5, 6,...up to all of the genes) in any one or more of Tables 2-5. It is not mandatory that a given assay be restricted to the detection of all of the various genes in a single table, or to at least one gene in each table. In addition, one may choose also to detect other genes that are believed to be useful in the evaluation of a patient for COPD, and therefore, the present method is not limited exclusively to detection of the genes identified herein, although the invention is primarily directed to the detection of one or more of these genes and includes the detection of at least one or more of these genes, hi addition, provided with this disclosure, one of skill in the art may proceed to identify additional genes that are differentially regulated in the PBMCs of patients with COPD, and detection of any of such genes may be used in the methods of the present invention, including in combination with detection of any of the genes disclosed herein. Indeed, the present inventors have now provided a powerful method to detect and evaluate biomarkers for COPD and have also provided data demonstrating the application of such technology. Given the knowledge of the genes regulated in COPD according to the present invention, one of skill in the art will be able to select one or more genes (at least one gene, and preferably, two, three, four, or any number of additional genes) to detect in a method of the present invention, and the selection of the one or more genes can be determined based on the preferences of the person using the assays described herein. hi one aspect, it may be desirable to preferentially select those genes for detection that are particularly highly regulated in patients with COPD in that they display the largest increases or decreases in expression levels in patients as compared to normal controls or as compared to the other form of COPD. The detection of such genes can be advantageous because the endpoint may be more clear and require less quantitation. The relative expression levels of the genes identified in the present invention are listed in the tables.
According to the present invention, a "baseline" or "control" can include a normal or negative control and/or a disease or positive control, against which a test level of gene expression can be compared. Therefore, it can be determined, based on the control or baseline level of gene expression, whether a sample to be evaluated for COPD has a measurable difference or substantially no difference in gene expression, as compared to the baseline level. In one aspect, the baseline control is indicative of the level of gene expression as expected in the PBMCs of a normal individual (e.g., healthy individual, negative control, or non-COPD patient). Therefore, the term "negative control" used in reference to a baseline level of gene expression typically refers to a baseline level of expression from a population of individuals which is believed to be normal (i.e., not having or developing COPD). In some embodiments of the invention, it may also be useful to compare the gene expression in a test sample of PBMCs to a baseline that has previously been established from a patient or population of patients with COPD. Such a baseline level, also referred to herein as a "positive control", refers to a level of gene expression established in PBMCs from one or preferably a population of individuals who had been positively diagnosed with COPD.
In one embodiment, when the goal is to monitor the progression or regression of COPD in a patient, for example, to monitor the efficacy of treatment of the disease or to determine whether a patient that appears to be predisposed to the disease begins to develop the disease, one baseline control can include the measurements of gene expression in a sample of PBMCs from the patient that was taken from a prior test in the same patient. In this embodiment, a new sample is evaluated periodically (e.g., at annual or more regular physicals), and any changes in gene expression in the patient PBMCs as compared to the prior measurement and most typically, also with reference to the above-described normal and/or positive controls, are monitored. Monitoring of a patient's PBMC gene expression profile can be used by the clinician to prescribe or modify treatment for the patient based on whether any differences in gene expression in the PBMCs is indicated.
In a preferred embodiment, the control or baseline levels of gene expression are obtained from PBMCs collected from "matched individuals". According to the present invention, the phrase "matched individuals" refers to a matching of the control individuals on the basis of one or more characteristics, such as gender, age, race, or any relevant biological or sociological factor that may affect the baseline of the control individuals and the patient (e.g., preexisting conditions, consumption of particular substances, levels of other biological or physiological factors). The number of matched individuals from whom control samples must be obtained to establish a suitable control level (e.g., a population) can be determined by those of skill in the art, but should be statistically appropriate to establish a suitable baseline for comparison with the patient to be evaluated (i.e., the test patient). The values obtained from the control samples are statistically processed using any suitable method of statistical analysis to establish a suitable baseline level using methods standard in the art for establishing such values. It will be appreciated by those of skill in the art that a baseline need not be established for each assay as the assay is performed but rather, a baseline can be established by referring to a form of stored information regarding a previously determined control level of gene expression. Such a form of stored information can include, for example, but is not limited to, a reference chart, listing or electronic file of population or individual data regarding "normal" (negative control) or COPD-positive gene expression; a medical chart for the patient recording data from previous evaluations; or any other source of data regarding control gene expression that is useful for the patient to be diagnosed or evaluated.
Expression of the transcripts and/or proteins encoded by the genes of the invention is measured by any of a variety of known methods in the art. In general, the nucleic acid sequence of a nucleic acid molecule (e.g., DNA or RNA) in a patient sample can be detected by any suitable method or technique of measuring or detecting gene sequence or expression. Such methods include, but are not limited to, polymerase chain reaction (PCR), reverse transcriptase- PCR (RT-PCR), in situ PCR, in situ hybridization, Southern blot, Northern blot, sequence analysis, microarray analysis, detection of a reporter gene, or other DNA/RNA hybridization platforms. For RNA expression, preferred methods include but are not limited to: extraction of cellular mRNA and Northern blotting using labeled probes that hybridize to transcripts encoding all or part of one or more of the genes of this invention; amplification of mRNA expressed from one or more of the genes of this invention using gene-specific primers, polymerase chain reaction (PCR), and reverse transcriptase-polymerase chain reaction (RT- PCR), followed by quantitative detection of the product by any of a variety of means; extraction of total RNA from the cells, which is then labeled and used to probe cDNAs or oligonucleotides encoding all or part of the genes of this invention, arrayed on any of a variety of surfaces; in situ hybridization; and detection of a reporter gene. The term "quantifying" or "quantitating" when used in the context of quantifying transcription levels of a gene can refer to absolute or to relative quantification. Absolute quantification may be accomplished by inclusion of known concentration(s) of one or more target nucleic acids and referencing the hybridization intensity of unknowns with the known target nucleic acids (e.g. through generation of a standard curve). Alternatively, relative quantification can be accomplished by comparison of hybridization signals between two or more genes, or between two or more treatments to quantify the changes in hybridization intensity and, by implication, transcription level.
The present invention includes isolated proteins encoded by the genes identified by the inventors as being differentially expressed in patients with COPD versus normal controls; that is, the proteins listed in Tables 2-5 and encoded by SEQ ID NOs: 1- ), isolated proteins encoded by a sequence complementary thereto, or polypeptides encoded by a fragment, homologue, or variant of genes represented in Tables 2-5. These proteins, peptides and polypeptides of the invention can be made using the genes or derived from the sequence information of the genes are also disclosed in the present invention. Functional forms of the proteins can be prepared, as purified preparations by using a cloned gene as described herein. Alternatively, the proteins, peptides and polypeptides of the invention can be produced synthetically. Full length proteins or fragments corresponding to one or more particular motifs and/or domains or to arbitrary sizes, for example, at least about 5, 10, 25, 50, 75, or 100 amino acids in length are within the scope of the present invention. Methods to measure protein expression levels of selected genes of this invention, include, but are not limited to: Western blot, immunoblot, enzyme-linked immunosorbant assay (ELISA), radioimmunoassay (RIA), immunoprecipitation, surface plasmon resonance, chemiluminescence, fluorescent polarization, phosphorescence, immunohistochemical analysis, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, microcytometry, microarray, microscopy, fluorescence activated cell sorting (FACS), flow cytometry, and assays based on a property of the protein including but not limited to DNA binding, ligand binding, or interaction with other protein partners.
Nucleic acid arrays are particularly useful for detecting the expression of the genes of the present invention. The production and application of high-density arrays in gene expression monitoring have been disclosed previously in, for example, PCT Publication No. WO 97/10365; PCT Publication No. WO 92/10588; U.S. Patent No. 6,040,138; U.S. Patent No. 5,445,934; or PCT Publication No. WO 95/35505, all of which are incorporated herein by reference in their entireties. Also for examples of arrays, see Hacia et al. (1996) Nature Genetics 14:441-447; Lockhart et al. (1996) Nature Biotechnol. 14:1675-1680; and De Risi et al. (1996) Nature Genetics 14:457-460, each of which is incorporated by reference in its entirety. In general, in an array, an oligonucleotide, a cDNA, or genomic DNA, that is a portion of a known gene, occupies a known location on a substrate. A nucleic acid target sample is hybridized with an array of such oligonucleotides and then the amount of target nucleic acids hybridized to each probe in the array is quantified. One preferred quantifying method is to use confocal microscope and fluorescent labels. The Affymetrix GeneChip™ Array system (Affymetrix, Santa Clara, Calif.) and the Atlas™ Human cDNA Expression Array system are particularly suitable for quantifying the hybridization; however, it will be apparent to those of skill in the art that any similar systems or other effectively equivalent detection methods can also be used. In a particularly preferred embodiment, one can use the knowledge of the genes described herein to design novel arrays of polynucleotides, cDNAs or genomic DNAs for screening methods described herein. Such novel pluralities of polynucleotides are contemplated to be a part of the present invention and are described in detail below.
Suitable nucleic acid samples for screening on an array contain transcripts of interest or nucleic acids derived from the transcripts of interest (i.e., transcripts derived from the genes associated with COPD of the present invention). As used herein, a nucleic acid derived from a transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template. Thus, a cDNA reverse transcribed from a transcript, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample. Thus, suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like. Preferably, such a sample is a total RNA preparation of a biological sample (e.g., peripheral blood mononuclear cells or PBMCs). More preferably in some embodiments, such a nucleic acid sample is the total mRNA isolated from such a biological sample. Preferably, the nucleic acids for screening are obtained from a homogenate of cells (e.g., peripheral blood mononuclear cells or PBMCs).
In general, typical clinical samples include, but are not limited to, sputum, blood, blood cells (e.g., peripheral blood mononuclear cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom. The present invention is primarily related to the detection of genes in peripheral blood mononuclear cells (PBMC or PBC).
In one embodiment, it is desirable to amplify the nucleic acid sample prior to hybridization. One of skill in the art will appreciate that whatever amplification method is used, if a quantitative result is desired, care must be taken to use a method that maintains or controls for the relative frequencies of the amplified nucleic acids to achieve quantitative amplification. Methods of "quantitative" amplification are well known to those of skill in the art. For example, quantitative PCR involves simultaneously co-amplifying a known quantity of a control sequence using the same primers. This provides an internal standard that may be used to calibrate the PCR reaction. The high-density array may then include probes specific to the internal standard for quantification of the amplified nucleic acid. Other suitable amplification methods include, but are not limited to polymerase chain reaction (PCR) Innis, et al., PCR Protocols. A guide to Methods and Application. Academic Press, Inc. San Diego, (1990)), ligase chain reaction (LCR) (see Wu and Wallace, Genomics, 4: 560 (1989), Landegren, et al., Science, 241: 1077 (1988) and Barringer, et al., Gene, 89: 117 (1990), transcription amplification (Kwoh, et al., Proc. Natl. Acad. Sci. USA, 86: 1173 (1989)), and self-sustained sequence replication (Guatelli, et al, Proc. Nat. Acad. Sci. USA, 87: 1874 (1990)).
Nucleic acid hybridization involves contacting a probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing. As used herein, hybridization conditions refer to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., is incorporated by reference herein in its entirety (see specifically, pages 9.31-9.62). In addition, formulae to calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mismatch of nucleotides are disclosed, for example, in Meinkoth et al., 1984, Anal. Biochem, 138, 267-284; Meinkoth et al., ibid., is incorporated by reference herein in its entirety. Nucleic acids that do not form hybrid duplexes are washed away from the hybridized nucleic acids and the hybridized nucleic acids can then be detected, typically through detection of an attached detectable label. It is generally recognized that nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids. Under low stringency conditions (e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA, RNA:RNA, or RNA:DNA) will form even where the annealed sequences are not perfectly complementary. Thus specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower salt) successful hybridization requires fewer mismatches.
High stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 80% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 20% or less mismatch of nucleotides). Very high stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides). As discussed above, one of skill in the art can use the formulae in Meinkoth et al., ibid, to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:RNA or DNA:DNA hybrids are being fonned. Calculated melting temperatures for DNA:DNA hybrids are 10°C less than for DNA:RNA hybrids. In particular embodiments, stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na+) at a temperature of between about 2O0C and about 350C (lower stringency), more preferably, between about 280C and about 400C (more stringent), and even more preferably, between about 350C and about 45°C (even more stringent), with appropriate wash conditions. In particular embodiments, stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na+) at a temperature of between about 30°C and about 450C, more preferably, between about 38°C and about 5O0C, and even more preferably, between about 45°C and about 550C, with similarly stringent wash conditions. These values are based on calculations of a melting temperature for molecules larger than about 100 nucleotides, 0% formamide and a G + C content of about 40%. Alternatively, Tm can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62. In general, the wash conditions should be as stringent as possible, and should be appropriate for the chosen hybridization conditions. For example, hybridization conditions can include a combination of salt and temperature conditions that are approximately 20-250C below the calculated Tm of a particular hybrid, and wash conditions typically include a combination of salt and temperature conditions that are approximately 12-2O0C below the calculated Tm of the particular hybrid. One example of hybridization conditions suitable for use with DNA:DNA hybrids includes a 2- 24 hour hybridization in 6X SSC (50% formamide) at about 42°C, followed by washing steps that include one or more washes at room temperature in about 2X SSC, followed by additional washes at higher temperatures and lower ionic strength (e.g., at least one wash as about 370C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC). Other hybridization conditions, and for example, those most useful with nucleic acid arrays, will be known to those of skill in the art. The hybridized nucleic acids are detected by detecting one or more labels attached to the sample nucleic acids. The labels may be incorporated by any of a number of means well known to those of skill in the art. Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, yellow fluorescent protein and the like), radiolabels (e.g., 3H, 125I, 35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Means of detecting such labels are well known to those of skill in the art. Thus, for example, radiolabels may be detected using photographic film or scintillation counters, fluorescent markers may be detected using a photodetector to detect emitted light. Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label.
The method of the present invention includes a step of comparing the results of detecting the expression of the one or more genes that are selectively regulated in patients with COPD as compared to a control (baseline normal or negative control) in order to determine whether there is any observed change or difference in expression of each gene in the patient as compared to the control. In one embodiment, a positive control (baseline COPD control) can also be used to assist in the confirmation of a diagnosis or prognosis. As discussed above, the present inventors have identified the expression profile of multiple genes that are differentially regulated in PBMCs of patients with COPD, as compared to a "normal" control (i.e., a patient that does not have or can not be detected to have COPD), including the manner in which the genes are regulated (i.e., up- or downregulated). Therefore, one can determine whether peripheral blood cells from a test patient have a gene expression profile that is statistically substantially similar to the profile of gene expression of a patient with COPD, or whether a profile of gene expression in the peripheral blood cells of the test patient is statistically more similar to the negative or normal, non-disease control. According to the present invention, an expression profile is substantially similar to a given profile of expression established for a group (e.g., COPD group, normal control group) if the expression profile of the gene or genes detected (including the identity of the gene, the manner in which expression is regulated, and/or the level of expression of the gene) is similar enough to the expected result so as to be statistically significant (i.e., with at least a 95% confidence level, or p<0.05, and more preferably, with a confidence level of p<0.01, and even more preferably, with a confidence level of p<0.005, and even more preferably, with a confidence level of pO.OOl). Software programs and various statistical analysis techniques are available in the art that are capable of analyzing the expression of multiple genes and determining whether differences from a control are significant or not significant. For example, as discussed in the Examples, the gene expression measurements determined in patient samples were mean-centered and analyzed using the clustering, class comparison and class discovery functions of BRB ArrayTools and genes were selected that met the p value requirement (p<0.01). In addition, statistical analysis methods are known in the art and described herein (see above and the Examples) that are preferably used to analyze the expression data generated for patient samples (e.g., independent and "leave-one- out" cross-validation and/or permutation testing).
By way of example, detection of the regulation of the expression of a gene in the "manner" associated with the established group, at a minimum, refers to the detection of the regulation of a gene that has now been shown by the present inventors to be selectively regulated in PBMCs of patients having COPD, at least in the same direction (i.e., upregulation or downregulation) and preferably at a similar or comparable level, as compared to a normal or baseline control established for the expression of that gene. Preferably, a gene identified as being upregulated or downregulated, as compared to a baseline control, is regulated in the same direction as the level of expression of the gene that is seen in established or confirmed patients with COPD as compared to a normal control. In other words, if "gene X" is upregulated in patients with COPD as compared to a normal control based on the inventors' discovery presented herein, then one determines whether the expression of gene X is upregulated in a patient test sample as compared to a normal control, or whether the expression of gene X is more similar to the level of expression of the normal control, hi addition, a gene identified as being upregulated or downregulated, as compared to a baseline control, is regulated to at least about 10%, and more preferably at least 20%, and more preferably at least 25%, and more preferably at least 30%, and more preferably at least 35%, and more preferably at least 40%, and more preferably at least 45%, and more preferably at least 50%, and preferably at least 55%, and more preferably at least 60%, and more preferably at least 65%, and more preferably at least 70%, and more preferably at least 75%, and more preferably at least 80%, and more preferably at least 85%, and more preferably at least 90%, and more preferably at least 95%, or even higher (e.g., above 100%) of the level of expression of the gene that is seen in established or confirmed patients with COPD. Statistical significance should be at least p<0.05, and more preferably, at least p<0.01, and more preferably, p<0.005, and even more preferably, pO.001. As discussed above, one of skill in the art can use software programs available in the art that use algorithms to analyze gene expression profiles and identify significant differences among samples and controls. In addition, one of skill in the art can apply various types of analyses as discussed above (e.g., cross-validation and/or permutation testing) to validate the results of the methods described herein.
A profile of individual gene biomarkers identified in a method of the invention, including a matrix of two or more markers, can be generated by one or more of the methods described above. According to the present invention, a profile of the genes regulated in a PBMC sample refers to a reporting of the expression level of a given gene that has been identified in any one or more of the tables presented herein, which, based on the knowledge of the regulation of the genes provided by the tables, includes a classification of the gene with regard to how the gene is regulated in PBMCs of a patient with COPD. For example, if a specific gene is identified as being expressed by a peripheral blood cell sample in a test patient, the profile for the blood cell sample will include the reporting of the expression of this gene as compared to one or more baseline controls (e.g., a negative/normal and/or a positive/COPD control). Preferably, the profile includes data for more than one (e.g., at least two), and preferably several genes (e.g., at least five, six, seven, eight, nine, ten, or more genes), such that a profile for the patient sample is created that can be compared to the control(s). The data can be reported as raw data, and/or statistically analyzed by any of a variety of methods, and/or combined with any other prognostic marker(s) for COPD, including any markers that are expressed in cells or tissues other than PBMCs and are useful for evaluating COPD in a patient. Prior to the present invention, one of skill in the art would not have known to screen patient peripheral blood cells for the particular genes in the tables provided herein, and particularly for any combinations of these genes, and one of skill in the art would not have been able to classify these genes or combinations thereof on the basis of COPD versus normal.
It will be appreciated by those of skill in the art that differences between the expression of genes in PBMCs of patients with COPD and without COPD may be small or large. Some small differences may be very reproducible and therefore are preferred for use in the diagnostic and prognostic methods of the invention. For other purposes, large differences may be desirable for ease of detection of the regulatory activity. It will therefore be appreciated that the exact boundary between a positive diagnosis and a negative diagnosis can shift, depending on the goal of the screening assay, the patient samples, the number of genes to be screened and the baseline controls used. For some assays, a given patient may be sampled over time to detect the efficacy of a treatment, and so changes in gene expression from a disease state toward a normal state may be detected. In this case, the patient may still be positive for COPD as compared to a normal, disease-free control, but may show a shift toward the normal control gene expression profile if treatment is successful. In addition, the technique being used for detection, as well as on the number of genes which are being tested, may impact how the assay is evaluated by those of skill in the art.
The profile of genes provided as a result of the screening of peripheral blood cells of a patient can be used by the patient or physician for decision-making regarding the usefulness of therapies for COPD in general. The profile can be used to estimate how the disease is likely to respond and progress in any individual patient. Clinical trials can be developed to correlate the relationship between COPD regulated genes and the biological behavior of the diseased tissues, including in response to particular treatments for COPD.
In one aspect of this embodiment of the invention, the profiling of genes expressed by peripheral blood cells can be extended to other diseases, and particularly, to other pulmonary diseases wherein diagnosis or prognosis of disease is difficult due to access to diseased tissue or difficulty distinguishing between subtypes of the disease based on conventional assays (e.g., histology). For example, as discussed above, using the guidance provided herein, it is within the ability of those of skill in the art to perform a de novo screening assay for the identification of genes regulated in peripheral blood cells in patients having a different disease, and particularly, a pulmonary disease, and to develop gene expression profiles for use in diagnostic and/or prognostic screening for these diseases. Moreover, one of skill in the art can use the techniques described herein to screen other gene arrays, including arrays of expressed tag sequences, to discover additional novel, genes that are regulated in the peripheral blood cells of patients with COPD. The extension of the gene profiles within COPD and also to other diseases will allow for the development of a variety of diagnostic assays in such diseases, as well as the identification of additional targets for therapeutic strategies.
It is to be understood that to perform the methods of the present invention, one of skill in the art can make use of any commercially available nucleotide or protein array, wherein hundreds or thousands of genes could be detected if desired. However, in the present method, one would use such an array to selectively screen only for the genes that are described as being useful for detection of COPD as disclosed herein, or to screen for such genes plus any other genes that would be useful as a predictor or analysis tool for COPD or for patients that have or may have COPD. In addition, the array can be designed to test for more than one disease condition in order to confirm or rule out other potential causes of a patient condition. For example, one may design an assay to screen for COPD as described herein, and also for pulmonary hypertension. In the specifically designed assays described herein, expression of the other non-informative genes in a large array can effectively be "ignored" or not screened. Alternatively, one of skill in the art can prepare nucleotide or protein arrays that are specifically designed to test for the expression of any combination of the genes of interest as described herein, alone or in combination with any other combination of genes that may be useful in evaluating a patient for COPD.
Another embodiment of the present invention relates to a plurality of polynucleotides for the detection of the expression of genes that are selectively regulated in peripheral blood cells of patients with COPD. The plurality of polynucleotides consists of, or consists essentially of, polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that have been identified herein as being selectively regulated in the peripheral blood cells of patients with COPD, and is therefore distinguished from previously known nucleic acid arrays and primer sets. The plurality of polynucleotides within the above- limitation includes at least two or more polynucleotide probes (e.g., at least 2, 3, 4, 5, 6, and so on, in whole integer increments, up to all of the possible probes) that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes identified by the present inventors. Such genes are selected from any of the genes listed in the tables provided herein. Multiple probes can also be used to detect the same gene or to detect different splice variants of the same gene.
In one embodiment, it is contemplated that additional genes that are not regulated in the peripheral blood cells of patients with COPD, or that are not presently known to be regulated in the peripheral blood cells of patients with COPD, can be added to the set of genes to be identified by the plurality of polynucleotides. Such genes would not be random genes, or large groups of unselected human genes, as are commercially available for detection now, but rather, would be specifically selected to complement the sets of genes identified by the present invention. For example, one of skill in the art may wish to add to the above-described plurality of polynucleotides one or more polynucleotides corresponding to (useful for identifying) genes that are of relevance because they are expressed by a particular tissue of interest (e.g., pulmonary tissue), are associated with the particular disease (COPD) but not necessarily with peripheral blood cells, or are associated with a particular cell, tissue or body function. The development of additional pluralities of polynucleotides (and antibodies, as disclosed below), which include both the above-described plurality and such additional selected polynucleotides, are explicitly contemplated by the present invention, hi addition, using the techniques described herein, one of skill in the art may identify additional genes that are regulated in the peripheral blood cells of patients with COPD, and polynucleotides derived from such genes can be included in the plurality of polynucleotides described herein.
According to the present invention, a plurality of polynucleotides refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of polynucleotides, including polynucleotides representing all of the genes described herein (e.g., 106), 500, 1000, 104, 105, or at least 106 or more polynucleotides. hi accordance with the present invention, an isolated polynucleotide, or an isolated nucleic acid molecule, is a nucleic acid molecule that has been removed from its natural milieu (i.e., that has been subject to human manipulation), its natural milieu being the genome or chromosome in which the nucleic acid molecule is found in nature. As such, "isolated" does not necessarily reflect the extent to which the nucleic acid molecule has been purified, but indicates that the molecule does not include an entire genome or an entire chromosome in which the nucleic acid molecule is found in nature. The polynucleotides useful in the plurality of polynucleotides of the present invention are typically a portion of a gene (sense or non-sense strand) of the present invention that is suitable for use as a hybridization probe or PCR primer for the identification of a full-length gene (or portion thereof) in a given sample (e.g., a peripheral blood cell sample). An isolated nucleic acid molecule can include a gene or a portion of a gene (e.g., the regulatory region or promoter), for example, to produce a reporter construct according to the present invention. An isolated nucleic acid molecule that includes a gene is not a fragment of a chromosome that includes such gene, but rather includes the coding region and regulatory regions associated with the gene, but no additional genes naturally found on the same chromosome. An isolated nucleic acid molecule can also include a specified nucleic acid sequence flanked by (i.e., at the 5' and/or the 3' end of the sequence) additional nucleic acids that do not normally flank the specified nucleic acid sequence in nature (i.e., heterologous sequences). Isolated nucleic acid molecule can include DNA, RNA (e.g., mRNA), or derivatives of either DNA or RNA (e.g., cDNA). Although the phrase "nucleic acid molecule" primarily refers to the physical nucleic acid molecule and the phrase "nucleic acid sequence" primarily refers to the sequence of nucleotides on the nucleic acid molecule, the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein. Preferably, an isolated nucleic acid molecule of the present invention is produced using recombinant DNA technology (e.g., polymerase chain reaction (PCR) amplification, cloning) or chemical synthesis.
The minimum size of a nucleic acid molecule or polynucleotide of the present invention is a size sufficient to encode a protein having a desired biological activity, sufficient to form a probe or oligonucleotide primer that is capable of forming a stable hybrid with the complementary sequence of a nucleic acid molecule encoding the natural protein (e.g., under moderate, high or very high stringency conditions), or to otherwise be used as a target in an assay or in any therapeutic method discussed herein. If the polynucleotide is an oligonucleotide probe or primer, the size of the polynucleotide can be dependent on nucleic acid composition and percent homology or identity between the nucleic acid molecule and a complementary sequence as well as upon hybridization conditions per se (e.g., temperature, salt concentration, and formamide concentration). The minimum size of a polynucleotide that is used as an oligonucleotide probe or primer is at least about 5 nucleotides in length, and preferably ranges from about 5 to about 50 or about 500 nucleotides or greater (1000, 2000, etc.), including any length in between, in whole number increments (i.e., 5, 6, 7, 8, 9, 10,...33, 34,...256, 257,...500...1000...), and more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length. In one aspect, the oligonucleotide primer or probe is typically at least about 12 to about 15 nucleotides in length if the nucleic acid molecules are GC-rich and at least about 15 to about 18 bases in length if they are AT-rich. There is no limit, other than a practical limit, on the maximal size of a nucleic acid molecule of the present invention, in that the nucleic acid molecule can include a portion of a protein- encoding sequence or a nucleic acid sequence encoding a full-length protein.
In one embodiment, the polynucleotide probes are conjugated to detectable markers. Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads.TM.), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3H,
5I, 35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Preferably, the polynucleotide probes are immobilized on a substrate.
In one embodiment, the polynucleotide probes are hybridizable array elements in a microarray or high density array. Nucleic acid arrays are well known in the art and are described for use in comparing expression levels of particular genes of interest, for example, in U.S. Patent No. 6,177,248, which is incorporated herein by reference in its entirety. Nucleic acid arrays are suitable for quantifying a small variations in expression levels of a gene in the presence of a large population of heterogeneous nucleic acids. Knowing the identity of the genes set forth by the present invention, nucleic acid arrays can be fabricated either by de novo synthesis on a substrate or by spotting or transporting nucleic acid sequences onto specific locations of substrate. Nucleic acids are purified and/or isolated from biological materials, such as a bacterial plasmid containing a cloned segment of sequence of interest. It is noted that all of the genes identified by the present invention have been previously sequenced, at least in part, such that oligonucleotides suitable for the identification of such nucleic acids can be produced. The database accession number for each of the genes identified by the present inventors is provided in the tables of the invention. Suitable nucleic acids are also produced by amplification of template, such as by polymerase chain reaction or in vitro transcription.
One of skill in the art will appreciate that an enormous number of array designs are suitable for the practice of this invention. An array will typically include a number of probes that specifically hybridize to the sequences of interest. In addition, in a preferred embodiment, the array will include one or more control probes. The high-density array chip includes "test probes". Test probes could be oligonucleotides having a minimum or maximum length as described above for other oligonucleotides. In another preferred embodiments, test probes are double or single strand DNA sequences. DNA sequences are isolated or cloned from natural sources or amplified from natural sources using natural nucleic acids as templates, or produced synthetically. These probes have sequences complementary to particular subsequences of the genes whose expression they are designed to detect. Thus, the test probes are capable of specifically hybridizing to the target nucleic acid they are to detect. Another embodiment of the present invention relates to a plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated in peripheral blood cells in patients with COPD. The plurality of antibodies, or antigen binding fragments thereof, consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated in peripheral blood cells in patients with COPD, and that can be detected as protein products using antibodies. In addition, the plurality of antibodies, or antigen binding fragments thereof, comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins or portions thereof (peptides) encoded by any of the genes from the tables provided herein.
According to the present invention, a plurality of antibodies, or antigen binding fragments thereof, refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of antibodies, or antigen binding fragments thereof, including antibodies representing all of the genes described herein (e.g., 246) or more, such as 500, or at least 1000 antibodies, or antigen binding fragments thereof.
According to the present invention, the phrase "selectively binds to" refers to the ability of an antibody, antigen binding fragment or binding partner (antigen binding peptide) to preferentially bind to specified proteins. More specifically, the phrase "selectively binds" refers to the specific binding of one protein to another (e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay. For example, when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA), immunoblot assays, etc.). Limited digestion of an immunoglobulin with a protease may produce two fragments.
An antigen binding fragment is referred to as an Fab, an Fab1, or an F(ab')2 fragment. A fragment lacking the ability to bind to antigen is referred to as an Fc fragment. An Fab fragment comprises one arm of an immunoglobulin molecule containing a L chain (VL + CL domains) paired with the VH region and a portion of the CH region (CHl domain). An Fab' fragment corresponds to an Fab fragment with part of the hinge region attached to the CHl domain. An F(ab')2 fragment corresponds to two Fab' fragments that are normally covalently linked to each other through a di-sulfide bond, typically in the hinge regions.
Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees. Whole antibodies of the present invention can be polyclonal or monoclonal. Alternatively, functional equivalents of whole antibodies, such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab)2 fragments), as well as genetically- engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention.
Generally, in the production of an antibody, a suitable experimental animal, such as, for example, but not limited to, a rabbit, a sheep, a hamster, a guinea pig, a mouse, a rat, or a chicken, is exposed to an antigen against which an antibody is desired. Typically, an animal is immunized with an effective amount of antigen that is injected into the animal. An effective amount of antigen refers to an amount needed to induce antibody production by the animal. The animal's immune system is then allowed to respond over a pre-determined period of time. The immunization process can be repeated until the immune system is found to be producing antibodies to the antigen. In order to obtain polyclonal antibodies specific for the antigen, serum is collected from the animal that contains the desired antibodies (or in the case of a chicken, antibody can be collected from the eggs). Such serum is useful as a reagent. Polyclonal antibodies can be further purified from the serum (or eggs) by, for example, treating the serum with ammonium sulfate. Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein {Nature 256:495-497, 1975). For example, B lymphocytes are recovered from the spleen (or any suitable tissue) of an immunized animal and then fused with myeloma cells to obtain a population of hybridoma cells capable of continual growth in suitable culture medium. Hybridomas producing the desired antibody are selected by testing the ability of the antibody produced by the hybridoma to bind to the desired antigen.
Finally, any of the genes of this invention, or their RNA or protein products, can serve as targets for therapeutic strategies. For example, regulatory compounds that regulate (e.g., upregulate or downregulate) the expression and/or biological activity of a target gene or its expression product (whether the product is intracellular, membrane or secreted) can be identified and/or designed using the information regarding the biomarker targets described herein. Alternatively, through the identification of particular genes that are highly regulated in patients with COPD, one can use such genes and their products to further investigate the molecular or biochemical mechanisms associated with the development and progression of COPD and then design or establish assays to identify therapeutic compounds that affect the molecular or biochemical mechanism with the goal of providing a therapeutic benefit to the patient.
For example, one embodiment of the present invention relates to methods for identifying compounds that regulate the expression or activity of at least one of the biomarkers described herein. Preferably, such compounds can be used to further study mechanisms associated with COPD or more preferably, serve as a therapeutic agent for use in the treatment or prevention of at least one symptom or aspect of COPD, or as a lead compound for the development of such a therapeutic agent. Once a biomarker has been identified as a target according to the present invention, an assay can be used for screening and selecting a chemical compound or a biological compound having regulatory activity as a candidate reagent or therapeutic based on the ability of the compound to regulate the expression or activity of the target biomarker. Reference herein to regulating a target, can refer to one or both of regulating transcription of a target gene and regulating the translation and/or activity of its corresponding expression product. Such a compound can be referred to herein as therapeutic compound, in one embodiment. For example, a cell line that naturally expresses the gene of interest or has been transfected with the gene (or suitable portions or derivatives thereof for assaying putative regulatory compounds) or other recombinant nucleic acid molecule encoding the protein of interest is incubated with various compounds, also referred to as candidate compounds, test compounds, or putative regulatory compounds. A regulation of the expression of the gene of interest or regulation of the activities of its encoded product (e.g., biological activity) may be used to identify a therapeutic compound. Therapeutic compounds identified in this manner can then be re-tested, if desired, in other assays to confirm their activities with regard to the target biomarker or a cellular or other activity related thereto. In the method of the invention, the identification of compounds that increase the expression or activity of genes in any one or more of Tables 2-5, or the proteins encoded thereby, that are downregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls, or the identification of compounds that decrease the expression or activity of genes in any one or more of Tables 2-5, or the proteins encoded thereby, that are upregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls, are predicted to be useful as therapeutic reagents or lead compounds therefore in the prevention and treatment of COPD.
For example one embodiment of the present invention relates to a method of using the differentially expressed genes described herein or the proteins encoded thereby (i.e., the biomarkers of the invention) as a target to identify a regulatory compound for regulation of a biological function associated with that gene or protein. Such a method can include the steps of: (a) contacting a test compound with a cell that expresses the target biomarker or a useful portion thereof (i.e., useful being any portion of a gene, transcript or protein that can be used to identify a compound as discussed herein); and (b) identifying compounds that regulate the expression or activity of the gene or protein.
In general, the biological activity or biological action of a protein refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions). Modifications, activities or interactions which result in a decrease in protein expression or a decrease in the activity of the protein, can be referred to as inactivation (complete or partial), down-regulation, reduced action, or decreased action or activity of a protein. Similarly, modifications, activities or interactions which result in an increase in protein expression or an increase in the activity of the protein, can be referred to as amplification, overproduction, activation, enhancement, up- regulation or increased action of a protein. The biological activity of a protein according to the invention can be measured or evaluated using any assay for the biological activity of the protein as known in the art. Such assays can include, but are not limited to, binding assays, assays to determine internalization of the protein and/or associated proteins, enzyme assays, cell signal transduction assays (e.g., phosphorylation assays), and/or assays for determining downstream cellular events that result from activation or binding of the cell surface protein (e.g., expression of downstream genes, production of various biological mediators, etc.).
According to the present invention, a biologically active fragment or homologue of a gene, nucleic acid transcript or derivative thereof, or protein maintains the ability to be useful in a method of the present invention. Therefore, the biologically active fragment or homologue maintains the ability to be used to identify regulators (e.g., inhibitors) of the native gene or protein when, for example, the biologically active fragment or homologue is expressed by a cell or used in another assay format. Therefore, the biologically active fragment or homologue has a structure that is sufficiently similar to the structure of the native gene or protein that a regulatory compound can be identified by its ability to bind to and/or regulate the expression or activity of the fragment or homologue in a manner consistent with the regulation of the native gene or protein.
Compounds to be screened in the methods of the invention include known organic compounds such as antibodies, products of peptide libraries, and products of chemical combinatorial libraries. Compounds may also be identified using rational drug design relying on the structure of the product of a gene. Such methods are known to those of skill in the art and involve the use of three-dimensional imaging software programs. For example, various methods of drug design, useful to design or select mimetics or other therapeutic compounds useful in the present invention are disclosed in Maulik et al., 1997, Molecular Biotechnology: Therapeutic Applications and Strategies, Wiley-Liss, Inc., which is incorporated herein by reference in its entirety.
As used herein, a mimetic refers to any peptide or non-peptide compound that is able to mimic the biological action of a naturally occurring peptide, often because the mimetic has a basic structure that mimics the basic structure of the naturally occurring peptide and/or has the salient biological properties of the naturally occurring peptide. Mimetics can include, but are not limited to: peptides that have substantial modifications from the prototype such as no side chain similarity with the naturally occurring peptide (such modifications, for example, may decrease its susceptibility to degradation); anti-idiotypic and/or catalytic antibodies, or fragments thereof; non-proteinaceous portions of an isolated protein (e.g., carbohydrate structures); or synthetic or natural organic molecules, including nucleic acids and drugs identified through combinatorial chemistry, for example. Such mimetics can be designed, selected and/or otherwise identified using a variety of methods known in the art.
A mimetic can be obtained, for example, from molecular diversity strategies (a combination of related strategies allowing the rapid construction of large, chemically diverse molecule libraries), libraries of natural or synthetic compounds, in particular from chemical or combinatorial libraries (i.e., libraries of compounds that differ in sequence or size but that have the similar building blocks) or by rational, directed or random drug design. See for example, Maulik et al., supra. In a molecular diversity strategy, large compound libraries are synthesized, for example, from peptides, oligonucleotides, carbohydrates and/or synthetic organic molecules, using biological, enzymatic and/or chemical approaches. The critical parameters in developing a molecular diversity strategy include subunit diversity, molecular size, and library diversity. The general goal of screening such libraries is to utilize sequential application of combinatorial selection to obtain high-affinity ligands for a desired target, and then to optimize the lead molecules by either random or directed design strategies. Methods of molecular diversity are described in detail in Maulik, et al., ibid.
Maulik et al. also disclose, for example, methods of directed design, in which the user directs the process of creating novel molecules from a fragment library of appropriately selected fragments; random design, in which the user uses a genetic or other algorithm to randomly mutate fragments and their combinations while simultaneously applying a selection criterion to evaluate the fitness of candidate ligands; and a grid-based approach in which the user calculates the interaction energy between three dimensional receptor structures and small fragment probes, followed by linking together of favorable probe sites. As used herein, the term "test compound", "putative inhibitory compound" or "putative regulatory compound" refers to compounds having an unknown or previously unappreciated regulatory activity in a particular process. As such, the term "identify" with regard to methods to identify compounds is intended to include all compounds, the usefulness of which as a regulatory compound for the purposes of regulating the expression or activity of a target biomarker or otherwise regulating some activity that may be useful in the study or treatment of COPD is determined by a method of the present invention.
In one embodiment of the invention, regulatory compounds are identified by exposing a target gene to a test compound; measuring the expression of a target; and selecting a compound that regulates (up or down) the expression of the target. For example, the putative regulatory compound can be exposed to a cell that expresses the target gene (endogenously or recombinantly). A preferred cell to use in an assay includes a mammalian cell that either naturally expresses the target gene or has been transformed with a recombinant form of the target gene, such as a recombinant nucleic acid molecule comprising a nucleic acid sequence encoding the target protein or a useful fragment thereof. Methods to determine expression levels of a gene are well known in the art.
The conditions under which a cell, cell lysate, nucleic acid molecule or protein of the present invention is exposed to or contacted with a putative regulatory compound, such as by mixing, are any suitable culture or assay conditions. In the case of a cell-based assay, the conditions include an effective medium in which the cell can be cultured or in which the cell lysate can be evaluated in the presence and absence of a putative regulatory compound. Cells of the present invention can be cultured in a variety of containers including, but not limited to, tissue culture flasks, test tubes, microtiter dishes, and petri plates. Culturing is carried out at a temperature, pH and carbon dioxide content appropriate for the cell. Such culturing conditions are also within the skill in the art. Cells are contacted with a putative regulatory compound under conditions which take into account the number of cells per container contacted, the concentration of putative regulatory compound(s) administered to a cell, the incubation time of the putative regulatory compound with the cell, and the concentration of compound administered to a cell. Determination of effective protocols can be accomplished by those skilled in the art based on variables such as the size of the container, the volume of liquid in the container, conditions known to be suitable for the culture of the particular cell type used in the assay, and the chemical composition of the putative regulatory compound (i.e., size, charge etc.) being tested. A preferred amount of putative regulatory compound(s) can comprise between about 1 nM to about 10 mM of putative regulatory compound(s) per well of a 96-well plate.
To detect expression of a target refers to the act of actively determining whether a target is expressed or not. This can include determining whether the target expression is upregulated as compared to a control, downregulated as compared to a control, or unchanged as compared to a control. Therefore, the step of detecting expression does not require that expression of the target actually is upregulated or downregulated, but rather, can also include detecting that the expression of the target has not changed (i.e., detecting no expression of the target or no change in expression of the target). Expression of transcripts and/or proteins is measured by any of a variety of known methods in the art, and such methods have been discussed previously herein. Similarly, measurement of translation of a protein includes any suitable method for detecting and/or measuring proteins from a cell or cell extract, and such methods have been described previously herein.
Designing a compound for testing in a method of the present invention can include creating a new chemical compound or searching databases of libraries of known compounds (e.g., a compound listed in a computational screening database containing three dimensional structures of known compounds). Designing can also be performed by simulating chemical compounds having substitute moieties at certain structural features. The step of designing can include selecting a chemical compound based on a known function of the compound. A preferred step of designing comprises computational screening of one or more databases of compounds in which the three dimensional structure of the compound is known and is interacted (e.g. , docked, aligned, matched, interfaced) with the three dimensional structure of a target by computer (e.g. as described by Humblet and Dunbar, Animal Reports in Medicinal Chemistry, vol. 28, pp. 275-283, 1993, M Venuti, ed., Academic Press). Methods to synthesize suitable chemical compounds are known to those of skill in the art and depend upon the structure of the chemical being synthesized. Methods to evaluate the bioactivity of the synthesized compound depend upon the bioactivity of the compound (e.g., inhibitory or stimulatory).
Candidate compounds identified or designed by the above-described methods can be synthesized using techniques known in the art, and depending on the type of compound. Synthesis techniques for the production of non-protein compounds, including organic and inorganic compounds are well known in the art. For example, for smaller peptides, chemical synthesis methods are preferred. For example, such methods include well known chemical procedures, such as solution or solid-phase peptide synthesis, or semi-synthesis in solution beginning with protein fragments coupled through conventional solution methods. Such methods are well known in the art and may be found in general texts and articles in the area such as: Merrifield, 1997, Methods Enzymol. 289:3-13; Wade et al., 1993, Australas Biotechnol. 3(6):332-336; Wong et al., 1991, Expeήentia 47(11-12): 1123-1129; Carey et al., 1991, Ciba Found Symp. 158:187-203; Plaue et al., 1990, Biologicals 18(3):147-157; Bodanszky, 1985, Int. J. Pept. Protein Res. 25(5):449-474; or H. Dugas and C. Penney, BIOORGANIC CHEMISTRY, (1981) at pages 54-92, all of which are incorporated herein by reference in their entirety. For example, peptides may be synthesized by solid-phase methodology utilizing a commercially available peptide synthesizer and synthesis cycles supplied by the manufacturer. One skilled in the art recognizes that the solid phase synthesis could also be accomplished using the FMOC strategy and a TF A/scavenger cleavage mixture. A compound that is a protein or peptide can also be produced using recombinant DNA technology and methods standard in the art, particularly if larger quantities of a protein are desired.
Li another embodiment of the invention, putative regulatory compounds are identified by exposing a target to a candidate compound; measuring the binding of the candidate compound to the target; and selecting a compound that binds to the target at a desired concentration, affinity, or avidity. In a preferred embodiment, the assay is performed under conditions conducive to promoting the interaction or binding of the compound to the target. One of skill in the art can determine such conditions based on the target and the compound being used in the assay. In one embodiment, a BIAcore machine can be used to determine the binding constant of a complex between the target protein (a protein encoded by the target gene) and a natural ligand in the presence and absence of the candidate compound. For example, the target protein or a ligand binding fragment thereof can be immobilized on a substrate. A natural or synthetic ligand is contacted with the substrate to form a complex. The dissociation constant for the complex can be determined by monitoring changes in the refractive index with respect to time as buffer is passed over the chip (O'Shannessy et al. Anal. Biochem. 212:457-468 (1993); Schuster et al., Nature 365:343-347 (1993)). Contacting a candidate compound at various concentrations with the complex and monitoring the response function (e.g., the change in the refractive index with respect to time) allows the complex dissociation constant to be determined in the presence of the test compound and indicates whether the candidate compound is either an inhibitor or an agonist of the complex. Alternatively, the candidate compound can be contacted with the immobilized target protein at the same time as the ligand to see if the candidate compound inhibits or stabilizes the binding of the ligand to the target protein.
Other suitable assays for measuring the binding of a candidate compound to a target protein or for measuring the ability of a candidate compound to affect the binding of the target protein to another protein or molecule include, but are not limited to, Western blot, immunoblot, enzyme-linked immunosorbant assay (ELISA), radioimmunoassay (RIA), immunoprecipitation, surface plasmon resonance, chemiluminescence, fluorescent polarization, phosphorescence, immunohistochemical analysis, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, microcytometry, microarray, microscopy, fluorescence activated cell sorting (FACS), and flow cytometry. Other assays include those that are suitable for monitoring the effects of protein binding, including, but not limited to, cell-based assays such as: cytokine secretion assays, or intracellular signal transduction assays that determine, for example, protein or lipid phosphorylation, mediator release or intracellular Ca mobilization.
In yet another embodiment, putative regulatory compounds are identified by exposing a target protein of the present invention (or a cell expressing the protein naturally or recombinantly) to a candidate compound and measuring the ability of the compound to inhibit or enhance a biological activity of the protein. In one embodiment, the biological activity of a protein encoded by the target gene is measured by measuring the amount of product generated in a biochemical reaction mediated by the protein encoded by the target gene. In still another embodiment, the activity of the protein encoded by the target gene is measured by measuring the amount of substrate generated in a biochemical reaction mediated by the protein encoded by the target gene. In another embodiment, a biological activity is measured by measuring a specific event in a cell-based assay, such as release or secretion of a biological mediator or compound that is regulated by the activity of the target protein, measuring intracellular signal transduction assays that determine, for example, protein or lipid phosphorylation, mediator release or intracellular Ca++ mobilization. Preferably, the activity of the protein is measured in the presence and absence of the candidate compound, or in the presence of another suitable control compound. In one embodiment of the invention, when the protein encoded by a target gene is an enzyme, a therapeutic compound is identified by exposing the enzyme encoded by a target gene to a test compound; measuring the activity of the enzyme encoded by the target gene in the presence and absence of the compound; and selecting a compound that down-regulates or inhibits the activity of the enzyme encoded by the target gene. Methods to measure enzymatic activity are well known to those skilled in the art and are selected based on the identity of the enzyme being tested. For example, if the enzyme is a kinase, phosphorylation assays can be used.
Preferably, methods used to identify therapeutic compounds are customized for each target gene or product. For example, if the target product is an enzyme, then the enzyme will be expressed in cell culture and purified. The enzyme will then be screened in vitro against therapeutic compounds to look for inhibition of that enzymatic activity. If the target is a non- catalytic protein, then it will also be expressed and purified. Therapeutic compounds will then be tested for their ability to regulate, for example, the binding of a site-specific antibody or a target-specific ligand to the target product. In a preferred embodiment, therapeutic compounds that bind to target products are identified, then those compounds can be further tested in biological assays that test for other desirable characteristics and activities, such as utility as a reagent for the study of COPD or utility as a therapeutic compound for the prevention or treatment of COPD.
If a suitable therapeutic compound is identified using the methods and genes of the present invention, a composition can be formulated. A composition, and particularly a therapeutic composition, of the present invention generally includes the therapeutic compound and a carrier, and preferably, a pharmaceutically acceptable carrier. According to the present invention, a "pharmaceutically acceptable carrier" includes pharmaceutically acceptable excipients and/or pharmaceutically acceptable delivery vehicles, which are suitable for use in administration of the composition to a suitable in vitro, ex vivo or in vivo site. A suitable in vitro, in vivo or ex vivo site is preferably a pulmonary tissue or a cell that is associated with or travels to a pulmonary tissue. Preferred pharmaceutically acceptable carriers are capable of maintaining a compound, a protein, a peptide, nucleic acid molecule or mimetic (drug) in a form that, upon arrival of the compound, protein, peptide, nucleic acid molecule or mimetic at the target site in a culture (in the case of an in vitro or ex vivo protocol) or in patient (in vivo), the compound, protein, peptide, nucleic acid molecule or mimetic is capable of providing the desired effect at the target site.
Suitable excipients of the present invention include excipients or formularies that transport or help transport, but do not specifically target a composition to a cell (also referred to herein as non-targeting carriers). Examples of pharmaceutically acceptable excipients include, but are not limited to water, phosphate buffered saline, Ringer's solution, dextrose solution, serum-containing solutions, Hank's solution, other aqueous physiologically balanced solutions, oils, esters and glycols. Aqueous carriers can contain suitable auxiliary substances required to approximate the physiological conditions of the recipient, for example, by enhancing chemical stability and isotonicity.
One type of pharmaceutically acceptable carrier includes a controlled release formulation that is capable of slowly releasing a composition of the present invention into a patient or culture. As used herein, a controlled release formulation comprises a therapeutic compound in a controlled release vehicle. Suitable controlled release vehicles include, but are not limited to, biocompatible polymers, other polymeric matrices, capsules, microcapsules, microparticles, bolus preparations, osmotic pumps, diffusion devices, liposomes, lipospheres, and transdermal delivery systems. Other carriers include liquids that, upon administration to a patient, form a solid or a gel in situ. Preferred carriers are also biodegradable (i.e., bioerodible). When the compound is a recombinant nucleic acid molecule, suitable delivery vehicles include, but are not limited to liposomes, viral vectors or other delivery vehicles, including ribozymes. Natural lipid-containing delivery vehicles include cells and cellular membranes. Artificial lipid-containing delivery vehicles include liposomes and micelles. A delivery vehicle of the present invention can be modified to target to a particular site in a patient, thereby targeting and making use of a therapeutic compound at that site. Suitable modifications include manipulating the chemical formula of the lipid portion of the delivery vehicle and/or introducing into the vehicle a targeting agent capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type. Other suitable delivery vehicles include gold particles, poly- L-lysine/DNA-molecular conjugates, and artificial chromosomes. A compound or composition can be delivered to a cell culture or patient by any suitable method. Selection of such a method will vary with the type of compound being administered or delivered (i.e., compound, protein, peptide, nucleic acid molecule, or mimetic), the mode of delivery (i.e., in vitro, in vivo, ex vivo) and the goal to be achieved by administration/delivery of the compound or composition. According to the present invention, an effective administration protocol (i.e., administering a composition in an effective manner) comprises suitable dose parameters and modes of administration that result in delivery of a composition to a desired site (i.e., to a desired cell) and/or in the desired regulatory event.
Administration routes include in vivo, in vitro and ex vivo routes. In vivo routes include, but are not limited to, oral, nasal, intratracheal injection, inhaled, transdermal, rectal, and parenteral routes. Preferred parenteral routes can include, but are not limited to, subcutaneous, intradermal, intravenous, intramuscular and intraperitoneal routes. Intravenous, intraperitoneal, intradermal, subcutaneous and intramuscular administrations can be performed using methods standard in the art. Aerosol (inhalation) delivery can also be performed using methods standard in the art (see, for example, Stribling et al., Proc. Natl. Acad. ScL USA 189:11277-11281, 1992, which is incorporated herein by reference in its entirety). Oral delivery can be performed by complexing a therapeutic composition of the present invention to a carrier capable of withstanding degradation by digestive enzymes in the gut of an animal. Examples of such carriers, include plastic capsules or tablets, such as those known in the art. Direct injection techniques include, for example, injecting the composition directly into a site. Ex vivo refers to performing part of the regulatory step outside of the patient, such as by transfecting a population of cells removed from a patient with a recombinant molecule comprising a nucleic acid sequence encoding a protein according to the present invention under conditions such that the recombinant molecule is subsequently expressed by the transfected cell, and returning the transfected cells to the patient. In vitro and ex vivo routes of administration of a composition to a culture of host cells can be accomplished by a method including, but not limited to, transfection, transformation, electroporation, microinjection, lipofection, adsorption, protoplast fusion, use of protein carrying agents, use of ion carrying agents, use of detergents for cell permeabilization, and simply mixing (e.g., combining) a compound in culture with a target cell. hi the method of the present invention, a therapeutic compound, as well as compositions comprising such compounds, can be administered to any organism, and particularly, to any member of the Vertebrate class, Mammalia, including, without limitation, primates, rodents, livestock and domestic pets. Livestock include mammals to be consumed or that produce useful products (e.g., sheep for wool production). Preferred mammals to protect include humans.
Typically, it is desirable to obtain a therapeutic benefit in a patient. A therapeutic benefit is not necessarily a cure for a particular disease or condition, but rather, preferably encompasses a result which can include alleviation of the disease or condition, elimination of the disease or condition, reduction of a symptom associated with the disease or condition, prevention or alleviation of a secondary disease or condition resulting from the occurrence of a primary disease or condition, and/or prevention of the disease or condition. As used herein, the phrase "protected from a disease" refers to reducing the symptoms of the disease, reducing the occurrence of the disease, and/or reducing the severity of the disease. Protecting a patient can refer to the ability of a composition of the present invention, when administered to a patient, to prevent a disease from occurring and/or to cure or to alleviate disease symptoms, signs or causes. As such, to protect a patient from a disease includes both preventing disease occurrence (prophylactic treatment) and treating a patient that has a disease (therapeutic treatment) to reduce the symptoms of the disease. A beneficial effect can easily be assessed by one of ordinary skill in the art and/or by a trained clinician who is treating the patient. The term, "disease" refers to any deviation from the normal health of a mammal and includes a state when disease symptoms are present, as well as conditions in which a deviation (e.g., infection, gene mutation, genetic defect, etc.) has occurred, but symptoms are not yet manifested.
The following examples are provided for the purpose of illustration and are not intended to limit the scope of the present invention.
Examples Example 1
The following example demonstrates the utility of peripheral blood mononuclear cells as surrogate markers for COPD.
The key initial discovery of the present inventors' studies was the demonstration of the utility of PBMC gene expression as a surrogate marker for COPD. A recent study has shown that the gene expression profile of PBMC compared among normal individuals is remarkably homogeneous, and distinct from the expression profiles of diseased individuals54. In the first experiments, the present inventors collected PBMC data collected on five former smokers with advanced COPD (Fig. 3). This group was used as a comparative group in a study of pulmonary arterial hypertension (PAH) versus normal controls (non-PAH samples) to validate distinct immunologic profiles within different disease processes. The PAH study is described in detail in Bull et al.u, and is incorporated herein by reference in its entirety, although the COPD data presented herein is described for the first time herein.
The Bull et al. study was designed to emphasize the importance of measure validation, which is again used to evaluate the gene expression in the experiments of the present invention. Specifically, due to the high measurement-to-sample ratio inherent in microarray studies, steps must be taken to avoid "over-fitting" any predictive model55. For this reason, both independent and "leave-one-out" cross-validation of the predictor was employed. In addition permutation testing was used to access the significance of the cross validation error rate. This procedure was performed by randomly permuting class labels among the gene expression measurements (for example normal vs. COPD) considering 2000 permutations of the class labels. The proportion of the permuted data sets that have a classification error rate less than or equal to the unpermutated misclassification rate serve as the achieved significance level in a test against the null hypothesis - that there is no difference in gene expression profiles between the two classes. Finally, validation was performed using an independent measure of gene expression magnitude, and the differences in gene expression were demonstrated again using this independent measure of mRNA abundance. These measurements were extended in the PAH study with a larger cohort with q-PCR analysis of two genes. The power of the microarray analysis to determine biomarkers is made evident by the fact that the larger cohort also demonstrated the differences in gene expression profiles predicted by the smaller subset of diseased and normal patients.
As shown in Fig. 3, the initial comparisons with PBMCs from age and gender matched non-smokers that have COPD supports distinct expression profiles between the groups. In particular, immunologic profiles including cytokine and chemokine mediators, major histocompatibility antigen markers, G-protein coupled receptors, and markers of oxidative stress are differentially expressed (Table 2 shows the entire list of 246 transcripts). The present inventors' prior experience using this technique in other disease models has consistently led to statistical confirmation of similar preliminary results when performed with larger samples. Therefore, larger samplings are predicted to support these conclusions.
The initial PBMC expression profile data in COPD described above supports several important points. First, the inventors have demonstrated the technical requirements for successfully isolating mononuclear cells from peripheral blood and extracting high quality RNA of appropriate quantity to generate microarray gene expression profiles. Second, PBMC gene expression from individuals with COPD has a distinct immunologic phenotype from normal controls and patients with other chronic lung diseases (PAH). Lastly, the differentially expressed transcripts in the PBMC of patients with COPD preliminary support for the inventors' hypothesis that immunoregulatory differences are present between tobacco exposed individuals with and without COPD. Example 2 The following example describes a larger study demonstrating the utility of peripheral blood mononuclear cells as surrogate markers for COPD. Experimental Design
The outline of the study is as follows. Four groups (n=10) of tobacco exposed individuals are studies: (1) current smokers with at least Stage I GOLD classification COPD56, (2) former smokers with at least Stage I GOLD classification COPD, (3) current smokers without physiologic evidence of COPD matched to group 1 for quantitative smoking history, age, gender, and ethnicity, and (4) former smokers without physiologic evidence of COPD matched to group 2 for quantitative smoking history, age, gender, and ethnicity. For groups (1) and (2), Exclusion Criteria are:history of underlying lung disease other than COPD (e.g. asthma) or systemic disease with known pulmonary manifestations (rheumatoid arthritis; systemic lupus erythematosus), etc.); known history of genetic predisposition to COPD (alpha- one antitrypsin, cystic fibrosis, etc.) or potential occupational exposure (mining, metal work, etc.); use of immunsuppressive medication including oral or inhaled corticosteroids; chronic oxygen therapy; and history of malignancy. Additional inclusion criteria for groups (3) and (4) includes: no evidence of airflow limitation by spirometry; FEVl > 80% predicted; and FEVl /FVC ratio > 70. Exclusion criteria for groups (3) and (4) includes: history of underlying lung disease (e.g. asthma, chronic bronchitis, etc.) orsystemic disease with known pulmonary manifestations (e.g. SLE, RA, etc.); use of immunosuppressive medications; and history of malignancy, hi addition to array analysis on all individuals, RNA isolated from PBMC separation will be stored for directed (quantitative reverse transcriptase PCR) validation analysis of observed differentially expressed genes. At time of initial blood draw, an additional peripheral PBMC isolation tube will be collected and cryopreserved for future validation studies of differentially expressed cell marker transcripts (Fig. 4). Sample size and power Using the data presented in Example 1 on differential gene expression, the inventors have determined that a sample size of 10 per group will allow the identification of large differences between any two groups. Table 2 shows minimally detectable mean differences (Mean2-Meanl) on the log 2 intensity scale using an α-level of 0.001 (two-sided) and power of
80%57. Table 2: sample size and power.
Figure imgf000066_0001
The estimated standard deviations (Sl and S2) were obtained from the COPD and normal distributions of standard deviations from the 15,022 genes that passed the filtering criteria applied in BRB ArrayTools. Values are shown for the median, 75th and 90th percentiles of standard deviation as suggested by Yang and Speed58. It is assumed for these calculations that an unmatched analysis will be performed (or that the matching will not induce correlation between COPD and non-COPD individuals), which gives a conservative estimate of effect size. As the true proportion of differentially expressed genes in the population varies from 0.005, 0.05 to 0.20, the expected false discovery rate will vary from 0.20, 0.02, to 0.00559. Isolation of PBMCs and total RNA
The mononuclear cells are obtained from patient blood in an identical manner to that employed for the PAH study discussed in Example 1, with slight modifications to improve reproducibility. Specifically, eight milliliters of peripheral blood is collected into BD-CPT tubes54 and processed following the manufacturer's instructions. In recent testing, the inventors have determined that PBMCs isolated in this fashion contain less than 5% granulocytes. Total RNA from samples which are selected for array analysis are isolated using standard methods, quantified by spectophotometry, and the absence of degradation assured using the Agilent Bioanalyzer. Target labeling and array hybridization
Biotinylated cRNA for array hybridization is generated from total RNA using the methods previously developed by the inventors and others. Briefly, total RNA (2-5μg) is converted to double-stranded cDNA using a standard oligo-dT-T7 primer, followed by in vitro transcription by T7 with the incorporation of biotin-nucleotide triphosphates. Labeled cRNA is fragmented, added to a hybridization buffer, and applied to the microarray. Subsequent to the hybridization the unbound cRNA is washed away, and the bound probe is stained with Streptavidin Phycoerythrin. The array is scanned, and the quantity of hybridization is inferred from the intensity of fluorescence at each feature of the array.
The most current Affymetrix® arrays for human gene expression are used: e.g., the Affymetrix Hu-133 Plus 2.0 Genechip™. This fifth-generation microarray measures approximately 47,000 transcripts, including essentially all of the well-characterized genes in the human genome (e.g., ~38,500 genes). Analysis of this microarray includes the use of a high- resolution scanner that is demanded for arrays with this increased density (1,300,000 individual array features). Microarray data analysis
The first step of the data flow is the conversion of image data into tabular data. The inventors use the statistical algorithms implemented in Asymetrix GeneChip™ Operating System (GCOS) (Affymetrix, Santa Clara, Calif.) for this task. While a number of alternatives to GCOS exist (such as d-chip, RMA, and PerfectMatch), the inventors' experience indicates that while these tools may provide advantages in the direct comparison of paired samples, they provide no advantage in class comparison and class discovery applications. Internal measurements of chip and sample quality (brightness scaling factor, noise, % present calls, and control gene 375' ratios) are collected in GCOS, and only chips that meet the inventors' quality threshold are included for analysis. The bulk of data analysis is conducted with the software package BRB ArrayTools v3.0.2E developed by Dr Richard Simon and Amy Peng Lam at the National Cancer Institute. This software emphasizes conservative, defensible analysis of array data, with a high level of validation through the use of permutation testing and leave-one-out cross-validation with removal of the left-out data set prior to gene selection. This approach leads to the association of reliability estimates with each discovery.
The analysis of the array data is similar to that outlined for the pulmonary hypertension analysis in Example 1 above. Briefly, the data is normalized (mean centered) and examined in an unsupervised fashion (see description of study design above). The goal of this analysis is to reduce bias in the discovery of meaningful groups in the patient population. Subsequent supervised analyses follow a class comparison paradigm, with the aim of discovering patterns of gene expression that support, or co-vary with assigned membership in new classes discovered in the unsupervised analysis, or with the clinical parameters.
Clinical data obtained through additional studies is the basis for potential classification groups, including pulmonary physiologic testing and demographic data. For each discriminator a binary class comparison analysis is conducted, and the reliability statistics (as determined by permutation testing) associated with the postulated classification are determined.
Once a signature or discriminator list of genes between the groups is determined, the expression data is evaluated using alternative techniques (Fig. 5). Quantitative reverse transcription-PCR (RT-PCR) is accepted as the "gold standard" validation method for array data. If differentially expressed cell signaling transcripts are present, these findings are validated with flow cytometry cell marker experiments. This methodology not only validates the array findings, but provides useful information regarding the biology of any interesting candidate genes identified in the experiment. More extensive validation via protein abundance measurements in PBMC or in other tissues are considered in light of the biological function of the biomarkers discovered.
The independent validation (Fig. 5) of the differentially expressed genes is directed by the signature expression profile for sample discrimination obtained as discussed above. Since this methodologic approach is determined by the observed result, this study is designed to collect and maintain appropriate samples for future validation. Additional RNA is maintained from the original isolation for quantitative PCR of differentially expressed transcripts. Second, an aliquot of PBMC is cryopreserved at the time of initial phlebotomy for future analysis by either western blot or flow cytometry if a cell surface marker of interest is differentially expressed. Specific antibodies of interest pertaining to highly differentially expressed genes or genes of biologic importance are used. Reference List
1. Lopez, A. D. and C. C. Murray. 1998. The global burden of disease, 1990-2020. Nat.Med. 4:1241-1243.
2. U. S. Department of Health and Human Services, National Institutes of Health National
Heart Lung and Blood Institute. NIH Publication No. 03-5229. 2003.
3. Barnes, P. J. 2000. Chronic obstructive pulmonary disease. N.Engl. J.Med. 343:269-280.
4. Fletcher, C. and R. Peto. 1977. The natural history of chronic airflow obstruction. Br.MedJ. 1:1645-1648.
5. Rutgers, S. R., D. S. Postma, N. H. ten Hacken, H. F. Kauffman, T. W. Der Mark, G. H. Koeter, and W. Timens. 2000. Ongoing airway inflammation in patients with COPD who Do not currently smoke. Chest 117:262S.
6. Bullinger, L., K. Dohner, E. Bair, S. Frohling, R. F. Schlenk, R. Tibshirani, H. Dohner, and J. R. Pollack. 2004. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N.Engl.J.Med. 350:1605-1616. 7. VaIk, P. J., R. G. Verhaak, M. A. Beijen, C. A. Erpelinck, Barjesteh van Waalwijk van Doorn-Khosrovani, J. M. Boer, H. B. Beverloo, M. J. Moorhouse, P. J. van der Spek, B. Lowenberg, and R. Delwel. 2004. Prognostically useful gene-expression profiles in acute myeloid leukemia. N.Engl.J.Med. 350:1617-1628.
8. Bovin, L. F., K. Rieneck, C. Workman, H. Nielsen, S. F. Sorensen, H. Skjodt, A.
Florescu, S. Brunak, and K. Bendtzen. 2004. Blood cell gene expression profiling in rheumatoid arthritis. Discriminative genes and effect of rheumatoid factor. Immunol.Lett. 93:217-226.
9. Jison, M. L., P. J. Munson, J. J. Barb, A. F. Suffredini, S. Talwar, C. Logun, N. Raghavachari, J. H. Beigel, J. H. Shelhamer, R. L. Danner, and M. T. Gladwin. 2004. Blood mononuclear cell gene expression profiles characterize the oxidant, hemolytic, and inflammatory stress of sickle cell disease. Blood 104:270-280.
10. Olsen, N. J., J. H. Moore, and T. M. Aune. 2004. Gene expression signatures for autoimmune disease in peripheral blood mononuclear cells. Arthritis Res.Ther. 6:120-128.
11. Ramanathan, M., B. Weinstock-Guttman, L. T. Nguyen, D. Badgett, C. Miller, K. Patrick, C. Brownscheidle, and L. Jacobs. 2001. In vivo gene expression revealed by cDNA arrays: the pattern in relapsing-remitting multiple sclerosis patients compared with normal subjects. J.Neuroimmunol. 116:213-219.
12. DePrimo, S. E., L. M. Wong, D. B. Khatry, S. L. Nicholas, W. C. Manning, B. D. Smolich, A. M. O'Farrell, and J. M. Cherrington. 2003. Expression profiling of blood samples from an SU5416 Phase III metastatic colorectal cancer clinical trial: a novel strategy for biomarker identification. BMC. Cancer 3:3.
13. Bull, T. M., C. D. Coldren, M. Moore, S. M. Sotto-Santiago, D. V. Pham, S. P. Nana- Sinkam, N. F. Voelkel, and M. W. Geraci. 2004. Gene Microarray Analysis of Peripheral Blood Cells in Pulmonary Arterial Hypertension. Am.J.Respir.Crit Care Med.
14. Anthonisen, N. R., J. E. Connett, and R. P. Murray. 2002. Smoking and lung function of
Lung Health Study participants after 11 years. Am.J.Respir.Crit Care Med. 166:675-679.
15. Sutherland, E. R. and R. M. Cherniack. 2004. Management of chronic obstructive pulmonary disease. N.Engl.J.Med. 350:2689-2697. 16. Silverman, E. K. and F. E. Speizer. 1996. Risk factors for the development of chronic obstructive pulmonary disease. Med.Clin.North Am. 80:501-522.
17. Redline, S., P. V. Tishler, F. I. Lewitter, I. B. Tager, A. Munoz, and F. E. Speizer. 1987. Assessment of genetic and nongenetic influences on pulmonary function. A twin study. Am.Rev.Respir.Dis. 135:217-222.
18. Tager, I. B., B. Rosner, P. V. Tishler, F. E. Speizer, and E. H. Kass. 1976. Household aggregation of pulmonary function and chronic bronchitis. Am.Rev.Respir.Dis. 114:485-492.
19. Barnes, P. J. 1999. Genetics and pulmonary medicine. 9. Molecular genetics of chronic obstructive pulmonary disease. Thorax 54:245-252.
20. Barnes, P. J. 2003. New concepts in chronic obstructive pulmonary disease. Annu.Rev.Med. 54:113-129.
21. Barnes, P. J. 2000. Mechanisms in COPD: differences from asthma. Chest 117:10S-14S.
22. Keatings, V. M., P. D. Collins, D. M. Scott, and P. J. Barnes. 1996. Differences in interleukin-8 and tumor necrosis factor-alpha in induced sputum from patients with chronic obstructive pulmonary disease or asthma. AmJ.Respir.Crit Care Med. 153:530-534.
23. Saetta, M., G. Turato, P. Maestrelli, C. E. Mapp, and L. M. Fabbri. 2001. Cellular and structural bases of chronic obstructive pulmonary disease. AmJ.Respir.Crit Care Med. 163:1304-1309.
24. Di Stefano, A., G. Turato, P. Maestrelli, C. E. Mapp, M. P. Ruggieri, A. Roggeri, P. Boschetto, L. M. Fabbri, and M. Saetta. 1996. Airflow limitation in chronic bronchitis is associated with T-lymphocyte and macrophage infiltration of the bronchial mucosa. AmJ.Respir.Crit Care Med. 153:629-632.
25. Jeffery, P. K. 1998. Structural and inflammatory changes in COPD: a comparison with asthma. Thorax 53:129-136.
26. Saetta, M., A. Di Stefano, P. Maestrelli, A. Ferraresso, R. Drigo, A. Potena, A. Ciaccia, and L. M. Fabbri. 1993. Activated T-lymphocytes and macrophages in bronchial mucosa of subjects with chronic bronchitis. Am.Rev.Respir.Dis. 147:301-306. 27. Thompson, A. B., D. Daughton, R. A. Robbins, M. A. Ghafouri, M. Oehlerking, and S. I. Rennard. 1989. Intraluminal airway inflammation in chronic bronchitis. Characterization and correlation with clinical parameters. Am.Rev.Respir.Dis. 140:1527-1537.
28. O'Shaughnessy, T. C, T. W. Ansari, N. C. Barnes, and P. K. Jeffery. 1997. Inflammation in bronchial biopsies of subjects with chronic bronchitis: inverse relationship of CD8+ T lymphocytes with FEVl. Am.J.Respir.Crit Care Med. 155:852-857.
29. Tzanakis, N., G. Chrysofakis, M. Tsoumakidou, D. Kyriakou, J. Tsiligianni, D. Bouros, and N. M. Siafakas. 2004. Induced sputum CD8+ T-lymphocyte subpopulations in chronic obstructive pulmonary disease. Respir.Med. 98:57-65.
30. Di Stefano, A., A. Capelli, M. Lusuardi, P. Balbo, C. Vecchio, P. Maestrelli, C. E. Mapp, L. M. Fabbri, C. F. Dormer, and M. Saetta. 1998. Severity of airflow limitation is associated with severity of airway inflammation in smokers. Am.J.Respir.Crit Care Med. 158:1277-1285.
31. Majo, J., H. Ghezzo, and M. G. Cosio. 2001. Lymphocyte population and apoptosis in the lungs of smokers and their relation to emphysema. Eur.Respir.J. 17:946-953.
32. Saetta, M., S. Baraldo, L. Corbino, G. Turato, F. Braccioni, F. Rea, G. Cavallesco, G. Tropeano, C. E. Mapp, P. Maestrelli, A. Ciaccia, and L. M. Fabbri. 1999. CD8+ve cells in the lungs of smokers with chronic obstructive pulmonary disease. Am.J.Respir.Crit Care Med. 160:711-717.
33. Saetta, M., A. Di Stefano, G. Turato, F. M. Facchini, L. Corbino, C. E. Mapp, P.
Maestrelli, A. Ciaccia, and L. M. Fabbri. 1998. CD8+ T-lymphocytes in peripheral airways of smokers with chronic obstructive pulmonary disease. Am.J.Respir.Crit Care Med. 157:822-826.
34. Liu, A. N., A. Z. Mohammed, W. R. Rice, D. T. Fiedeldey, J. S. Liebermann, J. A. Whitsett, T. J. Braciale, and R. I. Enelow. 1999. Perforin-independent CD8(+) T-cell-mediated cytotoxicity of alveolar epithelial cells is preferentially mediated by tumor necrosis factor- alpha: relative insensitivity to Fas ligand. Am.J.Respir.Cell Mol.Biol. 20:849-858.
35. Enelow, R. L, A. Z. Mohammed, M. H. Stoler, A. N. Liu, J. S. Young, Y. H. Lou, and T. J. Braciale. 1998. Structural and functional consequences of alveolar cell recognition by CD8(+) T lymphocytes in experimental lung disease. J.Clin.Invest 102:1653-1661. 36. Rahman, L, D. Morrison, K. Donaldson, and W. MacNee. 1996. Systemic oxidative stress in asthma, COPD, and smokers. Am.J.Respir.Crit Care Med. 154:1055-1060.
37. Burnett, D., A. Chamba, S. L. Hill, and R. A. Stockley. 1987. Neutrophils from subjects with chronic obstructive lung disease show enhanced chemotaxis and extracellular proteolysis. Lancet 2:1043-1046.
38. Noguera, A., S. Batle, C. Miralles, J. Iglesias, X. Busquets, W. MacNee, and A. G. Agusti. 2001. Enhanced neutrophil response in chronic obstructive pulmonary disease. Thorax 56:432-437.
39. Noguera, A., X. Busquets, J. Sauleda, J. M. Villaverde, W. MacNee, and A. G. Agusti. 1998. Expression of adhesion molecules and G proteins in circulating neutrophils in chronic obstructive pulmonary disease. Am.J.Respir.Crit Care Med. 158:1664-1668.
40. Noguera, A., E. SaIa, A. R. Pons, J. Iglesias, W. MacNee, and A. G. Agusti. 2004. Expression of adhesion molecules during apoptosis of circulating neutrophils in COPD. Chest 125:1837-1842.
41. Di Francia, M., D. Barbier, J. L. Mege, and J. Orehek. 1994. Tumor necrosis factor- alpha levels and weight loss in chronic obstructive pulmonary disease. Am.J.Respir.Crit Care Med. 150:1453-1455.
42. Schols, A. M., W. A. Buurman, Staal van den Brekel AJ, M. A. Dentener, and E. F. Wouters. 1996. Evidence for a relation between metabolic derangements and increased levels of inflammatory mediators in a subgroup of patients with chronic obstructive pulmonary disease. Thorax 51:819-824.
43. de, G., I, M. Donahoe, W. J. Calhoun, J. Mancino, and R. M. Rogers. 1996. Elevated TNF-alpha production by peripheral blood monocytes of weight-losing COPD patients. Am.J.Respir.Crit Care Med. 153:633-637.
44. Miller, L. G., G. Goldstein, M. Murphy, and L. C. Ginns. 1982. Reversible alterations in immunoregulatory T cells in smoking. Analysis by monoclonal antibodies and flow cytometry. Chest 82:526-529. 45. Costabel, U., K. J. Bross, C. Reuter, K. H. Ruhle, and H. Matthys. 1986. Alterations in immunoregulatory T-cell subsets in cigarette smokers. A phenotypic analysis of bronchoalveolar and blood lymphocytes. Chest 90:39-44.
46. Ekberg-Jansson, A., B. Andersson, E. Avra, O. Nilsson, and C. G. Lofdahl. 2000. The expression of lymphocyte surface antigens in bronchial biopsies, bronchoalveolar lavage cells and blood cells in healthy smoking and never-smoking men, 60 years old. Respir.Med. 94:264-
272.
47. de Jong, J. W., B. Belt-Gritter, G. H. Koeter, and D. S. Postma. 1997. Peripheral blood lymphocyte cell subsets in subjects with chronic obstructive pulmonary disease: association with smoking, IgE and lung function. Respir.Med. 91 :67-76.
48. Kim, W. D., W. S. Kim, Y. Koh, S. D. Lee, C. M. Lim, D. S. Kim, and Y. J. Cho. 2002. Abnormal peripheral blood T-lymphocyte subsets in a subgroup of patients with COPD. Chest 122:437-444.
49. Hodge, S. J., G. L. Hodge, P. N. Reynolds, R. Scicchitano, and M. Holmes. 2003. Increased production of TGF-beta and apoptosis of T lymphocytes isolated from peripheral blood in COPD. Am J.Physiol Lung Cell Mol.Physiol 285:L492-L499.
50. Takabatake, N., H. Nakamura, S. Inoue, K. Terashita, H. Yuki, S. Kato, S. Yasumura, and H. Tomoike. 2000. Circulating levels of soluble Fas ligand and soluble Fas in patients with chronic obstructive pulmonary disease. Respir.Med. 94:1215-1220.
51. Lehmann, C, A. Wilkening, D. Leiber, A. Markus, N. Krug, R. Pabst, and T. Tschernig. 2001. Lymphocytes in the bronchoalveolar space reenter the lung tissue by means of the alveolar epithelium, migrate to regional lymph nodes, and subsequently rejoin the systemic immune system. Anat.Rec. 264:229-236.
52. Hall, M. A., K. R. Ahmadi, P. Norman, H. Snieder, A. J. MacGregor, R. W. Vaughan, T. D. Spector, and J. S. Lanchbury. 2000. Genetic influence on peripheral blood T lymphocyte levels. Genes Immun. 1:423-427.
53. Amadori, A., R. Zamarchi, G. De Silvestro, G. Forza, G. Cavatton, G. A. Danieli, M. Clementi, and L. Chieco-Bianchi. 1995. Genetic control of the CD4/CD8 T-cell ratio in humans. Nat.Med. 1:1279-1283. 54. Whitney, A. R., M. Diehn, S. J. Popper, A. A. Alizadeh, J. C. Boldrick, D. A. Relman, and P. O. Brown. 2003. Individuality and variation in gene expression patterns in human blood. Proc.Natl.Acad.Sci.U.S.A 100:1896-1901.
55. Rus, V., S. P. Atamas, V. Shustova, I. G. Luzina, F. Selaru, L. S. Magder, and C. S. Via. 2002. Expression of cytokine- and chemokine-related genes in peripheral blood mononuclear cells from lupus patients by cDNA array. Clin.Immunol. 102:283-290.
56. Fabbri, L. M. and S. S. Hurd. 2003. Global Strategy for the Diagnosis, Management and Prevention of COPD: 2003 update. Eur.Respir.J. 22:1-2.
57. Hintze JL. PASS 2001. NCSS, Kaysville, Utah.
58. Yang, Y. H. and T. Speed. 2002. Design issues for cDNA microarray experiments. NatRev.Genet. 3:579-588.
59. Dobbin K and Simon RM. 2005. Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6: 27-38.
Each reference disclosed or cited herein is incorporated herein by reference in its entirety.
While various embodiments of the present invention have been described in detail, it is apparent that modifications and adaptations of those embodiments will occur to those skilled in the art. It is to be expressly understood, however, that such modifications and adaptations are within the scope of the present invention, as set forth in the following claims.

Claims

What is claimed is:
1. A method to diagnose chronic obstructive pulmonary disease (COPD) or a predisposition to develop COPD, comprising: a) detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient to be tested, wherein the gene is chosen from the genes represented by SEQ ID NO: 1-323, and wherein the level of expression of each of the genes in any one or more of Tables 2-5 is associated with COPD as measured by either upregulation or downregulation of gene expression in peripheral blood cells from patients with COPD as compared to the level of expression of the genes in peripheral blood cells from normal controls; and b) comparing the level of expression of the gene from the patient sample to the level of expression of the gene in normal control peripheral blood cells, wherein detection of regulation of the expression of the gene in the patient sample in the direction associated with COPD indicated in Table 2, 3, 4 and/or 5 indicates a diagnosis of COPD in the patient.
2. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 5 genes chosen from the genes represented by SEQ ID NO: 1-323.
3. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 10 genes chosen from the genes represented by SEQ ID NO: 1-323.
4. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 15 genes chosen from the genes represented by SEQ ID NO:l-323.
5. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 20 genes chosen from the genes represented by SEQ ID NO: 1-323.
6. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 25 genes chosen from the genes represented by SEQ ID NO: 1-323.
7. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 50 genes chosen from the genes represented by SEQ ID NO: 1-323.
8. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 75 genes chosen from the genes represented by SEQ ID NO: 1-323.
9. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 100 genes chosen from the genes represented by SEQ ID NO: 1-229, and 320-324.
10. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 125 genes chosen from the genes represented by SEQ ID NO:1- 229, and 320-324.
11. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 150 genes chosen from the genes represented by SEQ ID NO: 1-229, and
320-324.
12. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 175 genes chosen from the genes represented by SEQ ID NO: 1-229, and 320-324.
13. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 200 genes chosen from the genes represented by SEQ ID NO: 1-229, and 320-324.
14. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of at least 225 genes chosen from the genes represented by SEQ ID NO: 1-229, and 320-324.
15. The method of Claim 1, wherein the step (a) of detecting comprises detecting expression of all of the genes represented by SEQ ID NO:l-323.
16. The method of any one of Claims 1-15, wherein expression of the gene is detected by measuring amounts of transcripts of the gene in the patient peripheral blood cells.
17. The method of any one of Claims 1-15, wherein expression of the gene is detected by detecting hybridization of at least a portion of the gene or a transcript thereof to a nucleic acid molecule comprising a portion of the gene or a transcript thereof in a nucleic acid array.
18. The method of any one of Claims 1-15, wherein expression of the gene is detected by detecting the production of a protein encoded by the gene.
19. The method of any one of Claims 1-18, wherein the level of expression of the gene in the peripheral blood cells of a normal control has been predetermined.
20. A method to monitor the treatment of a patient with COPD, comprising: a) detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient undergoing treatment for COPD, wherein the gene is chosen from the genes represented by any one of SEQ ID NO: 1-324, and wherein the level of expression of each of the genes represented by any one of SEQ ID NO: 1-324 is associated with COPD as measured by either upregulation or downregulation of gene expression in peripheral blood cells from patients with COPD as compared to the level of expression of the genes in peripheral blood cells from normal controls; b) comparing the level of expression of the gene from the patient sample to the level of expression of the gene in a prior sample of peripheral blood cells from the patient, wherein detection of a change in the level of expression of the gene, as compared to the level of expression in the prior sample, toward the level of the expression of the gene in a normal control sample, indicates that the treatment for COPD is producing a beneficial result.
21. The method of Claim 20, wherein detection of a change in the level of expression of the gene, as compared to the level of expression in the prior sample, away from the level of the expression of the gene in a normal control sample, indicates a progression of the COPD.
22. The method of Claim 20, wherein detection of no significant change in the level of expression of the gene, as compared to the level of expression in the prior sample, indicates no significant change in the progression or treatment of the COPD in the patient.
23. A plurality of polynucleotides for the detection of the expression of genes that are indicative of COPD in a patient or a preclinical disposition therefore; wherein the plurality of polynucleotides consists of at least two polynucleotides, wherein each polynucleotide is at least 5 nucleotides in length, and wherein each polynucleotide is complementary to an RNA transcript, or nucleotide derived therefrom, of a gene that is regulated differently in individuals with COPD as compared to individuals that do not have
COPD.
24. The plurality of polynucleotides of Claim 23, wherein each polynucleotide is complementary to an RNA transcript, or a polynucleotide derived therefrom, of a gene represented by any one of SEQ ID NO: 1-324.
25. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least two genes represented by any one of SEQ ID NO:1- 324.
26. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least five genes represented by any one of SEQ ID NO:1- 324.
27. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 10 genes represented by any one of SEQ ID NO: 1-324.
28. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 50 genes represented by any one of SEQ ID NO: 1-324.
29. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 100 genes represented by any one of SEQ ID NO:1- 229, and 320-324.
30. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 150 genes represented by any one of SEQ ID NO:1- 229, and 320-324.
31. The plurality of polynucleotides of Claim 23, wherein the plurality of polynucleotides comprises polynucleotides that are complementary to an RNA transcript, or a nucleotide derived therefrom, of at least 200 genes represented by any one of SEQ ID NO:1- 229, and 320-324.
32. The plurality of polynucleotides of any one of Claims 23-31, wherein said polynucleotides are immobilized on a substrate.
33. The plurality of polynucleotides of any one of Claims 23-31, wherein said polynucleotides are hybridizable array elements in a microarray.
34. The plurality of polynucleotides of any one of Claims 23-31, wherein said polynucleotides are conjugated to detectable markers.
35. A method to diagnose chronic obstructive pulmonary disease (COPD) in a patient, comprising: a) detecting the level of expression of at least one gene in a sample of peripheral blood cells isolated from a patient to be tested, wherein the gene is chosen from a group of genes, each of which has been previously identified to be upregulated or downregulated in the peripheral blood cells of patients who have been diagnosed with
COPD, as compared to the level of expression of the gene in normal control peripheral blood cells; and b) comparing the level of expression of the gene from the patient sample to the level of expression of the gene in normal control peripheral blood cells, wherein detection of regulation of the expression of the gene in the patient sample in the direction associated with COPD as indicated by the previous identification, indicates a diagnosis of COPD in the patient.
36. A method to identify a compound with the potential to treat or prevent chronic obstructive pulmonary disease (COPD), comprising: a) contacting a test compound with a cell that expresses a gene selected from any one or more of the genes represented by any one of SEQ ID NO: 1-324; b) identifying compounds that increase the expression or activity of genes represented by any one of SEQ ID NO: 1-324 or the proteins encoded thereby that are downregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls, or that decrease the expression or activity of genes represented by any one of SEQ ID NO: 1-324 or the proteins encoded thereby that are upregulated in peripheral blood cells of patients with COPD as compared to peripheral blood cells of normal controls.
37. A method to treat a patient with COPD, comprising administering to the patient a therapeutic composition comprising a compound identified by the method of Claim 36.
38. The method of any or Claim 1, Claim 20, or 35, wherein detection of a change in the level of expression of at least one gene comprises detecting the presence of a protein.
39. The method of Claim 38, wherein the method further comprises detecting the presence of the protein using a reagent that specifically binds to the protein.
40. The method of claim 39, wherein the reagent is selected from the group consisting of an antibody, an antibody derivative, and an antibody fragment.
41. A plurality of reagents for the detection of the expression of genes that are indicative of COPD in a patient or a preclinical disposition therefore; wherein the plurality of reagents consists of at least two reagents that each of which specifically bind to a protein, wherein each protein is at least 15 amino acids in length, and wherein each protein is encoded a gene that is regulated differently in individuals with COPD as compared to individuals that do not have COPD.
42. The plurality of reagents of Claim 41, wherein each protein is encoded a gene represented by any one of SEQ ID NO: 1-324.
43. The plurality of reagents of Claim 41, wherein at least two proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.
44. The plurality of reagents of Claim 41, wherein at least five proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.
45. The plurality of reagents of Claim 41, wherein at least 10 proteins are encoded a gene represented by any one of SEQ ID NO: 1-324..
46. The plurality of reagents of Claim 41, wherein at least 50 proteins are encoded a gene represented by any one of SEQ ID NO: 1-324.
47. The plurality of reagents of Claim 41, wherein at least 100 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324.
48. The plurality of reagents of Claim 41, wherein at least 150 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324.
49. The plurality of reagents of Claim 41, wherein at least 200 proteins are encoded a gene represented by any one of SEQ ID NO: 1-229, and 320-324.
50. The plurality of reagents of any one of Claims 41-49, wherein said reagents are immobilized on a substrate.
51. The plurality of reagents of any one of Claims 41-49, wherein each of said reagents are selected from the group consisting of an antibody, an antibody derivative, and an antibody fragment, and wherein each of said reagents are elements in a microarray.
52. The plurality of reagents of any one of Claims 41-49, wherein said reagents are conjugated to detectable markers.
PCT/US2006/011570 2005-03-28 2006-03-28 Diagnosis of chronic pulmonary obstructive disease and monitoring of therapy using gene expression analysis of peripheral blood cells WO2006105252A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US66608505P 2005-03-28 2005-03-28
US60/666,085 2005-03-28

Publications (2)

Publication Number Publication Date
WO2006105252A2 true WO2006105252A2 (en) 2006-10-05
WO2006105252A3 WO2006105252A3 (en) 2009-06-04

Family

ID=37054110

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/011570 WO2006105252A2 (en) 2005-03-28 2006-03-28 Diagnosis of chronic pulmonary obstructive disease and monitoring of therapy using gene expression analysis of peripheral blood cells

Country Status (1)

Country Link
WO (1) WO2006105252A2 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2141498A1 (en) * 2008-07-02 2010-01-06 Apoptec AG Cellular COPD diagnosis
EP2141499A1 (en) * 2008-07-02 2010-01-06 Apoptec AG COPD diagnosis
WO2010008084A1 (en) * 2008-07-17 2010-01-21 独立行政法人理化学研究所 Novel use application of sugar chain-recognizing receptor
WO2011100792A1 (en) 2010-02-16 2011-08-25 Crc For Asthma And Airways Ltd Protein biomarkers for obstructive airways diseases
WO2012123293A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Seprase as marker for chronic obstructive pulmonary disease (copd)
WO2012123299A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Asc as marker for chronic obstructive pulmonary disease (copd)
WO2012123296A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Armet as marker for chronic obstructive pulmonary disease (copd)
WO2012123294A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Apex1 as marker for chronic obstructive pulmonary disease (copd)
WO2012123297A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Nnmt as marker for chronic obstructive pulmonary disease (copd)
WO2012123295A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Fen1 as marker for chronic obstructive pulmonary disease (copd)
FR2986239A1 (en) * 2012-01-31 2013-08-02 Univ Strasbourg BIOMARKER FOR CHRONIC PULMONARY INFLAMMATORY DISEASES
WO2013190092A1 (en) * 2012-06-21 2013-12-27 Philip Morris Products S.A. Gene signatures for copd diagnosis
WO2015112848A1 (en) * 2014-01-24 2015-07-30 National Jewish Health Methods for detection of respiratory diseases
EP2968988A4 (en) * 2013-03-14 2016-11-16 Allegro Diagnostics Corp Methods for evaluating copd status
WO2017158152A1 (en) * 2016-03-17 2017-09-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Diagnosis of chronic obstructive pulmonary disease (copd)
WO2017158146A1 (en) * 2016-03-17 2017-09-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Method for the diagnosis of chronic diseases based on monocyte transcriptome analysis
WO2020145041A1 (en) * 2019-01-11 2020-07-16 日本たばこ産業株式会社 In vitro evaluation method for risk of chronic obstructive pulmonary disease associated with smoking or inhalation
US11639527B2 (en) 2014-11-05 2023-05-02 Veracyte, Inc. Methods for nucleic acid sequencing
US11976329B2 (en) 2013-03-15 2024-05-07 Veracyte, Inc. Methods and systems for detecting usual interstitial pneumonia
US12110554B2 (en) 2009-05-07 2024-10-08 Veracyte, Inc. Methods for classification of tissue samples as positive or negative for cancer

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5763158A (en) * 1997-02-06 1998-06-09 The United States Of America As Represented By The Secretary Of The Army Detection of multiple antigens or antibodies
US6607879B1 (en) * 1998-02-09 2003-08-19 Incyte Corporation Compositions for the detection of blood cell and immunological response gene expression

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5763158A (en) * 1997-02-06 1998-06-09 The United States Of America As Represented By The Secretary Of The Army Detection of multiple antigens or antibodies
US6607879B1 (en) * 1998-02-09 2003-08-19 Incyte Corporation Compositions for the detection of blood cell and immunological response gene expression

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2330424A1 (en) * 2008-07-02 2011-06-08 Aposcience AG COPD diagnosis
EP2141498A1 (en) * 2008-07-02 2010-01-06 Apoptec AG Cellular COPD diagnosis
WO2010000820A2 (en) * 2008-07-02 2010-01-07 Aposcience Ag Copd diagnosis
WO2010000819A1 (en) * 2008-07-02 2010-01-07 Aposcience Ag Cellular copd diagnosis
US8415112B2 (en) 2008-07-02 2013-04-09 Aposcience Ag COPD diagnosis
WO2010000820A3 (en) * 2008-07-02 2010-03-04 Aposcience Ag Copd diagnosis
EP2141499A1 (en) * 2008-07-02 2010-01-06 Apoptec AG COPD diagnosis
CN102170910A (en) * 2008-07-17 2011-08-31 独立行政法人理化学研究所 Novel use application of sugar chain-recognizing receptor
JP5413915B2 (en) * 2008-07-17 2014-02-12 独立行政法人理化学研究所 Novel uses of sugar chain recognition receptors
US8440187B2 (en) 2008-07-17 2013-05-14 Riken Use application of sugar chain-recognizing receptor
WO2010008084A1 (en) * 2008-07-17 2010-01-21 独立行政法人理化学研究所 Novel use application of sugar chain-recognizing receptor
US12110554B2 (en) 2009-05-07 2024-10-08 Veracyte, Inc. Methods for classification of tissue samples as positive or negative for cancer
EP2537025A1 (en) * 2010-02-16 2012-12-26 CRC For Asthma And Airways Ltd Protein biomarkers for obstructive airways diseases
WO2011100792A1 (en) 2010-02-16 2011-08-25 Crc For Asthma And Airways Ltd Protein biomarkers for obstructive airways diseases
EP2537025A4 (en) * 2010-02-16 2013-08-21 Newcastle Innovation Ltd Protein biomarkers for obstructive airways diseases
CN103403556A (en) * 2011-03-11 2013-11-20 霍夫曼-拉罗奇有限公司 ARMET as marker for chronic obstructive pulmonary disease (COPD)
JP2014509741A (en) * 2011-03-11 2014-04-21 エフ.ホフマン−ラ ロシュ アーゲー ASC as a marker of chronic obstructive pulmonary disease (COPD)
WO2012123297A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Nnmt as marker for chronic obstructive pulmonary disease (copd)
CN103415770B (en) * 2011-03-11 2015-09-02 霍夫曼-拉罗奇有限公司 NNMT is as the mark of chronic obstructive pulmonary disease (COPD)
US9116156B2 (en) 2011-03-11 2015-08-25 Roche Diagnostics Operations, Inc. ASC as a marker for chronic obstructive pulmonary disease (COPD)
WO2012123294A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Apex1 as marker for chronic obstructive pulmonary disease (copd)
CN103403555A (en) * 2011-03-11 2013-11-20 霍夫曼-拉罗奇有限公司 ASC as marker for chronic obstructive pulmonary disease (COPD)
WO2012123296A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Armet as marker for chronic obstructive pulmonary disease (copd)
CN103415770A (en) * 2011-03-11 2013-11-27 霍夫曼-拉罗奇有限公司 NNMT as marker for chronic obstructive pulmonary disease (COPD)
WO2012123293A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Seprase as marker for chronic obstructive pulmonary disease (copd)
WO2012123299A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Asc as marker for chronic obstructive pulmonary disease (copd)
JP2014509739A (en) * 2011-03-11 2014-04-21 エフ.ホフマン−ラ ロシュ アーゲー ARMET as a marker for chronic obstructive pulmonary disease (COPD)
JP2014509740A (en) * 2011-03-11 2014-04-21 エフ.ホフマン−ラ ロシュ アーゲー NNMT as a marker for chronic obstructive pulmonary disease (COPD)
WO2012123295A1 (en) 2011-03-11 2012-09-20 Roche Diagnostics Gmbh Fen1 as marker for chronic obstructive pulmonary disease (copd)
JP2014509738A (en) * 2011-03-11 2014-04-21 エフ.ホフマン−ラ ロシュ アーゲー FEN1 as a marker for chronic obstructive pulmonary disease (COPD)
WO2013113852A1 (en) * 2012-01-31 2013-08-08 Universite De Strasbourg Biomarker for chronic inflammatory lung diseases
FR2986239A1 (en) * 2012-01-31 2013-08-02 Univ Strasbourg BIOMARKER FOR CHRONIC PULMONARY INFLAMMATORY DISEASES
WO2013190092A1 (en) * 2012-06-21 2013-12-27 Philip Morris Products S.A. Gene signatures for copd diagnosis
US10526655B2 (en) 2013-03-14 2020-01-07 Veracyte, Inc. Methods for evaluating COPD status
EP2968988A4 (en) * 2013-03-14 2016-11-16 Allegro Diagnostics Corp Methods for evaluating copd status
EP3626308A1 (en) * 2013-03-14 2020-03-25 Veracyte, Inc. Methods for evaluating copd status
US11976329B2 (en) 2013-03-15 2024-05-07 Veracyte, Inc. Methods and systems for detecting usual interstitial pneumonia
US9952225B2 (en) 2014-01-24 2018-04-24 National Jewish Health Methods for detection of respiratory diseases
US10684292B2 (en) 2014-01-24 2020-06-16 National Jewish Health Methods for detection of emphysema
WO2015112848A1 (en) * 2014-01-24 2015-07-30 National Jewish Health Methods for detection of respiratory diseases
US11639527B2 (en) 2014-11-05 2023-05-02 Veracyte, Inc. Methods for nucleic acid sequencing
WO2017158152A1 (en) * 2016-03-17 2017-09-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Diagnosis of chronic obstructive pulmonary disease (copd)
WO2017158146A1 (en) * 2016-03-17 2017-09-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Method for the diagnosis of chronic diseases based on monocyte transcriptome analysis
WO2020145041A1 (en) * 2019-01-11 2020-07-16 日本たばこ産業株式会社 In vitro evaluation method for risk of chronic obstructive pulmonary disease associated with smoking or inhalation
JPWO2020145041A1 (en) * 2019-01-11 2021-11-18 日本たばこ産業株式会社 In vitro assessment of the risk of chronic obstructive pulmonary disease from smoking or inhalation
JP7203124B2 (en) 2019-01-11 2023-01-12 日本たばこ産業株式会社 In vitro assessment method for risk of chronic obstructive pulmonary disease from smoking or inhalation

Also Published As

Publication number Publication date
WO2006105252A3 (en) 2009-06-04

Similar Documents

Publication Publication Date Title
WO2006105252A2 (en) Diagnosis of chronic pulmonary obstructive disease and monitoring of therapy using gene expression analysis of peripheral blood cells
US20180106817A1 (en) Protein biomarkers and therapeutic targets for renal disorders
JP4980878B2 (en) Classification, diagnosis, and prognosis of acute myeloid leukemia by gene expression profiling
US20190056403A1 (en) Lung cancer signature
EP2162459B1 (en) Transcriptomic biomarkers for individual risk assessment in new onset heart failure
US20060019272A1 (en) Diagnosis of disease and monitoring of therapy using gene expression analysis of peripheral blood cells
JP2013503643A (en) Method for treatment, diagnosis and monitoring of rheumatoid arthritis
US20050266467A1 (en) Biomarkers for multiple sclerosis and methods of use thereof
US20210302437A1 (en) Transcriptomic biomarker of myocarditis
US20130316921A1 (en) Methods for diagnosis of kawasaki disease
WO2011006119A2 (en) Gene expression profiles associated with chronic allograft nephropathy
JP6622722B2 (en) Pulmonary hypertension biomarker
EP1618218A2 (en) Methods for prognosis and treatment of solid tumors
US20100304987A1 (en) Methods and kits for diagnosis and/or prognosis of the tolerant state in liver transplantation
CN101120255A (en) Pharmacogenomic markers for prognosis of solid tumors
US20100104581A1 (en) Methods for Diagnosing and Treating Graft Rejection and Inflammatory Conditions
EP1797429A2 (en) Methods and kits for the prediction of therapeutic success and recurrence free survival in cancer therapy
US20080014579A1 (en) Gene expression profiling in colon cancers
WO2014173986A2 (en) Methods for diagnosing and monitoring the response to treatment of hepatocellular carcinoma
US20070054321A1 (en) Methods of diagnosing and treating inflammatory diseases using pac-1 (dusp2)
AU2014259525B2 (en) A transcriptomic biomarker of myocarditis
US20120128651A1 (en) Acute lymphoblastic leukemia (all) biomarkers
CA3161906A1 (en) Methods of determining impaired glucose tolerance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

NENP Non-entry into the national phase in:

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 06740011

Country of ref document: EP

Kind code of ref document: A2