WO2014173905A2 - Methods and kits for prognosis of stage i nsclc by determining the methylation pattern of cpg dinucleotides - Google Patents

Methods and kits for prognosis of stage i nsclc by determining the methylation pattern of cpg dinucleotides Download PDF

Info

Publication number
WO2014173905A2
WO2014173905A2 PCT/EP2014/058150 EP2014058150W WO2014173905A2 WO 2014173905 A2 WO2014173905 A2 WO 2014173905A2 EP 2014058150 W EP2014058150 W EP 2014058150W WO 2014173905 A2 WO2014173905 A2 WO 2014173905A2
Authority
WO
WIPO (PCT)
Prior art keywords
gene
cpg
cpg site
nsclc
methylation
Prior art date
Application number
PCT/EP2014/058150
Other languages
French (fr)
Other versions
WO2014173905A3 (en
Inventor
Manel Esteller Badosa
Juan SANDOVAL DEL AMOR
Jesús MÉNDEZ GONZÁLEZ
Original Assignee
Institut D'investigació Biomèdica De Bellvitge (Idibell)
Fundació Institució Catalana De Recerca I Estudis Avançats
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institut D'investigació Biomèdica De Bellvitge (Idibell), Fundació Institució Catalana De Recerca I Estudis Avançats filed Critical Institut D'investigació Biomèdica De Bellvitge (Idibell)
Publication of WO2014173905A2 publication Critical patent/WO2014173905A2/en
Publication of WO2014173905A3 publication Critical patent/WO2014173905A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • the present invention relates to the field of pharmacogenomics and in particular to an in vitro method for prognosis of a stage I NSCLC patient, comprising determining in a biological sample of said patient the methylation pattern in one or more genes. It further relates to an in vitro method for selecting a stage I NSCLC patient for an anti- cancer treatment based on the method of prognosis. Also, it relates to nucleic acids and kits useful for these methods and an anti-cancer treatment for the selected patients.
  • NSCLC Non-small-cell lung cancer
  • patients with NSCLC can be divided into different groups that reflect both the extent of the disease and the treatment approach.
  • the group of patients that has tumors which are surgically resectable generally stage I, stage II, and selected stage III tumors) still has the best prognosis.
  • the poor prognosis of NSCLC patients is associated with several factors, among them the late diagnosis of the disease and the small number of effective drugs.
  • the absence of validated prognostic biomarkers could also be relevant, because even patients with stage I NSCLC - who undergo potentially curative surgical resection - are at high risk of dying from recurrent disease, with a 5 -year relapse rate of 35-50%.
  • NSCLC is a tumor in which only small improvements in clinical outcome have been achieved. The issue is critical for stage I patients for whom there are no available biomarkers that indicate which high-risk patients should receive adjuvant chemotherapy.
  • the present invention relates to an in vitro method for the prognosis of a NSCLC patient, comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9, and wherein hypermethylation in at least one gene is indicative of a bad prognosis.
  • the present invention relates to an in vitro method for selecting a NSCLC patient for an anti-cancer treatment, comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9, and wherein hypermethylation in at least one gene selects the patient for the cancer treatment.
  • the present invention relates to an anti-cancer agent for the treatment of a NSCLC patient, wherein the patient is selected using the method of the second aspect.
  • the present invention relates to a kit comprising
  • At least one CpG site-binding oligonucleotide capable of specifically hybridizing to a sequence spanning a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9 in a methylat ion- specific manner; or
  • At least one CpG site-flanking oligonucleotide capable of specifically hybridizing to an upstream or downstream sequence of a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9, wherein unmethylated cytosine in at least part of said gene has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties.
  • the present invention relates to a nucleic acid selected from the group consisting of (i) a nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9;
  • nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F,
  • the present invention relates to the use of a kit or a nucleic acid of the foregoing aspects for the prognosis of a NSCLC patient or for selecting a NSCLC patient for anti-cancer treatment.
  • Figure 1 DNA Methylation Signatures Associated with Recurrence-Free Survival (RFS) in NSCLC samples.
  • Panel A displays the Kaplan-Meier analysis for RFS among the 198 patients with RFS information according to the two groups obtained from the clustering. The p-value corresponds to the hazard ratio (HR) adjusted by multivariate regression (including age, gender, smoking history, stage and histological type).
  • Panel B shows the Kaplan-Meier estimates for RFS among the subset of 147 patients with RFS information according to the two groups obtained in the clustering. The p-value reflects the HR adjusted as in the analysis in Panel B.
  • FIG. 2 Kaplan-Meier Estimates of RFS in Stage I NSCLC Patients byMethylation Status of the Five Validated Genes. Top panel for each gene shows the Kaplan-Meier estimates for RFS of the final five validated genes in the subset of 147 patients in stage I from the discovery cohort. Methylation status was determined by the Infmium 450k Methylation Array. Bottom panel for each gene shows the corresponding Kaplan-Meier estimates for the same genes in the 142 patients in stage I included in the validation cohort. In this case, methylation status was determined by pyrosequencing. The p- values correspond to HRs adjusted by multivariate regression (including age, gender, smoking history and histological type).
  • Figure 3 Kaplan-Meier Estimates of RFS by Number of Methylated Genes (A-B) and Forest Plot with Hazard Ratios for Recurrence in Stage I NSCLC (C).
  • A-B In each panel, patients are grouped into methylation low or methylation high groups according to the number of methylated genes (0-1 versus 2-5) from the five-gene signature (including HIST1H4F, NPBWRl, PCDHGB6, ALXl and HOXA9).
  • Panel A shows patients from the validation cohort analyzed by pyrosequencing.
  • Panel B includes patients from the discovery cohort analyzed by the DNA methylation microarray.
  • the p- values correspond to Hazard Ratios adjusted by multivariate regression (including age, gender, smoking history and histological type).
  • C The forest plot shows the multivariate Cox regression for the various DNA methylation classifiers of stage I NSCLC patients.
  • Data for the Group 1 heatmap in stage I was obtained from the discovery cohort with the 450K array.
  • Data from each of the five significant genes and the bimodal signature model were obtained both from the discovery cohort with the 450 array and from the validation cohort by pyrosequencing.
  • the prognostic value for each gene or signature was adjusted for age, gender, smoking history and histological type.
  • the authors of the present invention have found that hypermethylation of five genes was significantly associated with relapse-free survival (RFS) in stage I NSCLC patients. These genes are HIST1H4F, PCDHGB6, NPBWRl, ALXl and HOXA9. This finding is useful for prognosis of NSCLC patients and the selection of these patients for anti-cancer treatment as set out below. Further described are kits and nucleic acids for the use in these methods.
  • the invention relates to an in vitro method for the prognosis of a NSCLC patient (hereinafter referred to as the "method of the first aspect” or the “prognostic method of the invention”), comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and wherein hypermethylation in at least one gene is indicative of a bad prognosis.
  • prognosis as referred to herein is understood as the expected progression of a disease and relates to the assessment of the probability according to which a subject suffers from a disease as well as to the assessment of its onset, state of development, progression, or of its regression, and/or the prognosis of the course of the disease in the future. As will be understood by persons skilled in the art, such assessment normally may not be correct for 100% of the subjects to be diagnosed, although it preferably is correct. The term, however, requires that a correct prognosis can be made for a statistically significant part of the subjects.
  • Whether a part is statistically significant it can be determined simply by the person skilled in the art using several well-known statistical evaluation tools, for example, determination of confidence intervals, determination of p values, Student's t-test, Mann- Whitney test, etc. Details are provided in Dowdy and Wearden, Statistics for Research, John Wiley & Sons, New York 1983.
  • the preferred confidence intervals are at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%.
  • the p values are preferably 0.05, 0.01, or 0.005.
  • the prediction of the clinical outcome can be done using any assessment criterion used in oncology and known by the person skilled in the art.
  • the assessment parameters useful for describing the progression of a disease include, for example, relapse-free survival (RFS).
  • bad prognosis as referred to herein generally means an outcome which would be regarded negative for the patient and preferably depends on the prognosis type.
  • a bad prognosis of relapse-free survival (RFS) would mean that the patient will live for a shorter time, without recurrence of the tumour than, e.g., the average in a group of NSCLC I patients.
  • the prognosis is determined as relapse-free survival.
  • it is determined as the likelihood of relapse-free survival in a period of 3 years after tumour resection surgery.
  • the likelihood is equivalent to the percentage of patients that survive for 3 years after resection without relapse among a larger group of patients (some of which may relapse earlier).
  • patient refers to an individual, such as a human, a non-human primate (e.g. chimpanzees and other apes and monkey species); farm animals, such as birds, fish, cattle, sheep, pigs, goats and horses; domestic mammals, such as dogs and cats; laboratory animals including rodents, such as mice, rats and guinea pigs.
  • a non-human primate e.g. chimpanzees and other apes and monkey species
  • farm animals such as birds, fish, cattle, sheep, pigs, goats and horses
  • domestic mammals such as dogs and cats
  • laboratory animals including rodents, such as mice, rats and guinea pigs.
  • mice a particular age or sex.
  • the subject is a mammal.
  • the subject is a human.
  • the subject has undergone tumour resection.
  • the subject has not undergone chemotherapy prior to the determination.
  • NSCLC patient encompasses patients having NSCLC and patients having been diagnosed with NSCLC, even though they may be in remission or may have no visible signs of NSCLC anymore.
  • the NSCLC patient is a stage I NSCLC patient.
  • NSCLC non-small cell lung cancer
  • SCC squamous cell carcinoma
  • adenocarcinoma is the most common subtype of NSCLC, accounting for 50%) to 60%o of NSCLC, which starts near the gas-exchanging surface of the lung and which includes a subtype, the bronchioalveolar carcinoma, which may have different responses to treatment.
  • large cell carcinoma is a fast-growing form that grows near the surface of the lung. It is primarily a diagnosis of exclusion, and when more investigation is done, it is usually reclassified to squamous cell carcinoma or adenocarcinoma.
  • adenosquamous carcinoma is a type of cancer that contains two types of cells: squamous cells (thin, flat cells that line certain organs) and gland- like cells. 5. carcinomas with pleomorphic, sarcomatoid or sarcomatous elements. This is a group of rare tumours reflecting a continuum in histological heterogeneity as well as epithelial and mesenchymal differentiation.
  • carcinoid tumour is a slow-growing neuroendocrine lung tumour and begins in cells that are capable of releasing a hormone in response to a stimulus provided by the nervous system.
  • carcinomas of salivary gland type begin in salivary gland cells located inside the large airways of the lung.
  • unclassified carcinomas include cancers that do not fit into any of the aforementioned lung cancer categories.
  • stage I NSCLC refers to tumor which is present in the lungs but the cancer has not been found in the chest lymph nodes or in other locations outside of the chest.
  • Stage I NSCLC is subdivided into stages IA and IB, usually based upon the size of the tumor or involvement of the pleura, which is lining along the outside of the lung.
  • the tumor is 3 centimetres (cm) or less in size and has invaded nearby tissue minimally, if at all. The cancer has not spread to the lymph nodes or to any distant sites.
  • Stage IB the tumor is more than 3 cm in size, has invaded the pleural lining around the lung, or has caused a portion of the lung to collapse.
  • Stage IA corresponds to stages T1N0M0 of the TNM classification.
  • Stage IB corresponds to T2M0N0 of the TNM classification.
  • Stage I NSCLC includes stages IA and/or IB NSCLC.
  • TNM classification is a staging system for malignant cancer.
  • TNM classification refers to the 6 th edition of the TNM stage grouping as defined in Sobin et al. (International Union against Cancer (UICC), TNM Classification of Malignant tumors, 6 th ed. New York; Springer, 2002, pp. 191-203) (TNM6) and AJCC Cancer Staging Manual 6th edition; Chapter 19; Lung - original pages 167-177 whereby the tumors are classified by several factors, namely, T for tumor, N for nodes, M for metastasis as follows:
  • T Primary tumor cannot be assessed, or tumor proven by the presence of malignant cells in sputum or bronchial washings but not visualized by imaging or bronchoscopy:
  • T2 Tumor more than 3 cm but 7 cm or less or tumor with any of the following features involves main bronchus, 2 cm or more distal to the carina; invades visceral pleura (PL1 or PL2); associated with atelectasis or obstructive pneumonitis that extends to the hilar region but does not involve the entire lung,
  • T3 Tumor more than 7 cm or one that directly invades any of the following: parietal pleural (PL3), chest wall (including superior sulcus tumors), diaphragm, phrenic nerve, mediastinal pleura, parietal pericardium; or tumor in the main bronchus less than 2 cm distal to the carina but without involvement of the carina; or associated atelectasis or obstructive pneumonitis of the entire lung or separate tumor nodule(s) in the same lobe and
  • NSCLC is of the subtype adenocarcinoma, squamous carcinoma or large cell carcinoma. In a more preferred embodiment, NSCLC is of the subtype adenocarcinoma or squamous carcinoma. In the most preferred embodiment, NSCLC is of the subtype adenocarcinoma.
  • sample or “biological sample”, as used herein, refers to biological material isolated from a subject. The biological sample contains any biological material suitable for detecting the desired methylation pattern in one or more CpG site(s) and is a material comprising genetic material from the subject. The biological sample can comprise cell and/or non-cell material of the subject.
  • the sample comprises genetic material, e.g., DNA, genomic DNA (gDNA), complementary DNA (cDNA), R A, heterogeneous nuclear R A (hnR A), mR A, etc., from the subject under study.
  • the genetic material is DNA.
  • the DNA is genomic DNA.
  • the DNA is circulating DNA.
  • the sample can be isolated from any suitable tissue or biological fluid such as, for example blood, saliva, plasma, serum, urine, cerebrospinal liquid (CSF), feces, a buccal or buccal-pharyngeal swab, a surgical specimen, a specimen obtained from a biopsy, and a tissue sample embedded in paraffin.
  • tissue or biological fluid such as, for example blood, saliva, plasma, serum, urine, cerebrospinal liquid (CSF), feces, a buccal or buccal-pharyngeal swab, a surgical specimen, a specimen obtained from a biopsy, and a tissue sample embedded in paraffin.
  • said sample comprises a cancer cell, preferably a NSCLC cell.
  • it is a tumour tissue sample or portion thereof.
  • said tumour tissue sample is a lung tumour tissue sample, more preferably a pulmonary tumour tissue sample from a subject suffering from NSCLC.
  • Said sample can be obtained by conventional methods, e.g., biopsy, by using methods well known to those of ordinary skill in the related medical arts. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, or microdissection or other art-known cell-separation methods.
  • Tumour cells can additionally be obtained from fine needle aspiration cytology. In order to simplify conservation and handling of the samples, these can be formalin- fixed and paraffin- embedded or first frozen and then embedded in a cryosolidifiable medium, such as OCT- Compound, through immersion in a highly cryogenic medium that allows for rapid freeze.
  • methylation or "DNA methylation”, as used herein, refers to a biochemical process involving the addition of a methyl group to the cytosine or adenine DNA nucleotides. DNA methylation at the 5 position of cytosine may have the specific effect of reducing gene expression and has been found in every vertebrate examined. In adult non-gamete cells, DNA methylation typically occurs in a CpG site.
  • CpG site refers to regions of DNA where a cytosine nucleotide occurs next to a guanine nucleotide in the linear sequence of bases along its length.
  • CpG is shorthand for "C-phosphate-G", that is, cytosine and guanine separated by only one phosphate; phosphate links any two nucleosides together in DNA.
  • the "CpG” notation is used to distinguish this linear sequence from the CG base-pairing of cytosine and guanine. Cytosines in CpG dinucleotides can be methylated to form 5- methylcytosine.
  • CpG island relates to a DNA sequence, generally in a window of less than 200 pb, with a GC content greater than 50% and an observed:expected CpG ratio of more than 0.6.
  • CpG islands associated with genes are located near the promoters; nevertheless, it has been noticed that many CpG islands appear within the body of the gene, at the 3' end of the gene and/or intergenic regions. In a particular embodiment of the invention, CpG islands are near the promoter region of a gene.
  • CpG shore relates to the DNA sequences, up to 2kb long, flanking a CpG island and showing a comparatively low GC density.
  • methylation pattern refers to, but is not limited to, the presence or absence of methylation of one or more nucleotides. Thereby said one or more nucleotides are comprised in a single nucleic acid molecule. Said one or more nucleotides have the ability of being methylated or being non-methylated.
  • methylation status may also be used, wherein only a single nucleotide is considered.
  • a methylation pattern can be quantified, wherein it is considered over more than one nucleic acid molecule.
  • the term ' 'hypermethy lation' ' refers to an aberrant methylation pattern or status, wherein one or more nucleotides, preferably C(s) of a CpG site(s), are methylated compared to a control sample or the known methylation status in healthy subjects or subjects not suffering of having suffered from NSCLC (control).
  • control preferably C(s) of a CpG site(s)
  • hypermethylation as a methylation status/pattern can be determined at one or more CpG site(s). If more than one CpG sites are used, hypermethylation can be determined at each site separately or as an average of the CpG sites taken together.
  • HIST1H4F refers to the gene Histone cluster 1, H4F, also known as H4/M3, H4FC, H4, H4/A, H4/B, H4/C, H4/D, H4/E, H4/G, H4/H, H4/I, H4/J, H4/K, H4/M, H4/N, H4/0, H4fl, H4FB, H4FD, H4/K, H4/A, H4FE, H4/B, H4FG, H4/C, H4FH, H4/D, H4FI, H4/E3, H4FJ, H4/G, H4FK, H4FM, H4FN, H4F03 or HIST2H4.
  • the gene is located on chromosome 6p22.1. Its sequence reference is NM 003540.3.
  • the protein is located in the nucleus and it is involved in CenH3 -containing nucleosome assembly at the centromere, negative regulation of megakaryocyte differentiation, phosphatidylinositol- mediated signaling, and telomere maintenance.
  • PCDHGB6 refers to the gene Protocadherin gamma subfamily B 6, located on chromosome 5q31.3. Its sequence reference is NM 032011.1. The protein is located in the cytoplasm and it is involved in cell to cell adhesion in the brain and calcium ion binding.
  • NPBWR1 or “GPR7” refers to the gene Neuropeptides B/W Receptor
  • G-protein-coupled receptor 7 located on chromosome 8p22-q21.13. Its sequence reference is NM 005285.3.
  • the protein is located in the cell membrane and it is involved in G-protein coupled receptor (opioid receptor) activity and the regulation of processes such as neuroendocrine system regulation, food intake and synaptic transmission.
  • ALX1 refers to the gene ALX homeobox protein 1 , located on chromosome 12q21.31. Its sequence reference is NM 006982.1. The protein is located in the nucleus and it is involved in sequence-specific DNA binding, the regulation of transcription from the RNA polymerase II promoter, brain development, and cartilage condensation.
  • ⁇ 9 refers to the gene Homeobox A9, located on chromosome
  • the protein is located in the nucleus and it is involved in sequence-specific DNA binding, transcription regulation, development, cancer as a proto-oncogen, anterior/posterior pattern specification, embryonic forelimb morphogenesis, embryonic, skeletal system development, endothelial cell activation, multicellular organismal development, and proximal/distal pattern formation.
  • the present invention relates an in vitro method for selecting a NSCLC patient for an anti-cancer treatment (hereinafter referred to as the "method of the second aspect” or the “selection method of the invention"), comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and wherein hypermethylation in at least one gene selects the patient for the cancer treatment.
  • selection refers to the action of choosing said patient for an anti- cancer treatment, preferably an NSCLC treatment.
  • treatment refers to a therapeutic treatment, as well as a prophylactic or prevention method, wherein the goal is to prevent or reduce an unwanted physiological change or disease, such as cancer or NSCLC.
  • Beneficial or desired clinical results include, but not limiting, release of symptoms, reduction of the length of the disease, stabilized pathological state (specifically not deteriorated), retard in the disease's progression, improve of the pathological state and remission (both partial and total), both detectable and not detectable.
  • the terms “treat” and “treatment” are synonyms of the term “therapy” and can be used without distinction along the present description. Treatment can mean also prolong survival, compared to the expected survival if the treatment is not applied.
  • anti-cancer treatment refers to the use of chemical, physical or biological agents or compounds with antiproliferative, antioncogenic and/or carcinostatic properties which can be used to inhibit tumor growth, proliferation and/or development.
  • Anti-cancer agents are such agents or compounds. Examples of anti-cancer agent are alkylating agents, antimetabolites, plant alkaloyds and terpenoids, topoisomerase inhibitors and the like.
  • a preferred anti-cancer therapy is chemotherapy or platinum-based chemotherapy.
  • the NSCLC patient is a stage I NSCLC patient.
  • the anti-cancer agent is a platinum-based chemotherapeutic composition.
  • platinum-based chemotherapy refers to any chemotherapy using compounds which contain a platinum atom and which are capable of modifying the DNA inducing the activation of the DNA repair and subsequent cell death.
  • Suitable platinum-based chemotherapeutic compounds include, without limitation, cisplatin, ELOXATIN (oxaliplatin), eptaplatin, lobaplatin, nedaplatin, PARAPLATIN (carbop latin), carboplatin, cisplatin, iproplatin, tetrap latin, lobaplatin, DCP, PLD-147, JMl 18, JM216, JM335, and satraplatin.
  • platinum-based chemotherapy also includes compositions of two or more chemotherapeutic agents wherein at least one of the components is a platinum-based compound, such as cisplatin-gemcitabine, carboplatin-gemcitabine, cisplatin-gemcitabine -vinorelbine, cisplatin- vinorelbine, cisplatin-etoposide, cisplatin-etoposide-vincristine, cisplatin-paclitaxel, cisplatin- docetaxel, carboplatin-docetaxel and the like.
  • a platinum-based compound such as cisplatin-gemcitabine, carboplatin-gemcitabine, cisplatin-gemcitabine -vinorelbine, cisplatin- vinorelbine, cisplatin-etoposide, cisplatin-etoposide-vincristine, cisplatin-paclitaxel, cisplatin
  • the methylation pattern of said gene(s) is determined in the promoter region of said gene.
  • promoter region refers to a region of DNA that initiates transcription of a particular gene. Promoters are located near the genes they transcribe, on the same strand and upstream on the DNA, and can be about 100-1000 base pairs long. In the particular case of the
  • NPBWRl promoter starts at position 53850967 (according to TSS1500, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 53853454 (according to 1 st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009).
  • - HOXA9 promoter it starts at position 27200556 (according to TSS1500, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 27203460 (according to 1 st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009).
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWRl, ALXl and HOXA9 is determined at a CpG site of said gene(s).
  • the CpG site is located at a CpG island of said gene.
  • the CpG site is located at a CpG shore of said gene.
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWRl, ALXl and HOXA9 is determined at at least two CpG sites, at least three CpG sites, at four three CpG sites, at least five CpG sites, at least six CpG sites, at least seven CpG sites, at least eight CpG sites, at least nine CpG sites, at least ten three CpG sites, at least twelve CpG sites, or at least fifteen CpG sites of said gene(s), wherein the methylation pattern of said gene(s) is determined as the mean value of said at least two CpG sites, at least three CpG sites, at four three CpG sites, at least five CpG sites, at least six CpG sites, at least seven CpG sites, at least eight CpG sites, at least nine CpG sites, at least ten three CpG sites, at least twelve CpG sites, or at least fifteen CpG sites of said
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at a one or more, preferably all CpG sites located at a CpG island of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island of said gene(s).
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at a one or more, preferably all CpG sites located at the N-shore, at the S- shore or at both the N- and the S-shores of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG shore or shores of said gene(s).
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at one or more, preferably all CpG sites located at the CpG island and at the N shore of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island and the N shore of said gene(s).
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at one or more, preferably all CpG sites located at the CpG island and at the S shore of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island and the S shore of said gene(s).
  • the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at one or more, preferably all CpG sites located at the CpG island and at the N shore and the S shore of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island, the N shore and the S shores of said gene(s).
  • the location of the N shore, the CpG island, the S shore and the promoter for each of the genes is shown in Table 1.
  • Table 1 Start and end positions of the CpG island, of the shores flanking the CpG island and of the promoter regions in the NPBWR1, HistlH4F, PCDHGB6, ALX1 and HOXA9 genes.
  • Island start and island end indicate, respectively, the starting and ending positions of the CpG island by reference to the chromosome numbering according to Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009.
  • Shore/lsland/Shore start indicates the starting position of the shore 5' to the CpG island by reference to the chromosome numbering as indicated above (Infinium/UCSC).
  • Shore/lsland/Shore end indicates the end position of the shore located 3' with respect of the CpG island by reference to the chromosome numbering as indicated above (Infinium/UCSC).
  • the end position of the shore located 5' of the CpG island is the position adjacent in 5' to the island start position.
  • the start position of the shore located 3' of the CpG island is the position adjacent in 3' to the island end position.
  • Start promoter indicates the start position of the promoter region by reference to the chromosome numbering as indicated above (Infinium/UCSC).
  • End promoter (1 st exon) indicates the last position of the first exon of the gene, which is adjacent to the last position of the promoter, by reference to the chromosome numbering as indicated above (Infinium/UCSC).
  • the methods according to the invention involve determining methylation at some specific CpG site(s) which are significantly associated with a bad prognosis and selection of a stage I NSCLC in a subject.
  • the methylation pattern of said gene(s) is determined at the CpG site(s) located at positions - 26240782 (cgl0723962), 26240528 (cg22723502), 26240519 (cgl2260798), 26240762, 26240771, 26240774, 26240776, 26240779, 26240789 and/or 26240796 in HIST 1H4F,
  • 140787507 (cgl8507379), 140787504 (cgl8617005), 140787474, 140787487, 140787491, 140787504 and/or 140787513 in PCDHGB6,
  • the method of the first and/or the second aspect of the invention comprise determining the mean methylation level calculated based on the above positions.
  • determination of the methylation pattern in a CpG site refers to the determination of the methylation status of a particular CpG site.
  • the determination of the methylation pattern in a CpG site can be performed by means of multiple processes known by the person skilled in the art.
  • the nucleic acid is extracted from cells which are present in a biological fluid a tissue or a cell as an initial step, and, in such cases, the total nucleic acid extracted from said samples would represent the working material suitable for subsequent analysis. Isolating the nucleic acid of the sample can be performed by standard methods known by the person skilled in the art. Said methods can be found, for example, in Sambrook et al., 2001, "Molecular cloning: a Laboratory Manual", 3rd ed., Cold Spring Harbor Laboratory Press, N.Y., Vol. 1-3 and in the commonly used QIAamp DNA mini kit protocol by Qiagen.
  • the methylation pattern of one or more CpG site(s) is determined.
  • the analysis of the methylation pattern present in one or several of the CpG sites disclosed herein in a subject's nucleic acid can be done by any method or technique capable of measuring the methylation pattern present in a CpG site. For instance, one may detect SNPs in the first method of the invention by performing by a method selected from the group consisting of Methylation-Specific PCR (MSP), an enrichment-based method (e.g.
  • MSP Methylation-Specific PCR
  • MeDIP, MBD-seq and MethylCap bisulfite sequencing and bisulfite-based method (e.g. RRBS, bisulfite sequencing, Infmium, GoldenGate, COBRA, MSP, MethyLight) and a restriction-digestion method (e.g., MRE-seq, or HELP assay), ChlP-on-chip assay, or differential-conversion, differential restriction, differential weight of the DNA methylated CpG site(s).
  • RRBS bisulfite sequencing, Infmium, GoldenGate, COBRA, MSP, MethyLight
  • a restriction-digestion method e.g., MRE-seq, or HELP assay
  • ChlP-on-chip assay or differential-conversion, differential restriction, differential weight of the DNA methylated CpG site(s).
  • the genomic DNA sample is chemically treated in such a way that all of the unmethylated cytosine bases are modified to uracil bases, or another base which is dissimilar to cytosine in terms of base pairing behaviour, while the 5-methylcytosine bases remain unchanged.
  • modify means the conversion of an unmethylated cytosine to another nucleotide which will distinguish the unmethylated from the methylated cytosine.
  • the conversion of unmethylated, but not methylated, cytosine bases within the DNA sample is conducted with a converting agent.
  • converting agent or "converting reagent”, as used herein, relates to a reagent capable of converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties.
  • the converting agent is preferably a bisulfite such as disulfite or hydrogen sulfite.
  • methylated cytosine can also be used in the method of the invention, such as hydrogen sulfite.
  • the reaction is performed according to standard procedures (Frommer et al, 1992, Proc Natl Acad Sci USA 89: 1827-31; Olek, 1996, Nucleic Acids Res 24:5064-6; EP 1394172). It is also possible to conduct the conversion enzymatically, e.g by use of methylation specific cytidine deaminases.
  • the sample has been treated with a reagent capable of converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties. More preferably, the reagent used for modifying unmethylated cytosine is sodium bisulfite.
  • the region containing the one or more CpG site(s) is amplified using primers allowing to distinguish the unmethylated sequence (wherein the cytosine of CpG site is converted into uracil) of the methylated sequence (wherein the cytosine of the CpG site remains cytosine).
  • amplification methods rely on an enzymatic chain reaction such as, for example, a polymerase chain reaction (PCR), Ligase Chain Reaction (LCR), Polymerase Ligase Chain Reaction, Gap- LCR, Repair Chain Reaction, 3SR, and NASBA.
  • RNA molecules there are strand displacement amplification (SDA), transcription mediated amplification (TMA), and QP-ampl ificat ion, etc.; this list is merely illustrative and in no way limiting.
  • SDA strand displacement amplification
  • TMA transcription mediated amplification
  • QP-ampl ificat ion etc.; this list is merely illustrative and in no way limiting.
  • Methods for amplifying nucleic acid are described in Sambrook et al., 2001 (cited at supra).
  • Particularly preferred amplification methods according to the invention are the methylation specific PCR method (MSP) disclosed in US 5, 786,146 which combines bisulfite treatment and allele- specific PCR (see e.g. US 5,137,806, US 5,595,890, US 5,639,611).
  • MSP methylation specific PCR method
  • Uracil is recognized as a thymine by Taq polymerase and therefore upon PCR, the resultant
  • oligonucleotides capable of specifically hybridizing to a bisulfite-treated polynucleotide comprising at least one sequence of a CpG site of above-mentioned genes are described further below (see nucleic acids or kits of the invention).
  • the amplification products are then detected according to standard procedures in the art.
  • the amplified nucleic acid may be determined or detected by standard analytical methods known to the person skilled in the art and described e.g. in Sambrook et al., 2001 (cited at supra). There may be also further purification steps before the target nucleic acid is detected e.g. a precipitation step.
  • the detection methods may include but are not limited to the binding or intercalating of specific dyes as ethidium bromide which intercalates into the double-stranded DNA and changes its fluorescence thereafter.
  • the purified nucleic acids may also be separated by electrophoretic methods optionally after a restriction digest and visualized thereafter.
  • probe-based assays which exploit the oligonucleotide hybridization to specific sequences and subsequent detection of the hybrid. It is also possible to sequence the target nucleic acid after further steps known to the expert in the field. Other methods apply a diversity of nucleic acid sequences to a silicon chip to which specific probes are bound and yield a signal when a complementary sequences bind.
  • the methylation pattern or status is determined by sequencing, preferably pyrosequencing.
  • the nucleic acid amplification is carried out by real time PCR and real time probes are used to detect the presence of the extension product.
  • real time probes e.g. Lightcycler, Taqman, Scorpio, Sunrise, Molecular Beacon or Eclipse probes. Details concerning structure or detection of these probes are known in the state of the art.
  • methylation pattern of the nucleic acid can be confirmed by restriction enzyme digestion and Southern blot analysis.
  • methylation sensitive restriction endonucleases which can be used to detect CpG methylation include Smal, Sacll, Eagl, Mspl, Hpall, Bst ⁇ ⁇ and 5v.vHI I, for example.
  • the methylation pattern obtained is compared with the methylation pattern of a reference sample.
  • reference sample means a sample obtained from a pool of healthy subjects which do not have a disease state or particular phenotype of NSCLC. It may also mean a sample obtained from non-tumoral adjacent tissue obtained from a subject suffering from NSCLC.
  • the reference sample is a biological sample as defined above (wherein tumour cells are normal corresponding cells and biopsies are equivalent and contain no tumour material) from subjects which do not suffer from NSCLC or which do not have a history of NSCLC.
  • the reference sample is a sample of subjects matched on age and body mass index to the subject analysed.
  • the level of methylation of one or more CpG site(s) is increased (i.e. hypermethylation) when the level of methylation of said one or more CpG site(s) in a sample is higher than in the reference sample.
  • the level of methylation of one or more CpG site(s) is considered to be higher than in the reference sample when they are at least 1.5%, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%: at least 85%, at least 90%, at least 95%, at least 100%, at least 1 10%, at least 120%, at least 130%, at least 140%, at least 150% or more higher than in the reference sample.
  • the level of methylation of one or more CpG site(s) is increased (i.e. hypermethylation) when the mean methylation value of a number of CpG site(s) in a sample is higher than in the reference sample.
  • the mean level of methylation of a number of CpG site(s) is considered to be higher than in the reference sample when it is at least 1.5%, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%: at least 85%, at least 90%, at least 95%, at least 100%, at least 110%, at least 120%, at least 130%, at least 140%, at least 150% or more higher than in the reference sample
  • the methylation pattern is determined in at least two genes selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9. In further embodiments, the methylation pattern is determined in least three, four or five of these genes.
  • hypermethylation in all genes determined i.e. at least two, three, four or five indicates a bad prognosis or selects the patient. It is also envisaged, that hypermethylation also in at least one, two, three or four of the genes determined can indicate a bad prognosis or select the patient.
  • the methylation pattern is determined in the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and the hypermethylation in at least two of these genes is indicative of a bad prognosis or selects the patient for the cancer treatment. It has been shown by the inventors that this embodiment (also referred to herein as the "bimodal signature") allows a particularly accurate and reliable indication and selection.
  • Techniques for detection of DNA methylation include, without limitation, bisulfite modification based technologies [including bisulfite sequencing, pyrosequencing, ConLight-MSP (Conversion-specific Detection of DNA Methylation Using Real-time Polymerase Chain Reaction), SMART MSP (Sensitive Melting Analysis after Real Time- Methylation Specific PCR), Matrix-assisted laser desorption/ionization-time of flight (Mass Array Epityper Sequenom), HPLC (High performance liquid chromatography), methyl-beaming and COBRA (Combined Bisulfite Restriction Analysis)], enzymatic digestions based methodologies [including reduced representation bisulfite sequencing (RRBS), HELP assay (Hpall tiny fragment Enrichment by Ligation-mediated PCR) and MethDet (methylation detection)], affinity- enriched based technologies [including MeDIP (Methylated DNA immunoprecipitation), Methyl-Cap and methylation binding domain as
  • the methylation pattern is determined using pyrosequencing.
  • pyrosequencing relates to a method of DNA sequencing based on the "sequencing by synthesis” principle. It differs from Sanger sequencing in that it relies on the detection of pyrophosphate release on nucleotide incorporation, rather than chain termination.
  • the desired DNA sequence is able to be determined by light emitted upon incorporation of the next complementary nucleotide by the fact that only one out of four of the possible A/T/C/G nucleotides are added and available at a time so that only one letter can be incorporated on the single stranded template (which is the sequence to be determined).
  • the intensity of the light determines if there are more than one of these "letters" in a row.
  • the previous nucleotide letter (one out of four possible dNTP) is degraded before the next nucleotide letter is added for synthesis: allowing for the possible revealing of the next nucleotide(s) via the resulting intensity of light (if the nucleotide added was the next complementary letter in the sequence).
  • This process is repeated with each of the four letters until the DNA sequence of the single stranded template is determined.
  • Anti-cancer agent of the invention in a third aspect, relates to an anti-cancer agent for the treatment of a NSCLC patient (hereinafter referred to as the "anti-cancer agent of the invention” or the “anti-cancer agent of the third aspect of the invention", wherein the patient is selected using the method of the second aspect.
  • the NSCLC patient is a stage I NSCLC patient.
  • kits of the invention or the “kit of the fourth aspect of the invention" comprising
  • At least one CpG site-binding oligonucleotide capable of specifically hybridizing to a sequence of a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9 in a methy 1 at io n-spec i fi c manner; or
  • said CpG site is in the promoter region of said gene(s). In a more preferred embodiment, said CpG site is selected from the group as defined in the methods of the first and second aspects of the invention.
  • said part of said gene comprises at least the gene sequence between and including the upstream or downstream sequence and a CpG site.
  • oligonucleotide refers to a single-stranded DNA or RNA molecule, with up to 30, 25, 20, 19, 18, 17, 16, 15, 14 or 13 bases in length (upper limit).
  • the oligonucleotides of the invention are preferably DNA or RNA molecules of at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, or 13 bases in length (lower limit). Ranges of base lengths can be combined in all different manners using the afore-mentioned lower and upper limits, for example at least 2 and up to 30 bases, at least 8 and up to 15 bases, at least 5 and up 15 bases or at least 8 and up to 18 bases.
  • hybridizing refers to the capacity of an oligonucleotide or polynucleotide of recognizing specifically the sequence of a CpG site.
  • hybridization is the process of combining two complementary single-stranded nucleic acid molecules, or molecules with a high degree of similarity, and allowing them to form a single double-stranded molecule through base pairing. Normally, the hybridization occurs under high stringent conditions or moderately stringent conditions.
  • the "similarity" between two nucleic acid molecules is determined by comparing the nucleotide sequence of one molecule to the nucleotide sequence of a second molecule.
  • Variants according to the present invention include nucleotide sequences that are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95% similar or identical to the sequence of the CpG site.
  • the degree of identity between two nucleic acid molecules is determined using computer algorithms and methods that are widely known for the persons skilled in the art.
  • the identity between two amino acid sequences is preferably determined by using the BLASTN algorithm (BLAST Manual, Altschul et al, 1990, NCBI NLM NIH Bethesda, Md. 20894, Altschul, S., et al, J. Mol Biol 215:403-10).
  • “Stringency” of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).
  • stringent conditions typically: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50° C; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Fico 11/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5xSSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1 % sodium pyrophosphate, 5x Denhardt's solution, sonicated salmon sperm DNA (50 ⁇ g/ml), 0.1 % SDS, and 10% dextran sulfate
  • formamide for example, 50% (v/v) formamide
  • Modely stringent conditions may be identified as described by Sambrook et al, Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and % SDS) less stringent that those described above.
  • washing solution and hybridization conditions e.g., temperature, ionic strength and % SDS
  • An example of moderately stringent conditions is overnight incubation at 37°C.
  • CpG site-binding oligonucleotide refers to an oligonucleotide capable of specifically hybridizing to a nucleotide sequence, wherein the oligonucleotide covers the CpG site of the nucleotide sequence.
  • CpG site-binding oligonucleotides can be used, for example, as probes for the determination of the methylation status of a bisulfite- treated sequence containing a CpG site (binding or non-binding indicates methylation) or as PCR primers for the determination of the methylation status of a bisulfite-treated sequence containing a CpG site (presence/ absence/amount of an amplificate indicates the degree of methylation) in methods for determining a methylation pattern such as the ones described above.
  • CpG site- flanking oligonucleotide refers to an oligonucleotide capable of specifically hybridizing to a nucleotide sequence, wherein the target CpG site(s) of the nucleotide sequence is/are not covered by the oligonucleotide.
  • the oligonucleotide hybridizes on one side of one or more CpG site(s) (upstream or downstream), but not necessarily directly adjacent to the one or more CpG site(s), i.e. there may be one or more nucleotides between the one or more CpG site(s) and the oligonucleotide hybridizing site.
  • nucleotides there may be at least 1 nucleotides, at least 2 nucleotides, at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides or at least 100 nucleotides between the one or more CpG site(s) and the oligonucleotide hybridizing site.
  • CpG site-flanking oligonucleotides can be used, for example, as PCR primers to amplify a bisulfite-treated sequence containing one or more CpG site(s) or as sequencing primers for the determination of the methylation status of a bisulfite-treated sequence containing one or more CpG sites(s) in methods for determining a methylation pattern such as the ones described above.
  • the kit comprises a pair of CpG site-flanking, one of which upstream and the other downstream of the CpG site(s) of interest (i.e. a pair of forward and reverse primers).
  • Preferred examples of sets of three CpG site-flanking oligonucleotides per gene, PCR primer pairs or single CpG site-flanking oligonucleotides are:
  • These primers allow amplifying/sequencing 6 CpG sites, among them cgl 8507379, over 188 bp.
  • PCR forward GGGAAATAGYGATAGGGGAGTTTAAGATTG (SEQ ID NO : 7)
  • PCR reverse ACTCTCTACTTATCCACACACTTAC (SEQ ID NO: 8)
  • PCR forward GGGAATTGGTTGGTATTAGTATAATGG (SEQ ID NO: 10)
  • PCR reverse AACCCAAAAAACCAAATACATTAAC (SEQ ID NO: 11)
  • These primers allow amplifying/sequencing 9 CpG sites, among them cgl4996220, over 234 bp.
  • PCR forward GTAGATTTTATGTAATAATTTGGTGGTAT (SEQ ID NO: 13)
  • PCR reverse CCCTTTACATAAAAACATATAACTTTTACT (SEQ ID NO: 14)
  • Sequencing GGGGAAGTATAGTTATTTAATAAG (SEQ ID NO: 15)
  • These primers allow amplifying/sequencing 9 CpG sites, among them cgl 6104915 and cgl2600174, over 182 bp.
  • methylation-specific manner refers to an oligonucleotide that is either capable of hybridizing to a polynucleotide in which unmethylated cytosine has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties and which comprises at least one sequence of a CpG site when said CpG site is methylated, or to the same polynucleotide when said CpG site is unmethylated, but not to both.
  • the kit of the invention comprises (i) at least one first CpG site-binding oligonucleotide capable of specifically hybridizing to a polynucleotide in which unmethylated cytosine has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties and which comprises at least one sequence of a CpG site according to the kit of the fourth aspect of the invention when said CpG site is methylated, and
  • kits may include various reagents for use in accordance with the present invention in suitable containers and packaging materials, including tubes, vials, and shrink-wrapped and blow-moulded packages. Additionally, the kits of the invention can contain instructions for the simultaneous, sequential or separate use of the different components which are in the kit. Said instructions can be in the form of printed material or in the form of an electronic support capable of storing instructions such that they can be read by a subject, such as electronic storage media (magnetic disks, tapes and the like), optical media (CD-ROM, DVD) and the like. Additionally or alternatively, the media can contain Internet addresses that provide said instructions.
  • Materials suitable for inclusion in an exemplary kit in accordance with the present invention comprise one or more of the following: reagents required to discriminate between the various possible alleles in the sequence domains amplified by PCR or non- PCR amplification (e.g., restriction endonucleases, oligonucleotide that anneal preferentially to methylated or to unmethylated CpG sites, including those modified to contain enzymes or fluorescent chemical groups that amplify the signal from the oligonucleotide and make discrimination of methylated or unmethylated CpG sites more robust); or reagents required to physically separate products derived from the various amplified regions (e.g. agarose or polyacrylamide and a buffer to be used in electrophoresis, HPLC columns, SSCP gels, formamide gels or a matrix support for MALDI-TOF).
  • the kit of the invention further comprises one or more reagents for converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties (i.e. a converting agent as defined above).
  • the one or more reagents for an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties is a bisulfite, preferably sodium bisulfite.
  • the reagent capable of converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties is metabisulfite, preferably sodium metabisulfite.
  • kits of the invention have the meaning as defined for the prognostic and the selection method of the invention.
  • the present invention relates to a nucleic acid (hereinafter referred to as the "nucleic acid of the invention” or the “nucleic acid of the fifth aspect”) selected from the group consisting of
  • nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9;
  • nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, wherein the position corresponding to the C within the CpG site is a uracil;
  • the nucleic of the invention comprises at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30 or more contiguous nucleotides said genes.
  • said CpG site is in the promoter region of said gene(s). In a more preferred embodiment, said CpG site is selected from the group as defined in the methods of the first and second aspects of the invention.
  • nucleic acids of the invention have been described in detail in the context of the methods and kits of the invention and are used with the same meaning.
  • kits and nucleic acids of the invention are particularly useful in the prognosis of a NSCLC patient or for selecting a NSCLC patient for anti-cancer treatment according to the methods of the first and second aspect of the invention.
  • the present invention relates to the use of a kit or a nucleic acid of the foregoing aspects for the prognosis of a NSCLC patient or for selecting a NSCLC patient for anti-cancer treatment (hereinafter referred to as the ' 'use of the invention' ') .
  • the NSCLC patient is a stage I NSCLC patient.
  • Table 2 Data shown in Table 2 are average (range) or number (%). * in Table 1 indicates patients from the discovery cohort who had undergone resection of NSCLC and did not receive neither adjuvant nor neoadjuvant chemotherapy before relapse. ** in Table 2 indicates all patients from the validation cohort had undergone resection of NSCLC and did not receive neither adjuvant nor neoadjuvant chemotherapy before relapse.
  • the DNA methylation status of 450,000 CpG sites was established by using the Infinium 45 OK Methylation array.
  • the methylation score of each CpG is represented as a ⁇ value.
  • Samples were clustered in an unsupervised manner using the 10,000 most variable ⁇ values for CpG methylation according to the standard deviation for the CpG sites located in promoter regions by hierarchical clustering using the complete method for agglomerating the Manhattan distances.
  • DNA methylation microarray data are available from:
  • MIAME standards were accomplished following GEO specifications. Chip analysis was performed using Illumina HiScan SQ fluorescent scanner. Bisulphite converted DNA was amplified, fragmented and hybridised to Illumina Infinium Human Methylation450 Beadchip using standard Illumina protocol. Arrays were imaged using High Scan SQ following standard recommended Illumina scanner setting. Methylation score of each CpG is represented as beta value. Normalization
  • the intensities of the images were extracted using Genome Studio (2011.1) Methylation module (1.9.0) software.
  • a three-step based normalization procedure was performed using the lumil package available in the R statistical environment, consisting of color bias adjustment (normalization between the two color channels red and green),background level adjustment and quantile normalization across arrays based on color balance adjusted data, as specified (Du et al, Bioinformatics 24(13): 1547-1548, 2008).
  • Probes and sample filtering involved a two-step process.
  • probes overlapping with known single nucleotide polymorphisms SNPs were removed.
  • SNPs are known to have given rise to inaccurate measurements of DNA methylation in earlier generations of Illumina's BeadArray assay and this needs to be taken into consideration for the Human Methylation 450K technology. Therefore, the inventors first removed all probes containing a SNP in the assayed CpG dinucleotide, as well as those for which two or more SNPs were located in the probe sequence. The inventors extracted the known SNPs in the human genome from the dbSNP database.
  • the second step involves the detection p-values of all measurements.
  • the inventors considered every methylation ⁇ value to be unreliable if its corresponding detection p- value was not below the threshold T 0.05.
  • Greedycut was used, an algorithm that filters out the probe or sample with the highest fraction of unreliable measurements in successive iterations, producing a matrix of retained measurements and increasing the set of removed measurements.
  • the inventors selected the matrix that maximized the value of the expression s + 1 - a, thereby giving equal weights to the sensitivity and specificity. Presented geometrically on a ROC curve, this is the point that is furthest from the diagonal. The final step was the removal of CpGs included in the sex chromosomes. Collectively, the filtering steps removed 76,545 probes and 24 samples. The analyses presented in this publication focus on the remaining 409,219 CpGs in 515 samples (490 primary tumors and 25 normal lung tissue counterparts). To study possible batch effects for the Human Methylation 450K technology, multivariate Cox regression analyses were developed to show that the obtained hazard ratios were not associated with the center of origin of each sample or the microarray hybridization date.
  • Differentially methylated CpG search Differentially methylated CpGs between tumor and normal groups were discovered using the following procedure: for each probe/CpG, the sets of methylation ⁇ values T (belonging to the tumor samples: first group) and N (belonging to the normal lung tissue samples: second group) were compared. The following three measures were calculated and differentially methylated CpGs were selected on the basis of their fulfillment of these three conditions:
  • the inventors To facilitate the classification of lung cancer patients with respect to their methylation levels in specific CpGs, the inventors set a threshold ⁇ -value of 0.4 to define non-methylation ( ⁇ 0.4) and methylation ( ⁇ >0.4). Subsequently, the inventors examined whether these two groups of patients were associated with a difference pattern in relapse-free survival (RFS). The inventors also defined a CpG to be consistently unmethylated in normal donors when its mean ⁇ -value was at most 0.15. This condition was met by all of the selected differentially methylated CpGs.
  • RFS relapse-free survival
  • DNA methylation in the validation cohort was evaluated with a pyrosequencing assay.
  • a minimum of 500 ng of DNA was converted using the EZ DNA Methylation Gold (ZYMO RESEARCH) bisulfite conversion kit following the manufacturer's recommendations.
  • ZYMO RESEARCH EZ DNA Methylation Gold
  • Specific sets of primers for PCR amplification and sequencing were designed using specific software (PyroMark assay design version 2.0.01.15). Primer sequences were designed, when possible, to hybridize with CpG-free sites to ensure methylation-independent amplification.
  • PCR was performed under standard conditions with biotinylated primers and the PyroMark Vacuum Prep Tool (Biotage, Sweden) was used to prepare single- stranded PCR products according to manufacturer's instructions. PCR products were observed at 2% agarose gels before pyrosequencing. Reactions were performed in a PyroMark Q96 System version 2.0.6 (Qiagen) using appropriate reagents and protocols, and the methylation value was obtained from the average of each of the CpG dinucleotides included in the sequence analyzed. Controls to assess correct bisulfite conversion of the DNA were included in each run, as well as sequencing controls to ensure the fidelity of the measurements.
  • the threshold value derived from the discovery cohort for RFS analysis was not re- estimated but was applied directly to the validating cohort.
  • For the bimodal signature based on the accumulation of the five validated hypermethylated genes, only those samples in which all of them showed valid results were included (n 102).
  • the tumor size when included, was added categorically ( ⁇ 3 cm or >3cm) according to T classification.
  • the multiple testing adjusting (false discovery rate: FDR) was calculated as described (Storey J., Ann. Statist. 3 1 (6):2013-2035, 2003): it is defined as the calculation of the positive false discovery rate of the p-value. All statistical analyses were performed and graphical output produced using the SPSS, R-2.15.0 and R packages. Gene ontology by PANTHER, INTERPRO and KEGG pathway enrichment analysis were done using the Database for Annotation, Visualization and Integrated Discovery (DAVID; v6.7).
  • DNA methylation profiles identify two groups with different relapse -free survival
  • the inventors first evaluated a genome -wide DNA methylation profile of the original cohort of 490 lung tumor patients including three NSCLC subtypes (adenocarcinoma, squamous and large cell carcinomas) using a previously validated 450,000 CpG methylation microarray (Sandoval et al, Epigenetics 6(6):692 -702, 2011). In addition, 25 normal lung tissue counterparts without any histological evidence of malignancy were also analyzed.
  • Chi- square tests showed a significantly higher proportion of the adenocarcinoma histological type in Group A (Chi-square test P ⁇ 0.001), but no other significant differences in the distribution of the tumors according to their stage, gender or smoking history between Group A and Group B were observed.
  • the inventors investigated whether these two DNA methylation groups had any effect on the RFS of these patients.
  • the inventors analyzed the subset of patients who had undergone resection of NSCLC and not received adjuvant chemotherapy before relapse, due to the possible confounding effect of chemotherapy in the RFS. Only 6% (31 of 490) of the discovery cohort samples received neoadjuvant therapy and none of these cases were included in the RFS analysis.
  • stage I cases received neoadjuvant therapy and none of these were included in the RFS analysis.
  • the DNA methylation levels at the described CpG sites were analyzed by pyrosequencing (Fernandez AF et al., Genome Res 22(2):407-19, 2012) to test a more affordable large-scale approach. Methylation value by pyrosequencing was obtained from the average of each of the CpG dinucleotides included in the sequence analyzed. Due to the limited DNA material, the inventors selected top 10 genes with a HR >2 at a 10% FDR).
  • Histone clusterl H4F HIST1H4F, HR 3.55, p ⁇ 0.001
  • the inventors also observed a greater risk of shorter RFS, according to Kaplan-Meier plots, when stage I NSCLCs harbored a high number of the five statistically significant hypermethylated markers (HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9).
  • HIST1H4F the five statistically significant hypermethylated markers
  • PCDHGB6, NPBWR1, ALX1 and HOXA9 the inventors chose the cut-off 0-1 vs > 2 hypermethylated markers, because it was the best one in resembling the percentage of expected recurrences.
  • the described bimodal methylation signature divides the stage I tumors into two arms: patients with 0-1 methylated markers that show longer RFS and those with > 2 hypermethylated genes that were associated with a higher risk of poor RFS by Kaplan-Meier estimates (Fig. 3A).
  • stage I NSCLC Since 80% of recurrences of stage I NSCLC occur within three years of surgery (Martini N et al, J Thorac Cardiovasc Surg 109: 120 -9, 1995), the inventors also calculated how many patients relapsed in this period. The inventors observed that 48% (95% CI 39.8-56.4) of patients from the enriched methylated group (2-5 methylated markers) relapsed, but only 18% (95% CI 16.1 -19.5) of those in the low methylated group (0-1 methylated markers).
  • Indicated numbers correspond to the C nucleotide of a CpG site, according to MAPINFO/Illumina Infmium HumanMethylation450 BeadChip, Manifest vl .2, or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009.
  • DNA methylation classifiers that, at a different level of resolution, are potential prognostic biomarkers of shorter RFS in stage I NSCLC (Fig. 3C).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to the field of pharmacogenomics and in particular to an in vitromethod for prognosis of a stage I NSCLC patient, comprising determining in a biological sample of said patient the methylation pattern in one or more genes. It further relates to an in vitromethod for selecting a stage I NSCLC patient for an anti-cancer treatment based on the method of prognosis. Also, it relates to nucleic acids and kits useful for these methods and an anti-cancer treatment for the selected patients.

Description

METHODS AND KITS FOR PROGNOSIS OF STAGE I NSCLC BY DETERMINING THE METHYLATION PATTERN OF CpG DINUCLEOTIDES
FIELD OF THE INVENTION
The present invention relates to the field of pharmacogenomics and in particular to an in vitro method for prognosis of a stage I NSCLC patient, comprising determining in a biological sample of said patient the methylation pattern in one or more genes. It further relates to an in vitro method for selecting a stage I NSCLC patient for an anti- cancer treatment based on the method of prognosis. Also, it relates to nucleic acids and kits useful for these methods and an anti-cancer treatment for the selected patients.
BACKGROUND OF THE INVENTION Non-small-cell lung cancer (NSCLC) accounts for approximately 80% of all lung cancers, with 1.2 million new cases worldwide each year. NSCLC resulted in more than one million deaths worldwide in 2001 and is the leading cause of cancer-related mortality in both men and women (31% and 25%, respectively). The prognosis of advanced NSCLC is dismal. A recent Eastern Cooperative Oncology Group trial of 1155 patients showed no differences among the chemotherapies used: cisplatin/paclitaxel, cisplatin/gemcitabine, cisplatin/docetaxel and carboplat in/pacl ita el . Overall median time to progression was 3.6 months, and median survival was 7.9 months.
At diagnosis, patients with NSCLC can be divided into different groups that reflect both the extent of the disease and the treatment approach. The group of patients that has tumors which are surgically resectable (generally stage I, stage II, and selected stage III tumors) still has the best prognosis.
The overall five-year survival of patients with NSCLC has remained at less than 15%) for the past 20 years. Stage grouping of TNM subsets (T=primary tumor; N=regional lymph nodes; M=distant metastases) permits the identification of patient groups with similar prognosis and treatment options.
Multiple studies have attempted to identify prognostic determinants after surgery and have yielded conflicting evidence as to the prognostic importance of a variety of clinicopathologic factors. Factors that have been observed to correlate with adverse prognosis include the presence of pulmonary symptoms, large tumor size (>3 cm), non- squamous histology, metastases to multiple lymph nodes within a TNM-defmed nodal station, vascular invasion, and increased numbers of tumor blood vessels in the tumor specimen.
The poor prognosis of NSCLC patients is associated with several factors, among them the late diagnosis of the disease and the small number of effective drugs. The absence of validated prognostic biomarkers could also be relevant, because even patients with stage I NSCLC - who undergo potentially curative surgical resection - are at high risk of dying from recurrent disease, with a 5 -year relapse rate of 35-50%.
The importance of identifying early in time the patients that are suffering or will suffer a relapse before the physical relapse symptoms appear (cough, pain or tumoral mass observed by PET/CT) is very high since it will allow the practitioners to choose an appropriate therapy. NSCLC is a tumor in which only small improvements in clinical outcome have been achieved. The issue is critical for stage I patients for whom there are no available biomarkers that indicate which high-risk patients should receive adjuvant chemotherapy.
Although studies have demonstrated that adjuvant platinum-based chemotherapy is beneficial in more advanced resected disease, in which most of the patients have a high risk of recurrence, they have failed to show a survival benefit for patients at stage. One explanation for these negative data in the early stages could be the lack of biological factors predicting their recurrence and the fact that, in the absence of useful biomarkers, all stage I NSCLCs are pooled, making it more difficult to draw meaningful clinical conclusions.
Thus, there is a need in the art for prognostic markers for stage I NSCLC. In the search for new potential biomarkers of human cancer, the hypermethylation of the CpG island sequences located in the promoter regions of tumor suppressor genes are gaining prominence. The present inventors investigated the possibility whether DNA methylation markers could also be used to provide a prognostic snapshot of lung tumors. As a result, they have obtained DNA methylation signatures associated with shorter relapse-free survival (RFS) in stage I NSCLCs that will be useful in the design of clinical trials for adjuvant chemotherapy in the expanding population of diagnosed early-stage lung cancer. The DNA methylation signature of NSCLC can be practically determined by user- friendly PCR assays. The analysis of the best DNA methylation biomarkers improves prognostic accuracy beyond standard staging. SUMMARY OF THE INVENTION
In a first aspect, the present invention relates to an in vitro method for the prognosis of a NSCLC patient, comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9, and wherein hypermethylation in at least one gene is indicative of a bad prognosis.
In a second aspect, the present invention relates to an in vitro method for selecting a NSCLC patient for an anti-cancer treatment, comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9, and wherein hypermethylation in at least one gene selects the patient for the cancer treatment.
In a third aspect, the present invention relates to an anti-cancer agent for the treatment of a NSCLC patient, wherein the patient is selected using the method of the second aspect.
In a fourth aspect, the present invention relates to a kit comprising
(i) at least one CpG site-binding oligonucleotide capable of specifically hybridizing to a sequence spanning a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9 in a methylat ion- specific manner; or
(ii) at least one CpG site-flanking oligonucleotide capable of specifically hybridizing to an upstream or downstream sequence of a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWRl, ALXl, and HOXA9, wherein unmethylated cytosine in at least part of said gene has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties.
In a fifth aspect, the present invention relates to a nucleic acid selected from the group consisting of (i) a nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9;
(ii) a nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F,
PCDHGB6, NPBWR1, ALX1, and HOXA9, wherein the position corresponding to the C within the CpG site is a uracil; and
(iii) a polynucleotide which specifically hybridizes to a nucleic acid of (i) or (ii).
In a sixth aspect, the present invention relates to the use of a kit or a nucleic acid of the foregoing aspects for the prognosis of a NSCLC patient or for selecting a NSCLC patient for anti-cancer treatment.
LEGENDS TO THE FIGURES
Figure 1: DNA Methylation Signatures Associated with Recurrence-Free Survival (RFS) in NSCLC samples. Panel A displays the Kaplan-Meier analysis for RFS among the 198 patients with RFS information according to the two groups obtained from the clustering. The p-value corresponds to the hazard ratio (HR) adjusted by multivariate regression (including age, gender, smoking history, stage and histological type). Panel B shows the Kaplan-Meier estimates for RFS among the subset of 147 patients with RFS information according to the two groups obtained in the clustering. The p-value reflects the HR adjusted as in the analysis in Panel B.
Figure 2: Kaplan-Meier Estimates of RFS in Stage I NSCLC Patients byMethylation Status of the Five Validated Genes. Top panel for each gene shows the Kaplan-Meier estimates for RFS of the final five validated genes in the subset of 147 patients in stage I from the discovery cohort. Methylation status was determined by the Infmium 450k Methylation Array. Bottom panel for each gene shows the corresponding Kaplan-Meier estimates for the same genes in the 142 patients in stage I included in the validation cohort. In this case, methylation status was determined by pyrosequencing. The p- values correspond to HRs adjusted by multivariate regression (including age, gender, smoking history and histological type). Figure 3: Kaplan-Meier Estimates of RFS by Number of Methylated Genes (A-B) and Forest Plot with Hazard Ratios for Recurrence in Stage I NSCLC (C).
(A-B) In each panel, patients are grouped into methylation low or methylation high groups according to the number of methylated genes (0-1 versus 2-5) from the five-gene signature (including HIST1H4F, NPBWRl, PCDHGB6, ALXl and HOXA9). Panel A shows patients from the validation cohort analyzed by pyrosequencing. Panel B includes patients from the discovery cohort analyzed by the DNA methylation microarray. The p- values correspond to Hazard Ratios adjusted by multivariate regression (including age, gender, smoking history and histological type). (C) The forest plot shows the multivariate Cox regression for the various DNA methylation classifiers of stage I NSCLC patients. Data for the Group 1 heatmap in stage I was obtained from the discovery cohort with the 450K array. Data from each of the five significant genes and the bimodal signature model were obtained both from the discovery cohort with the 450 array and from the validation cohort by pyrosequencing. The prognostic value for each gene or signature was adjusted for age, gender, smoking history and histological type.
DETAILED DESCRIPTION OF THE INVENTION
The authors of the present invention have found that hypermethylation of five genes was significantly associated with relapse-free survival (RFS) in stage I NSCLC patients. These genes are HIST1H4F, PCDHGB6, NPBWRl, ALXl and HOXA9. This finding is useful for prognosis of NSCLC patients and the selection of these patients for anti-cancer treatment as set out below. Further described are kits and nucleic acids for the use in these methods.
Method for prognosis of a stage I NSCLC patient
In a first aspect, the invention relates to an in vitro method for the prognosis of a NSCLC patient (hereinafter referred to as the "method of the first aspect" or the "prognostic method of the invention"), comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and wherein hypermethylation in at least one gene is indicative of a bad prognosis.
The term "prognosis" as referred to herein is understood as the expected progression of a disease and relates to the assessment of the probability according to which a subject suffers from a disease as well as to the assessment of its onset, state of development, progression, or of its regression, and/or the prognosis of the course of the disease in the future. As will be understood by persons skilled in the art, such assessment normally may not be correct for 100% of the subjects to be diagnosed, although it preferably is correct. The term, however, requires that a correct prognosis can be made for a statistically significant part of the subjects. Whether a part is statistically significant it can be determined simply by the person skilled in the art using several well-known statistical evaluation tools, for example, determination of confidence intervals, determination of p values, Student's t-test, Mann- Whitney test, etc. Details are provided in Dowdy and Wearden, Statistics for Research, John Wiley & Sons, New York 1983. The preferred confidence intervals are at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%. The p values are preferably 0.05, 0.01, or 0.005. The prediction of the clinical outcome can be done using any assessment criterion used in oncology and known by the person skilled in the art. The assessment parameters useful for describing the progression of a disease include, for example, relapse-free survival (RFS).
The term "bad prognosis" as referred to herein generally means an outcome which would be regarded negative for the patient and preferably depends on the prognosis type. For example, a bad prognosis of relapse-free survival (RFS) would mean that the patient will live for a shorter time, without recurrence of the tumour than, e.g., the average in a group of NSCLC I patients.
In a preferred embodiment of the method of the first aspect, the prognosis is determined as relapse-free survival. Preferably, it is determined as the likelihood of relapse-free survival in a period of 3 years after tumour resection surgery. The likelihood is equivalent to the percentage of patients that survive for 3 years after resection without relapse among a larger group of patients (some of which may relapse earlier).
The term "patient", as used herein, refers to an individual, such as a human, a non-human primate (e.g. chimpanzees and other apes and monkey species); farm animals, such as birds, fish, cattle, sheep, pigs, goats and horses; domestic mammals, such as dogs and cats; laboratory animals including rodents, such as mice, rats and guinea pigs. The term does not denote a particular age or sex. In a particular embodiment of the invention, the subject is a mammal. In a preferred embodiment of the invention, the subject is a human. In another embodiment, the subject has undergone tumour resection. Preferably, the subject has not undergone chemotherapy prior to the determination.
The term "NSCLC patient" encompasses patients having NSCLC and patients having been diagnosed with NSCLC, even though they may be in remission or may have no visible signs of NSCLC anymore. In a preferred embodiment, the NSCLC patient is a stage I NSCLC patient.
The term "NSCLC" or "non-small cell lung cancer", as used herein, refers to a group of heterogeneous diseases grouped together because their prognosis and management is roughly identical and includes, according to the histological classification of the World Health Organization/International Association for the Study of Lung Cancer (Travis WD et al. Histological typing of lung and pleural tumours. 3rd ed. Berlin: Springer- Verlag, 1999):
1 squamous cell carcinoma (SCC), accounting for 30% to 40% of NSCLC, starts in the larger breathing tubes but grows slower meaning that the size of these tumours varies on diagnosis.
2 adenocarcinoma is the most common subtype of NSCLC, accounting for 50%) to 60%o of NSCLC, which starts near the gas-exchanging surface of the lung and which includes a subtype, the bronchioalveolar carcinoma, which may have different responses to treatment.
3 large cell carcinoma is a fast-growing form that grows near the surface of the lung. It is primarily a diagnosis of exclusion, and when more investigation is done, it is usually reclassified to squamous cell carcinoma or adenocarcinoma.
4 adenosquamous carcinoma is a type of cancer that contains two types of cells: squamous cells (thin, flat cells that line certain organs) and gland- like cells. 5. carcinomas with pleomorphic, sarcomatoid or sarcomatous elements. This is a group of rare tumours reflecting a continuum in histological heterogeneity as well as epithelial and mesenchymal differentiation.
6. carcinoid tumour is a slow-growing neuroendocrine lung tumour and begins in cells that are capable of releasing a hormone in response to a stimulus provided by the nervous system.
7. carcinomas of salivary gland type begin in salivary gland cells located inside the large airways of the lung.
8. unclassified carcinomas include cancers that do not fit into any of the aforementioned lung cancer categories.
The term "stage I NSCLC", as used herein, refers to tumor which is present in the lungs but the cancer has not been found in the chest lymph nodes or in other locations outside of the chest. Stage I NSCLC is subdivided into stages IA and IB, usually based upon the size of the tumor or involvement of the pleura, which is lining along the outside of the lung. In Stage IA, the tumor is 3 centimetres (cm) or less in size and has invaded nearby tissue minimally, if at all. The cancer has not spread to the lymph nodes or to any distant sites. In Stage IB, the tumor is more than 3 cm in size, has invaded the pleural lining around the lung, or has caused a portion of the lung to collapse. The cancer has not spread to the lymph nodes or to any distant sites. Stage IA corresponds to stages T1N0M0 of the TNM classification. Stage IB corresponds to T2M0N0 of the TNM classification. Stage I NSCLC includes stages IA and/or IB NSCLC.
The TNM classification is a staging system for malignant cancer. As used herein the term "TNM classification" refers to the 6th edition of the TNM stage grouping as defined in Sobin et al. (International Union Against Cancer (UICC), TNM Classification of Malignant tumors, 6th ed. New York; Springer, 2002, pp. 191-203) (TNM6) and AJCC Cancer Staging Manual 6th edition; Chapter 19; Lung - original pages 167-177 whereby the tumors are classified by several factors, namely, T for tumor, N for nodes, M for metastasis as follows:
T: Primary tumor cannot be assessed, or tumor proven by the presence of malignant cells in sputum or bronchial washings but not visualized by imaging or bronchoscopy:
TO No evidence of primary tumor,
Tis Carcinoma in situ, - Tl Tumor 3 cm or less in greatest dimension, surrounded by lung or visceral pleura, without bronchoscopic evidence of invasion more proximal than the lobar bronchus (for example, not in the main bronchus),
T2 Tumor more than 3 cm but 7 cm or less or tumor with any of the following features (T2 tumors with these features are classified T2a if 5 cm or less): involves main bronchus, 2 cm or more distal to the carina; invades visceral pleura (PL1 or PL2); associated with atelectasis or obstructive pneumonitis that extends to the hilar region but does not involve the entire lung,
T3: Tumor more than 7 cm or one that directly invades any of the following: parietal pleural (PL3), chest wall (including superior sulcus tumors), diaphragm, phrenic nerve, mediastinal pleura, parietal pericardium; or tumor in the main bronchus less than 2 cm distal to the carina but without involvement of the carina; or associated atelectasis or obstructive pneumonitis of the entire lung or separate tumor nodule(s) in the same lobe and
- T4 Tumor of any size that invades any of the following: mediastinum, heart, great vessels, trachea, recurrent laryngeal nerve, esophagus, vertebral body, carina, separate tumor nodule(s) in a different ipsilateral lobe.
N (Regional Lymph Nodes):
- NX Regional lymph nodes cannot be assessed
- NO No regional lymph node metastases
- Nl Metastasis in ipsilateral peribronchial and/or ipsilateral hilar lymph nodes and intrapulmonary nodes, including involvement by direct extension
- N2 Metastasis in ipsilateral mediastinal and/or subcarinal lymph node(s)
- N3 Metastasis in contralateral mediastinal, contralateral hilar, ipsilateral or contralateral scalene, or supraclavicular lymph node(s)
M: Distant metastasis
MO No distant metastasis
Ml Distant metastasis
In a preferred embodiment, NSCLC is of the subtype adenocarcinoma, squamous carcinoma or large cell carcinoma. In a more preferred embodiment, NSCLC is of the subtype adenocarcinoma or squamous carcinoma. In the most preferred embodiment, NSCLC is of the subtype adenocarcinoma. The term "sample" or "biological sample", as used herein, refers to biological material isolated from a subject. The biological sample contains any biological material suitable for detecting the desired methylation pattern in one or more CpG site(s) and is a material comprising genetic material from the subject. The biological sample can comprise cell and/or non-cell material of the subject. In the present invention, the sample comprises genetic material, e.g., DNA, genomic DNA (gDNA), complementary DNA (cDNA), R A, heterogeneous nuclear R A (hnR A), mR A, etc., from the subject under study. In a particular embodiment, the genetic material is DNA. In a preferred embodiment the DNA is genomic DNA. In another preferred embodiment, the DNA is circulating DNA. The sample can be isolated from any suitable tissue or biological fluid such as, for example blood, saliva, plasma, serum, urine, cerebrospinal liquid (CSF), feces, a buccal or buccal-pharyngeal swab, a surgical specimen, a specimen obtained from a biopsy, and a tissue sample embedded in paraffin. Methods for isolating samples are well known to those skilled in the art. In a particular embodiment, said sample comprises a cancer cell, preferably a NSCLC cell. In a preferred embodiment it is a tumour tissue sample or portion thereof. Preferably, said tumour tissue sample is a lung tumour tissue sample, more preferably a pulmonary tumour tissue sample from a subject suffering from NSCLC. Said sample can be obtained by conventional methods, e.g., biopsy, by using methods well known to those of ordinary skill in the related medical arts. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, or microdissection or other art-known cell-separation methods. Tumour cells can additionally be obtained from fine needle aspiration cytology. In order to simplify conservation and handling of the samples, these can be formalin- fixed and paraffin- embedded or first frozen and then embedded in a cryosolidifiable medium, such as OCT- Compound, through immersion in a highly cryogenic medium that allows for rapid freeze.
The term "methylation" or "DNA methylation", as used herein, refers to a biochemical process involving the addition of a methyl group to the cytosine or adenine DNA nucleotides. DNA methylation at the 5 position of cytosine may have the specific effect of reducing gene expression and has been found in every vertebrate examined. In adult non-gamete cells, DNA methylation typically occurs in a CpG site. The term "CpG site", as used herein, refers to regions of DNA where a cytosine nucleotide occurs next to a guanine nucleotide in the linear sequence of bases along its length. "CpG" is shorthand for "C-phosphate-G", that is, cytosine and guanine separated by only one phosphate; phosphate links any two nucleosides together in DNA. The "CpG" notation is used to distinguish this linear sequence from the CG base-pairing of cytosine and guanine. Cytosines in CpG dinucleotides can be methylated to form 5- methylcytosine.
The term "CpG island", as used herein, relates to a DNA sequence, generally in a window of less than 200 pb, with a GC content greater than 50% and an observed:expected CpG ratio of more than 0.6. In general, CpG islands associated with genes are located near the promoters; nevertheless, it has been noticed that many CpG islands appear within the body of the gene, at the 3' end of the gene and/or intergenic regions. In a particular embodiment of the invention, CpG islands are near the promoter region of a gene.
The term "CpG shore", as used herein, relates to the DNA sequences, up to 2kb long, flanking a CpG island and showing a comparatively low GC density.
The term "methylation pattern", as used herein, refers to, but is not limited to, the presence or absence of methylation of one or more nucleotides. Thereby said one or more nucleotides are comprised in a single nucleic acid molecule. Said one or more nucleotides have the ability of being methylated or being non-methylated. The term "methylation status" may also be used, wherein only a single nucleotide is considered. A methylation pattern can be quantified, wherein it is considered over more than one nucleic acid molecule.
The term ' 'hypermethy lation' ' refers to an aberrant methylation pattern or status, wherein one or more nucleotides, preferably C(s) of a CpG site(s), are methylated compared to a control sample or the known methylation status in healthy subjects or subjects not suffering of having suffered from NSCLC (control). In particular in refers to an increased presence of 5-mCyt at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mCyt found at corresponding CpG dinucleotides within a normal control DNA sample. Hypermethylation as a methylation status/pattern can be determined at one or more CpG site(s). If more than one CpG sites are used, hypermethylation can be determined at each site separately or as an average of the CpG sites taken together.
The term "HIST1H4F" refers to the gene Histone cluster 1, H4F, also known as H4/M3, H4FC, H4, H4/A, H4/B, H4/C, H4/D, H4/E, H4/G, H4/H, H4/I, H4/J, H4/K, H4/M, H4/N, H4/0, H4fl, H4FB, H4FD, H4/K, H4/A, H4FE, H4/B, H4FG, H4/C, H4FH, H4/D, H4FI, H4/E3, H4FJ, H4/G, H4FK, H4FM, H4FN, H4F03 or HIST2H4. The gene is located on chromosome 6p22.1. Its sequence reference is NM 003540.3. The protein is located in the nucleus and it is involved in CenH3 -containing nucleosome assembly at the centromere, negative regulation of megakaryocyte differentiation, phosphatidylinositol- mediated signaling, and telomere maintenance.
The term "PCDHGB6" refers to the gene Protocadherin gamma subfamily B 6, located on chromosome 5q31.3. Its sequence reference is NM 032011.1. The protein is located in the cytoplasm and it is involved in cell to cell adhesion in the brain and calcium ion binding.
The term "NPBWR1 " or "GPR7" refers to the gene Neuropeptides B/W Receptor
1 or G-protein-coupled receptor 7, located on chromosome 8p22-q21.13. Its sequence reference is NM 005285.3. The protein is located in the cell membrane and it is involved in G-protein coupled receptor (opioid receptor) activity and the regulation of processes such as neuroendocrine system regulation, food intake and synaptic transmission.
The term "ALX1" or "CART1" refers to the gene ALX homeobox protein 1 , located on chromosome 12q21.31. Its sequence reference is NM 006982.1. The protein is located in the nucleus and it is involved in sequence-specific DNA binding, the regulation of transcription from the RNA polymerase II promoter, brain development, and cartilage condensation.
The term ΉΟΧΑ9" refers to the gene Homeobox A9, located on chromosome
7pl5.2. Its sequence reference is NM_152739. The protein is located in the nucleus and it is involved in sequence-specific DNA binding, transcription regulation, development, cancer as a proto-oncogen, anterior/posterior pattern specification, embryonic forelimb morphogenesis, embryonic, skeletal system development, endothelial cell activation, multicellular organismal development, and proximal/distal pattern formation.
Method for selection of a stage I NSCLC patient In a second aspect, the present invention relates an in vitro method for selecting a NSCLC patient for an anti-cancer treatment (hereinafter referred to as the "method of the second aspect" or the "selection method of the invention"), comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and wherein hypermethylation in at least one gene selects the patient for the cancer treatment.
The term "selection" refers to the action of choosing said patient for an anti- cancer treatment, preferably an NSCLC treatment.
The term "treatment", as used herein, refers to a therapeutic treatment, as well as a prophylactic or prevention method, wherein the goal is to prevent or reduce an unwanted physiological change or disease, such as cancer or NSCLC. Beneficial or desired clinical results include, but not limiting, release of symptoms, reduction of the length of the disease, stabilized pathological state (specifically not deteriorated), retard in the disease's progression, improve of the pathological state and remission (both partial and total), both detectable and not detectable. The terms "treat" and "treatment" are synonyms of the term "therapy" and can be used without distinction along the present description. Treatment can mean also prolong survival, compared to the expected survival if the treatment is not applied.
The terms "anti-cancer treatment" or "NSCLC treatment", as used herein, refers to the use of chemical, physical or biological agents or compounds with antiproliferative, antioncogenic and/or carcinostatic properties which can be used to inhibit tumor growth, proliferation and/or development. "Anti-cancer agents" are such agents or compounds. Examples of anti-cancer agent are alkylating agents, antimetabolites, plant alkaloyds and terpenoids, topoisomerase inhibitors and the like. A preferred anti-cancer therapy is chemotherapy or platinum-based chemotherapy.
In a preferred embodiment, the NSCLC patient is a stage I NSCLC patient. In a preferred embodiment, the anti-cancer agent is a platinum-based chemotherapeutic composition. The term "platinum-based chemotherapy", as used herein, refers to any chemotherapy using compounds which contain a platinum atom and which are capable of modifying the DNA inducing the activation of the DNA repair and subsequent cell death. Suitable platinum-based chemotherapeutic compounds include, without limitation, cisplatin, ELOXATIN (oxaliplatin), eptaplatin, lobaplatin, nedaplatin, PARAPLATIN (carbop latin), carboplatin, cisplatin, iproplatin, tetrap latin, lobaplatin, DCP, PLD-147, JMl 18, JM216, JM335, and satraplatin. The term "platinum-based chemotherapy" also includes compositions of two or more chemotherapeutic agents wherein at least one of the components is a platinum-based compound, such as cisplatin-gemcitabine, carboplatin-gemcitabine, cisplatin-gemcitabine -vinorelbine, cisplatin- vinorelbine, cisplatin-etoposide, cisplatin-etoposide-vincristine, cisplatin-paclitaxel, cisplatin- docetaxel, carboplatin-docetaxel and the like.
Other terms used in context with the selection method of the invention have the meaning as defined for the prognostic method of the invention.
Preferred embodiments of both the prognostic and the selection method of the invention In a preferred embodiment of the method of the first and/or the second aspect of the invention, the methylation pattern of said gene(s) is determined in the promoter region of said gene.
The term "promoter region", as used herein, refers to a region of DNA that initiates transcription of a particular gene. Promoters are located near the genes they transcribe, on the same strand and upstream on the DNA, and can be about 100-1000 base pairs long. In the particular case of the
- HIST1H4F promoter, it starts at position 26239153 (according to TSS1500, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 26241021
(according to 1st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009),
- PCDHGB6 promoter, it starts at position 140786269 (according to 1st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 140790187 (according to Is exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009).
- NPBWRl promoter, it starts at position 53850967 (according to TSS1500, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 53853454 (according to 1st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009).
- ALXl promoter, it starts at position 85672535 (according to TSS1500, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 85674265 (according to 1st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009). - HOXA9 promoter, it starts at position 27200556 (according to TSS1500, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009) and ends at position 27203460 (according to 1st exon, as in Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009).
In an embodiment of the invention, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWRl, ALXl and HOXA9 is determined at a CpG site of said gene(s). In a particular embodiment, the CpG site is located at a CpG island of said gene. In a particular alternative embodiment, the CpG site is located at a CpG shore of said gene.
According to the present invention, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWRl, ALXl and HOXA9 is determined at at least two CpG sites, at least three CpG sites, at four three CpG sites, at least five CpG sites, at least six CpG sites, at least seven CpG sites, at least eight CpG sites, at least nine CpG sites, at least ten three CpG sites, at least twelve CpG sites, or at least fifteen CpG sites of said gene(s), wherein the methylation pattern of said gene(s) is determined as the mean value of said at least two CpG sites, at least three CpG sites, at four three CpG sites, at least five CpG sites, at least six CpG sites, at least seven CpG sites, at least eight CpG sites, at least nine CpG sites, at least ten three CpG sites, at least twelve CpG sites, or at least fifteen CpG sites of said gene(s). Said CpG sites may be located at a CpG island, at a CpG shore or both at a CpG island and a CpG shore.
In a particular alternative embodiment of the invention, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at a one or more, preferably all CpG sites located at a CpG island of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island of said gene(s).
In a particular alternative embodiment of the invention, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at a one or more, preferably all CpG sites located at the N-shore, at the S- shore or at both the N- and the S-shores of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG shore or shores of said gene(s).
In another particular embodiment, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at one or more, preferably all CpG sites located at the CpG island and at the N shore of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island and the N shore of said gene(s).
In another particular embodiment, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at one or more, preferably all CpG sites located at the CpG island and at the S shore of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island and the S shore of said gene(s).
In another particular embodiment, the methylation pattern of one or more of the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9 is determined at one or more, preferably all CpG sites located at the CpG island and at the N shore and the S shore of said gene(s). Wherein methylation is determined at more than one CpG site, then the methylation pattern of said gene(s) can be determined as the mean value of said CpG sites located at the CpG island, the N shore and the S shores of said gene(s).
The location of the N shore, the CpG island, the S shore and the promoter for each of the genes is shown in Table 1.
End
Gene Island Shore/island/shore Start promoter
promoter start end start end TSS1500 1st exon
NPBWR1 53851701 53854426 53849701 53856426 53850967 53853454
HistlH4F 26240697 26240951 26238697 26242951 26239153 26241021
PCDHGB6 140787447 140788044 140785447 140790044 140786269 140790187
ALX1 85673878 85674700 85671878 85676700 85672535 85674265
HOXA9 27203915 27206462 27201915 27208462 27200556 27203460
Table 1 : Start and end positions of the CpG island, of the shores flanking the CpG island and of the promoter regions in the NPBWR1, HistlH4F, PCDHGB6, ALX1 and HOXA9 genes.
Island start and island end indicate, respectively, the starting and ending positions of the CpG island by reference to the chromosome numbering according to Infinium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009.
Shore/lsland/Shore start indicates the starting position of the shore 5' to the CpG island by reference to the chromosome numbering as indicated above (Infinium/UCSC).
Shore/lsland/Shore end indicates the end position of the shore located 3' with respect of the CpG island by reference to the chromosome numbering as indicated above (Infinium/UCSC). The end position of the shore located 5' of the CpG island is the position adjacent in 5' to the island start position. The start position of the shore located 3' of the CpG island is the position adjacent in 3' to the island end position.
Start promoter (TSS1500) indicates the start position of the promoter region by reference to the chromosome numbering as indicated above (Infinium/UCSC).
End promoter (1st exon) indicates the last position of the first exon of the gene, which is adjacent to the last position of the promoter, by reference to the chromosome numbering as indicated above (Infinium/UCSC).
In a preferred embodiment, the methods according to the invention involve determining methylation at some specific CpG site(s) which are significantly associated with a bad prognosis and selection of a stage I NSCLC in a subject. Accordingly, in another preferred embodiment of the method of the first and/or the second aspect of the invention, the methylation pattern of said gene(s) is determined at the CpG site(s) located at positions - 26240782 (cgl0723962), 26240528 (cg22723502), 26240519 (cgl2260798), 26240762, 26240771, 26240774, 26240776, 26240779, 26240789 and/or 26240796 in HIST 1H4F,
- 140787507 (cgl8507379), 140787504 (cgl8617005), 140787474, 140787487, 140787491, 140787504 and/or 140787513 in PCDHGB6,
- 53851156 (cg26205771), 53852422 (cg07770968), 53851151, 53851160 and/or 53851189 in PBWR1,
- 85673270 (cgl4996220), 85673276, 85673296, 85673303, 85673325, 85673330, 85673344, 85673347 and/or 85673353 in ALX1, and/or
- 27205217 (cgl6104915), 27205230 (cgl2600174), 27205187, 27205200, 27205204, 27205206, 27205211, 27205219 and/or 27205224 (cgl8447772) in HOXA9,
wherein the indicated positions correspond to the C nucleotide of a CpG site according to MAPINFO/Illumina Infmium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009; positions in brackets are indicated according to IlmnlD/Illumina.
In another embodiment, the method of the first and/or the second aspect of the invention comprise determining the mean methylation level calculated based on the above positions.
The term "determination of the methylation pattern in a CpG site", as used herein, refers to the determination of the methylation status of a particular CpG site. The determination of the methylation pattern in a CpG site can be performed by means of multiple processes known by the person skilled in the art.
In some embodiments, for example, when the determination of the methylation pattern of a particular CpG site is carried out in a sample from whole blood, it may be used directly for the determination of said particular CpG site. In other embodiments, the nucleic acid is extracted from cells which are present in a biological fluid a tissue or a cell as an initial step, and, in such cases, the total nucleic acid extracted from said samples would represent the working material suitable for subsequent analysis. Isolating the nucleic acid of the sample can be performed by standard methods known by the person skilled in the art. Said methods can be found, for example, in Sambrook et al., 2001, "Molecular cloning: a Laboratory Manual", 3rd ed., Cold Spring Harbor Laboratory Press, N.Y., Vol. 1-3 and in the commonly used QIAamp DNA mini kit protocol by Qiagen.
After isolating and amplifying (if necessary) the nucleic acid, the methylation pattern of one or more CpG site(s) is determined. Those skilled in the art will readily recognize that the analysis of the methylation pattern present in one or several of the CpG sites disclosed herein in a subject's nucleic acid can be done by any method or technique capable of measuring the methylation pattern present in a CpG site. For instance, one may detect SNPs in the first method of the invention by performing by a method selected from the group consisting of Methylation-Specific PCR (MSP), an enrichment-based method (e.g. MeDIP, MBD-seq and MethylCap), bisulfite sequencing and bisulfite-based method (e.g. RRBS, bisulfite sequencing, Infmium, GoldenGate, COBRA, MSP, MethyLight) and a restriction-digestion method (e.g., MRE-seq, or HELP assay), ChlP-on-chip assay, or differential-conversion, differential restriction, differential weight of the DNA methylated CpG site(s).
When perfoming bisulfite sequencing and bisulfite-based method, the genomic DNA sample is chemically treated in such a way that all of the unmethylated cytosine bases are modified to uracil bases, or another base which is dissimilar to cytosine in terms of base pairing behaviour, while the 5-methylcytosine bases remain unchanged.
The term "modify", as used herein, means the conversion of an unmethylated cytosine to another nucleotide which will distinguish the unmethylated from the methylated cytosine. The conversion of unmethylated, but not methylated, cytosine bases within the DNA sample is conducted with a converting agent. The term "converting agent" or "converting reagent", as used herein, relates to a reagent capable of converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties. The converting agent is preferably a bisulfite such as disulfite or hydrogen sulfite. However, other agents that similarly modify unmethylated cytosine, but not methylated cytosine can also be used in the method of the invention, such as hydrogen sulfite. The reaction is performed according to standard procedures (Frommer et al, 1992, Proc Natl Acad Sci USA 89: 1827-31; Olek, 1996, Nucleic Acids Res 24:5064-6; EP 1394172). It is also possible to conduct the conversion enzymatically, e.g by use of methylation specific cytidine deaminases. Preferably, the sample has been treated with a reagent capable of converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties. More preferably, the reagent used for modifying unmethylated cytosine is sodium bisulfite.
Once the DNA has been treated with a bisulfite, the region containing the one or more CpG site(s) is amplified using primers allowing to distinguish the unmethylated sequence (wherein the cytosine of CpG site is converted into uracil) of the methylated sequence (wherein the cytosine of the CpG site remains cytosine). Many amplification methods rely on an enzymatic chain reaction such as, for example, a polymerase chain reaction (PCR), Ligase Chain Reaction (LCR), Polymerase Ligase Chain Reaction, Gap- LCR, Repair Chain Reaction, 3SR, and NASBA. Further, there are strand displacement amplification (SDA), transcription mediated amplification (TMA), and QP-ampl ificat ion, etc.; this list is merely illustrative and in no way limiting. Methods for amplifying nucleic acid are described in Sambrook et al., 2001 (cited at supra). Particularly preferred amplification methods according to the invention are the methylation specific PCR method (MSP) disclosed in US 5, 786,146 which combines bisulfite treatment and allele- specific PCR (see e.g. US 5,137,806, US 5,595,890, US 5,639,611). Uracil is recognized as a thymine by Taq polymerase and therefore upon PCR, the resultant product contains cytosine only at the position where 5-methylcytosine occurs in the starting template DNA.
Details and particulars of the oligonucleotides capable of specifically hybridizing to a bisulfite-treated polynucleotide comprising at least one sequence of a CpG site of above-mentioned genes are described further below (see nucleic acids or kits of the invention).
The amplification products are then detected according to standard procedures in the art. The amplified nucleic acid may be determined or detected by standard analytical methods known to the person skilled in the art and described e.g. in Sambrook et al., 2001 (cited at supra). There may be also further purification steps before the target nucleic acid is detected e.g. a precipitation step. The detection methods may include but are not limited to the binding or intercalating of specific dyes as ethidium bromide which intercalates into the double-stranded DNA and changes its fluorescence thereafter. The purified nucleic acids may also be separated by electrophoretic methods optionally after a restriction digest and visualized thereafter. There are also probe-based assays which exploit the oligonucleotide hybridization to specific sequences and subsequent detection of the hybrid. It is also possible to sequence the target nucleic acid after further steps known to the expert in the field. Other methods apply a diversity of nucleic acid sequences to a silicon chip to which specific probes are bound and yield a signal when a complementary sequences bind.
In a particularly preferred embodiment of the invention, the methylation pattern or status is determined by sequencing, preferably pyrosequencing.
Alternatively, the nucleic acid amplification is carried out by real time PCR and real time probes are used to detect the presence of the extension product. Several versions of real time probes are known, e.g. Lightcycler, Taqman, Scorpio, Sunrise, Molecular Beacon or Eclipse probes. Details concerning structure or detection of these probes are known in the state of the art.
Also, the methylation pattern of the nucleic acid can be confirmed by restriction enzyme digestion and Southern blot analysis. Examples of methylation sensitive restriction endonucleases which can be used to detect CpG methylation include Smal, Sacll, Eagl, Mspl, Hpall, Bst } \ and 5v.vHI I, for example.
In another embodiment of the method of the first and/or the second aspect of the invention, the methylation pattern obtained is compared with the methylation pattern of a reference sample.
The term "reference sample", as used herein, means a sample obtained from a pool of healthy subjects which do not have a disease state or particular phenotype of NSCLC. It may also mean a sample obtained from non-tumoral adjacent tissue obtained from a subject suffering from NSCLC. The reference sample is a biological sample as defined above (wherein tumour cells are normal corresponding cells and biopsies are equivalent and contain no tumour material) from subjects which do not suffer from NSCLC or which do not have a history of NSCLC. Preferably, the reference sample is a sample of subjects matched on age and body mass index to the subject analysed.
According to the present invention, the level of methylation of one or more CpG site(s) is increased (i.e. hypermethylation) when the level of methylation of said one or more CpG site(s) in a sample is higher than in the reference sample. The level of methylation of one or more CpG site(s) is considered to be higher than in the reference sample when they are at least 1.5%, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%: at least 85%, at least 90%, at least 95%, at least 100%, at least 1 10%, at least 120%, at least 130%, at least 140%, at least 150% or more higher than in the reference sample.
In another embodiment, the level of methylation of one or more CpG site(s) is increased (i.e. hypermethylation) when the mean methylation value of a number of CpG site(s) in a sample is higher than in the reference sample. The mean level of methylation of a number of CpG site(s) is considered to be higher than in the reference sample when it is at least 1.5%, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%: at least 85%, at least 90%, at least 95%, at least 100%, at least 110%, at least 120%, at least 130%, at least 140%, at least 150% or more higher than in the reference sample
In another embodiment of the method of the first and/or the second aspect of the invention, the methylation pattern is determined in at least two genes selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9. In further embodiments, the methylation pattern is determined in least three, four or five of these genes.
While it is preferred, this does not mean that hypermethylation in all genes determined (i.e. at least two, three, four or five) indicates a bad prognosis or selects the patient. It is also envisaged, that hypermethylation also in at least one, two, three or four of the genes determined can indicate a bad prognosis or select the patient.
In a particularly preferred embodiment of the method of the first and/or the second aspect of the invention, the methylation pattern is determined in the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and the hypermethylation in at least two of these genes is indicative of a bad prognosis or selects the patient for the cancer treatment. It has been shown by the inventors that this embodiment (also referred to herein as the "bimodal signature") allows a particularly accurate and reliable indication and selection.
Techniques for detection of DNA methylation are known in the art and include, without limitation, bisulfite modification based technologies [including bisulfite sequencing, pyrosequencing, ConLight-MSP (Conversion-specific Detection of DNA Methylation Using Real-time Polymerase Chain Reaction), SMART MSP (Sensitive Melting Analysis after Real Time- Methylation Specific PCR), Matrix-assisted laser desorption/ionization-time of flight (Mass Array Epityper Sequenom), HPLC (High performance liquid chromatography), methyl-beaming and COBRA (Combined Bisulfite Restriction Analysis)], enzymatic digestions based methodologies [including reduced representation bisulfite sequencing (RRBS), HELP assay (Hpall tiny fragment Enrichment by Ligation-mediated PCR) and MethDet (methylation detection)], affinity- enriched based technologies [including MeDIP (Methylated DNA immunoprecipitation), Methyl-Cap and methylation binding domain assays] and high throughput analysis [including arrays and Whole Genome Bisulfite Sequencing].
As indicated above, in another embodiment of the method of the first and/or the second aspect of the invention, the methylation pattern is determined using pyrosequencing. The term "pyrosequencing" relates to a method of DNA sequencing based on the "sequencing by synthesis" principle. It differs from Sanger sequencing in that it relies on the detection of pyrophosphate release on nucleotide incorporation, rather than chain termination. The desired DNA sequence is able to be determined by light emitted upon incorporation of the next complementary nucleotide by the fact that only one out of four of the possible A/T/C/G nucleotides are added and available at a time so that only one letter can be incorporated on the single stranded template (which is the sequence to be determined). The intensity of the light determines if there are more than one of these "letters" in a row. The previous nucleotide letter (one out of four possible dNTP) is degraded before the next nucleotide letter is added for synthesis: allowing for the possible revealing of the next nucleotide(s) via the resulting intensity of light (if the nucleotide added was the next complementary letter in the sequence). This process is repeated with each of the four letters until the DNA sequence of the single stranded template is determined. Methods for pyrosequencing are well known in the art and described, for example, in Nyren, P. (2007). "The History of Pyrosequencing". Methods Mol Biology 373: 1-14.
Anti-cancer agent of the invention In a third aspect, the present invention relates to an anti-cancer agent for the treatment of a NSCLC patient (hereinafter referred to as the "anti-cancer agent of the invention" or the "anti-cancer agent of the third aspect of the invention", wherein the patient is selected using the method of the second aspect.
Preferably, the NSCLC patient is a stage I NSCLC patient.
Terms used in context with the anti-cancer agent of the invention, in particular the terms "treatment", "anti-cancer treatment" and "anti-cancer agent" have the meaning as defined for the prognostic and the selection method of the invention. Kits of the invention
In a fourth aspect, the present invention relates to a kit (hereinafter referred to as the "kit of the invention" or the "kit of the fourth aspect of the invention") comprising
(i) at least one CpG site-binding oligonucleotide capable of specifically hybridizing to a sequence of a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9 in a methy 1 at io n-spec i fi c manner; or
(ii) at least one CpG site-flanking oligonucleotide capable of specifically hybridizing to an upstream or downstream sequence of a CpG site in a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1,
ALX1, and HOXA9, wherein unmethylated cytosine in at least part of said gene has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties.
In a preferred embodiment, said CpG site is in the promoter region of said gene(s). In a more preferred embodiment, said CpG site is selected from the group as defined in the methods of the first and second aspects of the invention.
Preferably, said part of said gene comprises at least the gene sequence between and including the upstream or downstream sequence and a CpG site.
The term "oligonucleotide", as used herein, refers to a single-stranded DNA or RNA molecule, with up to 30, 25, 20, 19, 18, 17, 16, 15, 14 or 13 bases in length (upper limit). The oligonucleotides of the invention are preferably DNA or RNA molecules of at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, or 13 bases in length (lower limit). Ranges of base lengths can be combined in all different manners using the afore-mentioned lower and upper limits, for example at least 2 and up to 30 bases, at least 8 and up to 15 bases, at least 5 and up 15 bases or at least 8 and up to 18 bases.
The term "capable of hybridizing" or "capable of specifically hybridizing", as used herein, refers to the capacity of an oligonucleotide or polynucleotide of recognizing specifically the sequence of a CpG site. As used herein, "hybridization" is the process of combining two complementary single-stranded nucleic acid molecules, or molecules with a high degree of similarity, and allowing them to form a single double-stranded molecule through base pairing. Normally, the hybridization occurs under high stringent conditions or moderately stringent conditions.
As known in the art, the "similarity" between two nucleic acid molecules is determined by comparing the nucleotide sequence of one molecule to the nucleotide sequence of a second molecule. Variants according to the present invention include nucleotide sequences that are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95% similar or identical to the sequence of the CpG site. The degree of identity between two nucleic acid molecules is determined using computer algorithms and methods that are widely known for the persons skilled in the art. The identity between two amino acid sequences is preferably determined by using the BLASTN algorithm (BLAST Manual, Altschul et al, 1990, NCBI NLM NIH Bethesda, Md. 20894, Altschul, S., et al, J. Mol Biol 215:403-10).
"Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).
The term "stringent conditions" or "high stringency conditions", as used herein, typically: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50° C; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Fico 11/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5xSSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1 % sodium pyrophosphate, 5x Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1 % SDS, and 10% dextran sulfate at 42°C, with washes at 42°C in 0.2xSSC (sodium chloride/sodium citrate) and 50% formamide, followed by a high- stringency wash consisting of O. lxSSC containing EDTA at 55 °C.
"Moderately stringent conditions" may be identified as described by Sambrook et al, Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and % SDS) less stringent that those described above. An example of moderately stringent conditions is overnight incubation at 37°C. in a solution comprising: 20% formamide, 5xSSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in lxSSC at about 37-50°C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like.
The term "CpG site-binding" oligonucleotide refers to an oligonucleotide capable of specifically hybridizing to a nucleotide sequence, wherein the oligonucleotide covers the CpG site of the nucleotide sequence. "CpG site-binding" oligonucleotides can be used, for example, as probes for the determination of the methylation status of a bisulfite- treated sequence containing a CpG site (binding or non-binding indicates methylation) or as PCR primers for the determination of the methylation status of a bisulfite-treated sequence containing a CpG site (presence/ absence/amount of an amplificate indicates the degree of methylation) in methods for determining a methylation pattern such as the ones described above.
The term "CpG site- flanking" oligonucleotide refers to an oligonucleotide capable of specifically hybridizing to a nucleotide sequence, wherein the target CpG site(s) of the nucleotide sequence is/are not covered by the oligonucleotide. In other words, the oligonucleotide hybridizes on one side of one or more CpG site(s) (upstream or downstream), but not necessarily directly adjacent to the one or more CpG site(s), i.e. there may be one or more nucleotides between the one or more CpG site(s) and the oligonucleotide hybridizing site. In particular, there may be at least 1 nucleotides, at least 2 nucleotides, at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides or at least 100 nucleotides between the one or more CpG site(s) and the oligonucleotide hybridizing site. CpG site-flanking oligonucleotides can be used, for example, as PCR primers to amplify a bisulfite-treated sequence containing one or more CpG site(s) or as sequencing primers for the determination of the methylation status of a bisulfite-treated sequence containing one or more CpG sites(s) in methods for determining a methylation pattern such as the ones described above. In case of PCR primers, it is preferred that the kit comprises a pair of CpG site-flanking, one of which upstream and the other downstream of the CpG site(s) of interest (i.e. a pair of forward and reverse primers).
Preferred examples of sets of three CpG site-flanking oligonucleotides per gene, PCR primer pairs or single CpG site-flanking oligonucleotides (e.g. sequencing primers) are:
HIST1H4F:
PCR forward: AGGTAAAGGTGGTAAAGGTTTAG (SEQ ID NO: 1)
PCR reverse: AACAACATCCATTACAATAACTATCT (SEQ ID NO : 2)
Sequencing: CTCCTCATAAATAAAACCC (SEQ ID NO: 3)
These primers allow amplifying/sequencing 8 CpG sites, among them cgl0723962, over 253 bp. PCDHGB6:
PCR forward: AGTAAAATTTGAGGGGGATGTAT (SEQ ID NO: 4)
PCR reverse: ATTAACTACRCAAAAAATCCCAAACCAA (SEQ ID NO: 5) Sequencing: GGAGATYGAATTTAAAATGAAAAA (SEQ ID NO: 6)
These primers allow amplifying/sequencing 6 CpG sites, among them cgl 8507379, over 188 bp.
NPBWR1 :
PCR forward: GGGAAATAGYGATAGGGGAGTTTAAGATTG (SEQ ID NO : 7) PCR reverse: ACTCTCTACTTATCCACACACTTAC (SEQ ID NO: 8)
Sequencing: TTGTTAGTTTTTTTTTGGTTATTT (SEQ ID NO: 9)
These primers allow amplifying/sequencing 4 CpG sites, among them cg26205771, over
125 bp.
ALX1 :
PCR forward: GGGAATTGGTTGGTATTAGTATAATGG (SEQ ID NO: 10) PCR reverse: AACCCAAAAAACCAAATACATTAAC (SEQ ID NO: 11)
Sequencing: ATTTTAGAGAAAAAGAAAAGGTTA (SEQ ID NO: 12)
These primers allow amplifying/sequencing 9 CpG sites, among them cgl4996220, over 234 bp.
HOXA9:
PCR forward: GTAGATTTTATGTAATAATTTGGTGGTAT (SEQ ID NO: 13) PCR reverse: CCCTTTACATAAAAACATATAACTTTTACT (SEQ ID NO: 14) Sequencing: GGGGAAGTATAGTTATTTAATAAG (SEQ ID NO: 15)
These primers allow amplifying/sequencing 9 CpG sites, among them cgl 6104915 and cgl2600174, over 182 bp.
The term "methylation-specific manner", as used herein, refers to an oligonucleotide that is either capable of hybridizing to a polynucleotide in which unmethylated cytosine has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties and which comprises at least one sequence of a CpG site when said CpG site is methylated, or to the same polynucleotide when said CpG site is unmethylated, but not to both.
In a preferred embodiment, the kit of the invention comprises (i) at least one first CpG site-binding oligonucleotide capable of specifically hybridizing to a polynucleotide in which unmethylated cytosine has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties and which comprises at least one sequence of a CpG site according to the kit of the fourth aspect of the invention when said CpG site is methylated, and
(ii) at least one second CpG site-binding oligonucleotide capable of specifically hybridizing to the same polynucleotide when said CpG site is unmethylated. Suitable kits may include various reagents for use in accordance with the present invention in suitable containers and packaging materials, including tubes, vials, and shrink-wrapped and blow-moulded packages. Additionally, the kits of the invention can contain instructions for the simultaneous, sequential or separate use of the different components which are in the kit. Said instructions can be in the form of printed material or in the form of an electronic support capable of storing instructions such that they can be read by a subject, such as electronic storage media (magnetic disks, tapes and the like), optical media (CD-ROM, DVD) and the like. Additionally or alternatively, the media can contain Internet addresses that provide said instructions.
Materials suitable for inclusion in an exemplary kit in accordance with the present invention comprise one or more of the following: reagents required to discriminate between the various possible alleles in the sequence domains amplified by PCR or non- PCR amplification (e.g., restriction endonucleases, oligonucleotide that anneal preferentially to methylated or to unmethylated CpG sites, including those modified to contain enzymes or fluorescent chemical groups that amplify the signal from the oligonucleotide and make discrimination of methylated or unmethylated CpG sites more robust); or reagents required to physically separate products derived from the various amplified regions (e.g. agarose or polyacrylamide and a buffer to be used in electrophoresis, HPLC columns, SSCP gels, formamide gels or a matrix support for MALDI-TOF).
In another particular embodiment, the kit of the invention further comprises one or more reagents for converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties (i.e. a converting agent as defined above). In a preferred embodiment, the one or more reagents for an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties is a bisulfite, preferably sodium bisulfite. In another embodiment, the reagent capable of converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties is metabisulfite, preferably sodium metabisulfite.
Terms used in context with the kits of the invention have the meaning as defined for the prognostic and the selection method of the invention.
Nucleic acids of the invention
In a fifth aspect, the present invention relates to a nucleic acid (hereinafter referred to as the "nucleic acid of the invention" or the "nucleic acid of the fifth aspect") selected from the group consisting of
(i) a nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9;
(ii) a nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, wherein the position corresponding to the C within the CpG site is a uracil; and
(iii) a polynucleotide which specifically hybridizes to a nucleic acid of (i) or (ii).
In a particular embodiment, the nucleic of the invention comprises at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30 or more contiguous nucleotides said genes.
In a preferred embodiment, said CpG site is in the promoter region of said gene(s). In a more preferred embodiment, said CpG site is selected from the group as defined in the methods of the first and second aspects of the invention.
The terms and particulars of the nucleic acids of the invention have been described in detail in the context of the methods and kits of the invention and are used with the same meaning.
Use of the kits and nucleic acids of the invention The person skilled in the art will understand that the kits and nucleic acids of the invention are particularly useful in the prognosis of a NSCLC patient or for selecting a NSCLC patient for anti-cancer treatment according to the methods of the first and second aspect of the invention. Thus, in a sixth aspect, the present invention relates to the use of a kit or a nucleic acid of the foregoing aspects for the prognosis of a NSCLC patient or for selecting a NSCLC patient for anti-cancer treatment (hereinafter referred to as the ' 'use of the invention' ') .
In a preferred embodiment, the NSCLC patient is a stage I NSCLC patient.
The terms and particulars of the use of the invention have been described in detail in the context of the methods, kits and nucleic acids of the invention and are used with the same meaning in the context of the uses according to the invention.
All publications mentioned herein are hereby incorporated in their entirety by reference.
While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be appreciated by one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention and appended claims.
The invention is described by way of the following examples which are to be construed as merely illustrative and not limitative of the scope of the invention.
EXAMPLE 1
Patients and Methods
Study design and patient population
Patients were eligible to enter the study as part of both discovery and validation cohorts if they underwent surgical resection of NSCLC in any of the internationally participating institutions. The clinical characteristics of the NSCLC surgical samples obtained are shown in Table 1. Tumors were collected by surgical resection from patients who have provided consent and protocols were approved by the Institutional Review Boards. The median clinical follow-up was 7.2 years. Follow-up was performed with the use of radiographic imaging (chest X-ray and CT scans) and time of recurrence was noted. In addition, 25 normal lung tissue counterparts without any histological evidence of malignancy were also analyzed. The NSCLC tumor samples were studied in a consecutive manner as they arrived to the centralized DNA methylation facility and passed the technical quality checks.
Data shown in Table 2 are average (range) or number (%). * in Table 1 indicates patients from the discovery cohort who had undergone resection of NSCLC and did not receive neither adjuvant nor neoadjuvant chemotherapy before relapse. ** in Table 2 indicates all patients from the validation cohort had undergone resection of NSCLC and did not receive neither adjuvant nor neoadjuvant chemotherapy before relapse.
Figure imgf000035_0001
Figure imgf000036_0001
Procedures
The DNA methylation status of 450,000 CpG sites was established by using the Infinium 45 OK Methylation array. The methylation score of each CpG is represented as a β value. Samples were clustered in an unsupervised manner using the 10,000 most variable β values for CpG methylation according to the standard deviation for the CpG sites located in promoter regions by hierarchical clustering using the complete method for agglomerating the Manhattan distances. DNA methylation microarray data are available from:
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=nzmlvqwmiqsgobi&acc=GSE39 279
Preparation of tumor specimens
All frozen specimens were labeled with study-specific code identifiers only. DNA was extracted from frozen specimens using a standard phenol chloroform extraction method. DNA from paraffin-embedded tissue blocks was extracted from two sequential unstained sections, each 10 μιη thick. For each sample of tumor tissue, subsequent sections were stained with hematoxylin and eosin for histological confirmation of the presence (>50%) of tumor cells. Unstained tissue sections were deparaffinized, and DNA was extracted using the same protocol as for frozen specimens. Extracted DNA was checked for integrity and quantity with 1.3% agarose gel electrophoresis and picogreen quantification, respectively. Bisulfite conversion of 600 ng of DNA for each sample was performed according to the manufacturer's recommendation for Illumina Infinium Assay. MIAME standards were accomplished following GEO specifications. Chip analysis was performed using Illumina HiScan SQ fluorescent scanner. Bisulphite converted DNA was amplified, fragmented and hybridised to Illumina Infinium Human Methylation450 Beadchip using standard Illumina protocol. Arrays were imaged using High Scan SQ following standard recommended Illumina scanner setting. Methylation score of each CpG is represented as beta value. Normalization
The intensities of the images were extracted using Genome Studio (2011.1) Methylation module (1.9.0) software. A three-step based normalization procedure was performed using the lumil package available in the R statistical environment, consisting of color bias adjustment (normalization between the two color channels red and green),background level adjustment and quantile normalization across arrays based on color balance adjusted data, as specified (Du et al, Bioinformatics 24(13): 1547-1548, 2008). Prefiltering
Probes and sample filtering involved a two-step process. In the first step, probes overlapping with known single nucleotide polymorphisms (SNPs) were removed. SNPs are known to have given rise to inaccurate measurements of DNA methylation in earlier generations of Illumina's BeadArray assay and this needs to be taken into consideration for the Human Methylation 450K technology. Therefore, the inventors first removed all probes containing a SNP in the assayed CpG dinucleotide, as well as those for which two or more SNPs were located in the probe sequence. The inventors extracted the known SNPs in the human genome from the dbSNP database. The inventors selected the 2008 release of the database because it is the last release before the massive introduction of personal genomes, which would lead to false-positive probe removal. The second step involves the detection p-values of all measurements. The inventors considered every methylation β value to be unreliable if its corresponding detection p- value was not below the threshold T = 0.05. Greedycut was used, an algorithm that filters out the probe or sample with the highest fraction of unreliable measurements in successive iterations, producing a matrix of retained measurements and increasing the set of removed measurements. The inventors calculated false-positive rate (a) and sensitivity (s) when the retained measurements were used to predict the reliable ones. The inventors selected the matrix that maximized the value of the expression s + 1 - a, thereby giving equal weights to the sensitivity and specificity. Presented geometrically on a ROC curve, this is the point that is furthest from the diagonal. The final step was the removal of CpGs included in the sex chromosomes. Collectively, the filtering steps removed 76,545 probes and 24 samples. The analyses presented in this publication focus on the remaining 409,219 CpGs in 515 samples (490 primary tumors and 25 normal lung tissue counterparts). To study possible batch effects for the Human Methylation 450K technology, multivariate Cox regression analyses were developed to show that the obtained hazard ratios were not associated with the center of origin of each sample or the microarray hybridization date.
Differentially methylated CpG search Differentially methylated CpGs between tumor and normal groups were discovered using the following procedure: for each probe/CpG, the sets of methylation β values T (belonging to the tumor samples: first group) and N (belonging to the normal lung tissue samples: second group) were compared. The following three measures were calculated and differentially methylated CpGs were selected on the basis of their fulfillment of these three conditions:
(1) Mean difference MD = |μΤ - μΝ|>0.2
(2) Mean quotient MT = log2^T / μΝ)>2
(3) Significance of multiple testing-corrected Wilcoxon test (FDR), adjusted p<0.05
Categorization of primary tumor samples according to methylation levels
To facilitate the classification of lung cancer patients with respect to their methylation levels in specific CpGs, the inventors set a threshold β-value of 0.4 to define non-methylation (β<0.4) and methylation (β>0.4). Subsequently, the inventors examined whether these two groups of patients were associated with a difference pattern in relapse-free survival (RFS). The inventors also defined a CpG to be consistently unmethylated in normal donors when its mean β-value was at most 0.15. This condition was met by all of the selected differentially methylated CpGs.
Pyrosequencing
DNA methylation in the validation cohort was evaluated with a pyrosequencing assay. A minimum of 500 ng of DNA was converted using the EZ DNA Methylation Gold (ZYMO RESEARCH) bisulfite conversion kit following the manufacturer's recommendations. There were 143 samples in the validation set. Specific sets of primers for PCR amplification and sequencing were designed using specific software (PyroMark assay design version 2.0.01.15). Primer sequences were designed, when possible, to hybridize with CpG-free sites to ensure methylation-independent amplification. PCR was performed under standard conditions with biotinylated primers and the PyroMark Vacuum Prep Tool (Biotage, Sweden) was used to prepare single- stranded PCR products according to manufacturer's instructions. PCR products were observed at 2% agarose gels before pyrosequencing. Reactions were performed in a PyroMark Q96 System version 2.0.6 (Qiagen) using appropriate reagents and protocols, and the methylation value was obtained from the average of each of the CpG dinucleotides included in the sequence analyzed. Controls to assess correct bisulfite conversion of the DNA were included in each run, as well as sequencing controls to ensure the fidelity of the measurements. The threshold value derived from the discovery cohort for RFS analysis was not re- estimated but was applied directly to the validating cohort. For the bimodal signature based on the accumulation of the five validated hypermethylated genes, only those samples in which all of them showed valid results were included (n=102).
Statistical analysis
Assay results were compared with actual patient outcomes in a double-blind manner. Median follow-up duration was calculated according to the inverse Kaplan- Meier method. Differences in distributions between groups were examined by the chi- square test. The Kaplan-Meier method was used to estimate RFS, and differences among the groups were analyzed with the log-rank test. Hazard ratios (HRs) from univariate Cox regression analysis were used to determine the association of age, gender, smoking history, histological type and tumor stage with relapse. Multivariate Cox proportional hazards regression was used to evaluate independent prognostic factors associated with RFS. Age, gender, smoking history, histological type and tumor stage were included as covariates. The tumor size, when included, was added categorically (< 3 cm or >3cm) according to T classification. The multiple testing adjusting (false discovery rate: FDR) was calculated as described (Storey J., Ann. Statist. 3 1 (6):2013-2035, 2003): it is defined as the calculation of the positive false discovery rate of the p-value. All statistical analyses were performed and graphical output produced using the SPSS, R-2.15.0 and R packages. Gene ontology by PANTHER, INTERPRO and KEGG pathway enrichment analysis were done using the Database for Annotation, Visualization and Integrated Discovery (DAVID; v6.7). In order to address the problem of picking from the whole array predictive CpG sites by chance, the inventors performed a 10,000 permutations test randomly selecting the same number of predictive CpG sites the inventors determine in the article (n=10) and calculating the likelihood of getting more than 5 CpG which induce a hazard higher than 2 (p=0.002) (p= 0.01 using only the 150 CpGs in promoter, island/shore that were different in cancer vs. control samples).
EXAMPLE 2
DNA methylation profiles identify two groups with different relapse -free survival
The inventors first evaluated a genome -wide DNA methylation profile of the original cohort of 490 lung tumor patients including three NSCLC subtypes (adenocarcinoma, squamous and large cell carcinomas) using a previously validated 450,000 CpG methylation microarray (Sandoval et al, Epigenetics 6(6):692 -702, 2011). In addition, 25 normal lung tissue counterparts without any histological evidence of malignancy were also analyzed.
The analyses of CpG methylation β-values from the DNA methylation microarray within all primary NSCLCs (n=490) and normal tissues (n=25 ) identified 10,000 promoter CpGs with the most variable CpG methylation levels. These 10,000 top-ranked CpG sites were plotted in an unsupervised manner in the 490 primary NSCLCs (Fig. 1A). The hierarchical clustering distinguished two main types of tumors, accounting for 65 (13%) (Group A) and 425 (87%) (Group B) cases. Chi- square tests showed a significantly higher proportion of the adenocarcinoma histological type in Group A (Chi-square test P<0.001), but no other significant differences in the distribution of the tumors according to their stage, gender or smoking history between Group A and Group B were observed. The inventors investigated whether these two DNA methylation groups had any effect on the RFS of these patients. The inventors analyzed the subset of patients who had undergone resection of NSCLC and not received adjuvant chemotherapy before relapse, due to the possible confounding effect of chemotherapy in the RFS. Only 6% (31 of 490) of the discovery cohort samples received neoadjuvant therapy and none of these cases were included in the RFS analysis. For a similar reason, overall survival was not selected as an end-point of the study because it could be affected by subsequent therapies received at relapse. The inventors also focused our RFS analysis on the two main histological groups, adenocarcinomas and squamous cell carcinomas, since the large cell tumors constituted a small minority of the discovery set, (4%, 18 of 490), where only 2.6% (4 of 151) stage I cases would have been useful for the RFS analyses.
Overall, 198 NSCLC cases met the criteria for inclusion in the RFS cohort. Most importantly, these NSCLC Group A patients had a significantly shorter RFS, as shown in the Kaplan-Meier survival analysis (log-rank test: p<0.001) (Fig. IB) and in the univariate (HR 2.23; p=0.001) and multivariate (HR 2.30; p=0.002) Cox regression analyses of stage, histology, smoking history, age and gender. Related to histology, sample size was not sufficient to study the squamous type alone, but the unsupervised clustering analysis of the adenocarcinomas also identified a group associated with shorter RFS (HR 2.11; p=0.004). The inventors wanted to extend these observations in an important clinical issue: identifying those NSCLC tumors that, despite their low stage, are prone to recurrence. The selection ofthese patients is a critical issue because approximately 30-40%) of patients with stage I NSCLC die of recurrent disease (see, e.g., Martini et al., J Thorac Cardiovasc Surg 109: 120 -9, 1995). To address this, the profile of the aforementioned 10,000 promoter CpGs, which had already shown their prognostic value throughout all NSCLC stages, was plotted in an unsupervised manner in the 253 cases of stage I NSCLC (Fig. 1C). Hierarchical clustering distinguished two main types of tumors, accounting for 68 (27%o) (Group 1) and 185 (73%) (Group 2) cases. Chi- square tests revealed no significant differences in the distribution of the tumors in the two groups by gender, smoking history and histological type. Among the 253 stage I NSCLC cases, 147 patients met the described criteria for inclusion in the RFS cohort. The ineligible cases (n=106) had a higher recurrence rate (Chi-square test, P=0.05) probably associated with the increased number of stage IB cases in this setting (Chi-square test, P=0.001).
Only 4%o (10 of 253) of stage I cases received neoadjuvant therapy and none of these were included in the RFS analysis. Group 1 identified high-risk stage I NSCLC patients that had lower RFS, as revealed by the Kaplan-Meier survival analysis (log-rank test: p=0.021) (Fig. ID) and in the univariate (HR 1.96; p=0.023) and multivariate (HR 2.46; p=0.005) Cox regressions of histology, smoking history, age and gender. Related to histology, sample size was not sufficient to study the squamous type alone, but the unsupervised clustering analysis of the adenocarcinomas in stage I also identified a group associated with shorter RFS (HR 2.64; p=0.008). For all the stage I cases, since stage IA and IB have different outcomes, the inventors also added this particular feature according to the revised 6th TNM classification criteria, to the Cox regression multivariate analysis and Group 1 remained significantly associated with shorter RFS (HR 2.20; p=0.018). The inclusion of tumor size within Stage I, also an indicator of poor prognosis in NSCLC, in the Cox analysis did not alter the significant association of Group 1 tumors with shorter RFS (HR 2.21; p=0.035). The reclassification of the stage I tumors according to the last 7th TNM classification criteria also confirmed that Group 1 patients remained significantly associated with shorter RFS (HR 2.47; p=0.024).
EXAMPLE 3 Identification of candidate genes as DNA methylation biomarkers of shorter RFS in the discovery cohort of stage I NSCLC
The identification of a DNA methylation signature for stage I NSCLC that predicts the likelihood of early recurrence might be useful in managing these patients, but the finding of a smaller panel of DNA methylation biomarkers could simplify the whole process. To achieve this goal, the inventors developed an integrative approach to rank the CpG sites that, according to their methylation status (β -values), were best at discriminating the 490 NSCLC samples from the 25 normal lung tissue samples. This analysis identified 363 highly ranked CpG sites. From these, we focused on those CpGs located in gene regulatory regions: promoter CpG islands and shores. The inventors found that 150 of the 363 CpG sites were located in the described regions. All these 150 CpG sites were present in the 10,000 CpG sites used in the clustering. CpG hypermethylation of these 150 sites was significantly enriched in Group A vs Group B (Student's t-test, p<0.001 ) and in Group 1 vs Group 2 (Student's t-test, p<0.001) supporting their potential prognostic value. Thus, the inventors tested the methylation value of each of these 150 CpG sites for RFS in the 147 stage I tumors by Kaplan-Meier survival and multivariate Cox regression. The inventors identified 53 CpGs, corresponding to 40 genes, significantly associated with shorter RFS at a 10% False Discovery Rate (FDR).
EXAMPLE 4
Validation of candidate genes as DNA methylation biomarkers of shorter RFS in an independent cohort of stage I NSCLC
Once the inventors had identified 40 genes with CpG promoter methylation that influenced RFS in our initial discovery cohort of 147 stage I tumors, the inventors sought to validate these single DNA methylation markers in an additional cohort of 143 stage I NSCLC patients (Table 1). All these new NSCLC cases were obtained from patients who had undergone a resection and did not receive adjuvant chemotherapy. The validation cohort, in comparison to the discovery set, was significantly enriched in European samples and, thus, in affected men and squamous carcinomas (Travis WD. The WHO: Classification of the Lung Cancer 2004). The DNA methylation levels at the described CpG sites were analyzed by pyrosequencing (Fernandez AF et al., Genome Res 22(2):407-19, 2012) to test a more affordable large-scale approach. Methylation value by pyrosequencing was obtained from the average of each of the CpG dinucleotides included in the sequence analyzed. Due to the limited DNA material, the inventors selected top 10 genes with a HR >2 at a 10% FDR). Of these 10 candidate DNA methylation biomarkers associated with recurrence in the discovery cohort using the DNA methylation microarray, five (50%) were significantly associated with recurrence (p<0.05) in the validation cohort of 143 stage I NSCLC samples analyzed by pyrosequencing. These were the genes Histone clusterl H4F (HIST1H4F, HR 3.55, p<0.001), Protocadherin gamma subfamily B6 (PCDHGB6, HR 2.95, p=0.002), Neuropeptides B/W receptor 1 (NPBWR1, HR 2.71, p=0.004), ALX homeobox protein 1 (ALX1 , HR 2.29, p=0.015) and Homeobox A9 (HOXA9, HR 2.03,p=0.027) (Fig. 2). In addition, other three genes (30%>) showed a trend towards significancy (OTX2, HR 1.82, p=0.11 ; TRIM58, HR 1.57, p=0.14; TRH, HR 4.23, p=0.17).
The inventors also observed a greater risk of shorter RFS, according to Kaplan-Meier plots, when stage I NSCLCs harbored a high number of the five statistically significant hypermethylated markers (HIST1H4F, PCDHGB6, NPBWR1, ALX1 and HOXA9). To obtain the most useful bimodal methylation signature, the inventors chose the cut-off 0-1 vs > 2 hypermethylated markers, because it was the best one in resembling the percentage of expected recurrences. The described bimodal methylation signature divides the stage I tumors into two arms: patients with 0-1 methylated markers that show longer RFS and those with > 2 hypermethylated genes that were associated with a higher risk of poor RFS by Kaplan-Meier estimates (Fig. 3A). The heavily hypermethylated group identified high-risk stage I NSCLC patients that had shorter RFS, as shown by the Kaplan-Meier survival analysis (log- rank test: p=0.010) (Fig. 3A) and the univariate (HR 2.26; p=0.012) and multivariate (HR 3.24; p=0.001) Cox regressions. The bimodal methylation signature remained significantly associated with shorter RFS in the Cox regression multivariate analysis even when stage I tumors where subdivided in IA and IB according to the revised 6th TNM classification (HR 3.09, p=0.002). The reclassification of the stage I tumors according to the last 7th TNM classification criteria also confirmed the relevance of the enriched hypermethylation group for shorter RFS in 103 original stage I tumors where all the necessary clinicopathological information was available (HR 2.89, p=0.010). The inclusion of tumor size in the Cox analysis within stage I did not alter the significant association of tumors with >2 methylated markers with shorter RFS (HR 2.88 p=0.01 1). Since 80% of recurrences of stage I NSCLC occur within three years of surgery (Martini N et al, J Thorac Cardiovasc Surg 109: 120 -9, 1995), the inventors also calculated how many patients relapsed in this period. The inventors observed that 48% (95% CI 39.8-56.4) of patients from the enriched methylated group (2-5 methylated markers) relapsed, but only 18% (95% CI 16.1 -19.5) of those in the low methylated group (0-1 methylated markers). Finally, as expected, the prognostic bimodal methylation signature obtained by pyrosequencing of the five genes in the validation cohort was also observed in the 147 stage I NSCLC from the discovery cohort studied by the DNA methylation microarray (HR 1.95, p=0.023) (Fig. 3B).
CpG sites analysed by the inventors of the invention are shown in Table 3.
Indicated numbers correspond to the C nucleotide of a CpG site, according to MAPINFO/Illumina Infmium HumanMethylation450 BeadChip, Manifest vl .2, or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009.
Figure imgf000046_0001
Table 3: CpG sites analysed according to the present invention and shown in
Table 3. Indicated numbers correspond to the C nucleotide of a CpG site, as in Infmium/UCSC.
Overall, the inventors have identified DNA methylation classifiers that, at a different level of resolution, are potential prognostic biomarkers of shorter RFS in stage I NSCLC (Fig. 3C).

Claims

1. An in vitro method for the prognosis of a NSCLC patient, comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of PCDHGB6, HIST1H4F, NPBWRl, ALXl and HOXA9, and wherein hypermethylation in at least one gene is indicative of a bad prognosis.
2. The method according to claim 1 wherein the NSCLC patient is a stage I NSCLC patient.
3. The method according to claims 1 or 2 wherein the prognosis is determined as relapse-free survival.
4. An in vitro method for selecting a NSCLC patient for an anti-cancer treatment, comprising determining in a biological sample of said patient the methylation pattern in at least one gene selected from the group consisting of PCDHGB6, HIST1H4F, NPBWRl, ALXl and HOXA9, and wherein hypermethylation in at least one gene selects the patient for the cancer treatment.
5. The method according to claim 4 wherein the NSCLC patient is a stage I NSCLC patient.
6. The method according to any of claims 1 to 5, wherein the methylation pattern of said gene(s) is determined in the promoter region of said gene(s).
7. The method according to any of claims 1 to 6, wherein the methylation pattern of said gene(s) is determined at a CpG site of said gene(s).
8. The method according to claim 7, wherein said CpG site is located at a CpG island of said gene(s) or at a CpG shore of said gene(s).
9. The method according to any of claims 1 to 8, wherein the methylation pattern of said gene(s) is determined as the methylation mean value of at least two CpG sites of said gene(s).
10. The method according to any of claims 1 to 9, wherein the CpG sites of said gene(s) are located at positions:
- 26240782 (cgl0723962), 26240528 (cg22723502), 26240519 (cgl2260798), 26240762, 26240771, 26240774, 26240776, 26240779, 26240789 and/or 26240796 in HIST1H4F,
- 140787507 (cgl8507379), 140787504 (cgl8617005), 140787474, 140787487, 140787491, 140787504 and/or 140787513 in PCDHGB6,
- 53851156 (cg26205771) and/or 53852422 (cg07770968), 53851151, 53851160 and/or 53851189 in NPBWR 1 ,
- 85673270 (cgl4996220), 85673276, 85673296, 85673303, 85673325,
85673330, 85673344, 85673347 and/or 85673353 in ALX1, and/or
- 27205217 (cgl6104915), 27205230 (eg 12600174), 27205187, 27205200, 27205204, 27205206, 27205211, 27205219 and/or 27205224 in HOXA9.
wherein indicated positions correspond to the C nucleotide of a CpG site according to MAPINFO/Illumina Infmium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build 37 (GRCh37) and UCSC hgl9 as released on February 2009; and wherein positions in brackets are indicated according to IlmnlD/Illumina.
11. The method according to any of claims 1 to 10, wherein the methylation pattern is determined in at least two genes selected from the group consisting of HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9.
12. The method according to any of claims 1 to 11, wherein the methylation pattern is determined in the genes HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9, and wherein hypermethylation in at least two of these genes is indicative of a bad prognosis or selects the patient for the cancer treatment.
13. The method according to any of claims 1 to 12, wherein the methylation pattern is determined by a technique selected from the group comprising bisulfite modification based techniques, enzymatic digestion based techniques, affinity- enriched based techniques and high-throughput techniques.
14. The method according to any of claims 1 to 13, wherein the methylation pattern is determined by pyrosequencing.
15. An anti-cancer agent for use in the treatment of a NSCLC patient, wherein the patient is selected using the method for selecting a NSCLC patient for a cancer treatment according to any one of claims 3 to 12.
16. The anti-cancer treatment for use according to claim 15 wherein the NSCLC patient is a stage I NSCLC patient.
17. A kit comprising
(i) at least one CpG site-binding oligonucleotide capable of specifically hybridizing to a sequence spanning a CpG site in a gene selected from the group consisting of PCDHGB6, HIST1H4F, NPBWR1, ALX1, and HOXA9 in a methylat ion-specific manner; or
(ii) at least one CpG site-flanking oligonucleotide capable of specifically hybridizing to an upstream or downstream sequence of a CpG site in a gene selected from the group consisting of PCDHGB6, HIST1H4F, NPBWR1, ALX1, and HOXA9, wherein unmethylated cytosine in at least part of said gene has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties.
18. The kit according to claim 17, comprising (i) at least one first CpG site-binding oligonucleotide capable of specifically hybridizing to a polynucleotide in which unmethylated cytosine has been converted to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties and which comprises at least one sequence of a CpG site according to claim 10 when said CpG site is methylated, and
(ϋ) at least one second CpG site-binding oligonucleotide capable of specifically hybridizing to the same polynucleotide when said CpG site is unmethylated.
19. The kit according to claim 17, wherein said part of said gene comprises at least the gene sequence between and including the upstream or downstream sequence and the CpG site.
20. The kit according to any of claims 17 to 19, further comprising one or more reagents for converting an unmethylated cytosine to uracil or to another base that is detectably dissimilar to cytosine in terms of hybridization properties.
21. A nucleic acid selected from the group consisting of
(i) a nucleic acid comprising at least 9 contiguous nucleotides comprising a
CpG site of a gene selected from the group consisting of PCDHGB6, HIST1H4F, NPBWR1, ALX1, and HOXA9;
(ii) a nucleic acid comprising at least 9 contiguous nucleotides comprising a CpG site of a gene selected from the group consisting of PCDHGB6, HIST1H4F, NPBWR1, ALX1, and HOXA9, wherein the position rresponding to the C within the CpG site is a uracil; and
Figure imgf000050_0001
22. The kit according to any of claims 17 to 20 or the nucleic acid of claim 21, wherein said CpG site is in the promoter region of said gene(s).
23. The kit according to any of claims 17 to 20 or 21 to 22, or the nucleic acid of any of claims 18 to 19 or 21, wherein said CpG site is located at a CpG island or at a CpG shore of said gene(s).
24. The kit or the nucleic acid according to any of claims 22 or 23, wherein said CpG site is selected from the group consisting of a CpG located at positions
- 26240782 (cgl0723962), 26240528 (cg22723502), 26240519 (cgl2260798), 26240762, 26240771, 26240774, 26240776, 26240779, 26240789 and/or 26240796 in HIST 1H4F,
- 140787507 (cgl8507379), 140787504 (cgl8617005), 140787474, 140787487, 140787491, 140787504 and/or 140787513 in PCDHGB6,
- 53851156 (cg26205771) and/or 53852422 (cg07770968), 53851151,53851160 and/or 53851189 in NPBWR1 ,
- 85673270 (cgl4996220), 85673276, 85673296, 85673303, 85673325, 85673330, 85673344, 85673347 and/or 85673353 in ALX1, and/or
- 27205217 (eg 16104915 ), 27205230 (eg 12600174), 27205187, 27205200, 27205204, 27205206, 27205211, 27205219 and/or 27205224 in HOXA9.
wherein indicated positions correspond to the C nucleotide of a CpG site according to MAPINFO/Illumina Infmium HumanMethylation450 BeadChip, Manifest vl .2 or according to UCSC database, as in Genome Reference Consortium Human Build
37 (GRCh37) and UCSC hgl9 as released on February 2009; and wherein positions in brackets are indicated according to IlmnlD/Illumina.
25. Use of a kit according to any of claims 17 to 20 or 22 to 24, or of a nucleic acid according to any of claims 21 to 24 for prognosis of a stage I NSCLC patient or for selecting a stage I NSCLC patient for anti-cancer treatment.
26. The method according to any of claims 1 to 14, the anti-cancer agent according to any of claims 15 or 16 or the use according to claim 25, wherein NSCLC is of the subtype adenocarcinoma or squamous carcinoma.
27. The method according to any of claims 1 to 14 or 26, the anti-cancer agent according to claim 15, 16 or 26 or the use according to claim 25 or 26, wherein said patient has undergone resection of NSCLC and has optionally not yet received chemotherapy.
PCT/EP2014/058150 2013-04-23 2014-04-22 Methods and kits for prognosis of stage i nsclc by determining the methylation pattern of cpg dinucleotides WO2014173905A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP13382147 2013-04-23
EP13382147.0 2013-04-23
EP13185321 2013-09-20
EP13185321.0 2013-09-20

Publications (2)

Publication Number Publication Date
WO2014173905A2 true WO2014173905A2 (en) 2014-10-30
WO2014173905A3 WO2014173905A3 (en) 2015-04-23

Family

ID=50680012

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2014/058150 WO2014173905A2 (en) 2013-04-23 2014-04-22 Methods and kits for prognosis of stage i nsclc by determining the methylation pattern of cpg dinucleotides

Country Status (1)

Country Link
WO (1) WO2014173905A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3190191A1 (en) * 2016-01-11 2017-07-12 Institut d'Investigació Biomèdica de Bellvitge (IDIBELL) Method and kit for the diagnosis of lung cancer
CN110964811A (en) * 2018-09-29 2020-04-07 广州市康立明生物科技有限责任公司 HOXA9 methylation detection reagent
CN113652490A (en) * 2021-09-27 2021-11-16 广州凯普医药科技有限公司 Primer probe combination and kit for early screening and/or prognosis monitoring of bladder cancer
EP3839070A4 (en) * 2018-08-16 2022-05-18 Shanghai Public Health Clinical Center Dna methylation-related marker for diagnosing tumor, and application thereof
WO2022187246A1 (en) * 2021-03-01 2022-09-09 National Taiwan University Method and kit for monitoring non-small cell lung cancer

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044331A2 (en) * 2000-11-29 2002-06-06 Cangen International Dap-kinase and hoxa9, two human genes associated with genesis, progression, and aggressiveness of non-small cell lung cancer
WO2008063655A2 (en) * 2006-11-20 2008-05-29 The Johns Hopkins University Dna methylation markers and methods of use
WO2012175562A2 (en) * 2011-06-21 2012-12-27 University Of Tartu Methylation and microrna markers of early-stage non-small cell lung cancer

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3190191A1 (en) * 2016-01-11 2017-07-12 Institut d'Investigació Biomèdica de Bellvitge (IDIBELL) Method and kit for the diagnosis of lung cancer
WO2017121712A1 (en) * 2016-01-11 2017-07-20 Institut D'investigació Biomèdica De Bellvitge (Idibell) Method and kit for the diagnosis of lung cancer
EP3839070A4 (en) * 2018-08-16 2022-05-18 Shanghai Public Health Clinical Center Dna methylation-related marker for diagnosing tumor, and application thereof
CN110964811A (en) * 2018-09-29 2020-04-07 广州市康立明生物科技有限责任公司 HOXA9 methylation detection reagent
CN110964811B (en) * 2018-09-29 2022-04-29 广州康立明生物科技股份有限公司 HOXA9 methylation detection reagent
WO2022187246A1 (en) * 2021-03-01 2022-09-09 National Taiwan University Method and kit for monitoring non-small cell lung cancer
CN113652490A (en) * 2021-09-27 2021-11-16 广州凯普医药科技有限公司 Primer probe combination and kit for early screening and/or prognosis monitoring of bladder cancer
CN113652490B (en) * 2021-09-27 2022-07-22 广州凯普医药科技有限公司 Primer probe combination and kit for early screening and/or prognosis monitoring of bladder cancer

Also Published As

Publication number Publication date
WO2014173905A3 (en) 2015-04-23

Similar Documents

Publication Publication Date Title
US11352672B2 (en) Methods for diagnosis, prognosis and monitoring of breast cancer and reagents therefor
US20170121775A1 (en) Detection and Prognosis of Lung Cancer
US20120142546A1 (en) Hypomethylated genes in cancer
WO2007070640A2 (en) Use of roma for characterizing genomic rearrangements
US20120141603A1 (en) Methods and compositions for lung cancer prognosis
JP6106257B2 (en) Diagnostic methods for determining the prognosis of non-small cell lung cancer
US20150072947A1 (en) Gene biomarkers for prediction of susceptibility of ovarian neoplasms and/or prognosis or malignancy of ovarian cancers
WO2014173905A2 (en) Methods and kits for prognosis of stage i nsclc by determining the methylation pattern of cpg dinucleotides
Mori et al. S100A11 gene identified by in-house cDNA microarray as an accurate predictor of lymph node metastases of gastric cancer
JP2024020392A (en) Composition for diagnosing liver cancer by using cpg methylation changes in specific genes, and use thereof
US20210223249A1 (en) Cancer epigenetic profiling
JP2021503921A (en) Compositions and Methods for Adapting Cancer
WO2016014941A1 (en) Method to diagnose malignant melanoma in the domestic dog
KR102195591B1 (en) Diagnostic methods for prognosis of non-small-cell lung cancer using glut3 snp
JP5865241B2 (en) Prognostic molecular signature of sarcoma and its use
WO2017046714A1 (en) Methylation signature in squamous cell carcinoma of head and neck (hnscc) and applications thereof
WO2017119510A1 (en) Test method, gene marker, and test agent for diagnosing breast cancer
US9920376B2 (en) Method for determining lymph node metastasis in cancer or risk thereof and rapid determination kit for the same
US20220017967A1 (en) Molecular signature
JP2018093764A (en) Method for predicting prognosis of esophageal cancer
Ruosaari Microarrays in Lung Cancer Research: From Comparative Analyses to Verified Findings

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14722139

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 14722139

Country of ref document: EP

Kind code of ref document: A2