WO2016066797A2 - Ovarian cancer prognostic subgrouping - Google Patents

Ovarian cancer prognostic subgrouping Download PDF

Info

Publication number
WO2016066797A2
WO2016066797A2 PCT/EP2015/075246 EP2015075246W WO2016066797A2 WO 2016066797 A2 WO2016066797 A2 WO 2016066797A2 EP 2015075246 W EP2015075246 W EP 2015075246W WO 2016066797 A2 WO2016066797 A2 WO 2016066797A2
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
biomarkers
poor
patients
platinum
Prior art date
Application number
PCT/EP2015/075246
Other languages
French (fr)
Other versions
WO2016066797A3 (en
Inventor
Sampsa Hautaniemi
Ping Chen
Olli Carpen
Kaisa Huhtinen
Original Assignee
University Of Helsinki
University Of Turku
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Helsinki, University Of Turku filed Critical University Of Helsinki
Publication of WO2016066797A2 publication Critical patent/WO2016066797A2/en
Publication of WO2016066797A3 publication Critical patent/WO2016066797A3/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Definitions

  • the present invention relates to molecular profiling in order to identify clinically applicable biomarkers for predicting cancer patients' response to a spe- cific therapy regimen.
  • the present invention particularly relates to cancer patients treated first line with platinum and taxane combination therapy.
  • the invention provides substantial help in the clinical decision making and is especially useful in the selection of first line chemotherapy regimen.
  • HGS-OvCa high-grade serous ovarian can- cer
  • HGS- OvCa the most common ovarian cancer subtype with an overall 5- year survival rate of 35 - 40% [Berns EM and Bowtell DD, 2012].
  • HGS- OvCa is a single morphological and clinical entity, its molecular features are versatile. The genomic alterations are very complex and there are no common driver mutations, or other known targetable mutations that could serve as uni- form therapeutic targets.
  • HGS-OvCa The standard treatment for HGS-OvCa is platinum and taxane combination therapy. While most patients at the advanced stage of the disease initially respond to chemotherapy, the treatment is very seldom curative and the majority suffers relapse within 18 months. The development of platinum re- sistance is a complex and poorly understood process. There are very few clinically relevant treatment options for platinum resistant ovarian cancer patients available at the moment. Thus, biomarkers for alternative treatments that would allow designing of personalized combinatorial therapy for platinum and taxane resistant cancer patients are urgently needed.
  • AOCS and TCGA expression data have been used to identify four subtypes in HGS-OvCa [TothillR. W. et al., 2008, Verhaak R. G et al., 2013]. Patients classified to any of these subtypes can, however, still have different responses to platinum and taxane combination therapy. There are very few clinically relevant treatment options for platinum resistant ovarian cancer patients available at the moment.
  • Characterization of cancer patient subgroups from the massive high- throughput data for personalized medicine requires cost-efficient systematic approaches.
  • Most of the existing approaches for identifying cancer subgroups use two-step approach where the first step is to use computational methods, such as unsupervised learning, Bayesian methods or dimension reduction for identi- fying clusters using molecular data. This is followed by assessment of biological or medical relevance for the identified clusters using Gene Ontology, pathway or clinical data.
  • the major limitation of this approach is that most clusters typically have weak or no association to outcome or disease relevant biological processes because clusters are formed without fusing clinical data.
  • the clusters identified without fusing clinical data with molecular data could lose association to outcome or disease relevant biological processes and may contain hundreds of genes that are loosely correlated with each other making the choice of putative biomarkers challenging.
  • isoform-level signatures are often lineage-specific rather than tissue-specific shown by gene-level signatures.
  • isoform-level signature may become useful biomarkers for cancer progression.
  • the treatment of ovarian cancer remains challenging despite many advances in therapeutic options. There is still place for examining better ways of using known drugs.
  • An object of the present invention is thus to provide a method for stratifying cancer patients having different responses to a certain therapy.
  • an object of the present invention is to provide clinically applicable methods and means for predicting the response to the first line platinum and taxane combination therapy in cancer, especially in high-grade serous ovarian cancer (HGS-OvCa) and basal like breast cancer (BL-BrCa).
  • a further object of the invention is to provide means for developing alternative treatment strategies and to select patients to future therapeutic trails.
  • the objects of the invention are achieved by a method of stratifying cancer patients to prognostic subgroups, a panel of biomarkers, a method for designing of personalized combinatorial therapy for platinum and taxane resistant cancer patients and a kit which are characterized by what is stated in the independent claims.
  • the present invention provides a set of features that stratify patients to prognostic groups and result in the identification of patient groups that are directly associated with a clinical endpoint.
  • the invention provides a panel of co-expressed features that are applicable as predictive biomarkers.
  • the present invention is highly applicable in HGS-OvCa and BL- BrCa.
  • the preferred embodiments of the invention are disclosed in the dependent claims.
  • the invention is based on a novel computational approach called
  • Prognostic Subgroup Finder which using co-expressed isoform- and gene-level markers identified from large-scale transcriptomics and clinical data stratifies patients to prognostic groups.
  • PSFinder was applied to HGS-OvCa patients treated with the standard platinum and taxane combination therapy from the TCGA HGS-OvCa cohort among others.
  • the results of this approach identified three prognostic groups stratified by 61 transcript isoform-level markers and 32 gene-level markers; two with poor response and one with good response to platinum and taxane chemotherapy.
  • the prognostic value was further increased, when the BRCA1/2 mutation status was used in addition to subgroup infor- mation. It was further discovered that the invention may be extended to other cancers, for example to BL-BrCa, wherein the patient receives platinum and tax- ane combination therapy as a first line treatment.
  • the novel method of the invention provides an easily applicable and fast way of identifying therapy resistant/sensitive patients.
  • the method is used in predicting outcome of platinum and taxane combination therapy at treatment naive stage.
  • the method enables identification of poor and good responsive cancer patients to the first line therapy using expression profiles and BRCA1/2 status.
  • the invention enables identification of poor and good responsive ovarian cancer patients to first line therapy using expression profiles and BRCA1/2 status. This approach can be useful in selecting the first line chemotherapy regiment in the future.
  • the invention provides a restricted set of previously unknown iso- form-level biomarkers.
  • it provides a panel of gene-level markers.
  • biomarkers in the panels may individually or in combination with the BRCA mutation status be used for stratification of cancer patients treated with the state-of-the-art platinum and taxane combination, especially HGS-OvCa and BL-BrCa patients. Identification of the extreme responders and primarily chemo- resistant patients can help to develop alternative treatment strategies and to se- lect patients for future therapeutic trials.
  • the invention enables functional analysis on subgroup-specific genes revealing the molecular signaling pathways that lead to different progno- sis in subgroups of the cancer. These may lead to therapeutic benefits in designing therapy for cancer and allow designing personalized combinatorial therapy for platinum-taxane resistant patients.
  • the present invention provides means for developing accurate diagnostic tests that identify patients who can benefit from targeted therapies.
  • the invention provides advantages not only for individual patient care but also for better selection and stratification in clinical trials and molecular studies.
  • FIG. 1 shows the overall workflow of PSFinder and the outcome by using isoform-level expression and clinical data from 180 HGS-OvCa patient samples.
  • PSFinder is an iterative rule-based approach. Inputs consist of feature (transcript isoforms or genes) expression data from a cohort of samples and survival times of the sample donors.
  • Step 1 PSFinder identifies individual features whose discretized expression values (high/low) are associated with statistically significant survival effect (Kaplan-Meier with log-rank test p ⁇ 0.05). These are used as seeds in Step 2, where highly correlated transcripts (r ⁇
  • PSFinder iteratively merges initial cliques into larger cliques.
  • the initial seed merging iterations are done until either 1 ) the correlation between the expression profiles or 2) survival association in the merged clique falls under the threshold.
  • the outcome of PSFinder is the merged clique based on the maximum score and all the features in the clique.
  • Figure 2. shows outcome of PSFinder using isoform-level expression data from 180 high-grade serous ovarian cancer patients treated with platinum and taxane combination therapy.
  • A Expression profile of isoform-level markers in Poor I, Poor II and Good prognostic patients. In the heatmap, rows correspond to transcript isoforms and columns to samples. PSFinder identified two poor prognosis groups: Poor I, Poor II, and one Good prognosis group.
  • Figure 3 shows prognostic outcome prediction in 29 Mupet HGS- OvCa prospective patient cohorts with isoform-specific expression from Taqman qRT-PCR.
  • A Patient prognostic outcome prediction with nine isoform markers. Markers and pathway genes are shown at x-axis in this heat map, with pathway genes being indicated with asterisks (*). Patients predicted to be Poor I, Poor II and Good prognosis are indicated at y-axis.
  • C Expression boxplot of validated characteristic markers in Poor I, Poor II and Good prognostic patients.
  • D Expression boxplot of validated pathway genes in Poor I, Poor II and Good prognostic patients.
  • Figure 4. shows association of BRCA1 /2 mutation and PSFinder identified prognosis groups with patients' survival.
  • A BRCA1 /2 mutation status and patient prognosis in the homogenously treated TCGA cohort.
  • B Kaplan- Meier survival curves for patients in discovery set given platinum and taxane combination therapy with 1 ) poor prognosis and wild-type BRCA1/2 2) poor prognosis and mutant BRCA1/2 3) good prognosis and wild-type BRCA1/2 4) good prognosis and mutant BRCA1/2.
  • the log-rank test p is 0.0002.
  • Figure 5 Shows outcome of PSFinder using gene-level expression data from 180 high-grade serous ovarian cancer patients treated with platinum and taxane combination therapy.
  • FIG 10. The application (LPS prediction) of HGS-OvCa study identified markers (Table 4) to basal-like breast cancer samples.
  • the Kaplan-Meier analysis with log-rank test revealed statistically significant survival associations (p ⁇ 0.05) between the three groups (Poor I, Poor II and Good) also in basal-like breast cancer.
  • the heatmap shows the expression values for 32 genes identified by PSFinder in HGS-OvCa in basal-like breast cancer samples. The Poor I (red color in side bar), Poor II (blue) and Good (green) subtypes are marked to the heatmap.
  • FIG. 11 Optimization of biomarker panel for predicting the survival of HGS-OvCa patients.
  • Receiver operating characteristic (ROC) curves for the best five gene-level biomarker predictor.
  • the x-axis represents false-discovery rate (FPR) and y-axis true-positive rate (TPR).
  • the area under ROC curve (AUC) tells how well the biomarkers are able to predict patient response.
  • the present invention relates to a method of stratifying cancer patients treated with platinum and taxane combination therapy to prognostic subgroups, wherein the expression level of a set of biomarkers defined in Table 3 and/or Table 4 are determined from a patient sample, and the expression level of the biomarkers is used for identifying prognostic groups of patients with distinct responses to a standardized treatment.
  • First-line therapy or first line treatment is the first therapy that will be tried in view of prevailing therapy guidelines. Its priority over other options is usually either formally recommended on the basis of clinical trial evidence for its best-available combination of efficacy, safety, and tolerability or chosen based on the clinical experience of the physician.
  • the first-line therapy might not be similarly efficient in different patient subgroups because the expected or presumed efficiency is typically based on the observed or extrapolated average outcome in the entire heterogeneous group of patients suffering of the same cancer type. Identification of patients or patient subgroups associated with poor prognosis, i.e.
  • prognosis that is poorer than said average prognosis, provides a valuable means for example for selecting and/or adjusting alternative and/or additional therapy for the patients predicted to be resistant for the current first-line therapy.
  • Patients predicted to be resistant for the platinum and taxane combina- tion therapy i.e. stratified to a poor prognostic subgroup as enclosed in the current invention, might hence benefit from an alternative treatment strategy equally well or, optimally, better than from said platinum and taxane combination therapy prevailingly used as first-line therapy.
  • alternative treatment strategy could be reasoned for use as first line therapy instead of or before or after the platinum and taxane combination therapy.
  • platinum-based chemotherapy drugs are generally used against advanced, metastatic forms of adenocarcinoma of the colon and rectum, small cell and non-small cell lung cancer, breast cancer, adrenocortical cancer, anal cancer, endometrial cancer, cervix cancer, non-Hodgkin lymphoma, glio- blastoma, melanoma, ovarian cancer, testicular cancer, and head and neck cancers.
  • platinum based drugs are used in combination with taxane, which a chemotherapy drug from other class.
  • the cancer treated with platinum and taxane combination therapy is preferably selected from a group consisting of ovarian cancer, breast can- cer, endometrial cancer, melanoma, cervix cancer, pancreatic cancer, esophageal cancer, colorectal cancer and glioblastoma.
  • the cancer is high grade serous ovarian cancer (HGS-Ova) or basal-like breast cancer (BL- BrCa).
  • the method of the invention comprises a) determining a set of biomarkers in patient samples obtained from patients suffering from can- cer, which is treated first line with platinum and taxane combination therapy; b) determining outcome groups for cancer patients based on the expression of the set of biomarkers; wherein Poor I subgroup is associated with poor prognosis and is characterized by the transcripts and/or genes specifically co-expressed in Poor I outcome group; Poor II subgroup is associated with poor prognosis and is characterized by the transcripts and/or genes specifically co-expressed in Poor II outcome group, and the Good subgroup is associated with good prognosis and is characterized by all the Poor I and Poor II specific transcripts and/or genes with their expression on opposite directions.
  • the median overall survival times for patients in Poor I and Poor II is 45 and 50 months, respectively, whereas for patients in Good prognostic group it is >120 months.
  • the five-year survival rates for Poor I, Poor II and Good prognostic patients are 42%, 40% and 68%, respectively.
  • the outcome groups are defined as those tumors containing the same set of molecular alterations and their associated pathways.
  • a set of biomarkers comprises at least 5 biomarkers of those defined in Table 3 and/or 4. It may also comprise all the biomarkers defined in Table 3 and/or 4 or it may comprise any combination of at least 5 biomarkers defined in Table 3 and 4. In one embodiment a set of biomarkers comprises at least 5 biomarkers defined in Table 3. In another embodiment a set of biomarkers comprises at least 5 biomarkers defined in Table 4.
  • BRCA1/2 mutation status is integrated to the identified prognostic subgroups. Integration of this information leads to further distinctions of the survival groups.
  • the information of the three prognostic groups is independent of BRCA1/2 mutation status.
  • the gene silencing by mutations or hypermethylation in BRCA1 and BRCA2 are known to sig- nificantly contribute to ovarian cancer risk, survival and sensitivity to estrogen based therapy.
  • the BRCA1/2 mutation status may be analysed by any suitable method known in the art.
  • the combination of the information of the three prognostic groups and BRCA1/2 dysfunction can be used to identify patients that i) are very likely to benefit from platinum and taxane treatment and ii) are very likely not to respond.
  • the ability to separate poor and good responders is crucially important for example in stratifying HGS-OvCa patients to clinical trials.
  • the identification of the biomarkers of the invention was achieved by applying PSFinder to isoform-level expression data of 180 homogeneously treated HGS-OvCa patient primary cancer samples from the TCGA repository. All patients had disseminated disease and were treated with surgery and first line platinum and taxane combination therapy. Identification of the biomarkers utilized a transcriptomic study, which may also be referred to as expression profiling, wherein the expression data i.e. expression level of mRNAs in a given cell population is examined using high-throughput techniques based on DNA micro- array technology.
  • the transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA produced in one or a population of cells.
  • the transcriptome includes all mRNA transcripts in the cell, thus reflecting the genes that are being actively expressed at any given time.
  • PSFinder is a computational approach that uses an iterative rule- based approach to search for co-expressed features, which are able to divide the samples into groups that have significant association to the clinical data.
  • the iterative process is executed until there are no such features available in any subset of all samples. Iteration round starts with seed search (Stepl ), continues with initial clique collection (Step2) and iterative clique merging (Step3) as shown in Figure 1A.
  • Clinical data is a principal resource for most health and medical research. Clinical data is either collected during the course of ongoing patient care or as part of a formal clinical trial program. Clinical data relates to the observed symptoms and course of a disease. Clinical endpoint data relates to the outcome of the disease, e.g. survival of the patient.
  • a biomarker refers to a measured characteristic which is used as an indicator of a biological state, condition or disease.
  • a cancer biomarker refers to a substance or process that is indicative of the presence of cancer in the body.
  • a biomarker may be a molecule secreted by a tumor or a specific response of the body to the presence of cancer.
  • the term biomarker can be defined as a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention or other health care intervention.
  • a biomarker is specifically a transcript and/or a gene, whose expression level correlates with the risk or progression of cancer, or with the susceptibility of the cancer to a given treatment.
  • the transcript ex- pression is analysed at isoform level.
  • Transcription of genes is a very dynamic process, allowing cells able to adapt rapidly to external, environmental or physiological changes affecting target tissues, organs or cells.
  • the gene expression profiling allows identifying biomarkers that describe a given physiological status, a disease, an exposure to drugs, or other exogenous stimuli.
  • the expression of transcripts or genes is analysed from a cancer patient sample.
  • the sample refers to a sample from a tissue or an organ, or to a sample of a body fluid.
  • Tissue or organ samples may be obtained from any tissue or organ by, e.g., biopsy.
  • Samples of body fluids can be obtained by well-known techniques and include, for example, samples of blood, plasma, serum, or urine.
  • Separated cells may be obtained from the body fluids or the tissues or organs by separating techniques such as centrifugation or cell sorting.
  • the sample is tumor specimen. Most preferably it is na ' fve tumor specimen.
  • the expression level of a biomarker may be determined using several techniques including methods of quantifying nucleic acid encoding a target biomarker, such as PCR-based techniques, microarrays, different gene expression systems using color-coded probe pairs, RNA sequencing and Northern or Southern blotting techniques, and/or methods of quantifying protein biomarkers, such as immunological assay, western blotting, or mass spectrometry.
  • methods of quantifying nucleic acid encoding a target biomarker such as PCR-based techniques, microarrays, different gene expression systems using color-coded probe pairs, RNA sequencing and Northern or Southern blotting techniques, and/or methods of quantifying protein biomarkers, such as immunological assay, western blotting, or mass spectrometry.
  • the expression levels of biomarkers in a sample are determined based on the levels of nucleic acid (i.e.
  • DNA or mRNA transcript encoding the target protein biomarkers in the biological sample such as for example nuclear Run-on assay, RNase protection assay, Ch IP-Chip of RNAP, RT-PCR, DNA microarrays, in situ hybridization, MS2 tagging, Northern blot, RNA-Seq, or any other technically suitable method.
  • the present invention relates further to a panel of biomarkers for stratifying cancer patients treated first line with platinum and taxane combination therapy to prognostic subgroups.
  • This panel comprises the 61 outcome-related biomarkers on isoform-level shown in Table 3.
  • the panel further comprises 32 outcome-related biomarkers on gene-level shown in Table 4.
  • These marker panels may be used alone, in combination or in parallel.
  • the 32 characteristic genes may be used as gene-level biomarkers when isoform-level data is not available.
  • the biomarkers of the invention have been obtained from the PSFinder analysis of HGS-OvCa patients treated with platinum and taxane combination therapy.
  • the gene markers identified initially in the HGS-OvCa study are applicable directly in subgroup prediction also on other cancers treated first line with platinum and taxane combination therapy, especially on basal -like breast cancer samples.
  • the prognostic subgroup of a patient is determined by the expression status of a set of biomarkers in the biomarker panel..
  • the present invention relates to the use of the bi- omarkers included in Table 3 and/or 4 for identifying a prognostic subgroup of HGS-OvCa patient or BL-BrCa patient.
  • the biomarkers defined in Table 3 and 4 may be used individually or in combination. In one embodiment of the invention at least 5 biomarkers of those defined in Table 3 and/or Table 4 are used. In another embodiment at least 5 biomarkers of those defined in Table 4 are used.
  • the biomarkers of the invention can be utilized as a means for predicting patient responses to the cancer therapy, for identification of patient subgroups that are directly associated with a clinical endpoint.
  • the biomarkers facilitate a development of personalized treatment to overcome drug resistance.
  • the biomarkers may be utilized in a personalized cancer therapy, which aims to develop accurate diagnostic tests that identify patients who can benefit from targeted therapies.
  • the biomarkers may be utilized in finding alternative treatments to patients who are predicted to be resistant to the combination of platinum and taxane by identifying the related signaling pathways. Hence they are useful in the selection of first line chemotherapy.
  • the present invention relates further to a method for designing of per- sonalized combinatorial therapy for a platinum and taxane resistant cancer patient, comprising a) determining transcript isoform expression level or gene expression level data of a set of biomarkers defined in Table 3 and/ or in Table 4 from a patient sample, b) determining a patient subgroup based on the expres- sion status of the biomarkers and c) designing the therapy regimens based on the results obtained from step b).
  • the co-expression pattern of the biomarkers defines the patient subgroup.
  • the cancer patient is high grade serous ovarian cancer (HGS-OvCa) patient or basal-like breast cancer (BL-BrCa) patient.
  • the cancer patient is HGS-OvCa patient.
  • the method comprises the additional assay of analyzing the BRCA1/2 mutation status and combining it with the data obtained from step b).
  • the good-outcome group gets the standard platinum and taxane combination therapy, whereas the treatment of Poor I and Poor II requires other alternatives and/or additional treatment.
  • the invention relates to a kit for use in evaluating the probability of survival of a patient suffering from HGS-OvCa or BL-BrCa, wherein the kit comprises of the biomarkers defined in Table 3 and/or Table 4.
  • the kit may also contain means for determining the expression status of the biomarkers.
  • the kit may be used for carrying out the method of stratifying prog- nostic subgroups of high grade serous ovarian cancer patients or basal like breast cancer patients.
  • the kit may also be used for carrying out the method of stratifying prognostic subgroups of any other cancer patient treated with platinum and taxane combination therapy.
  • the kit may comprise e.g. a chip, such as a microarray, suitable for use in biochip technology. Also e.g. deep sequencing data with quantified expression may be used in relation to the kit.
  • the kit further comprises instructions for screening a sample taken from a subject having, or suspected of having HGS-OvCa.
  • Prognostic Subgroup Finder identifies groups of patients that 1 ) consist of features ⁇ e.g., transcript isoforms or genes) whose expression correlates strongly in the samples belonging to each subgroup and exhibit distinct expression patterns between patient groups, 2) have statistically significant association with the clinical data, such as survival time.
  • the overall workflow and outcome of PSFinder is shown in Figure 1 .
  • PSFinder uses an iterative rule-based approach to search for co-expressed features, which are able to divide the samples into groups that have significant association to the clinical data. The iterative process is executed until there are no such features available to further divide subset of samples into groups. Iteration round starts with seed search (Stepl ), continues with initial clique collection (Step2) and iterative clique merging (Step3) as shown in Figure 1 . After iteration the samples at the leaf nodes form subgroups and the features used in the sample divisions are returned as the output.
  • Stepl Finding seed features with association to the clinical data
  • the PSFinder starts with finding a fairly large set of features (seeds) that form the basis for subsequent steps.
  • the main objective of Stepl is to discard features that do not have association to the clinical data.
  • univariant Cox proportional hazards regression model is used on the expression (in log2- scale) for each feature (p ⁇ 0.2).
  • Each feature with discretized data is analyzed individually with Kaplan-Meier survival analysis with log-rank test, and the features that have significance below the threshold (in our case study p ⁇ 0.05) are used in Step2.
  • the threshold for the Cox and Kaplan- Meier significance can be set by the user.
  • the default values used in this step are relaxed and p-values were not corrected for multiple testing in order to maximize sensitivity and obtain relatively large set of features for
  • the main objective in Step2 is to form cliques in which all the features correlate strongly with each other and together divide samples into groups that have a significant association to clinical data.
  • PSFinder executes correlation analysis (Pearson correlation measure).
  • a Boolean matrix is generated from the pairwise correlations of features so that '1 ' denotes high correlation (
  • Features that correlate with each other form (initial) cliques.
  • Each clique is required to separate samples into two groups so that the groups have significant survival association (Kaplan-Meier and log-rank test, here p ⁇ 0.05).
  • the cliques are ranked based on the average pairwise correlation i,j ;i ⁇ j
  • Step3 the objective is to merge cliques from Step2 and result in subgroups so that correlation of the features inside the subgroup and the association to the clinical data is retained.
  • the workflow in Step3 is iterative and there are two sub-steps.
  • initial cliques are merged if the features in the cliques correlated strongly (
  • the cliques are then merged iteratively until feature correlation or survival association rules are violated.
  • the merged cliques are ranked based on average pairwise correlation scores that are calculated from the expression values of all the features in the merged clique.
  • the outcome of PSFinder is the merged clique based on the maximum score and all the features in the clique.
  • Linear predictor score (LPS) to predict the most likely group for a sample into
  • GOOd ⁇ (LPS Poorl, Good(X)> ⁇ PoorI > a Poorl + ®(LPS Poorl, Good ( ⁇ )> ⁇ -> ⁇ Good ) ⁇ ⁇ ⁇ performing comparisons 1 ) and 3), the sample is classified to the Poor I type if P(sample ⁇ Poor I) > P(sample e Good) and P(sample ⁇ Poor I) > P(sample e Poor II). In cases where a sample cannot be clearly assigned to any of groups, the sample is classified as "unknown.”
  • Exon array data (level 1 ; Affymetrix Human Exon 1 .0 ST v2) and associated clinical information of two separate HGS-OvCa cohorts were downloaded from The Cancer Genome Atlas (TCGA) repository (Cancer Genome At- las Research, 201 1 ).
  • the first cohort included 180 patients treated with standard platinum and taxane combination, and the second 327 ovarian cancer samples from patients who were given heterogeneous treatments.
  • Exon array data were processed at transcript isoform level by Multiple Exon Array Preprocessing (MEAP) algorithm (Chen et al, 201 1 ).
  • MEAP produced a data matrix of 55,594 rows (transcript isoforms) and 180 (discovery set) or 327 (validation set) columns (samples) with normalized data in log2-scale.
  • Treatment modality for each patient was determined based on preoperative imaging studies and diagnostic laparoscopy. Treatment response was defined with clinical examination, CA125 level and contrast-enhanced CT utilizing Recist1 .1 criteria. Tumor and ascites samples were collected during operation. One part of each specimen was snap-frozen and an- other part was directly prepared for cell culture to generate primary cell lines.
  • the primers and probes were designed using Universal ProbeLibrary Assay Design Center (Roche Applied Science). To select the op- timal endogenous controls for the normalization of gene expression data, we studied the expression of ACTB, GAPDH, GUSB, HPRT1, RPL19, PPIA, and TBP. Based on SLqPCR R-package (Vandesompele et al, 2002) and Best- Keeper (Pfaffl et al, 2004), PPIA and TBP were selected as the best reference genes in both MUPET patient samples and cell lines. Raw qRT-PCR Ct values were thus normalized against the geometric mean of PPIA and TBP.
  • ITGA11 integrin alpha 11 68 acgcttccacctcaaatacg tgggcttgacctcgtagtg
  • KLF16 Kruppel-like factor 16 51 cagggctgcgacaagaag gaagcgcttggagcacag
  • MMP2 70 ataacctggatgccgtcgt aggcacccttgaagaagtagc gelatinase, 72kDa type IV
  • PPP1 CC catalytic subunit gamma 34 cttatatgtagagcccatcaggtg aataattgggcgcagaaaac
  • ribosomal protein L19 46 agcgagctctttcctttcg gagcctcttctgaagcctga peptidylprolyl
  • TATA boxnding protein 51 cccatgactcccatgacc tttacaaccaagattcactgtgg BRCA1/2 mutation status analysis for primary cell lines
  • PSFinder identified expression signature predicts HGS-OvCa patient re- sponse to platinum and taxane combination therapy
  • PSFinder identified 61 transcripts on isoform-level for Poor I, Poor II and Good prognostic patient characterization.
  • Transcript IDs and gene names are based on Ensembl version 64.
  • Median log2-scale expression of transcripts in each patient group is shown in column “Poor I”, “Poor II” and “Good”, where prognostic group specific features are marked with an up-arrow or down-arrow symbol for up-regulated or a down-regulated, respectively.
  • Two-sided t-test and median log2-scale fold change (log2FC) were calculated for each features between Poor I and Good, Poor II and Good, respectively. P-values are shown by *** (p ⁇ 0.001 ), ** (p ⁇ 0.01 ), * (p ⁇ 0.05) or empty (p ⁇ 0.05).
  • PSFinder identified 32 genes for Poor I, Poor II and Good prognostic patient characterization by gene-level.
  • PSFinder identified 32 genes for Poor I, Poor II and Good prognostic patient characterization by gene-level. Genes characterizing Poor I or Poor II groups are annotated in the "Group" column.
  • the first validation cohort consists of transcript level data from 327 ovarian cancer patients treated with heterogeneous treatment regimens available in the TCGA repository.
  • the predicted Poor I and Poor II samples are associated with short overall survival and the predicted good samples are associated with good overall survival.
  • the log-rank test p 0,046. Optimization of biomarker panel for predicting the survival of HGS-OvCa patients
  • Verhaak RG Tamayo P, Yang JY, Hubbard D, Zhang H, Creighton CJ, Fereday S, Lawrence M, Carter SL, Mermel CH, Kostic AD, Etemad- moghadam D, Saksena G, Cibulskis K, Duraisamy S, Levanon K, Sougnez C, Tsherniak A, Gomez S, Onofrio R et al (2013) Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. The Journal of clinical investigation 123: 517-525
  • Hynninen J Auranen A, Carpen O, Dean K, Seppanen M, Kemp- painen J, Lavonius M, Lisinen I, Virtanen J, Grenman S (2012) FDG PET/CT in staging of advanced epithelial ovarian cancer: frequency of supradiaphragmatic lymph node metastasis challenges the traditional pattern of disease spread.
  • Gynecologic oncology 126 64-68.

Abstract

The present invention relates to molecular profiling in order to identify clinically applicable biomarkers for stratifying cancer patients having different responses to a certain therapy. Specifically, the present invention provides clinically applicable methods and means for predicting the response to the platinum and taxane combination therapy in cancer patients, especially in high-grade serous ovarian cancer (HGS-OvCa) and basal like breast cancer (BL-BrCa) patients. The method of the invention comprises determining the set of biomarkers in patient samples obtained from patients suffering from HGS-OvCa; and determining outcome groups for cancer patients based on the expression of the set of biomarkers. A panel of biomarkers and a kit is also provided. The invention provides substantial help in the clinical decision making and is especially useful in the selection of first line chemotherapy regimen.

Description

OVARIAN CANCER PROGNOSTIC SUBGROUPING
FIELD OF THE INVENTION
The present invention relates to molecular profiling in order to identify clinically applicable biomarkers for predicting cancer patients' response to a spe- cific therapy regimen. The present invention particularly relates to cancer patients treated first line with platinum and taxane combination therapy. The invention provides substantial help in the clinical decision making and is especially useful in the selection of first line chemotherapy regimen.
BACKGROUND OF THE INVENTION
Choosing an effective treatment for a cancer patient is one of the most important and difficult decisions in the clinic. While histological and some molecular markers facilitate treatment choice, a large number of cancers still cannot be optimally managed, resulting in adverse side-effects and health costs without benefit. An example of such a disease is high-grade serous ovarian can- cer (HGS-OvCa), the most common ovarian cancer subtype with an overall 5- year survival rate of 35 - 40% [Berns EM and Bowtell DD, 2012]. Although HGS- OvCa is a single morphological and clinical entity, its molecular features are versatile. The genomic alterations are very complex and there are no common driver mutations, or other known targetable mutations that could serve as uni- form therapeutic targets.
The standard treatment for HGS-OvCa is platinum and taxane combination therapy. While most patients at the advanced stage of the disease initially respond to chemotherapy, the treatment is very seldom curative and the majority suffers relapse within 18 months. The development of platinum re- sistance is a complex and poorly understood process. There are very few clinically relevant treatment options for platinum resistant ovarian cancer patients available at the moment. Thus, biomarkers for alternative treatments that would allow designing of personalized combinatorial therapy for platinum and taxane resistant cancer patients are urgently needed.
Morphologically similar cancers originating from the same organ can be remarkably heterogeneous at the molecular level, whereas some cancers originating from different organs can show many molecular communalities, indicating a related etiology and similar therapeutic opportunities. Discovery of cancer prognostic subgroups is of vital importance for personalized cancer therapy. Thus, establishing effective diagnosis, prognosis and treatment options requires coordinated efforts to gather large numbers of clinical samples with a wide range of molecular and clinical data such as The Cancer Genome Atlas (TCGA) project [Cancer Genome Atlas Research Network, 201 1 ] or the Australian Ovarian Cancer Study (AOCS) [Tothill R. W. et al., 2008]. AOCS and TCGA expression data have been used to identify four subtypes in HGS-OvCa [TothillR. W. et al., 2008, Verhaak R. G et al., 2013]. Patients classified to any of these subtypes can, however, still have different responses to platinum and taxane combination therapy. There are very few clinically relevant treatment options for platinum resistant ovarian cancer patients available at the moment.
Characterization of cancer patient subgroups from the massive high- throughput data for personalized medicine requires cost-efficient systematic approaches. Most of the existing approaches for identifying cancer subgroups use two-step approach where the first step is to use computational methods, such as unsupervised learning, Bayesian methods or dimension reduction for identi- fying clusters using molecular data. This is followed by assessment of biological or medical relevance for the identified clusters using Gene Ontology, pathway or clinical data. The major limitation of this approach is that most clusters typically have weak or no association to outcome or disease relevant biological processes because clusters are formed without fusing clinical data. The clusters identified without fusing clinical data with molecular data could lose association to outcome or disease relevant biological processes and may contain hundreds of genes that are loosely correlated with each other making the choice of putative biomarkers challenging.
So far, patient groups with prognostic association in ovarian cancer have been mostly identified on gene-level but not yet on isoform-level. During the evolution of mammals, isoform-level signatures are often lineage-specific rather than tissue-specific shown by gene-level signatures. [Merkin J et al., 2012]. Because the evolution of mammals and the adaption of cancer cells to chemotherapy share similar principles, [Merlo L et al., 2006, Ostrow S et al., 2014, Greaves M and Maley CC, 2012] isoform-level signature may become useful biomarkers for cancer progression. The treatment of ovarian cancer remains challenging despite many advances in therapeutic options. There is still place for examining better ways of using known drugs. To date there have not been established applicable biomarkers that characterize the cancer patient subtypes and are able to predict whether a patient is likely to be sensitive to the treatment. Search of such markers is especially challenging in especially in HGS-OvCa, which is characterized by extensive copy number alterations and low level of prevalent somatic mutations. Thus, identifying reliable patient subgroups and predictive biomarkers that allow directing chemotherapy to responsive ovarian cancer patients is crucially needed. BRIEF DESCRIPTION OF THE INVENTION
An object of the present invention is thus to provide a method for stratifying cancer patients having different responses to a certain therapy. Specifically, an object of the present invention is to provide clinically applicable methods and means for predicting the response to the first line platinum and taxane combination therapy in cancer, especially in high-grade serous ovarian cancer (HGS-OvCa) and basal like breast cancer (BL-BrCa). A further object of the invention is to provide means for developing alternative treatment strategies and to select patients to future therapeutic trails.
The objects of the invention are achieved by a method of stratifying cancer patients to prognostic subgroups, a panel of biomarkers, a method for designing of personalized combinatorial therapy for platinum and taxane resistant cancer patients and a kit which are characterized by what is stated in the independent claims. Specifically, the present invention provides a set of features that stratify patients to prognostic groups and result in the identification of patient groups that are directly associated with a clinical endpoint. Moreover, the invention provides a panel of co-expressed features that are applicable as predictive biomarkers. The present invention is highly applicable in HGS-OvCa and BL- BrCa. The preferred embodiments of the invention are disclosed in the dependent claims.
The invention is based on a novel computational approach called
Prognostic Subgroup Finder (PSFinder) which using co-expressed isoform- and gene-level markers identified from large-scale transcriptomics and clinical data stratifies patients to prognostic groups. PSFinder was applied to HGS-OvCa patients treated with the standard platinum and taxane combination therapy from the TCGA HGS-OvCa cohort among others. The results of this approach identified three prognostic groups stratified by 61 transcript isoform-level markers and 32 gene-level markers; two with poor response and one with good response to platinum and taxane chemotherapy. The prognostic value was further increased, when the BRCA1/2 mutation status was used in addition to subgroup infor- mation. It was further discovered that the invention may be extended to other cancers, for example to BL-BrCa, wherein the patient receives platinum and tax- ane combination therapy as a first line treatment.
The novel method of the invention provides an easily applicable and fast way of identifying therapy resistant/sensitive patients. The method is used in predicting outcome of platinum and taxane combination therapy at treatment naive stage. The method enables identification of poor and good responsive cancer patients to the first line therapy using expression profiles and BRCA1/2 status. In one embodiment the invention enables identification of poor and good responsive ovarian cancer patients to first line therapy using expression profiles and BRCA1/2 status. This approach can be useful in selecting the first line chemotherapy regiment in the future.
The invention provides a restricted set of previously unknown iso- form-level biomarkers. In addition it provides a panel of gene-level markers. These biomarkers in the panels may individually or in combination with the BRCA mutation status be used for stratification of cancer patients treated with the state-of-the-art platinum and taxane combination, especially HGS-OvCa and BL-BrCa patients. Identification of the extreme responders and primarily chemo- resistant patients can help to develop alternative treatment strategies and to se- lect patients for future therapeutic trials.
For a personalized therapy it is necessary to stratify cancer patients into homogeneous subgroups according to which molecular alterations their tumors exhibit. The invention enables functional analysis on subgroup-specific genes revealing the molecular signaling pathways that lead to different progno- sis in subgroups of the cancer. These may lead to therapeutic benefits in designing therapy for cancer and allow designing personalized combinatorial therapy for platinum-taxane resistant patients. Moreover the present invention provides means for developing accurate diagnostic tests that identify patients who can benefit from targeted therapies. The invention provides advantages not only for individual patient care but also for better selection and stratification in clinical trials and molecular studies.
BRIEF DESCRIPTION OF THE DRAWINGS
In the following the invention will be described in greater detail by means of preferred embodiments with reference to the attached, in which
Figure 1. shows the overall workflow of PSFinder and the outcome by using isoform-level expression and clinical data from 180 HGS-OvCa patient samples. PSFinder is an iterative rule-based approach. Inputs consist of feature (transcript isoforms or genes) expression data from a cohort of samples and survival times of the sample donors. In Step 1 , PSFinder identifies individual features whose discretized expression values (high/low) are associated with statistically significant survival effect (Kaplan-Meier with log-rank test p < 0.05). These are used as seeds in Step 2, where highly correlated transcripts (r≥ |0.6| in our case study) form initial cliques. In Step 3, PSFinder iteratively merges initial cliques into larger cliques. In this step, the initial seed merging iterations are done until either 1 ) the correlation between the expression profiles or 2) survival association in the merged clique falls under the threshold. The outcome of PSFinder is the merged clique based on the maximum score and all the features in the clique.
Figure 2. shows outcome of PSFinder using isoform-level expression data from 180 high-grade serous ovarian cancer patients treated with platinum and taxane combination therapy. (A) Expression profile of isoform-level markers in Poor I, Poor II and Good prognostic patients. In the heatmap, rows correspond to transcript isoforms and columns to samples. PSFinder identified two poor prognosis groups: Poor I, Poor II, and one Good prognosis group. (B) Kaplan- Meier survival curves on overall 5-year survival for Poor I, Poor II and Good prognosis patients identified from isoform-level data. The log-rank test p = 0.007.
Figure 3. shows prognostic outcome prediction in 29 Mupet HGS- OvCa prospective patient cohorts with isoform-specific expression from Taqman qRT-PCR. (A) Patient prognostic outcome prediction with nine isoform markers. Markers and pathway genes are shown at x-axis in this heat map, with pathway genes being indicated with asterisks (*). Patients predicted to be Poor I, Poor II and Good prognosis are indicated at y-axis. (B) Kaplan-Meier survival curves on time to progression in month for predicted poor and good prognostic patients in the 29 Mupet HGS-OvCa patient cohorts. The significance was measured with the log-rank test (p = 0.03). (C) Expression boxplot of validated characteristic markers in Poor I, Poor II and Good prognostic patients. (D) Expression boxplot of validated pathway genes in Poor I, Poor II and Good prognostic patients.
Figure 4. shows association of BRCA1 /2 mutation and PSFinder identified prognosis groups with patients' survival. (A) BRCA1 /2 mutation status and patient prognosis in the homogenously treated TCGA cohort. (B) Kaplan- Meier survival curves for patients in discovery set given platinum and taxane combination therapy with 1 ) poor prognosis and wild-type BRCA1/2 2) poor prognosis and mutant BRCA1/2 3) good prognosis and wild-type BRCA1/2 4) good prognosis and mutant BRCA1/2. The log-rank test p is 0.0002. (C) Kaplan- Meier survival curves for patients in validation set with 1 ) poor prognosis and wild-type BRCA1/2 2) poor prognosis and mutant BRCA1/2 3) good prognosis and wild-type BRCA1/2 4) good prognosis and mutant BRCA1/2. The log-rank test p is 0.019.
Figure 5. Shows outcome of PSFinder using gene-level expression data from 180 high-grade serous ovarian cancer patients treated with platinum and taxane combination therapy.
Figure 6. Predicted poor and good prognostic patients using isoform- level markers in TCGA ovarian cancer validation set (N=327).
Figure 7. Kaplan-Meier survival curves on time to progression in month for Poor I, Poor II and Good Mupet HGS-OvCa patients.
Figure 8. Predicted poor and good prognostic patients from AOCS,
Dressman, Yoshihara, GSE30161 and Crijns cohorts on gene-level.
Figure 9. Kaplan-Meier survival curves for BRCA1/2 wild-type and BRCA1/2 mutated patients from the 180 homogeneously treated TCGA HGS- OvCa cohort.
Figure 10. The application (LPS prediction) of HGS-OvCa study identified markers (Table 4) to basal-like breast cancer samples. The Kaplan-Meier analysis with log-rank test revealed statistically significant survival associations (p < 0.05) between the three groups (Poor I, Poor II and Good) also in basal-like breast cancer. The basal-like breast cancer samples (n=96) were fetched from TCGA. The heatmap shows the expression values for 32 genes identified by PSFinder in HGS-OvCa in basal-like breast cancer samples. The Poor I (red color in side bar), Poor II (blue) and Good (green) subtypes are marked to the heatmap.
Figure 11. Optimization of biomarker panel for predicting the survival of HGS-OvCa patients. Receiver operating characteristic (ROC) curves for the best five gene-level biomarker predictor. In the ROC figure the x-axis represents false-discovery rate (FPR) and y-axis true-positive rate (TPR). The area under ROC curve (AUC) tells how well the biomarkers are able to predict patient response.
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method of stratifying cancer patients treated with platinum and taxane combination therapy to prognostic subgroups, wherein the expression level of a set of biomarkers defined in Table 3 and/or Table 4 are determined from a patient sample, and the expression level of the biomarkers is used for identifying prognostic groups of patients with distinct responses to a standardized treatment.
First-line therapy or first line treatment is the first therapy that will be tried in view of prevailing therapy guidelines. Its priority over other options is usually either formally recommended on the basis of clinical trial evidence for its best-available combination of efficacy, safety, and tolerability or chosen based on the clinical experience of the physician. However, the first-line therapy might not be similarly efficient in different patient subgroups because the expected or presumed efficiency is typically based on the observed or extrapolated average outcome in the entire heterogeneous group of patients suffering of the same cancer type. Identification of patients or patient subgroups associated with poor prognosis, i.e. prognosis that is poorer than said average prognosis, provides a valuable means for example for selecting and/or adjusting alternative and/or additional therapy for the patients predicted to be resistant for the current first-line therapy. Patients predicted to be resistant for the platinum and taxane combina- tion therapy, i.e. stratified to a poor prognostic subgroup as enclosed in the current invention, might hence benefit from an alternative treatment strategy equally well or, optimally, better than from said platinum and taxane combination therapy prevailingly used as first-line therapy. For such patients alternative treatment strategy could be reasoned for use as first line therapy instead of or before or after the platinum and taxane combination therapy.
The various platinum-based chemotherapy drugs are generally used against advanced, metastatic forms of adenocarcinoma of the colon and rectum, small cell and non-small cell lung cancer, breast cancer, adrenocortical cancer, anal cancer, endometrial cancer, cervix cancer, non-Hodgkin lymphoma, glio- blastoma, melanoma, ovarian cancer, testicular cancer, and head and neck cancers. For efficient treatment of these cancers, platinum based drugs are used in combination with taxane, which a chemotherapy drug from other class. In the present invention the cancer treated with platinum and taxane combination therapy is preferably selected from a group consisting of ovarian cancer, breast can- cer, endometrial cancer, melanoma, cervix cancer, pancreatic cancer, esophageal cancer, colorectal cancer and glioblastoma. Most preferably the cancer is high grade serous ovarian cancer (HGS-Ova) or basal-like breast cancer (BL- BrCa).
Specifically, the method of the invention comprises a) determining a set of biomarkers in patient samples obtained from patients suffering from can- cer, which is treated first line with platinum and taxane combination therapy; b) determining outcome groups for cancer patients based on the expression of the set of biomarkers; wherein Poor I subgroup is associated with poor prognosis and is characterized by the transcripts and/or genes specifically co-expressed in Poor I outcome group; Poor II subgroup is associated with poor prognosis and is characterized by the transcripts and/or genes specifically co-expressed in Poor II outcome group, and the Good subgroup is associated with good prognosis and is characterized by all the Poor I and Poor II specific transcripts and/or genes with their expression on opposite directions. The median overall survival times for patients in Poor I and Poor II is 45 and 50 months, respectively, whereas for patients in Good prognostic group it is >120 months. The five-year survival rates for Poor I, Poor II and Good prognostic patients are 42%, 40% and 68%, respectively. The outcome groups are defined as those tumors containing the same set of molecular alterations and their associated pathways.
A set of biomarkers comprises at least 5 biomarkers of those defined in Table 3 and/or 4. It may also comprise all the biomarkers defined in Table 3 and/or 4 or it may comprise any combination of at least 5 biomarkers defined in Table 3 and 4. In one embodiment a set of biomarkers comprises at least 5 biomarkers defined in Table 3. In another embodiment a set of biomarkers comprises at least 5 biomarkers defined in Table 4.
In one embodiment of the invention, BRCA1/2 mutation status is integrated to the identified prognostic subgroups. Integration of this information leads to further distinctions of the survival groups. The information of the three prognostic groups is independent of BRCA1/2 mutation status. The gene silencing by mutations or hypermethylation in BRCA1 and BRCA2 are known to sig- nificantly contribute to ovarian cancer risk, survival and sensitivity to estrogen based therapy. The BRCA1/2 mutation status may be analysed by any suitable method known in the art. The combination of the information of the three prognostic groups and BRCA1/2 dysfunction can be used to identify patients that i) are very likely to benefit from platinum and taxane treatment and ii) are very likely not to respond. The ability to separate poor and good responders is crucially important for example in stratifying HGS-OvCa patients to clinical trials. The identification of the biomarkers of the invention was achieved by applying PSFinder to isoform-level expression data of 180 homogeneously treated HGS-OvCa patient primary cancer samples from the TCGA repository. All patients had disseminated disease and were treated with surgery and first line platinum and taxane combination therapy. Identification of the biomarkers utilized a transcriptomic study, which may also be referred to as expression profiling, wherein the expression data i.e. expression level of mRNAs in a given cell population is examined using high-throughput techniques based on DNA micro- array technology. The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA produced in one or a population of cells. The transcriptome includes all mRNA transcripts in the cell, thus reflecting the genes that are being actively expressed at any given time.
In addition to the isoform-level expression data from the ovarian cancer project in The Cancer Genome Atlas (TCGA) (Cancer Genome Atlas Re- search Network, Nature 474, 609, Jun 30, 201 1 ), the present study has utilized both isoform and gene level expression data from the Australian Ovarian Cancer Study (AOCS) (R. W. Tothill et al., Clin Cancer Res 14, 5198 Aug 15, 2008), the Japanese Ovarian Cancer Study (Yoshihara K et.al., 2010), ovarian cancer cohorts from Dressman's study (Dressman et al, 2007) and a study by Ferriss JS et al., 2012. PSFinder is a computational approach that uses an iterative rule- based approach to search for co-expressed features, which are able to divide the samples into groups that have significant association to the clinical data. The iterative process is executed until there are no such features available in any subset of all samples. Iteration round starts with seed search (Stepl ), continues with initial clique collection (Step2) and iterative clique merging (Step3) as shown in Figure 1A. When there are no features left that are able to divide the samples with significant survival association to the clinical data, samples at the leaf nodes form each subgroup and features used in dividing the samples are returned as the output.
Clinical data is a principal resource for most health and medical research. Clinical data is either collected during the course of ongoing patient care or as part of a formal clinical trial program. Clinical data relates to the observed symptoms and course of a disease. Clinical endpoint data relates to the outcome of the disease, e.g. survival of the patient.
A biomarker, or a biological marker, refers to a measured characteristic which is used as an indicator of a biological state, condition or disease. A cancer biomarker refers to a substance or process that is indicative of the presence of cancer in the body. A biomarker may be a molecule secreted by a tumor or a specific response of the body to the presence of cancer. The term biomarker can be defined as a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention or other health care intervention.
In the present context, a biomarker is specifically a transcript and/or a gene, whose expression level correlates with the risk or progression of cancer, or with the susceptibility of the cancer to a given treatment. The transcript ex- pression is analysed at isoform level. Transcription of genes is a very dynamic process, allowing cells able to adapt rapidly to external, environmental or physiological changes affecting target tissues, organs or cells. The gene expression profiling allows identifying biomarkers that describe a given physiological status, a disease, an exposure to drugs, or other exogenous stimuli.
In the present invention the expression of transcripts or genes is analysed from a cancer patient sample. The sample refers to a sample from a tissue or an organ, or to a sample of a body fluid. Tissue or organ samples may be obtained from any tissue or organ by, e.g., biopsy. Samples of body fluids can be obtained by well-known techniques and include, for example, samples of blood, plasma, serum, or urine. Separated cells may be obtained from the body fluids or the tissues or organs by separating techniques such as centrifugation or cell sorting. Preferably the sample is tumor specimen. Most preferably it is na'fve tumor specimen.
Once a suitable sample is obtained, it is analyzed to determine the expression level of a set of biomarkers in the sample. As a person skilled in the art will appreciate, the expression level of a biomarker may be determined using several techniques including methods of quantifying nucleic acid encoding a target biomarker, such as PCR-based techniques, microarrays, different gene expression systems using color-coded probe pairs, RNA sequencing and Northern or Southern blotting techniques, and/or methods of quantifying protein biomarkers, such as immunological assay, western blotting, or mass spectrometry. Preferably the expression levels of biomarkers in a sample are determined based on the levels of nucleic acid (i.e. DNA or mRNA transcript) encoding the target protein biomarkers in the biological sample such as for example nuclear Run-on assay, RNase protection assay, Ch IP-Chip of RNAP, RT-PCR, DNA microarrays, in situ hybridization, MS2 tagging, Northern blot, RNA-Seq, or any other technically suitable method.
The present invention relates further to a panel of biomarkers for stratifying cancer patients treated first line with platinum and taxane combination therapy to prognostic subgroups. Especially the invention relates to a panel of biomarkers for stratifying high grade serous ovarian cancer (HGS-OvCa) patients or basal-like breast cancer (BL-BrCa) patients to prognostic subgroups. This panel comprises the 61 outcome-related biomarkers on isoform-level shown in Table 3. The panel further comprises 32 outcome-related biomarkers on gene-level shown in Table 4. These marker panels may be used alone, in combination or in parallel. For example, the 32 characteristic genes may be used as gene-level biomarkers when isoform-level data is not available.
The biomarkers of the invention have been obtained from the PSFinder analysis of HGS-OvCa patients treated with platinum and taxane combination therapy. The gene markers identified initially in the HGS-OvCa study are applicable directly in subgroup prediction also on other cancers treated first line with platinum and taxane combination therapy, especially on basal -like breast cancer samples. The prognostic subgroup of a patient is determined by the expression status of a set of biomarkers in the biomarker panel..
In one aspect the present invention relates to the use of the bi- omarkers included in Table 3 and/or 4 for identifying a prognostic subgroup of HGS-OvCa patient or BL-BrCa patient. The biomarkers defined in Table 3 and 4 may be used individually or in combination. In one embodiment of the invention at least 5 biomarkers of those defined in Table 3 and/or Table 4 are used. In another embodiment at least 5 biomarkers of those defined in Table 4 are used.
The biomarkers of the invention can be utilized as a means for predicting patient responses to the cancer therapy, for identification of patient subgroups that are directly associated with a clinical endpoint. The biomarkers facilitate a development of personalized treatment to overcome drug resistance. In one aspect the biomarkers may be utilized in a personalized cancer therapy, which aims to develop accurate diagnostic tests that identify patients who can benefit from targeted therapies. Furthermore the biomarkers may be utilized in finding alternative treatments to patients who are predicted to be resistant to the combination of platinum and taxane by identifying the related signaling pathways. Hence they are useful in the selection of first line chemotherapy.
The present invention relates further to a method for designing of per- sonalized combinatorial therapy for a platinum and taxane resistant cancer patient, comprising a) determining transcript isoform expression level or gene expression level data of a set of biomarkers defined in Table 3 and/ or in Table 4 from a patient sample, b) determining a patient subgroup based on the expres- sion status of the biomarkers and c) designing the therapy regimens based on the results obtained from step b). The co-expression pattern of the biomarkers defines the patient subgroup. In a preferred embodiment the cancer patient is high grade serous ovarian cancer (HGS-OvCa) patient or basal-like breast cancer (BL-BrCa) patient. In a most preferred embodiment the cancer patient is HGS-OvCa patient. In one embodiment the method comprises the additional assay of analyzing the BRCA1/2 mutation status and combining it with the data obtained from step b). The good-outcome group gets the standard platinum and taxane combination therapy, whereas the treatment of Poor I and Poor II requires other alternatives and/or additional treatment.
In one aspect the invention relates to a kit for use in evaluating the probability of survival of a patient suffering from HGS-OvCa or BL-BrCa, wherein the kit comprises of the biomarkers defined in Table 3 and/or Table 4. The kit may also contain means for determining the expression status of the biomarkers. The kit may be used for carrying out the method of stratifying prog- nostic subgroups of high grade serous ovarian cancer patients or basal like breast cancer patients. The kit may also be used for carrying out the method of stratifying prognostic subgroups of any other cancer patient treated with platinum and taxane combination therapy. The kit may comprise e.g. a chip, such as a microarray, suitable for use in biochip technology. Also e.g. deep sequencing data with quantified expression may be used in relation to the kit. In certain embodiments, the kit further comprises instructions for screening a sample taken from a subject having, or suspected of having HGS-OvCa.
It will be obvious to a person skilled in the art that, as the technology advances, the inventive concept can be implemented in various ways. The in- vention and its embodiments are not limited to the examples described but may vary within the scope of the claims. EXAMPLES
Materials and methods
PSFinder approach
Prognostic Subgroup Finder (PSFinder) identifies groups of patients that 1 ) consist of features {e.g., transcript isoforms or genes) whose expression correlates strongly in the samples belonging to each subgroup and exhibit distinct expression patterns between patient groups, 2) have statistically significant association with the clinical data, such as survival time. The overall workflow and outcome of PSFinder is shown in Figure 1 .
PSFinder uses an iterative rule-based approach to search for co-expressed features, which are able to divide the samples into groups that have significant association to the clinical data. The iterative process is executed until there are no such features available to further divide subset of samples into groups. Iteration round starts with seed search (Stepl ), continues with initial clique collection (Step2) and iterative clique merging (Step3) as shown in Figure 1 . After iteration the samples at the leaf nodes form subgroups and the features used in the sample divisions are returned as the output.
Stepl : Finding seed features with association to the clinical data The PSFinder starts with finding a fairly large set of features (seeds) that form the basis for subsequent steps. The main objective of Stepl is to discard features that do not have association to the clinical data. First, univariant Cox proportional hazards regression model is used on the expression (in log2- scale) for each feature (p < 0.2). The features passing the Cox analysis are then discretized into k groups using / -means clustering method. In our case study, we focus on high and low expression values and thus k = 2. Each feature with discretized data is analyzed individually with Kaplan-Meier survival analysis with log-rank test, and the features that have significance below the threshold (in our case study p < 0.05) are used in Step2. The threshold for the Cox and Kaplan- Meier significance can be set by the user. The default values used in this step are relaxed and p-values were not corrected for multiple testing in order to maximize sensitivity and obtain relatively large set of features for further steps.
Step2: Initial clique collection
With a set of seed features that have association with the clinical data, the main objective in Step2 is to form cliques in which all the features correlate strongly with each other and together divide samples into groups that have a significant association to clinical data. For all discretized data of features passing Stepl , PSFinder executes correlation analysis (Pearson correlation measure). A Boolean matrix is generated from the pairwise correlations of features so that '1 ' denotes high correlation (|r| > 0.6 in our case study) and Ό' low correlation (|r| < 0.6 in our case study). Features that correlate with each other form (initial) cliques. Each clique is required to separate samples into two groups so that the groups have significant survival association (Kaplan-Meier and log-rank test, here p < 0.05). The cliques are ranked based on the average pairwise correlation i,j ;i≠ j
scores n(n ~^) using expression values of genes or isoforms within the clique, where /¾ means correlation coefficient value between gene/isoform /' and j. The top cliques (25% in our analysis) are analyzed in Step3.
Step3: Clique merging
In Step3, the objective is to merge cliques from Step2 and result in subgroups so that correlation of the features inside the subgroup and the association to the clinical data is retained. The workflow in Step3 is iterative and there are two sub-steps. First, initial cliques are merged if the features in the cliques correlated strongly (|r| > 0.6 in our case study) and the merged clique still has significant survival association (Kaplan-Meier and log-rank test). The cliques are then merged iteratively until feature correlation or survival association rules are violated. The merged cliques are ranked based on average pairwise correlation scores that are calculated from the expression values of all the features in the merged clique. The outcome of PSFinder is the merged clique based on the maximum score and all the features in the clique. Linear predictor score (LPS) to predict the most likely group for a sample into
To predict which prognostic group (Poor I, Poor II and Good) a new sample in a validation set with expression data belongs to, we adopted a Bayesian predictor method described in (Wright et al, 2003). Briefly, a linear predictor score (LPS)
LPS(X) =∑ajXj
for a feature X is formed as follows: J
where ay is the scaling factor and Xs is the expression of the feature j.
The formulation presented in (Wright et al, 2003) assumes two groups and here we have extended LPS to three groups' case via pair-wise group comparison. Here, LPS were calculated three times: 1 ) between Poor I and Poor II, 2) between Poor II and Good, and 3) between Poor I and Good. For example, for comparison 3), the sample is assigned to Poor I type if P(sample e Poor I) > P(sample e Good), where P(Sample ^ Poor I) =
<I>(LPS PoorI Good (X) ' · Poorl ' G Poorl )
I>(LPS PoorI Good (X) ' ^ Poorl ' σ Poorl + I>(LPS PoorI Good (X) ' ^Good ^ Good ) gn(j p(sample e
^jLPSp^j ^j (X); μβοοά , σ Good )
GOOd) = ^ (LPS Poorl, Good(X)> ^PoorI > a Poorl + ®(LPS Poorl, Good (Χ)> ^βοοά -> σ Good ) ΑίίβΓ performing comparisons 1 ) and 3), the sample is classified to the Poor I type if P(sample ^ Poor I) > P(sample e Good) and P(sample ^ Poor I) > P(sample e Poor II). In cases where a sample cannot be clearly assigned to any of groups, the sample is classified as "unknown."
Transcript isoform level expression for TCGA ovarian cancer patients
Exon array data (level 1 ; Affymetrix Human Exon 1 .0 ST v2) and associated clinical information of two separate HGS-OvCa cohorts were downloaded from The Cancer Genome Atlas (TCGA) repository (Cancer Genome At- las Research, 201 1 ). The first cohort included 180 patients treated with standard platinum and taxane combination, and the second 327 ovarian cancer samples from patients who were given heterogeneous treatments. Exon array data were processed at transcript isoform level by Multiple Exon Array Preprocessing (MEAP) algorithm (Chen et al, 201 1 ). MEAP produced a data matrix of 55,594 rows (transcript isoforms) and 180 (discovery set) or 327 (validation set) columns (samples) with normalized data in log2-scale.
Gene-level expression data from validation cohorts
Affymetrix Human Genome U133 Plus 2.0 Array data of 172 of 285 high-grade serous ovarian cancer from the Australian Ovarian Cancer Study (AOCS) by Tothill and colleagues (Tothill et al, 2008), Agilent G41 12A Array data of 84 high-grade serous ovarian cancer from the Japanese Ovarian Cancer Study, Affymetrix Human Genome U133 Plus 2.0 Array data of 44 high-grade serous ovarian cancer from XXX study and Affymetrix Human Genome U133A Array data of 1 10 high-grade serous ovarian cancer patients (Dressman et al, 2007) were used as validation cohorts in this study. All patients were selected based on platinum and taxane based chemotherapy and were diagnosed with serous high-grade ovarian cancer. Data were normalized with fRMA described in the gene expression meta-analysis study of ovarian cancer (Ganzfried et al, 2013).
Prospective ovarian cancer patient cohort (MUPET cohort)
As an independent validation set we used a prospective ovarian can- cer cohort consisting of 29 high-grade serous cancer (HGS-OvCa) patients treated at the Department of Obstetrics and Gynaecology, Turku University Hospital (ClinicalTrials.gov Id: NCT01276574). The study and use of all clinical material have been approved by 1 ) The Ethics Committee of the Hospital District of Southwest Finland (ETMK): ETMK 53/180/2009 § 238 and ETMK 69/180/2010, and 2) National Supervisory Authority for Welfare and Health (Valvira): DNRO 6550/05.01 .00.06/2010 and STH507A.
Patients were treated with either primary surgery followed by six cycles of platinum and taxane based chemotherapy or three cycles of neoadjuvant chemotherapy (NACT) followed by interval debulking surgery and 3-6 chemo- therapy cycles. The treatment modality for each patient was determined based on preoperative imaging studies and diagnostic laparoscopy. Treatment response was defined with clinical examination, CA125 level and contrast-enhanced CT utilizing Recist1 .1 criteria. Tumor and ascites samples were collected during operation. One part of each specimen was snap-frozen and an- other part was directly prepared for cell culture to generate primary cell lines.
HGS-OvCa cell lines
Primary cell lines M019i, its Cisplatin resistant variant M019iCis and M022i with no pathogenic BRCA1/2 mutations identified together with three established high-grade serous ovarian cancer cell lines (CAOV4, NIHOVCAR3 and TYKNU) with wild-type BRCA1/2 (Domcke et al, 2013) were used for bi- omarker validation.
Basal-like breast cancer samples
Exon-array data at gene-level for 96 basal-like breast cancer samples were downloaded from The Cancer Genome Atlas (TCGA) [REF: PMID: 23000897]. The 32 genes from Table 4 with the LPS method were used to see whether the 32 genes have predictive power also in basal-like breast cancer. Quantitative real time RT-PCR
RNA was isolated using Thsure (Bioline, UK), purified with RNeasy kit (Qiagen), and reverse-transcribed to cDNA using Tetro cDNA synthesis kit (Bioline, UK) and oligo dT primers. Expression of selected markers and pathway genes were determined in triplicate samples using TaqMan qRT-PCR with Applied Biosystems 7900HT instrument (Finnish DNA Microarray Centre, Turku Centre for Biotechnology, University of Turku, Finland).
The primers and probes (Table 1 ) were designed using Universal ProbeLibrary Assay Design Center (Roche Applied Science). To select the op- timal endogenous controls for the normalization of gene expression data, we studied the expression of ACTB, GAPDH, GUSB, HPRT1, RPL19, PPIA, and TBP. Based on SLqPCR R-package (Vandesompele et al, 2002) and Best- Keeper (Pfaffl et al, 2004), PPIA and TBP were selected as the best reference genes in both MUPET patient samples and cell lines. Raw qRT-PCR Ct values were thus normalized against the geometric mean of PPIA and TBP.
Table 1. Primers and probes used in Taqman qRT-PCR. (A) Primers and probes used for target genes. (B) Primers and probes used for reference genes.
A.
Probe
Gene code Gene name Forward primer Reverse primer
v-akt murine thymoma viral
AKT1 69 ggctattgtgaaggagggttg tccttgtagccaatgaaggtg
oncogene homolog 1
v-akt murine thymoma viral
AKT3 22 ttgctttcagggctcttgat cataatttcttttgcatcatctgg
oncogene homolog 3
ATP1A10S ATP1A1 opposite strand 87 cgagtgaaatcgtgcatttg tcaagccccagaagactgag RP11-247A12.2.1 63 atcctcctcgtcgtcttcc aggtatccaaagggcacaag
IDUA iduronidase, alpha-L- 4 caggacggtaaggcgtaca ggagccagagacagcacct
ITGA11 integrin, alpha 11 68 acgcttccacctcaaatacg tgggcttgacctcgtagtg
KLF16 Kruppel-like factor 16 51 cagggctgcgacaagaag gaagcgcttggagcacag
microtubule-associated
MAP1 S 25 gagctcactgctcctcgtg caggatcgacatcccaagac protein 1 S
matrix metallopeptidase 14
MMP14 14 ccccaagaacatcaaagtctg ttccccttgtagaagtaagtgaaga
(membrane-inserted)
matrix metallopeptidase 2
(gelatinase A, 72kDa
MMP2 70 ataacctggatgccgtcgt aggcacccttgaagaagtagc gelatinase, 72kDa type IV
collagenase)
nuclear receptor subfamily
NR4A1 17 gcactgccaaactggactact cggagagcaggtcgtagaac
4, group A, member 1
protein phosphatase 1 ,
PPP1 CC catalytic subunit, gamma 34 cttatatgtagagcccatcaggtg aataattgggcgcagaaaac
isozyme
Gene
Gene name Probe # Forward primer Reverse primer code
atgccctcccccatgccatcctgcg tcacccacactgtgcccatcta cagcggaaccgctcattgcca
ACTB actin, beta
t cgc atgg
glyceraldehyde-3-
GAPD acgaccactttgtcaagctcatttcc
phosphate acccactcctccacctttga ttgctgtagccaaattcgttgt H
dehydrogenase tggt
GUSB glucuronidase, beta 57 cgccctgcctatctgtattc tccccacagggagtgtgtag hypoxanthine
HPRT
phosphoribosyltransfer 73 tgaccttgatttattttgcatacc cgagcaagacgttcagtcct 1
ase 1
ribosomal protein L19 46 agcgagctctttcctttcg gagcctcttctgaagcctga peptidylprolyl
PPIA isomerase A 48 atgctggacccaacacaaat tctttcactttgccaaacacc (cyclophilin A)
TATA boxnding protein 51 cccatgactcccatgacc tttacaaccaagattcactgtgg BRCA1/2 mutation status analysis for primary cell lines
We analyzed exome-seq data (sequenced in Beijing Genome Institute, Beijing, China) for BRCA 1 and BRCA2 genes to define their mutation status in M022i and M019i cell lines for which 87% and 89% of the coding region were covered at least with 10 reads to call for germline and somatic protein coding and untranslated region (UTR) point mutations and small insertions and deletions. We used Anduril (Ovaska et al, 2010) integrative exome-seq analysis pipeline followed by GATK (DePristo et al, 201 1 ) variant analysis workflow, which starts from data preprocessing (read mapping, mark duplicates, sorting, indel realignment, base recalibration) to variant calling, variant recalibration and variant database annotation (clinvar_2014021 1 , cosmic68, 1000g2012apr, snp138) and ANNOVAR (Wang et al, 2010). The UTR, synonymous and non- synonymous variants were manually compared to two databases, Catalogue of Somatic Mutations in Cancer (COSMIC, http://cancer.sanger.ac.uk/cancerge- nome/projects/cosmic/ [cancer.sanger.ac.uk]) and Breast Cancer Information Core (BIC, http://research.nhgri.nih.gov/bic/index.shtml [research. nhgri.nih.gov]).
Results
PSFinder identified expression signature predicts HGS-OvCa patient re- sponse to platinum and taxane combination therapy
We analyzed the isoform expression data from 180 homogeneously treated HGS-OvCa patient primary cancer samples available in the TCGA repository (Cancer Genome Atlas Research, 201 1 ) using PSFinder (Figure 1 ). All patients had disseminated disease and were treated with surgery and first line platinum and taxane combination therapy (clinical data in Table 2).
Table 2. Clinical features of the cohorts used in this study
Figure imgf000021_0001
PSFinder analysis resulted in 61 co-expressed transcripts that divide the discovery cohort of 180 samples into three groups (Figure 2A and Table 3).
Table 3. PSFinder identified 61 transcripts on isoform-level for Poor I, Poor II and Good prognostic patient characterization
PSFinder identified 61 transcripts on isoform-level for Poor I, Poor II and Good prognostic patient characterization. Transcript IDs and gene names are based on Ensembl version 64. Median log2-scale expression of transcripts in each patient group is shown in column "Poor I", "Poor II" and "Good", where prognostic group specific features are marked with an up-arrow or down-arrow symbol for up-regulated or a down-regulated, respectively. Two-sided t-test and median log2-scale fold change (log2FC) were calculated for each features between Poor I and Good, Poor II and Good, respectively. P-values are shown by *** (p < 0.001 ), ** (p < 0.01 ), * (p < 0.05) or empty (p≥ 0.05).
...
...
...
...
...
... ...
-'
...
...
...
...
.
...
...
...
...
...
... .
...
...
...
,„
...
...
...
...
...
...
...
...
„, ...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
... Two groups are associated with relatively short overall survival (Figure 2B) even though they have markedly different expression signature, and are called subsequently as Poor I and Poor II groups, whereas one group (Good group) is associated with longer survival (log-rank test p = 0.007, Figure 2B). The median overall survival times for patients in Poor I and Poor II are 45 and 50 months, respectively, whereas for patients in Good group it is >120 months. The five- year survival rates for Poor I, Poor II and Good prognostic patients are 42%, 40% and 68%, respectively. The three prognostic groups were reproduced when gene-level expression data were analyzed with PSFinder (Figure 5). The gene- level analysis resulted in 32 characteristics genes (Table 4) from which 14 were also found as isoform-level markers (Figure 5C).
Table 4. PSFinder identified 32 genes for Poor I, Poor II and Good prognostic patient characterization by gene-level.
PSFinder identified 32 genes for Poor I, Poor II and Good prognostic patient characterization by gene-level. Genes characterizing Poor I or Poor II groups are annotated in the "Group" column.
SEQ ID
MarkerlD GeneName Group NO
62
ENSG00000232729 AC083884.8 Poor I
63
ENSG00000227617 CERS6-AS1 Poor I
64
ENSG00000236200 KDM4A-AS1 Poor I
ENSG0000012991 1 KLF16 Poor I 65
66
ENSG00000130479 MAP1 S Poor I
67
ENSG00000257550 RP1 1-793H13.3 Poor I
68
ENSG00000245156 RP1 1-867G23.3 Poor I
69
ENSG00000167685 ZNF444 Poor I
70
ENSG00000148848 ADAM 12 Poor II
ENSG00000151388 ADAMTS12 Poor II 71
72
ENSG00000106624 AEBP1 Poor II
73
ENSG00000122870 BICC1 Poor II
74
ENSG00000123500 COL10A1 Poor II
75
ENSG00000164692 COL1A2 Poor II
76
ENSG00000168542 COL3A1 Poor II
ENSG00000130635 COL5A1 Poor II 77
78
ENSG00000204262 COL5A2 Poor II
79
ENSG00000142156 COL6A1 Poor II
80
ENSG00000142173 COL6A2 Poor II
81
ENSG00000158270 COLEC12 Poor II
82
ENSG00000143387 CTSK Poor II
ENSG00000143369 ECM1 Poor II 83
84
ENSG00000078098 FAP Poor II
85
ENSG000001 15414 FN1 Poor II
86
ENSG00000161638 ITGA5 Poor II
87
ENSG00000099953 MMP11 Poor II
88
ENSG00000137745 MMP13 Poor II
ENSG00000157227 MMP14 Poor II 89
90
ENSG00000087245 MMP2 Poor II
91
ENSG000001 16132 PRRX1 Poor II
92
ENSG00000165124 SVEP1 Poor II
93
ENSG00000186340 THBS2 Poor II
Confirmation of the PSFinder results
We determined whether the three survival associated groups and the 61 transcript isoforms have predictive power in two independent HGC-OvCa cohorts (TCGA heterogeneous and MUPET). The first validation cohort consists of transcript level data from 327 ovarian cancer patients treated with heterogeneous treatment regimens available in the TCGA repository. The use of various treatment regimens introduces a bias to validation because the 61 transcripts and three groups were identified using samples from patients who received only platinum and taxane combination therapy. Still, the three groups are visible also in the heterogeneously treated cohort (Figure 6A) and the difference between the survival between the groups is significant with p = 0.048 (Figure 6B). When we combined Poor I and Poor II groups to a single Poor group the survival difference is even clearer with p = 0.02 (Figure 6C). The five-year survival rate of the Good responder group was only 28% in the cohort receiving heterogeneous treatment regimens, while it was 68% in the homogeneously treated Good prog- nosis group.
We performed transcript-isoform specific qRT-PCR assays in a prospective cohort (MUPET) consisting of 29 HGS-OvCa patients (Hynninen et al, 2012) (clinical data in Table 2). We tested 24 marker transcripts out of 61 transcripts to which specific assays based on Universal ProbeLibrary Assay Design Center (lifescience.roche.com; Roche Diagnostics) were available. Only nine had isoform specific assay and robust signal in all samples. Similar LPS predictor as used in the isoform-level data were constructed using the nine markers resulting in three patient groups (Figure 3A). Due to the small size of the cohort we combined Poor I and Poor II groups, and the patients in the combined Poor group have significantly shorter survival times than patients belonging to the good prognosis group (log-rank test p = 0.03, Figure 3B and Figure 7). The expressions of these nine markers in the MUPET cohort are in line with the expression data for the Poor I and Poor II groups in TCGA samples (Figure 3C).
To further check the reproducibility of the PSFinder results we used gene expression data from AOCS (Tothill et al, 2008), Yoshihara (Yoshihara K et.al., 2010), Ferriss (Ferriss JS et al., 2012) and Dressman (Dressman et al, 2007) cohorts, which consist of 172, 84, 44 and 1 10 HGS-OvCa patients, respectively. All these patients were treated with platinum and/or taxane so these cohorts are free from bias due to heterogeneous treatments. However, due to microarray platform design differences, expression data for 84% out of PSFinder predictive genes were available. Despite the suboptimal number of genes used in the LPS prediction for these cohorts, the groups based on TCGA discovery set were reproduced. The patients in the predicted Poor I, Poor II and Good groups have similar expression profiles and survival associations as in the TCGA discovery cohort (Figure 8). Combining PSFinder expression signature with the mutation data of BRCA1 and BRCA2 improves identification of poor and good responsive HGS-OvCa patients
We examined the relation of the BRCA1/2 dysfunction and PSFinder identified prognostic subgroups using BRCA1/2 mutation data for the homolo- gously treated HGS-OvCa patients (Cancer Genome Atlas Research, 201 1 ).
The mutation status of BRCA1/2 does not show clear enrichment in any of the groups as shown in Figure 4A. This indicates that the PSFinder identified Good response group cannot be explained by the enrichment of BRCA1/2 mutations. BRCA1/2 mutation data were available for 1 17 samples in the homogeneously treated TCGA cohort and BRCA1/2 mutation carriers had longer overall survival than BRCA1/2 wild-type patients as expected (log-rank test p = 0.007, Figure 9). Due to small number of HGS-OvCa patients with the BRCA1/2 status, we combined the prognostic groups and BRCA1/2 mutation status infor- mation and divided patients as follows. 1 ) Good prognosis with BRCA1/2 mutation ("good-mu"), 2) good prognosis with wild-type BRCA1/2 ("good-wt"), 3) poor prognosis with BRCA1/2 mutation ("poor-mu") and 4) poor prognosis with BRCA1/2 wild-type ("poor-wt") groups. Survival analysis shows that all "good- mu" patients were alive at 80 months, whereas only 10% "poor-wt" patients were alive at that point (Figure 4B). Thus, combination of PSFinder and BRCA1/2 dysfunction can be used to identify patients that i) are very likely to benefit from platinum and taxane combination treatment and ii) are very likely not to respond. The ability to separate poor and good responders is crucially important in stratifying HGS-OvCa patients to clinical trials.
The application (LPS prediction) of the HGS-OvCa study identified markers (Table 4) to basal-like breast cancer samples.
We did subgroup prediction on basal-like breast cancer samples using the 32 gene markers for HGS-OvCa prognostic subgroup stratification. By using linear predictor score (LPS) method, we observed the same three groups in basal-like breast cancer samples (Figure 10).
The predicted Poor I and Poor II samples are associated with short overall survival and the predicted good samples are associated with good overall survival. The log-rank test p=0,046. Optimization of biomarker panel for predicting the survival of HGS-OvCa patients
For genes in Table 4 we tested all 1 ,2,3,4 and 5 gene combinations and tested their performance in predicting overall survival (OS) and progression- free survival (PFS) (Figure 1 1 ). The area under ROC curve (AUC) tells how well the biomarkers are able to predict patient response. In OS the best result was 84% and for PFS 77%. The prediction was done with Random Forest (RF) ensemble machine learning prediction method. Random forest shows the best performance with AUC 0.79 for OS and AUC 0.7 for PFS based on five-gene com- binations. Based on overall survival, the best candidates are ADAM12, COL1A2, COL5A1 , CTSK, FAP. Based on progression free time survival the best candidates are KLF16, COL5A2, FAP, MMP14, MMP2.
REFERENCES
Berns EM, Bowtell DD (2012) The changing view of high-grade serous ovarian cancer. Cancer Res 72: 2701-2704.
Cancer Genome Atlas Research Network, Nature 474, 609, Jun 30, 201 1 )
Australian Ovarian Cancer Study (AOCS) (R. W. Tothill et ai, Australian Ovarian Cancer Study (AOCS) Clin Cancer Res 14, 5198 Aug 15, 2008).
Verhaak RG, Tamayo P, Yang JY, Hubbard D, Zhang H, Creighton CJ, Fereday S, Lawrence M, Carter SL, Mermel CH, Kostic AD, Etemad- moghadam D, Saksena G, Cibulskis K, Duraisamy S, Levanon K, Sougnez C, Tsherniak A, Gomez S, Onofrio R et al (2013) Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. The Journal of clinical investigation 123: 517-525
Merkin J, Russell C, Chen P, Burge CB (2012) Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science 338: 1593-1599.
Merlo LM, Pepper JW, Reid BJ, Maley CC (2006) Cancer as an evolutionary and ecological process. Nat Rev Cancer 6: 924-935.
Ostrow SL, Barshir R, DeGregori J, Yeger-Lotem E, Hershberg R (2014) Cancer evolution is associated with pervasive positive selection on glob- ally expressed genes. PLoS Genet 10: e1004239.
Greaves M, Maley CC (2012) Clonal evolution in cancer. Nature 481:
306-313.]
Chen P, Lepikhova T, Hu Y, Monni O, Hautaniemi S (201 1 ) Comprehensive exon array data processing method for quantitative analysis of alterna- tive spliced variants. Nucleic acids research 39: e123
Dressman HK, Berchuck A, Chan G, Zhai J, Bild A, Sayer R, Cragun J, Clarke J, Whitaker RS, Li L, Gray J, Marks J, Ginsburg GS, Potti A, West M, Nevins JR, Lancaster JM (2007) An integrated genomic-based approach to individualized treatment of patients with advanced-stage ovarian cancer. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 25: 517-525
Ganzfried BF, Riester M, Haibe-Kains B, Risch T, Tyekucheva S, Jazic I, Wang XV, Ahmadifar M, Birrer MJ, Parmigiani G, Huttenhower C, Wal- dron L (2013) curated OvarianData: clinically annotated data for the ovarian can- cer transcriptome. Database : the journal of biological databases and curation 2013: bat013 Yoshihara K, Tajima A, Yahata T, Kodama S, Fujiwara H, Suzuki M, Onishi Y, Hatae M, Sueyoshi K, Fujiwara H, Kudo Y, Kotera K, Masuzaki H, Tashiro H, Katabuchi H, Inoue I, Tanaka K. (2010) Gene expression profile for predicting survival in advanced-stage serous ovarian cancer across two inde- pendent datasets. PLoS ONE.;5(3):e9615.
Ferriss JS, Kim Y, Duska L, Birrer M, Levine DA, Moskaluk C, Theo- dorescu D, Lee JK.(2012) Multi-gene expression predictors of single drug responses to adjuvant chemotherapy in ovarian carcinoma: predicting platinum resistance. PLoS One. 2012;7(2):e30550.
Tsai LL, Yu CC, Chang YC, Yu CH, Chou MY (201 1 ) Markedly increased Oct4 and Nanog expression correlates with cisplatin resistance in oral squamous cell carcinoma. Journal of oral pathology & medicine : official publication of the International Association of Oral Pathologists and the American Academy of Oral Pathology 40: 621 -628
Domcke S, Sinha R, Levine DA, Sander C, Schultz N (2013) Evaluating cell lines as tumour models by comparison of genomic profiles. Nature communications 4: 2126
Hynninen J, Auranen A, Carpen O, Dean K, Seppanen M, Kemp- painen J, Lavonius M, Lisinen I, Virtanen J, Grenman S (2012) FDG PET/CT in staging of advanced epithelial ovarian cancer: frequency of supradiaphragmatic lymph node metastasis challenges the traditional pattern of disease spread. Gynecologic oncology 126: 64-68.
Wright G, Tan B, Rosenwald A, Hurt EH, Wiestner A, Staudt LM (2003) A gene expression-based method to diagnose clinically distinct sub- groups of diffuse large B cell lymphoma. Proceedings of the National Academy of Sciences of the United States of America 100: 9991 -9996.
Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F (2002) Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Ge- nome biology 3: RESEARCH0034
Pfaffl MW, Tichopad A, Prgomet C, Neuvians TP (2004) Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper-Excel-based tool using pair-wise correlations. Biotechnology letters 26: 509-515.
Ovaska K, Laakso M, Haapa-Paananen S, Louhimo R, Chen P, Ait- tomaki V, Valo E, Nunez-Fontarnau J, Rantanen V, Karinen S, Nousiainen K, Lahesmaa-Korpinen AM, Miettinen M, Saarinen L, Kohonen P, Wu J, Wester- marck J, Hautaniemi S (2010) Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme. Genome medicine 2: 65.
Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annota- tion of genetic variants from high-throughput sequencing data. Nucleic acids research 38: e164
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ (201 1 ) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43: 491 -498

Claims

1 . Method of stratifying cancer patients treated first line with platinum and taxane combination therapy to prognostic subgroups, comprising determining expression level of a set of biomarkers defined in Table 3 and/or Table 4 from a patient sample, wherein the expression level of biomarkers is used for identifying prognostic subgroups of patients.
2. The method of claim 1 , wherein the cancer treated with platinum and taxane combination therapy is selected form a group consisting of ovarian cancer, breast cancer, endometrial cancer, melanoma, cervix cancer, pancreatic cancer, esophageal cancer, colorectal cancer and glioblastoma.
3. The method according to claim 2, wherein the ovarian cancer treated with platinum and taxane combination therapy is HGS-Ova.
4. The method according to claim 2, wherein the breast cancer treated with platinum and taxane combination therapy is (BL-BrCa).
5. The method according to any of the previous claims, wherein the prognostic subgroup is survival associated.
6. The method according to any of the previous claims, wherein a set of biomarkers comprising at least 5 biomarkers of Table 3 and/or Table 4 are determined.
7. The method according to any one of the previous claims, comprising
a) determining a set of biomarkers in samples obtained from patients suffering from cancer, which is treated first line with platinum and taxane combination therapy;
b) determining outcome groups for the cancer patients based on the expression of the set of biomarkers; wherein Poor I subgroup is associated with poor prognosis and is characterized by the transcripts and/or genes specifically co-expressed in Poor I outcome group; Poor II subgroup is associated with poor prognosis and is characterized by the transcripts and/or genes specifically co- expressed in Poor II outcome group, and Good subgroup is associated with good prognosis and is characterized by all the Poor I and Poor II specific transcripts and/or genes with their expression on opposite directions.
8. The method according to any one of the previous claims, wherein the expression level data of the biomarkers is further combined with BRCA1/2 mutation status.
9. The method according to any one of the previous claims, wherein the patient sample is na'fve tumour specimen.
10. A panel of biomarkers for stratifying cancer patients treated first line with platinum and taxane combination therapy to prognostic subgroups, said panel comprising of biomarkers defined in Table 3.
1 1 . The panel of biomarkers according to claim 10, wherein the panel further comprises a set of genes defined in Table 4.
12. The panel of biomarkers according to claim 10 or 1 1 , wherein the cancer treated with platinum and taxane combination therapy is selected form a group consisting of ovarian cancer, breast cancer, endometrial cancer, melanoma, cervix cancer, pancreatic cancer, esophageal cancer, colorectal cancer and glioblastoma.
13. The panel of biomarkers according to claim 12, wherein the ovarian cancer treated with platinum and taxane combination therapy is HGS-Ova.
14. The panel of biomarkers according to claim 12, wherein the breast cancer treated with platinum and taxane combination therapy is BL-BrCa.
15. A method for designing of personalized combinatorial therapy for a platinum and taxane resistant cancer patient, comprising a) determining expression level data of a set of biomarkers defined in Table 3 and/or in Table 4 from a patient sample, b) determining a patient subgroup based on the expression status of the biomarkers and c) designing the therapy regimen based on the results obtained from step b).
16. The method of claim 15, wherein the method comprises an additional assay of analyzing the BRCA1/2 mutation status and combining it with the data obtained from step b).
17. The method according to any one of claims 15 to 16, wherein the cancer patient is suffering from is HGS-Ova.
18. The method according to any one of claims 15 to 16, wherein the cancer patient is suffering from is BL-BrCa.
19. Use of the biomarkers defined in Table 3 and/or 4 for identifying a prognostic subgroup of high grade serous ovarian cancer (HGS-Ova) patient or basal-like breast cancer (BL-BrCa) patient.
20. A kit for use in evaluating the probability of survival of a patient suffering from HGS-Ova or (BL-BrCa), wherein the kit comprises the biomarkers defined in Table 3 and/or in Table 4.
PCT/EP2015/075246 2014-10-30 2015-10-30 Ovarian cancer prognostic subgrouping WO2016066797A2 (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
FI20145951 2014-10-30
FI20145951 2014-10-30
FI20145953 2014-10-30
FI20145953 2014-10-30
EP2015074023 2015-10-16
EP2015074021 2015-10-16
EPPCT/EP2015/074021 2015-10-16
EPPCT/EP2015/074023 2015-10-16

Publications (2)

Publication Number Publication Date
WO2016066797A2 true WO2016066797A2 (en) 2016-05-06
WO2016066797A3 WO2016066797A3 (en) 2016-06-30

Family

ID=54476932

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/EP2015/075246 WO2016066797A2 (en) 2014-10-30 2015-10-30 Ovarian cancer prognostic subgrouping
PCT/EP2015/075252 WO2016066800A1 (en) 2014-10-30 2015-10-30 Method and system for finding prognostic biomarkers

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/075252 WO2016066800A1 (en) 2014-10-30 2015-10-30 Method and system for finding prognostic biomarkers

Country Status (1)

Country Link
WO (2) WO2016066797A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113785075A (en) * 2019-05-03 2021-12-10 皇家飞利浦有限公司 Method for prognosis of high-grade serous ovarian cancer
WO2021188881A3 (en) * 2020-03-20 2021-12-23 Applied Dna Sciences, Inc. Compositions and methods for detecting and treating sars-cov-2

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2017278261A1 (en) * 2016-06-05 2019-01-31 Berg Llc Systems and methods for patient stratification and identification of potential biomarkers

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2006243782A1 (en) * 2005-05-04 2006-11-09 University Of South Florida Predicting treatment response in cancer subjects
US20090006055A1 (en) * 2007-06-15 2009-01-01 Siemens Medical Solutions Usa, Inc. Automated Reduction of Biomarkers
US20120270233A1 (en) * 2008-02-11 2012-10-25 Historx, Inc. Association of biomarkers with patient outcome
US20140274780A1 (en) * 2013-03-15 2014-09-18 PSertain Technologies Methods of improving survival in cancer
US20140314750A1 (en) * 2013-04-19 2014-10-23 Wisconsin Alumni Research Foundation Six-Gene Biomarker of Survival and Response to Platinum Based Chemotherapy in Serious Ovarian Cancer Patients

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113785075A (en) * 2019-05-03 2021-12-10 皇家飞利浦有限公司 Method for prognosis of high-grade serous ovarian cancer
WO2021188881A3 (en) * 2020-03-20 2021-12-23 Applied Dna Sciences, Inc. Compositions and methods for detecting and treating sars-cov-2

Also Published As

Publication number Publication date
WO2016066797A3 (en) 2016-06-30
WO2016066800A1 (en) 2016-05-06

Similar Documents

Publication Publication Date Title
Lian et al. Identification of a plasma four-microRNA panel as potential noninvasive biomarker for osteosarcoma
Zhu et al. Identification of circulating microRNAs as novel potential biomarkers for gastric cancer detection: a systematic review and meta-analysis
Calza et al. Intrinsic molecular signature of breast cancer in a population-based cohort of 412 patients
Perreard et al. Classification and risk stratification of invasive breast carcinomas using a real-time quantitative RT-PCR assay
CN106834462B (en) Application of gastric cancer genes
AU2012261820B2 (en) Molecular diagnostic test for cancer
Montero-Meléndez et al. Identification of novel predictor classifiers for inflammatory bowel disease by gene expression profiling
Wang et al. A functional variation in pre-microRNA-196a is associated with susceptibility of esophageal squamous cell carcinoma risk in Chinese Han
Peng et al. LncRNA EGOT promotes tumorigenesis via hedgehog pathway in gastric cancer
US20160110494A1 (en) Medical prognosis and prediction of treatment response using multiple cellular signalling pathway activities
JP2020503850A (en) Method for distinguishing tumor suppressive FOXO activity from oxidative stress
US20190188359A1 (en) Determination of mapk-ap-1 pathway activity using unique combination of target genes
KR20170053617A (en) Methods for Evaluating Lung Cancer Status
AU2012261820A1 (en) Molecular diagnostic test for cancer
JP2019527544A (en) Molecular marker, reference gene, and application thereof, detection kit, and detection model construction method
Liu et al. Circular RNA profiling identified as a biomarker for predicting the efficacy of Gefitinib therapy for non-small cell lung cancer
US20160060704A1 (en) Methods and Compositions for Diagnosis of Glioblastoma or a Subtype Thereof
Uhara et al. NRAS mutations in primary and metastatic melanomas of Japanese patients
Fan et al. Identification of a five-lncRNA signature for the diagnosis and prognosis of gastric cancer
Wu et al. Long noncoding RNA HOTTIP expression predicts tumor recurrence in hepatocellular carcinoma patients following liver transplantation
JP2011509689A (en) Molecular staging and prognosis of stage II and III colon cancer
CA2504403A1 (en) Prognostic for hematological malignancy
Wang et al. Identification of a 5-gene signature for clinical and prognostic prediction in gastric cancer patients upon microarray data
Wang et al. A robust blood gene expression-based prognostic model for castration-resistant prostate cancer
Lee et al. Meta-analysis of tumor stem-like breast cancer cells using gene set and network analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15790886

Country of ref document: EP

Kind code of ref document: A2

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15790886

Country of ref document: EP

Kind code of ref document: A2