WO2017216559A1

WO2017216559A1 - Predicting responsiveness to therapy in prostate cancer

Info

Publication number: WO2017216559A1
Application number: PCT/GB2017/051740
Authority: WO
Inventors: Laura Knight; Steven Walker; Richard Kennedy; Paul Harkin; Catherine DAVIDSON
Original assignee: Almac Diagnostics Limited
Priority date: 2016-06-14
Filing date: 2017-06-14
Publication date: 2017-12-21

Abstract

A method of predicting responsiveness of a subject having a prostate cancer to a mitotic inhibitor and/or a DNA damaging therapeutic agent comprises measuring expression levels of at least one gene selected from Table 1-45 in a sample from the subject. The measured expression levels are used to determine whether the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA). If the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation to abnormal DNA responsiveness to a mitotic inhibitor is predicted. If the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) non-responsiveness to a mitotic inhibitor is predicted. If the prostate cancer has a deficiency in DNA damage repair and/or displays elevated immune signalling responsiveness to a DNA damaging therapeutic agent is predicted. If the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) non-responsiveness to a DNA damaging therapeutic agent is predicted. Corresponding products and methods of treatment are provided.

Description

PREDICTING RESPONSIVENESS TO THERAPY IN PROSTATE CANCER

FIELD OF THE INVENTION

The present invention relates to a molecular diagnostic test useful for identifying prostate cancers within a high risk metastatic group. Such prostate cancers are more likely to recur and are considered more aggressive. They are characterised by deficiency in DNA damage repair and immune activation (as a consequence of abnormal DNA produced in the cells). The test also predicts responsiveness of prostate cancers to particular treatments that includes the use of mitotic inhibitors such as taxanes. The invention includes the generation and use of various classifiers derived from identification of this subtype in prostate cancer patients, such as use of a 44-gene classification model that is used to identify this DNA damage repair/immune activation molecular subtype. One application is the stratification of response to, and selection of patients for prostate cancer therapeutic drug classes, including mitotic inhibitors, DNA damage causing agents and DNA repair targeted therapies. The present invention provides a test that can guide conventional therapy selection as well as selecting patient groups for enrichment strategies during clinical trial evaluation of novel therapeutics. DNA repair deficient subtypes can be identified, for example, from fresh/frozen (FF) or formalin fixed paraffin embedded (FFPE) patient samples. They may also be identified in liquid biopsy samples e.g. blood. BACKGROUND OF THE INVENTION

Prostate cancer is the most common malignancy in men with a lifetime incidence of 15.3%

(Howlader 2012). Based upon data from 1999-2006 approximately 80% of prostate cancer patients present with early disease clinically confined to the prostate (Altekruse et al 2010) of which around 65% are cured by surgical resection or radiotherapy (Kattan et al 1999, Pound et al 1999). 35% will develop PSA recurrence of which approximately 35% will develop local or metastatic recurrence, which is non-curable.

The typical treatment pathway for prostate cancer which has relapsed following primary treatment (surgery or radiation) and/or metastatic prostate cancer is primarily androgen deprivation, practically all prostate tumours initially respond to this treatment, however resistance normally occurs leading to castrate resistance prostate cancer (CRPC). Unfortunately once a tumour is castrate resistant subsequent therapies including chemotherapy have limited efficacy.

Recent clinical trials (CHAARTED and STAMPEDE) have demonstrated improved outcome for De Novo metastatic and high risk localized prostate cancer by the addition of chemotherapy (Docetaxel) to primary androgen deprivation. Furthermore targeted DNA repair inhibitors (e.g. PARPi) have also demonstrated efficacy in mCRPC early phase clinical trial. Currently there are no molecular biomarkers used in the clinic to guide chemotherapy treatment for prostate cancer, the CHAARTED and STAMPEDE trials have established the use of chemotherapy earlier in the clinical care pathway, furthermore the ongoing trial activity will result in greater treatment choices. Therefore the development of robust predictive molecular biomarkers to guide treatment decisions in prostate cancer is of high priority.

It is now clear that most solid tumours originating from the same anatomical site represent a number of distinct entities at a molecular level (Perou et al 2000). The inventors have previously performed gene expression analysis of a cohort of prostate cancer samples enriched with metastatic disease and have identified a distinct molecular subgroup of primary prostate cancers that clustered with metastatic disease and prostate cancers known to have concomitant metastases, this cluster was hence termed the 'Metastatic-Like' subgroup (see WO2015087088, incorporated herein by reference). The inventors then developed a 70-gene signature to prospectively identify the

'Metastatic-like' subgroup of patients (see GB1510684.2, filed 17 June 2015, incorporated herein by reference). This 70-gene assay can be used to prospectively assess disease progression from a primary tumour, to determine the likelihood of disease recurrence and/or metastatic progression.

SUMMARY OF THE INVENTION

The inventors have further analysed this prostate cancer dataset to identify molecular groups of relevance for prediction of response to treatment. They have discovered that within the metastatic biology group there is a subgroup of prostate cancer with a deficiency in DNA damage repair. This subgroup also displays immune activation. The immune activation is postulated to be responsive to cytosolic DNA produced as a consequence of the deficiency in DNA damage repair. This is termed "abnormal DNA". This subgroup responds poorly to therapy with mitotic inhibitors such as docetaxel. Where this subgroup is positively identified, alternative therapies should be provided as discussed herein.

Thus, the invention provides a method of predicting responsiveness of a subject having a prostate cancer to a mitotic inhibitor and/or a DNA damaging therapeutic agent comprising:

a. measuring expression levels of at least one gene selected from Table 1 -45 in a sample from the subject;

b. using the measured expression levels to determine whether the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA); wherein: i. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation to abnormal DNA responsiveness to a mitotic inhibitor is predicted; or

ii. if the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) non-responsiveness to a mitotic inhibitor is predicted; and/or iii. if the prostate cancer has a deficiency in DNA damage repair and/or displays elevated immune signalling responsiveness to a DNA damaging therapeutic agent is predicted; or iv. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) non-responsiveness to a DNA damaging therapeutic agent is predicted.

According to some embodiments, the measured expression levels are used by generating a test score derived from the measured expression levels. Generating a test score may comprise steps of deriving a test score that captures the expression levels; providing a threshold score comprising information correlating the test score and responsiveness; and comparing the test score to the threshold score. Responsiveness to a mitotic inhibitor is predicted when the test score does not exceed the threshold score. Non-responsiveness to a mitotic inhibitor is predicted when the test score exceeds the threshold score. Additionally or alternatively, responsiveness to a DNA damaging therapeutic agent is predicted when the test score exceeds the threshold score. Non- responsiveness to a DNA damaging therapeutic agent is predicted when the test score does not exceed the threshold score.

The invention also provides a method of predicting outcome of treatment of a subject having a prostate cancer with a mitotic inhibitor and/or a DNA damaging therapeutic agent comprising:

b. using the measured expression levels to determine whether the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA); wherein: i. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) an improved outcome of treatment with a mitotic inhibitor is predicted; or

ii. if the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) a poorer outcome of treatment with a mitotic inhibitor is predicted; and/or

iii. if the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) an improved outcome of treatment with a DNA damaging therapeutic agent is predicted; or

iv. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) a poorer outcome of treatment with a DNA damaging therapeutic agent is predicted.

According to some embodiments, the measured expression levels are used by generating a test score derived from the measured expression levels. Generating a test score may comprise steps of deriving a test score that captures the expression levels; providing a threshold score comprising information correlating the test score and predicted outcome; and comparing the test score to the threshold score. An improved outcome of treatment with a mitotic inhibitor is predicted when the test score does not exceed the threshold score. A poorer outcome of treatment with a mitotic inhibitor is predicted when the test score exceeds the threshold score. In some embodiments, an improved outcome of treatment with a DNA damaging therapeutic agent is predicted when the test score exceeds the threshold score. A poorer outcome of treatment with a DNA damaging therapeutic agent is predicted when the test score does not exceed the threshold score.

Also provided is a method of selecting an appropriate therapy to treat a subject having a prostate cancer comprising:

b. using the measured expression levels to determine whether the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA); wherein: i. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) a mitotic inhibitor is selected for treatment; or ii. if the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) a mitotic inhibitor is not selected for treatment; and/or

iii. if the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) a DNA damaging therapeutic agent is selected for treatment; or

iv. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) a DNA damaging therapeutic agent is not selected for treatment.

According to some embodiments, the measured expression levels are used by generating a test score derived from the measured expression levels. Generating a test score may comprise steps of deriving a test score that captures the expression levels; providing a threshold score comprising information correlating the test score and therapy selection; and comparing the test score to the threshold score. A mitotic inhibitor is selected for treatment when the test score does not exceed the threshold score. A mitotic inhibitor is not selected for treatment when the test score exceeds the threshold score. In some embodiments, a DNA damaging therapeutic agent is selected for treatment when the test score exceeds the threshold score. A DNA damaging therapeutic agent is not selected for treatment when the test score does not exceed the threshold score.

Also provided is a method of treating a subject having a prostate cancer comprising administering a mitotic inhibitor to the subject, wherein the subject is predicted to be responsive to the mitotic inhibitor, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject. This may be phrased as a mitotic inhibitor for use in a method of treating a subject having a prostate cancer, wherein the subject is predicted to be responsive to the mitotic inhibitor, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject. This may also be phrased as use of a mitotic inhibitor in the manufacture of a medicament for treating a subject having a prostate cancer, wherein the subject is predicted to be responsive to the mitotic inhibitor, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject.

The invention also provides a method of treating a subject having a prostate cancer comprising administering a DNA damaging therapeutic agent to the subject, wherein the subject is predicted to be responsive to the DNA damaging therapeutic agent, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject. This may be phrased as a DNA damaging therapeutic agent for use in a method of treating a subject having a prostate cancer, wherein the subject is predicted to be responsive to the DNA damaging therapeutic agent, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject. This may also be phrased as use of a DNA damaging therapeutic agent for use in the manufacture of a medicament for treating a subject having a prostate cancer, wherein the subject is predicted to be responsive to the DNA damaging therapeutic agent, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject.

According to all such methods of treatment and medical uses, the subject may be selected for treatment according to a method as described herein. According to some embodiments, the measured expression levels are used by generating a test score derived from the measured expression levels. Generating a test score may comprise steps of deriving a test score that captures the expression levels; providing a threshold score comprising information correlating the test score and responsiveness; and comparing the test score to the threshold score. A mitotic inhibitor is used to treat the subject when the test score does not exceed the threshold score. A mitotic inhibitor is not used to treat the subject when the test score exceeds the threshold score. In some embodiments, a DNA damaging therapeutic agent is used to treat the subject when the test score exceeds the threshold score. A DNA damaging therapeutic agent is not used to treat the subject when the test score does not exceed the threshold score.

According to all aspects of the invention the prostate cancer may be metastatic prostate cancer or may be predicted to be aggressive, metastatic or potentially metastatic prostate cancer on the basis of performance of a method as disclosed herein.

According to all aspects of the invention the at least one gene may be selected from CXCL10, MX1 ID01 , IFI44L, CD2, GBP5, PRAME, ITGAL, LRP4, and APOL3. Expression levels of all of these genes may be determined in some embodiments. Additionally or alternatively at least one gene may be selected from CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, AC138128.1 , KIF26A, CD274, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC, PPP1 R1A and AL137218.1 . Expression levels of all of these genes (34 or 44) may be determined in some embodiments. Expression levels of additional genes may also be determined in some embodiments. According to all aspects of the invention the at least one gene may be selected from CXCL10, MX1 , ID01 , IFI44L, CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, KIF26A, CD274, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 ,

IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC and PPP1 R1A. The expression level of at least one gene from this list together with at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40 or 41 etc. further genes may be measured. Thus, for example, the expression level of ID01 together with at least one further gene may be measured. The at least one further gene may be selected from

CXCL10, MX1 , IFI44L, CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, KIF26A, CD274, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC and PPP1 R1A. As a further example, the expression level of CD274 together with at least one further gene may be measured. The at least one further gene may be selected from CXCL10, MX1 , ID01 , IFI44L, CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, KIF26A, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC and PPP1 R1A. In some embodiments, the expression level of ID01 and CD274 may be measured. In some embodiments, the expression level of ID01 and CD274 together with at least one further gene may be measured. The at least one further gene may be selected from CXCL10, MX1 , IFI44L, CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, KIF26A, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC and PPP1 R1A.

Target sequences for use according to all aspects of the invention (i.e. sequences to which primers and/or probes may hybridize and/or which may be amplified or otherwise detected) may comprise, consist essentially of, or consist of the nucleotide sequences of any one or more of SEQ ID Nos 1751 -3500. Such target sequences may represent an aspect of the invention. The methods may comprise measuring the expression level of at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43 or 44 of the genes from Table 2B. The methods may comprise measuring the expression level of at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 or 42 of the genes from Table 2C.

According to all aspects of the invention, any suitable mitotic inhibitor may be employed. In some embodiments, the mitotic inhibitor comprises a vinca alkaloid and/or a taxane. A suitable vinca alkaloid is vinorelbine. In some embodiments, the taxane is docetaxel or paclitaxel. Also, according to all aspects, a mitotic inhibitor may be used as a sole therapy. In some embodiments, the mitotic inhibitor is administered and a DNA damaging therapeutic agent is not administered. Thus, in some embodiments the therapy is not a combination therapy.

According to all relevant aspects, any suitable DNA damaging therapeutic agent may be employed. In some embodiments, the DNA-damaging therapeutic agent comprises one or more substances selected from the group consisting of: a DNA damaging agent, a DNA repair targeted therapy, an inhibitor of DNA damage signalling, an inhibitor of DNA damage induced cell cycle arrest, a histone deacetylase inhibitor, a heat shock protein inhibitor and an inhibitor of DNA synthesis.

In some embodiments, the DNA-damaging therapeutic agent comprises one or more of a platinum- containing agent, a nucleoside analogue such as gemcitabine or 5-fluorouracil or a prodrug thereof such as capecitabine, an anthracycline such as epirubicin or doxorubicin, an alkylating agent such as cyclophosphamide, an ionising radiation or a combination of radiation and chemotherapy (chemoradiation).

In some embodiments, the DNA-damaging therapeutic agent comprises a platinum-containing agent. The platinum based agent may be selected from cisplatin, carboplatin and oxaliplatin.

In certain embodiments, the DNA damaging therapeutic agent comprises a PARP inhibitor.

According to all aspects of the invention, the therapy may be adjuvant treatment and/or neoadjuvant treatment.

It is also shown herein that the subgroup of prostate cancer is more aggressive and likely to recur. Thus, there is provided a method of predicting recurrence of prostate cancer in a subject and/or identifying a prostate cancer likely to recur comprising: a. measuring expression levels of at least one gene selected from Table 1 -45 in a sample from the subject;

b. using the measured expression levels to determine whether the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA); wherein if the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA) a high likelihood of recurrence is predicted and/or a prostate cancer likely to recur is identified.

Recurrence may be considered co-terminus with relapse, as would be understood by the skilled person.

Recurrence may be clinical recurrence, metastatic recurrence or biochemical recurrence. In the context of prostate cancer biochemical recurrence means a rise in the level of PSA in a subject after treatment for prostate cancer. Biochemical recurrence may indicate that the prostate cancer has not been treated effectively or has recurred. Recurrence may be following surgery, for example radical prostatectomy and/or following radiotherapy.

According to some embodiments, the measured expression levels are used by generating a test score derived from the measured expression levels. Generating a test score may comprise steps of deriving a test score that captures the expression levels; providing a threshold score comprising information correlating the test score and the relevant biology and comparing the test score to the threshold score. A high likelihood of recurrence is predicted and/or a prostate cancer likely to recur is identified is predicted when the test score exceeds the threshold score. A lower likelihood of recurrence is predicted and/or a prostate cancer less likely to recur is identified when the test score does not exceed the threshold score.

The present invention relates to prediction of response to therapeutic agents (such as mitotic inhibitors and DNA-damaging therapeutic agents) using different classifications of response, such as overall survival, progression free survival, disease free survival, radiological response, as defined by RECIST, complete response, partial response, stable disease and serological markers. In specific embodiments this invention can be used to evaluate standard chest roentgenography, computed tomography (CT), perfusion CT, dynamic contrast material-enhanced magnetic resonance (MR) diffusion-weighted (DW) MR or positron emission tomography (PET) with the glucose analog fluorine 18 fluorodeoxyglucose (FDG) (FDG-PET) response in prostate cancer treated with therapeutic agents, including combination therapies.

The present invention relies upon identification, within a larger group of metastatic prostate cancer, of a specific molecular subtype. This molecular subtype is characterized by deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA). This molecular subtype can, in some embodiments, be detected by the use of various different gene classifiers as disclosed herein; termed "DDRD classifier".

In another aspect, the present invention relates to kits for performing the methods of the invention. Such kits may be for performing nucleic acid amplification, including PCR and all variants thereof such as real-time and end point methods and qPCR, Next generation Sequencing (NGS), including RNA-seq, microarray, branched DNA/RNA (bDNA/RNA) assays and immunoassays such as immunohistochemistry, ELISA, Western blot and the like. Such kits include appropriate reagents and directions to assay the expression of the genes or gene products and quantify mRNA or protein expression. The kits may include suitable primers and/or probes to detect the expression levels of at least one of the genes in Table 1A, 1 B and/or 1 C (or in any of tables 1 -45). In some embodiments, the kits may also contain the specific therapeutic agent to be administered in the event that the test predicts responsiveness. This agent may be provided in a form, such as a dosage form, that is tailored to prostate cancer treatment specifically. The kit may be provided with suitable instructions for administration according to prostate cancer treatment regimens.

The invention provides, and the kits of the invention may incorporate, the probe sequences disclosed herein with reference to Table 1A. Table 1A lists the SEQ ID Nos for the individual probes used to measure expression levels of the genes identified in the table. Thus, the invention provides a probe comprising, consisting essentially of or consisting of the sequence of any one of SEQ ID NOs 1 -1750. Any one or more, up to all, of the probes may be included in the kits of the invention. The kits of the invention may incorporate primers and/or probes that hybridize with the target sequences of any one or more of SEQ ID Nos 1751 -3500. The kits of the invention may incorporate primers and/or probes that generate an amplicon comprising at least a portion, up to all of, the nucleotide sequence of any one or more of SEQ ID Nos 1751 -3500.

The invention also provides methods for identifying prostate tumours with deficiency in DNA damage repair and/or that display immune activation (to abnormal DNA). The invention can be used to identify patients that are sensitive to and respond, or are resistant to and do not respond, to therapeutic agents such as mitotic inhibitors and DNA-damaging therapeutic agents, such as drugs that damage DNA directly, damage DNA indirectly or inhibit normal DNA damage signaling and/or repair processes.

The invention also relates to guiding conventional treatment of patients. The invention also relates to selecting patients for clinical trials where novel therapeutic agents, such as mitotic inhibitors and drugs of the classes that directly or indirectly affect DNA damage and/or DNA damage repair are to be tested. The present invention and methods accommodate the use of archived formalin fixed paraffin- embedded (FFPE) biopsy material, including fine needle aspiration (FNA) as well as fresh/frozen (FF) tissue, for assay of all transcripts in the invention, and are therefore compatible with the most widely available type of biopsy material. The expression level may be determined using RNA obtained from FFPE tissue, fresh frozen tissue or fresh tissue that has been stored in solutions such as RNAIater®. Liquid biopsies are also contemplated.

DESCRIPTION OF THE FIGURES

Figure 1 : Molecular subgroups of prostate cancer

Figure 2: Molecular cluster 2 is defined by activated Immune gene expression

Figure 3: DDRD scores across the groups

Figure 4: DDRD scores in Prostate Cancer TCGA samples

Figure 5: DDRD signature is prognostic in MSKCC dataset for PSA relapse

Figure 6: DDRD test identifies mCRPC patients that have worse outcome following Docetaxel treatment

DETAILED DESCRIPTION OF THE INVENTION

Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods, devices, and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods, devices and materials are now described.

All publications, published patent documents, and patent applications cited in this application are indicative of the level of skill in the art(s) to which the application pertains.

All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as though each individual publication, published patent document, or patent application was specifically and individually indicated as being incorporated by reference.

The articles "a" and "an" are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element, unless explicitly indicated to the contrary. A major goal of current research efforts in cancer is to increase the efficacy of perioperative systemic therapy in patients by incorporating molecular parameters into clinical therapeutic decisions. Pharmacogenetics/genomics is the study of genetic/genomic factors involved in an individual's response to a foreign compound or drug. Agents or modulators which have a stimulatory or inhibitory effect on expression of a marker of the invention can be administered to individuals to treat (prophylactically or therapeutically) prostate cancer in a patient. It is ideal to also consider the pharmacogenomics of the individual in conjunction with such treatment. Differences in metabolism of therapeutics may possibly lead to severe toxicity or therapeutic failure by altering the relationship between dose and blood concentration of the pharmacologically active drug. Thus, understanding the pharmacogenomics of an individual permits the selection of effective agents (e.g., drugs) for prophylactic or therapeutic treatments. Such pharmacogenomics can further be used to determine appropriate dosages and therapeutic regimens. Accordingly, the level of expression of a marker of the invention in an individual can be determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment of the individual.

The invention is directed to the application of a collection of gene or gene product markers

(hereinafter referred to as "biomarkers") expressed in certain prostate cancer cells/tissue for predicting responsiveness to treatment using mitotic inhibitors (and/or DNA-damaging therapeutic agents in some embodiments). In different aspects, this biomarker list may form the basis of a single parameter or multiparametric predictive test that could be delivered using methods known in the art such as microarray, Q-PCR, NGS (e.g. RNA-seq), immunohistochemistry, bDNA/bRNA, ELISA or other technologies that can quantify mRNA or protein expression.

The present invention also relates to kits and methods that are useful for prognosis following cytotoxic chemotherapy or selection of specific treatments for prostate cancer. Methods are provided such that when some or all of the transcripts are over or under-expressed, the expression profile indicates responsiveness or resistance to mitotic inhibitors and/or DNA-damaging therapeutic agents. These kits and methods employ gene or gene product markers that are differentially expressed in tumours of patients with prostate cancer, in particular prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA). In one embodiment of the invention, the expression profiles of these biomarkers are correlated with clinical outcome (response or survival) in archival tissue samples under a statistical method or a correlation model to create a database or model correlating expression profile with responsiveness to one or more therapeutic agents (e.g. mitotic inhibitors). The predictive model may then be used to predict the responsiveness in a patient whose responsiveness to the therapeutic agent(s) is unknown. In many other embodiments, a patient population can be divided into at least two classes based on patients' clinical outcome, prognosis, or responsiveness to therapeutic agents, and the biomarkers are substantially correlated with a class distinction between these classes of patients. The biological pathways described herein have been shown to be predictive of responsiveness to treatment of prostate cancer using therapeutic agents such as mitotic inhibitors.

Predictive Marker Panels/Expression Classifiers

A unique collection of biomarkers as a genetic classifier expressed in prostate cancer cells/tissue is provided that is useful in determining responsiveness or resistance to therapeutic agents, such as mitotic inhibitors (and potentially DNA-damaging therapeutic agents), used to treat prostate cancer. Such a collection may be termed a "marker panel", "expression classifier", or "classifier". One such collection is shown in Table 1A, together with an indication of relevant accession numbers. This collection was derived from an original collection of biomarkers as shown in Tables 1 B and 1 C (see WO 2012/037378) which were then mapped to a prostate cancer platform (see the Examples herein). A hierarchical clustering analysis identified a DDRD cluster that defines those individuals likely to respond to certain treatments of prostate cancer. This cluster, or collection, of biomarkers makes up Table 1A. This represents 42 different genes and 152 different target sequences within those 42 genes. The invention may involve determining expression levels of any one or more of these genes or target sequences. Evidence is also presented herein that a related 44 gene classifier (Table 2B) is effective in predicting responsiveness to mitotic inhibitors in prostate cancer.

The biomarkers useful in the present methods are thus identified in the tables herein. These biomarkers are identified as having predictive value to determine a patient (having a particular subtype of prostate cancer) response to a therapeutic agent, or lack thereof. Their expression correlates with the response to an agent, and more specifically, a mitotic inhibitor or a DNA- damaging therapeutic agent. By examining the expression of a collection of the identified biomarkers in cells from a prostate tumour, it is possible to determine which therapeutic agent or combination of agents will be most likely to reduce the growth rate of the cancer, and in some embodiments, prostate cancer cells. By examining a collection of identified transcript gene or gene product markers, it is also possible to determine which therapeutic agent or combination of agents will be the least likely to reduce the growth rate of the cancer. By examining the expression of a collection of biomarkers, it is therefore possible to eliminate ineffective or inappropriate therapeutic agents.

Importantly, in certain embodiments, these determinations can be made on a patient-by-patient basis or on an agent-by-agent basis. Thus, one can determine whether or not a particular therapeutic regimen is likely to benefit a particular patient or type of patient, and/or whether a particular regimen should be continued.

Table 1 A - Genes (biomarkers), probeset IDs and number of probes aligned for defining DDRD status in prostate cancer patients

Probe SEQ ID Target SEQ Entrez

No. probes NOs Gene ID NOs Gene

Probe Set ID Orientation aligned Ensembl Gene Symbol ID

Sense (Fully 243-253 ADAMTS 1993-2003

PC3P.15254.Cl_s_at Exonic) 11 ENSG00000158859 4 9507

Sense (Fully 1410-1420 ADAMTS 3160-3170

PCEM.1191_at Exonic) 11 ENSG00000158859 4 9507

Sense (Fully 1080-1090 ADAMTS 2830-2840

PCAD P.1286_s_at Exonic) 11 ENSG00000158859 4 9507

Sense (Fully 1355-1376 ADAMTS 3105-3126

PCADNP.9420_at Exonic) 11 ENSG00000158859 4 9507

Sense (Fully 1619-1629 3369-3379

PCHP.583_s_at Exonic) 11 ENSG00000135046 ANXA1 301

360-370 2110-2120

PC3P.3611.C2_at Sense (Fully 11 ENSG00000135046 ANXA1 301 Exonic)

Sense (Fully 371-381 2121-2131

PC3P.3611.C2_x_at Exonic) 11 ENSG00000135046 ANXA1 301

Sense (Fully 1322-1332 3072-3082

PCADNP.6684_s_at Exonic) 11 ENSG00000135046 ANXA1 301

Sense (Fully 596-606 2346-2356

PC3SNGnh.3559_at Exonic) 11 ENSG00000128284 APOL3 80833

Sense (Fully 607-617 2357-2367

PC3SNGnh.3559_x_at Exonic) 11 ENSG00000128284 APOL3 80833

Sense (Fully 404-414 2154-2164

PC3P.562.Cl_s_at Exonic) 11 ENSG00000128284 APOL3 80833

Sense (Fully 1058-1068 2808-2818

PCADNP.10333_s_at Exonic) 11 ENSG00000156535 CD109 135228

Sense (Fully 1729-1739 3479-3489

PCRS2.4932_s_at Exonic) 11 ENSG00000156535 CD109 135228

Sense (Fully 1311-1321 3061-3071

PCADNP.5793_x_at Exonic) 11 ENSG00000156535 CD109 135228

Sense (Fully 111 - 121 1861-1871

PC3P.10604.Cl_s_at Exonic) 11 ENSG00000116824 CD2 914

Sense (Fully 728-738 2478-2488

PCADA.12514_at Exonic) 11 ENSG00000120217 CD274 29126

Sense (Fully 1069-1079 2819-2829

PCADNP.12746_at Exonic) 11 ENSG00000120217 CD274 29126

Sense (Fully 1190-1200 2940-2950

PCADNP.17113_s_at Exonic) 11 ENSG00000184258 CDR1 1038

Sense (Fully 1157-1167 2907-2917

PCADNP.16641_s_at Exonic) 11 ENSG00000134873 CLDN10 9071

Sense (Fully 1685-1695 3435-3445

PCRS.1827_s_at Exonic) 11 ENSG00000134873 CLDN10 9071

Sense (Fully 1575-1585 3325-3335

PCHP.1577_s_at Exonic) 11 ENSG00000169245 CXCL10 3627

Sense (Fully 1168-1189 2918-2939

PCADNP.16954_s_at Exonic) 11 ENSG00000197408 CYP2B6 1555

Sense (Fully 1465-1475 3215-3225

PCEM.1821_at Exonic) 11 ENSG00000197408 CYP2B6 1555

Sense (Fully 1476-1486 3226-3236

PCEM.1821_x_at Exonic) 11 ENSG00000197408 CYP2B6 1555

Sense (Fully 1124-1134 2874-2884

PCADNP.15001_x_at Exonic) 11 ENSG00000197408 CYP2B6 1555

Sense 1135-1145 2885-2895

(includes

PCADNP.15400_s_at Intronic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 122-132 1872-1882

PC3P.11520.Cl_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1553-1563 3303-3313

PCHP.1544_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense 1333-1343 3083-3093

(includes

PCADNP.6698_s_at Intronic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1443-1453 3193-3203

PCEM.1412_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense 1102-1112 2852-2862

(includes

PCADNP.14609_s_at Intronic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1564-1574 3314-3324

PCHP.1559_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense 893-903 2643-2653

(includes

PCADA.4898_s_at Intronic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1542-1552 3292-3302

PCHP.1505_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1531-1541 3281-3291

PCHP.1490_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense 56-66 1806-1816

3Snip.7697- (includes

26104a_s_at Intronic) 11 ENSG00000146648 EGFR 1956 Sense (Fully 221-231 1971-1981

PC3P.14538.Cl_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1641-1651 3391-3401

PCPD.13671.Cl_s_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense 78-88 1828-1838

(includes

3Snip.9132-885a_s_at Intronic) 11 ENSG00000146648 EGFR 1956

Sense 1036-1046 2786-2796

(includes

PCADA.9138_s_at Intronic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 486-496 2236-2246

PC3P.964.Cl_at Exonic) 11 ENSG00000146648 EGFR 1956

Sense (Fully 1256-1266 3006-3016

PCADNP.18033_at Exonic) 11 ENSG00000146648 EGFR 1956

PC3P.1443.C1- Sense (Fully 188-198 1938-1948 980a_s_at Exonic) 11 ENSG00000120738 EGR1 1958

3Snip.2719- Sense (Fully 12-22 1762-1772 1678a_s_at Exonic) 11 ENSG00000120738 EGR1 1958

Sense (Fully 199-209 1949-1959

PC3P.1443.Cl_s_at Exonic) 11 ENSG00000120738 EGR1 1958

Sense (Fully 210-220 1960-1970

PC3P.1443.Cl_x_at Exonic) 11 ENSG00000120738 EGR1 1958

Sense (Fully 1432-1442 3182-3192

PCEM.1333_x_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1487-1497 3237-3247

PCEM.2344_s_at Exonic) 11 ENSG00000091831 ESR1 2099

PC3SNG.770- Sense (Fully 574-584 2324-2334 2499a_s_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1597-1607 3347-3357

PCHP.473_s_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1245-1255 2995-3005

PCADNP.17944_s_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense 915-925 2665-2675

(includes

PCADA.5502_s_at Intronic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1421-1431 3171-3181

PCEM.1333_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1377-1387 3127-3137

PCADNP.9855_s_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 992-1002 2742-2752

PCADA.6606_s_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1113-1123 2863-2873

PCADNP.14980_s_at Exonic) 11 ENSG00000091831 ESR1 2099

Sense (Fully 1289-1299 3039-3049

PCADNP.20272_x_at Exonic) 11 ENSG00000010030 ETV7 51513

PC3SNG.1941- Sense (Fully 508-518 2258-2268 45a_s_at Exonic) 11 ENSG00000010030 ETV7 51513

Sense (Fully 305-315 FAM19A 2055-2065

PC3P.16607.Cl_s_at Exonic) 11 ENSG00000219438 5 25817

Sense (Fully 981-991 FAM19A 2731-2741

PCADA.6490_x_at Exonic) 11 ENSG00000219438 5 25817

Sense (Fully 970-980 FAM19A 2720-2730

PCADA.6490_at Exonic) 11 ENSG00000219438 5 25817

PC3SNG.760- Sense (Fully 563-573 FAM19A 2313-2323 21a_s_at Exonic) 11 ENSG00000219438 5 25817

Sense (Fully 1454-1464 3204-3214

PCEM.1525_s_at Exonic) 11 ENSG00000125740 FOSB 2354

Sense (Fully 1663-1673 3413-3423

PCPD.3244.Cl_s_at Exonic) 11 ENSG00000125740 FOSB 2354

Sense (Fully 338-348 2088-2098

PC3P.1906.Cl_s_at Exonic) 11 ENSG00000125740 FOSB 2354

PC3P.1906.C1- Sense (Fully 327-337 2077-2087 568a_s_at Exonic) 11 ENSG00000125740 FOSB 2354

Sense (Fully 133-143 1883-1893

PC3P.11652.Cl_x_at Exonic) 11 ENSG00000125740 FOSB 2354

838-848 2588-2598

PCADA.3896_s_at Sense (Fully 11 ENSG00000082074 FYB 2533 Exonic)

Sense 750-760 2500-2510

(includes

PCADA.2440_x_at Intronic) 11 ENSG00000082074 FYB 2533

Sense 739-749 2489-2499

(includes

PCADA.2440_at Intronic) 11 ENSG00000082074 FYB 2533

PC3SNG.2978- Sense (Fully 519-529 2269-2279 49a_s_at Exonic) 11 ENSG00000082074 FYB 2533

Sense (Fully 1267-1277 3017-3027

PCADNP.18094_s_at Exonic) 11 ENSG00000082074 FYB 2533

PC3SNG.7312- Sense (Fully 552-562 2302-2312 305a_s_at Exonic) 11 ENSG00000082074 FYB 2533

Sense (Fully 585-595 2335-2345

PC3SNGnh.3118_s_at Exonic) 11 ENSG00000154451 GBP5 115362

Sense (Fully 849-859 2599-2609

PCADA.4270_s_at Exonic) 11 ENSG00000154451 GBP5 115362

Sense (Fully 1520-1530 3270-3280

PCHP.1101_s_at Exonic) 11 ENSG00000115290 GRB14 2888

Sense (Fully 1509-1519 3259-3269

PCHP.1085_x_at Exonic) 11 ENSG00000115290 GRB14 2888

PC3SNG.1731- Sense (Fully 497-507 2247-2257 88a_s_at Exonic) 11 ENSG00000131203 IDOl 3620

Sense (Fully 144-154 1894-1904

PC3P.14080.Cl_s_at Exonic) 11 ENSG00000137959 IFI44L 10964

Sense (Fully 684-694 2434-2444

PC3SNGnh.7276_s_at Exonic) 11 ENSG00000137959 IFI44L 10964

Sense (Fully 1740-1750 3490-3500

PCRS3.1964_s_at Exonic) 11 ENSG00000137959 IFI44L 10964

Sense (Fully 948-958 2698-2708

PCADA.5842_s_at Exonic) 11 ENSG00000005844 ITGAL 3683

Sense (Fully 1652-1662 3402-3412

PCPD.27349.Cl_s_at Exonic) 11 ENSG00000005844 ITGAL 3683

Sense (Fully 882-892 2632-2642

PCADA.4889_s_at Exonic) 11 ENSG00000066735 KIF26A 26153

Sense (Fully 871-881 2621-2631

PCADA.4889_at Exonic) 11 ENSG00000066735 KIF26A 26153

Sense (Fully 695-705 2445-2455

PCADA.10547_s_at Exonic) 11 ENSG00000066735 KIF26A 26153

Sense (Fully 1014-1024 2764-2774

PCADA.79_x_at Exonic) 11 ENSG00000130487 KLHDC7B 113730

Sense (Fully 959-969 2709-2719

PCADA.6149_s_at Exonic) 11 ENSG00000130487 KLHDC7B 113730

Sense (Fully 1003-1013 2753-2763

PCADA.79_at Exonic) 11 ENSG00000130487 KLHDC7B 113730

Sense (Fully 1608-1618 3358-3368

PCHP.533_s_at Exonic) 11 ENSG00000150457 LATS2 26524

PC3SNG.3272- Sense (Fully 530-540 2280-2290 1416a_s_at Exonic) 11 ENSG00000150457 LATS2 26524

Sense (Fully 1146-1156 2896-2906

PCADNP.15501_s_at Exonic) 11 ENSG00000150457 LATS2 26524

Sense 34-44 1784-1794

(includes

3Snip.292-1275a_s_at Intronic) 11 ENSG00000150457 LATS2 26524

Sense (Fully 177-187 1927-1937

PC3P.14196.Cl_at Exonic) 11 ENSG00000134569 LRP4 4038

Sense 717-727 2467-2477

(includes

PCADA.10668_x_at Intronic) 11 ENSG00000134569 LRP4 4038

Sense 706-716 2456-2466

(includes

PCADA.10668_at Intronic) 11 ENSG00000134569 LRP4 4038

PC3SNG.4407- Sense (Fully 541-551 2291-2301 18a_s_at Exonic) 11 ENSG00000197614 MFAP5 8076

3Snip.4760- Sense (Fully 45-55 1795-1805 1950a_s_at Exonic) 11 ENSG00000197614 MFAP5 8076 Sense (Fully 415-425 2165-2175

PC3P.6933.Cl_s_at Exonic) 11 ENSG00000157601 MX1 4599

Sense (Fully 1586-1596 3336-3346

PCHP.1635_s_at Exonic) 11 ENSG00000171428 NAT1 9

Sense 67-77 1817-1827

3Snip.7770- (includes

25236a_x_at Intronic) 11 ENSG00000171428 NAT1 9

Sense (Fully 1630-1640 3380-3390

PCHP.878_s_at Exonic) 11 ENSG00000171428 NAT1 9

Sense (Fully 254-264 2004-2014

PC3P.16455.Cl_s_at Exonic) 11 ENSG00000140853 NLRC5 84166

Sense (Fully 1-11 1751-1761

3Snip.l260-664a_s_at Exonic) 11 ENSG00000102837 OLFM4 10562

Sense (Fully 349-359 2099-2109

PC3P.2569.Cl_s_at Exonic) 11 ENSG00000102837 OLFM4 10562

Sense (Fully 1388-1398 3138-3148

PCEM.lll_at Exonic) 11 ENSG00000237988 OR2I1P N/A

Sense (Fully 1091-1101 2841-2851

PCADNP.13531_s_at Exonic) 11 ENSG00000237988 OR2I1P N/A

Sense (Fully 1399-1409 3149-3159

PCEM.lll_x_at Exonic) 11 ENSG00000237988 OR2I1P N/A

PC3P.8311.C1- Sense (Fully 464-474 2214-2224 482a_s_at Exonic) 11 ENSG00000137558 PI15 51050

Sense (Fully 1234-1244 2984-2994

PCADNP.17332_s_at Exonic) 11 ENSG00000137558 PI15 51050

Sense (Fully 426-436 2176-2186

PC3P.7245.Cl_at Exonic) 11 ENSG00000137558 PI15 51050

Sense (Fully 23-33 1773-1783

3Snip.2873-1277a_at Exonic) 11 ENSG00000137558 PI15 51050

Sense (Fully 437-447 2187-2197

PC3P.7245.Cl_x_at Exonic) 11 ENSG00000137558 PI15 51050

Sense (Fully 475-485 2225-2235

PC3P.8311.Cl_x_at Exonic) 11 ENSG00000137558 PI15 51050

Sense (Fully 316-326 2066-2076

PC3P.17320.Cl_at Exonic) 11 ENSG00000135447 PPP1R1A 5502

Sense (Fully 283-293 2033-2043

PC3P.16554.Cl_at Exonic) 11 ENSG00000135447 PPP1R1A 5502

Sense (Fully 294-304 2044-2054

PC3P.16554.Cl_x_at Exonic) 11 ENSG00000135447 PPP1R1A 5502

Sense (Fully 100-110 1850-1860

PC3P.10017.Cl_x_at Exonic) 11 ENSG00000185686 PRAME 23532

Sense (Fully 89-99 1839-1849

PC3P.10017.Cl_s_at Exonic) 11 ENSG00000185686 PRAME 23532

Sense (Fully 1674-1684 3424-3434

PCRS.1819_s_at Exonic) 11 ENSG00000139174 PRICKLEl 144165

Sense (Fully 651-661 2401-2411

PC3SNGnh.5779_s_at Exonic) 11 ENSG00000139174 PRICKLEl 144165

Sense (Fully 1300-1310 3050-3060

PCADNP.5098_s_at Exonic) 11 ENSG00000139174 PRICKLEl 144165

Sense 618-628 2368-2378

(includes

PC3SNGnh.4539_x_at Intronic) 11 ENSG00000139174 PRICKLEl 144165

Sense (Fully 827-837 2577-2587

PCADA.3411_s_at Exonic) 11 ENSG00000081237 PTPRC 5788

Sense (Fully 926-947 2676-2697

PCADA.5534_s_at Exonic) 22 ENSG00000081237 PTPRC 5788

Sense (Fully 860-870 2610-2620

PCADA.42_s_at Exonic) 11 ENSG00000081237 PTPRC 5788

Sense 662-672 2412-2422

(includes

PC3SNGnh.6570_at Intronic) 11 ENSG00000081237 PTPRC 5788

Sense (Fully 265-282 2015-2032

PC3P.16458.Cl_s_at Exonic) 18 ENSG00000081237 PTPRC 5788

Sense 673-683 2423-2433

(includes

PC3SNGnh.6570_x_at Intronic) 11 ENSG00000081237 PTPRC 5788 Sense (Fully 448-463 2198-2213

PC3P.8279.Cl_s_at Exonic) 16 ENSG00000081237 PTPRC 5788

Sense (Fully 155-176 1905-1926

PC3P.14096.Cl_x_at Exonic) 11 ENSG00000128340 RAC2 5880

Sense (Fully 1201-1233 2951-2983

PCADNP.17166_x_at Exonic) 11 ENSG00000128340 RAC2 5880

Sense 1344-1354 3094-3104

(includes

PCADNP.8864_at Intronic) 11 ENSG00000134321 RSAD2 91543

Sense (Fully 1047-1057 2797-2807

PCADA.9398_s_at Exonic) 11 ENSG00000134321 RSAD2 91543

Sense (Fully 1718-1728 3468-3478

PCRS.427_x_at Exonic) 11 ENSG00000134321 RSAD2 91543

Sense (Fully 1707-1717 3457-3467

PCRS.427_at Exonic) 11 ENSG00000134321 RSAD2 91543

Sense (Fully 1025-1035 2775-2785

PCADA.8164_x_at Exonic) 11 ENSG00000134321 RSAD2 91543

Sense (Fully 783-804 2533-2554

PCADA.2714_s_at Exonic) 11 ENSG00000185404 SP140L 93349

Sense (Fully 232-242 1982-1992

PC3P.14603.Cl_s_at Exonic) 11 ENSG00000185404 SP140L 93349

Sense (Fully 761-782 2511-2532

PCADA.2714_at Exonic) 11 ENSG00000185404 SP140L 93349

Sense (Fully 805-826 2555-2576

PCADA.2714_x_at Exonic) 11 ENSG00000185404 SP140L 93349

Sense 904-914 2654-2664

(includes

PCADA.5480_x_at Intronic) 11 ENSG00000156298 TSPAN7 7102

Sense (Fully 382-392 2132-2142

PC3P.4522.Cl_at Exonic) 11 ENSG00000156298 TSPAN7 7102

Sense 1498-1508 3248-3258

(includes

PCEM.706_at Intronic) 11 ENSG00000156298 TSPAN7 7102

Sense (Fully 393-403 2143-2153

PC3P.4522.Cl_x_at Exonic) 11 ENSG00000156298 TSPAN7 7102

Sense (Fully 1278-1288 3028-3038

PCADNP.18547_s_at Exonic) 11 ENSG00000161405 IKZF3 22806

Sense 629-639 2379-2389

(includes

PC3SNGnh.4739_at Intronic) 11 ENSG00000161405 IKZF3 22806

Sense (Fully 1696-1706 3446-3456

PCRS.218_s_at Exonic) 11 ENSG00000161405 IKZF3 22806

Sense 640-650 2390-2400

(includes

PC3SNGnh.4739_x_at Intronic) 11 ENSG00000161405 IKZF3 22806

Table 1 B - Original list of genes tested in breast cancer and mapped to prostate cancer

AS1 LOC100293679 BREM.2466 s at 24

Table 1 C - Original list of genes tested in breast cancer and mapped to prostate cancer

All or a portion of the biomarkers recited in Tables 1A, 1 B and/or 1 C may be used in the methods of the invention. For example, biomarker panels selected from the biomarkers in Tables 1A, 1 B and 1 C can be generated using the methods provided herein and can comprise between one, and all of the biomarkers set forth in Tables 1 A, 1 B and/or 1 C and each and every combination in between (e.g., four selected biomarkers, 16 selected biomarkers, 74 selected biomarkers, etc.). In some embodiments, the predictive biomarker set comprises at least 5, 10, 20, 40 or more biomarkers. In other embodiments, the predictive biomarker set comprises no more than 5, 10, 20, 40 or fewer biomarkers. In some embodiments, the predictive biomarker set includes a plurality of biomarkers listed in Tables 1A, 1 B and/or 1 C. In some embodiments the predictive biomarker set includes at least about 1 %, about 5%, about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, about 96%, about 97%, about 98%, or about 99% of the biomarkers listed in Tables 1A, 1 B and/or 1 C. Selected predictive biomarker sets can be assembled from the predictive biomarkers provided using methods described herein and analogous methods known in the art. In one embodiment, the biomarker panel contains all 42 biomarkers in Table 1A. In another embodiment, the biomarker panel contains the 152 different target sequences in Table 1 A. In another embodiment, the biomarker panel corresponds to the 40 or 44 gene panel described in tables 2A and 2B.

Predictive biomarker sets may be defined in combination with corresponding scalar weights on the real scale with varying magnitude, which are further combined through linear or non-linear, algebraic, trigonometric or correlative means into a single scalar value via an algebraic, statistical learning, Bayesian, regression, or similar algorithms which together with a mathematically derived decision function on the scalar value provide a predictive model by which expression profiles from samples may be resolved into discrete classes of responder or non-responder, resistant or non- resistant, to a specified drug or drug class. Such predictive models, including biomarker membership, are developed by learning weights and the decision threshold, optimized for sensitivity, specificity, negative and positive predictive values, hazard ratio or any combination thereof, under cross-validation, bootstrapping or similar sampling techniques, from a set of representative expression profiles from historical patient samples with known drug response and/or resistance or with known molecular subtype classification. In one embodiment, the biomarkers are used to form a weighted sum of their signals, where individual weights can be positive or negative. The resulting sum ("decisive function") is compared with a pre-determined reference point or value. The comparison with the reference point or value may be used to diagnose, or predict a clinical condition or outcome. As described above, one of ordinary skill in the art will appreciate that the biomarkers included in the classifier or classifiers provided in Tables 1 A, 1 B and I C will carry unequal weights in a classifier for responsiveness or resistance to a therapeutic agent. Therefore, while as few as one sequence may be used to diagnose or predict an outcome such as responsiveness to therapeutic agent, the specificity and sensitivity or diagnosis or prediction accuracy may increase using more sequences.

As used herein, the term "weight" refers to the relative importance of an item in a statistical calculation. The weight of each biomarker in a gene expression classifier may be determined on a data set of patient samples using analytical methods known in the art. Gene specific bias values may also be applied. Gene specific bias may be required to mean centre each gene in the classifier relative to a training data set, as would be understood by one skilled in the art.

In one embodiment the biomarker panel is directed to the 40 biomarkers detailed in Table 2A with corresponding ranks and weights detailed in the table or alternative rankings and weightings. In another embodiment, the biomarker panel is directed to the 44 biomarkers detailed in Table 2B with corresponding ranks and weights detailed in the table or alternative rankings and weightings. Tables 2A and 2B rank the biomarkers in order of decreasing weight in the classifier, defined as the rank of the average weight in the compound decision score function measured under cross-validation.

Table 2A

Gene IDs and EntrezGene IDs for 40-gene DDRD classifier model

with associated ranking and weightings

DDRD classifier 40 gene model

Rank Genes Symbol EntrezGene ID Weights

1 GBP5 115362 0.022389581

2 CXCL10 3627 0.021941734

3 ID01 3620 0.020991115

4 MX1 4599 0.020098675

5 IFI44L 10964 0.018204957

6 CD2 914 0.018080661

7 PRAME 23532 0.016850837

8 ITGAL 3683 0.016783359

9 LRP4 4038 -0.015129969

10 SP140L 93349 0.014646025

11 APOL3 80833 0.014407174

12 FOSB 2354 -0.014310521

13 CDR1 1038 -0.014209848

14 RSAD2 91543 0.014177132

15 TSPAN7 7102 -0.014111562

16 RAC2 5880 0.014093627

17 FYB 2533 0.01400475

18 KLHDC7B 113730 0.013298413

19 GRB14 2888 0.013031204

20 KIF26A 26153 -0.012942351

21 CD274 29126 0.012651964

22 CD109 135228 -0.012239425

23 ETV7 51513 0.011787297

24 MFAP5 8076 -0.011480443

25 OLFM4 10562 -0.011130113

26 PI15 51050 -0.010904326 27 FAM19A5 25817 -0.010500936

28 NLRC5 84166 0.009593449

29 EGR1 1958 -0.008947963

30 ANXA1 301 -0.008373991

31 CLDN10 9071 -0.008165127

32 ADAMTS4 9507 -0.008109892

33 ESR1 2099 0.007524594

34 PTPRC 5788 0.007258669

35 EGFR 1956 -0.007176203

36 NAT1 9 0.006165534

37 LATS2 26524 -0.005951091

38 CYP2B6 1555 0.005838391

39 PPP1 R1A 5502 -0.003898835

40 TERF1 P1 348567 0.002706847

Table 2B

Gene IDs and EntrezGene IDs for 44-gene DDRD classifier model with associated ranking and weightings

DDRD CI; issifier - 44 Gene Moi lei (NA:genomic sec luence) Rank Gene symbol EntrezGene ID Weight

1 CXCL10 3627 0.023

2 MX1 4599 0.0226

3 ID01 3620 0.0221

4 IFI44L 10964 0.0191

5 CD2 914 0.019

6 GBP5 1 15362 0.0181

7 PRAME 23532 0.0177

8 ITGAL 3683 0.0176

9 LRP4 4038 -0.0159

10 APOL3 80833 0.0151

1 1 CDR1 1038 -0.0149

12 FYB 2533 -0.0149

13 TSPAN7 7102 0.0148

14 RAC2 5880 -0.0148

15 KLHDC7B 1 13730 0.014

16 GRB14 2888 0.0137 17 AC138128.1 N/A -0.0136

18 KIF26A 26153 -0.0136

19 CD274 29126 0.0133

20 CD109 135228 -0.0129

21 ETV7 51513 0.0124

22 MFAP5 8076 -0.0121

23 OLFM4 10562 -0.0117

24 PI15 51050 -0.0115

25 FOSB 2354 -0.0111

26 FAM19A5 25817 0.0101

27 NLRC5 84166 -0.011

28 PRICKLE1 144165 -0.0089

29 EGR1 1958 -0.0086

30 CLDN10 9071 -0.0086

31 ADAMTS4 9507 -0.0085

32 SP140L 93349 0.0084

33 ANXA1 301 -0.0082

34 RSAD2 91543 0.0081

35 ESR1 2099 0.0079

36 I ZF3 22806 0.0073

37 OR2I1 P 442197 0.007

38 EGFR 1956 -0.0066

39 NAT1 9 0.0065

40 LATS2 26524 -0.0063

41 CYP2B6 1555 0.0061

42 PTPRC 5788 0.0051

43 PPP1 R1A 5502 -0.0041

44 AL137218.1 N/A -0.0017

In different embodiments, subsets of the biomarkers listed in Tables 1A, 1 B and/or 1 C, Table 2A and/or Table 2B may be used in the methods described herein. These subsets include but are not limited to biomarkers ranked 1 -2, 1 -3, 1 -4, 1 -5, 1 -10, 1 -20, 1 -30, 1 -40, 1 -44, 6-10, 1 1 -15, 16-20, 21 - 25, 26-30, 31 -35, 36-40, 36-44, 1 1 -20, 21 -30, 31 -40, and 31 -44 in Table 2A or Table 2B. In one aspect, therapeutic responsiveness is predicted in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to at least one of the biomarkers from Table 2B and at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43. In one aspect, the methods of the invention are performed by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to at least one of the biomarkers GBP5, CXCL10, ID01 and MX1 and at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39 or 40. As used herein, the term "biomarker" can refer to a gene, an mRNA, cDNA, an antisense transcript, a miRNA, a polypeptide, a protein, a protein fragment, or any other nucleic acid sequence or polypeptide sequence that indicates either gene expression levels or protein production levels. In some embodiments, when referring to a biomarker of CXCL10, ID01 , CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, AC138128.1 , KIF26A, CD274, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC, PPP1 R1A, or AL137218.1 , the biomarker comprises an mRNA of CXCL10, ID01 , CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, AC138128.1 , KIF26A, CD274, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC, PPP1 R1A, or AL137218.1 , respectively. In further or other embodiments, when referring to a biomarker of MX1 , GBP5, IFI44L, BIRC3, IGJ, IQGAP3, LOC100294459, SIX1 , SLC9A3R1 , STAT1 , TOB1 , UBD, C1 QC, C2orf14, EPSTI, GALNT6, HIST1 H4H, HIST2H4B, KIAA1244, LOC100287927, LOC100291682, or LOC100293679, the biomarker comprises an antisense transcript of MX1 , IFI44L, GBP5, BIRC3, IGJ, IQGAP3, LOC100294459, SIX1 , SLC9A3R1 , STAT1 , TOB1 , UBD, C1 QC, C2orf14, EPSTI, GALNT6, HIST1 H4H, HIST2H4B, KIAA1244, LOC100287927, LOC100291682, or LOC100293679, respectively.

The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarkers GBP5, CXCL10, ID01 and MX1 and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39 or 40. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarker GBP5 and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 29, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarker CXCL10 and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 21 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 29, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarker ID01 and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 29, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarker MX1 and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 29, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarker CD274 and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 29, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43.

The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to at least two of the biomarkers CXCL10, MX1 , ID01 and IFI44L. Such assays may be conducted with at least N additional biomarkers selected from the list of additional biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, or 40. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarkers CXCL10, MX1 , ID01 and IFI44L and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, or 40. The methods of the invention may be performed in an individual by conducting an assay on a test (biological) sample from the individual and detecting biomarker values that each correspond to the biomarker IFI44L and one of at least N additional biomarkers selected from the list of biomarkers in Table 2B, wherein N equals 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 29, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42 or 43.

It should be noted that the complement of each sequence described herein may be employed as appropriate (e.g. for designing hybridizing probes and/or primers, including primer pairs). Additional gene signatures representing selections of the 44 gene signature are described herein and are applicable to all aspects of the invention. The additional gene signatures are set forth in Tables 3-45, together with suitable weight and bias scores that may be adopted when calculating the final signature score (as further described herein). The k value for each signature can be set once the threshold for defining a positive signature score has been determined, as would be readily appreciated by the skilled person. Similarly, the rankings for each gene in the signature can readily be determined by reviewing the weightings attributed to each gene (where a larger weight indicates a higher ranking in the signature - see Tables 2A and 2B for the rank order in respect of the 40 and 44 gene signatures, respectively).

Whilst Tables 3-45 provide an exemplary weight and bias for each gene in each signature, it will be appreciated that the gene signatures provided by these tables are not limited to the particular weights and biases given. Weight values may indicate the directionality of expression that is measured to indicate a positive signature score according to the invention. Thus, a positive weight indicates that an increase in gene expression contributes to a positive signature score/identification of prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA) and vice versa.

Table 3 - One gene signature

Table 6 - Four gene signature Gene Weight Bias Names

CXCL10 0.048331 2.03931

ID01 0.046238 0.725702

IFI44L 0.0401 1 .17581

MX1 0.047475 3.43549

Table 7 - Five gene signature

Gene Weight Bias Names

CD2 0.034275 4.09036

CXCL10 0.041595 2.03931

ID01 0.039792 0.725702

IFI44L 0.03451 1 1 .17581

MX1 0.040858 3.43549

Table 8 - Six gene signature

Gene Weight Bias Names

CD2 0.030041 4.09036

CXCL10 0.036456 2.03931

GBP5 0.028552 1 .39771

ID01 0.034877 0.725702

IFI44L 0.030247 1 .17581

MX1 0.03581 3.43549

Table 9 - Seven gene signature

Gene Weight Bias Names

CD2 0.025059 4.09036

CXCL10 0.03041 2.03931

GBP5 0.023817 1 .39771

ID01 0.029093 0.725702

IFI44L 0.025231 1 .17581

MX1 0.029872 3.43549

PRAME 0.023355 2.2499

Table 10 - Eight gene signature Gene Weight Bias Names

CD2 0.02446 4.09036

CXCL10 0.029683 2.03931

GBP5 0.023247 1 .39771

ID01 0.028397 0.725702

IFI44L 0.024628 1 .17581

ITGAL 0.022705 3.21615

MX1 0.029157 3.43549

PRAME 0.022796 2.2499

Table 1 1 - Nine gene signature

Gene Weight Bias Names

CD2 0.023997 4.09036

CXCL10 0.029122 2.03931

GBP5 0.022807 1 .39771

ID01 0.02786 0.725702

IFI44L 0.024162 1 .17581

ITGAL 0.022275 3.21615

LRP4 -0.02008 0.306454

MX1 0.028606 3.43549

PRAME 0.022365 2.2499

Table 12 - Ten gene signature

Gene Weight Bias Names

APOL3 0.017969 2.20356

CD2 0.02255 4.09036

CXCL10 0.027366 2.03931

GBP5 0.021432 1 .39771

ID01 0.02618 0.725702

IFI44L 0.022705 1 .17581

ITGAL 0.020932 3.21615

LRP4 -0.01887 0.306454

MX1 0.026881 3.43549

PRAME 0.021017 2.2499 Table 13 - Eleven gene signature

Table 14 - Twelve gene signature

Table 15 - Thirteen gene signature

Gene Weight Bias Names

APOL3 0.017102 2.20356

CD2 0.021463 4.09036

CDR1 -0.01687 4.79794 CXCL10 0.026046 2.03931

FYB 0.016819 1 .56179

GBP5 0.020399 1 .39771

ID01 0.024918 0.725702

IFI44L 0.02161 1 .17581

ITGAL 0.019923 3.21615

LRP4 -0.01796 0.306454

MX1 0.025585 3.43549

PRAME 0.020003 2.2499

TSPAN7 -0.01675 1 .65843

Table 16 - Fourteen gene signatu

Gene weight bias Names

APOL3 0.016213 2.20356

CD2 0.020347 4.09036

CDR1 -0.01599 4.79794

CXCL10 0.024692 2.03931

FYB 0.015945 1 .56179

GBP5 0.019338 1 .39771

ID01 0.023622 0.725702

IFI44L 0.020487 1 .17581

ITGAL 0.018887 3.21615

LRP4 -0.01703 0.306454

MX1 0.024255 3.43549

PRAME 0.018963 2.2499

RAC2 0.01586 3.03644

TSPAN7 -0.01588 1 .65843

Table 17 - Fifteen gene signature

Gene Weight Bias Names

APOL3 0.015496 2.20356

CD2 0.019447 4.09036

CDR1 -0.01528 4.79794

CXCL10 0.023599 2.03931

FYB 0.015239 1 .56179 GBP5 0.018482 1 .39771

ID01 0.022577 0.725702

IFI44L 0.01958 1 .17581

ITGAL 0.018051 3.21615

KLHDC7B 0.014303 1 .43954

LRP4 -0.01627 0.306454

MX1 0.023181 3.43549

PRAME 0.018124 2.2499

RAC2 0.015158 3.03644

TSPAN7 -0.01518 1 .65843

Table 18 - Sixteen gene signature

Gene Weight Bias Names

APOL3 0.016001 2.20356

CD2 0.020081 4.09036

CDR1 -0.01578 4.79794

CXCL10 0.024369 2.03931

FYB 0.015736 1 .56179

GBP5 0.019085 1 .39771

GRB14 0.014473 0.269629

ID01 0.023313 0.725702

IFI44L 0.020219 1 .17581

ITGAL 0.01864 3.21615

KLHDC7B 0.014769 1 .43954

LRP4 -0.0168 0.306454

MX1 0.023937 3.43549

PRAME 0.018715 2.2499

RAC2 0.015653 3.03644

TSPAN7 -0.01567 1 .65843

Table 19 - Seventeen gene signature

Gene Weight Bias Names

AC138128.1 -0.01406 1 .4071

APOL3 0.015604 2.20356

CD2 0.019583 4.09036 CDR1 -0.01539 4.79794

CXCL10 0.023765 2.03931

FYB 0.015346 1 .56179

GBP5 0.018612 1 .39771

GRB14 0.0141 14 0.269629

ID01 0.022735 0.725702

IFI44L 0.019718 1 .17581

ITGAL 0.018178 3.21615

KLHDC7B 0.014403 1 .43954

LRP4 -0.01639 0.306454

MX1 0.023344 3.43549

PRAME 0.018251 2.2499

RAC2 0.015265 3.03644

TSPAN7 -0.01528 1 .65843

Table 20 - Eighteen gene signature

Gene Weight Bias Names

AC138128.1 -0.01401 1 .4071

APOL3 0.015556 2.20356

CD2 0.019522 4.09036

CDR1 -0.01534 4.79794

CXCL10 0.023691 2.03931

FYB 0.015298 1 .56179

GBP5 0.018554 1 .39771

GRB14 0.01407 0.269629

ID01 0.022665 0.725702

IFI44L 0.019656 1 .17581

ITGAL 0.018121 3.21615

KIF26A -0.01397 2.05036

KLHDC7B 0.014359 1 .43954

LRP4 -0.01634 0.306454

MX1 0.023271 3.43549

PRAME 0.018194 2.2499

RAC2 0.015217 3.03644

TSPAN7 -0.01524 1 .65843

Table 21 - Nineteen gene signature Gene Weight Bias Names

AC138128.1 -0.01338 1 .4071

APOL3 0.014853 2.20356

CD2 0.01864 4.09036

CD274 0.013043 1 .37297

CDR1 -0.01465 4.79794

CXCL10 0.02262 2.03931

FYB 0.014607 1 .56179

GBP5 0.017716 1 .39771

GRB14 0.013434 0.269629

ID01 0.02164 0.725702

IFI44L 0.018768 1 .17581

ITGAL 0.017302 3.21615

KIF26A -0.01334 2.05036

KLHDC7B 0.01371 1 .43954

LRP4 -0.0156 0.306454

MX1 0.022219 3.43549

PRAME 0.017372 2.2499

RAC2 0.014529 3.03644

TSPAN7 -0.01455 1 .65843

Table 22 - Twenty gene signature

Gene Weight Bias Names

AC138128.1 -0.0137 1 .4071

APOL3 0.015205 2.20356

CD109 -0.01292 0.947671

CD2 0.019081 4.09036

CD274 0.013352 1 .37297

CDR1 -0.015 4.79794

CXCL10 0.023156 2.03931

FYB 0.014953 1 .56179

GBP5 0.018135 1 .39771

GRB14 0.013752 0.269629

ID01 0.022153 0.725702

IFI44L 0.019212 1 .17581 ITGAL 0.017712 3.21615

KIF26A -0.01366 2.05036

KLHDC7B 0.014034 1 .43954

LRP4 -0.01597 0.306454

MX1 0.022746 3.43549

PRAME 0.017783 2.2499

RAC2 0.014874 3.03644

TSPAN7 -0.01489 1 .65843

Table 23 - Twenty one gene signature

Table 24 - Twenty two gene signature

AC138128.1 -0.01326 1 .4071

APOL3 0.014714 2.20356

CD109 -0.0125 0.947671

CD2 0.018466 4.09036

CD274 0.012921 1 .37297

CDR1 -0.01451 4.79794

CXCL10 0.022409 2.03931

ETV7 0.012038 1 .46783

FYB 0.014471 1 .56179

GBP5 0.01755 1 .39771

GRB14 0.013309 0.269629

ID01 0.021438 0.725702

IFI44L 0.018593 1 .17581

ITGAL 0.017141 3.21615

KIF26A -0.01322 2.05036

KLHDC7B 0.013582 1 .43954

LRP4 -0.01545 0.306454

MFAP5 -0.01 172 2.69918

MX1 0.022012 3.43549

PRAME 0.01721 2.2499

RAC2 0.014394 3.03644

TSPAN7 -0.01441 1 .65843

Table 25 - Twenty three gene signature

Gene Weight Bias Names

AC138128.1 -0.01361 1 .4071

APOL3 0.015108 2.20356

CD109 -0.01284 0.947671

CD2 0.018961 4.09036

CD274 0.013268 1 .37297

CDR1 -0.0149 4.79794

CXCL10 0.02301 2.03931

ETV7 0.012361 1 .46783

FYB 0.014858 1 .56179

GBP5 0.018021 1 .39771

GRB14 0.013666 0.269629 ID01 0.022013 0.725702

IFI44L 0.019091 1 .17581

ITGAL 0.0176 3.21615

KIF26A -0.01357 2.05036

KLHDC7B 0.013946 1 .43954

LRP4 -0.01587 0.306454

MFAP5 -0.01204 2.69918

MX1 0.022602 3.43549

OLFM4 -0.01 167 0.636684

PRAME 0.017671 2.2499

RAC2 0.01478 3.03644

TSPAN7 -0.0148 1 .65843

Table 26 - Twenty four gene signatur

Gene Weight Bias Names

AC138128.1 -0.01365 1 .4071

APOL3 0.015148 2.20356

CD109 -0.01287 0.947671

CD2 0.01901 4.09036

CD274 0.013302 1 .37297

CDR1 -0.01494 4.79794

CXCL10 0.023069 2.03931

ETV7 0.012393 1 .46783

FYB 0.014897 1 .56179

GBP5 0.018068 1 .39771

GRB14 0.013701 0.269629

ID01 0.02207 0.725702

IFI44L 0.019141 1 .17581

ITGAL 0.017646 3.21615

KIF26A -0.01361 2.05036

KLHDC7B 0.013982 1 .43954

LRP4 -0.01591 0.306454

MFAP5 -0.01207 2.69918

MX1 0.022661 3.43549

OLFM4 -0.01 17 0.636684

PI15 -0.01 146 0.335476 PRAME 0.017717 2.2499

RAC2 0.014818 3.03644

TSPAN7 -0.01484 1 .65843

Table 27 - Twenty five gene signaturc

Gene Weight Bias Names

AC138128.1 -0.01342 1 .4071

APOL3 0.014899 2.20356

CD109 -0.01266 0.947671

CD2 0.018698 4.09036

CD274 0.013084 1 .37297

CDR1 -0.0147 4.79794

CXCL10 0.022691 2.03931

ETV7 0.01219 1 .46783

FOSB -0.01093 1 .85886

FYB 0.014653 1 .56179

GBP5 0.017771 1 .39771

GRB14 0.013476 0.269629

ID01 0.021708 0.725702

IFI44L 0.018827 1 .17581

ITGAL 0.017357 3.21615

KIF26A -0.01338 2.05036

KLHDC7B 0.013753 1 .43954

LRP4 -0.01565 0.306454

MFAP5 -0.01 187 2.69918

MX1 0.022289 3.43549

OLFM4 -0.01 151 0.636684

PI15 -0.01 128 0.335476

PRAME 0.017426 2.2499

RAC2 0.014575 3.03644

TSPAN7 -0.01459 1 .65843

Table 28 - Twenty six gene signature

Gene Weight Bias Names

AC138128.1 -0.01339 1 .4071 APOL3 0.014858 2.20356

CD109 -0.01262 0.947671

CD2 0.018647 4.09036

CD274 0.013048 1 .37297

CDR1 -0.01465 4.79794

CXCL10 0.022629 2.03931

ETV7 0.012157 1 .46783

FAM19A5 -0.01083 0.413683

FOSB -0.0109 1 .85886

FYB 0.014613 1 .56179

GBP5 0.017723 1 .39771

GRB14 0.013439 0.269629

ID01 0.021649 0.725702

IFI44L 0.018775 1 .17581

ITGAL 0.017309 3.21615

KIF26A -0.01335 2.05036

KLHDC7B 0.013715 1 .43954

LRP4 -0.0156 0.306454

MFAP5 -0.01 184 2.69918

MX1 0.022228 3.43549

OLFM4 -0.01 148 0.636684

PI15 -0.01 125 0.335476

PRAME 0.017379 2.2499

RAC2 0.014535 3.03644

TSPAN7 -0.01455 1 .65843

Table 29 - Twenty seven gene signature

Gene Weight Bias Names

AC138128.1 -0.01316 1 .4071

APOL3 0.014603 2.20356

CD109 -0.01241 0.947671

CD2 0.018326 4.09036

CD274 0.012824 1 .37297

CDR1 -0.0144 4.79794

CXCL10 0.022239 2.03931

ETV7 0.01 1947 1 .46783 FAM19A5 -0.01064 0.413683

FOSB -0.01071 1 .85886

FYB 0.014361 1 .56179

GBP5 0.017417 1 .39771

GRB14 0.013208 0.269629

ID01 0.021276 0.725702

IFI44L 0.018452 1 .17581

ITGAL 0.01701 1 3.21615

KIF26A -0.01312 2.05036

KLHDC7B 0.013479 1 .43954

LRP4 -0.01534 0.306454

MFAP5 -0.01 164 2.69918

MX1 0.021845 3.43549

NLRC5 0.009724 2.26863

OLFM4 -0.01 128 0.636684

PI15 -0.01 105 0.335476

PRAME 0.017079 2.2499

RAC2 0.014285 3.03644

TSPAN7 -0.0143 1 .65843

Table 30 - Twenty eight gene signatu

Gene Weight Bias Names

AC138128.1 -0.01326 1 .4071

APOL3 0.014712 2.20356

CD109 -0.0125 0.947671

CD2 0.018464 4.09036

CD274 0.01292 1 .37297

CDR1 -0.01451 4.79794

CXCL10 0.022407 2.03931

ETV7 0.012037 1 .46783

FAM19A5 -0.01072 0.413683

FOSB -0.01079 1 .85886

FYB 0.014469 1 .56179

GBP5 0.017548 1 .39771

GRB14 0.013307 0.269629

ID01 0.021436 0.725702 IFI44L 0.018591 1 .17581

ITGAL 0.017139 3.21615

KIF26A -0.01322 2.05036

KLHDC7B 0.01358 1 .43954

LRP4 -0.01545 0.306454

MFAP5 -0.01 172 2.69918

MX1 0.02201 3.43549

NLRC5 0.009797 2.26863

OLFM4 -0.01 137 0.636684

PI15 -0.01 1 14 0.335476

PRAME 0.017208 2.2499

PRICKLE1 -0.00864 1 .77018

RAC2 0.014392 3.03644

TSPAN7 -0.01441 1 .65843

Table 31 - Twenty nine gene signatur

Gene Weight Bias Names

AC138128.1 -0.01307 1 .4071

APOL3 0.014506 2.20356

CD109 -0.01232 0.947671

CD2 0.018204 4.09036

CD274 0.012739 1 .37297

CDR1 -0.01431 4.79794

CXCL10 0.022092 2.03931

EGR1 -0.00827 2.18651

ETV7 0.01 1868 1 .46783

FAM19A5 -0.01057 0.413683

FOSB -0.01064 1 .85886

FYB 0.014266 1 .56179

GBP5 0.017302 1 .39771

GRB14 0.01312 0.269629

ID01 0.021 135 0.725702

IFI44L 0.01833 1 .17581

ITGAL 0.016898 3.21615

KIF26A -0.01303 2.05036

KLHDC7B 0.013389 1 .43954 LRP4 -0.01523 0.306454

MFAP5 -0.01 156 2.69918

MX1 0.021701 3.43549

NLRC5 0.009659 2.26863

OLFM4 -0.01 121 0.636684

PI15 -0.01098 0.335476

PRAME 0.016966 2.2499

PRICKLE1 -0.00852 1 .77018

RAC2 0.01419 3.03644

TSPAN7 -0.01421 1 .65843

Table 32 - Thirty gene signature

Gene Weight Bias Names

AC138128.1 -0.01326 1 .4071

APOL3 0.014722 2.20356

CD109 -0.01251 0.947671

CD2 0.018476 4.09036

CD274 0.012928 1 .37297

CDR1 -0.01452 4.79794

CLDN10 -0.00834 -0.34464

CXCL10 0.022421 2.03931

EGR1 -0.00839 2.18651

ETV7 0.012045 1 .46783

FAM19A5 -0.01073 0.413683

FOSB -0.0108 1 .85886

FYB 0.014478 1 .56179

GBP5 0.01756 1 .39771

GRB14 0.013316 0.269629

ID01 0.02145 0.725702

IFI44L 0.018603 1 .17581

ITGAL 0.01715 3.21615

KIF26A -0.01323 2.05036

KLHDC7B 0.013589 1 .43954

LRP4 -0.01546 0.306454

MFAP5 -0.01 173 2.69918

MX1 0.022024 3.43549 NLRC5 0.009803 2.26863

OLFM4 -0.01 137 0.636684

PI15 -0.01 1 14 0.335476

PRAME 0.017219 2.2499

PRICKLE1 -0.00864 1 .77018

RAC2 0.014402 3.03644

TSPAN7 -0.01442 1 .65843

Table 33 - Thirty one gene signature

Gene Weight Bias Names

AC138128.1 -0.01339 1 .4071

ADAMTS4 -0.00837 1 .95693

APOL3 0.014864 2.20356

CD109 -0.01263 0.947671

CD2 0.018654 4.09036

CD274 0.013053 1 .37297

CDR1 -0.01466 4.79794

CLDN10 -0.00842 -0.34464

CXCL10 0.022638 2.03931

EGR1 -0.00847 2.18651

ETV7 0.012161 1 .46783

FAM19A5 -0.01083 0.413683

FOSB -0.0109 1 .85886

FYB 0.014618 1 .56179

GBP5 0.017729 1 .39771

GRB14 0.013444 0.269629

ID01 0.021657 0.725702

IFI44L 0.018782 1 .17581

ITGAL 0.017316 3.21615

KIF26A -0.01335 2.05036

KLHDC7B 0.01372 1 .43954

LRP4 -0.01561 0.306454

MFAP5 -0.01 184 2.69918

MX1 0.022236 3.43549

NLRC5 0.009898 2.26863

OLFM4 -0.01 148 0.636684 PI15 -0.01 125 0.335476

PRAME 0.017385 2.2499

PRICKLE1 -0.00873 1 .77018

RAC2 0.014541 3.03644

TSPAN7 -0.01456 1 .65843

Table 34 - Thirty two gene signature

Gene Weight Bias Names

AC138128.1 -0.01332 1 .4071

ADAMTS4 -0.00832 1 .95693

APOL3 0.014789 2.20356

CD109 -0.01256 0.947671

CD2 0.01856 4.09036

CD274 0.012987 1 .37297

CDR1 -0.01459 4.79794

CLDN10 -0.00838 -0.34464

CXCL10 0.022523 2.03931

EGR1 -0.00843 2.18651

ETV7 0.0121 1 .46783

FAM19A5 -0.01078 0.413683

FOSB -0.01085 1 .85886

FYB 0.014544 1 .56179

GBP5 0.01764 1 .39771

GRB14 0.013377 0.269629

ID01 0.021548 0.725702

IFI44L 0.018688 1 .17581

ITGAL 0.017228 3.21615

KIF26A -0.01329 2.05036

KLHDC7B 0.013651 1 .43954

LRP4 -0.01553 0.306454

MFAP5 -0.01 178 2.69918

MX1 0.022124 3.43549

NLRC5 0.009848 2.26863

OLFM4 -0.01 143 0.636684

PI15 -0.01 1 19 0.335476

PRAME 0.017298 2.2499 PRICKLE1 -0.00868 1 .77018

RAC2 0.014467 3.03644

SP140L 0.00825 0.550538

TSPAN7 -0.01449 1 .65843

Table 35 - Thirty three gene signaturc

Gene Weight Bias Names

AC138128.1 -0.01348 1 .4071

ADAMTS4 -0.00842 1 .95693

ANXA1 -0.0081 2.00146

APOL3 0.014961 2.20356

CD109 -0.01271 0.947671

CD2 0.018776 4.09036

CD274 0.013138 1 .37297

CDR1 -0.01476 4.79794

CLDN10 -0.00848 -0.34464

CXCL10 0.022785 2.03931

EGR1 -0.00853 2.18651

ETV7 0.01224 1 .46783

FAM19A5 -0.0109 0.413683

FOSB -0.01097 1 .85886

FYB 0.014713 1 .56179

GBP5 0.017845 1 .39771

GRB14 0.013532 0.269629

ID01 0.021798 0.725702

IFI44L 0.018905 1 .17581

ITGAL 0.017428 3.21615

KIF26A -0.01344 2.05036

KLHDC7B 0.01381 1 .43954

LRP4 -0.01571 0.306454

MFAP5 -0.01 192 2.69918

MX1 0.022381 3.43549

NLRC5 0.009962 2.26863

OLFM4 -0.01 156 0.636684

PI15 -0.01 132 0.335476

PRAME 0.017498 2.2499 PRICKLE1 -0.00878 1 .77018

RAC2 0.014635 3.03644

SP140L 0.008345 0.550538

TSPAN7 -0.01465 1 .65843

Table 36 - Thirty four gene signature

Gene Weight Bias Names

AC138128.1 -0.01334 1 .4071

ADAMTS4 -0.00834 1 .95693

ANXA1 -0.00802 2.00146

APOL3 0.014812 2.20356

CD109 -0.01258 0.947671

CD2 0.018589 4.09036

CD274 0.013007 1 .37297

CDR1 -0.01461 4.79794

CLDN10 -0.00839 -0.34464

CXCL10 0.022558 2.03931

EGR1 -0.00844 2.18651

ETV7 0.0121 18 1 .46783

FAM19A5 -0.0108 0.413683

FOSB -0.01086 1 .85886

FYB 0.014567 1 .56179

GBP5 0.017667 1 .39771

GRB14 0.013397 0.269629

ID01 0.021581 0.725702

IFI44L 0.018716 1 .17581

ITGAL 0.017255 3.21615

KIF26A -0.01331 2.05036

KLHDC7B 0.013672 1 .43954

LRP4 -0.01556 0.306454

MFAP5 -0.01 18 2.69918

MX1 0.022159 3.43549

NLRC5 0.009863 2.26863

OLFM4 -0.01 144 0.636684

PI15 -0.01 121 0.335476

PRAME 0.017324 2.2499 PRICKLE1 -0.0087 1 .77018

RAC2 0.01449 3.03644

RSAD2 0.007894 1 .44894

SP140L 0.008262 0.550538

TSPAN7 -0.01451 1 .65843

Table 37 - Thirty five gene signature

Gene Weight Bias Names

AC138128.1 -0.0137 1 .4071

ADAMTS4 -0.00856 1 .95693

ANXA1 -0.00823 2.00146

APOL3 0.015208 2.20356

CD109 -0.01292 0.947671

CD2 0.019085 4.09036

CD274 0.013355 1 .37297

CDR1 -0.015 4.79794

CLDN10 -0.00862 -0.34464

CXCL10 0.023161 2.03931

EGR1 -0.00867 2.18651

ESR1 0.007943 0.851213

ETV7 0.012442 1 .46783

FAM19A5 -0.01 108 0.413683

FOSB -0.01 1 15 1 .85886

FYB 0.014956 1 .56179

GBP5 0.018139 1 .39771

GRB14 0.013755 0.269629

ID01 0.022157 0.725702

IFI44L 0.019216 1 .17581

ITGAL 0.017716 3.21615

KIF26A -0.01366 2.05036

KLHDC7B 0.014037 1 .43954

LRP4 -0.01597 0.306454

MFAP5 -0.01212 2.69918

MX1 0.022751 3.43549

NLRC5 0.010127 2.26863

OLFM4 -0.01 175 0.636684 PI15 -0.01 151 0.335476

PRAME 0.017787 2.2499

PRICKLE1 -0.00893 1 .77018

RAC2 0.014877 3.03644

RSAD2 0.008105 1 .44894

SP140L 0.008483 0.550538

TSPAN7 -0.0149 1 .65843

Table 38 - Thirty six gene signature

Gene Weight Bias Names

AC138128.1 -0.01359 1 .4071

ADAMTS4 -0.00849 1 .95693

ANXA1 -0.00816 2.00146

APOL3 0.015081 2.20356

CD109 -0.01281 0.947671

CD2 0.018926 4.09036

CD274 0.013244 1 .37297

CDR1 -0.01487 4.79794

CLDN10 -0.00855 -0.34464

CXCL10 0.022968 2.03931

EGR1 -0.0086 2.18651

ESR1 0.007876 0.851213

ETV7 0.012338 1 .46783

FAM19A5 -0.01099 0.413683

FOSB -0.01 106 1 .85886

FYB 0.014831 1 .56179

GBP5 0.017988 1 .39771

GRB14 0.01364 0.269629

ID01 0.021973 0.725702

IFI44L 0.019056 1 .17581

IKZF3 0.007318 -0.58991

ITGAL 0.017568 3.21615

KIF26A -0.01355 2.05036

KLHDC7B 0.01392 1 .43954

LRP4 -0.01584 0.306454

MFAP5 -0.01202 2.69918 MX1 0.022561 3.43549

NLRC5 0.010042 2.26863

OLFM4 -0.01 165 0.636684

PI15 -0.01 141 0.335476

PRAME 0.017639 2.2499

PRICKLE1 -0.00885 1 .77018

RAC2 0.014753 3.03644

RSAD2 0.008038 1 .44894

SP140L 0.008412 0.550538

TSPAN7 -0.01477 1 .65843

Table 39 - Thirty seven gene signatu

Gene Weight Bias Names

AC138128.1 -0.01342 1 .4071

ADAMTS4 -0.00838 1 .95693

ANXA1 -0.00806 2.00146

APOL3 0.014896 2.20356

CD109 -0.01265 0.947671

CD2 0.018694 4.09036

CD274 0.013081 1 .37297

CDR1 -0.01469 4.79794

CLDN10 -0.00844 -0.34464

CXCL10 0.022686 2.03931

EGR1 -0.00849 2.18651

ESR1 0.00778 0.851213

ETV7 0.012187 1 .46783

FAM19A5 -0.01086 0.413683

FOSB -0.01092 1 .85886

FYB 0.014649 1 .56179

GBP5 0.017767 1 .39771

GRB14 0.013473 0.269629

ID01 0.021703 0.725702

IFI44L 0.018823 1 .17581

IKZF3 0.007228 -0.58991

ITGAL 0.017353 3.21615

KIF26A -0.01338 2.05036 KLHDC7B 0.01375 1 .43954

LRP4 -0.01564 0.306454

MFAP5 -0.01 187 2.69918

MX1 0.022284 3.43549

NLRC5 0.009919 2.26863

OLFM4 -0.01 151 0.636684

OR2I1 P 0.00685 -1 .30235

PI15 -0.01 127 0.335476

PRAME 0.017422 2.2499

PRICKLE1 -0.00875 1 .77018

RAC2 0.014572 3.03644

RSAD2 0.007939 1 .44894

SP140L 0.008309 0.550538

TSPAN7 -0.01459 1 .65843

Table 40 - Thirty eight gene signature

Gene Weight Bias Names

AC138128.1 -0.01345 1 .4071

ADAMTS4 -0.0084 1 .95693

ANXA1 -0.00808 2.00146

APOL3 0.014924 2.20356

CD109 -0.01268 0.947671

CD2 0.01873 4.09036

CD274 0.013106 1 .37297

CDR1 -0.01472 4.79794

CLDN10 -0.00846 -0.34464

CXCL10 0.022729 2.03931

EGFR -0.00649 -0.17669

EGR1 -0.00851 2.18651

ESR1 0.007795 0.851213

ETV7 0.01221 1 .46783

FAM19A5 -0.01088 0.413683

FOSB -0.01095 1 .85886

FYB 0.014677 1 .56179

GBP5 0.017801 1 .39771

GRB14 0.013499 0.269629 ID01 0.021745 0.725702

IFI44L 0.018858 1 .17581

IKZF3 0.007242 -0.58991

ITGAL 0.017386 3.21615

KIF26A -0.01341 2.05036

KLHDC7B 0.013776 1 .43954

LRP4 -0.01567 0.306454

MFAP5 -0.01 189 2.69918

MX1 0.022327 3.43549

NLRC5 0.009938 2.26863

OLFM4 -0.01 153 0.636684

OR2I1 P 0.006863 -1 .30235

PI15 -0.01 13 0.335476

PRAME 0.017456 2.2499

PRICKLE1 -0.00876 1 .77018

RAC2 0.0146 3.03644

RSAD2 0.007954 1 .44894

SP140L 0.008325 0.550538

TSPAN7 -0.01462 1 .65843

Table 41 - Thirty nine gene signature

Gene Weight Bias Names

AC138128.1 -0.01356 1 .4071

ADAMTS4 -0.00847 1 .95693

ANXA1 -0.00815 2.00146

APOL3 0.015054 2.20356

CD109 -0.01279 0.947671

CD2 0.018892 4.09036

CD274 0.01322 1 .37297

CDR1 -0.01485 4.79794

CLDN10 -0.00853 -0.34464

CXCL10 0.022926 2.03931

EGFR -0.00654 -0.17669

EGR1 -0.00858 2.18651

ESR1 0.007862 0.851213

ETV7 0.012316 1 .46783 FAM19A5 -0.01097 0.413683

FOSB -0.01 104 1 .85886

FYB 0.014805 1 .56179

GBP5 0.017955 1 .39771

GRB14 0.013616 0.269629

ID01 0.021933 0.725702

IFI44L 0.019022 1 .17581

IKZF3 0.007305 -0.58991

ITGAL 0.017536 3.21615

KIF26A -0.01352 2.05036

KLHDC7B 0.013895 1 .43954

LRP4 -0.01581 0.306454

MFAP5 -0.012 2.69918

MX1 0.02252 3.43549

NAT1 0.006442 -0.79732

NLRC5 0.010024 2.26863

OLFM4 -0.01 163 0.636684

OR2I1 P 0.006922 -1 .30235

PI15 -0.01 139 0.335476

PRAME 0.017607 2.2499

PRICKLE1 -0.00884 1 .77018

RAC2 0.014726 3.03644

RSAD2 0.008023 1 .44894

SP140L 0.008397 0.550538

TSPAN7 -0.01474 1 .65843

Table 42 - Forty gene signature

Gene Weight Bias Names

AC138128.1 -0.01357 1 .4071

ADAMTS4 -0.00848 1 .95693

ANXA1 -0.00815 2.00146

APOL3 0.015057 2.20356

CD109 -0.01279 0.947671

CD2 0.018896 4.09036

CD274 0.013223 1 .37297

CDR1 -0.01485 4.79794 CLDN10 -0.00853 -0.34464

CXCL10 0.022931 2.03931

EGFR -0.00654 -0.17669

EGR1 -0.00858 2.18651

ESR1 0.007864 0.851213

ETV7 0.012319 1 .46783

FAM19A5 -0.01097 0.413683

FOSB -0.01 104 1 .85886

FYB 0.014808 1 .56179

GBP5 0.017959 1 .39771

GRB14 0.013619 0.269629

ID01 0.021938 0.725702

IFI44L 0.019026 1 .17581

IKZF3 0.007306 -0.58991

ITGAL 0.01754 3.21615

KIF26A -0.01353 2.05036

KLHDC7B 0.013898 1 .43954

LATS2 -0.00622 0.486251

LRP4 -0.01581 0.306454

MFAP5 -0.012 2.69918

MX1 0.022525 3.43549

NAT1 0.006444 -0.79732

NLRC5 0.010026 2.26863

OLFM4 -0.01 163 0.636684

OR2I1 P 0.006924 -1 .30235

PI15 -0.01 14 0.335476

PRAME 0.01761 1 2.2499

PRICKLE1 -0.00884 1 .77018

RAC2 0.014729 3.03644

RSAD2 0.008025 1 .44894

SP140L 0.008399 0.550538

TSPAN7 -0.01475 1 .65843

Table 43 - Forty one gene signature

Gene Weight Bias Names

AC138128.1 -0.01374 1 .4071 ADAMTS4 -0.00859 1 .95693

ANXA1 -0.00826 2.00146

APOL3 0.015253 2.20356

CD109 -0.01296 0.947671

CD2 0.019143 4.09036

CD274 0.013395 1 .37297

CDR1 -0.01504 4.79794

CLDN10 -0.00864 -0.34464

CXCL10 0.02323 2.03931

CYP2B6 0.006181 0.921835

EGFR -0.00663 -0.17669

EGR1 -0.00869 2.18651

ESR1 0.007966 0.851213

ETV7 0.01248 1 .46783

FAM19A5 -0.01 1 12 0.413683

FOSB -0.01 1 19 1 .85886

FYB 0.015001 1 .56179

GBP5 0.018194 1 .39771

GRB14 0.013797 0.269629

ID01 0.022224 0.725702

IFI44L 0.019274 1 .17581

IKZF3 0.007402 -0.58991

ITGAL 0.017769 3.21615

KIF26A -0.0137 2.05036

KLHDC7B 0.014079 1 .43954

LATS2 -0.0063 0.486251

LRP4 -0.01602 0.306454

MFAP5 -0.01215 2.69918

MX1 0.022819 3.43549

NAT1 0.006528 -0.79732

NLRC5 0.010157 2.26863

OLFM4 -0.01 178 0.636684

OR2I1 P 0.007014 -1 .30235

PI15 -0.01 154 0.335476

PRAME 0.01784 2.2499

PRICKLE1 -0.00896 1 .77018

RAC2 0.014921 3.03644 RSAD2 0.00813 1 .44894

SP140L 0.008509 0.550538

TSPAN7 -0.01494 1 .65843

Table 44 - Forty two gene signature

Gene Weight Bias Names

AC138128.1 -0.01365 1 .4071

ADAMTS4 -0.00853 1 .95693

ANXA1 -0.0082 2.00146

APOL3 0.015146 2.20356

CD109 -0.01287 0.947671

CD2 0.019008 4.09036

CD274 0.013301 1 .37297

CDR1 -0.01494 4.79794

CLDN10 -0.00858 -0.34464

CXCL10 0.023067 2.03931

CYP2B6 0.006138 0.921835

EGFR -0.00658 -0.17669

EGR1 -0.00863 2.18651

ESR1 0.00791 0.851213

ETV7 0.012392 1 .46783

FAM19A5 -0.01 104 0.413683

FOSB -0.01 1 1 1 1 .85886

FYB 0.014895 1 .56179

GBP5 0.018065 1 .39771

GRB14 0.013699 0.269629

ID01 0.022067 0.725702

IFI44L 0.019138 1 .17581

IKZF3 0.00735 -0.58991

ITGAL 0.017644 3.21615

KIF26A -0.01361 2.05036

KLHDC7B 0.01398 1 .43954

LATS2 -0.00626 0.486251

LRP4 -0.01591 0.306454

MFAP5 -0.01207 2.69918

MX1 0.022658 3.43549 NAT1 0.006482 -0.79732

NLRC5 0.010085 2.26863

OLFM4 -0.01 17 0.636684

OR2I1 P 0.006965 -1 .30235

PI15 -0.01 146 0.335476

PRAME 0.017715 2.2499

PRICKLE1 -0.00889 1 .77018

PTPRC 0.005152 -1 .1 1824

RAC2 0.014816 3.03644

RSAD2 0.008072 1 .44894

SP140L 0.008449 0.550538

TSPAN7 -0.01484 1 .65843

Table 45 - Forty three gene signature

Gene Weight Bias Names

AC138128.1 -0.01364 1 .4071

ADAMTS4 -0.00852 1 .95693

ANXA1 -0.0082 2.00146

APOL3 0.015139 2.20356

CD109 -0.01286 0.947671

CD2 0.018999 4.09036

CD274 0.013295 1 .37297

CDR1 -0.01493 4.79794

CLDN10 -0.00858 -0.34464

CXCL10 0.023056 2.03931

CYP2B6 0.006135 0.921835

EGFR -0.00658 -0.17669

EGR1 -0.00863 2.18651

ESR1 0.007907 0.851213

ETV7 0.012386 1 .46783

FAM19A5 -0.01 103 0.413683

FOSB -0.01 1 1 1 .85886

FYB 0.014889 1 .56179

GBP5 0.018057 1 .39771

GRB14 0.013693 0.269629

ID01 0.022057 0.725702 IFI44L 0.01913 1 .17581

IKZF3 0.007346 -0.58991

ITGAL 0.017636 3.21615

KIF26A -0.0136 2.05036

KLHDC7B 0.013974 1 .43954

LATS2 -0.00625 0.486251

LRP4 -0.0159 0.306454

MFAP5 -0.01206 2.69918

MX1 0.022648 3.43549

NAT1 0.006479 -0.79732

NLRC5 0.010081 2.26863

OLFM4 -0.01 17 0.636684

OR2I1 P 0.006962 -1 .30235

PI15 -0.01 146 0.335476

PPP1 R1A -0.0041 1 .76371

PRAME 0.017707 2.2499

PRICKLE1 -0.00889 1 .77018

PTPRC 0.00515 -1 .1 1824

RAC2 0.01481 3.03644

RSAD2 0.008069 1 .44894

SP140L 0.008445 0.550538

TSPAN7 -0.01483 1 .65843

In some embodiments, the target sequences/probes listed in Table 1 A may be used in the methods described herein. Thus, the methods may utilise probes having the sequences disclosed herein with reference to Table 1A. Table 1A lists the SEQ ID Nos for the individual probes used to measure expression levels of the genes identified in the table. Thus, the methods may employ one or more probes, each of which comprises, consists essentially of or consists of the sequence of any one of SEQ ID NOs 1 -1750. Similarly, the methods may utilise probes and/or primers that hybridize specifically (which may be under conditions of high stringency) to the target sequences disclosed herein with reference to Table 1A. Table 1A lists the SEQ ID Nos for the individual target sequences used to measure expression levels of the genes identified in the table. Thus, the methods may employ one or more probes and/or primers, each of which specifically hybridizes with a target sequence comprising, consisting essentially of or consisting of any one of SEQ ID NOs 1751 -3500. The known gene sequences may be utilised for the purposes of designing primers and/or probes which hybridize to the target sequences. Design of suitable primers and/or probes is within the capability of one skilled in the art once the target sequence is identified. Various primer design tools are freely available to assist in this process, such as the NCBI Primer-BLAST tool; see Ye et al, BMC Bioinformatics. 13:134 (2012). The primers and/or probes may be designed such that they hybridize to the target sequence under stringent conditions (as defined herein). Primers and/or probes may be at least 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24 or 25 (or more) nucleotides in length. It should be understood that each subset can include multiple primers and/or probes directed to the same biomarker. The tables show in some cases multiple target sequences within the same overall gene. Such primers and/or probes may be included in kits useful for performing the methods of the invention. The kits may be sequencing, array, bDNA/bRNA or PCR based kits for example and may include additional reagents, such as a polymerase and/or dNTPs for example.

Measuring Gene Expression Using Classifier Models

A variety of methods have been utilized in an attempt to identify biomarkers and diagnose disease. For protein-based markers, these include two-dimensional electrophoresis, mass spectrometry, and immunoassay methods. For nucleic acid markers, these include mRNA expression profiles, microRNA profiles, sequencing, FISH, serial analysis of gene expression (SAGE), methylation profiles, and large-scale gene expression arrays.

When a biomarker indicates or is a sign of an abnormal process, disease or other condition in an individual, that biomarker is generally described as being either over-expressed or under-expressed as compared to an expression level or value of the biomarker that indicates or is a sign of a normal process, an absence of a disease or other condition in an individual. "Up-regulation", "up-regulated", "over-expression", "over-expressed", and any variations thereof are used interchangeably to refer to a value or level of a biomarker in a biological sample that is greater than a value or level (or range of values or levels) of the biomarker that is typically detected in similar biological samples from healthy or normal individuals. The terms may also refer to a value or level of a biomarker in a biological sample that is greater than a value or level (or range of values or levels) of the biomarker that may be detected at a different stage of a particular disease.

"Down-regulation", "down-regulated", "under-expression", "under-expressed", and any variations thereof are used interchangeably to refer to a value or level of a biomarker in a biological sample that is less than a value or level (or range of values or levels) of the biomarker that is typically detected in similar biological samples from healthy or normal individuals. The terms may also refer to a value or level of a biomarker in a biological sample that is less than a value or level (or range of values or levels) of the biomarker that may be detected at a different stage of a particular disease.

Further, a biomarker that is either over-expressed or under-expressed can also be referred to as being "differentially expressed" or as having a "differential level" or "differential value" as compared to a "normal" expression level or value of the biomarker that indicates or is a sign of a normal process or an absence of a disease or other condition in an individual. Thus, "differential expression" of a biomarker can also be referred to as a variation from a "normal" expression level of the biomarker.

The terms "differential biomarker expression" and "differential expression" are used interchangeably to refer to a biomarker whose expression is activated to a higher or lower level in a subject suffering from a specific disease, relative to its expression in a normal subject, or relative to its expression in a patient that responds differently to a particular therapy or has a different prognosis. The terms also include biomarkers whose expression is activated to a higher or lower level at different stages of the same disease. It is also understood that a differentially expressed biomarker may be either activated or inhibited at the nucleic acid level or protein level, or may be subject to alternative splicing to result in a different polypeptide product. Such differences may be evidenced by a variety of changes including mRNA levels, miRNA levels, antisense transcript levels, or protein surface expression, secretion or other partitioning of a polypeptide. Differential biomarker expression may include a comparison of expression between two or more genes or their gene products; or a comparison of the ratios of the expression between two or more genes or their gene products; or even a comparison of two differently processed products of the same gene, which differ between normal subjects and subjects suffering from a disease; or between various stages of the same disease. Differential expression includes both quantitative, as well as qualitative, differences in the temporal or cellular expression pattern in a biomarker among, for example, normal and diseased cells, or among cells which have undergone different disease events or disease stages.

In certain embodiments, the expression profile obtained is a genomic or nucleic acid expression profile, where the amount or level of one or more nucleic acids in the sample is determined. In these embodiments, the sample that is assayed to generate the expression profile (i.e. to measure the expression levels of the one or more biomarkers in the sample) employed in the diagnostic or prognostic methods comprises a nucleic acid sample. The nucleic acid sample includes a population of nucleic acids that includes the expression information of the phenotype determinative biomarkers of the cell or tissue being analysed. In some embodiments, the nucleic acid may include RNA or DNA nucleic acids, e.g., mRNA, cRNA, cDNA etc., so long as the sample retains the expression information of the host cell or tissue from which it is obtained. The sample may be prepared in a number of different ways, as is known in the art, e.g., by mRNA isolation from a cell, where the isolated mRNA is used as isolated, amplified, or employed to prepare cDNA, cRNA, etc., as is known in the field of differential gene expression. Accordingly, determining the level of mRNA in a sample includes preparing cDNA or cRNA from the mRNA and subsequently measuring the cDNA or cRNA. The sample is typically prepared from a cell or tissue harvested from a subject in need of treatment, e.g., via biopsy of tissue, using standard protocols, where cell types or tissues from which such nucleic acids may be generated include any tissue in which the expression pattern of the to be determined phenotype exists, including, but not limited to, disease cells or tissue, body fluids, etc. The expression profile, representing the measured expression levels of one or more biomarkers in the test sample may be generated from the initial nucleic acid sample using any convenient protocol. While a variety of different manners of generating expression profiles are known, such as those employed in the field of differential gene expression/biomarker analysis, one representative and convenient type of protocol for generating expression profiles is array-based gene expression profile generation protocols. Such applications are hybridization assays in which a surface such as a (glass) chip, on which several probes for each of several thousand genes are immobilized is employed. On these surfaces there are generally multiple target regions within each gene to be analysed, and multiple (usually from 1 1 to 100) probes per target region. In this way, expression of each gene is evaluated by hybridization to multiple (tens) of probes on the surface. In these assays, a sample of target nucleic acids is first prepared from the initial nucleic acid sample being assayed, where preparation may include labelling of the target nucleic acids with a label, e.g., a member of a signal producing system. Following target nucleic acid sample preparation, the sample is contacted with the array under hybridization conditions, whereby complexes are formed between target nucleic acids that are complementary to probe sequences attached to the array surface. The presence of hybridized complexes is then detected, either qualitatively or quantitatively. Specific hybridization technology which may be practiced to generate the expression profiles employed in the subject methods includes the technology described in U.S. Pat. Nos. 5,143,854; 5,288,644; 5,324,633; 5,432,049; 5,470,710; 5,492,806; 5,503,980; 5,510,270; 5,525,464; 5,547,839; 5,580,732;

5,661 ,028; 5,800,992; the disclosures of which are herein incorporated by reference; as well as WO 95/21265; WO 96/31622; WO 97/10365; WO 97/27317; EP 373 203; and EP 785 280. In these methods, an array of "probe" nucleic acids that includes one or several probes for each of the biomarkers whose expression is being assayed is contacted with target nucleic acids as described above. Contact is carried out under hybridization conditions, e.g., stringent hybridization conditions as described above, and unbound nucleic acid is then removed. The resultant pattern of hybridized nucleic acids provides information regarding expression for each of the biomarkers that have been probed, where the expression information is in terms of whether or not the gene is expressed and, typically, at what level, where the expression data, i.e., expression profile, may be both qualitative and quantitative. The methods may include normalizing the hybridization pattern against a subset of or all other probes on the array.

In certain embodiments an increased expression level of at least one gene selected from Table 1A, 2B or 2A, or any of Tables 3-45, with a positive weight identifies or contributes to the identification of prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA). The reverse also applies.

In further embodiments a decreased expression level of at least one gene selected from Table 1A, 2B or 2A, or any of Tables 3-45, with a negative weight identifies or contributes to the identification of prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA). The reverse also applies.

Expression levels are weighted accordingly, to account for their contribution to gene signature score as discussed herein. A threshold of expression may be set relative to a median level against which "signature positive" and "signature negative" expression values can be set. The median values are set individually for each dataset as would be understood by one skilled in the art.

Creating and using a Biomarker Expression Classifier

The relative expression levels of biomarkers may be measured to form a gene expression profile. The gene expression profile of a set of biomarkers from a sample is summarized in the form of a compound decision score (or test score) and compared to a score threshold that may be mathematically derived from a training set of patient data. The score threshold separates a patient group based on different characteristics such as, but not limited to, responsiveness/non- responsiveness to treatment.

In certain embodiments the methods described herein may comprise determining the expression level of at least one of the genes with a negative weight listed in Table 1A, 2B or 2A, or any of Tables 3-45, together with at least one gene with a positive weight listed in Table 1A, 2B or 2A, or any of Tables 3-45. Thus, the methods may rely upon a combination of an up-regulated marker and a down-regulated marker. The combined up and down regulated marker expression levels, as appropriately weighted, may then contribute to, or make up, the final signature score.

In certain embodiments the methods described herein comprise comparing the expression level of one or more genes to a reference value or to the expression level in one or more control samples or to the expression level in one or more control cells in the same sample. The control cells may be normal (i.e. cells characterized by an independent method as non-cancerous) cells. The one or more control samples may consist of non-cancerous cells or may include a mixture of cancer cells (prostate) and non-cancerous cells. The expression level may be compared to the expression level of the same gene in one or more control samples or control cells.

The reference value may be a threshold level of expression of at least one gene set by determining the level or levels in a range of samples from subjects with and without the relevant cancer. Suitable methods for setting a threshold are well known to those skilled in the art. The threshold may be mathematically derived from a training set of patient data. The score threshold thus separates the test samples according to presence or absence of the particular condition. The interpretation of this quantity, i.e. the cut-off threshold may be derived in a development or training phase from a set of patients with known outcome. The threshold may therefore be fixed prior to performance of the claimed methods from training data by methods known to those skilled in the art and as detailed herein in relation to generation of the various gene signatures.

The reference value may also be a threshold level of expression of at least one gene set by determining the level of expression of the at least one gene in a sample from a subject at a first time point. The determined levels of expression at later time points for the same subject are then compared to the threshold level. Thus, the methods of the invention may be used in order to monitor progress of disease in a subject, namely to provide an ongoing characterization and/or prognosis of disease in the subject. For example, the methods may be used to identify (or

"diagnose") a prostate cancer with a deficiency in DNA damage repair and/or that displays immune activation (to abnormal DNA). This may be used to guide treatment decisions as discussed in further detail herein. In some embodiments, such active surveillance methods determine whether treatment should be administered or not. If the cancer is identified within the metastatic biology group the cancer should be treated. If the cancer is identified as not being within the relevant subgroup further monitoring can be performed to ensure that the cancer remains stable (i.e. does not evolve into the metastatic form). In such circumstances, alternative treatment may be applied.

For genes whose expression level does not differ between normal cells and cells from a cancer, such as prostate cancer that does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) the expression level of the same gene in normal cells in the same sample can be used as a control.

Different may be statistically significantly different. By statistically significant is meant unlikely to have occurred by chance alone. A suitable statistical assessment may be performed according to any suitable method.

The methods described herein may further comprise determining the expression level of a reference gene. A reference gene may be required if the target gene expression level differs between normal cells and cells from a cancer, such as prostate cancer that does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA).

In certain embodiments the expression level of at least one gene selected from Tables 1 -45 is compared to the expression level of a reference gene. The reference gene may be any gene with minimal expression variance across all cancer samples. Thus, the reference gene may be any gene whose expression level does not vary dependent on the biology underlying the prostate cancer. The skilled person is well able to identify a suitable reference gene based upon these criteria. The expression level of the reference gene may be determined in the same sample as the expression level of at least one gene selected from the Tables.

The expression level of the reference gene may be determined in a different sample. The different sample may be a control sample as described above. The expression level of the reference gene may be determined in normal cells and/or cancer, such as prostate cancer, cells in a sample.

The expression level of the at least one gene in the sample from the subject may be analyzed using a statistical model. In specific embodiments where the expression level of at least 2 genes, is measured the genes may be weighted. As used herein, the term "weight" refers to the relative importance of an item in a statistical calculation. The weight of each gene may be determined on a data set of patient samples using analytical methods known in the art. An overall score, termed a "signature score", may be calculated and used to provide a defined outcome when performing the methods of the invention. Typically, the score represents the sum of the weighted gene expression levels. Suitable weights for calculating the gene signature scores are set forth in Tables 2B and 2A and may be employed according to the methods of the invention. Similarly, suitable weights for exemplary smaller signatures are set forth in Tables 3 to 45.

Thus, according to all aspects of the invention, the methods may comprise:

(i) determining the expression level of at least one gene selected from Tables 1 -45 in a sample from the subject; and

(ii) assessing from the expression level of the at least one gene whether the sample from the subject is positive or negative for a gene signature comprising the at least one gene. Thus, at its simplest, an increased level of expression of one or more genes defines a sample as positive for the gene signature. For certain genes, a decreased level of expression of one or more gene defines a sample as positive for the gene signature. However, where the expression level of a plurality of genes is measured, the combination of expression levels is typically aggregated in order to determine whether the sample is positive for the gene signature. Thus, some genes may display increased expression and some genes may display decreased expression. This can be achieved in various ways, as discussed in detail herein.

In specific embodiments, the signature score may be calculated according to the following equation:

Signatures core

Where ^ is a weight for each gene, b_i is a gene-specific bias, ge_t is the gene expression after pre-processing, and k is a constant offset. Similarly, each gene in the signature may (or may not) be attributed a bias score. Example bias scores for the specified in tables 3-45 and may be adopted according to the performance of the methods of the invention. Of course, where different signatures are utilized the bias values would be recalculated.

As indicated, k is a constant offset. Again, where different signatures are utilized the value of k would be recalculated. The value of k varies dependent upon where the threshold for "signature positive" is set. This threshold may be set dependent upon which considerations are most important, e.g. to maximize sensitivity and/or specificity as against a particular outcome or characterization. Suitable thresholds may be determined as described above.

By "signature score" is meant a compound decision score that summarizes the expression levels of the genes. This may be compared to a threshold score that is mathematically derived from a training set of patient data. The threshold score is established with the purpose of maximizing the ability to separate cancers into those that are positive for the biomarker signature and those that are negative. The patient training set data is preferably derived from cancer tissue samples having been characterized by sub-type, prognosis, likelihood of recurrence, long term survival, clinical outcome, treatment response, diagnosis, cancer classification, or personalized genomics profile. Expression profiles, and corresponding decision scores from patient samples may be correlated with the characteristics of patient samples in the training set that are on the same side of the mathematically derived score decision threshold. In certain example embodiments, the threshold of the (linear) classifier scalar output is optimized to maximize the sum of sensitivity and specificity under cross- validation as observed within the training dataset.

The overall expression data for a given sample may be normalized using methods known to those skilled in the art in order to correct for differing amounts of starting material, varying efficiencies of the extraction and amplification reactions, etc. In one embodiment, the biomarker expression levels in a sample are evaluated by a (linear) classifier. As used herein, a (linear) classifier refers to a weighted sum of the individual biomarker intensities into a compound decision score ("decision function"). The decision score is then compared to a pre-defined cut-off score threshold, corresponding to a certain set-point in terms of sensitivity and specificity which indicates if a sample is equal to or above the score threshold (decision function positive) or below (decision function negative).

Using a (linear) classifier on the normalized data to make a call (e.g. positive or negative for a biomarker signature) effectively means to split the data space, i.e. all possible combinations of expression values for all genes in the classifier, into two disjoint segments by means of a separating hyperplane. This split is empirically derived on a (large) set of training examples. Without loss of generality, one can assume a certain fixed set of values for all but one biomarker, which would automatically define a threshold value for this remaining biomarker where the decision would change from, for example, positive or negative for the biomarker signature. The precise value of this threshold depends on the actual measured expression profile of all other genes within the classifier, but the general indication of certain genes remains fixed. Therefore, in the context of the overall gene expression classifier, relative expression can indicate if either up- or down-regulation of a certain biomarker is indicative of being positive for the signature or not. In certain example embodiments, a sample expression score above the threshold expression score indicates the sample is positive for the biomarker signature. In certain other example embodiments, a sample expression score above a threshold score indicates the subject has a poor clinical prognosis compared to a subject with a sample expression score below the threshold score.

In certain other example embodiments, the expression signature is derived using a decision tree (Hastie et al. The Elements of Statistical Learning, Springer, New York 2001), a random forest (Breiman, 2001 Random Forests, Machine Learning 45:5), a neural network (Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford 1995), discriminant analysis (Duda et al. Pattern Classification, 2nd ed., John Wiley, New York 2001), including, but not limited to linear, diagonal linear, quadratic and logistic discriminant analysis, a Prediction Analysis for Microarrays (PAM, (Tibshirani et al., 2002, Proc. Natl. Acad. Sci. USA 99:6567-6572)) or a Soft Independent Modeling of Class Analogy analysis. (SIMCA, (Wold, 1976, Pattern Recogn. 8:127-139)).

Classification trees (Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841 -8) provide a means of predicting outcomes based on logic and rules. A classification tree is built through a process called binary recursive partitioning, which is an iterative procedure of splitting the data into partitions/branches. The goal is to build a tree that distinguishes among pre-defined classes. Each node in the tree corresponds to a variable. To choose the best split at a node, each variable is considered in turn, where every possible split is tried and considered, and the best split is the one which produces the largest decrease in diversity of the classification label within each partition. This is repeated for all variables, and the winner is chosen as the best splitter for that node. The process is continued at the next node and in this manner, a full tree is generated. One of the advantages of classification trees over other supervised learning approaches such as discriminant analysis, is that the variables that are used to build the tree can be either categorical, or numeric, or a mix of both. In this way it is possible to generate a classification tree for predicting outcomes based on say the directionality of gene expression.

Random forest algorithms (Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5-32. doi:10.1023/A:1010933404324) provide a further extension to classification trees, whereby a collection of classification trees are randomly generated to form a "forest" and an average of the predicted outcomes from each tree is used to make inference with respect to the outcome. Biomarker expression values may be defined in combination with corresponding scalar weights on the real scale with varying magnitude, which are further combined through linear or non-linear, algebraic, trigonometric or correlative means into a single scalar value via an algebraic, statistical learning, Bayesian, regression, or similar algorithms which together with a mathematically derived decision function on the scalar value provide a predictive model by which expression profiles from samples may be resolved into discrete classes of responder or non-responder, resistant or non- resistant, to a specified drug, drug class, molecular subtype, or treatment regimen. Such predictive models, including biomarker membership, are developed by learning weights and the decision threshold, optimized for sensitivity, specificity, negative and positive predictive values, hazard ratio or any combination thereof, under cross-validation, bootstrapping or similar sampling techniques, from a set of representative expression profiles from historical patient samples with known drug response and/or resistance. In one embodiment, the genes are used to form a weighted sum of their signals, where individual weights can be positive or negative. The resulting sum ("expression score") is compared with a predetermined reference point or value. The comparison with the reference point or value may be used to diagnose, or predict a clinical condition or outcome. As described above, one of ordinary skill in the art will appreciate that the genes included in the classifier provided in the various Tables will carry unequal weights in a classifier. Therefore, while as few as one biomarker may be used to diagnose or predict a clinical prognosis or response to a therapeutic agent, the specificity and sensitivity or diagnosis or prediction accuracy may increase using more genes.

In certain example embodiments, the expression signature is defined by a decision function. A decision function is a set of weighted expression values derived using a (linear) classifier. All linear classifiers define the decision function using the following equation:

f(x) = w' · x + b =∑ wi · xi +b (1)

All measurement values, such as the microarray gene expression intensities xi, for a certain sample are collected in a vector x. Each intensity is then multiplied with a corresponding weight wi to obtain the value of the decision function f(x) after adding an offset term b. In deriving the decision function, the linear classifier will further define a threshold value that splits the gene expression data space into two disjoint sections. Example (linear) classifiers include but are not limited to partial least squares (PLS), (Nguyen et al., Bioinformatics 18 (2002) 39-50), support vector machines (SVM) (Scholkopf et al., Learning with Kernels, MIT Press, Cambridge 2002), and shrinkage discriminant analysis (SDA) (Ahdesmaki et al., Annals of applied statistics 4, 503-519 (2010)). In one example embodiment, the (linear) classifier is a PLS linear classifier. The decision function is empirically derived on a large set of training samples, for example from patients showing a good or poor clinical prognosis. The threshold separates a patient group based on different characteristics such as, but not limited to, clinical prognosis before or after a given therapeutic treatment. The interpretation of this quantity, i.e. the cut-off threshold, is derived in the development phase ("training") from a set of patients with known outcome. The corresponding weights and the responsiveness/resistance cut-off threshold for the decision score are fixed a priori from training data by methods known to those skilled in the art. In one example embodiment, Partial Least Squares Discriminant Analysis (PLS-DA) is used for determining the weights. (L. Stahle, S. Wold, J. Chemom. 1 (1987) 185-196; D. V. Nguyen, D.M. Rocke, Bioinformatics 18 (2002) 39-50).

Effectively, this means that the data space, i.e. the set of all possible combinations of biomarker expression values, is split into two mutually exclusive groups corresponding to different clinical classifications or predictions, for example, one corresponding to good clinical prognosis and poor clinical prognosis. In the context of the overall classifier, relative over-expression of a certain biomarker can either increase the decision score (positive weight) or reduce it (negative weight) and thus contribute to an overall decision of, for example, a good clinical prognosis.

In certain example embodiments of the invention, the data is transformed non-linearly before applying a weighted sum as described above. This non-linear transformation might include increasing the dimensionality of the data. The non-linear transformation and weighted summation might also be performed implicitly, for example, through the use of a kernel function. (Scholkopf et al. Learning with Kernels, MIT Press, Cambridge 2002). In certain example embodiments, the patient training set data is derived by isolated RNA from a corresponding cancer tissue sample set and determining expression values by hybridizing the (cDNA amplified from) isolated RNA to a microarray. In certain example embodiments, the microarray used in deriving the expression signature is a transcriptome array. As used herein a "transcriptome array" refers to a microarray containing probe sets that are designed to hybridize to sequences that have been verified as expressed in the diseased tissue of interest. Given alternative splicing and variable poly-A tail processing between tissues and biological contexts, it is possible that probes designed against the same gene sequence derived from another tissue source or biological context will not effectively bind to transcripts expressed in the diseased tissue of interest, leading to a loss of potentially relevant biological information. Accordingly, it is beneficial to verify what sequences are expressed in the disease tissue of interest before deriving a microarray probe set. Verification of expressed sequences in a particular disease context may be done, for example, by isolating and sequencing total RNA from a diseased tissue sample set and cross-referencing the isolated sequences with known nucleic acid sequence databases to verify that the probe set on the transcriptome array is designed against the sequences actually expressed in the diseased tissue of interest. Methods for making transcriptome arrays are described in United States Patent Application Publication No. 2006/0134663, which is incorporated herein by reference. In certain example embodiments, the probe set of the transcriptome array is designed to bind within 300 nucleotides of the 3' end of a transcript. Methods for designing transcriptome arrays with probe sets that bind within 300 nucleotides of the 3' end of target transcripts are disclosed in United States Patent Application Publication No. 2009/0082218, which is incorporated by reference herein.

An optimal (linear) classifier can be selected by evaluating a (linear) classifier's performance using such diagnostics as "area under the curve" (AUC). AUC refers to the area under the curve of a receiver operating characteristic (ROC) curve, both of which are well known in the art. AUC measures are useful for comparing the accuracy of a classifier across the complete data range. (Linear) classifiers with a higher AUC have a greater capacity to classify unknowns correctly between two groups of interest (e.g., ovarian cancer samples and normal or control samples). ROC curves are useful for plotting the performance of a particular feature (e.g., any of the genes described herein and/or any item of additional biomedical information) in distinguishing between two populations (e.g., individuals responding and not responding to a therapeutic agent). Typically, the feature data across the entire population (e.g., the cases and controls) are sorted in ascending order based on the value of a single feature. Then, for each value for that feature, the true positive and false positive rates for the data are calculated. The true positive rate is determined by counting the number of cases above the value for that feature and then dividing by the total number of positive cases. The false positive rate is determined by counting the number of controls above the value for that feature and then dividing by the total number of controls. Although this definition refers to scenarios in which a feature is elevated in cases compared to controls, this definition also applies to scenarios in which a feature is lower in cases compared to the controls (in such a scenario, samples below the value for that feature would be counted). ROC curves can be generated for a single feature as well as for other single outputs, for example, a combination of two or more features can be mathematically combined (e.g., added, subtracted, multiplied, etc.) to provide a single sum value, and this single sum value can be plotted in a ROC curve. Additionally, any combination of multiple features, in which the combination derives a single output value, can be plotted in a ROC curve. These combinations of features may comprise a test. The ROC curve is the plot of the true positive rate (sensitivity) of a test against the false positive rate (1 -specificity) of the test.

Alternatively, an optimal classifier can be selected by evaluating performance against time-to-event endpoints using methods such as Cox proportional hazards (PH) and measures of performance across all possible thresholds assessed via the concordance-index (C-index) (Harrell, Jr. 2010). The C-lndex is analogous to the "area under the curve" (AUC) metric (used for dichotomized endpoints), and it is used to measure performance with respect to association with survival data. Note that the extension of AUC to time-to-event endpoints is the C-index, with threshold selection optimized to maximize the hazard ratio (HR) under cross-validation. In this instance, the partial Cox regression algorithm (Li and Gui, 2004) was chosen for the biomarker discovery analyses. It is analogous to principal components analysis in that the first few latent components explain most of the information in the data. Implementation is as described in Ahdesmaki et al 2013. C-index values can be generated for a single feature as well as for other single outputs, for example, a combination of two or more features can be mathematically combined (e.g., added, subtracted, multiplied, etc.) to provide a single sum value, and this single sum value can be evaluated for statistical significance. Additionally, any combination of multiple features, in which the combination derives a single output value, can be evaluated as a C-index for assessing utility for time-to-event class separation. These combinations of features may comprise a test. The C-index (Harrell, Jr. 2010, see Equation 4) of the continuous cross-validation test set risk score predictions was evaluated as the main performance measure.

Therapeutic agents

The methods and kits described herein may predict responsiveness to treatment, and be used to select treatment, with mitotic inhibitors. The mitotic inhibitor may be a vinca alkaloid or a taxane. In specific embodiments, the vinca alkaloid is vinorelbine In certain embodiments, the taxane is paclitaxel or docetaxel. It is shown herein that prostate cancers with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA) may be refractory to treatment with mitotic inhibitors. Thus, the invention is useful to identify those subjects where treatment with a mitotic inhibitor may be recommended i.e. where the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA); or where it may be contraindicated. In addition, the methods described herein permit the classification of a patient suffering from prostate cancer as responsive or non-responsive to a therapeutic agent that targets tumours with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA). Such agents are referred to herein as a "DNA-damaging therapeutic agent). As used herein "DNA- damaging therapeutic agent" includes agents known to damage DNA directly, agents that prevent DNA damage repair, agents that inhibit DNA damage signalling, agents that inhibit DNA damage induced cell cycle arrest, and agents that inhibit processes indirectly leading to DNA damage. Some suitable therapeutics within this category include, but are not limited to, the following DNA-damaging therapeutic agents: 1) DNA damaging agents:

a. Alkylating agents (platinum containing agents such as cisplatin, carboplatin, and oxaliplatin; cyclophosphamide; busulphan).

b. Topoisomerase I inhibitors (irinotecan;topotecan)

c. Topisomerase II inhibitors (etoposide;anthracyclines such as doxorubicin and epirubicin) d. Ionising radiation

2) DNA repair targeted therapies

a. Inhibitors of Non-homologous end-joining (DNA-PK inhibitors, Nu7441 , NU7026) b. Inhibitors of homologous recombination

c. Inhibitors of nucleotide excision repair

d. Inhibitors of base excision repair (PARP inhibitors, AG014699, AZD2281 , ABT-888, MK4827, BSI-201 , INO-1001 , TRC-102, APEX 1 inhibitors, APEX 2 inhibitors, Ligase III inhibitors e. Inhibitors of the Fanconianemia pathway

3) Inhibitors of DNA damage signalling

a. ATM inhibitors (CP466722)

b. CHK 1 inhibitors (XL-844.UCN-01 , AZD7762, PF00477736)

c. CHK 2 inhibitors (XL-844, AZD7762, PF00477736)

d. ATR inhibitors (AZ20)

4) Inhibitors of DNA damage induced cell cycle arrest

a. Wee1 kinase inhibitors

b. CDC25a, b or e inhibitors

5) Inhibition of processes indirectly leading to DNA damage

a. Histone deacetylase inhibitors

b. Heat shock protein inhibitors (geldanamycin, AUY922), 6) Inhibitors of DNA synthesis:

a. Pyrimidine analogues (5-FU, gemcitabine)

b. Prodrugs (capecitabine)

As discussed above, the therapeutic agents, for which responsiveness is predicted may be applied in an adjuvant setting. However, they may be utilised in a neoadjuvant setting additionally or alternatively.

The invention described herein is not limited to any one DNA-damaging therapeutic agent; it can be used to identify responders and non-responders to any of a range of DNA-damaging therapeutic agent, for example those that directly or indirectly affect DNA damage and/or DNA damage repair. In some embodiments, the DNA-damaging therapeutic agent comprises one or more substances selected from the group consisting of: a DNA damaging agent, a DNA repair targeted therapy, an inhibitor of DNA damage signalling, an inhibitor of DNA damage induced cell cycle arrest, a histone deacetylase inhibitor, a heat shock protein inhibitor and an inhibitor of DNA synthesis. More specifically, the DNA-damaging therapeutic agent may be selected from one or more of a platinum- containing agent, a nucleoside analogue such as gemcitabine or 5-fluorouracil or a prodrug thereof such as capecitabine, an anthracycline such as epirubicin or doxorubicin, an alkylating agent such as cyclophosphamide, an ionising radiation or a combination of radiation and chemotherapy (chemoradiation). In particular embodiments, the DNA-damaging therapeutic agent comprises a platinum-containing agent, such as a platinum based agent selected from cisplatin, carboplatin and oxaliplatin. PARP inhibitors may be employed.

Diseases and Tissue Sources

The predictive classifiers described herein are useful for determining responsiveness or resistance to a therapeutic agent for treating prostate cancer. The prostate cancer is typically within a metastatic group. In one embodiment, the methods described herein refer to prostate cancers that are treated with chemotherapeutic agents of the mitotic inhibitor class. Alternatively treatment may be with DNA damaging agents, DNA repair target therapies, inhibitors of DNA damage signalling, inhibitors of DNA damage induced cell cycle arrest, inhibition of processes indirectly leading to DNA damage and inhibition of DNA synthesis, but not limited to these classes. Each of these chemotherapeutic agents is considered a "DNA-damaging therapeutic agent" as the term is used herein.

"Biological sample", "sample", and "test sample" are used interchangeably herein to refer to any material, biological fluid, tissue, or cell obtained or otherwise derived from an individual. The sample contains prostate cancer cells and/or genetic material (DNA and/or RNA) derived from prostate cancer cells. The sample may comprise circulating tumour cells and/or cell free DNA. This includes blood (including whole blood, leukocytes, peripheral blood mononuclear cells, buffy coat, plasma, and serum), sputum, tears, mucus, nasal washes, nasal aspirate, breath, urine, semen, saliva, meningeal fluid, amniotic fluid, glandular fluid, lymph fluid, nipple aspirate, bronchial aspirate, synovial fluid, joint aspirate, ascites, cells, a cellular extract, and cerebrospinal fluid. This also includes experimentally separated fractions of all of the preceding. For example, a blood sample can be fractionated into serum or into fractions containing particular types of blood cells, such as red blood cells or white blood cells (leukocytes). If desired, a sample can be a combination of samples from an individual, such as a combination of a tissue and fluid sample. The term "biological sample" also includes materials containing homogenized solid material, such as from a stool sample, a tissue sample, or a tissue biopsy, for example. The term "biological sample" also includes materials derived from a tissue culture or a cell culture. Any suitable methods for obtaining a biological sample can be employed; exemplary methods include, e.g., phlebotomy, swab (e.g., buccal swab), and a fine needle aspirate biopsy procedure. Samples may be obtained by bronchoscopy or by sputum cytology in some embodiments. A "biological sample" obtained or derived from an individual includes any such sample that has been processed in any suitable manner after being obtained from the individual.

In such cases, the target cells may be tumour cells, for example prostate cancer cells. The target cells are derived from any tissue source, including human and animal tissue, such as, but not limited to, a newly obtained sample, a frozen sample, a biopsy sample, a sample of bodily fluid, a blood sample, preserved tissue such as a paraffin-embedded fixed tissue sample (i.e., a tissue block), or cell culture. In some specific embodiments, the samples may or may not comprise vesicles. Methods and Kits

Kits for Gene Expression Analysis

Reagents, tools, and/or instructions for performing the methods described herein can be provided in a kit. For example, the kit can contain reagents, tools, and instructions for determining an appropriate therapy for a prostate cancer patient. Such a kit can include reagents for collecting a tissue sample from a patient, such as by biopsy, and reagents for processing the tissue. The kit may incorporate reagents for recovering or purifying genetic material (DNA and/or RNA) from the sample. The kit can also include one or more reagents for performing a biomarker expression analysis, such as reagents for performing nucleic acid amplification, including RT-PCR and qPCR, NGS, bDNA/bRNA, northern blot, proteomic analysis, or immunohistochemistry to determine expression levels of biomarkers in a sample of a patient. For example, primers for performing RT-PCR, probes for performing northern blot analyses, and/or antibodies for performing proteomic analysis such as Western blot, immunohistochemistry and ELISA analyses can be included in such kits. Appropriate buffers for the assays can also be included. Detection reagents required for any of these assays can also be included. The appropriate reagents and methods are described in further detail below.

The target sequences of the genes described herein are known and may be utilised for the purposes of designing primers and/or probes which hybridize to the target sequences. Design of suitable primers and/or probes is within the capability of one skilled in the art once the target sequence is identified. Various primer design tools are freely available to assist in this process such as the NCBI Primer-BLAST tool. The primers and/or probes may be designed such that they hybridize to the target sequence under stringent conditions. Primers and/or probes may be at least 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24 or 25 (or more) nucleotides in length. It should be understood that each subset can include multiple primers and/or probes directed to the same biomarker. The tables show in some cases multiple target sequences within the same overall gene. Such primers and/or probes may be included in kits useful for performing the methods of the invention. The invention provides, and the kits of the invention may incorporate, the probe sequences disclosed herein with reference to Table 1A. Table 1A lists the SEQ ID Nos for the individual probes used to measure expression levels of the genes identified in the table. Thus, the invention provides a probe comprising, consisting essentially of or consisting of the sequence of any one of SEQ ID NOs 1 -3500. Any one or more, up to all, of the probes may be included in the kits of the invention. The kits may be array or PCR based kits for example and may include additional reagents, such as a polymerase and/or dNTPs for example. The kits featured herein can also include an instruction sheet describing how to perform the assays for measuring biomarker expression. The instruction sheet can also include instructions for how to determine a reference cohort, including how to determine expression levels of biomarkers in the reference cohort and how to assemble the expression data to establish a reference for comparison to a test patient. The instruction sheet can also include instructions for assaying biomarker expression in a test patient and for comparing the expression level with the expression in the reference cohort to subsequently determine the appropriate chemotherapy for the test patient. Methods for determining the appropriate

chemotherapy are described above and can be described in detail in the instruction sheet.

Informational material included in the kits can be descriptive, instructional, marketing or other material that relates to the methods described herein and/or the use of the reagents for the methods described herein. For example, the informational material of the kit can contain contact information, e.g., a physical address, email address, website, or telephone number, where a user of the kit can obtain substantive information about performing a gene expression analysis and interpreting the results, particularly as they apply to a human's likelihood of having a positive response to a specific therapeutic agent.

The kits featured herein can also contain software necessary to infer a patient's likelihood of having a positive response to a specific therapeutic agent from the biomarker expression. a) Gene expression profiling methods

Measuring mRNA in a biological sample may be used as a surrogate for detection of the level of the corresponding protein in the biological sample. Thus, any of the biomarkers or biomarker panels described herein can also be detected by detecting the appropriate RNA. Methods of gene expression profiling include, but are not limited to, microarray, RT-PCT, qPCR, NGS, northern blots, SAGE, mass spectrometry. mRNA expression levels are measured by reverse transcription quantitative polymerase chain reaction (RT-PCR followed with qPCR). RT-PCR is used to create a cDNA from the mRNA. The cDNA may be used in a qPCR assay to produce fluorescence as the DNA amplification process progresses. By comparison to a standard curve, qPCR can produce an absolute measurement such as number of copies of mRNA per cell. Northern blots, microarrays, Invader assays, and RT-PCR combined with capillary electrophoresis have all been used to measure expression levels of mRNA in a sample. See Gene Expression Profiling: Methods and Protocols, Richard A. Shimkets, editor, Humana Press, 2004. miRNA molecules are small RNAs that are non-coding but may regulate gene expression. Any of the methods suited to the measurement of mRNA expression levels can also be used for the corresponding miRNA. Recently many laboratories have investigated the use of miRNAs as biomarkers for disease. Many diseases involve widespread transcriptional regulation, and it is not surprising that miRNAs might find a role as biomarkers. The connection between miRNA concentrations and disease is often even less clear than the connections between protein levels and disease, yet the value of miRNA biomarkers might be substantial. Of course, as with any RNA expressed differentially during disease, the problems facing the development of an in vitro diagnostic product will include the requirement that the miRNAs survive in the diseased cell and are easily extracted for analysis, or that the miRNAs are released into blood or other matrices where they must survive long enough to be measured. Protein biomarkers have similar requirements, although many potential protein biomarkers are secreted intentionally at the site of pathology and function, during disease, in a paracrine fashion. Many potential protein biomarkers are designed to function outside the cells within which those proteins are synthesized.

Gene expression may also be evaluated using mass spectrometry methods. A variety of configurations of mass spectrometers can be used to detect biomarker values. Several types of mass spectrometers are available or can be produced with various configurations. In general, a mass spectrometer has the following major components: a sample inlet, an ion source, a mass analyser, a detector, a vacuum system, and instrument-control system, and a data system.

Difference in the sample inlet, ion source, and mass analyser generally define the type of instrument and its capabilities. For example, an inlet can be a capillary-column liquid chromatography source or can be a direct probe or stage such as used in matrix-assisted laser desorption. Common ion sources are, for example, electrospray, including nanospray and microspray or matrix-assisted laser desorption. Common mass analysers include a quadrupole mass filter, ion trap mass analyser and time-of-flight mass analyser. Additional mass spectrometry methods are well known in the art (see Burlingame et al., Anal. Chem. 70:647 R-716R (1998); Kinter and Sherman, New York (2000)).

Protein biomarkers and biomarker values can be detected and measured by any of the following: electrospray ionization mass spectrometry (ESI-MS), ESI-MS/MS, ESI-MS/(MS)n, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS), surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS), desorption/ionization on silicon (DIOS), secondary ion mass spectrometry (SIMS), quadrupole time-of-flight (Q-TOF), tandem time-of-flight (TOF/TOF) technology, called ultraflex III TOF/TOF, atmospheric pressure chemical ionization mass spectrometry (APCI-MS), APCI-MS/MS, APCI-(MS).sup.N, atmospheric pressure photoionization mass spectrometry (APPI-MS), APPI-MS/MS, and APPI-(MS).sup.N, quadrupole mass spectrometry, Fourier transform mass spectrometry (FTMS), quantitative mass spectrometry, and ion trap mass spectrometry.

Sample preparation strategies are used to label and enrich samples before mass spectroscopic characterization of protein biomarkers and determination biomarker values.

Labelling methods include but are not limited to isobaric tag for relative and absolute quantitation (iTRAQ) and stable isotope labelling with amino acids in cell culture (SILAC). Capture reagents used to selectively enrich samples for candidate biomarker proteins prior to mass spectroscopic analysis include but are not limited to aptamers, antibodies, nucleic acid probes, chimeras, small molecules, an F(ab')2 fragment, a single chain antibody fragment, an Fv fragment, a single chain Fv fragment, a nucleic acid, a lectin, a ligand-binding receptor, affybodies, nanobodies, ankyrins, domain antibodies, alternative antibody scaffolds (e.g. diabodiesetc) imprinted polymers, avimers, peptidomimetics, peptoids, peptide nucleic acids, threose nucleic acid, a hormone receptor, a cytokine receptor, and synthetic receptors, and modifications and fragments of these.

The foregoing assays enable the detection of biomarker values that are useful in methods for predicting responsiveness of a cancer therapeutic agent, where the methods comprise detecting, in a biological sample from an individual suffering from prostate cancer, at least N biomarker values that each correspond to a biomarker selected from the group consisting of the biomarkers provided in Tables 1 to 45, wherein a classification, as described in detail below, using the biomarker values indicates whether the individual will be responsive to a therapeutic agent, or alternatively that the therapeutic agent is contraindicated. While certain of the described predictive biomarkers are useful alone for predicting responsiveness to a therapeutic agent, methods are also described herein for the grouping of multiple subsets of the biomarkers that are each useful as a panel of two or more biomarkers. Thus, various embodiments of the instant application provide combinations comprising N biomarkers, wherein N is at least 2, 3, 4, 5 etc. biomarkers. It will be appreciated that N can be selected to be any number from any of the above-described ranges, as well as similar, but higher order, ranges. In accordance with any of the methods described herein, biomarker values can be detected and classified individually or they can be detected and classified collectively, as for example in a multiplex assay format. b) Microarray methods In one embodiment, the present invention makes use of "oligonucleotide arrays" (also called herein "microarrays"). Microarrays can be employed for analysing the expression of biomarkers in a cell, and especially for measuring the expression of biomarkers of cancer tissues. ln one embodiment, biomarker arrays are produced by hybridizing detectably labelled

polynucleotides representing the mRNA transcripts present in a cell (e.g., fluorescently-labelled cDNA synthesized from total cell mRNA or labelled cRNA) to a microarray. A microarray is a surface with an ordered array of binding (e.g., hybridization) sites for products of many of the genes in the genome of a cell or organism, preferably most or almost all of the genes. Microarrays can be made in a number of ways known in the art. However produced, microarrays share certain characteristics. The arrays are reproducible, allowing multiple copies of a given array to be produced and easily compared with each other. Preferably the microarrays are small, usually smaller than 5 cm2, and they are made from materials that are stable under binding (e.g., nucleic acid hybridization) conditions. A given binding site or unique set of binding sites in the microarray will specifically bind the product of a single gene in the cell. In a specific embodiment, positionally addressable arrays containing affixed nucleic acids of known sequence at each location are used.

It will be appreciated that when cDNA complementary to the RNA of a cell is made and hybridized to a microarray under suitable hybridization conditions, the level of hybridization to the site in the array corresponding to any particular gene will reflect the prevalence in the cell of mRNA transcribed from that gene/biomarker. For example, when detectably labelled (e.g., with a fluorophore) cDNA or cRNA complementary to the total cellular mRNA is hybridized to a microarray, the site on the array corresponding to a gene (i.e., capable of specifically binding the product of the gene) that is not transcribed in the cell will have little or no signal (e.g., fluorescent signal), and a gene for which the encoded mRNA is prevalent will have a relatively strong signal. Nucleic acid hybridization and wash conditions are chosen so that the probe "specifically binds" or "specifically hybridizes' to a specific array site, i.e., the probe hybridizes, duplexes or binds to a sequence array site with a

complementary nucleic acid sequence but does not hybridize to a site with a non-complementary nucleic acid sequence. As used herein, one polynucleotide sequence is considered complementary to another when, if the shorter of the polynucleotides is less than or equal to 25 bases, there are no mismatches using standard base-pairing rules or, if the shorter of the polynucleotides is longer than 25 bases, there is no more than a 5% mismatch. Preferably, the polynucleotides are perfectly complementary (no mismatches). It can be demonstrated that specific hybridization conditions result in specific hybridization by carrying out a hybridization assay including negative controls using routine experimentation.

Optimal hybridization conditions will depend on the length (e.g., oligomer vs. polynucleotide greater than 200 bases) and type (e.g., RNA, DNA, PNA) of labelled probe and immobilized polynucleotide or oligonucleotide. General parameters for specific (i.e., stringent) hybridization conditions for nucleic acids are described in Sambrook et al., supra, and in Ausubel et al., "Current Protocols in Molecular Biology", Greene Publishing and Wiley-interscience, NY (1987), which is incorporated in its entirety for all purposes. When the cDNA microarrays are used, typical hybridization conditions are hybridization in 5xSSC plus 0.2% SDS at 65C for 4 hours followed by washes at 25°C in low stringency wash buffer (1xSSC plus 0.2% SDS) followed by 10 minutes at 25°C in high stringency wash buffer (0.1 SSC plus 0.2% SDS) (see Shena et al., Proc. Natl. Acad. Sci. USA, Vol. 93, p. 10614 (1996)). Useful hybridization conditions are also provided in, e.g., Tijessen, Hybridization With Nucleic Acid Probes", Elsevier Science Publishers B.V. (1993) and Kricka, "Nonisotopic DNA Probe Techniques", Academic Press, San Diego, Calif. (1992).

Microarray platforms include those manufactured by companies such as Affymetrix, lllumina and Agilent. Examples of microarray platforms manufactured by Affymetrix include the U133 Plus2 array, the Almac proprietary Xcel™ array and the Almac proprietary Cancer DSAs®, including the Breast Cancer DSA® and Prostate Cancer DSA®. c) Immunoassay methods

Immunoassay methods are based on the reaction of an antibody to its corresponding target or analyte and can detect the analyte in a sample depending on the specific assay format. To improve specificity and sensitivity of an assay method based on immunoreactivity, monoclonal antibodies are often used because of their specific epitope recognition.

Polyclonal antibodies have also been successfully used in various immunoassays because of their increased affinity for the target as compared to monoclonal antibodies Immunoassays have been designed for use with a wide range of biological sample matrices Immunoassay formats have been designed to provide qualitative, semi-quantitative, and quantitative results.

Quantitative results may be generated through the use of a standard curve created with known concentrations of the specific analyte to be detected. The response or signal from an unknown sample is plotted onto the standard curve, and a quantity or value corresponding to the target in the unknown sample is established.

Numerous immunoassay formats have been designed. ELISA or EIA can be quantitative for the detection of an analyte/biomarker. This method relies on attachment of a label to either the analyte or the antibody and the label component includes, either directly or indirectly, an enzyme. ELISA tests may be formatted for direct, indirect, competitive, or sandwich detection of the analyte. Other methods rely on labels such as, for example, radioisotopes (1125) or fluorescence. Additional techniques include, for example, agglutination, nephelometry, turbidimetry, Western blot, immunoprecipitation, immunocytochemistry, immunohistochemistry, flow cytometry, Luminex assay, and others (see ImmunoAssay: A Practical Guide, edited by Brian Law, published by Taylor & Francis, Ltd., 2005 edition). Exemplary assay formats include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay, fluorescent, chemiluminescence, and fluorescence resonance energy transfer (FRET) or time resolved-FRET (TR-FRET) immunoassays. Examples of procedures for detecting biomarkers include biomarker immunoprecipitation followed by quantitative methods that allow size and peptide level discrimination, such as gel electrophoresis, capillary electrophoresis, planar

electrochromatography, and the like.

Methods of detecting and/or quantifying a detectable label or signal generating material depend on the nature of the label. The products of reactions catalysed by appropriate enzymes (where the detectable label is an enzyme; see above) can be, without limitation, fluorescent, luminescent, or radioactive or they may absorb visible or ultraviolet light. Examples of detectors suitable for detecting such detectable labels include, without limitation, x-ray film, radioactivity counters, scintillation counters, spectrophotometers, colorimeters, fluorimeters, luminometers, and

densitometers.

Any of the methods for detection can be performed in any format that allows for any suitable preparation, processing, and analysis of the reactions. This can be, for example, in multi-well assay plates (e.g., 96 wells or 384 wells) or using any suitable array or microarray. Stock solutions for various agents can be made manually or robotically, and all subsequent pipetting, diluting, mixing, distribution, washing, incubating, sample readout, data collection and analysis can be done robotically using commercially available analysis software, robotics, and detection instrumentation capable of detecting a detectable label. d) Sequencing

Gene expression may also be determined using sequencing methods, which include the various next generation sequencing technologies. In specific embodiments RNAseq may be utilized.

Clinical Uses

In some embodiments, methods are provided for identifying and/or selecting a prostate cancer patient who is responsive to a therapeutic regimen, or where a particular class of therapeutic agent is contraindicated. In particular, the methods are directed to identifying or selecting a prostate cancer patient who is responsive to a mitotic inhibitor or who is non-responsive to a mitotic inhibitor. Such methods are based upon identifying whether the patient has a prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA). If the patient has a prostate cancer of this type, a mitotic inhibitor should not be administered. These methods typically include determining the level of expression of a collection of predictive markers in a patient's tumour (primary, metastatic or other derivatives from the tumour such as, but not limited to, blood, or components in blood, urine, saliva and other bodily fluids)(e.g., a patient's cancer cells), comparing the level of expression to a reference expression level, and identifying whether expression in the sample includes a pattern or profile of expression of a selected predictive biomarker or biomarker set which corresponds to response or non-response to the therapeutic agent.

In some embodiments a method of predicting responsiveness of an individual having prostate cancer to treatment with a mitotic inhibitor or DNA-damaging therapeutic agent comprises:

a. measuring expression levels of one or more biomarkers in a test sample obtained from the individual, wherein the one or more biomarkers are selected from Tables 1 -45;

b. deriving a test score that captures the expression levels;

c. providing a threshold score comprising information correlating the test score and responsiveness;

d. and comparing the test score to the threshold score; wherein non-responsiveness to the mitotic inhibitor is predicted when the test score exceeds the threshold score but wherein responsiveness to the DNA-damaging therapeutic agent is predicted when the test score exceeds the threshold score.

In specific embodiments, the one or more biomarkers are selected from the group consisting of CXCL10, MX1 , ID01 , IF144L, CD2, GBP5, PRAME, ITGAL, LRP4, and APOL3. One of ordinary skill in the art can determine an appropriate threshold score, and appropriate biomarker weightings, using the teachings provided herein including the teachings of the Example.

In other embodiments, the one or more biomarkers are selected from the group consisting of CXCL10, MX1 , ID01 , IF144L, CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, AC138128.1 , KIF26A, CD274, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC, PPP1 R1A, and AL137218.1 .

Tables 2A and 2B provide exemplary gene signatures (or gene classifiers) wherein the biomarkers consist of 40 or 44 of the gene products listed therein, respectively, and wherein a threshold score is derived from the individual gene product weightings listed therein. In one of these embodiments wherein the biomarkers consist of the 44 gene products listed in Table 2B, and the biomarkers are associated with the weightings provided in Table 2B, a test score that exceeds a threshold score, such as a threshold score of 0.3681 is used to indicate a tumour of the relevant subtype.

A cancer is "responsive" to a therapeutic agent if its rate of growth is inhibited as a result of contact with the therapeutic agent, compared to its growth in the absence of contact with the therapeutic agent. Growth of a cancer can be measured in a variety of ways, for instance, the size of a tumour or the expression of tumour markers appropriate for that tumour type may be measured.

A cancer is "non-responsive" to a therapeutic agent if its rate of growth is not inhibited as a result of contact with the therapeutic agent when compared to its growth in the absence of contact with the therapeutic agent. As stated above, growth of a cancer can be measured in a variety of ways, for instance, the size of a tumour or the expression of tumour markers appropriate for that tumour type may be measured. The quality of being non-responsive to a therapeutic agent is a highly variable one, with different cancers exhibiting different levels of "non-responsiveness" to a given therapeutic agent, under different conditions. Still further, measures of non-responsiveness can be assessed using additional criteria beyond growth size of a tumour, including patient quality of life, degree of metastases, etc.

An application of this test will predict end points including, but not limited to, overall survival, progression free survival, radiological response, as defined by RECIST, complete response, partial response, stable disease and serological markers such as, but not limited to, PSA, CEA, CA125, CA15-3 and CA19-9. In specific embodiments this invention can be used to evaluate standard chest roentgenography, computed tomography (CT), perfusion CT, dynamic contrast material-enhanced magnetic resonance (MR) diffusion-weighted (DW) MR or positron emission tomography (PET) with the glucose analogue fluorine 18 fluorodeoxyglucose (FDG) (FDG-PET) response to therapy.

Array or non-array based methods for detection, quantification and qualification of RNA, DNA or protein within a sample of one or more nucleic acids or their biological derivatives such as encoded proteins may be employed, including quantitative PCR (QPCR), enzyme-linked immunosorbent assay (ELISA) or immunohistochemistry (IHC) and the like.

After obtaining an expression profile from a sample being assayed, the expression profile is compared with a reference or control profile to make a diagnosis regarding the therapy responsive phenotype of the cell or tissue, and therefore host, from which the sample was obtained. The terms "reference" and "control" as used herein in relation to an expression profile mean a standardized pattern of gene or gene product expression or levels of expression of certain biomarkers to be used to interpret the expression classifier of a given patient and assign a prognostic or predictive class. The reference or control expression profile may be a profile that is obtained from a sample known to have the desired phenotype, e.g., responsive phenotype, and therefore may be a positive reference or control profile. In addition, the reference profile may be from a sample known to not have the desired phenotype, and therefore be a negative reference profile.

If quantitative PCR is employed as the method of quantitating the levels of one or more nucleic acids, this method may quantify the PCR product accumulation through measurement of fluorescence released by a dual-labelled fluorogenic probe (e.g. a TaqMan® probe or a molecular beacon or FRET/Light Cycler probes). Some methods may not require a separate probe, such as the Scorpion and Amplifluor systems where the probes are built into the primers. In certain embodiments, the obtained expression profile is compared to a single reference profile to obtain information regarding the phenotype of the sample being assayed. In yet other embodiments, the obtained expression profile is compared to two or more different reference profiles to obtain more in depth information regarding the phenotype of the assayed sample. For example, the obtained expression profile may be compared to a positive and negative reference profile to obtain confirmed information regarding whether the sample has the phenotype of interest.

The comparison of the obtained expression profile and the one or more reference profiles may be performed using any convenient methodology, where a variety of methodologies are known to those of skill in the array art, e.g., by comparing digital images of the expression profiles, by comparing databases of expression data, etc. Patents describing ways of comparing expression profiles include, but are not limited to, U.S. Pat. Nos. 6,308,170 and 6,228,575, the disclosures of which are herein incorporated by reference. Methods of comparing expression profiles are also described above. The comparison step results in information regarding how similar or dissimilar the obtained expression profile is to the one or more reference profiles, which similarity information is employed to determine the phenotype of the sample being assayed. For example, similarity with a positive control indicates that the assayed sample has a responsive phenotype similar to the responsive reference sample. Likewise, similarity with a negative control indicates that the assayed sample has a non-responsive phenotype to the non-responsive reference sample.

The level of expression of a biomarker can be further compared to different reference expression levels. For example, a reference expression level can be a predetermined standard reference level of expression in order to evaluate if expression of a biomarker or biomarker set is informative and make an assessment for determining whether the patient is responsive or non-responsive.

Additionally, determining the level of expression of a biomarker can be compared to an internal reference marker level of expression which is measured at the same time as the biomarker in order to make an assessment for determining whether the patient is responsive or non-responsive. For example, expression of a distinct marker panel which is not comprised of biomarkers of the invention, but which is known to demonstrate a constant expression level can be assessed as an internal reference marker level, and the level of the biomarker expression is determined as compared to the reference. In an alternative example, expression of the selected biomarkers in a tissue sample which is a non-tumour sample can be assessed as an internal reference marker level. The level of expression of a biomarker may be determined as having increased expression in certain aspects. The level of expression of a biomarker may be determined as having decreased expression in other aspects. The level of expression may be determined as no informative change in expression as compared to a reference level. In still other aspects, the level of expression is determined against a pre-determined standard expression level as determined by the methods provided herein.

The invention is also related to guiding conventional treatment of patients. Patients in which the diagnostics test reveals that they are responders to the drugs, of the classes that directly or indirectly affect DNA damage and/or DNA damage repair, can be administered with that therapy and both patient and oncologist can be confident that the patient will benefit. Patients that are designated non-responders by the diagnostic test can be identified for alternative therapies which are more likely to offer benefit to them.

The invention further relates to selecting patients for clinical trials where novel drugs, such as mitotic inhibitors and of the classes that directly or indirectly affect DNA damage and/or DNA damage repair in order to treat prostate cancer are tested. Enrichment of trial populations with potential responders will facilitate a more thorough evaluation of that drug under relevant criteria.

The invention still further relates to methods of diagnosing patients as having or being susceptible to developing prostate cancer associated with a deficiency in DNA damage repair (DDRD) and/or displaying immune activation (to abnormal DNA). DDRD is defined herein as any condition wherein a cell or cells of the patient have a reduced ability to repair DNA damage, which reduced ability is a causative factor in the development or growth of a tumour. The DDRD diagnosis may be associated with a mutation in the Fanconi anemia/BRCA pathway. The methods of diagnosing an individual having prostate cancer may comprise:

b. deriving a test score that captures the expression levels;

c. providing a threshold score comprising information correlating the test score and diagnosis of prostate cancer;

d. and comparing the test score to the threshold score; wherein the individual is determined to have a prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA) when the test score exceeds the threshold score.

The one or more biomarkers may be selected from the group consisting of CXCL10, MX1 , ID01 , IF144L, CD2, GBP5, PRAME, ITGAL, LRP4, and APOL3. One of ordinary skill in the art can determine an appropriate threshold score, and appropriate biomarker weightings, using the teachings provided herein including the teachings of the Example.

In other embodiments, the one or more biomarkers are selected from the group consisting of

CXCL10, MX1 , ID01 , IF144L, CD2, GBP5, PRAME, ITGAL, LRP4, APOL3, CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, AC138128.1 , KIF26A, CD274, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4, SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC, PPP1 R1 A, and AL137218.1 . Tables 2A and 2B provide exemplary gene signatures (or gene classifiers) wherein the biomarkers consist of 40 or 44 of the gene products listed therein, respectively, and wherein a threshold score is derived from the individual gene product weightings listed therein. In one of these embodiments wherein the biomarkers consist of the 44 gene products listed in Table 2B, and the biomarkers are associated with the weightings provided in Table 2B, a test score that exceeds a threshold score, such as a threshold score of 0.3681 , indicates a diagnosis of prostate cancer with a deficiency in DNA damage repair and/or displaying immune activation (to abnormal DNA).

The following examples are offered by way of illustration and not by way of limitation.

EXAMPLES

Through analysis of a prostate cancer dataset we found that molecular cluster 2 (Figure 1) was defined by overexpression/activation of Immune related gene processes (Figure 2). Analysis of these gene processes highlighted similarity to those which defined the DDRD group first identified in breast cancer (WO 2012/037378; Mulligan et al 2014). We subsequently applied the DDRD test (WO 2012/037378) and found that molecular cluster 2 demonstrated significantly higher DDRD test scores than the other 3 groups (Figure 3). We defined this group as the DDRD group in prostate cancer. We have previously demonstrated that the DDRD group in breast cancer is associated with tumours that have mutations in BRCA1 or BRCA2 and thereby have loss of the Fanconi Anemia (FA)/BRCA pathway. To verify that the DDRD group in prostate cancer is associated with tumours which have mutations in DNA repair genes, we performed analysis of gene expression and mutational data from prostate cancer samples from The Cancer Genome Atlas (TCGA). DDRD test scores we calculated from gene expression data and samples with mutations in BRCA1 /2 and/or ATM were identified from the mutational information. Higher DDRD scores were associated with samples with a mutation in BRCA1/2 and/or ATM (Figure 4).

It has been recognized that males with inherited mutations in BRCA are not only at increased risk of prostate cancer but may also have worse outcomes than non-carriers (Castro et al 2015). We assessed the prognostic performance of the DDRD test in the Taylor dataset (Taylor et al 2010), which was a dataset of localized prostate cancer samples that were treated with radical prostatectomy. The DDRD test score was generated from the gene expression data, samples were dichotomized as biomarker positive or negative based on a median test score, Kaplan Meier analysis for the time to disease recurrence (Prostate Specific Antigen (PSA) defined) patients who did not experience relapse were censored. DDRD test was significantly associated with samples at increased risk of disease recurrence (Hazard ratio 2.09 p-value 0.037) (Figure 5).

We tested the ability of the DDRD assay to stratify patient response to Taxanes. To assess if the DDRD assay had utility for the prediction of non-response to Docetaxel in prostate cancer we generated a gene expression dataset from 52 mCRPC patients which received Docetaxel, patients which were assay positive demonstrated significantly worse survival when treated with Docetaxel (HR 1 .76 p-value 0.31 17) (Figure 6).

Summary

These data demonstrate that we have identified the DDRD molecular subtype through an unbiased analysis in prostate cancer, the DDRD test accurately identifies this group, and the group is associated with tumours with mutations in BRCA1 /2 and/or ATM. The DDRD test identifies a group of primary prostate cancers at increased risk of recurrence following surgery. The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and accompanying figures. Such modifications are intended to fall within the scope of the appended claims. Moreover, all embodiments described herein are considered to be broadly applicable and combinable with any and all other consistent embodiments, as appropriate.

Various publications are cited herein, the disclosures of which are incorporated by reference in their entireties.

Claims

CLAIMS:

1 . A method of predicting responsiveness of a subject having a prostate cancer to a mitotic inhibitor and/or a DNA damaging therapeutic agent comprising:

2. The method of claim 1 wherein the measured expression levels are used by generating a test score derived from the measured expression levels.

3. The method of claim 1 or 2 which comprises steps of:

a. deriving a test score that captures the expression levels;

b. providing a threshold score comprising information correlating the test score and responsiveness; and

c. comparing the test score to the threshold score; wherein:

i. responsiveness to a mitotic inhibitor is predicted when the test score does not exceed the threshold score; or

ii. non-responsiveness to a mitotic inhibitor is predicted when the test score exceeds the threshold score; and/or

iii. responsiveness to a DNA damaging therapeutic agent is predicted when the test score exceeds the threshold score; or

iv. non-responsiveness to a DNA damaging therapeutic agent is predicted when the test score does not exceed the threshold score.

4. A method of predicting outcome of treatment of a subject having a prostate cancer with a mitotic inhibitor and/or a DNA damaging therapeutic agent comprising:

a. measuring expression levels of at least one gene selected from Table 1 -45 in a sample from the subject; b. using the measured expression levels to determine whether the prostate cancer has a deficiency in DNA damage repair and/or displays immune activation (to abnormal DNA); wherein: i. if the prostate cancer does not have a deficiency in DNA damage repair and/or does not display immune activation (to abnormal DNA) an improved outcome of treatment with a mitotic inhibitor is predicted; or

5. The method of claim 4 wherein the measured expression levels are used by generating a test score derived from the measured expression levels.

6. The method of claim 4 or 5 which comprises steps of:

a. deriving a test score that captures the expression levels;

b. providing a threshold score comprising information correlating the test score and predicted outcome of treatment; and

c. comparing the test score to the threshold score; wherein:

i. an improved outcome of treatment with a mitotic inhibitor is predicted when the test score does not exceed the threshold score; or

ii. a poorer outcome of treatment with a mitotic inhibitor is predicted when the test score exceeds the threshold score; and/or

iii. an improved outcome of treatment with a DNA damaging therapeutic agent is predicted when the test score exceeds the threshold score; or

iv. a poorer outcome of treatment with a DNA damaging therapeutic agent is predicted when the test score does not exceed the threshold score.

7. A method of selecting an appropriate therapy to treat a subject having a prostate cancer comprising:

8. The method of claim 7 wherein the measured expression levels are used by generating a test score derived from the measured expression levels.

9. The method of claim 7 or 8 which comprises steps of:

a. deriving a test score that captures the expression levels;

b. providing a threshold score comprising information correlating the test score and treatment selection; and

c. comparing the test score to the threshold score; wherein:

i. a mitotic inhibitor is selected for treatment when the test score does not exceed the threshold score; or

ii. a mitotic inhibitor is not selected for treatment when the test score exceeds the threshold score; and/or

iii. a DNA damaging therapeutic agent is selected for treatment when the test score exceeds the threshold score; or

iv. a DNA damaging therapeutic agent is not selected for treatment when the test score does not exceed the threshold score.

10. (a) A method of treating a subject having a prostate cancer comprising administering a mitotic inhibitor to the subject, wherein the subject is predicted to be responsive to the mitotic inhibitor, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject; or

(b) A mitotic inhibitor for use in a method of treating a subject having a prostate cancer, wherein the subject is predicted to be responsive to the mitotic inhibitor, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject.

1 1 . (a) A method of treating a subject having a prostate cancer comprising administering a DNA damaging therapeutic agent to the subject, wherein the subject is predicted to be responsive to the DNA damaging therapeutic agent, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject; or

(b) A DNA damaging therapeutic agent for use in a method of treating a subject having a prostate cancer, wherein the subject is predicted to be responsive to the DNA damaging therapeutic agent, or the therapy is selected, on the basis of measured expression levels of at least one gene selected from Table 1 -45 in a sample from the subject.

12. The method or use of claim 10 or 1 1 wherein the subject is selected for treatment according to a method of any one of claims 1 to 9.

13. The method of claim 10 or 1 1 wherein the measured expression levels are used by generating a test score derived from the measured expression levels.

14. The method of claim 10, 1 1 or 13 which comprises steps of:

a. deriving a test score that captures the expression levels;

c. comparing the test score to the threshold score; wherein:

i. a mitotic inhibitor is used to treat the subject when the test score does not exceed the threshold score; or

ii. a mitotic inhibitor is not used to treat the subject when the test score exceeds the threshold score; and/or

iii. a DNA damaging therapeutic agent is used to treat the subject when the test score exceeds the threshold score; or

ii. a DNA damaging therapeutic agent is not used to treat the subject when the test score does not exceed the threshold score.

15. The method of any preceding claim wherein the prostate cancer is metastatic prostate cancer.

16. The method of any preceding claim, wherein the at least one gene is selected from the group consisting of

a. CXCL10, MX1 , ID01 , IFI44L, CD2, GBP5, PRAME, ITGAL, LRP4, and APOL3; and/or b. CDR1 , FYB, TSPAN7, RAC2, KLHDC7B, GRB14, AC138128.1 , KIF26A, CD274, CD109, ETV7, MFAP5, OLFM4, PI15, FOSB, FAM19A5, NLRC5, PRICKLE1 , EGR1 , CLDN10, ADAMTS4,

SP140L, ANXA1 , RSAD2, ESR1 , IKZF3, OR2I1 P, EGFR, NAT1 , LATS2, CYP2B6, PTPRC, PPP1 R1A and AL137218.1 .

17. The method of claim 16, comprising measuring the expression level of all of the genes in group a and/or all of the genes in group b.

18. The method of any one of claims 1 to 17, comprising measuring the expression level of at least 10 of the genes from Table 2B.

19. The method of any preceding claim, wherein the mitotic inhibitor comprises a vinca alkaloid and/or a taxane.

20. The method of claim 19, wherein the vinca alkaloid is vinorelbine.

21 . The method of claim 19, wherein the taxane is docetaxel or paclitaxel.

22. The method of any preceding claim, wherein the DNA-damaging therapeutic agent comprises one or more substances selected from the group consisting of: a DNA damaging agent, a DNA repair targeted therapy, an inhibitor of DNA damage signalling, an inhibitor of DNA damage induced cell cycle arrest, a histone deacetylase inhibitor, a heat shock protein inhibitor and an inhibitor of DNA synthesis.

23. The method of any preceding claim, wherein the DNA-damaging therapeutic agent comprises one or more of a platinum-containing agent, a nucleoside analogue such as gemcitabine or 5-fluorouracil or a prodrug thereof such as capecitabine, an anthracycline such as epirubicin or doxorubicin, an alkylating agent such as cyclophosphamide, an ionising radiation or a combination of radiation and chemotherapy (chemoradiation).

24. The method of any preceding claim, wherein the DNA-damaging therapeutic agent comprises a platinum-containing agent.

25. The method of claim 24, wherein the platinum based agent is selected from cisplatin, carboplatin and oxaliplatin.

26. The method of any preceding claim wherein the DNA damaging therapeutic agent comprises a PARP inhibitor.

27. The method of any preceding claim wherein the therapy is adjuvant treatment and/or neoadjuvant treatment.

28. A method of predicting recurrence of prostate cancer in a subject and/or identifying a prostate cancer likely to recur comprising: a. measuring expression levels of at least one gene selected from Table 1 -45 in a sample from the subject;

29. The method of claim 28 wherein the measured expression levels are used by generating a test score derived from the measured expression levels.

30. The method of claim 28 or 29 which comprises steps of:

a. deriving a test score that captures the expression levels;

b. providing a threshold score comprising information correlating the test score and likelihood of recurrence; and

c. comparing the test score to the threshold score; wherein:

i. a high likelihood of recurrence is predicted and/or a prostate cancer likely to recur is identified is predicted when the test score exceeds the threshold score; or

ii. a lower likelihood of recurrence is predicted and/or a prostate cancer less likely to recur is identified when the test score does not exceed the threshold score.

31 . A method according to any preceding claim wherein measuring the expression level comprises the use of a probe comprising, consisting essentially of or consisting of the sequence of any one of SEQ ID NOs 1 -1750.

32. A probe comprising, consisting essentially of or consisting of the sequence of any one of SEQ ID NOs 1 -1750.

33. A nucleic acid molecule comprising, consisting essentially of, or consisting of the nucleotide sequence of any one of SEQ ID Nos 1751 -3500.

34. A primer or probe that specifically hybridizes with the nucleic acid molecule of claim 33.

35. A kit for performing a method according to any one of claims 1 to 31 comprising a probe as defined in claim 32 or a primer or probe as defined in claim 34.