WO2023146389A1 - Gene signatures for classifying homologous recombination deficiency - Google Patents

Gene signatures for classifying homologous recombination deficiency Download PDF

Info

Publication number
WO2023146389A1
WO2023146389A1 PCT/MY2023/050003 MY2023050003W WO2023146389A1 WO 2023146389 A1 WO2023146389 A1 WO 2023146389A1 MY 2023050003 W MY2023050003 W MY 2023050003W WO 2023146389 A1 WO2023146389 A1 WO 2023146389A1
Authority
WO
WIPO (PCT)
Prior art keywords
hrd
genes
sample
low
subset
Prior art date
Application number
PCT/MY2023/050003
Other languages
French (fr)
Inventor
Jia Wern PAN
Soo Hwang Teo
Original Assignee
Cancer Research Malaysia
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cancer Research Malaysia filed Critical Cancer Research Malaysia
Publication of WO2023146389A1 publication Critical patent/WO2023146389A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the subject does not carry a germline or somatic BRCA1 or BRCA2 mutation.
  • therapies that target DNA repair pathways can be used for patients with other types of HRD mutation.
  • the nearest shrunken centroid method [7] was utilised to further identify a core set of 25 genes that provided a good balance between low cross- validation misclassification rate and small size of gene set.
  • Figure 2A illustrates the cross- validation misclassification rates as the shrunken centroid threshold increases and size of gene set decreases.
  • the core set of 25 genes can be described as the genes that contribute the most towards an accurate classification of HRD high versus HRD low in the MyBrCa TNBC cohort. None of the 25 genes in the core set were previously known to be associated with homologous recombination deficiency (HRD).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to the development of a method and tool to classify non-BRCA patient tumour samples as having high or low homologous recombination deficiency (HRD). The invention may have clinical utility to select for patients with high HRD who may benefit from treatment with therapies that target DNA repair pathways such as PARP inhibitors.

Description

GENE SIGNATURES FOR CLASSIFYING HOMOLOGOUS RECOMBINATION DEFICIENCY
FIELD OF INVENTION
The invention relates to a gene-expression-based method and tool to classify non-BRCA patient tumour samples as having high or low homologous recombination deficiency (HRD) in order to select for patients with high HRD for treatment with therapies that target DNA repair pathways.
BACKGROUND ART
Triple-negative breast cancer (TNBC), which tests negative for estrogen receptors, progesterone receptors, and excess HER2 protein, remains an area of unmet need, with fewer treatment options and higher mortality rates relative to other subtypes of breast cancer. In order to increase the number of therapeutic options for TNBC, a number of new targeted therapies have been investigated for use in the TNBC patient population. This includes platinum chemotherapy, ATM, ATR, and DNA-PKcs inhibitors as well as PARP inhibitor therapy.
Platinum salts are a class of chemotherapy drugs that have been in use for some time to treat various types of cancer. For breast cancer, the main drugs in this category are carboplatin, cisplatin, and oxaliplatin. Platinum salts work by causing crosslinking of DNA strands, leading to a high number of single-stranded and double-stranded breaks. In normal cells, these breaks are usually efficiently repaired by DNA repair mechanisms, but in cells with defective DNA repair mechanism (such as cancer cells with HRD), these breaks accumulate, leading to cell death. Platinum salts have been shown in clinical studies to work significantly better on breast cancer patients with germline carriers of deleterious variants in BRCA1 and BRCA2 (gBRCAm carriers) which tend to have defective homologous recombination repair pathways.
ATM (ataxia-telangiectasia, mutated), ATR (ataxia telangiectasia mutated and Rad3-related) and DNA-PKcs (DNA-dependent protein kinase) are all members of the phosphatidylinositol 3-kinase-related kinase (PIKK) family, and they all play a role in the detection of DNA doublestranded breaks and activation of repair mechanisms. In normal cells, there are three main pathways to repair damaged DNA; classical non-homologous end joining (C-NHEJ), alternative NHEJ (A-NHEJ), and homologous recombination (HR). ATM, ATR, and DNA-PKcs have distinct but somewhat overlapping roles in the initiation of all three pathways, particularly for C-NHEJ and HR. In normal cells, these mechanisms can generally compensate for each other, meaning that DNA repair can still proceed although less efficiently even when these protein kinases are inhibited. However, in cancer cells with pre-existing defects in their DNA repair mechanisms, inhibition of these proteins can lead to accumulation of damaged DNA and cell death. Small molecule inhibitors for all three proteins are currently in clinical trials as therapeutic agents for cancer. HRD may serve as a biomarker for ATM, ATR, and DNA-PKcs inhibitors effective use given their mechanism of action.
PARP inhibitor therapy is a new class of drugs that inhibit the action of poly (ADP-ribose) polymerase (PARP) proteins. PARP inhibitors work by inducing double stranded DNA breaks during cellular replication. In cells that have defects in the homologous recombination repair pathway, these double stranded breaks cannot be efficiently repaired, leading to synthetic lethality and the death of the cell. Recently, PARP inhibitors have been approved for TNBC tumours arising in germline carriers of deleterious variants in BRCA1 and BRCA2 (gBRCAm carriers) [1 ,2], which tend to have defective homologous recombination repair pathways.
Other molecular alterations in the tumour that drive a defective homologous recombination process, giving rise to a “BRCAness” feature, have been proposed to broaden the patient population to PARP inhibitors [3], However, the patient population and its molecular features remain to be clearly identified. In the breast cancer setting, some of the tumours with the BRCAness feature may arise in tumours where the expression of BRCA1 or BRCA2 may be repressed by hypermethylation or somatic mutation, or where the homologous recombination pathway is abrogated through mutations in other genes in the pathway (e.g., PALB2 and ATM), and ongoing clinical studies seek to expand the utility of PARP inhibitors in this context. In addition, transcriptional signatures and genomic mutational signatures have been generated that distinguish tumours with HRD, with some association with PARP sensitivity [4,5], and ongoing clinical studies seek to examine the utility of these methods as predictive biomarkers of PARP inhibitor treatment.
PARP inhibitors are a new class of therapeutic drugs that have significant life-saving potential for cancer patients. In triple-negative breast cancer, PARP inhibitors are currently only indicated for patients with germline or somatic mutations in BRCA1 or BRCA2. A laboratory test that is proven to be able to identify patients who do not have BRCA1 or BRCA2 mutations, but who may still benefit from PARP inhibitor therapy, would be highly valuable for patients, doctors, and the pharmaceutical industry.
Prior attempts such as HRDetect require high coverage genomic sequencing of tumours in order to detect sufficient numbers of mutations to derive a mutational signature that may be indicative of defects in the homologous recombination pathway. The drawback is that such an attempt requires high coverage genomic sequencing of tumours in order to detect sufficient numbers of mutations in order to be able to derive a mutational signature.
An aim of the invention is therefore to develop and validate a gene-expression-based classification method and tool in order to classify non-BRCA patient tumour samples as having high or low HRD in order to select for patients with high HRD who may benefit from treatment with PARP inhibitors.
SUMMARY OF INVENTION
In one aspect of the invention, there is provided a method for classifying homologous recombination deficiency (HRD) comprising the steps of: obtaining a tumour sample from a human subject; measuring the expression levels of at least two genes in the sample selected from a core set; and classifying the sample as HRD high or HRD low based on the expression levels of the genes; wherein a sample classified as HRD high is indicative of an increased likelihood of the tumour responding to therapies that target DNA repair pathways; characterized in that the core set of genes comprises a first subset of DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, and UGT8, a second subset of GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 , and SOX10, and a third subset of ABCC11 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B.
Surprisingly the core set of genes have not previously been associated with homologous recombination deficiency (i.e., responsive to PARP inhibitors).
In one embodiment the human subject has cancer, typically triple-negative breast cancer (TNBC).
In one embodiment the expression levels are measured in vitro.
Typically, the expression levels of the core set of genes are measured based on the mRNA transcripts obtained from the tumour sample.
In one embodiment the method further comprises the step of measuring the expression levels of at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, at least thirteen, at least fourteen, at least fifteen, at least sixteen, at least seventeen, at least eighteen, at least nineteen, at least twenty, at least twenty-one, at least twenty-two, at least twenty-three, at least twenty-four, or at least twenty-five genes in the sample selected from the core set. In another embodiment the expression levels of at least one gene selected from the first and second subsets, and at least one gene selected from the third subset is measured in the sample.
In one embodiment the expression levels of at least one gene selected from the first subset, at least one gene selected from the second subset, and at least one gene selected from the third subset is measured in the sample.
In one embodiment the core set of genes are identified from differential gene expression analysis of training samples that are classified as HRD high or HRD low according to the proportion of somatic mutations in each tumour that correspond to a selected mutational signature.
In one embodiment the selected mutational signature is the COSMIC mutational signature SBS3 known to be driven by defects in the homologous recombination DNA repair pathway.
In one embodiment the core set of genes are utilised to classify a sample as HRD high or HRD low using nearest a shrunken centroid method whereby the standardized nearest shrunken centroids for each class (HRD high and HRD low) in a training dataset is determined using a curated subset of tumour samples comprising HRD high and HRD low samples.
In one embodiment where classifying the sample as HRD high or HRD low requires finding the sample’s standardized squared distance to each shrunken centroid for each class (HRD high and HRD low) whereby the distance obtained is utilised to calculate the probability of the sample being assigned as HRD high or HRD low using Gaussian Discriminant Analysis.
In one embodiment the HRD high classification is associated with upregulation of genes from the first subset or second subset and downregulation of genes from the third subset. Similarly the HRD low classification is associated with downregulation of genes from the first subset or second subset and upregulation of genes from the third subset.
In one embodiment the sample is considered to be unambiguously classified when the probability of the sample being in one class (HRD high or HRD low) exceeds the other by 0.5. In one embodiment the expression of the DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, UGT8, GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 and SOX10 genes are positively associated with HRD high classification and negatively associated with HRD low classification while the expression of the ABCC11 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B genes are positively associated with HRD low classification and negatively associated with HRD high classification.
In one embodiment the expression levels of one or more supplementary genes is measured in the sample, selected from SLC7A2, CROT, FAM184B, LY75, FSTL3, NDC80, IL12RB2, C1 QTNF3, BIRC5, SLC7A8, CRAT, TDRD1 , PTK6, TMC5, CPE, LIN7A, FOXM1 , LDHB, GCNT2, TTK, IQCG, ROPN1B, MLPH, KYNU, DHCR24, BCL11A, OGFRL1 , WIPF3, NRK, SPDEF, DEK, PODXL, PIMREG, FOXA1 , ACTR3B, HMGCS2, TTC6, LMO4, NT5DC4, ABCA12, IL17RD, MS4A2, KCNE4, TCF7L1 , CHODL, L3MBTL4, MPV17L, TSC22D3, MPP3, MXRA8, VANGL2, NOSTRIN, UCN, CMBL, TMEM63C, HACD1 , ABTB2, BMERB1 , LDHD, AR, GPRC5C, LRG1 , KNDC1 , SCUBE2, P2RY2, ARL4D, ZBTB42, OR7E22P, SYNM, EPHB3, GPR19, ADAMTSL5, WNK3, AMACR and ZNF286B.
In one embodiment the therapy is poly (ADP-ribose) polymerase inhibitor (PARPi).
In one embodiment the therapies can be selected from platinum chemotherapy or ATM, ATR, and DNA-PKcs inhibitors.
Typically, the subject does not carry a germline or somatic BRCA1 or BRCA2 mutation. Thus therapies that target DNA repair pathways can be used for patients with other types of HRD mutation.
Advantageously alternative machine learning methods can be utilised to classify the sample as HRD high or HRD low based on the expression levels of the genes from the core and supplementary gene sets. For example, gene expression data from the core and supplementary gene sets can be used to train random forest and support vector machine predictive models.
In a further aspect of the invention there is provided a tool for classifying homologous recombination deficiency (HRD) comprising: a panel of at least two genes selected from a core set of twenty-five genes; wherein the panel allows a tumour sample extracted from a human subject to be measured for the expression levels of the selected genes; wherein the sample is classified as HRD high or HRD low based on the expression levels of the genes; wherein a sample classified as HRD high indicative of an increased likelihood of the tumour responding to therapies that target DNA repair pathways; characterized in that the core set of genes comprises a first subset of DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, and UGT8, a second subset of GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 , and SOX10, and a third subset of ABCC11 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B.
In one embodiment the panel further comprises one or more of the following genes: SLC7A2, CROT, FAM184B, LY75, FSTL3, NDC80, IL12RB2, C1 QTNF3, BIRC5, SLC7A8, CRAT, TDRD1 , PTK6, TMC5, CPE, LIN7A, FOXM1 , LDHB, GCNT2, TTK, IQCG, ROPN1 B, MLPH, KYNU, DHCR24, BCL1 1 A, OGFRL1 , WIPF3, NRK, SPDEF, DEK, PODXL, PIMREG, FOXA1 , ACTR3B, HMGCS2, TTC6, LMO4, NT5DC4, ABCA12, IL17RD, MS4A2, KCNE4, TCF7L1 , CHODL, L3MBTL4, MPV17L, TSC22D3, MPP3, MXRA8, VANGL2, NOSTRIN, UCN, CMBL, TMEM63C, HACD1 , ABTB2, BMERB1 , LDHD, AR, GPRC5C, LRG1 , KNDC1 , SCUBE2, P2RY2, ARL4D, ZBTB42, OR7E22P, SYNM, EPHB3, GPR19, ADAMTSL5, WNK3, AMACR and ZNF286B.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings in which:
Figures 1 A, 1 B and 1 C illustrate differential expression analysis for HRD in 94 TNBC tumours arising in Malaysian women.
Figures 2A and 2B illustrate selection and grouping of genes in the core set.
Figures 3A and 3B illustrate performance of the HRD classifier in the training dataset.
Figure 4A, 4B and 4C illustrate validation of the HRD classifier in the Malaysian Breast Cancer (MyBrCa) cohort.
Figure 5A, 5B and 5C illustrate validation of the HRD classifier in the The Cancer Genome Atlas (TCGA) cohort.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Genomic sequencing data was taken primarily from TNBC samples included in the Malaysian Breast Cancer (MyBrCa) cohort tumour sequencing project. This included shallow wholegenome sequencing (sWGS), whole-exome sequencing (WES), and RNA-sequencing data collected from breast tumours of female patients, and analysed together with available clinical data. The cohort data and sequencing methods are as described in [6],
The prevalent mutational drivers in the TNBC setting were examined by conducting an unsupervised clustering analysis of the mutational signatures of TNBC samples. Hierarchical clustering of COSMIC mutational signatures previously generated for the MyBrCa tumour genomics cohort [6] revealed two large clusters of TNBC tumours with high proportions of the COSMIC mutational signature SBS3.
Using COSMIC mutational signature data, 94 TNBC tumours from the MyBrCa cohort (training samples) were classified according to the proportion of somatic mutations in each tumour that correspond to the COSMIC mutational signature SBS3. The SBS3 mutational signature is known to be driven by defects in the homologous recombination DNA repair pathway. Thus, patients whose tumour samples are predicted to have a high proportion of this mutational signature may also be more likely to respond to therapies that target DNA repair pathways, such as platinum chemotherapy, ATM, ATR, and DNA-PKcs inhibitors as well as PARP inhibitor therapy.
Using a cutoff of 0.2 based on the distribution of SBS3 scores in the cohort, tumours with scores above the cutoff were classified as HRD high, while tumours below the cutoff were classified as HRD low (Figure 1 A). Then, using differential gene expression analysis of RNA- sequencing data from the TNBC tumour training samples, a set of 100 genes that are differentially expressed between the HRD high vs. HRD low samples were identified (Figure 1 B). Gene set variation analysis (GSVA) of these 100 genes confirmed that the expression of these genes was significantly different between HRD high vs. HRD low samples (p<0.001) (Figure 1 C).
Table 1 lists the set of 100 genes that are differentially expressed between HRD high and HRD low samples.
Figure imgf000011_0001
Figure imgf000012_0001
Figure imgf000013_0001
* Gene start and gene end locations are based on the GrCh38 human reference genome
Table 1
From the initial set of 100 genes, the nearest shrunken centroid method [7] was utilised to further identify a core set of 25 genes that provided a good balance between low cross- validation misclassification rate and small size of gene set. Figure 2A illustrates the cross- validation misclassification rates as the shrunken centroid threshold increases and size of gene set decreases. The core set of 25 genes can be described as the genes that contribute the most towards an accurate classification of HRD high versus HRD low in the MyBrCa TNBC cohort. None of the 25 genes in the core set were previously known to be associated with homologous recombination deficiency (HRD).
Further analysis of the training samples indicated that the core set of 25 genes can be further separated into two main groups; 18 genes that are upregulated in HRD high samples (Group 1) and 7 genes that are downregulated in HRD high samples (Group 2). Additionally, the 18 upregulated genes (Group 1) can be further separated into two groups; 11 genes (Group 1A) and 7 genes (Group 1 B), respectively, based on their expression patterns in the tumour samples (Figure 2B). For Figure 2B, the aim was to determine which genes in the core set have similar expression patterns, to better understand how many different “signals” there are in the data. To do this, a method called hierarchical clustering which uses a distance function to calculate how similar one gene is to another and display it in a hierarchical manner was used in which the most similar samples are grouped first, then those groups are compared to the other groups, and so on. The heatmap shows the relative level of expression of each gene in each sample, normalized to the median expression. From the left dendrogram, it can be seen that the top-level separation into two groups separates the genes that are upregulated versus downregulated and thus have opposite expression patterns. Furthermore, the upregulated genes (NDUFB4P11 through ROPN1) can be further divided into two groups based on the distance function, even though visually from the heatmap the differences may not be obvious.
Table 2 lists the core set of genes used to predict homologous recombination deficiency (HRD) in triple negative breast cancer (TNBC).
Figure imgf000014_0001
Figure imgf000015_0001
* Gene start and gene end locations are based on the GrCh38 human reference genome
Table 2
The core set of 25 genes can be used to classify the HRD status of any given patient tumour sample by using the nearest shrunken centroids methodology as described in [7], Briefly, the standardized nearest shrunken centroids for each of the two classes (HRD high and HRD low) in a training dataset is determined. For the training dataset, a curated subset of 80 TNBC tumours from the MyBrCa cohort comprising approximately 40 HRD high and 40 HRD low samples with the 100 genes listed in Table 1 were used. Using the data from the curated subset, the centroid for each class (HRD high and HRD low) was calculated across 100 dimensions (each gene = 1 dimension). To determine a sample’s HRD status, first, the sample data is normalized to ensure it is on the same scale as the training dataset. Then, the mediancentered transcript count data was used to find each sample’s standardized squared distance to each shrunken centroid for each class (HRD high and HRD low). Using the distance, the probability of the sample being assigned as HRD high or HRD low is calculated using Gaussian Discriminant Analysis. A sample is considered to be unambiguously classified when the probability of the sample being in one class (either HRD high or HRD low) exceeds the other by 0.5 (i.e., if the probability of high > probability of low by 0.5 or more, then the sample is classified as HRD high or vice versa). Samples right at the boundary of both classes (high and low) are assigned a probability based on how common the classes are in the training dataset.
Table 3 lists the standardized centroids for each class used to classify samples as HRD high and HRD low.
Figure imgf000015_0002
Figure imgf000016_0001
Table 3
From the core set of 25 genes listed in Table 3, expression of the DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, UGT8, GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 and SOX10 genes are positively associated with HRD high classification and negatively associated with HRD low classification while the expression of the ABCC11 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B genes are positively associated with HRD low classification and negatively associated with HRD high classification.
Using this method, the classifier was able to successfully classify the 71 out of the 94 Asian TNBC samples correctly as HRD high and HRD low (a baseline accuracy of 75.5%), with an area under the receiver operating characteristic (AUROC) of 0.816 (p=1.92e-7) when compared to their SBS3 scores (Figure 3A). The classifier was also able to categorize 9 out of the 11 samples with genomic BRCA mutations as HRD high (Figure 3B). Tumours with genomic BRCA mutations are known to have defective homologous recombination repair pathways [8],
The classifier was validated using an additional 35 TNBC samples from the MyBrCa cohort that were sequenced at a later point. Using RNA-seq gene expression data for the 25 genes, the classifier was able to select for TNBC samples with high SBS3 scores with an AUROC of 0.855 (p=3.51 e-4) (Figure 4A), correctly classifying 28 out of 35 samples as HRD high or HRD low according to the 0.2 cutoff (accuracy of 80%). There was also a significant difference in SBS3 scores between the validation MyBrCa TNBC samples that were classified as HRD high vs. HRD low (p=0.008) (Figure 4B).
The classifier was further validated using data from TNBC tumours in The Cancer Genome Atlas (TCGA) dataset [9], Using RNA-seq gene expression data for the 25 genes, the HRD classifier was able to select for TCGA TNBC samples with high SBS3 scores with an AUROC of 0.652 (p=0.021) (Figure 5A), correctly classifying 58 out of 87 samples as HRD high or HRD low according to the 0.2 cutoff (accuracy of 66.7%). There was also a significant difference in SBS3 scores between TCGA TNBC samples that were classified as HRD high vs. HRD low (p=1.15e-4) (Figure 5B). Additionally, the HRD classifier was able to classify 4 out of 5 of the TCGA TNBC samples known to carry gBRCA mutations as being HRD high (Figure 5C).
Alternative machine learning methods apart from nearest shrunken centroids to classify samples as HRD high and HRD low using the core gene set may also be used. For example, gene expression data from the set of 100 genes described above (including the core 25-gene set) can be used to train random forest and support vector machine predictive models. One such set of models, trained using 70% of all 129 MyBrCa TNBC samples and tested on the remaining 30% of samples, was able to achieve an average AUROC of 0.94 in predicting HRD high samples in the MyBrCa test data after being optimized using 10-fold cross-validation. This particular set of models was also able to predict HRD high in the 87 TCGA TNBC samples with an average AUROC of 0.66. These results demonstrate that the core gene set in the present invention may be utilized in more than one machine learning predictive framework in order to successfully classify the HRD status of patient samples.
The HRD classifier as envisioned would be prescribed by qualified oncologists for Asian patients with metastatic Stage IV TNBC who do not have germline or somatic BRCA mutations, in order to determine if the patient may still benefit from PARP inhibitor therapy. The oncologist would send a sample of the patient’s archived (FFPE) surgical tumour tissue to a laboratory where RNA will be extracted from the tissue sample and will be used to conduct the HRD test in order to classify the sample as “HRD high” or “HRD low”. The results would then be reported back to the oncologist to be used as part of the clinical decision-making process.
It will be appreciated by persons skilled in the art that the present invention may also include further additional modifications which do not affect the overall functioning.
References
1 . Robson M, Im S-A, Senkus E, et al. Olaparib for Metastatic Breast Cancer in Patients with a Germline BRCA Mutation. N Engl J Med. 2017;377(6):523-533. doi: 10.1056/NEJMoa1706450.
2. Litton JK, Rugo HS, Ettl J, et al. Talazoparib in Patients with Advanced Breast Cancer and a Germline BRCA Mutation. N Engl J Med. 2018;379(8):753-763. doi: 10.1056/NEJMoa1802905.
3. Lord CJ, Ashworth A. BRCAness revisited. Nat Rev Cancer. 2016;16(2):110-120. doi:10.1038/nrc.2015.21.
4. Telli ML, Timms KM, Reid JE, et al. Homologous Recombination Deficiency (HRD) Score Predicts Response to Platinum-Containing Neoadjuvant Chemotherapy in Patients with Triple Negative Breast Cancer. Clin Cancer Res. 2016.
5. Davies H, Glodzik D, Morganella S, et al. HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures. Nat Med. 2017;23(4):517-525. doi:/10.1038/nm.4292.
6. Pan, JW, Zabidi, MMA, Ng, PS et al. The molecular landscape of Asian breast cancers reveals clinically relevant population-specific differences. Nat Commun. 2020;11 :6433. doi:10.1038/s41467-020-20173-5.
7. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A. 2002;99(10):6567-72. doi: 10.1073/pnas.082099299. PMID: 12011421 ; PMCID: PMC124443.
8. Powell, S., Kachnic, L. Roles of BRCA1 and BRCA2 in homologous recombination, DNA replication fidelity and the cellular response to ionizing radiation. Oncogene 2003;22:5784- 5791. doi:10.1038/sj.onc.1206678.
9. The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 2012;490: 61-70.

Claims

1 . A method for classifying homologous recombination deficiency (HRD) comprising the steps of: obtaining a tumour sample from a human subject; measuring the expression levels of at least two genes in the sample selected from a core set; and classifying the sample as HRD high or HRD low based on the expression levels of the genes; wherein a sample classified as HRD high is indicative of an increased likelihood of the tumour responding to therapies that target DNA repair pathways; characterized in that the core set of genes comprises a first subset of DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, and UGT8, a second subset of GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 , and SOX10, and a third subset of ABCC1 1 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B.
2. The method of claim 1 , further comprising the step of measuring the expression levels of at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, at least thirteen, at least fourteen, at least fifteen, at least sixteen, at least seventeen, at least eighteen, at least nineteen, at least twenty, at least twenty-one, at least twenty-two, at least twenty-three, at least twenty-four, or at least twenty-five genes in the sample selected from the core set.
3. The method of claim 1 , wherein the expression levels of at least one gene selected from the first and second subsets, and at least one gene selected from the third subset is measured in the sample.
4. The method of claim 1 , wherein the expression levels of at least one gene selected from the first subset, at least one gene selected from the second subset, and at least one gene selected from the third subset is measured in the sample.
5. The method of claim 1 , wherein the core set of genes are identified from differential gene expression analysis of training samples that are classified as HRD high or HRD low according to the proportion of somatic mutations in each tumour that correspond to a selected mutational signature. The method of claim 5, wherein the selected mutational signature is COSMIC mutational signature SBS3. The method of claim 5, wherein the core set of genes are utilised to classify a sample as HRD high or HRD low using a nearest shrunken centroid method whereby the standardized nearest shrunken centroids for each class (HRD high and HRD low) in a training dataset is determined using a curated subset of tumour samples comprising HRD high and HRD low samples. The method of claim 7, wherein classifying the sample as HRD high or HRD low requires finding the sample’s standardized squared distance to each shrunken centroid for each class (HRD high and HRD low) whereby the distance obtained is utilised to calculate the probability of the sample being assigned as HRD high or HRD low using Gaussian Discriminant Analysis. The method of claim 8, wherein the sample is considered to be unambiguously classified when the probability of the sample being in one class (HRD high or HRD low) exceeds the other by 0.5. The method of claim 1 , wherein the expression of the DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, UGT8, GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 and SOX10 genes are positively associated with HRD high classification and negatively associated with HRD low classification while the expression of the ABCC11 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B genes are positively associated with HRD low classification and negatively associated with HRD high classification. The method of claim 1 , wherein the expression levels of one or more supplementary genes is measured in the sample, selected from SLC7A2, CROT, FAM184B, LY75, FSTL3, NDC80, IL12RB2, C1 QTNF3, BIRC5, SLC7A8, CRAT, TDRD1 , PTK6, TMC5, CPE, LIN7A, FOXM1 , LDHB, GCNT2, TTK, IQCG, ROPN1 B, MLPH, KYNU, DHCR24, BCL1 1 A, OGFRL1 , WIPF3, NRK, SPDEF, DEK, PODXL, PIMREG, FOXA1 , ACTR3B, HMGCS2, TTC6, LMO4, NT5DC4, ABCA12, IL17RD, MS4A2, KCNE4, TCF7L1 , CHODL, L3MBTL4, MPV17L, TSC22D3, MPP3, MXRA8, VANGL2, NOSTRIN, UCN, CMBL, TMEM63C, HACD1 , ABTB2, BMERB1 , LDHD, AR, GPRC5C, LRG1 , KNDC1 , SCUBE2, P2RY2, ARL4D, ZBTB42, OR7E22P, SYNM, EPHB3, GPR19, ADAMTSL5, WNK3, AMACR and ZNF286B. The method of claim 1 , wherein the therapy is poly (ADP-ribose) polymerase inhibitor (PARPi). The method of claim 1 , wherein the therapies can be selected from platinum chemotherapy or ATM, ATR, and DNA-PKcs inhibitors. The method of claim 1 , wherein the human subject has cancer, typically triple-negative breast cancer (TNBC). The method of claim 1 , wherein the subject does not carry germline or somatic BRCA1 or BRCA2 mutation. The method of claim 1 , wherein alternative machine learning methods can be utilised to classify the sample as HRD high or HRD low based on the expression levels of the genes from the core and supplementary gene sets. A tool for classifying homologous recombination deficiency (HRD) comprising: a panel of at least two genes selected from a core set of twenty-five genes; wherein the panel allows a tumour sample extracted from a human subject to be measured for the expression levels of the selected genes; wherein the sample is classified as HRD high or HRD low based on the expression levels of the genes; wherein a sample classified as HRD high is indicative of an increased likelihood of the tumour responding to therapies that target DNA repair pathways characterized in that the core set of genes comprises a first subset of DLGAP1 , FOXC1 , MFSD6L, NCMAP, NDUFB4P11 , NPFFR1 , PLEKHB1 , PNMA8A, SRSF12, TAFA3, and UGT8, a second subset of GABRP, MIA, OXGR1 , RLBP1 , ROCR, ROPN1 , and SOX10, and a third subset of ABCC1 1 , CLCA2, HGD, MUCL1 , REEP6, SPINK8 and TFAP2B. The tool of claim 15, wherein the panel further comprises one or more of the following genes: SLC7A2, CROT, FAM184B, LY75, FSTL3, NDC80, IL12RB2, C1 QTNF3, BIRC5, SLC7A8, CRAT, TDRD1 , PTK6, TMC5, CPE, LIN7A, FOXM1 , LDHB, GCNT2, TTK, IQCG, ROPN1 B, MLPH, KYNU, DHCR24, BCL11A, OGFRL1 , WIPF3, NRK, SPDEF, DEK, PODXL, PIMREG, FOXA1 , ACTR3B, HMGCS2, TTC6, LMO4, NT5DC4, ABCA12, IL17RD, MS4A2, KCNE4, TCF7L1 , CHODL, L3MBTL4, MPV17L, TSC22D3, MPP3, MXRA8, VANGL2, NOSTRIN, UCN, CMBL, TMEM63C, HACD1 , ABTB2, BMERB1 , LDHD, AR, GPRC5C, LRG1 , KNDC1 , SCUBE2, P2RY2, ARL4D, ZBTB42, OR7E22P, SYNM, EPHB3, GPR19, ADAMTSL5, WNK3, AMACR and ZNF286B.
PCT/MY2023/050003 2022-01-28 2023-01-20 Gene signatures for classifying homologous recombination deficiency WO2023146389A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2022000608 2022-01-28
MYPI2022000608 2022-01-28

Publications (1)

Publication Number Publication Date
WO2023146389A1 true WO2023146389A1 (en) 2023-08-03

Family

ID=87472291

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2023/050003 WO2023146389A1 (en) 2022-01-28 2023-01-20 Gene signatures for classifying homologous recombination deficiency

Country Status (1)

Country Link
WO (1) WO2023146389A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220028482A1 (en) * 2019-12-10 2022-01-27 Tempus Labs, Inc. Systems and methods for predicting homologous recombination deficiency status of a specimen

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220028482A1 (en) * 2019-12-10 2022-01-27 Tempus Labs, Inc. Systems and methods for predicting homologous recombination deficiency status of a specimen

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LADAN, MARJOLIJN M. ET AL.: "Homologous recombination deficiency testing for BRCA-like tumors: The road to clinical validation", CANCERS, vol. 13, no. 1004, 28 February 2021 (Publication date), pages 1 - 23, XP055886909 *
LI YAWEI, ZHANGXIANG ZHAO, LIQIANG AI, YUQUAN WANG, KAIDONG LIU, BO CHEN, TINGTING CHEN, SHUPING ZHUANG, HUANHUAN XU, MIN ZOU, YU: "Discovering a qualitative transcriptional signature of homologous recombination defectiveness for prostate cancer", ISCIENCE, ELSEVIER, vol. 24, no. 10, 22 October 2022 (2022-10-22), pages 1 - 18, XP093083063, DOI: 10.1016/j.isci.2021.103135 *
SHI ZHIWEN, ZHAO QINGGUO, LV BIN, QU XINYU, HAN XIAO, WANG HONGYAN, QIU JUNJUN, HUA KEQIN: "Identification of biomarkers complementary to homologous recombination deficiency for improving the clinical outcome of ovarian serous cystadenocarcinoma", CLINICAL AND TRANSLATIONAL MEDICINE, INTERNATIONAL SOCIETY FOR TRANSLATIONAL MEDICINE, SE, vol. 11, no. 5, 1 May 2021 (2021-05-01), SE , XP093083061, ISSN: 2001-1326, DOI: 10.1002/ctm2.399 *
TAKAYA HISAMITSU, NAKAI HIDEKATSU, TAKAMATSU SHIRO, MANDAI MASAKI, MATSUMURA NORIOMI: "Homologous recombination deficiency status-based classification of high-grade serous ovarian carcinoma", SCIENTIFIC REPORTS, vol. 10, no. 1, XP093083058, DOI: 10.1038/s41598-020-59671-3 *

Similar Documents

Publication Publication Date Title
WO2021036620A1 (en) Application of a group of genes related to ovarian cancer prognosis
CA2923528A1 (en) Molecular diagnostic test for lung cancer
CA2811015A1 (en) Molecular diagnostic test for cancer
US11015190B2 (en) Method of treating a patient having renal cancer
JP2011509689A (en) Molecular staging and prognosis of stage II and III colon cancer
CN107532208B (en) Compositions and methods for determining prognosis of endometrial cancer
US10036070B2 (en) Methods and means for molecular classification of colorectal cancers
WO2017216559A1 (en) Predicting responsiveness to therapy in prostate cancer
EP3950960A1 (en) Dna methylation marker for predicting recurrence of liver cancer, and use thereof
US20230257826A1 (en) Methods for predicting prostate cancer and uses thereof
WO2010101916A1 (en) Methods for predicting cancer response to egfr inhibitors
WO2017046714A1 (en) Methylation signature in squamous cell carcinoma of head and neck (hnscc) and applications thereof
WO2023146389A1 (en) Gene signatures for classifying homologous recombination deficiency
US20220290243A1 (en) Identification of patients that will respond to chemotherapy
US20240175093A1 (en) Molecular subtyping of colorectal liver metastases to personalize treatment approaches
US20230279502A1 (en) Identification of estrogen receptor positive (er+) breast cancers that will not develop tamoxifen resistance
US11636921B2 (en) Systems and methods for inferring cell status
US20230348990A1 (en) Prognostic and treatment response predictive method
WO2017061953A1 (en) Invasive ductal carcinoma aggressiveness classification
EP3394290B1 (en) Differential diagnosis in glioblastoma multiforme
WO2022152899A1 (en) Method for predicting the response to cdk4/6 inhibitor therapy in cancer patients
Pan et al. Gene expression signature for predicting homologous recombination deficiency in triple-negative breast cancer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23747427

Country of ref document: EP

Kind code of ref document: A1