WO2023099889A1 - Method of prognosis - Google Patents

Method of prognosis Download PDF

Info

Publication number
WO2023099889A1
WO2023099889A1 PCT/GB2022/053034 GB2022053034W WO2023099889A1 WO 2023099889 A1 WO2023099889 A1 WO 2023099889A1 GB 2022053034 W GB2022053034 W GB 2022053034W WO 2023099889 A1 WO2023099889 A1 WO 2023099889A1
Authority
WO
WIPO (PCT)
Prior art keywords
genes
gene panel
gene
ppfia2
tshz2
Prior art date
Application number
PCT/GB2022/053034
Other languages
French (fr)
Inventor
Ryan LUSBY
Vijay TIWARI
Mohammed INAYATULLAH
Original Assignee
The Queen's University Of Belfast
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Queen's University Of Belfast filed Critical The Queen's University Of Belfast
Publication of WO2023099889A1 publication Critical patent/WO2023099889A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to methods of characterizing breast cancer.
  • the invention relates to methods of predicting resistance to chemotherapy in subjects with triple negative breast cancer.
  • TNBC triple-negative breast cancer
  • progesterone progesterone
  • HER2 receptors 2 HER2 receptors 2
  • TNBC is the most aggressive breast cancer subtype. It is associated with a poorer clinical outcome due to a lack of early prognostic techniques, high incidences of relapse, metastasis and a lack of targeted therapeutics.
  • chemotherapy is the standard treatment, which includes a combination of taxanes and anthracyclines. However, approximately 30%-50% of patients develop resistance, and their prognosis worsens to 13-15 months survival 3-5 .
  • TNBC is a disease associated with extensive intratumor heterogeneity (ITH) and plays a crucial role in driving chemoresistance 6 .
  • ITH intratumor heterogeneity
  • the inventors performed an integrated analysis of expression profiles derived from scRNA-seq, from TNBC tumors treated with chemotherapy and identified subpopulations of cells that associate with key aspects of TNBC aggressiveness including chemoresistance and metastasis.
  • the inventors show that the identified signature genes can accurately predict response to NAC in primary as well as advanced stage TNBC (lymph-node positive), with the prediction accuracy for TNBC greatly improved over that of known gene expression signatures. .
  • the present invention provides a method of predicting resistance to chemotherapy in a subject having triple negative breast cancer (TNBC), wherein said method comprises: a) providing a biological sample from said subject; and b) determining the expression levels of each member of a gene panel in said biological sample, said gene panel comprising at least 10 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS; wherein the determined expression levels of said at genes of said gene panel is used to determine the likelihood of resistance to said chemotherapy.
  • TNBC triple negative breast cancer
  • the gene panel comprises the following 20 genes: SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel consists of the following 20 genes: SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the inventors have shown that such a gene panel (gene signature) can predict response to chemotherapy in TNBC significantly more accurately than known gene expression signatures for TNBC. Moreover, the inventors have further shown that smaller gene panels comprising fewer of the 20 genes still outperform known gene signatures. For example, as shown in the Examples, even when the ten genes having greatest effect on the performance of the gene panel are not included, the overall performance of the gene signature is still greater than that of known gene signatures.
  • the invention extends to gene panels which do not include all 20 of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • said gene panel comprises at least 10 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, and GDAP2.
  • said gene panel comprises at least the 10 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, and CLCN3.
  • said gene panel comprises at least 15 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • said gene panel comprises at least the fifteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, and GDAP2.
  • said gene panel comprises one, two, three, four, or five of SLC11 A2, TSHZ2, CTDSPL, NDUFA6, and PTPRJ.
  • the likelihood of resistance to chemotherapy may be determined by comparing the expression levels of said genes of the gene panel relative to reference amounts of said genes.
  • Said reference amounts may be reference amounts in control cells.
  • Such control cells may be chemosensitive TNBC cells, for example chemo sensitive TNBC cells from patients shown to have pathological complete response (pCR) to said chemotherapy.
  • each gene of the gene panel has significantly higher expression in TNBC cells from patients with residual disease vs TNBC cells from patients with pCR (e.g. the difference having a p value of less than 0.05).
  • the likelihood of resistance to said chemotherapy may be determined by applying the expression levels to a predictive model which relates expression levels of said genes of the gene panel with resistance to chemotherapy against triple negative breast cancer.
  • Applying the expression levels to such a predictive model may comprise weighting the expression levels of said genes of the gene panel according to a predetermined ranking of said genes of the gene panel.
  • the weighting of the expression levels of the genes may be determined by performing LASSO regression.
  • the genes may be ranked in order of greatest to least predictive power, for example as shown in Table 2.
  • an active agent or "a pharmacologically active agent” includes a single active agent as well as two or more different active agents in combination
  • references to "a carrier” includes mixtures of two or more carriers as well as a single carrier, and the like.
  • the present invention is based on the identification of a specific gene signature which the inventors have shown can be used to predict with high accuracy resistance against chemotherapy in TNBC patients.
  • RNA-sequencing data scRNA-seq
  • scRNA-seq matched longitudinal single-cell RNA-sequencing data
  • the inventors identified 281 markers that define the subpopulations associated with chemoresistance.
  • Lasso and Elastic-Net Regularized Generalized Linear Models the inventors developed a predictive model of 20 genes, which showed a high capability in discriminating between residual disease and pathological complete response in a total of 371 TNBC patients from four independent studies.
  • Genes which may be used in the gene signatures of the invention comprise the following 20 genes: MSH3, TSHZ2, SLC11A2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAH, RNF19B, and MKKS.
  • nucleic acid sequences are shown in the sequence listing with sequences for each of the genes as follows: : MSH3 (SEQ ID NO: 1), TSHZ2 (SEQ ID NO:2), SLC11A2 (SEQ ID NO:3), CTDSPL (SEQ ID NO:4), NDUFA6 (SEQ ID NO:5), PTPRJ (SEQ ID NO:6), PPFIA2 (SEQ ID NO:7), COL21 Al (SEQ ID NO:8), EFEMP2 (SEQ ID NO:9), CLCN3 (SEQ ID NO: 10), ESMI (SEQ ID NO: 11), EGFR (SEQ ID NO: 12), DTNA (SEQ ID NO: 13), EPHB3 (SEQ ID NO: 14), GDAP2 (SEQ ID NO: 15), HOXA1 (SEQ ID NO: 16), LOXL2 (SEQ ID NO: 17), RAH (SEQ ID NO: 18), RNF19B (SEQ ID NO: 19), and MKKS (S
  • genes for use in the gene signatures of the invention are shown in Figure 4.
  • Examples of suitable Affymetric probe sequences which may be used to identify target sequences of the genes are listed in Table 1. Table 1
  • probes recited for each gene of the gene panel should be considered as non-limiting examples.
  • the gene panel may comprise SLC11 A2. In some embodiments of the invention, the gene panel may comprise PPFIA2. In some embodiments of the invention, the gene panel may comprise DTNA. In some embodiments of the invention, the gene panel may comprise HOXA1. In some embodiments of the invention, the gene panel may comprise RAI1. In some embodiments of the invention, the gene panel may comprise MKKS.
  • the gene panel comprises SLC11 A2 and at least 9, for example,
  • genes selected from the group consisting of MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises PPFIA2 and at least 9, for example, 10,
  • genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises DTNA and at least 9, for example, 10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises HOXA1 and at least 9, for example, 10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises MKKS and at least 9, for example, 10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, and RNF19B.
  • the gene panel comprises at least two, three, four, five or all 6 genes selected from the group consisting of: SLC11 A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS.
  • the gene panel may comprise SLC11 A2 and PPFIA2.
  • the gene panel may comprise SLC11 A2 and DTNA.
  • the gene panel may comprise PPFIA2 and DTNA.
  • the gene panel comprises SLC11 A2, PPFIA2 and DTNA.
  • the gene panel comprises SLC11A2, PPFIA2, DTNA and HOXA1.
  • the gene panel comprises SLC11 A2, PPFIA2 DTNA, HOXA1 and RAI1. In one embodiment of the invention, the gene panel comprises SLC11 A2, PPFIA2 DTNA, HOXA1, RAI1, and MKKS.
  • the gene panel comprises two, three, four, five, or six of the genes selected from SLC11 A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS and at least 4, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, of the genes selected from the group consisting of MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, EPHB3, GDAP2, LOXL2, and RNF19B.
  • the gene panel may comprise one, two, three, four, or five of MSH3, TSHZ2, CTDSPL, NDUFA6, and PTPRJ.
  • the gene panel may comprise one, two, three, four, or five of MSH3, TSHZ2, CTDSPL, NDUFA6, and PTPRJ together with one, two, three, four, five, or six of the genes selected from SLC11 A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS.
  • the gene panel may comprise at least 12, 13, 14, 15, 16, 17, 18, or 19 genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises (i) at least the twelve genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, and EGFR; and optionally (ii) one or more of the genes of the group consisting of DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises at least the thirteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, and DTNA; and optionally (ii) one or more of the genes of the group consisting of EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises at least the fourteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, and EPHB3; and optionally (ii) one or more of the genes of the group consisting of GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises at least the fifteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3; and GDAP2, and optionally (ii) one or more of the genes of the group consisting of H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises at least the sixteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, and HOXA1; and optionally (ii) one or more of the genes of the group consisting of LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel comprises at least the seventeen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, and LOXL2; and optionally (ii) one or more of the genes of the group consisting of RAI1, RNF19B, and MKKS.
  • the gene panel comprises at least the eighteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, and RAI1; and optionally (ii) one or more of the genes of the group consisting of RNF19B, and MKKS.
  • the gene panel comprises at least the nineteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, and RNF19B, and optionally MKKS.
  • the gene panel comprises the twenty genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel is limited to genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS, i.e., the genes of which expression is determined are limited to only ten or more of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel consists of the ten genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, and CLCN3.
  • the gene panel consists of the eleven genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, and ESMl.
  • the gene panel consists of the twelve genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, and EGFR.
  • the gene panel consists of the thirteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21 Al, EFEMP2, CLCN3, ESMI, EGFR, and DTNA.
  • the gene panel consists of the fourteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, and EPHB3.
  • the gene panel consists of the fifteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3; and GDAP2.
  • the gene panel comprises at least the sixteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, and HOXA1; and optionally (ii) one or more of the genes of the group consisting of LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel consists of the seventeen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, and LOXL2.
  • the gene panel consists of the at least the eighteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, and RAIL
  • the gene panel consists of the nineteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, and RNF19B.
  • the gene panel comprises no more than 20 genes.
  • the gene panel consists of the following 20 genes: SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
  • the gene panel may comprise other biomarkers associated with TNBC.
  • the inventors have also identified the following genes as being useful in the TNBC gene signatures of the invention: ITGB1, RBFOX2, DST, RCAN1, and c9orf3/AOPEP.
  • the gene panel may comprise one, two, three, four, or five of ITGB1, RBFOX2, DST, RCAN1, and c9orf3/AOPEP. .
  • nucleic acid sequences for these genes are shown in the sequence listing with representative sequences for each of the genes as follows: ITGB1 (SEQ ID NO: 21), RBFOX2 (SEQ ID NO: 22), DST (SEQ ID NO: 23), RCAN1 (SEQ ID NO: 24), and c9orf3/AOPEP (SEQ ID NO: 25).
  • the expression level of genes of the gene panel may be identified using any suitable method. Expression of the genes may be measured directly by measurement of RNA expression via a microarray, Northern blotting, in situ RNA detection, RNA sequencing, PCR methods or any other suitable nucleic acid amplification technique.
  • the methods may involve any suitable primers and/or probes, the design of which is within the routine knowledge of the skilled person. Primer design tools, such as the NCBI Primer-BLAST tool may be used. Typically, primers and/or probes will be approximately 15-25 nucleotides in length.
  • next generation sequencing methods employing, for example, pyrosequencing, Ion Torrent semiconductor sequencing, Illumina sequencing methods, single-molecule real-time sequencing or DNA nanoball sequencing.
  • expression of the genes in the gene samples may be determined by in situ RNA detection using, for example, in situ hybridisation techniques to localise specific RNA sequences in a cell or section of tissue.
  • the expression levels of one or more of the genes of the gene panel may be determined by measurement of protein products of the genes in cells of a tissue sample.
  • Such methods of determining expression of the genes at the protein level may include commonly known techniques such as Western blot, immunocytochemistry, immunoprecipitation, mass spectrometry, ELISA, etc.
  • immunohistochemistry may be used to determine the level of proteins from particular genes of the gene panel in cells of a tissue sample of interest.
  • antibodies or aptamers directed to the protein of interest may be used to determine its level. Methods for generating such antibodies or aptamers are well known to the person skilled in the art.
  • the antibody or aptamers used in such methods may be conjugated to a label, the label forming part of a detection agent. Such methods are well known in the art.
  • the expression level of each member of the gene panel may be determined with reference to reference or control expression levels, e.g. expression levels from chemo sensitive TNBC cells.
  • a reference level of a particular gene may be an absolute or relative amount or concentration of the gene, a presence or absence of the gene product, a range or amount of concentration of the gene product, a mean amount of the gene or gene product, and/or a median amount of or concentration of the gene or gene product.
  • a “reference level” can also be a “standard curve reference level” based on a level of one or more of the genes determined from a population and plotted on appropriate axes to produce a reference curve. The reference curve may be tailored to particular populations of subjects, with reference levels varying with, for example, age group.
  • a standard curve reference level may be determined from a group of reference levels from a group of subjects having a particular disease using statistical analysis, such as univariate or multivariate regression analysis, logistic regression analysis, linear regression analysis, etc. of the levels of such genes/biomarkers in samples from the group. Such reference levels may be adjusted to specific techniques used to measure levels of gene expression, where the gene expression levels may differ based on the specific technique that is used.
  • the sample may be defined as positive for the gene signature, i.e. the demonstration that at least 10 genes of the gene panel have increased expression relative to reference levels in control cells for the corresponding genes is indicative that the sample is from TNBC cells which are resistant to chemotherapy.
  • the presence of 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or more, 17 or more, 18 or more, or 19 or more of the genes of the gene panel being expressed at an increased level compared to the reference levels in control cells is indicative of the sample being a sample of TNBC cells which are chemotherapy resistant.
  • the expression of each of the genes of the gene panel is combined to produce a compound gene signature score of total expression of all of the genes of the gene panel relative to a compound score of the total of the reference expression levels of these genes from a control.
  • a compound gene signature score of total expression of all of the genes of the gene panel relative to a compound score of the total of the reference expression levels of these genes from a control.
  • each of the genes which may be used in the gene panels in the methods of the invention have been assessed for predictive power in relation to the overall gene signature, with the genes ranked in order of influence on the gene signature performance.
  • Each of the genes of the 20 gene panel (SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS) has thus been assigned a Predictive Score i.e.
  • Ranking Score indicative of its influence on the performance of the 20 gene signature, with a higher value indicative of greater reduction in performance of the gene signature upon the exclusion of that gene from the gene signature.
  • the rank position and Predictive Score (Ranking Score) associated with each of the genes of the 20 gene panel is shown in Table 2.
  • the ranking of each gene in the gene panel may be taken into account in determining the overall compound expression score for the gene panel to determine the likelihood of resistance to chemotherapy in TNBC cells.
  • the compound expression score is calculated employing a weighting of the expression levels of the genes of the gene panel according to the ranking of the genes of the gene panel.
  • the ranking may be the rank position, in which a ranking of “1” is indicative that the gene has the most negative impact on performance when removed with rankings of higher numbers indicating less impact on performance when removed from the gene signature.
  • the genes of a gene signature of 20 genes may be ranked from 1 to 20, with the gene having the greatest positive effect on the predictive power of the gene signature having rank position 1, the gene with the second highest positive effect on the predictive power of the gene signature having rank position 2, etc.
  • any formula used to determine the compound expression score may take into account the Predictive Score (Ranking Score), such as the ranking scores as listed for the 20 gene signature in Table 2.
  • Ranking Score the Predictive Score
  • the higher the ranking score of a gene the greater the impact on performance on the gene signature when that gene is removed from the gene signature.
  • the ranking score and rank position may change depending on the total number of genes in a gene panel.
  • the combined expression level as appropriately weighted according to ranking score, may thus be used to determine the final gene signature score.
  • genes included in the gene signature may thus carry unequal weight in the determination of the likelihood of resistance to chemotherapy in TNBC cells. Accordingly, the weighting/rank of each gene in the score may be taken into account in the calculation of the overall gene signature score.
  • the gene signature score may be determined using any suitable weighting formula.
  • a suitable weighting formula may make use of regression analysis.
  • Any suitable predictive package may be used.
  • one such predictive package which may be used is the caret R package. In summary it takes the absolute value of the final coefficients and ranks them based on their importance in the model.
  • the main function is varlmp() which is described as a generic method for calculating variable importance for objects produced by “train” and method specific methods (https://www.rdocumentation.Org/packages/caret/versions/6.0-90/topics/varImp). It can be applied to multiple models including linear, random forrest and glmnet where it calculates the importance of the outputs and ranks them. In the case of glmnet the outputs are coefficients.
  • the varlmp function in the caret package may be used to rank the importance of genes of the gene signature.
  • the compound gene signature score may be compared to a reference value which may be derived from, for example, a training set of patent data in which, for example, the threshold is established to indicate a total score which is indicative of whether or not the cells of the sample are chemotherapy resistant.
  • the performance of a particular gene signature may be determined in any suitable way.
  • the performance is assessed using Area Under the Curve (AUC).
  • AUC refers to the area under the curve of a Receiver Operating Characteristic (ROC) curve.
  • ROC Receiver Operating Characteristic
  • AUC is typically measured as a value between 0 and 1 where an AUC of 0 is indicative that the model’s predictions are 100% wrong and an AUC of 1 is indicative that the model’s predictions are 100% correct.
  • the AUC associated with a gene panel of the invention is greater than 0.750.
  • the AUC associated with a gene panel of the invention is greater than 0.800, for example greater than 0.825, such as greater than 0.850, greater than 0.860, such as greater than 0.870, greater than 0.880 or greater than 0.890.
  • methods may be used to characterise whether or not a sample falls within the chemoresistant TNBC category other than by determining AUC.
  • methods which could be used include, for example, classification and regression trees, Random Forests, Multivariate Adaptive Regression Splines and Support vector machines.
  • the methods of the invention enable the determination as to whether a patient’s TNBC is likely to be resistant to a chemotherapy treatment.
  • the chemotherapy comprises one, two, three or more of the group consisting of anthracyclines (for example Adriamycin (doxorubicin)), alkylating agents (for example Cyclophosphamide), taxanes (for example taxol (Docetaxel)), or antimetabolites (such as fluorouracil (5-FU).
  • anthracyclines for example Adriamycin (doxorubicin)
  • alkylating agents for example Cyclophosphamide
  • taxanes for example taxol (Docetaxel)
  • antimetabolites such as fluorouracil (5-FU).
  • the chemotherapy treatment is a combined chemotherapy treatment regimen which comprises or consists of an anthracycline and an alkylating agent.
  • the combined chemotherapy treatment regimen consists of Adriamycin (doxorubicin) and cyclophosphamide.
  • the combined chemotherapy treatment regimen consists of Adriamycin (doxorubicin), cyclophosphamide, and paclitaxel.
  • the chemotherapy treatment is a combined chemotherapy treatment regimen which comprises or consists of an anthracycline, an alkylating agent, and an antimetabolite.
  • the combined chemotherapy treatment regimen comprises or consists of 5-FU, Adriamycin, and Cyclophosphamide.
  • the combined chemotherapy treatment regimen consists of paclitaxel, 5-FU, Adriamycin, and Cyclophosphamide.
  • Figure 2 20 Genes show higher expression in TNBC subtype, have poor survival probability, and have high accuracy in predicting chemotherapy response.
  • G) ROC curve showing the prognostic capability following removal of five genes (AUC 0.678).
  • ROC curves showing AUC values associated with the removal of from 1 to 19 of the genes of the 20 gene panel (TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS), with the ROC curve labelled -1 including the first 19 of the gene panel, i.e. excluding MKKS, the ROC curve labelled -2 including the first 18 of the gene panel, i.e. excluding RNF19B, and MKKS, etc, with -19 thus only including TSHZ2.
  • ROC curves showing AUC values associated with the removal of from 6 to 10 of the genes of the 20 gene panel (TSHZ2, SLC11A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAJI, RNF19B, and MKKS), with the ROC curve labelled -Top Six for a panel of 14 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, and MSH3, the ROC curve labelled -Top Seven for a panel of 13 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3, and PPFIA2, the ROC curve labelled -Top Eight for a panel of 12 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2,
  • scRNA-seq analysis was performed on the data set obtained from Kim et al 6 , consisting of matched pre and post chemotherapy (anthracy cline and a taxane) samples from four responsive and four resistant patients with a total of 6,862 cells.
  • the patient data files from the four responsive and four resistant patients were analysed using the same parameters in the R package "Seurat”.
  • cells with feature counts of greater than 2500 or less than 200 were removed, including mitochondrial reads of greater than 5%.
  • downstream analysis including normalisation, variable feature selection, dimensionality reduction and UMAP clustering was performed.
  • Cluster annotation was performed using the python program "SC SA" to identify associated cell types and cancer-related processes.
  • SC SA python program
  • RNA seq datasets For the reproducibility of the gene set identified using scRNA-seq analysis, the inventors used five independent bulk RNA seq datasets, GSE20271, GSE25055, GSE25065, GSE20194, GSE50948, of 397 TNBC patients where their chemotherapy response was available (RD, pCR). Patient samples were excluded if the therapeutic outcome (residual disease or pathological complete response) was unknown and not classified as TNBC. The raw data was normalised, batch corrected, and log- transformed using the R package "affy" and the python package "pyComBat". In total, 397 patient's data were selected for reproducibility analysis. The 281 genes identified in the scRNA-seq dataset were extracted from the normalised bulk RNA- seq count files. A custom R script was used to compare each gene's expression in patients with residual disease and pathological complete response.
  • Raw microarray expression (CEL) files of 307 TNBC patients were downloaded from Gene Expression Omnibus, GSE20271, GSE25055, GSE25065, GSE20194. Gene expression profiles were quantile normalised and log2 transformalised using "BART", followed by batch correction using "ComBat” from the R package "sva”. To identify the most significant gene set, GSE20271 and GSE25055 datasets with 177 TNBC patients (57 pathological complete response, 120 residual disease) were used as the testing cohort. To verify the strength of the geneset, GSE25065 and GSE20194 datasets with 130 TNBC patients (46 pathological complete response, 84 residual disease) were used as the validation cohort.
  • the inventors To identify the significant gene set and develop a predictive model to discriminate pCR and RD groups, the inventors first used Lasso and Elastic-Net Regularized Generalized Linear Models using the R package "glmnet" on the 87 markers to identify the best combination with the greatest predictive power. Then, the inventors used the 10-fold cross-validation method to evaluate the discrimination ability to obtain a relatively unbiased estimate. After the LASSO regression analysis, a predictive model based on 20 was used to fit a generalised linear model. The predictive capability was measured by the receiver operating characteristic curve (ROC curve) area under the curve (AUC) using the R package "pROC". Results were evaluated using the area under the ROC curve. The optimal model was selected by maximising AUC.
  • ROC curve receiver operating characteristic curve
  • AUC area under the curve
  • the R package “WGCNA”, was used to perform the weighted correlation network analysis 17 .
  • the gene co-expression similarity between genes in RD and pCR was defined.
  • a power function was then applied to correlate adjacency of genes.
  • Scale independence and mean connectivity were then tested using a gradient method (the power value ranging from 1 to 20). When the degree of independence was determined to be above 0.80 an appropriate power value was screened out to obtain a scale-free network.
  • the adjacency matrix was transformed into a topological overlap matrix, and modules were detected by hierarchical average linkage clustering analysis for the gene dendrogram. Additionally, the inventors extracted the corresponding gene information for each module for further analysis, in particular, turquoise (I) and yellow (J) modules.
  • ChlP-seq data for genome-wide binding patterns of STAT3 in TNBC cell lines and RNA-seq after 96hr knockdown of STAT3 or non-targeting siRNA was downloaded from (GSE85579). Expression values for each gene identified in the turquoise (I) and yellow (J) modules were extracted for STAT3 knockdown and non-targeting control and plotted using the R package “ggplof ’. ChlP-seq bedgraph files of STAT3 KD and control were converted to bigwig format using Kent utils “bedgraph2bigwig” to enable viewing in the UCSC genome browser.
  • scRNA-seq Due to the high degree of intra-tumour heterogeneity (ITH) associated with TNBC, scRNA-seq provides a higher level of resolution and enables the identification of distinct cell types and markers, which could be lost in bulk RNA-seq analysis.
  • ITH intra-tumour heterogeneity
  • Fig. IB For each patient, cells were clustered into distinct pre and post treatment clusters. Cluster annotation was performed, using SC SA, to identify cell types and revealed the pre and post treatment clusters were majorly dominated by progenitor and luminal epithelial cells pre-treatment and basal epithelial cells post-treatment. The chemoresistant patients showed a clear separation of clusters into pre and post treatment. Basal epithelial cell types, associated with metastasis and other TNBC processes were present pre-treatment and remained post-treatment, indicating that treatment was not successful in eliminating these cells and potentially promote chemoresistance in these patients (Fig. IB). For all chemoresponsive patients, the clustering showed a clear separation of pre and post treatment timepoints.
  • the pretreatment cluster contained progenitor cells and luminal epithelial cells which following NAC were removed. Additionally, the basal epithelial cell type, which is associated with metastasis and other TNBC processes were not present or removed in the post treatment clusters indicating that these cells may confer chemoresistance (Fig. IB).
  • Differential gene expression was performed for each patient to identify markers specific to the chemoresistant patients that could be identified across pre and post samples. First, the list of differentially expressed genes for each treatment timepoint for all patients was obtained. Next, a distinct set of genes defining each population was compared between responsive and resistant groups to identify markers specific to chemoresistance patients.
  • ROC plotter can link gene expression and therapy response using transcriptome level data of 3,104 breast cancer patients and resulted in the identification of 16 genes which were shown to have a high AUC ( ⁇ 0.6) in patients with residual disease compared to pathological complete response following chemotherapy in the clinical trial setting. Based on this score, the 16 genes were identified to have a high capability of being a top biomarker in predicting chemotherapy response ( see Fig. ID which shows results for two of these, LOXL2 and DTNA). Interestingly, Lysyl Oxidase Like 2 (LOXL2) was identified to be overlapping between pre and post treatment in the resistant patients.
  • LOXL2 Lysyl Oxidase Like 2
  • LOXL2 is a member of the Lysyl Oxidase (LOX) family which have previously been associated with tumour aggressiveness by regulation of cell adhesion, motility and invasion. Additionally, inhibition of LOX and its paralogues have been shown to overcome chemotherapy resistance in TNBC 18 . Additionally, Dystrobrevin Alpha (DTNA) was shown to have high prediction capabilities and to date has not been associated with chemotherapy resistance in TNBC or any other breast cancer subtypes.
  • LOX Lysyl Oxidase
  • the inventors used gene expression of the 87 genes in 307 TNBC patients from four microarray-based breast cancer studies. Out of 307 patients, 183 from two datasets GSE25055 and GSE25065 were used to construct the classifier, and 130 patients from GSE20194 and GSE20271 were used for external validation.
  • Lasso and Elastic-Net Regularized Generalized Linear Models was applied, using the R package "glmnet", to screen the 87 chemoresistant markers and identify the top combination of genes that could accurately predict therapy response in TNBC patients.
  • the inventors used the 10-fold cross-validation method to evaluate the discrimination ability between pCR and RD to obtain a relatively unbiased estimate (Fig. 2A, B).
  • TNBC patients achieving a pathological complete response to neoadjuvant chemotherapy is a crucial predictor of a patient's long-term outcomes and can allow an early evaluation of the effectiveness of systemic therapy.
  • Identification of biomarkers that can predict TNBC NAC response provides physicians with an opportunity to offer alternative treatments.
  • For oestrogen receptor or HER2 positive tumours several molecular tests can be used to guide therapeutic options, however, there are currently no tests in clinical use for TNBC patients, and there is an unmet need for the development of accurate NAC response predictors to aid in their clinical management.
  • TNBC specifically a number of predictive panels 22-26 have been published but none achieved the high predictive accuracy obtained by panels used in oestrogen and HER2 positive tumours due to small sample sizes and lack of validation data.
  • Figure 4C shows that even when the top 10 genes are removed from the 20 gene panel, the reduced gene panel of 10 genes still performs well with an AUC of 0.76.
  • Figure 4B demonstrates that when the top 5 genes of the gene panel are excluded, the reduced gene panel of 15 genes still performs very well with an AUC of 0.827.
  • WGCNA weighted gene co-expression network analysis
  • RNA-seq and ChlP-seq data was downloaded from GSE85579 where they performed STAT3 knockdown (KD) and revealed a role of STAT3 in regulating invasion and metastasis.
  • RNA-seq showed that KD of STAT3 altered the expression of the genes of interest, including upregulation of EGFR and TSHZ2, two genes which regulate STAT3 signalling, upregulation of DTNA and downregulation of the remaining five genes (Fig. 3F).
  • DTNA has recently been shown to play a role in HBC-induced hepatocellular carcinoma by activating STAT3 27 and here it was shown to have higher expression following STAT3 KD.
  • Neoadjuvant chemotherapy is used frequently in the treatment of TNBC patients due to the lack of targeted therapeutics and its ability to reduce tumour size, improve surgical outcomes and increase survival in responders.
  • NAC Neoadjuvant chemotherapy
  • patients due to the ITH associated with TNBC, patients have differing responses to NAC 35 . Achieving pCR is associated with significantly improved survival outcomes in TNBC patients 36 . Identifying those patients who will have RD following NAC will enable physicians to determine the best therapeutic option at the beginning of treatment, rather than waiting for NAC treatment results, to increase the chances of achieving pCR.
  • Numerous efforts have been put into developing predictive signatures in TNBC, but currently, there is no clinically recommended predictive biomarker panel for NAC response 7-9.
  • the inventors have developed a predictive model which has outstandingly high accuracy in defining chemotherapy response in TNBC patients.
  • the inventors predictive model has been shown to outperform existing panels for ER-positive breast cancers, the inventors model having a higher AUC and lower number of genes.
  • a low number of genes in biomarker panel development is desirable due to cost-effectiveness in production and the end-user.
  • the inventors have revealed a potential role of STAT3 signalling in driving TNBC chemoresistance.
  • WCGNA WCGNA
  • the inventors identified two modules where genes were overlapping and significantly co-expressing between RD and pCR patients.
  • the genes involved have all previously been reported to play a role in STAT3 signalling, either as activators or downstream targets.
  • the inventors were able to provide insights into the potential role of these genes in STAT3 signalling and how they are driving chemoresistance in TNBC patients.
  • the inventors have developed a 20 gene predictive model derived from scRNA-seq TNBC data, which accurately predicts chemotherapy response in TNBC patients.
  • the inventors also revealed a potential role of STAT3 signalling driving chemoresistance in TNBC patients.
  • the inventors have presented a framework for identifying predictive biomarkers from the single-cell level and the development of a predictive model which could be applied in other cancer types.
  • Hu, Z.-G. et al. DTNA promotes HBV-induced hepatocellular carcinoma progression by activating STAT3 and regulating TGFpi and P53 signaling. Life Sci. 258, 118029 (2020).
  • Peng, L. et al. Secreted LOXL2 is a novel therapeutic target that promotes gastric cancer metastasis via the Src/FAK pathway. Carcinogenesis 30, 1660- 1669 (2009).
  • Zhu, H. et al. NPM-ALK up-regulates iNOS expression through a STAT3/microRNA-26a-dependent mechanism. J. Pathol. 230, 82-94 (2013).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to a method of predicting resistance to chemotherapy in a subject having triple negative breast cancer (TNBC), wherein said method comprises: (a) providing a biological sample from said subject; and (b) determining the expression levels of each member of a gene panel, said gene panel comprising at least 10 genes selected from the group consisting of SLC11A2, TSHZ2, CTDSPL, NDUFA6, PTPRJ, MSH3, PPFIA2, COL21A1, EFEMP2, CLCN3, ESM1, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS in said biological sample; wherein the determined expression levels of said genes of said gene panel is used to determine the likelihood of resistance to said chemotherapy.

Description

METHOD OF PROGNOSIS
Field Of The Invention
The present invention relates to methods of characterizing breast cancer. In particular, the invention relates to methods of predicting resistance to chemotherapy in subjects with triple negative breast cancer.
Background to the Invention
Breast cancer is the most commonly diagnosed cancer and the second leading cause of mortality in women worldwide f Breast cancer patients have historically been characterised based on hormone receptor status into one of four subtypes. 20% of breast cancer patients are characterised as triple-negative breast cancer (TNBC), lacking oestrogen, progesterone and HER2 receptors 2. TNBC is the most aggressive breast cancer subtype. It is associated with a poorer clinical outcome due to a lack of early prognostic techniques, high incidences of relapse, metastasis and a lack of targeted therapeutics. In the neoadjuvant setting, chemotherapy is the standard treatment, which includes a combination of taxanes and anthracyclines. However, approximately 30%-50% of patients develop resistance, and their prognosis worsens to 13-15 months survival 3-5.
Currently, there are panels for ER and HER2 positive patients which can effectively guide their treatment outcomes. TNBC is a disease associated with extensive intratumor heterogeneity (ITH) and plays a crucial role in driving chemoresistance6. However, although there has been some work on chemoresistance in TNBC patients7- 9, the genomic and molecular basis of chemoresistance in TNBC patients remains poorly understood and there are no commercially available biomarker panels that can predict chemotherapy response in TNBC patients. Currently available signature gene panels for predicting chemotherapy response such as Oncotype DX (21 genes, RT- PCR) 2426, Endopredict (12 genes, RT-PCR) 27, and PROSIGNA (50 genes, RT-PCR) 28 are designed for ER-positive breast cancer and are not effective in the prediction of resistance to chemotherapy in the treatment of TNBC. Increasing evidence suggests that there may exist similar gene signatures for TNBC 29'35. For example, while one study showed a relationship of proliferation and immune-related genes, another one showed stroma-related genes to be predictive of neoadjuvant chemotherapy (NAC) response in TNBC patients. Previous efforts to identify chemoresistance genes have failed in capturing those expressed in small subpopulations associated with chemoresistance in TNBC. None of these signatures reach a high prediction accuracy, limiting their clinical utility. Unfortunately, no targeted therapies for TNBC are available, except PARP inhibitors in germline BRCAl/2-mutated tumours and therefore chemotherapy is the only treatment option for most TNBC patients. Thus, there is an unmet need to accurately predict TNBC patients' chemotherapy response to enable clinicians to accurately stratify patients, and provide alternative treatment options.
Summary Of The Invention
In the present study, the inventors performed an integrated analysis of expression profiles derived from scRNA-seq, from TNBC tumors treated with chemotherapy and identified subpopulations of cells that associate with key aspects of TNBC aggressiveness including chemoresistance and metastasis. The inventors show that the identified signature genes can accurately predict response to NAC in primary as well as advanced stage TNBC (lymph-node positive), with the prediction accuracy for TNBC greatly improved over that of known gene expression signatures. .
Accordingly in a first aspect, the present invention provides a method of predicting resistance to chemotherapy in a subject having triple negative breast cancer (TNBC), wherein said method comprises: a) providing a biological sample from said subject; and b) determining the expression levels of each member of a gene panel in said biological sample, said gene panel comprising at least 10 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS; wherein the determined expression levels of said at genes of said gene panel is used to determine the likelihood of resistance to said chemotherapy.
In a particular embodiment of the invention, the gene panel comprises the following 20 genes: SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
In a particular embodiment of the invention, the gene panel consists of the following 20 genes: SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
As described in the examples, the inventors have shown that such a gene panel (gene signature) can predict response to chemotherapy in TNBC significantly more accurately than known gene expression signatures for TNBC. Moreover, the inventors have further shown that smaller gene panels comprising fewer of the 20 genes still outperform known gene signatures. For example, as shown in the Examples, even when the ten genes having greatest effect on the performance of the gene panel are not included, the overall performance of the gene signature is still greater than that of known gene signatures. Thus, the invention extends to gene panels which do not include all 20 of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
Accordingly, in another embodiment of the invention, said gene panel comprises at least 10 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, and GDAP2. In another embodiment of the invention, said gene panel comprises at least the 10 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, and CLCN3.
In another embodiment of the invention, said gene panel comprises at least 15 genes selected from the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In another embodiment of the invention, said gene panel comprises at least the fifteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, and GDAP2.
In certain embodiments of the invention, said gene panel comprises one, two, three, four, or five of SLC11 A2, TSHZ2, CTDSPL, NDUFA6, and PTPRJ.
The likelihood of resistance to chemotherapy may be determined by comparing the expression levels of said genes of the gene panel relative to reference amounts of said genes. Said reference amounts may be reference amounts in control cells. Such control cells may be chemosensitive TNBC cells, for example chemo sensitive TNBC cells from patients shown to have pathological complete response (pCR) to said chemotherapy.
In a preferred embodiment of the invention, each gene of the gene panel has significantly higher expression in TNBC cells from patients with residual disease vs TNBC cells from patients with pCR (e.g. the difference having a p value of less than 0.05).
In embodiments of the invention, the likelihood of resistance to said chemotherapy may be determined by applying the expression levels to a predictive model which relates expression levels of said genes of the gene panel with resistance to chemotherapy against triple negative breast cancer.
Applying the expression levels to such a predictive model may comprise weighting the expression levels of said genes of the gene panel according to a predetermined ranking of said genes of the gene panel. In one embodiment, the weighting of the expression levels of the genes may be determined by performing LASSO regression.
In one such embodiment, where the gene panel comprises all 20 genes of the gene panel of the first aspect of the invention, the genes may be ranked in order of greatest to least predictive power, for example as shown in Table 2.
Detailed Description
Unless otherwise defined, all technical and scientific terms used herein have the meaning commonly understood by a person who is skilled in the art in the field of the present invention.
Throughout the specification, unless the context demands otherwise, the terms “comprise” or “include”, or variations such as “comprises” or “comprising”, “includes” or “including” will be understood to imply the inclusion of a stated integer or group of integers, but not the exclusion of any other integer or group of integers.
As used herein, terms such as "a", "an" and "the" include singular and plural referents unless the context clearly demands otherwise. Thus, for example, reference to "an active agent" or "a pharmacologically active agent" includes a single active agent as well as two or more different active agents in combination, while references to "a carrier" includes mixtures of two or more carriers as well as a single carrier, and the like. The present invention is based on the identification of a specific gene signature which the inventors have shown can be used to predict with high accuracy resistance against chemotherapy in TNBC patients. Here, by profiling matched longitudinal single-cell RNA-sequencing data (scRNA-seq) of chemoresponsive and chemoresistant TNBC patients, the inventors identified 281 markers that define the subpopulations associated with chemoresistance. Using Lasso and Elastic-Net Regularized Generalized Linear Models, the inventors developed a predictive model of 20 genes, which showed a high capability in discriminating between residual disease and pathological complete response in a total of 371 TNBC patients from four independent studies.
Genes which may be used in the gene signatures of the invention comprise the following 20 genes: MSH3, TSHZ2, SLC11A2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAH, RNF19B, and MKKS. Examples of nucleic acid sequences are shown in the sequence listing with sequences for each of the genes as follows: : MSH3 (SEQ ID NO: 1), TSHZ2 (SEQ ID NO:2), SLC11A2 (SEQ ID NO:3), CTDSPL (SEQ ID NO:4), NDUFA6 (SEQ ID NO:5), PTPRJ (SEQ ID NO:6), PPFIA2 (SEQ ID NO:7), COL21 Al (SEQ ID NO:8), EFEMP2 (SEQ ID NO:9), CLCN3 (SEQ ID NO: 10), ESMI (SEQ ID NO: 11), EGFR (SEQ ID NO: 12), DTNA (SEQ ID NO: 13), EPHB3 (SEQ ID NO: 14), GDAP2 (SEQ ID NO: 15), HOXA1 (SEQ ID NO: 16), LOXL2 (SEQ ID NO: 17), RAH (SEQ ID NO: 18), RNF19B (SEQ ID NO: 19), and MKKS (SEQ ID NO:20)
In one embodiment, genes for use in the gene signatures of the invention are shown in Figure 4. Examples of suitable Affymetric probe sequences which may be used to identify target sequences of the genes are listed in Table 1. Table 1
Figure imgf000008_0001
Figure imgf000009_0001
It should be understood that for any particular gene of the gene signature, a number of target sequences may be used. The skilled person will readily be able to identify suitable target sequences for each gene and likewise will readily be able to design suitable probes/primers based on the sequences of the genes and/or of individual target sequences. Accordingly, probes recited for each gene of the gene panel should be considered as non-limiting examples.
In some embodiments of the invention, the gene panel may comprise SLC11 A2. In some embodiments of the invention, the gene panel may comprise PPFIA2. In some embodiments of the invention, the gene panel may comprise DTNA. In some embodiments of the invention, the gene panel may comprise HOXA1. In some embodiments of the invention, the gene panel may comprise RAI1. In some embodiments of the invention, the gene panel may comprise MKKS.
In one embodiment, the gene panel comprises SLC11 A2 and at least 9, for example,
10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment, the gene panel comprises PPFIA2 and at least 9, for example, 10,
11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment, the gene panel comprises DTNA and at least 9, for example, 10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment, the gene panel comprises HOXA1 and at least 9, for example, 10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment, the gene panel comprises MKKS and at least 9, for example, 10, 11, 12, 13, 14, 15, 16, 17 , 18, or all 19 of the genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, and RNF19B.
In one embodiment of the invention, the gene panel comprises at least two, three, four, five or all 6 genes selected from the group consisting of: SLC11 A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS. Thus in one embodiment, the gene panel may comprise SLC11 A2 and PPFIA2. In another embodiment, the gene panel may comprise SLC11 A2 and DTNA. In another embodiment, the gene panel may comprise PPFIA2 and DTNA. In one embodiment of the invention, the gene panel comprises SLC11 A2, PPFIA2 and DTNA. In one embodiment of the invention, the gene panel comprises SLC11A2, PPFIA2, DTNA and HOXA1. In one embodiment of the invention, the gene panel comprises SLC11 A2, PPFIA2 DTNA, HOXA1 and RAI1. In one embodiment of the invention, the gene panel comprises SLC11 A2, PPFIA2 DTNA, HOXA1, RAI1, and MKKS.
In one embodiment, the gene panel comprises two, three, four, five, or six of the genes selected from SLC11 A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS and at least 4, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, of the genes selected from the group consisting of MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, EPHB3, GDAP2, LOXL2, and RNF19B.
In some embodiments of the invention, the gene panel may comprise one, two, three, four, or five of MSH3, TSHZ2, CTDSPL, NDUFA6, and PTPRJ. In some such embodiments, the gene panel may comprise one, two, three, four, or five of MSH3, TSHZ2, CTDSPL, NDUFA6, and PTPRJ together with one, two, three, four, five, or six of the genes selected from SLC11 A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS.
In embodiments of the invention, the gene panel may comprise at least 12, 13, 14, 15, 16, 17, 18, or 19 genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises (i) at least the twelve genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, and EGFR; and optionally (ii) one or more of the genes of the group consisting of DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises at least the thirteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, and DTNA; and optionally (ii) one or more of the genes of the group consisting of EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises at least the fourteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, and EPHB3; and optionally (ii) one or more of the genes of the group consisting of GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises at least the fifteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3; and GDAP2, and optionally (ii) one or more of the genes of the group consisting of H0XA1, LOXL2, RAI1, RNF19B, and MKKS. In one embodiment of the invention, the gene panel comprises at least the sixteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, and HOXA1; and optionally (ii) one or more of the genes of the group consisting of LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises at least the seventeen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, and LOXL2; and optionally (ii) one or more of the genes of the group consisting of RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises at least the eighteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, and RAI1; and optionally (ii) one or more of the genes of the group consisting of RNF19B, and MKKS.
In one embodiment of the invention, the gene panel comprises at least the nineteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, and RNF19B, and optionally MKKS.
In a particular embodiment of the present invention, the gene panel comprises the twenty genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel is limited to genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS, i.e., the genes of which expression is determined are limited to only ten or more of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel consists of the ten genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, and CLCN3.
In one embodiment of the invention, the gene panel consists of the eleven genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, and ESMl.
In one embodiment of the invention, the gene panel consists of the twelve genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, and EGFR.
In one embodiment of the invention, the gene panel consists of the thirteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21 Al, EFEMP2, CLCN3, ESMI, EGFR, and DTNA.
In one embodiment of the invention, the gene panel consists of the fourteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, and EPHB3.
In one embodiment of the invention, the gene panel consists of the fifteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3; and GDAP2.
In one embodiment of the invention, the gene panel comprises at least the sixteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, and HOXA1; and optionally (ii) one or more of the genes of the group consisting of LOXL2, RAI1, RNF19B, and MKKS.
In one embodiment of the invention, the gene panel consists of the seventeen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, and LOXL2.
In one embodiment of the invention, the gene panel consists of the at least the eighteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, and RAIL
In one embodiment of the invention, the gene panel consists of the nineteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, and RNF19B.
In one embodiment, the gene panel comprises no more than 20 genes. In a particular embodiment of the invention, the gene panel consists of the following 20 genes: SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
In alternative embodiments, the gene panel may comprise other biomarkers associated with TNBC.
For example, the inventors have also identified the following genes as being useful in the TNBC gene signatures of the invention: ITGB1, RBFOX2, DST, RCAN1, and c9orf3/AOPEP. Accordingly, in embodiments of the invention, the gene panel may comprise one, two, three, four, or five of ITGB1, RBFOX2, DST, RCAN1, and c9orf3/AOPEP. . Examples of nucleic acid sequences for these genes are shown in the sequence listing with representative sequences for each of the genes as follows: ITGB1 (SEQ ID NO: 21), RBFOX2 (SEQ ID NO: 22), DST (SEQ ID NO: 23), RCAN1 (SEQ ID NO: 24), and c9orf3/AOPEP (SEQ ID NO: 25).
Determining Expression Levels
In the methods of the invention, the expression level of genes of the gene panel may be identified using any suitable method. Expression of the genes may be measured directly by measurement of RNA expression via a microarray, Northern blotting, in situ RNA detection, RNA sequencing, PCR methods or any other suitable nucleic acid amplification technique. The methods may involve any suitable primers and/or probes, the design of which is within the routine knowledge of the skilled person. Primer design tools, such as the NCBI Primer-BLAST tool may be used. Typically, primers and/or probes will be approximately 15-25 nucleotides in length. Other methods used to determine gene expression include next generation sequencing methods employing, for example, pyrosequencing, Ion Torrent semiconductor sequencing, Illumina sequencing methods, single-molecule real-time sequencing or DNA nanoball sequencing. Alternatively, expression of the genes in the gene samples may be determined by in situ RNA detection using, for example, in situ hybridisation techniques to localise specific RNA sequences in a cell or section of tissue.
In addition to methods of determining expression of the genes of the gene panel which rely on measurement of RNA directly, the expression levels of one or more of the genes of the gene panel may be determined by measurement of protein products of the genes in cells of a tissue sample. Such methods of determining expression of the genes at the protein level may include commonly known techniques such as Western blot, immunocytochemistry, immunoprecipitation, mass spectrometry, ELISA, etc. In one option, immunohistochemistry may be used to determine the level of proteins from particular genes of the gene panel in cells of a tissue sample of interest. In such methods, antibodies or aptamers directed to the protein of interest may be used to determine its level. Methods for generating such antibodies or aptamers are well known to the person skilled in the art. The antibody or aptamers used in such methods may be conjugated to a label, the label forming part of a detection agent. Such methods are well known in the art.
It is to be understood that, while in many cases it may be preferable for ease of operation to determine the expression level of each of the genes in a gene panel using the same technique, it is within the scope of the invention to use a number of different methods to determine expression levels of genes of the gene panel.
Reference Levels
In the method of the invention, the expression level of each member of the gene panel may be determined with reference to reference or control expression levels, e.g. expression levels from chemo sensitive TNBC cells.
A reference level of a particular gene may be an absolute or relative amount or concentration of the gene, a presence or absence of the gene product, a range or amount of concentration of the gene product, a mean amount of the gene or gene product, and/or a median amount of or concentration of the gene or gene product. A “reference level” can also be a “standard curve reference level” based on a level of one or more of the genes determined from a population and plotted on appropriate axes to produce a reference curve. The reference curve may be tailored to particular populations of subjects, with reference levels varying with, for example, age group. A standard curve reference level may be determined from a group of reference levels from a group of subjects having a particular disease using statistical analysis, such as univariate or multivariate regression analysis, logistic regression analysis, linear regression analysis, etc. of the levels of such genes/biomarkers in samples from the group. Such reference levels may be adjusted to specific techniques used to measure levels of gene expression, where the gene expression levels may differ based on the specific technique that is used.
In one embodiment of the invention, where the expression level of 10 or more genes of the gene panel is increased relative to the predetermined reference level for these genes, the sample may be defined as positive for the gene signature, i.e. the demonstration that at least 10 genes of the gene panel have increased expression relative to reference levels in control cells for the corresponding genes is indicative that the sample is from TNBC cells which are resistant to chemotherapy. In another embodiment, the presence of 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or more, 17 or more, 18 or more, or 19 or more of the genes of the gene panel being expressed at an increased level compared to the reference levels in control cells is indicative of the sample being a sample of TNBC cells which are chemotherapy resistant.
In a simple version of the method of the invention, the expression of each of the genes of the gene panel is combined to produce a compound gene signature score of total expression of all of the genes of the gene panel relative to a compound score of the total of the reference expression levels of these genes from a control. Thus, it is possible that, while some genes of the gene panel do not demonstrate increased expression, sufficient numbers of the genes of the gene panel demonstrate increased expression such that the compound gene signature score reaches a threshold level which is indicative of resistance to chemotherapy.
As described in the examples, each of the genes which may be used in the gene panels in the methods of the invention have been assessed for predictive power in relation to the overall gene signature, with the genes ranked in order of influence on the gene signature performance. Each of the genes of the 20 gene panel (SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAI1, RNF19B, and MKKS) has thus been assigned a Predictive Score i.e. Ranking Score indicative of its influence on the performance of the 20 gene signature, with a higher value indicative of greater reduction in performance of the gene signature upon the exclusion of that gene from the gene signature. The rank position and Predictive Score (Ranking Score) associated with each of the genes of the 20 gene panel is shown in Table 2.
Table 2
Rank Gene Score
1 MSH3 0.55837077
2 TSHZ2 0.45621663
3 SLC11A2 0.40561056
4 CTDSPL 0.36240989
5 NDUFA6 0.35041027
6 PTPRJ 0.33073842
7 PPFIA2 0.27207718
8 COL21A1 0.26561734
9 EFEMP2 0.26305276
10 CLCN3 0.25778845
11 ESMI 0.24978852
12 EGFR 0.24651416
13 DTNA 0.21853149
14 EPHB3 0.20481655
15 GDAP2 0.20161485
16 HOXA1 0.19550178
17 LOXL2 0.18104556
18 RAI1 0.17973036
19 RNF19B 0.03268052
20 MKKS 0.01919981 The ranking of each gene in the gene panel may be taken into account in determining the overall compound expression score for the gene panel to determine the likelihood of resistance to chemotherapy in TNBC cells.
Accordingly, in one embodiment of the invention, in the determining of the likelihood of resistance to chemotherapy against triple negative breast cancer, the compound expression score is calculated employing a weighting of the expression levels of the genes of the gene panel according to the ranking of the genes of the gene panel. The ranking may be the rank position, in which a ranking of “1” is indicative that the gene has the most negative impact on performance when removed with rankings of higher numbers indicating less impact on performance when removed from the gene signature. Thus, the genes of a gene signature of 20 genes may be ranked from 1 to 20, with the gene having the greatest positive effect on the predictive power of the gene signature having rank position 1, the gene with the second highest positive effect on the predictive power of the gene signature having rank position 2, etc. Alternatively, any formula used to determine the compound expression score may take into account the Predictive Score (Ranking Score), such as the ranking scores as listed for the 20 gene signature in Table 2. In such case, the higher the ranking score of a gene, the greater the impact on performance on the gene signature when that gene is removed from the gene signature. The ranking score and rank position may change depending on the total number of genes in a gene panel.
The combined expression level, as appropriately weighted according to ranking score, may thus be used to determine the final gene signature score.
The skilled person will appreciate that the genes included in the gene signature may thus carry unequal weight in the determination of the likelihood of resistance to chemotherapy in TNBC cells. Accordingly, the weighting/rank of each gene in the score may be taken into account in the calculation of the overall gene signature score.
In one embodiment, the gene signature score may be determined using any suitable weighting formula. In one embodiment, a suitable weighting formula may make use of regression analysis. Any suitable predictive package may be used. For example one such predictive package which may be used is the caret R package. In summary it takes the absolute value of the final coefficients and ranks them based on their importance in the model. The main function is varlmp() which is described as a generic method for calculating variable importance for objects produced by “train” and method specific methods (https://www.rdocumentation.Org/packages/caret/versions/6.0-90/topics/varImp). It can be applied to multiple models including linear, random forrest and glmnet where it calculates the importance of the outputs and ranks them. In the case of glmnet the outputs are coefficients. Thus, in one embodiment, the varlmp function in the caret package may be used to rank the importance of genes of the gene signature.
In one such embodiment, the following code may used for gene ranking varlmp <- function(object, lambda = NULL, ...) { beta <- predict(object, s = lambda, type = "coef) if(isdist(beta)) { out <- do. call(" cbind", lapply beta, function(x) x [,!])) out <- as.data.frame(out, stringsAsFactors = TRUE) } else out <- data.frame(Overall = beta[l]) out <- abs(out[rownames(out) != "(Intercept)",, drop = FALSE]) out varImp(cv.lassoModel, lambda = cv.lassoModelSlambda.min)
The compound gene signature score may be compared to a reference value which may be derived from, for example, a training set of patent data in which, for example, the threshold is established to indicate a total score which is indicative of whether or not the cells of the sample are chemotherapy resistant.
The performance of a particular gene signature may be determined in any suitable way. In one embodiment, the performance is assessed using Area Under the Curve (AUC). AUC refers to the area under the curve of a Receiver Operating Characteristic (ROC) curve. A higher AUC value is indicative of greater capacity of the gene signature (gene panel) to predict classes correctly, in this case or not a sample falls in one or another of two groups of interest, e.g. in the present case, the ability to predict whether or not cells of a sample are chemotherapy resistant. AUC is typically measured as a value between 0 and 1 where an AUC of 0 is indicative that the model’s predictions are 100% wrong and an AUC of 1 is indicative that the model’s predictions are 100% correct.
In embodiments of the invention, the AUC associated with a gene panel of the invention is greater than 0.750. Optionally, the AUC associated with a gene panel of the invention is greater than 0.800, for example greater than 0.825, such as greater than 0.850, greater than 0.860, such as greater than 0.870, greater than 0.880 or greater than 0.890.
Other methods may be used to characterise whether or not a sample falls within the chemoresistant TNBC category other than by determining AUC. For example, methods which could be used include, for example, classification and regression trees, Random Forests, Multivariate Adaptive Regression Splines and Support vector machines.
Chemotherapy
The methods of the invention enable the determination as to whether a patient’s TNBC is likely to be resistant to a chemotherapy treatment. In one embodiment, the chemotherapy comprises one, two, three or more of the group consisting of anthracyclines (for example Adriamycin (doxorubicin)), alkylating agents (for example Cyclophosphamide), taxanes (for example taxol (Docetaxel)), or antimetabolites (such as fluorouracil (5-FU).
In one embodiment, the chemotherapy treatment is a combined chemotherapy treatment regimen which comprises or consists of an anthracycline and an alkylating agent. In one such embodiment, the combined chemotherapy treatment regimen consists of Adriamycin (doxorubicin) and cyclophosphamide. In another embodiment, the combined chemotherapy treatment regimen consists of Adriamycin (doxorubicin), cyclophosphamide, and paclitaxel. In another embodiment, the chemotherapy treatment is a combined chemotherapy treatment regimen which comprises or consists of an anthracycline, an alkylating agent, and an antimetabolite. In one such embodiment, the combined chemotherapy treatment regimen comprises or consists of 5-FU, Adriamycin, and Cyclophosphamide. In another embodiment, the combined chemotherapy treatment regimen consists of paclitaxel, 5-FU, Adriamycin, and Cyclophosphamide.
Brief Description of the Figures
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying figures:
Figure 1: scRNA-seq Analysis Reveals Subpopulations of Cells and Key Markers Potentially Driving TNBC Chemoresistance.
A) Study Workflow. B) UMAP projections of scRNA-seq profiles from pre and post treatment chemoresponsive and chemoresistant patients showing distinct pre and post treatment clusters and cell type annotations. C) Gene Ontology analysis of markers shared between pre and post treatment resistant patients reveals that the potential markers are significantly involved in multiple epithelial to mesenchymal transition processes. D) ROC plotter results of chemoresistant markers shows that many have been shown to have higher expression in non-responder TNBC patients following NAC. E) Reproducibility analysis reveals that 87 out of 281 markers identified in pre-treatment chemoresistant patients had higher expression in residual disease than pathological complete response across all datasets. F) Survival plot of the 87 genes in TNBC patients from the METABRIC Cohort G) Gene ontology of the 87 genes reveals they are significantly involved in signalling pathways and cell migration.
Figure 2: 20 Genes show higher expression in TNBC subtype, have poor survival probability, and have high accuracy in predicting chemotherapy response.
A) Selection of the tuning parameter ( ), based on 10-fold cross-validation, using the LASSO model. Vertical lines represent lamda.min and lamda.lse, the red line represents the cross-validation curve and mean binomial deviance against log- X. B) The coefficients of the 20 genes and 21 probe IDs used to construct the predictive model. C) Gene ontology of the 20 genes reveals they are significantly involved in EGFR and ERBB signalling pathways D) Average expression of the 20 genes in the TCGA-BRCA cohort E) Survival plot of the 20 genes in TNBC patients from the METABRIC Cohort F) ROC curve highlighting the accuracy of the inventors model compared to previously published TNBC prediction models. G) ROC curve showing the prognostic capability following removal of five genes (AUC=0.678).
Figure 3: Co-Expression analysis reveals potential role of STAT3 signalling in driving TNBC chemoresistance.
A) Scale independence and mean connectivity of various soft-thresholding values B) The hierarchical cluster tree of 20 gene probe IDs between the RD and pCR. The branches and colour bands represent the assigned module; and C) co-expression network modules the light colour represents low overlap and the progressively darker red colour represents higher overlap between the genes D) Survival plot of the coexpressing genes in TNBC patients from the METABRIC Cohort E) proposed network for each module highlighting downstream targets (red arrows) and upstream targets (blue arrows) of STAT3 (the arrows pointing to STAT3 originating from regulators and the arrows originating from STAT3 pointing to regulators) F) Expression of DTNA, EGFR and RNF19B following STAT3 knockdown.
Figure 4 ROC curves showing the prognostic capability following removal of one or more genes from 20 gene panel
A) ROC curves showing AUC values associated with the removal of from 1 to 19 of the genes of the 20 gene panel (TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS), with the ROC curve labelled -1 including the first 19 of the gene panel, i.e. excluding MKKS, the ROC curve labelled -2 including the first 18 of the gene panel, i.e. excluding RNF19B, and MKKS, etc, with -19 thus only including TSHZ2. B) ROC curves showing AUC values associated with the removal of from 1 to 5 of the genes of the 20 gene panel (TSHZ2, SLC11A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS), with the ROC curve labelled -TSHZ2 for a panel of 19 genes excluding TSHZ2, the ROC curve labelled -Top Two for a panel of 18 genes excluding TSHZ2 and SLC11 A2, the ROC curve labelled -Top Three for a panel of 17 genes excluding TSHZ2, SLC11 A2, and CTDSPL, the ROC curve labelled -Top Four for a panel of 16 genes excluding TSHZ2, SLC11 A2, CTDSPL, and NDUFA6, and the ROC curve labelled -Top Five for a panel of 15 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, and PTPRJ. C) ROC curves showing AUC values associated with the removal of from 6 to 10 of the genes of the 20 gene panel (TSHZ2, SLC11A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAJI, RNF19B, and MKKS), with the ROC curve labelled -Top Six for a panel of 14 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, and MSH3, the ROC curve labelled -Top Seven for a panel of 13 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3, and PPFIA2, the ROC curve labelled -Top Eight for a panel of 12 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, and COL21A1, the ROC curve labelled -Top Nine for a panel of 11 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, and EFEMP2, and the ROC curve labelled -Top Ten for a panel of 10 genes excluding TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, and CLCN3.D) ROC curves showing AUC values associated with the removal of individual genes (TSHZ2, SLC11 A2, CTDSPL, NDUFA6, or PTPRJ ) from the 20 gene panel (TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, H0XA1, LOXL2, RAH, RNF19B, and MKKS). E) ROC curves showing AUC values associated with the removal of individual genes (H0XA1, LOXL2, RAH, RNF19B, or MKKS) from the 20 gene panel (TSHZ2, SLC11A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAH, RNF19B, and MKKS). EXAMPLES
METHODS
Identification of chemoresistant cell types using single-cell RNA-sequencing analysis
To identify cell types and their markers associated with TNBC chemoresistance, scRNA-seq analysis was performed on the data set obtained from Kim et al 6, consisting of matched pre and post chemotherapy (anthracy cline and a taxane) samples from four responsive and four resistant patients with a total of 6,862 cells.
The patient data files from the four responsive and four resistant patients were analysed using the same parameters in the R package "Seurat". First, cells with feature counts of greater than 2500 or less than 200 were removed, including mitochondrial reads of greater than 5%. Following the removal of cells, downstream analysis, including normalisation, variable feature selection, dimensionality reduction and UMAP clustering was performed. Cluster annotation was performed using the python program "SC SA" to identify associated cell types and cancer-related processes. Next, the inventors pooled all significantly expressed markers for each treatment time point and therapy response to identify uniquely expressed markers in pre and post chemoresistant patients that could potentially have a crucial role in driving TNBC chemoresistance.
Prognostic Capability Analysis using ROC Plotter
To test the prognostic capability of the markers identified, the inventors input the marker list into the online tool ROC plotter. Using the parameters of TNBC, chemotherapy treatment, and clinical trial data (n=223), the inventors identified that many markers had higher expression in patients with residual disease. Reproducible Signature Marker Identification
For the reproducibility of the gene set identified using scRNA-seq analysis, the inventors used five independent bulk RNA seq datasets, GSE20271, GSE25055, GSE25065, GSE20194, GSE50948, of 397 TNBC patients where their chemotherapy response was available (RD, pCR). Patient samples were excluded if the therapeutic outcome (residual disease or pathological complete response) was unknown and not classified as TNBC. The raw data was normalised, batch corrected, and log- transformed using the R package "affy" and the python package "pyComBat". In total, 397 patient's data were selected for reproducibility analysis. The 281 genes identified in the scRNA-seq dataset were extracted from the normalised bulk RNA- seq count files. A custom R script was used to compare each gene's expression in patients with residual disease and pathological complete response.
Identification of Significant Gene Set and Construction of the Prognostic Prediction Model Based on Residual Disease Vs Pathological Complete Response.
Raw microarray expression (CEL) files of 307 TNBC patients were downloaded from Gene Expression Omnibus, GSE20271, GSE25055, GSE25065, GSE20194. Gene expression profiles were quantile normalised and log2 transformalised using "BART", followed by batch correction using "ComBat" from the R package "sva". To identify the most significant gene set, GSE20271 and GSE25055 datasets with 177 TNBC patients (57 pathological complete response, 120 residual disease) were used as the testing cohort. To verify the strength of the geneset, GSE25065 and GSE20194 datasets with 130 TNBC patients (46 pathological complete response, 84 residual disease) were used as the validation cohort. To identify the significant gene set and develop a predictive model to discriminate pCR and RD groups, the inventors first used Lasso and Elastic-Net Regularized Generalized Linear Models using the R package "glmnet" on the 87 markers to identify the best combination with the greatest predictive power. Then, the inventors used the 10-fold cross-validation method to evaluate the discrimination ability to obtain a relatively unbiased estimate. After the LASSO regression analysis, a predictive model based on 20 was used to fit a generalised linear model. The predictive capability was measured by the receiver operating characteristic curve (ROC curve) area under the curve (AUC) using the R package "pROC". Results were evaluated using the area under the ROC curve. The optimal model was selected by maximising AUC.
Construction of WGCNA
The R package, “WGCNA”, was used to perform the weighted correlation network analysis 17. Firstly, the gene co-expression similarity between genes in RD and pCR was defined. A power function was then applied to correlate adjacency of genes. Scale independence and mean connectivity were then tested using a gradient method (the power value ranging from 1 to 20). When the degree of independence was determined to be above 0.80 an appropriate power value was screened out to obtain a scale-free network. Finally, the adjacency matrix was transformed into a topological overlap matrix, and modules were detected by hierarchical average linkage clustering analysis for the gene dendrogram. Additionally, the inventors extracted the corresponding gene information for each module for further analysis, in particular, turquoise (I) and yellow (J) modules.
STAT3 knockdown ChlP-seq and RNA-seq Validation
ChlP-seq data for genome-wide binding patterns of STAT3 in TNBC cell lines and RNA-seq after 96hr knockdown of STAT3 or non-targeting siRNA was downloaded from (GSE85579). Expression values for each gene identified in the turquoise (I) and yellow (J) modules were extracted for STAT3 knockdown and non-targeting control and plotted using the R package “ggplof ’. ChlP-seq bedgraph files of STAT3 KD and control were converted to bigwig format using Kent utils “bedgraph2bigwig” to enable viewing in the UCSC genome browser.
RESULTS
ScRNA-Seq Analysis of Chemotherapy Response Dataset Identifies Key Markers Associated with Chemotherapy Resistance. To unravel the underlying mechanisms of chemoresistance in TNBC, the inventors analysed the scRNAseq dataset provided by Kim et al. The dataset provided is the only scRNA-seq data available containing matched longitudinal samples of TNBC patients with known therapeutic response (Fig. 1 A).
Due to the high degree of intra-tumour heterogeneity (ITH) associated with TNBC, scRNA-seq provides a higher level of resolution and enables the identification of distinct cell types and markers, which could be lost in bulk RNA-seq analysis. The inventors hypothesised that by profiling TNBC chemoresistant patient data at the single-cell level, the inventors could identify critical markers driving chemotherapy response, and identifying these markers could enable the prediction of chemotherapy response in untreated patients. The inventors analysed a total of 6,862 cells from four chemoresponsive and four chemoresistant patients at pre and post treatment time points.
For each patient, cells were clustered into distinct pre and post treatment clusters. Cluster annotation was performed, using SC SA, to identify cell types and revealed the pre and post treatment clusters were majorly dominated by progenitor and luminal epithelial cells pre-treatment and basal epithelial cells post-treatment (Fig. IB). The chemoresistant patients showed a clear separation of clusters into pre and post treatment. Basal epithelial cell types, associated with metastasis and other TNBC processes were present pre-treatment and remained post-treatment, indicating that treatment was not successful in eliminating these cells and potentially promote chemoresistance in these patients (Fig. IB). For all chemoresponsive patients, the clustering showed a clear separation of pre and post treatment timepoints. The pretreatment cluster contained progenitor cells and luminal epithelial cells which following NAC were removed. Additionally, the basal epithelial cell type, which is associated with metastasis and other TNBC processes were not present or removed in the post treatment clusters indicating that these cells may confer chemoresistance (Fig. IB). Differential gene expression was performed for each patient to identify markers specific to the chemoresistant patients that could be identified across pre and post samples. First, the list of differentially expressed genes for each treatment timepoint for all patients was obtained. Next, a distinct set of genes defining each population was compared between responsive and resistant groups to identify markers specific to chemoresistance patients.
For the chemoresistant patients, 241 genes specific to pre-treatment, 760 posttreatment and 40 shared between pre-and post-treatment resistant patients were identified.
As it is thought that chemoresistance occurs due the clonal evolution of prexisting clones the 40 genes shared across pre and post treatment patients in the resistance group were of most interest. GO term analysis was performed using the online tool "EnrichR", and the results showed the 40 markers were significantly involved in the EMT process, which is thought to play a crucial role in chemoresistance (5) (Fig. 1C).
To investigate the prognostic capabilities of the 40 markers, the online tool "ROC plotter" was used. ROC plotter can link gene expression and therapy response using transcriptome level data of 3,104 breast cancer patients and resulted in the identification of 16 genes which were shown to have a high AUC (<0.6) in patients with residual disease compared to pathological complete response following chemotherapy in the clinical trial setting. Based on this score, the 16 genes were identified to have a high capability of being a top biomarker in predicting chemotherapy response ( see Fig. ID which shows results for two of these, LOXL2 and DTNA). Interestingly, Lysyl Oxidase Like 2 (LOXL2) was identified to be overlapping between pre and post treatment in the resistant patients. LOXL2 is a member of the Lysyl Oxidase (LOX) family which have previously been associated with tumour aggressiveness by regulation of cell adhesion, motility and invasion. Additionally, inhibition of LOX and its paralogues have been shown to overcome chemotherapy resistance in TNBC 18. Additionally, Dystrobrevin Alpha (DTNA) was shown to have high prediction capabilities and to date has not been associated with chemotherapy resistance in TNBC or any other breast cancer subtypes.
Reproducibility Analysis Identifies a Reduced Gene Set Associated with Residual Disease in TNBC Patients.
To assess the reproducibility of the markers in multiple bulk RNA-seq datasets, where the therapeutic response was known, the inventors first expanded the marker list to include all pre-treatment markers (n=241, total=281) and then a custom script was used to compare expression levels of each marker between residual disease and pathological complete response. This resulted in the identification of 87 out of 281 markers which showed a significantly higher expression across all datasets in patients with residual disease (Figure IE). High expression of the 87 genes showed significantly worse survival probabilities in TNBC patients for the METABRIC cohort (Fig. IF). Gene Ontology analysis revealed that these markers were significantly involved in EGFR signalling pathways, which have been shown to be involved in TNBC chemoresistance 19-21 (Fig. 1G). Together these results highlight the potential role of the 87 markers in promoting and driving chemoresistance in TNBC patients.
Identification of Significant Gene Set by LASSO Regression
The inventors used gene expression of the 87 genes in 307 TNBC patients from four microarray-based breast cancer studies. Out of 307 patients, 183 from two datasets GSE25055 and GSE25065 were used to construct the classifier, and 130 patients from GSE20194 and GSE20271 were used for external validation.
Lasso and Elastic-Net Regularized Generalized Linear Models was applied, using the R package "glmnet", to screen the 87 chemoresistant markers and identify the top combination of genes that could accurately predict therapy response in TNBC patients. The inventors used the 10-fold cross-validation method to evaluate the discrimination ability between pCR and RD to obtain a relatively unbiased estimate (Fig. 2A, B). This analysis revealed a total of 20 genes (CLCN3, NDUFA6, PTPRJ, GDAP2, RNF19B, MKKS, TSHZ2, COL21A1, LOXL2, SLC11A2, ESMI, CTDSPL, RAH, EFEMP2, DTNA, EPHB3, EGFR, HOXA1, MSH3 and PPFIA2) to have the strongest discrimination between RD and pCR.
Gene ontology analysis revealed that the 20 genes were significantly involved in ERRB and EGFR signalling and regulation of cell-matrix adhesion (Fig. 2C). To ensure the genes where TNBC specific the inventors next validated their expression in the TCGA-BRCA dataset. In combination, the average expression of the 20 genes showed significantly higher expression in TNBC compared to breast cancer subtypes, but when compared to normal the difference was not significant (Fig. 2D). However, testing their survival in TNBC patients from the METABRIC cohort reveals that in combination their higher expression is associated with significantly worse survival (Fig. 2E).
20 Gene Model Accurately Predicts Therapy Response in TNBC Patients and Outperforms Published TNBC Predictive Panels.
In TNBC patients achieving a pathological complete response to neoadjuvant chemotherapy is a crucial predictor of a patient's long-term outcomes and can allow an early evaluation of the effectiveness of systemic therapy. Identification of biomarkers that can predict TNBC NAC response provides physicians with an opportunity to offer alternative treatments. For oestrogen receptor or HER2 positive tumours, several molecular tests can be used to guide therapeutic options, however, there are currently no tests in clinical use for TNBC patients, and there is an unmet need for the development of accurate NAC response predictors to aid in their clinical management. For TNBC specifically a number of predictive panels22-26 have been published but none achieved the high predictive accuracy obtained by panels used in oestrogen and HER2 positive tumours due to small sample sizes and lack of validation data.
Here using the 20 genes (TSHZ2, SLC11 A2, CTDSPL, NDUFA6, PTPRJ, MSH3,PPFIA2, COL21A1, EFEMP2, CLCN3, ESMI, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1, RNF19B, and MKKS), as identified by LASSO regression, and their corresponding Affymetrix Probe IDs (as listed in Table 1) , the inventors developed a predictive model using over 300 patients that demonstrated a high accuracy in determining RD. The model (QUB model) had an area under the curve (AUC) of 0.90, suggesting a high prediction accuracy (Fig. 2F). To highlight the predictive strength of the specific combination of the 20 genes, the inventors next removed a set of genes (n=5) (GDAP2, MKKS, RAI1, RNF19B and TSHZ2) from the model and tested its predictive power. After removal of this set of genes, the inventors found that the inventors model's predictive power was significantly reduced (AUC: 0.827) (Fig. 2G). Nevertheless, the predictive power of the resulting 15-gene gene panel was, with an AUC of 0.827 still significantly better than existing models. The inventors further analysed the effect of removal of one or more genes from the 20-gene gene panel to determine the effect on the predictive power of the resulting reduced gene panels. The results are shown in Figure 4. Figure 4C shows that even when the top 10 genes are removed from the 20 gene panel, the reduced gene panel of 10 genes still performs well with an AUC of 0.76. Figure 4B demonstrates that when the top 5 genes of the gene panel are excluded, the reduced gene panel of 15 genes still performs very well with an AUC of 0.827.
Construction of Weighted Gene Co-Expression Network Reveals potential role of STAT3 signalling in TNBC chemoresistance
To identify genes that potentially were co-expressing in chemoresistant TNBC patients, a weighted gene co-expression network analysis (WGCNA) was performed. WCGNA enables identification of genes which are potentially co-expressing together and, in this case, potentially driving chemoresistance. Using the dynamic tree cutting algorithm, probe IDs for the 20 significant gene were grouped into ten modules (Fig. 3B). Two modules indicated that there was significant co-expression of several genes (Fig. 3C) The turquoise (I) and yellow (J) modules are shown to have a high similarity score compared to the other module, and so the inventors extracted the genes from each of these and identified the overlapping genes, (Figure 3E). The overlapping genes include DTNA, RNF19B, and EFEMP2. Figure 3E represents a module of genes which are co-expressed in RD compared to pCR. Figure E highlights the genes present in each module of interest, with the I module and the J module represented by lighter and darker shading respectively.
The average expression of the turquoise (I) and yellow (J) modules was also shown to be significantly associated with worse survival in the METABRIC TNBC cohort (Fig. 3D).
All genes in the yellow and turquoise modules appeared to be involved in STAT3 signalling 27-34 which provided the rationale for the proposed network (Fig. 3E) with EGFR from the yellow module, TSHZ2 from the turquoise module and DTNA from both modules activating STAT3 signalling and the remaining genes being downstream targets. To test the proposed networks involvement in STAT3 signalling, RNA-seq and ChlP-seq data was downloaded from GSE85579 where they performed STAT3 knockdown (KD) and revealed a role of STAT3 in regulating invasion and metastasis. The RNA-seq showed that KD of STAT3 altered the expression of the genes of interest, including upregulation of EGFR and TSHZ2, two genes which regulate STAT3 signalling, upregulation of DTNA and downregulation of the remaining five genes (Fig. 3F). Interestingly, DTNA has recently been shown to play a role in HBC-induced hepatocellular carcinoma by activating STAT3 27 and here it was shown to have higher expression following STAT3 KD.
The STAT3 ChlP-seq data of each TNBC cell line suggests that STAT3 binds each gene in TNBC cell lines. After knockdown of STAT3 the RNA-seq of the potential regulators of STAT3 were showing significantly increased expression, however, whilst the downstream targets appeared to show higher expression it was not significant. These results indicate that genes within the proposed model are bound by STAT3 and are under its tight regulation as they are affected upon STAT3 depletion, highlighted in the RNA-seq data. DISCUSSION
Neoadjuvant chemotherapy (NAC) is used frequently in the treatment of TNBC patients due to the lack of targeted therapeutics and its ability to reduce tumour size, improve surgical outcomes and increase survival in responders. However, due to the ITH associated with TNBC, patients have differing responses to NAC 35. Achieving pCR is associated with significantly improved survival outcomes in TNBC patients 36. Identifying those patients who will have RD following NAC will enable physicians to determine the best therapeutic option at the beginning of treatment, rather than waiting for NAC treatment results, to increase the chances of achieving pCR. Numerous efforts have been put into developing predictive signatures in TNBC, but currently, there is no clinically recommended predictive biomarker panel for NAC response 7-9.
Here, by profiling chemoresponsive and chemoresistant patients at the single-cell level to identify markers associated with chemotherapy, the inventors have developed a predictive model which has outstandingly high accuracy in defining chemotherapy response in TNBC patients. The inventors predictive model has been shown to outperform existing panels for ER-positive breast cancers, the inventors model having a higher AUC and lower number of genes. A low number of genes in biomarker panel development is desirable due to cost-effectiveness in production and the end-user. These observations highlight that not only can the marker list identified accurately predict NAC response in TNBC patients, but they could also be the key drivers in TNBC chemoresistance.
Additionally, the inventors have revealed a potential role of STAT3 signalling in driving TNBC chemoresistance. By performing WCGNA the inventors identified two modules where genes were overlapping and significantly co-expressing between RD and pCR patients. The genes involved have all previously been reported to play a role in STAT3 signalling, either as activators or downstream targets. By extracting RNA- seq and ChlP-seq of TNBC cell lines following STAT3 knockdown, the inventors were able to provide insights into the potential role of these genes in STAT3 signalling and how they are driving chemoresistance in TNBC patients.
In summary, the inventors have developed a 20 gene predictive model derived from scRNA-seq TNBC data, which accurately predicts chemotherapy response in TNBC patients. The inventors also revealed a potential role of STAT3 signalling driving chemoresistance in TNBC patients. Additionally, the inventors have presented a framework for identifying predictive biomarkers from the single-cell level and the development of a predictive model which could be applied in other cancer types.
REFERENCES
1. Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer J. Clin. (2018) doi: 10.3322/caac.21492.
2. Eliyatkin, N., Yalcin, E., Zengel, B., Akta§, S. & Vardar, E. Molecular Classification of Breast Carcinoma: From Traditional, Old-Fashioned Way to A New Age, and A New Way. J. Breast Heal. (2015) doi: 10.5152/tjbh.2015.1669.
3. Liedtke, C. et al. Response to neoadjuvant therapy and long-term survival in patients with triple-negative breast cancer. J. Clin. Oncol. (2008) doi: 10.1200/JC0.2007.14.4147.
4. Anampa, J., Makower, D. & Sparano, J. A. Progress in adjuvant chemotherapy for breast cancer: An overview. BMC Medicine (2015) doi: 10.1186/sl2916- 015-0439-8.
5. Moo, T. A., Sanford, R., Dang, C. & Morrow, M. Overview of Breast Cancer Therapy. PET Clinics (2018) doi: 10.1016/j.cpet.2018.02.006.
6. Kim, C. et al. Chemoresistance Evolution in Triple-Negative Breast Cancer Delineated by Single-Cell Sequencing. Cell (2018) doi: 10.1016/j. cell.2018.03.041.
7. Gass, P. et al. Prediction of pathological complete response and prognosis in patients with neoadjuvant treatment for triple-negative breast cancer. BMC Cancer 18, 1051 (2018).
8. Masuda, H. et al. Differential Response to Neoadjuvant Chemotherapy Among 7 Triple-Negative Breast Cancer Molecular Subtypes. Clin. Cancer Res. 19, 5533 LP - 5540 (2013).
9. W Chen, J. et al. RNA Expression Classifiers from a Model of Breast Epithelial Cell Organization to Predict Pathological Complete Response in Triple Negative Breast Cancer. Arch. Clin. Biomed. Res. 05, (2021).
10. Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods (2009) doi:10.1038/nmeth,1315.
11. Stark, R., Grzelak, M. & Hadfield, J. RNA sequencing: the teenage years. Nature Reviews Genetics (2019) doi: 10.1038/s41576-019-0150-2.
12. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: A revolutionary tool for transcriptomics. Nature Reviews Genetics (2009) doi: 10.1038/nrg2484.
13. Kulkarni, A., Anderson, A. G., Merullo, D. P. & Konopka, G. Beyond bulk: a review of single cell transcriptomics methodologies and applications. Current Opinion in Biotechnology (2019) doi : 10.1016/j . copbio.2019.03.001.
14. Haque, A., Engel, J., Teichmann, S. A. & Lonnberg, T. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Medicine (2017) doi: 10.1186/sl3073-017-0467-4.
15. Kim, K. T. et al. Application of single-cell RNA sequencing in optimizing a combinatorial therapeutic strategy in metastatic renal cell carcinoma. Genome Biol. (2016) doi: 10.1186/sl3059-016-0945-9.
16. Karaayvaz, M. et al. Unravelling subcl onal heterogeneity and aggressive disease states in TNBC through single-cell RNA-seq. Nat. Commun. (2018) doi: 10.1038/s41467-018-06052-0.
17. Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
18. Saatci, O. et al. Targeting lysyl oxidase (LOX) overcomes chemotherapy resistance in triple negative breast cancer. Nat. Commun. 11, 2416 (2020).
19. Lev, S. Targeted therapy and drug resistance in triple-negative breast cancer: the EGFR axis. Biochem. Soc. Trans. 48, 657-665 (2020). 20. Changavi, A. A., Shashikala, A. & Ramji, A. S. Epidermal Growth Factor Receptor Expression in Triple Negative and Nontriple Negative Breast Carcinomas. J. Lab. Physicians 7, 079-083 (2015).
21. Ogden, A. et al. Combined HER3-EGFR score in triple-negative breast cancer provides prognostic and predictive significance superior to individual biomarkers. Sci. Rep. 10, 3009 (2020).
22. Fournier, M. V. et al. A Predictor of Pathological Complete Response to Neoadjuvant Chemotherapy Stratifies Triple Negative Breast Cancer Patients with High Risk of Recurrence. Sci. Rep. 9, 14863 (2019).
23. Hatzis, C. A Genomic Predictor of Response and Survival Following Taxane- Anthracy cline Chemotherapy for Invasive Breast Cancer. JAMA 305, 1873 (2011).
24. Lehmann, B. D. et al. Refinement of Triple-Negative Breast Cancer Molecular Subtypes: Implications for Neoadjuvant Chemotherapy Selection. PLoS One 11, e0157368 (2016).
25. Nakashoji, A. et al. Clinical predictors of pathological complete response to neoadjuvant chemotherapy in triple-negative breast cancer. Oncol. Lett. 14, 4135-4141 (2017).
26. Lim, G. B. et al. Prediction of prognostic signatures in triple-negative breast cancer based on the differential expression analysis via NanoString nCounter immune panel. BMC Cancer 20, 1052 (2020).
27. Hu, Z.-G. et al. DTNA promotes HBV-induced hepatocellular carcinoma progression by activating STAT3 and regulating TGFpi and P53 signaling. Life Sci. 258, 118029 (2020).
28. Lo, H.-W. et al. Nuclear interaction of EGFR and STAT3 in the activation of the iNOS/NO pathway. Cancer Cell 7, 575-589 (2005).
29. Grillo, M. et al. Stat3 oxidation-dependent regulation of gene expression impacts on developmental processes and involves cooperation with Hif-la. PLoS One 15, e0244255 (2020).
30. Peng, L. et al. Secreted LOXL2 is a novel therapeutic target that promotes gastric cancer metastasis via the Src/FAK pathway. Carcinogenesis 30, 1660- 1669 (2009). Zhang, Y. et al. Suppression of chloride voltage-gated channel 3 expression increases sensitivity of human glioma U251 cells to cisplatin through lysosomal dysfunction. Oncol. Lett. (2018) doi: 10.3892/ol.2018.8736. Zhu, H. et al. NPM-ALK up-regulates iNOS expression through a STAT3/microRNA-26a-dependent mechanism. J. Pathol. 230, 82-94 (2013). Ohanna, M. et al. Secretome from senescent melanoma engages the STAT3 pathway to favor reprogramming of naive melanoma towards a tumorinitiating cell phenotype. Oncotarget 4, 2212-2224 (2013). Zhang, D. et al. RAB5C, SYNJ1, and RNF19B promote male ankylosing spondylitis by regulating immune cell infiltration. Ann. TransL Med. 9, 1011—
1011 (2021). Yin, L., Duan, J. -J., Bian, X.-W. & Yu, S. Triple-negative breast cancer molecular subtyping and treatment progress. Breast Cancer Res. 22, 61 (2020). Symmans, W. F. et al. Measurement of Residual Breast Cancer Burden to Predict Survival After Neoadjuvant Chemotherapy. J. Clin. Oncol. 25, 4414-
4422 (2007).

Claims

Claims
1. A method of predicting resistance to chemotherapy in a subject having triple negative breast cancer (TNBC), wherein said method comprises: a) providing a biological sample from said subject; and b) determining the expression levels of each member of a gene panel, said gene panel comprising at least 10 genes selected from the group consisting of SLC11A2, TSHZ2, CTDSPL, NDUFA6, PTPRJ, MSH3, PPFIA2, COL21A1 , EFEMP2, CLCN3, ESM1 , EGFR, DTNA, EPHB3, GDAP2, HOXA1 , LOXL2, RAI1 , RNF19B, and MKKS in said biological sample; wherein the determined expression levels of said genes of said gene panel is used to determine the likelihood of resistance to said chemotherapy.
2. The method according to claim 1 , wherein said gene panel comprises one, two, three, four, or five of SLC11A2, TSHZ2, CTDSPL, NDUFA6 and PTPRJ.
3. The method according to claim 1 or claim 2, wherein said gene panel comprises at least 10 genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1 , EFEMP2, CLCN3, ESM1 , EGFR, DTNA, EPHB3, and GDAP2.
4. The method according to any one of the preceding claims wherein the gene panel comprises at least the 10 genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1 , EFEMP2, and CLCN3.
5. The method according to any one of the preceding claims, wherein said gene panel comprises at least 15 genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1 , EFEMP2, CLCN3, ESM1, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1 , RNF19B, and MKKS.
6. The method according to claim 5, wherein said gene panel comprises at least the fifteen genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1 , EFEMP2, CLCN3, ESM1, EGFR, DTNA, EPHB3, and GDAP2.
7. The method according to any one of the preceding claims, wherein the gene panel comprises (i) at least the fifteen genes of the group consisting of SLC11 A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESM1, EGFR, DTNA, EPHB3; and GDAP2, and optionally (ii) one or more of the genes of the group consisting of HOXA1, LOXL2, RAI1, RNF19B, and MKKS.
8. The method according to any one of the preceding claims wherein said gene panel comprises at least one, two, or three genes selected from the group consisting of: SLC11A2, PPFIA2, DTNA, HOXA1, RAI1, and MKKS.
9. The method according to any one of the preceding claims wherein said gene panel comprises at least two genes selected from the group consisting of: SLC11A2, PPFIA2, and DTNA.
10. The method according to any one of the preceding claims, wherein said gene panel comprises SLC11A2.
11. The method according to claim 10 wherein said gene panel comprises SLC11A2, PPFIA2, and DTNA.
12. The method according to any one of the preceding claims wherein said gene panel comprises one, two, three or four of MSH3, TSHZ2, CTDSPL, and NDUFA6.
13. The method according to any one of the preceding claims wherein said gene panel comprises one, two, three, four or five of ITGB1, RBFOX2, DST, RCAN1 and c9orf3.
14. The method according to any one of the preceding claims wherein the gene panel comprises no more than 20 genes.
15. The method according to claim 14, wherein the gene panel comprises only genes selected from the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESM1, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1 , RNF19B, and MKKS.
16. The method according to any one of the preceding claims wherein said gene panel consists of the twenty genes of the group consisting of SLC11A2, MSH3, TSHZ2, CTDSPL, NDUFA6, PTPRJ, PPFIA2, COL21A1, EFEMP2, CLCN3, ESM1, EGFR, DTNA, EPHB3, GDAP2, HOXA1, LOXL2, RAI1 , RNF19B, and MKKS.
17. The method according to any one of the preceding claims, wherein said method comprises determining the expression levels of said genes of the gene panel relative to reference amounts of said genes.
18. The method according to any one of the preceding claims wherein the likelihood of resistance to said chemotherapy is determined by applying the expression levels to a predictive model which relates expression levels of said genes of the gene panel with resistance to chemotherapy against triple negative breast cancer.
19. The method according to claim 18, wherein applying the expression levels to a predictive model comprises weighting the expression levels of said genes of the gene panel according to a predetermined ranking of said genes of the gene panel.
20. The method according to any one of the preceding claims wherein the AUC associated with said gene panel is greater than 0.750.
21. The method according to claim 20 wherein the AUC associated with said gene panel is greater than 0.800.
22. The method according to any one of the preceding claims wherein the chemotherapy is combination chemotherapy comprising an anthracycline and an alkylating agent.
23. The method according to claim 22, wherein the chemotherapy is combination chemotherapy which further comprises an antimetabolite.
PCT/GB2022/053034 2021-11-30 2022-11-30 Method of prognosis WO2023099889A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB2117296.0A GB202117296D0 (en) 2021-11-30 2021-11-30 Method of prognosis
GB2117296.0 2021-11-30

Publications (1)

Publication Number Publication Date
WO2023099889A1 true WO2023099889A1 (en) 2023-06-08

Family

ID=80038537

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2022/053034 WO2023099889A1 (en) 2021-11-30 2022-11-30 Method of prognosis

Country Status (2)

Country Link
GB (1) GB202117296D0 (en)
WO (1) WO2023099889A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020232033A1 (en) * 2019-05-14 2020-11-19 Tempus Labs, Inc. Systems and methods for multi-label cancer classification

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020232033A1 (en) * 2019-05-14 2020-11-19 Tempus Labs, Inc. Systems and methods for multi-label cancer classification

Non-Patent Citations (38)

* Cited by examiner, † Cited by third party
Title
ANAMPA, J.MAKOWER, D.SPARANO, J.: "A. Progress in adjuvant chemotherapy for breast cancer: An overview.", BMC MEDICINE, 2015
BIOCHEM. SOC. TRANS., vol. 48, 2020, pages 657 - 665
BRAY, F. ET AL.: "Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.", CA. CANCER J. CLIN., 2018
CHANGAVI, A. A.SHASHIKALA, A.RAMJI, A. S.: "Epidermal Growth Factor Receptor Expression in Triple Negative and Nontriple Negative Breast Carcinomas.", J. LAB. PHYSICIANS, vol. 7, 2015, pages 079 - 083
ELIYATKIN, N.YALCIN, E.ZENGEL, B.AKTAS, S.VARDAR, E.: "Molecular Classification of Breast Carcinoma: From Traditional, Old-Fashioned Way to A New Age, and A New Way", J. BREAST HEAL., 2015
FOURNIER, M. V. ET AL.: "A Predictor of Pathological Complete Response to Neoadjuvant Chemotherapy Stratifies Triple Negative Breast Cancer Patients with High Risk of Recurrence.", SCI. REP., vol. 9, 2019, pages 14863
GASS, P. ET AL.: "Prediction of pathological complete response and prognosis in patients with neoadjuvant treatment for triple-negative breast cancer.", BMC CANCER, vol. 18, 2018, pages 1051
GENOME MEDICINE, 2017
HATZIS, C.: "A Genomic Predictor of Response and Survival Following Taxane-Anthracycline Chemotherapy for Invasive Breast Cancer.", JAMA, vol. 305, 2011, pages 1873, XP055076686, DOI: 10.1001/jama.2011.593
KARAAYVAZ, M. ET AL.: "Unravelling subclonal heterogeneity and aggressive disease states in TNBC through single-cell RNA-seq.", NAT. COMMUN., 2018
KIM, C. ET AL.: "Chemoresistance Evolution in Triple-Negative Breast Cancer Delineated by Single-Cell Sequencing", CELL, 2018
KIM, K. T. ET AL.: "Application of single-cell RNA sequencing in optimizing a combinatorial therapeutic strategy in metastatic renal cell carcinoma.", GENOME BIOL., 2016
KULKARNI, A.ANDERSON, A. G.MERULLO, D. P.KONOPKA, G: "Beyond bulk: a review of single cell transcriptomics methodologies and applications.", CURRENT
LANGFELDER, P.HORVATH, S.: "WGCNA: an R package for weighted correlation network analysis.", BMC BIOINFORMATICS, vol. 9, 2008, pages 559, XP021047563, DOI: 10.1186/1471-2105-9-559
LEHMANN, B. D. ET AL.: "Refinement of Triple-Negative Breast Cancer Molecular Subtypes: Implications for Neoadjuvant Chemotherapy Selection.", PLOS ONE, vol. 11, 2016, pages e0157368, XP055325113, DOI: 10.1371/journal.pone.0157368
LIEDTKE, C. ET AL.: "Response to neoadjuvant therapy and long-term survival in patients with triple-negative breast cancer", J. CLIN. ONCOL., 2008
LIFE SCI., vol. 258, 2020, pages 118029
LIM, G. B. ET AL.: "Prediction of prognostic signatures in triple-negative breast cancer based on the differential expression analysis via NanoString nCounter immune panel.", BMC CANCER, vol. 20, 2020, pages 1052
LO, H.-W. ET AL.: "Nuclear interaction of EGFR and STAT3 in the activation of the iNOS/NO pathway.", CANCER CELL, vol. 7, 2005, pages 575 - 589
MASUDA, H. ET AL.: "Differential Response to Neoadjuvant Chemotherapy Among 7 Triple-Negative Breast Cancer Molecular Subtypes", CLIN. CANCER RES., vol. 19
MOO, T. A.SANFORD, R.DANG, C.MORROW, M.: "Overview of Breast Cancer Therapy.", PET CLINICS, 2018
NAKASHOJI, A. ET AL.: "Clinical predictors of pathological complete response to neoadjuvant chemotherapy in triple-negative breast cancer.", ONCOL. LETT., vol. 14, 2017, pages 4135 - 4141
OGDEN, A. ET AL.: "Combined HER3-EGFR score in triple-negative breast cancer provides prognostic and predictive significance superior to individual biomarkers.", SCI. REP., vol. 10, 2020, pages 3009
OHANNA, M. ET AL.: "Secretome from senescent melanoma engages the STAT3 pathway to favor reprogramming of naive melanoma towards a tumor-initiating cell phenotype.", ONCOTARGET, vol. 4, 2013, pages 2212 - 2224
OPINION IN BIOTECHNOLOGY, 2019
PENG, L. ET AL.: "Secreted LOXL2 is a novel therapeutic target that promotes gastric cancer metastasis via the Src/FAK pathway", CARCINOGENESIS, vol. 30, 2009, pages 1660 - 1669, XP002711315, DOI: 10.1093/CARCIN/BGP178
PLOS ONE, vol. 15, 2020, pages e0244255
SAATCI, O. ET AL.: "Targeting lysyl oxidase (LOX) overcomes chemotherapy resistance in triple negative breast cancer.", NAT. COMMUN., vol. 11, 2020, pages 2416
STARK, R.GRZELAK, M.HADFIELD, J.: "RNA sequencing: the teenage years.", NATURE REVIEWS GENETICS, 2019
SYMMANS, W. F. ET AL.: "Measurement of Residual Breast Cancer Burden to Predict Survival After Neoadjuvant Chemotherapy", J. CLIN. ONCOL., vol. 25, 2007, pages 4414 - 4422, XP055079334, DOI: 10.1200/JCO.2007.10.6823
TANG, F. ET AL.: "mRNA-Seq whole-transcriptome analysis of a single cell", NAT. METHODS, 2009
W CHEN, J. ET AL.: "RNA Expression Classifiers from a Model of Breast Epithelial Cell Organization to Predict Pathological Complete Response in Triple Negative Breast Cancer.", ARCH. CLIN. BIOMED. RES., vol. 05, 2021
WANG, Z.GERSTEIN, M.SNYDER, M.: "RNA-Seq: A revolutionary tool for transcriptomics.", NATURE REVIEWS GENETICS, 2009
YIN, L.DUAN, J.-J.BIAN, X.-WYU, S: "Triple-negative breast cancer molecular subtyping and treatment progress.", BREAST CANCER RES., vol. 22, 2020, pages 61
ZHANG, D. ET AL.: "RAB5C, SYNJ1, and RNF19B promote male ankylosing spondylitis by regulating immune cell infiltration.", ANN. TRANSL. MED., vol. 9, 2021, pages 1011 - 1011
ZHANG, Y. ET AL.: "Suppression of chloride voltage-gated channel 3 expression increases sensitivity of human glioma U251 cells to cisplatin through lysosomal dysfunction.", ONCOL. LETT., 2018
ZHAO YANDING ET AL: "Gene signature-based prediction of triple-negative breast cancer patient response to Neoadjuvant chemotherapy", CANCER MEDICINE, vol. 9, no. 17, 21 July 2020 (2020-07-21), GB, pages 6281 - 6295, XP055937571, ISSN: 2045-7634, Retrieved from the Internet <URL:https://onlinelibrary.wiley.com/doi/full-xml/10.1002/cam4.3284> DOI: 10.1002/cam4.3284 *
ZHU, H. ET AL.: "NPM-ALK up-regulates iNOS expression through a STAT3/microRNA-26a-dependent mechanism", J. PATHOL., vol. 230, 2013, pages 82 - 94

Also Published As

Publication number Publication date
GB202117296D0 (en) 2022-01-12

Similar Documents

Publication Publication Date Title
Arpino et al. Gene expression profiling in breast cancer: a clinical perspective
Goossens et al. Cancer biomarker discovery and validation
AU2012261820B2 (en) Molecular diagnostic test for cancer
US11091809B2 (en) Molecular diagnostic test for cancer
AU2015230677A1 (en) Determining cancer agressiveness, prognosis and responsiveness to treatment
US20200131586A1 (en) Methods and compositions for diagnosing or detecting lung cancers
US20160222459A1 (en) Molecular diagnostic test for lung cancer
AU2012261820A1 (en) Molecular diagnostic test for cancer
AU2006246241A1 (en) Gene-based algorithmic cancer prognosis
AU2015317893B2 (en) Compositions, methods and kits for diagnosis of a gastroenteropancreatic neuroendocrine neoplasm
AU2015213844A1 (en) Molecular diagnostic test for predicting response to anti-angiogenic drugs and prognosis of cancer
US20220136063A1 (en) Method of predicting survival rates for cancer patients
CA2923527A1 (en) Molecular diagnostic test for oesophageal cancer
Aswad et al. Genome and transcriptome delineation of two major oncogenic pathways governing invasive ductal breast cancer development
Ratovomanana et al. Prediction of response to immune checkpoint blockade in patients with metastatic colorectal cancer with microsatellite instability
Klopfenstein et al. Evaluation of tumor immune contexture among intrinsic molecular subtypes helps to predict outcome in early breast cancer
CN112458171A (en) Marker for predicting cervical squamous carcinoma chemotherapy curative effect and screening method and application thereof
Li et al. Identification of metastasis-associated genes in colorectal cancer through an integrated genomic and transcriptomic analysis
WO2023099889A1 (en) Method of prognosis
He et al. Identification of a combined RNA prognostic signature in adenocarcinoma of the lung
Goh et al. Matrisomal genes in squamous cell carcinoma of head and neck influence tumor cell motility and response to cetuximab treatment
Shi et al. SNRFCB: sub-network based random forest classifier for predicting chemotherapy benefit on survival for cancer treatment
Lin et al. LncRNA DIRC1 is a novel prognostic biomarker and correlated with immune infiltrates in stomach adenocarcinoma
AU2019220440A1 (en) Patient classification and prognostic method
Zhong et al. Optimized cross-study analysis of microarray-based predictors

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22821596

Country of ref document: EP

Kind code of ref document: A1