WO2016028141A1 - Method for determining prognosis of colorectal cancer - Google Patents

Method for determining prognosis of colorectal cancer Download PDF

Info

Publication number
WO2016028141A1
WO2016028141A1 PCT/MY2015/050086 MY2015050086W WO2016028141A1 WO 2016028141 A1 WO2016028141 A1 WO 2016028141A1 MY 2015050086 W MY2015050086 W MY 2015050086W WO 2016028141 A1 WO2016028141 A1 WO 2016028141A1
Authority
WO
WIPO (PCT)
Prior art keywords
polynucleotides
clinical outcome
subject
colorectal cancer
risk score
Prior art date
Application number
PCT/MY2015/050086
Other languages
French (fr)
Inventor
A Rahman BIN A JAMAL
Norfilza BINTI MOHD MOKHTAR
Roslan BIN HARUN
Original Assignee
Universiti Kebangsaan Malaysia
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universiti Kebangsaan Malaysia filed Critical Universiti Kebangsaan Malaysia
Publication of WO2016028141A1 publication Critical patent/WO2016028141A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57419Specifically defined cancers of colon
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to a method of predicting clinical outcome of colorectal cancer in a subject. More particularly, the present invention relates to a method of determining prognosis of colorectal cancer in a subject based on the expression levels of 19 genes in the subject, including BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13.
  • Colorectal cancer is a cancer that develops in the colon and rectum. It is the third most prevailing cancer in the world. Colorectal cancer in the early stages is often asymptomatic. Symptoms of colorectal cancer are observed when the cancer progresses and becomes more difficult to treat. Symptoms of colorectal cancer include the presence of blood in the faecal matter and change in bowel habits such as constipation and diarrhoea.
  • colorectal cancer Treatment of colorectal cancer depends on the specific location and severity of the cancer. The most common treatment for colorectal cancer is surgical resection wherein the cancerous cells and the nearby lymph nodes are removed. Patients with Dukes' A colorectal cancer often show excellent prognosis after the curative surgery. Patients with advanced stages of colorectal cancer have poorer prognosis and usually require adjuvant therapies to kill any possibly remaining cancer cells after surgical removal of the primary cancer. Adjuvant therapies, such as chemotherapy, are used to improve the event-free survival rate in colorectal cancer patients. For instance, the recurrence risk in Dukes' C colorectal cancer patients who undergo postsurgical course of chemotherapy reduces from 60 % to about 40 % to 50 %.
  • colorectal cancer patients require adjuvant therapy, particularly chemotherapy.
  • the five-year survival rate for Dukes' B colorectal cancer patients is about 70 % to 80 % with surgical resection only.
  • colorectal cancer patients respond differently to chemotherapy.
  • 5-fluorouracil (5-FU) and folinic acid benefit from the chemotherapy.
  • the probability of a stage II colorectal cancer patient to benefit from adjuvant chemotherapy has been difficult to determine. Nevertheless, this information is important for the patient in determining the need for the adjuvant chemotherapy and suitable adjuvant chemotherapy.
  • treatment decisions in colorectal cancer are based primarily on tumour stage identified pathologically.
  • the MSI status of a subject is correlated to the subject's five-year survival rate and probability of benefiting from the adjuvant chemotherapy.
  • gene expression profiling is a promising prognostic tool for colorectal cancer patients. Genetic information can help predict the patient's clinical outcome as well as the patient's probability of benefiting from adjuvant therapy.
  • a patent from the United States [Patent No. 20140094379] discloses a gene signature for colorectal cancer patients which comprises 636 genes while the United States Patent No. 2011257034 discloses a gene signature comprising 176 genes for the prognosis of colorectal cancer patients.
  • the gene signatures disclosed by these prior art are identified from patients with Dukes' A to Dukes' D colorectal cancer.
  • the primary objective of the present invention is to provide a method of predicting clinical outcome of a subject with colorectal cancer.
  • the subject's projected clinical outcome is derived from the expression levels of differentially expressed genes in Dukes' B and Dukes' C colorectal cancer patients.
  • Another objective of the present invention is to provide a gene signature for Dukes' B and C colorectal cancer, wherein the gene signature comprises 19 genes whose expression patterns significantly differ in subjects with positive clinical outcome and subjects with negative clinical outcome.
  • At least one of the preceding objects is met, in whole or in part, by the invention, in which the embodiment of the invention describes a method for predicting clinical outcome of colorectal cancer in a subject, comprising detecting an expression level of at least four of the polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO.
  • the colorectal cancer in the subject is Dukes' B or Dukes' C colorectal cancer.
  • the polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 are capable of hybridizing to nucleotide sequence of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP 13.
  • the biological samples can be tissues obtained at biopsy, surgical resection specimens or biofluids.
  • the biological samples are derived from the colorectal tissues.
  • the expression levels of the polynucleotides are detected via a microarray platform.
  • synthetic oligonucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19, or the complementary thereof, are put into contact with polynucleotides from biological samples. Hybridization between the oligonucleotides and polynucleotides can be observed if the polynucleotides from the biological sample comprises nucleotide sequences of any of the abovementioned genes.
  • the expression levels of the polynucleotides are normalised by a microarray data analysis program, wherein the microarray data analysis program used is GenomeStudio, GeneSpring GX or R-programming.
  • the microarray data analysis program is capable of performing statistical data analysis, including Significance Analysis of Microarrays, Linear Models of Microarray Data or t-test.
  • normalised expression levels of the polynucleotides are converted into a risk score.
  • a subject's risk score is compared to a reference value, which is a median of the risk scores, converted from expression levels of a sample population of subjects with colorectal cancer, preferably Dukes' B or Dukes' C colorectal cancer.
  • the method disclosed herein may further comprise a step of providing a report indicating the probability of clinical outcome of the subject.
  • the clinical outcome may be expressed based on the risk score of the subject.
  • the clinical outcome may be expressed in terms of five-year survival.
  • the present invention also describes a molecular array for predicting clinical outcome of colorectal cancer in a subject, which array comprises a plurality of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof, which are capable of hybridising to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13 under high stringency conditions.
  • the expression levels of these polynucleotides are detected and converted to a risk score.
  • a reference value for the polynucleotides an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
  • Figure 1 illustrates hierarchical clustering for a training set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively.
  • Figure 1 illustrates hierarchical clustering for a test set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively.
  • FIG. 1 illustrates Principal Component Analysis for a training set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively.
  • FIG. 1 illustrates Principal Component Analysis for a test set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively.
  • FIG. 1 illustrates Principal Component Analysis for a training set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively.
  • FIG. 1 illustrates the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated.
  • FIG. 7 illustrates a Kaplan-Meier plot for a test set, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated.
  • Figure 7 illustrates a Kaplan-Meier plot of an Australian cohort, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated.
  • Figure 8 illustrates a Kaplan-Meier plot of an American cohort, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated.
  • the present invention discloses a method for predicting clinical outcome of colorectal cancer in a subject.
  • the method disclosed herein comprises the steps of (a) detecting an expression level of at least four of the polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof in a biological sample obtained from the subject; (b) converting the expression levels of the polynucleotides to a risk score; and (c) determining the probability of a positive clinical outcome for the subject based on the risk score of the polynucleotides; wherein upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
  • RNA ribonucleic acid
  • cDNA complementary deoxyribonucleic acid
  • moiety that can produce a detectable signal is incorporated at the 5' or 3' end of the cDNA strands.
  • a further embodiment of the present invention comprises a pre-qualification step wherein the quality of RNA extracted from the biological sample is assessed.
  • the quality assessment of the RNA which includes RNA integrity test, is conducted using quantitative real-time polymerase chain reaction (qPCR).
  • qPCR quantitative real-time polymerase chain reaction
  • polynucleotides encoding ribosomal protein L13A (RPL13A) are amplified. SEQ ID NO. 20 and SEQ ID NO.
  • RNA with cycle threshold (Ct) value of 29 or less are subjected to reverse transcription polymerase chain reaction (RT-PCR) to obtain complementary DNA (cDNA) for application in microarray.
  • Ct cycle threshold
  • RT-PCR reverse transcription polymerase chain reaction
  • the expression levels of polynucleotides are measured.
  • expression levels of at least four polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof are detected.
  • polynucleotides are complementary or capable of hybridising to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13.
  • Table 1 in Example 1 shows the complete or partial nucleotide sequences of the abovementioned polynucleotides and the corresponding genes to which the polynucleotides hybridised.
  • the expression levels of the messenger RNA (mRNA) of these genes are determined.
  • the expression levels of the cDNA of the mRNA of these genes are determined.
  • the expression levels of the polynucleotides are detected via a microarray. Any conventionally known microarray can be used. Whole-genome cDNA-mediated Annealing, Selection, Extension, and Ligation (WG-DASL) array is preferred if the biological samples are formalin-fixed paraffin-embedded. A molecular array adapted for the method disclosed herein will be described later.
  • cDNA synthesised from the RNA of the biological sample are put into contact with synthetic oligonucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 are immobilised on a surface.
  • the cDNA comprises nucleotide sequences of any of the abovementioned genes, hybridisation between the cDNA and its complementary oligonucleotides can be observed.
  • the level of hybridisation indicates the expression levels of the cDNA or its corresponding mRNA or gene in the subject.
  • the quantitative determination of expression levels of polynucleotides is performed using microarray data analysis program.
  • statistical softwares such as GeneSpring or GenomeStudio, which provides statistical tools for analysis of gene expression, can be used in conjunction with computer programming languages such as R for analyzing the expression data obtained from microarray.
  • Microarray data analysis program can perform statistical data analysis such as Significance Analysis of Microarrays (SAM), Linear Models of Microarray Data (LIMMA) or t-test to identify differentially expressed genes in colorectal cancer cells.
  • SAM Significance Analysis of Microarrays
  • LIMMA Linear Models of Microarray Data
  • t-test to identify differentially expressed genes in colorectal cancer cells.
  • genes BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13 are differentially expressed genes identified by all SAM, LIMMA, and t-test in a sample population of subjects with Dukes' B or C colorectal cancer.
  • the expression levels of the polynucleotides or their corresponding genes are normalised by the microarray data analysis program. Normalization of expression values is performed to compensate the systematic technical differences between microarrays such as the hybridization conditions.
  • the expression data may be subjected to quality control measures, such as principal component analysis (PCA), to eliminate outliers and background signal noise.
  • PCA principal component analysis
  • the normalised expression values are converted into a risk score using the equation (1).
  • Expr ge ne i is the normalised expression value of the polynucleotide or its corresponding gene
  • p ge ne i is the regression coefficient obtained from pensim R package
  • i is the SEQ ID NO. of the polynucleotide.
  • the risk score of a subject is compared to a reference value for determining the subject's probability of a positive outcome.
  • the reference value is a median of the risk scores converted from expression levels of a sample population of subjects with colorectal cancer. Specifically, these subjects are diagnosed with Dukes' B or C colorectal cancer.
  • the present invention further comprises a step of providing a report indicating the probability of clinical outcome of the subject.
  • the report indicates the subject's probability of a positive or negative clinical outcome based on the comparison of the subject's risk score to the reference value. Particularly, if the subject's risk score is higher than the reference value, the subject has an increased probability of negative clinical outcome or poor prognosis. In contrary, the subject has an increased probability of positive clinical outcome or good prognosis if the subject's risk score is lower than the reference value.
  • the report indicates the subject's projected clinical outcome in terms of five-year survival rate.
  • the subject's estimated five-year survival rate can be obtained upon comparison of the subject's projected clinical outcome, obtained as described in preceding description, to a Kaplan-Meier plot showing the five-year survival curves of a sample population of subjects with positive clinical outcome and a sample population of subjects with negative clinical outcome.
  • the present invention also describes a molecular array for predicting clinical outcome of colorectal cancer in a subject.
  • the array comprises a plurality of polynucleotides or synthetic oligonucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof, which are capable of hybridising to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13 under high stringency conditions.
  • genes are expression signatures for Dukes' B and C colorectal cancer.
  • the expression levels of these genes or the polynucleotides corresponding to these genes are detected and converted to a risk score as described in preceding description.
  • a reference value for the genes or the polynucleotides corresponding to these genes an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
  • the present invention provides a method for estimating the clinical outcome of Dukes' B and C patients based on the gene expression profile of the patients.
  • the predicted clinical outcome or the projected five-year survival rate of a subject obtained via the method described herein may help a physician in deciding the need for adjuvant therapy or the type of adjuvant therapy for Dukes' B and Dukes' C colorectal cancer.
  • a subject with increased probability of positive clinical outcome may avoid adjuvant chemotherapy or may be given a low-dose chemotherapy.
  • a subject with increased probability of negative clinical outcome may benefit from more aggressive treatments such as high-dose chemotherapy.
  • FFPE Formalin-fixed, paraffin-embedded
  • the samples are sectioned and stained using Haematoxylin and Eosin (H&E) staining to observe the morphology of the tissue. Seventy-eight samples with tissue sections containing more than 80% cancer cells are selected for subsequent gene expression analysis, wherein half of the samples are obtained patients in the good prognosis group while the other half of the samples are obtained from patients in the poor prognosis group. In each sample, tissues that are free of inflamed or cancerous cells are used as the control. Prior to RNA purification, the tissue sections are then deparaffinized by immersing the tissue slides in xylene and then in absolute alcohol.
  • H&E Haematoxylin and Eosin
  • RNA from the cancerous tissues and normal tissues are extracted using High Pure RNA Paraffin Kit (Roche Applied Science, Malaysia). Nanodrop ND-1000 Spectrophotometer is used to quantify the RNA extracted, while Bioanalyzer 2100 RNA 6000 Nano Kit (Agilent Technologies, Malaysia) is used to assess the quality or integrity of the RNA extracted.
  • the RNA extracted is then subjected to a pre- qualification analysis, which is performed using quantitative polymerase chain reaction (qPCR) by means of Corbett Rotor-Gene 6000 series detection system (QIAGEN, Malaysia).
  • the primers used in the qPCR are designed from the transcript of ribosomal protein L13A (RPL13A).
  • the forward primer sequence (SEQ ID NO. 20) and reverse primer sequence (SEQ ID NO. 21) are shown in Table 1.
  • RNA with cycle threshold (Ct) value of 29 or less is used in the gene expression analysis.
  • RNA analysis of the RNA is performed using Whole-Genome cDNA-mediated Annealing, Selection, Extension, and Ligation (WG-DASL) assay according to the manufacturer's protocol.
  • the array data is analysed using GenomeStudio Software (Illumina, Malaysia) to check the data quality.
  • the microarray data is normalised and background noise is eliminated.
  • Principal component analysis (PCA) mapping is conducted to ensure patients from good prognosis group and poor prognosis group exhibit different gene expression patterns. Genes that are not expressed in both control and experimental groups are filtered. The outliers are replaced by 90 % confidence interval (CI) for robust mean to improve the normalisation of gene expression values.
  • PCA Principal component analysis
  • the expression data is analysed using the GeneSpring GX 12.0.2 software (Agilent, Malaysia) and R-programming.
  • the normalised gene expression values of patients from both good and poor prognosis groups are randomly divided into one hundred training and test sets.
  • Three different statistical tests, namely Significance Analysis of Microarrays (SAM), Linear Models for Microarray Data (LIMMA), and t-test, are then employed to identify the differentially expressed genes in the training and test sets.
  • a p-value of 0.05 is used as a cut-off for significance.
  • Figures 1 and 2 illustrate the hierarchical clustering for a training set and a test set, respectively.
  • Figure 3 and 4 illustrate the PCA for the training set and the test set, respectively.
  • a prognostic signature comprising 19 genes is obtained; these 19 genes are consistently identified as significant genes in all training and test sets in each statistical test. These genes are BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13. Table 1 shows the nucleotide sequences of the probes of these genes (SEQ ID NO. 1-19).
  • Chromosome 12 C12orf66 CGGCAGGTGTTCATGGGAAGTAGGTG ORF 66 GATCAGTGAAGGCATAATGGGCTG
  • Solute carrier family SLC38A9 AGACCATTTCTGCAGTTTGCCCAAAC 38, member 9 CTCTACTGTTTGGGACAGTAAGCC
  • Thyroid hormone TRIP13 CGGTATGGGCGCCCCTGCATTGCTGG receptor interactor GATGTTTCTGCCCACGGTTTTGTT 13
  • a median risk score is determined and is validated in the test set.
  • the validated median risk score can be used as a cut-off for categorising Dukes' B and C colorectal cancer patients into the good and poor prognosis groups. Patients with risk score lower than the median risk score are classified as having good prognosis or low risk while patients with risk score higher than the median risk score are classified as having poor prognosis or high risk.
  • a Kaplan-Meier plot showing the five-year survival rates of the patients categorised as having good prognosis and patients categorised as having poor prognosis is generated using Cox proportional hazards test from the expression levels of the genes.
  • Figure 5 and 6 show that the survival curve of patients in good prognosis group significantly differs from that of patients in poor prognosis group. This indicates that the prognostic signature provided herein is highly reliable in predicting the clinical outcome of a subject.

Abstract

A method for predicting clinical outcome of colorectal cancer in a subject comprises of detecting an expression level of at least four polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1- 19 or the complementary thereof in a biological sample obtained from the subject; converting the expression levels of the polynucleotides to a risk score; and determining the probability of a positive clinical outcome for the subject based on the risk score of the polynucleotides; wherein upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value. The polynucleotides having nucleotide sequence as set forth in SEQ ID NO. 1-19 are capable of hybridising to nucleotide sequence of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13.

Description

METHOD FOR DETERMINING PROGNOSIS OF COLORECTAL CANCER Field of Invention
The present invention relates to a method of predicting clinical outcome of colorectal cancer in a subject. More particularly, the present invention relates to a method of determining prognosis of colorectal cancer in a subject based on the expression levels of 19 genes in the subject, including BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13.
Background of the Invention
Colorectal cancer is a cancer that develops in the colon and rectum. It is the third most prevailing cancer in the world. Colorectal cancer in the early stages is often asymptomatic. Symptoms of colorectal cancer are observed when the cancer progresses and becomes more difficult to treat. Symptoms of colorectal cancer include the presence of blood in the faecal matter and change in bowel habits such as constipation and diarrhoea.
Treatment of colorectal cancer depends on the specific location and severity of the cancer. The most common treatment for colorectal cancer is surgical resection wherein the cancerous cells and the nearby lymph nodes are removed. Patients with Dukes' A colorectal cancer often show excellent prognosis after the curative surgery. Patients with advanced stages of colorectal cancer have poorer prognosis and usually require adjuvant therapies to kill any possibly remaining cancer cells after surgical removal of the primary cancer. Adjuvant therapies, such as chemotherapy, are used to improve the event-free survival rate in colorectal cancer patients. For instance, the recurrence risk in Dukes' C colorectal cancer patients who undergo postsurgical course of chemotherapy reduces from 60 % to about 40 % to 50 %. However, not all colorectal cancer patients require adjuvant therapy, particularly chemotherapy. The five-year survival rate for Dukes' B colorectal cancer patients is about 70 % to 80 % with surgical resection only. Furthermore, colorectal cancer patients respond differently to chemotherapy. Only approximately 2 % to 4 % of colorectal cancer patients who receive adjuvant anti-cancer drugs, i.e. 5-fluorouracil (5-FU) and folinic acid, benefit from the chemotherapy. The probability of a stage II colorectal cancer patient to benefit from adjuvant chemotherapy has been difficult to determine. Nevertheless, this information is important for the patient in determining the need for the adjuvant chemotherapy and suitable adjuvant chemotherapy. Currently, treatment decisions in colorectal cancer are based primarily on tumour stage identified pathologically.
The need for adjuvant treatments can be determined based on a patient's prognosis after surgery. Nevertheless, it is sometimes difficult to distinguish Dukes' B and Dukes' C colorectal cancer patients based on pathological examination of cancer cells removed surgically. Hence, it is difficult for physicians to determine the treatment choice of Dukes' B and Dukes' C colorectal cancer patient as the patients' prognosis is uncertain. In view of this problem, a method for prognosis after surgery in stage II colorectal cancer patients which guides the treatment decisions is highly desirable. As disclosed by Dotan and Cohen, 2011, microsatellite instability (MSI) presents as a promising prognostic and predictive tool for colorectal cancer patients. The MSI status of a subject is correlated to the subject's five-year survival rate and probability of benefiting from the adjuvant chemotherapy. Also, gene expression profiling is a promising prognostic tool for colorectal cancer patients. Genetic information can help predict the patient's clinical outcome as well as the patient's probability of benefiting from adjuvant therapy. A patent from the United States [Patent No. 20140094379] discloses a gene signature for colorectal cancer patients which comprises 636 genes while the United States Patent No. 2011257034 discloses a gene signature comprising 176 genes for the prognosis of colorectal cancer patients. However, the gene signatures disclosed by these prior art are identified from patients with Dukes' A to Dukes' D colorectal cancer.
Therefore, a method of predicting the clinical outcome of a Dukes' B or Dukes' C colorectal cancer patient based on the expression levels of a gene signature of Dukes' B and Dukes' C colorectal cancer is highly welcomed. The present invention provides such method.
Summary of the Invention
The primary objective of the present invention is to provide a method of predicting clinical outcome of a subject with colorectal cancer. Particularly, the subject's projected clinical outcome is derived from the expression levels of differentially expressed genes in Dukes' B and Dukes' C colorectal cancer patients. Another objective of the present invention is to provide a gene signature for Dukes' B and C colorectal cancer, wherein the gene signature comprises 19 genes whose expression patterns significantly differ in subjects with positive clinical outcome and subjects with negative clinical outcome. At least one of the preceding objects is met, in whole or in part, by the invention, in which the embodiment of the invention describes a method for predicting clinical outcome of colorectal cancer in a subject, comprising detecting an expression level of at least four of the polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof in a biological sample obtained from the subject; converting the expression levels of the polynucleotides to a risk score; and determining the probability of a positive clinical outcome for the subject based on the risk score of the polynucleotides; wherein upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value. Particularly, the colorectal cancer in the subject is Dukes' B or Dukes' C colorectal cancer. In accordance with the preferred embodiment of the invention, the polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 are capable of hybridizing to nucleotide sequence of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP 13.
The biological samples can be tissues obtained at biopsy, surgical resection specimens or biofluids. Preferably, the biological samples are derived from the colorectal tissues. In the preferred embodiment, the expression levels of the polynucleotides are detected via a microarray platform. In the microarray method, synthetic oligonucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19, or the complementary thereof, are put into contact with polynucleotides from biological samples. Hybridization between the oligonucleotides and polynucleotides can be observed if the polynucleotides from the biological sample comprises nucleotide sequences of any of the abovementioned genes.
In a further embodiment, the expression levels of the polynucleotides are normalised by a microarray data analysis program, wherein the microarray data analysis program used is GenomeStudio, GeneSpring GX or R-programming. The microarray data analysis program is capable of performing statistical data analysis, including Significance Analysis of Microarrays, Linear Models of Microarray Data or t-test.
In a further embodiment of the invention, normalised expression levels of the polynucleotides are converted into a risk score. A subject's risk score is compared to a reference value, which is a median of the risk scores, converted from expression levels of a sample population of subjects with colorectal cancer, preferably Dukes' B or Dukes' C colorectal cancer. The method disclosed herein may further comprise a step of providing a report indicating the probability of clinical outcome of the subject. In one embodiment, the clinical outcome may be expressed based on the risk score of the subject. In another embodiment, the clinical outcome may be expressed in terms of five-year survival. The present invention also describes a molecular array for predicting clinical outcome of colorectal cancer in a subject, which array comprises a plurality of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof, which are capable of hybridising to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13 under high stringency conditions. Specifically, the expression levels of these polynucleotides are detected and converted to a risk score. Upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
Brief Description of the Drawings
For the purpose of facilitating an understanding of the invention, there is illustrated in the accompanying drawing the preferred embodiments from an inspection of which when considered in connection with the following description, the invention, its construction and operation and many of its advantages would be readily understood and appreciated. Figure 1 illustrates hierarchical clustering for a training set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively. illustrates hierarchical clustering for a test set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively. illustrates Principal Component Analysis for a training set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively. illustrates Principal Component Analysis for a test set, wherein the sample population of subjects with positive clinical outcome (good prognosis) and the sample population of subjects with negative clinical outcome (poor prognosis) are clustered together respectively. illustrates a Kaplan-Meier plot for a training set, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated. illustrates a Kaplan-Meier plot for a test set, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated. Figure 7 illustrates a Kaplan-Meier plot of an Australian cohort, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated.
Figure 8 illustrates a Kaplan-Meier plot of an American cohort, wherein the survival curve of the sample population of subjects with positive clinical outcome (good prognosis) and that of the sample population of subjects with negative clinical outcome (poor prognosis) are statistically significantly separated.
Detailed Description of the Invention
One skilled in the art will readily appreciate that the present invention is well adapted to carry out the objectives as well as to obtain the ends and advantages mentioned, and those inherent therein. The embodiment described herein is not intended as limitations on the scope of the invention. The terms "positive clinical outcome", "good prognosis" and "low risk" are used interchangeably unless otherwise mentioned. Likewise, the terms "negative clinical outcome", "poor prognosis" and "high risk" are used interchangeably unless otherwise mentioned. The present invention discloses a method for predicting clinical outcome of colorectal cancer in a subject. The method disclosed herein comprises the steps of (a) detecting an expression level of at least four of the polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof in a biological sample obtained from the subject; (b) converting the expression levels of the polynucleotides to a risk score; and (c) determining the probability of a positive clinical outcome for the subject based on the risk score of the polynucleotides; wherein upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
The method disclosed herein is used to estimate the prognosis of Dukes' B and Dukes' C colorectal cancer patients. Biological samples obtained from the subjects can be in the form of tissue biopsy, surgical resection specimens or biofluids. Preferably, the biological samples are derived from the colorectal tissues. More preferably, the biological samples should contain at least 80 % of cancerous cells. In order to perform the method disclosed by the present invention, polynucleotides, particularly ribonucleic acid (RNA), are extracted from the biological samples and preferably converted to complementary deoxyribonucleic acid (cDNA) prior to the subsequent expression analysis. In one embodiment, moiety that can produce a detectable signal, such as fluorophore, dye or radiolabel, is incorporated at the 5' or 3' end of the cDNA strands. A further embodiment of the present invention comprises a pre-qualification step wherein the quality of RNA extracted from the biological sample is assessed. The quality assessment of the RNA, which includes RNA integrity test, is conducted using quantitative real-time polymerase chain reaction (qPCR). Preferably, polynucleotides encoding ribosomal protein L13A (RPL13A) are amplified. SEQ ID NO. 20 and SEQ ID NO. 21 respectively represent the nucleotide sequences of the forward and reverse primers flanking the complete_nucleotide sequence of RPL13A transcript. Samples containing RNA with cycle threshold (Ct) value of 29 or less are subjected to reverse transcription polymerase chain reaction (RT-PCR) to obtain complementary DNA (cDNA) for application in microarray. As described in the preceding description, the expression levels of polynucleotides are measured. Preferably, expression levels of at least four polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof are detected. These polynucleotides are complementary or capable of hybridising to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13. Table 1 in Example 1 shows the complete or partial nucleotide sequences of the abovementioned polynucleotides and the corresponding genes to which the polynucleotides hybridised. In the preferred embodiment, the expression levels of the messenger RNA (mRNA) of these genes are determined. In a more preferred embodiment, the expression levels of the cDNA of the mRNA of these genes are determined. Preferably, the expression levels of the polynucleotides are detected via a microarray. Any conventionally known microarray can be used. Whole-genome cDNA-mediated Annealing, Selection, Extension, and Ligation (WG-DASL) array is preferred if the biological samples are formalin-fixed paraffin-embedded. A molecular array adapted for the method disclosed herein will be described later. In the preferred embodiment of the invention, cDNA synthesised from the RNA of the biological sample are put into contact with synthetic oligonucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 are immobilised on a surface. If the cDNA comprises nucleotide sequences of any of the abovementioned genes, hybridisation between the cDNA and its complementary oligonucleotides can be observed. The level of hybridisation indicates the expression levels of the cDNA or its corresponding mRNA or gene in the subject.
Preferably, the quantitative determination of expression levels of polynucleotides is performed using microarray data analysis program. In more particular, statistical softwares such as GeneSpring or GenomeStudio, which provides statistical tools for analysis of gene expression, can be used in conjunction with computer programming languages such as R for analyzing the expression data obtained from microarray.
Microarray data analysis program can perform statistical data analysis such as Significance Analysis of Microarrays (SAM), Linear Models of Microarray Data (LIMMA) or t-test to identify differentially expressed genes in colorectal cancer cells. In the preferred embodiment, genes BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13 are differentially expressed genes identified by all SAM, LIMMA, and t-test in a sample population of subjects with Dukes' B or C colorectal cancer. These genes all have significance values of 5 %. Particularly, these 19 genes carry equal weight in predicting the subject's clinical outcome. In a further embodiment of the invention, the expression levels of the polynucleotides or their corresponding genes are normalised by the microarray data analysis program. Normalization of expression values is performed to compensate the systematic technical differences between microarrays such as the hybridization conditions. The expression data may be subjected to quality control measures, such as principal component analysis (PCA), to eliminate outliers and background signal noise.
In accordance with the preferred embodiment of the invention, the normalised expression values are converted into a risk score using the equation (1).
19
Risk score =∑ {Expr t*$ )
(1) wherein Exprgene i is the normalised expression value of the polynucleotide or its corresponding gene, pgene i is the regression coefficient obtained from pensim R package, and i is the SEQ ID NO. of the polynucleotide.
Particularly, the risk score of a subject is compared to a reference value for determining the subject's probability of a positive outcome. In the preferred embodiment, the reference value is a median of the risk scores converted from expression levels of a sample population of subjects with colorectal cancer. Specifically, these subjects are diagnosed with Dukes' B or C colorectal cancer.
The present invention further comprises a step of providing a report indicating the probability of clinical outcome of the subject. In one embodiment, the report indicates the subject's probability of a positive or negative clinical outcome based on the comparison of the subject's risk score to the reference value. Particularly, if the subject's risk score is higher than the reference value, the subject has an increased probability of negative clinical outcome or poor prognosis. In contrary, the subject has an increased probability of positive clinical outcome or good prognosis if the subject's risk score is lower than the reference value. In another preferred embodiment, the report indicates the subject's projected clinical outcome in terms of five-year survival rate. The subject's estimated five-year survival rate can be obtained upon comparison of the subject's projected clinical outcome, obtained as described in preceding description, to a Kaplan-Meier plot showing the five-year survival curves of a sample population of subjects with positive clinical outcome and a sample population of subjects with negative clinical outcome.
Furthermore, the present invention also describes a molecular array for predicting clinical outcome of colorectal cancer in a subject. The array comprises a plurality of polynucleotides or synthetic oligonucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof, which are capable of hybridising to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13 under high stringency conditions. These genes are expression signatures for Dukes' B and C colorectal cancer. With the use of the molecular array disclosed herein, the expression levels of these genes or the polynucleotides corresponding to these genes are detected and converted to a risk score as described in preceding description. Upon comparison to a reference value for the genes or the polynucleotides corresponding to these genes, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
Staging of Dukes' B and C colorectal cancer patients by means of pathological examination is generally less accurate. Consequently prognosis estimation and/or treatment choice for these patients can be difficult. The present invention provides a method for estimating the clinical outcome of Dukes' B and C patients based on the gene expression profile of the patients. The predicted clinical outcome or the projected five-year survival rate of a subject obtained via the method described herein may help a physician in deciding the need for adjuvant therapy or the type of adjuvant therapy for Dukes' B and Dukes' C colorectal cancer. In more particular, a subject with increased probability of positive clinical outcome may avoid adjuvant chemotherapy or may be given a low-dose chemotherapy. In contrast, a subject with increased probability of negative clinical outcome may benefit from more aggressive treatments such as high-dose chemotherapy.
Example
An example is provided below to illustrate different aspects and embodiments of the invention. The example is not intended in any way to limit the disclosed invention, which is limited only by the claims.
Example 1 Establishment of Gene Signature for Colorectal Cancer
Formalin-fixed, paraffin-embedded (FFPE) archived tissue specimens are obtained from the Pathology Department of Universiti Kebangsaan Malaysia Medical Centre (UKMMC). These samples are selected from a cohort of patients with Dukes' B and C colorectal cancer between year 2002 to 2007. The patients are classified into two groups: good prognosis group (positive clinical outcome) and poor prognosis group (negative clinical outcome). Patients who survived more than 5 years are classified as having good prognosis whereas patients who survived less than 5 years are classified as having poor prognosis.
The samples are sectioned and stained using Haematoxylin and Eosin (H&E) staining to observe the morphology of the tissue. Seventy-eight samples with tissue sections containing more than 80% cancer cells are selected for subsequent gene expression analysis, wherein half of the samples are obtained patients in the good prognosis group while the other half of the samples are obtained from patients in the poor prognosis group. In each sample, tissues that are free of inflamed or cancerous cells are used as the control. Prior to RNA purification, the tissue sections are then deparaffinized by immersing the tissue slides in xylene and then in absolute alcohol.
The total RNA from the cancerous tissues and normal tissues are extracted using High Pure RNA Paraffin Kit (Roche Applied Science, Malaysia). Nanodrop ND-1000 Spectrophotometer is used to quantify the RNA extracted, while Bioanalyzer 2100 RNA 6000 Nano Kit (Agilent Technologies, Malaysia) is used to assess the quality or integrity of the RNA extracted. The RNA extracted is then subjected to a pre- qualification analysis, which is performed using quantitative polymerase chain reaction (qPCR) by means of Corbett Rotor-Gene 6000 series detection system (QIAGEN, Malaysia). The primers used in the qPCR are designed from the transcript of ribosomal protein L13A (RPL13A). The forward primer sequence (SEQ ID NO. 20) and reverse primer sequence (SEQ ID NO. 21) are shown in Table 1. RNA with cycle threshold (Ct) value of 29 or less is used in the gene expression analysis.
Microarray analysis of the RNA is performed using Whole-Genome cDNA-mediated Annealing, Selection, Extension, and Ligation (WG-DASL) assay according to the manufacturer's protocol. The array data is analysed using GenomeStudio Software (Illumina, Malaysia) to check the data quality. The microarray data is normalised and background noise is eliminated. Principal component analysis (PCA) mapping is conducted to ensure patients from good prognosis group and poor prognosis group exhibit different gene expression patterns. Genes that are not expressed in both control and experimental groups are filtered. The outliers are replaced by 90 % confidence interval (CI) for robust mean to improve the normalisation of gene expression values. The expression data is analysed using the GeneSpring GX 12.0.2 software (Agilent, Malaysia) and R-programming. The normalised gene expression values of patients from both good and poor prognosis groups are randomly divided into one hundred training and test sets. Three different statistical tests, namely Significance Analysis of Microarrays (SAM), Linear Models for Microarray Data (LIMMA), and t-test, are then employed to identify the differentially expressed genes in the training and test sets. A p-value of 0.05 is used as a cut-off for significance. Figures 1 and 2 illustrate the hierarchical clustering for a training set and a test set, respectively. Further, Figure 3 and 4 illustrate the PCA for the training set and the test set, respectively. From the statistical analyses, a prognostic signature comprising 19 genes is obtained; these 19 genes are consistently identified as significant genes in all training and test sets in each statistical test. These genes are BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13. Table 1 shows the nucleotide sequences of the probes of these genes (SEQ ID NO. 1-19).
Table 1
Polynucleotide Name of Symbol of Nucleotide sequence of polynucleotide (SEQ ID NO. ) corresponding gene corresponding gene (5'→ 3')
1 Breast cancer 1, BRCA1 TCCAGGACTGTTTATAGCTGTTGGAA early onset GGACTAGGTCTTCCCTAGCCCCCC
2 Chromosome 12 C12orf66 CGGCAGGTGTTCATGGGAAGTAGGTG ORF 66 GATCAGTGAAGGCATAATGGGCTG
3 CPX chromosome CPXCR1 CCTTCGAGGGTGCACTACTACCGTCC region, candidate 1 CCTCACTGAGAGAATGACATCAGG
4 Dual oxidase 2 DUOX2 ATTTCCATCACCCCAGAAACTCCCCTT
GTACCCCCTTCCACTTCGTCTCC Table 1
Polynucleotide Name of Symbol of Nucleotide sequence of polynucleotide (SEQ ID NO. ) corresponding gene corresponding gene (5'→ 3')
5 Dual specificity DUSP21 CAATGTAAGCCATCCCGGCCAGCCCC phosphatase 21 TGACATCTGCCATCGATCTTGCAC
6 EGF, latrophilin and ELTD1 CTTTGGCTATCTAAGCCCAGCCGTGGT seven AGTTGGATTTTCGGCAGCACTAG transmembrane
domain containing 1
7 FERM domain FRMD6 AGATGTAGGGTACAGTGGAACATAAG containing 6 CAGTGTTACCCCTGGCTGGGAGTC
8 GDNF family GFRA4 GCCGTGGCGAGGGTGGGGACGGGGC receptor alpha 4 CTCTCTCCGGCTCACCGCCCTCCCG
9 Inositol 1,4,5- ITPRIP GCCTCCAGAAGCCAAAACCATGCCTG trisphosphate GATCTCCCATAGCTTCTCCTTTGC receptor interacting
protein
10 Ceramide synthase 6 LASS6 CACAGGTCCATGAAAGTTTGGCTTCC
TGGTTTGATGTCTGTTGCGTGGCC
11 Mitochondrial MRPL52 CTCCTTTGCAGGCCATAGGACTAGCC ribosomal protein CAACTATGAGAAATAGCTGTTCTG L52
12 Notch 2 NOTCH2 AGCTGCCCCCTGGGCTACACTGGGAA
AAACTGTCAGACCCTGGTGAATCT
13 Olfactory receptor, OR10H5 GTCGCCATGAAGAAGACTTGCTTCAC family 10, subfamily CAAACTCTTTCCACAGAACTGCTG H, member 5
14 Oxysterol binding OSBPL9 ATCAGTACCACACTTGCTTTTTTCCAG protein-like 9 TCTTCTGGTATCTCTCCAGTTCT
15 Phosducin PDC CAGGTGCTGGGGACCGCTTTTCCTTA
GATGTACTTCCTACACTGCTCATC
16 Solute carrier family SLC38A9 AGACCATTTCTGCAGTTTGCCCAAAC 38, member 9 CTCTACTGTTTGGGACAGTAAGCC
17 Sortilin-related SORCS2 ACATGTTGGGCATGTGGACCCAAGCA VPS 10 domain CCTGGGAAGGAGGTGGCATCTGAG containing receptor 2
18 Spastic paraplegia 7 SPG7 GCGAGTTTGTGGATTATCTGAAGAGC
CCAGAACGCTTCCTCCAGCTTGGC
19 Thyroid hormone TRIP13 CGGTATGGGCGCCCCTGCATTGCTGG receptor interactor GATGTTTCTGCCCACGGTTTTGTT 13
20 Ribosomal protein RPL13A GTACGCTGTGAAGGCATCAA
L13A
21 Ribosomal protein RPL13A GTTGGTGTTCATCCGCTTG
L13A The patients, whose samples are obtained for microarray analysis, are randomly divided into the training set and the test set with equal numbers of patients. The risk score of each patient is computed using equation 1.
19
Risk score =∑ {Expr t*$ )
(1) wherein Exprgene i is the normalized expression value of the gene, pgene i is the regression coefficient obtained from pensim R package using optlD, and i is the SEQ
ID NO. of the polynucleotide corresponding to the gene. From the training set, a median risk score is determined and is validated in the test set. The validated median risk score can be used as a cut-off for categorising Dukes' B and C colorectal cancer patients into the good and poor prognosis groups. Patients with risk score lower than the median risk score are classified as having good prognosis or low risk while patients with risk score higher than the median risk score are classified as having poor prognosis or high risk.
The performance of the prognostic signature is further validated using independent external validation data sets from the Australian (n = 185 patients) and United States cohort (n = 114 patients).
Further, a Kaplan-Meier plot showing the five-year survival rates of the patients categorised as having good prognosis and patients categorised as having poor prognosis is generated using Cox proportional hazards test from the expression levels of the genes. Figure 5 and 6 show that the survival curve of patients in good prognosis group significantly differs from that of patients in poor prognosis group. This indicates that the prognostic signature provided herein is highly reliable in predicting the clinical outcome of a subject. Reference
Dotan, E.; Cohen, S.J. 2011. Challenges in the management of stage II colon cancer. Seminars in Oncology. 38 (4): 511-520.

Claims

Claims
1. A method for predicting clinical outcome of colorectal cancer in a subject, comprising
detecting an expression level of at least four polynucleotides selected from the group of polynucleotides having nucleotide sequences as set forth in SEQ ID
NO. 1-19 or the complementary thereof in a biological sample obtained from the subj ect;
converting the expression levels of the polynucleotides to a risk score; and determining the probability of a positive clinical outcome for the subject based on the risk score of the polynucleotides;
wherein upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
2. A method according to claim 1 further comprising a step of providing a report indicating the probability of clinical outcome of the subject.
3. A method according to claim 1, wherein the colorectal cancer is Dukes' B or C colorectal cancer.
4. A method according to claim 1, wherein the polynucleotides having nucleotide sequence as set forth in SEQ ID NO. 1-19 are capable of hybridising to nucleotide sequence of genes selected from the group consisting of BRCA1,
C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP13.
5. A method according to claim 1, wherein the expression levels of the polynucleotides are detected via a microarray.
6. A method according to claim 1, wherein the reference value is a median of the risk scores converted from the expression levels of a sample population of subjects with colorectal cancer.
7. A method according to claim 6, wherein the colorectal cancer is Dukes' B or C colorectal cancer.
8. A method according to claim 1, wherein the expression levels of the polynucleotides are normalised by a microarray data analysis program.
9. A method according to claim 8, wherein the microarray data analysis program is GenomeStudio, GeneSpring GX or R-programming.
10. A method according to claim 8, wherein the microarray data analysis program performs statistical data analysis.
11. A method according to claim 10, wherein the statistical data analysis is Significance Analysis of Microarrays, Linear Models of Microarray Data or t- test.
12. A method according to claim 1, wherein the clinical outcome is expressed in terms of five-year survival.
13. A method according to claim 1, wherein the biological sample is tissue biopsy surgical resection specimen or biofluid.
14. A method according to claim 1, wherein the biological sample is derived from colorectal tissues.
15. A molecular array for predicting clinical outcome of colorectal cancer in a subject, which array comprises
a plurality of polynucleotides having nucleotide sequences as set forth in SEQ ID NO. 1-19 or the complementary thereof that are capable of hybridizing to nucleotide sequences of genes selected from the group consisting of BRCA1, C12orf66, CPXCR1, DUOX2, DUSP21, ELTD1, FRMD6, GFRA4, ITPRIP, LASS6, MRPL52, NOTCH2, OR10H5, OSBPL9, PDC, SLC38A9, SORCS2, SPG7, and TRIP 13 under high stringency conditions,
wherein expression levels of the polynucleotides are detected and converted to a risk score, and upon comparison to a reference value for the polynucleotides, an increased probability of negative clinical outcome of the subject is indicated if the risk score is higher than the reference value or an increased probability of positive clinical outcome of the subject is indicated if the risk score is lower than the reference value.
PCT/MY2015/050086 2014-08-18 2015-08-17 Method for determining prognosis of colorectal cancer WO2016028141A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2014702296 2014-08-18
MYPI2014702296 2014-08-18

Publications (1)

Publication Number Publication Date
WO2016028141A1 true WO2016028141A1 (en) 2016-02-25

Family

ID=55351002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2015/050086 WO2016028141A1 (en) 2014-08-18 2015-08-17 Method for determining prognosis of colorectal cancer

Country Status (1)

Country Link
WO (1) WO2016028141A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020092969A1 (en) * 2018-11-02 2020-05-07 Oklahoma Medical Research Foundation Monoclonal antibodies to eltd1 and uses thereof
CN113549696A (en) * 2021-09-23 2021-10-26 广州医科大学附属肿瘤医院 Tumor marker SORCS2 and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010096929A1 (en) * 2009-02-25 2010-09-02 Diagnocure Inc. Method for detecting metastasis of gi cancer
WO2010127322A1 (en) * 2009-05-01 2010-11-04 Genomic Health Inc. Gene expression profile algorithm and test for likelihood of recurrence of colorectal cancer and response to chemotherapy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010096929A1 (en) * 2009-02-25 2010-09-02 Diagnocure Inc. Method for detecting metastasis of gi cancer
WO2010127322A1 (en) * 2009-05-01 2010-11-04 Genomic Health Inc. Gene expression profile algorithm and test for likelihood of recurrence of colorectal cancer and response to chemotherapy

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
GRAY RICHARD G. ET AL.: "Validation Study of a Quantitative Multigene Reverse Transcriptase- Polymerase Chain Reaction Assay for Assessment of Recurrence Risk in Patients With Stage II Colon Cancer", JOURNAL OF CLINICAL ONCOLOGY, vol. 29, no. 35, 2011, pages 4611 - 4619 *
HAO JUN-MEI ET AL.: "A five- gene signature as a potential predictor of metastasis and survival in colorectal cancer", JOURNAL OF PATHOLOGY, vol. 220, 2010, pages 475 - 489, XP002679451, DOI: doi:10.1002/path.2668 *
SALAZAR RAMON ET AL.: "Gene Expression Signature to Improve Prognosis Prediction of Stage II and III Colorectal Cancer", JOURNAL OF CLINICAL ONCOLOGY, vol. 29, no. 1, 2011, pages 17 - 24, XP055032270, DOI: doi:10.1200/JCO.2010.30.1077 *
VARGAS TEODORO ET AL.: "Genes associated with metabolic syndrome predict disease-free survival in stage II colorectal cancer patients. A novel link between metabolic dysregulation and colorectal cancer", MOLECULAR ONCOLOGY, vol. 8, June 2014 (2014-06-01), pages 1469 - 1481 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020092969A1 (en) * 2018-11-02 2020-05-07 Oklahoma Medical Research Foundation Monoclonal antibodies to eltd1 and uses thereof
CN113549696A (en) * 2021-09-23 2021-10-26 广州医科大学附属肿瘤医院 Tumor marker SORCS2 and application thereof

Similar Documents

Publication Publication Date Title
JP6246845B2 (en) Methods for quantifying prostate cancer prognosis using gene expression
JP2020141684A (en) Microrna biomarkers for gastric cancer diagnosis
JP7228896B2 (en) Methods for predicting the prognosis of breast cancer patients
CN111910002B (en) Kit or device for detecting esophagus cancer and detection method
KR101672531B1 (en) Genetic markers for prognosing or predicting early stage breast cancer and uses thereof
JP2009528825A (en) Molecular analysis to predict recurrence of Dukes B colorectal cancer
CN111961726A (en) Evaluation of cellular signaling pathway activity using linear combinations of target gene expression
EP2382331A1 (en) Cancer biomarkers
CA2825218A1 (en) Colon cancer gene expression signatures and methods of use
WO2013086352A1 (en) Prostate cancer associated circulating nucleic acid biomarkers
WO2016020551A1 (en) Thyroid cancer diagnosis by dna methylation analysis
JP6356217B2 (en) Method for producing prognostic model for gastric cancer
US20090192045A1 (en) Molecular staging of stage ii and iii colon cancer and prognosis
KR20170063977A (en) Methods for assessing risk of developing breast cancer
TW201309805A (en) Biomarkers for recurrence prediction of colorectal cancer
JP2019510473A (en) Methods for assessing the risk of developing colorectal cancer
JP2014511677A (en) Distinguishing benign and malignant thyroid lesions that are difficult to distinguish
CN110621788A (en) DNA methylation and mutation analysis methods for bladder cancer monitoring
CN112020566B (en) Kit, device and method for detection of bladder cancer
US10161004B2 (en) Diagnostic miRNA profiles in multiple sclerosis
WO2016028141A1 (en) Method for determining prognosis of colorectal cancer
CA3133294A1 (en) Methods for predicting prostate cancer and uses thereof
KR20210048794A (en) Composition for diagnosing nontuberculous mycobacterial infection or infection disease
JP7024957B2 (en) Methods for predicting the presence or absence of metastatic metastasis of colorectal cancer and kits used for it
CN117012376A (en) Construction method and risk prediction method of breast cancer local recurrence model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15834089

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15834089

Country of ref document: EP

Kind code of ref document: A1