WO2019237641A1 - 用于检测癌症复发风险的生物标志物及检测方法 - Google Patents

用于检测癌症复发风险的生物标志物及检测方法 Download PDF

Info

Publication number
WO2019237641A1
WO2019237641A1 PCT/CN2018/113414 CN2018113414W WO2019237641A1 WO 2019237641 A1 WO2019237641 A1 WO 2019237641A1 CN 2018113414 W CN2018113414 W CN 2018113414W WO 2019237641 A1 WO2019237641 A1 WO 2019237641A1
Authority
WO
WIPO (PCT)
Prior art keywords
risk
cancer
prostate cancer
genes
biomarker
Prior art date
Application number
PCT/CN2018/113414
Other languages
English (en)
French (fr)
Inventor
唐大木
何立智
陈争
陈婧
赵坤成
赵凤娟
曾永柯
马靖翔
Original Assignee
深圳市颐康生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市颐康生物科技有限公司 filed Critical 深圳市颐康生物科技有限公司
Publication of WO2019237641A1 publication Critical patent/WO2019237641A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the invention relates to prostate cancer detection technology, in particular to a biomarker for detecting the risk of cancer recurrence, and a method for detecting the risk of cancer recurrence.
  • Prostate cancer is the most common malignant tumor in men in developed countries. The incidence of prostate cancer is also rising rapidly in China. The progression of different prostate cancers varies widely. Although most tumors with low scores (Gleason scores less than 6 / WHO grade (group) I) have good long-term prognosis, about 30% of cancers recur after radical prostatectomy. The main indicator of this type of cancer recurrence is elevated serum prostate-specific antigen (PSA), also known as biochemical recurrence. Elevated serum prostate-specific antigens indicate a high risk of cancer metastasis.
  • PSA serum prostate-specific antigen
  • Elevated serum prostate-specific antigens indicate a high risk of cancer metastasis.
  • the standard treatment for metastatic prostate cancer is androgen removal therapy (ADT), a palliative treatment.
  • prostate cancer After prostate cancer castration treatment fails, the recurrence of prostate cancer is also called castration-resistant prostate cancer (CRPC).
  • CRPC castration-resistant prostate cancer
  • the biochemical relapse period of high prostate-specific antigen is a key period for early targeted treatment. Therefore, it is necessary to follow-up the risk of prostate cancer biochemical recurrence.
  • mRNA-based multigenomic detection kits for diagnosing prostate cancer recurrence on the market. They include OncotypeDX (Genomic Prostate Score / GPS), Prolaris (cell cycle progress / CCP), and Decipher (Genomic Classifier / GC ).
  • OncotypeDX Geneomic Prostate Score / GPS
  • Prolaris cell cycle progress / CCP
  • Decipher Geneomic Classifier / GC
  • the 17-gene Oncotype DX and 31-gene Prolaris kits help to stratify the risk of patients with high-risk prostate cancer recurrence after prostate cancer diagnosis and radical surgery.
  • the 22-gene Decipher can predict the risk of cancer metastasis after radical operation.
  • Mucin1 (MUC1) pathway plays an important role in biochemical recurrence after radical prostatectomy.
  • MUC1 is a well-researched tumor-associated antigen, in part because MUC1, a cell membrane glycoprotein, is expressed on the apical surface of most epithelial tissues. In 70% of cancers, the glycosylation of MUC1 is altered. MUC1 promotes tumor progression in many tumors by activating important oncogenic proteins of multiple pathways, including EGFR, ⁇ -Catenin, NF- ⁇ B, and PKM2. In prostate cancer, MUC1 expression is up-regulated and abnormal glycosylation occurs. These abnormalities are associated with angiogenesis and adverse clinical symptoms.
  • MUC1 The up-regulation of MUC1 is associated with weak shortening of disease-free survival (DFS) and overall survival (OS), and with malignant histopathology after radical prostatectomy.
  • DFS disease-free survival
  • OS overall survival
  • AZGP1, MUC1, and p53 Three genomes (AZGP1, MUC1, and p53) are associated with poor prognosis in patients with primary prostate cancer.
  • Metastatic prostate cancer can detect an increase in MUC1 mRNA expression.
  • the genomic changes of the 25-gene MUC1 gene network are slightly associated with prostate cancer recurrence.
  • the technical problem to be solved by the present invention is: based on the genome of the 25-gene MUC1 gene network, to provide a biomarker for detecting the risk of cancer recurrence, which can effectively predict the recurrence risk of cancers such as prostate cancer; and based on the above biomarker Provide a method for detecting the risk of cancer recurrence.
  • the technical solution adopted by the present invention is:
  • a biomarker for detecting the risk of cancer recurrence including at least one of the 696 differentially expressed genes in Table 1.
  • the table 1 is the table 1 in the specification.
  • a gene combination for detecting the risk of cancer recurrence including at least one of the following genes: SLCO2A1, CGNL1, SUPV3L1, TATDN2, MGAT4B, VAV2, SLC25A33, MCCC1, ASNS, CASKIN1, DNMT3B, AURKA, OIP5, CTHRC1, and GOLGA7B.
  • a gene combination for detecting the risk of cancer recurrence characterized in that it includes at least the following characteristic genomes: SigCut1, SigCut2, SigCut3, and SigMuc1NW1;
  • the SigCut1 includes the following genes: MGAT4B, AURKA, and OIP5;
  • the SigCut2 includes the following genes: TATDN2, MGAT4B, VAV2, AURKA, and OIP5;
  • the SigCut3 includes the following genes: SLCO2A1, CGNL1, SUPV3L1, TATDN2, MGAT4B, VAV2, SLC25A33, MCCC1, ASNS, CASKIN1, DNMT3B, AURKA, OIP5, CTHRC1, and GOLGA7B;
  • the SigMuc1NW1 includes the following genes: CGNL1, MGAT4B, VAV2, ASNS, CASKIN1, DNMT3B, AURKA, OIP5, CTHRC1, and GOLGA7B.
  • a method for detecting the risk of cancer recurrence which diagnoses or estimates a patient's risk of death by examining changes in the expression of genes in the aforementioned gene combination.
  • PCR, DNA chip, Nanostring or RAN sequencing can be used to check the low expression of mRNA and high expression of mRNA in the biomarkers.
  • test object of this method is human or mammal.
  • the biomarker for detecting the risk of cancer recurrence of the present invention makes full use of the potential value of MUC1 as a new type of biomarker to develop an effective characteristic gene combination to predict the recurrence of cancers such as prostate cancer.
  • FIG. 1 shows a strategy for generating a characteristic genome of the present patent
  • Figures 2A-B show a selective covariate analysis of 696 genes using the Elastic-net method
  • Figure 3 shows the gene expression of a selected 15 characteristic gene genome (SigMuc1NW);
  • Figures 4A-B show that SigMuc1NW is associated with decreased disease-free survival (DFS) and overall survival (OS) in patients with prostate cancer;
  • DFS disease-free survival
  • OS overall survival
  • Figure 5 shows the overlap between the 9-gene signature genome [21] and SigMuc1NW we reported previously;
  • FIG. 5 show that the two characteristic genomes of FIG. 5 are significantly associated with a reduction in disease-free survival (DFS) and overall survival (OS) in patients with prostate cancer;
  • DFS disease-free survival
  • OS overall survival
  • Figures 7A-D show that the SigMuc1NW score can effectively stratify prostate cancers with a high risk of recurrence
  • Figure 8 shows the estimated cut-off point for the SigMuc1NW score
  • Figure 9 shows that all 15 genes of SigMuc1NW are significantly correlated with the presence of three sub-characteristic genomes for prostate cancer recurrence and acquisition;
  • FIGS 10A-E show that SigCut1, SigCut2, and SigCut3 are significantly associated with a reduction in disease-free survival (DFS);
  • 11A-C show that the SigMuc1NW score is effective for stratified grouping of prostate cancer with high recurrence risk
  • the present invention aims to make full use of the potential value of MUC1 as a novel biomarker to develop an effective characteristic gene combination to predict the recurrence of prostate cancer.
  • DEGs differentially expressed genes
  • SigMuc1NW 15 characteristic gene genomes
  • SLCO2A1, CGNL1, SUPV3L1, TATDN2, MGAT4B, VAV2, SLC25A33, MCCC1, ASNS, CASKIN1, DNMT3B, AURKA, OIP5, CTHRC1 and GOLGA7B 15 characteristic gene genomes.
  • SigMuc1NW characteristic genome the inventors further grouped four sub-characteristic genomes, namely: SigCut1, SigCut2, SigCut3, and SigMuc1NW1.
  • SigMuc1NW can strongly predict the biochemical recurrence after radical operation, with a sensitivity of 56.4% and a specificity of 72.6%.
  • the median disease-free period (MMDF) of patients with SigMuc1NW-positive prostate cancer was 63.24 months, while the median disease-free period of patients with SigMuc1NW-negative prostate cancer was significantly longer than that of positive patients, and it was not even effective at the end of the 160-month follow-up period.
  • the median disease-free period (p 1.12e-12).
  • the time-dependent AUC (area under the curve, tAUC) value of SigMuc1NW at 11.5 months was 76.6%, 73.8% at 22.3 months, 78.5% at 32.1 months, and 76.4% at 48.4 months.
  • SigCut1 (including MGAT4B, AURKA, and OIP5 genes) was used to distinguish tAUC values between recurrent and non-recurrent prostate cancer at 74.3% at 11.5 months, 73.8% at 22.3 months, 78.5% at 32.1 months, and 48.4 76.4% at month.
  • the tAUC values of SigCut2 (including TATDN2, MGAT4B, VAV2, AURKA and OIP5) to distinguish between relapsed and non-relapsed prostate cancer were 75.9% at 11.5 months, 73.4% at 22.3 months, and 76.5 at 32.1 months % And 48.4 months were 75.3%.
  • SigMuc1NW1 consists of 10 genes CGNL1, MGAT4B, VAV2, ASNS, CASKIN1, DNMT3B, AURKA, OIP5, CTHRC1 and GOLGA7B.
  • SigMuc1NW1 was used to distinguish tAUC values between relapsed and non-relapsed prostate cancer at 82.5% at 18.4 months, 78.5% at 38 months, 76.6% at 51.4 months, and 78.2% at 65 months.
  • SigMuc1NW1 and SigCut3 are independent risk factors adjusted to predict prostate cancer recurrence after surgery.
  • SigMuc1NW and SigMuc1NW1 are associated with shortened overall survival (OS) of multiple cancer types. See details below:
  • SigMuc1NW and SigMuc1NW1 are associated with shortened disease-free survival (DFS) for multiple cancer types. See details below:
  • the present invention diagnoses and assesses the possible risk of recurrence of prostate cancer in patients with prostate cancer after radical prostatectomy by examining changes in 15 genes in the characteristic genome of SigMuc1NW and different subgenomes (SigMuc1NW1, SigCut1, SigCut2, and SigCut3). It can be used to diagnose and assess the risk of death in patients with prostate cancer. It can also be used to diagnose and evaluate the risk of recurrence at the first diagnosis of prostate cancer. It can also be used to diagnose and assess the risk of metastasis and progression to castration-resistant prostate cancer (CRPC) after radical surgery. .
  • CRPC castration-resistant prostate cancer
  • the 15 genes in the SigMuc1NW characteristic genome can be used in different combinations, that is, combinations other than the aforementioned SigMuc1NW1, SigCut1, SigCut2, and SigCut3 can also be used. This is because all 15 genes can individually predict the biological recurrence of cancer.
  • the TCGA subdata bank in cBioPortal has gene expressions from 492 prostate cancer patients, and these gene expressions were obtained by RNA sequencing.
  • a cross-validation (CV) curve with the mixing parameter ⁇ set to 0.2 (A) and 0.8 (B).
  • the number of non-zero coefficients (covariates) of the current ⁇ value (the parameter value adjusted by setting the penalty level) is displayed at the top of the graph.
  • the right-most vertical line indicates the minimum value of the CV curve, and the vertical line on the left indicates that the CV-error is within one standard deviation of the minimum value.
  • the model is built on the value of ⁇ shown by the vertical line on the left.
  • FIG. 7 (A) all tumors in the TCGA sub-database are scored with SigMuc1NW. Scores were analyzed using tROC to identify tumors with a high risk of recurrence. The figure shows the AUC (tAUC) and the state of disease recurrence over a specified period of time. DF: No disease. (B) The cutpoint of the SigMuc1NW score can effectively separate prostate cancer with a low risk of recurrence from prostate cancer with a high risk of recurrence (see Figure 8 for details). The binary code is then assigned to the tumor based on this cut point.
  • mRNA expression data of 15 genes are obtained from the TCGA sub-database (cBioPortal) to obtain each cutoff point value, and a binary code is provided to all tumors.
  • Univariate Cox proportional hazards (PH) model was used to determine the hazard ratio (HR) of prostate cancer recurrence for all genes. Cox proportional hazard assumptions have also been evaluated and confirmed. These analyses were performed using the Rsurvival package.
  • the graph includes hazard ratios, 95% CI and p-values. Based on the p-value, we also obtained genes contained in the characteristic genomes of SigCut1, SigCut2, and SigCut3.
  • the TCGA subdatabase is used here.
  • A All tumors were scored using the Cox coefficients of SigCut1, SigCut2, and SigCut3. This figure shows the time-dependent AUC and corresponding recurrence status of the three characteristic genomes during the follow-up period.
  • B-D Association of SigCut1, SigCut2, and SigCut3 with biochemical recurrence.
  • E Analysis of the Q1, median, Cutpoint, and Q3 scores of SigCut3 for stratified grouping of prostate cancers with a high risk of recurrence. Include the number of individuals at risk during the specified follow-up period. Kaplan-Meier analysis and log-rank test were performed using Rsurvival package.
  • the SigMuc1NW scores based on Q1, median, and Q3 values were used to perform a stratified analysis of prostate cancer with high biochemical recurrence risk in the TCGA subdatabase.
  • gene expression data of all 15-component genes are acquired in the MSKCC subdatabase in cBioPortal. Gene expression data for this population were obtained from DNA gene chips. MRNA levels in normal and prostate cancer tissue mRNA (A), primary and metastatic prostate cancer mRNA (B), and non-relapsed and relapsed prostate cancer (C). This graph also shows the number of cancers in each group. Statistical analysis was performed using Student's test (double test). * p ⁇ 0.05, ** p ⁇ 0.01 and *** p ⁇ 0.001.
  • SigMuc1NW1 contains 10 genes. This figure shows the time-dependent AUC (A) obtained. SigCut1NW1 cut points (B), Q1 (C), median (D), and Q3 (E) were used to stratify and group prostate cancer at high risk of recurrence. The number of prostate cancers during the current follow-up is also shown in the figure.
  • SigMuc1NW1 is significantly associated with a reduction in DFS and OS in prostate cancer patients in the TCGA clinical cohort, and SigMuc1NW1 gene expression is based on SD levels. Kaplan-Meier analysis and log-rank tests were performed using tools provided by cBioPortal.
  • Biochemical recurrence occurs in 30-40% of patients after radical prostatectomy; approximately 40% of these patients will develop metastatic cancer. The assessment of the risk of biochemical relapse will help to develop a personalized treatment plan.
  • We recently constructed a 9-gene signature genome derived from the molecular biology network of the MUC1 gene; this signature genome uses the TCGA subdatabase to effectively predict biochemical recurrences: sensitivity 34.8%, specificity 83.6%, median disease-free period (MMDF) 73.36 months (p 5.57e-5).
  • Biochemical recurrence is the result of polygenic, multichannel mutations.
  • the inventors obtained a more efficient characteristic genome by analyzing changes in the transcriptome associated with the characteristic genome of the 9-gene.
  • the inventors used the strategy in Figure 1 to analyze the TCGA subdatabase in the cBioPortal database.
  • the inventors analyzed gene transcription closely related to the characteristic genome of the 9-gene.
  • 100 had a characteristic genomic positive ( Figure 1). Comparing the average expression of genes between these 100 positive prostate cancers and other 392 negative cancers, we obtained a total of 696 differentially expressed genes (DEGs), (q ⁇ 0.001) (Table 1, Table 1 shows the TCGA subdatabase Differentially Expressed Genes (DEGs) in the 9-gene Characteristic Genome).
  • differentially expressed genes contain 416 down-regulated genes and 280 up-regulated genes (Figure 1; Table 1). Enrichment analysis of these differentially expressed genes using the KEGG (kegg, kegg.set.hs) data set in the RGaga package revealed that the up-regulated genes are mainly the same as regulating the cell cycle, oocyte meiosis and progesterone Genes that are involved in cell maturation and other functions are down-regulated. Similarly, using Gene Ontology (go, go.sets.hs) data set analysis, up-regulated gene functions are involved in regulating cell cycle progression, DNA metabolism, and other processes related to cell proliferation. Down-regulated gene functions are involved in mediating cell junctions, extracellular processes, and other cellular processes.
  • Enrichment analysis of the 696 differentially expressed gene channels using R's Reactome software package revealed that these genes regulate the G1, M, DNA replication and chromatid pathways of the cell cycle.
  • the above analysis collectively revealed that 696 differentially expressed genes are associated with the progression of prostate cancer.
  • the reference population is all tumors in the dataset or tumors with complete diploids (http://www.cbioportal.org/faq.jsp). Then use Elastic-net logistic regression in Rglmnet software package (Figure 1) to perform regularized covariate selection analysis.
  • This reorganized database contains the down-regulated genes, up-regulated genes, follow-up period and relapse status of each patient .
  • the blending parameter ⁇ in the Elastic-net analysis to 0.2 or 0.8.
  • NW refers to the network
  • a -1.5SD down-regulated genes
  • b 2SD up-regulated genes
  • NA not available.
  • VAV2 VAVguanine nucleotide exchange factor 2
  • ASNS asparagine synthesis
  • DNMT3B DNA methyltransferase 3 beta
  • AURKA Aurora DNAase A
  • VAV2 is a co-activator of androgen receptor (AR), and maintains androgen receptor signaling after androgen deprivation therapy (ADT). It can also promote angiogenesis and metastasis.
  • AURKA plays an important role in mitosis and promotes the development of neuroendocrine prostate cancer after castration therapy.
  • DNMT3B may regulate epigenetic events to promote the progression of castration-resistant prostate cancer (CRPC).
  • CRPC castration-resistant prostate cancer
  • SigMuc1NW SigMuc1NW
  • 2 SigMuc1NW-derived cut-off point
  • 3 diagnosis age
  • 4 radical prostatectomy Gleason score
  • 5 seminal vesicle invasion
  • 6 surgical margin
  • 7 tumor stage (for ⁇ T2, 0; For T3 and T4, 1);
  • HR hazard ratio;
  • CI confidence interval; NA: not available.
  • RNA sequencing data of all 15 SigMuc1NW genes were retrieved from the TCGA subdatabase and the cut-off point 1 to distinguish individual gene expression in recurrent prostate cancer was estimated (Table 5).
  • all tumors are given a binary code for all tumors.
  • tumors that express less than the cut-off point are designated as "1".
  • PH Cox proportional hazards
  • RNA sequencing data of the SigMuc1NW component genes from the TCGA subdatabase (cBioPortal).
  • 2 Use the Maximum Selected Rank Statistics in R to estimate the cut-off point.
  • 3 A univariate Cox proportional hazard analysis was used to determine the coefficient of biochemical recurrence. #: PH is assumed to be at p ⁇ 0.05.
  • the Q1 (1.647), median (3.589), and Q3 (6.386) scores are effective in stratifying and grouping the risk of biochemical recurrence of prostate cancer, and their sensitivity / specificity / median disease-free month (MMDF / p) values: Q1 was 93.4% / 31.8% / 81.2 / 6.76e-6, the median was 80.2% / 56.9% / 66.9 / 6.73e-11, and Q3 was 56% / 82% / 40/0 ( Figure 11).
  • SigCut3 is significantly more effective than SigMuc1NW ( Figure 4A) constructed using standard deviation (SD) ( Figure 10D).
  • SD standard deviation
  • SigMuc1NW and SigMuc1NW1 are associated with reduced disease-free survival (DFS) and overall survival (OS) for multiple cancer types
  • SigMuc1NW and SigMuc1NW1 were analyzed the value of SigMuc1NW and SigMuc1NW1 in predicting disease-free survival (DFS) and overall survival (OS) of other cancer types. These two markers are related to two major breast cancer populations, low-grade glioma, squamous cell carcinoma of the head and neck (SigMuc1NW1 only), clear cell renal cell carcinoma (ccRCC), papillary renal cell carcinoma (pRCC), and hepatocytes.
  • ccRCC clear cell renal cell carcinoma of the head and neck
  • pRCC papillary renal cell carcinoma
  • hepatocytes hepatocytes.
  • SigMuc1NW1 is associated with reduced DFS in sarcomas and is associated with disease recurrence in more cancer types compared to SigMuc1NW (Table 9, Table 9 is associated with a reduction in disease-free survival of multiple cancersa). . Collectively, these data confirm the clinical significance of SigMuc1NW and SigMuc1NW1.
  • a all cancer datasets are from the cBioPortal database.
  • b The number of positive (+) and negative (-) characteristics of the specified characteristic genome. Total / relapses included; MMDF.
  • a all cancer datasets are from the cBioPortal database.
  • b The number of positive (+) and negative (-) characteristics of the specified characteristic genome. Total / relapses included; MMDF.
  • the present invention develops a new method to analyze multi-gene-related transcriptomes to obtain characteristic genomes that can be used in the diagnosis of tumor recurrence. This is the first time that the transcriptome analysis is based entirely on multiple genes (696 genes), rather than on a single gene. Due to the novel perspective of the present invention and a new comprehensive analysis method, we have acquired the characteristic genome of the 15-gene. In this genome, 73.3% (11/15) genes have not been reported to be associated with prostate cancer.
  • the 11 new prostate cancer genes include MGAT4B and OIP5. The former may play a role in changing the glycosylation of tumor proteins, and the glycosylation is a very important change in tumorigenesis. MUC1 abnormal glycosylation has been fully confirmed in tumorigenesis.
  • MGAT4B in the 15-genome is consistent with the genome derived from the 9-gene MUC1 characteristic genome.
  • OIP5 in SigMuc1NW indicates that darenoprotein is a tumor-associated antigen (TAA) in prostate cancer. Tumor-associated antigens have been extensively studied in the diagnosis and treatment of cancer. Therefore, OIP5 will have potential clinical applications in the diagnosis and treatment of prostate cancer.
  • TAA tumor-associated antigen
  • SigMuc1NW Due to the complex nature of cancer progression, we chose not to focus on specific aspects of tumorigenesis, but instead apply the latest machine learning system to the ability to predict the biochemical recurrence of prostate cancer with 696 genes. We have thus constructed a genome containing 15 genes. Although SigMuc1NW was not constructed to target specific pathways, the genome may encompass multiple pathways. In addition to the potential effects of MGAT4B on protein glycosylation, the genome also contains proteins with RNA helicase activity (SUPV3L1, Table 2) and DNA methyltransferase activity (DNMT3B, Table 2). These cellular processes are very important in gene expression and epigenetic changes, and their malignant changes are an important manifestation of cancer progression. SigMuc1NW also contains genes that regulate cell proliferation.
  • AURKA is gradually recognized as an important regulator of mitosis and a key player in tumorigenesis.
  • AURKA is considered a very important potential target gene.
  • SigMuc1NW has been reported to play a role in prostate cancer, and all four genes can promote the progression of castration-resistant prostate cancer (CRPC). Because in gene castration and epigenetic changes are significantly abnormal in castration-resistant prostate cancer, the 15 genome can also predict the progression of castration-resistant prostate cancer.
  • the sensitivity, specificity, and PPV positive predictive value
  • Figure 13B-E the sensitivity, specificity, and PPV
  • the method of the present invention includes:
  • the cBioPortal (http://www.cbioportal.org/index.do) database contains the most comprehensive and comprehensive genetic data on various cancer types.
  • the TCGA subdatabase covers genetic abnormalities, transcriptional expression as determined by cDNA microarray or RNA sequencing, and detailed clinical characteristics including disease outcomes (relapse and death).
  • the TCGA clinical prostate cancer database contains 492 patients with localized prostate cancer.
  • the maximum selection level statistics (Maxstat software package) analysis in R were used to obtain the cut-off points. This cut-off point is used to distinguish between recurrent and non-recurrent prostate cancer.
  • RNA expression determined by RNA sequencing from the TCGA subdatabase; we also evaluated the effectiveness of the cutoff points to distinguish between recurrent and non-recurrent prostate cancer.
  • the GAGE and Reactine packages in R are used to analyze the differential gene KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (gene ontology) pathway analysis.
  • GraphPad Prism 5 software was used for Fisher's exact test. Kaplan-Meier survival analysis and log-rank test were performed using the Rsurvival package and tools provided by cBioPortal. Univariate and multivariate Cox regression analysis was performed using the Rsurvival package. Time-dependent ROC (time-receive operating, characterization, tROC) analysis was performed using R timeROC software package. A value of p ⁇ 0.05 was considered statistically significant.
  • the gene combination for detecting the risk of cancer recurrence provided by the present invention has the advantage that it can effectively predict the risk of recurrence of cancers such as prostate cancer.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了用于检测癌症复发风险的生物标志物及检测方法。所述生物标志物至少包括以下一个基因:SLCO2A1、CGNL1、SUPV3L1、TATDN2、MGAT4B、VAV2、SLC25A33、MCCC1、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B。

Description

用于检测癌症复发风险的生物标志物及检测方法 技术领域
本发明涉及前列腺癌检测技术,特别涉及一种用于检测癌症复发风险的生物标志物、以及检测癌症复发风险的方法。
背景技术
前列腺癌是发达国家男性最常见的恶性肿瘤,前列腺癌的发病率在中国也迅速上升。不同的前列腺癌的进展差异很大。虽然大部分低评分(Gleason评分小于6/WHO分级(组)I)的肿瘤远期预后很好,但是根治性前列腺切除术后大约还有30%癌复发。这类癌复发的主要指针是血清前列腺特异性抗原(PSA)升高,也称为生化复发。血清前列腺特异性抗原升高预示着高风险的癌转移。治疗转移性前列腺癌的标准疗法是雄激素去除疗法(ADT),这是一种姑息性的治疗。在前列腺癌去势疗法失效后,前列腺癌的复发也叫去势抵抗性前列腺癌(castration resistance prostate cancer,CRPC)。高前列腺特异性抗原的生化复发期是进行早期针对性治疗的关键时期。因此跟踪性评估前列腺癌生化复发的风险是极有必要的。
目前,市场上有三种用于诊断前列腺癌复发的基于mRNA表达的多基因组检测试剂盒,它们包括Oncotype DX(Genomic Prostate Score/GPS)、Prolaris(cell cycle progression/CCP)和Decipher(Genomic Classifier/GC)。在前列腺癌确诊和根治术后,17-基因的Oncotype DX和31-基因的Prolaris试剂盒有助于对前列腺癌复发高风险患者的风险分层分组。22-基因的Decipher可以预测根治术后的癌转移风险。虽然这些生物标志物有助于设计个性化的前列腺癌治疗计划,但它们的临床价值还需要进一步验证。尽管前列腺癌生物标志物研究取得了很多重要的进展,但是对于前列腺癌根治术后复发风险评估和据此进行前列腺癌患者分层分组还是十分欠缺。部分原因可能是促使前列腺癌复杂的分子生物学网络机制是十分复杂。
Mucin 1(MUC1)通路在前列腺癌根治术后的生化复发起着重要的作用。 MUC1是一个获得深入研究的肿瘤相关抗原,部分原因在于MUC1这种细胞膜糖蛋白在大多数上皮组织的顶端的表面表达的。在70%的癌症中,MUC1的糖基化出现改变。MUC1通过激活多种通路的重要致癌蛋白,包括EGFR、β-Catenin、NF-κB和PKM2在很多肿瘤中促使肿瘤进展恶化。在前列腺癌中,MUC1表达上调并且出现异常糖基化。这些异常与血管生成和不良临床症状有关。MUC1上调与无病生存期(DFS)和总生存期(OS)的缩短弱相关,并与前列腺癌根治术后的恶性组织病理学相关。三个基因组(AZGP1,MUC1和p53)与原发前列腺癌患者的不良预后相关。转移性前列腺癌可以检测到MUC1mRNA表达的增加。25基因的MUC1基因网络的基因组的改变与前列腺癌复发稍有关联。
发明内容
本发明所要解决的技术问题是:基于25基因的MUC1基因网络的基因组,提供一种用于检测癌症复发风险的生物标志物,可有效预测前列腺癌等癌症的复发风险;并基于上述生物标志物,提供一种检测癌症复发风险的方法。
为了解决上述技术问题,本发明采用的技术方案为:
一种用于检测癌症复发风险的生物标志物,至少包括表1中696个差异表达基因中的一个。所述表1即为说明书中的表1。
一种用于检测癌症复发风险的基因组合,至少包括以下一个基因:SLCO2A1、CGNL1、SUPV3L1、TATDN2、MGAT4B、VAV2、SLC25A33、MCCC1、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B。
一种用于检测癌症复发风险的基因组合,其特征在于,至少包括以下一组特征性基因组:SigCut1、SigCut2、SigCut3和SigMuc1NW1;
所述SigCut1包括以下基因:MGAT4B、AURKA和OIP5;
所述SigCut2包括以下基因:TATDN2、MGAT4B、VAV2、AURKA和OIP5;
所述SigCut3包括以下基因:SLCO2A1、CGNL1、SUPV3L1、TATDN2、MGAT4B、VAV2、SLC25A33、MCCC1、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B;
所述SigMuc1NW1包括以下基因:CGNL1、MGAT4B、VAV2、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B。
一种检测癌症复发风险的方法,通过检查上述的基因组合中的基因的表达变化来诊断或预估患者的死亡风险。
其中,可以采用PCR、DNA芯片、Nanostring或RAN测序的方法,以检查生物标志物中的基因的mRNA低表达和mRNA高表达。
其中,该方法的检查对象是人或哺乳动物。
本发明的有益效果为:
本发明的用于检测癌症复发风险的生物标志物,充分利用MUC1作为的新型生物标志物的潜在价值,开发出有效的特征性基因组合来预测前列腺癌等癌症的复发。本发明构建了15个特征性基因组和多个子组。这些特征性基因组十分有效地预测了在两个独立的前列腺癌数据库(n=492和n=140)中前列腺癌根治术后的前列腺癌复发。此外,这些特征性基因组与其它多种癌症类型的无病生存期和总生存期的降低密切相关。
附图说明
图1显示了用于产生本技术专利的特征性基因组的策略;
图2A-B显示了使用Elastic-net方法进行696个基因的选择性协变量分析;
图3显示了选定的15基因的特征性基因组(SigMuc1NW)的基因表达;
图4A-B显示SigMuc1NW与前列腺癌患者的无病生存期(DFS)和总生存期(OS)降低相关;
图5显示了我们之前报道的9-基因的特征性基因组[21]和SigMuc1NW之间的重叠;
图6A-C显示图5的两个特征性基因组与前列腺癌患者的无病生存期(DFS)和总生存期(OS)的减少显着相关;
图7A-D显示SigMuc1NW的评分可以有效地将具有高复发风险的前列腺癌分层分组;
图8显示了SigMuc1NW得分的估计分界点;
图9显示SigMuc1NW的所有15个基因与前列腺癌复发和获取的三个亚特征性基因组的存在显着的相关;
图10A-E显示SigCut1,SigCut2和SigCut3与无病生存期(DFS)的减少显着相关;
图11A-C显示SigMuc1NW得分有效地对高复发风险前列腺癌进行分层分组;
图12A-C显示了独立于TCGA数据库的另一个前列腺癌群体中组分基因的表达的改变;
图13A-E显示SigMuc1NW1图12的数据库中十分有效地预测前列腺癌复发;
图14A-B显示SigMuc1NW1与TCGA亚数据库前列腺癌患者的无病生存期(DFS)和总生存期(OS)的减少显着相关。
具体实施方式
为详细说明本发明的技术内容、所实现目的及效果,以下结合实施方式并配合附图予以说明。
基于25基因的MUC1基因网络的基因组的改变与前列腺癌复发稍有关联,在此基础上,发明人在这25个基因中,证明其中9个基因的基因组改变可以显着增强这种关联。本发明旨在充分利用MUC1作为新型生物标志物的潜在价值,以开发出有效的特征性基因组合来预测前列腺癌的复发。在创新工作中,发明人在cBioPortal的前列腺癌TCGA临床数据库中发现了696个差异性表达基因(DEGs),它们与9-基因的特征性基因组相关。发明人进一步从这些差异性表达基因中,构建了15个特征性基因组和多个子组。这些特征性基因组十分有效地预测了在两个独立的前列腺癌数据库(n=492和n=140)中前列腺癌根治术后的前列腺癌复发。此外,这些特征性基因组与其它多种癌症类型的无病生存期和总生存期的降低也密切相关。
本发明从前列腺癌的TCGA数据亚库(n=492)中获取696个与9-基因MUC1特征性基因组相关的差异表达基因。使用Elastic-net logistic回归分析所有出现 变化的基因对前列腺癌复发的影响。通过分析,选取影响低于平均值1.5SD(标准偏差)的416个下调基因和高于平均值2SD的280个上调基因。通过该分析,在696基因中,获取包含15个基因特征性基因组(SigMuc1NW),即为:SLCO2A1、CGNL1、SUPV3L1、TATDN2、MGAT4B、VAV2、SLC25A33、MCCC1、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B。从SigMuc1NW特征性基因组中,发明人进一步分组出四个亚特征性基因组,即:SigCut1、SigCut2、SigCut3和SigMuc1NW1。
SigMuc1NW可强力预测根治术后的生化复发,敏感性为56.4%,特异性为72.6%。SigMuc1NW阳性前列腺癌患者的中位无病期(MMDF)为63.24个月,而SigMuc1NW阴性前列腺癌患者的中位无病期明显长于阳性患者,甚至在160个月的随访期结束时都无法获得有效的中位无病期(p=1.12e-12)。SigMuc1NW在11.5个月时的时间依赖性AUC(曲线下面积,tAUC)值为76.6%,在22.3个月时为73.8%,32.1个月时为78.5%,48.4个月时为76.4%。SigMuc1NW与前列腺癌的恶性特征、高Gleason评分(odds ratio/OR1.48,p<2e-16)和晚期肿瘤分期(OR 1.33,p=4.37e-13)相关。
SigCut1(包括MGAT4B、AURKA和OIP5基因)用来区分复发和非复发前列腺癌时的tAUC值分别为11.5个月时的74.3%、22.3个月时的73.8%、32.1个月时的78.5%和48.4个月时的76.4%。SigCut1阳性前列腺癌的中位无病期为69.1个月,SigCut1阴性前列腺癌在随访结束时尚未达到有效的中位无病期(p=4.8e-10)。
SigCut2(包括TATDN2、MGAT4B、VAV2、AURKA和OIP5基因)用来区分复发和非复发前列腺癌时的tAUC值分别为:11.5个月时为75.9%、22.3个月为73.4%、32.1个月为76.5%、48.4个月为75.3%。SigCut2阳性前列腺癌的中位无病期为32.5个月,SigCut2阴性前列腺癌在随访结束时尚未达到有效的中位无病期(p=0)。
SigCut3包括SigMuc1NW的全部15个组成基因。基于这些基因在切点处的mRNA表达,SigCut3预测前列腺癌复发时有67%的敏感性和75.7%的特异性。SigCut3阳性前列腺癌和SigCut3阴性前列腺癌的中位无病期分别为45.2个月和 未达到有效的中位无病期(p=0)。SigCut3用来区分复发和非复发前列腺癌时的tAUC值在11.5个月时为76.5%、22.3个月时为73.8%、32.1个月时为78.5%、48.4个月时为76.4%。
SigMuc1NW1由10个基因CGNL1、MGAT4B、VAV2、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B组成。用SigMuc1NW1预测TCGA数据库中前列腺癌(n=492)复发。该组前列腺癌中,SigMuc1NW1阳性的中位无病期为60.91个月,阴性前列腺癌在随访结束时尚未达到有效的中位无病期(p=3.14e-12)。用SigMuc1NW1预测另外一个数据库的前列腺癌(MSKCC,n=140),SigMuc1NW1阳性的中位无病期为11.8个月,而阴性前列腺癌在随访结束时尚未达到有效的中位无病期(p=3.11e-15)。SigMuc1NW1用来区分复发和非复发前列腺癌时的tAUC值在18.4个月时为82.5%、38个月时为78.5%、51.4个月时为76.6%、65个月时为78.2%。
在对已知的临床高风险因子,包括年龄,Gleason分级,手术切缘肿瘤残留和肿瘤分期,进行调整后,SigMuc1NW1和SigCut3是调整后预测前列腺癌术后复发的互相独立的危险因素。
SigMuc1NW和SigMuc1NW1与多种癌症类型的总生存期(OS)缩短相关。具体见下述:
乳腺癌:Curtis数据库(n=1980,p=0.00447/SigMuc1NW和p=0.000575/SigMuc1NW1)和TCGA亚数据库(n=1093,p=0.022和p=0.0586);
低分级的胶质瘤(n=516,p=4.92e-5和p=0.000191);
头颈部鳞状细胞癌(n=520,p=0.0368/SigMuc1NW1);
透明细胞肾细胞癌(ccRCC,n=533,p=1.58e-5和p=2e-7);
乳头状肾细胞癌(pRCC,n=290,p=0.00289和p=1.172e-5);
肝细胞癌(n=371,p=0.048和p=0.0349);
肉瘤(n=259,p=0.00813和p=0.000851);
甲状腺癌(n=501,p=1.01e-5和p=0.000742);
子宫体子宫内膜癌(n=177,p=0.0244和p=0.026)。
SigMuc1NW和SigMuc1NW1与多种癌症类型的无病生存期(DFS)缩短相 关。具体见下述:
低级别胶质瘤(n=516,p=00183和p=0.00511);
透明细胞肾细胞癌(n=533,p=0.00485和p=0.000264);
乳头状肾细胞癌(n=290,p=0.00379/SigMuc1NW1);
肾嫌色细胞癌(n=66,p 0.03和p=0.0235);
肝细胞癌(n=371,p=0.0105和p=0.00813);
肉瘤(n=259,p=0.000768和p=1.1e-5);
皮肤黑色素瘤(n=469,p=0.00566/SigMuc1NW1);
甲状腺癌(n=501,p=0.0506/SigMuc1NW1)。
本发明通过检查SigMuc1NW特征性基因组中的15个基因和不同亚基因组(SigMuc1NW1、SigCut1、SigCut2和SigCut3)的变化来诊断评估前列腺癌根治术后患者复发前列腺癌的可能风险。其可用于诊断评估前列腺癌患者的死亡风险,也可用于诊断评估在初诊前列腺癌时复发的风险,亦可用于诊断评估根治术后的转移和进展为去势抵抗性前列腺癌(CRPC)的风险。
SigMuc1NW特征性基因组中的15个基因可以以不同的组合使用,即,也可以使用除上述SigMuc1NW1、SigCut1、SigCut2和SigCut3之外的组合。这是因为所有15个基因都能单独预测癌症的生物复发。
本发明附图1-14的具体说明如下:
图1中,cBioPortal中的TCGA亚数据库存有492个前列腺癌患者的基因表达,这些基因表达是通过通过RNA测序获得的。根据来源于MUC1调节的基因组网络的9-基因的特征性基因组,我们把该群体首先被分成两个组:一组(n=100)是9-基因的特征性基因组阳性和另一个(n=392)是阴性。从这两组中,根据它们的平均mRNA表达(q<0.001),我们获得了696个差异性表达基因(DEGs)。这些差异性表达基因由461个下调基因和218个上调基因组成。肿瘤具有这一特征的--461个下调基因表达低于平均值1.5SD(标准偏差)(-1.5SD)--被分为一组。肿瘤具有这一特征的--218个上调基因表达高于总体平均值2SD的前列腺癌上(+2SD)--被分为一组。然后,我们使用R glmnet包中的Elastic-netpenalty方法进行模型构建。然后,以此模型进行正则化耦合协变量选择分析这696个 差异性表达基因(DEGs)对前列腺癌生化复发的影响。
图2中,混合参数α设为0.2(A)和0.8(B)的交叉验证(CV)曲线。在图示顶部显示当前λ值(通过设置惩罚水平来调整的参数值)的非零系数(协变量)的数量。最右侧的垂直线表示CV曲线的最小值,在其左侧垂直线显示CV-误差在最小值的一个标准偏差内。模型建立在左侧垂直线所示的λ值上。
图3中,使用图1和2中所述的系统,根据它们对前列腺癌复发的影响,我们从696个基因中选择这15个基因。在TCGA数据库的前列腺癌中,SLCP2A1和CGNL1在下调1.5个标准差(-1.5SD),其余基因上调2个标准差(+2SD)。使用OncoPrint(顶部灰色插图)和聚类(底部彩色图像)显示TCGA数据库的前列腺癌中这些差异性表达基因(DEG)的表达。这一图还包括无病状态。该图是使用cBioPortal提供的工具生成的。
图4中,TCGA亚数据库把用于这些分析。(A)SigMuc1NW对DFS的影响。MDF:无病月数;MMDF:中位无病期;NA:未达到有效MMDF。该图包括在指定随访时间段内的风险个体数量。(B)SigMuc1NW对OS的影响。MMS:中位数月存活率。使用R的survival包进行Kaplan-Meier分析和对数秩检验。
图5中,该图是使用TCGA亚数据库(n=492,cBioPortal)生成的。此图仅显示在两个特征性基因组中同时出现改变的基因。
图6中,分析使用TCGA亚数据库进行分析。(A,B)用于构建SigMuc1NW的9-基因的特征性基因组与SigMuc1NW相结合,然后检验它们其对DFS(A)和OS(B)的影响。(C)使用TCGA数据库,对我们以前确定的9-基因的特征性基因组对DFS的影响。
图7中,(A)对TCGA亚数据库中的所有肿瘤进行SigMuc1NW评分。使用tROC分析评分以鉴别具有高风险复发的肿瘤。如图显示了在指定的时间段内的AUC(tAUC)以及疾病复发的状态。DF:无病状态。(B)SigMuc1NW得分的分界点(cutpoint)可以以有效地将具有低复发风险前列腺癌和高复发风险的前列腺癌分开(参见图8的细节)。随后基于该切点将二进制代码分配给肿瘤。最后测定分界点对TCGA组群患者的无病生存期(DFS)的影响。(C,D)TCGA亚数据库中前列腺癌患者的SigMuc1NW平均值和Q3评分对前列腺癌的生化复发的影响。 使用Rsurvival包进行Kaplan-Meier分析和对数秩检验。垂直点线显示中位无病期。彩色点曲线为95%置信区间(置信区间)。
图8中,对TCGA亚数据库的所有492名患者进行SigMuc1NW评分。使用R中的最大选择等级统计(Maxstat软件包)进行分界点值估计。垂直虚线显示分界点及其关联的p值。
图9中,从TCGA亚数据库(cBioPortal)中获得15个基因的mRNA表达数据,以此获得各个分界点值,并且提供二进制代码给所有肿瘤。使用单变量Cox比例风险(proportional hazards,PH)模式确定所有基因的前列腺癌复发风险比(hazard ratio,HR)。Cox比例风险假设也进行了评估和确认。这些分析使用Rsurvival包进行。该图包括风险比,95%CI和p值。根据p值,我们也获得了SigCut1、SigCut2和SigCut3特征性基因组中所包含的基因。
图10中,此处使用TCGA亚数据库。(A)使用SigCut1、SigCut2和SigCut3的Cox系数对所有肿瘤进行评分。此图显示了在随访时间段的三个特征性基因组的时间依赖性AUC和相应的复发状态。(B-D)SigCut1、SigCut2和SigCut3与生化复发的关联。(E)分析SigCut3的Q1、中值、Cutpoint和Q3评分用于具有高复发风险的前列腺癌分层分组。包括在指定随访时间段内的风险个体数量。使用Rsurvival包进行Kaplan-Meier分析和对数秩检验。
图11中,基于Q1、中值和Q3值的SigMuc1NW评分对TCGA亚数据库中的的高生化复发风险的前列腺癌进行分层分析。
图12中,在cBioPortal内的MSKCC亚数据库获取所有15-组分基因的基因表达数据。这一群体的基因表达数据是获取于DNA基因芯片。正常和前列腺癌组织mRNA(A),原发前列腺癌和转移前列腺癌mRNA(B)以及非复发和复发前列腺癌(C)中的mRNA水平。此图同时标出了各组的癌数量。统计分析采用Student's t检验(双测)进行。*p<0.05,**p<0.01和***p<0.001。
图13中随访的数据以及所有15个基因的mRNA表达数据都从MSKCC数据库中获取。SigMuc1NW1包含了10个基因。此图显示所获得的时间依赖性AUC(A)。使用SigCut1NW1的切点(B),Q1(C),中值(D)和Q3(E)对高复发风险的前列腺癌进行分层分组。也当前随访期间的前列腺癌数量也显示在图中。
图14中,SigMuc1NW1与TCGA临床队列前列腺癌患者的DFS和OS的减少显着相关,SigMuc1NW1基因表达基于SD水平。使用由cBioPortal提供的工具进行Kaplan-Meier分析和对数秩检验。
下面对本发明的技术方案做详细介绍,包括:
1、同9-基因的特征性基因组密切相关的基因
30-40%的患者在前列腺癌根治术后发生生化复发(BCR);其中大约40%的患者会出现转移性癌。生化复发风险的评估,将有助于就制定出个性化的治疗方案。我们最近构建了源自MUC1基因的分子生物学网络的9-基因特征性基因组;该特征性基因组使用TCGA亚数据库有效地预测生化复发:敏感性34.8%,特异性83.6%,中位无病期(MMDF)73.36个月(p=5.57e-5)。生化复发产生是由多基因,多通道异变的结果。在这方面,发明人通过分析与9-基因的特征性基因组相关的转录组的变化,获得更有效地特征性基因组。为了研究这种可能性,发明人采用图1中的策略的在cBioPortal数据库内的对TCGA亚数据库进行分析。发明人分析了同9-基因的特征性基因组密切相关的基因转录。在492例前列腺癌患者中,100例的特征性基因组呈阳性(图1)。对比这100例阳性前列腺癌和其他392例阴性癌之间基因的平均表达,我们共获得了696个差异表达基因(DEGs),(q<0.001)(表1,表1显示了在TCGA亚数据库中9-基因特征性基因组的差异表达基因(DEGs))。这些差异表达基因包含416个下调基因和280个上调基因(图1;表1)。使用RGaga包中的KEGG(kegg,kegg.set.hs)数据集对这些差异表达基因的基因进行富集分析揭示了上调的基因主要是同调节细胞周期、卵母细胞减数分裂和孕酮介导的卵母细胞成熟有关,而介导细胞连接和其他功能的基因是下调的。类似的,利用Gene Ontology(go,go.sets.hs)数据集分析,上调的基因功能涉及包括调控细胞周期进展、DNA代谢和与细胞增殖相关的其他过程。下调的基因功能涉及包含介导细胞连接、细胞外过程和其他细胞过程。使用R的Reactome软件包对696差异表达基因的基因通道的富集分析显示这些基因调节细胞周期的G1、M、DNA复制和染色单体分离的途径。总之,上述分析共同揭示了696差异表达基因与前列腺癌的进展关联。
表1
Figure PCTCN2018113414-appb-000001
Figure PCTCN2018113414-appb-000002
Figure PCTCN2018113414-appb-000003
Figure PCTCN2018113414-appb-000004
Figure PCTCN2018113414-appb-000005
Figure PCTCN2018113414-appb-000006
Figure PCTCN2018113414-appb-000007
Figure PCTCN2018113414-appb-000008
Figure PCTCN2018113414-appb-000009
Figure PCTCN2018113414-appb-000010
Figure PCTCN2018113414-appb-000011
Figure PCTCN2018113414-appb-000012
Figure PCTCN2018113414-appb-000013
Figure PCTCN2018113414-appb-000014
Figure PCTCN2018113414-appb-000015
Figure PCTCN2018113414-appb-000016
Figure PCTCN2018113414-appb-000017
Figure PCTCN2018113414-appb-000018
Figure PCTCN2018113414-appb-000019
Figure PCTCN2018113414-appb-000020
Figure PCTCN2018113414-appb-000021
Figure PCTCN2018113414-appb-000022
Figure PCTCN2018113414-appb-000023
Figure PCTCN2018113414-appb-000024
Figure PCTCN2018113414-appb-000025
2、构建15-基因特征性基因组SigMuc1NW以预测前列腺根治性切除术(RP)后的生化复发(BCR)。
我们随后使用TCGA亚数据库分析了696个差异性表达基因(DEG)对生化复发的影响。在上述癌症类型的群体中,使用直接协变选择(direct covariate selections)从696个进一步获取基因组。这些患者都是前列腺根治性切除术术后(cBioPortal)。基于前列腺癌的存在多样性,我们推论当这些差异性表达基因的表达超出阈值水平时,这些差异性表达基因可能会影响癌的生化复发。我们把差异性表达基因的表达相当于参考群体平均值下调1.5个标准差(-1.5SD)的前列腺癌分为一组;相对应的差异性表达基因上调超过2个标准差(+2SD)的分为一组(图1)。参考群体是数据集内的所有肿瘤或者是有完整二倍体的肿瘤(http://www.cbioportal.org/faq.jsp)。然后使用R glmnet软件包(图1)中的Elastic-net logistic回归进行正则化的协变量选择分析这一重新整理的数据库,包含的数据是下调基因,上调基因,随访期和每位患者复发状态。为了调整选择高度相关的协变量和最小化协变量的数量,我们设置Elastic-net分析中的混合参数α为0.2或0.8。在所有选择设置中使用10倍交叉验证。如预期的那样,在α=0.2(n=17)比α=0.8(n=5)选择更多的协变量(图2)。我们还用不同的设置(s=0.5)进行协变量选择,这就产生比设置α=0.2的更多的协变量。然后,我们去除在s=0.5设置中相关系数<0.01和在α=0.2设置中相关系数<0.001差异性表达基因。这样我们就获得了包括15个基因SigMuc1NW特征性基因组(NW指网络)。这一基因组包括了选自α=0.8的所有5个基因;选自α=0.2的14个基因(这也包括在α=0.8选择的所有5个基因)和s=0.5时的15个基因(包括在α=0.2时选择的全部14个基因)(表2)。
表2
Figure PCTCN2018113414-appb-000026
Figure PCTCN2018113414-appb-000027
其中,a:-1.5SD下调基因;b:2SD上调基因;NA:不可用。
在15个基因中,只有SLCO2A1和CGNL1基因下调,其余基因上调(表1)。五个基因CGNL1、SUPV3L1、TATDN2、CASKIN1和GOLGA7B在前列腺癌肿瘤发生或肿瘤发生中作用还没有报道(表2)。有报道显示6个基因(SLCO2A1,MGAT4B,SLC25A33,MCCC1,OPI5和CTHRC1)影响其他癌症类型的肿瘤发生,但不影响前列腺癌(表2)。OIP5(Opa interacting protein 5)是一种癌症睾丸抗原,在其他癌症类型中已有报道称为肿瘤相关抗原(tumor associated antigen,TAA)。它在前列腺癌中的表达异常表明OIP5也有可能是前列腺癌的肿瘤相关抗原。其余4个基因VAV2(VAV guanine nucleotide exchange factor 2)、ASNS(asparagine synthesis)、DNMT3B(DNA methyltransferase 3 beta)和AURKA(Aurora kinase A)不仅能促进前列腺癌发生,而且癌进展CRPC去势抵抗性前列腺癌(castration resistance prostate cancer,CRPC)中发挥作用。VAV2是雄激素受体(AR)的共激活因子,在去势疗法(androgen deprivation therapy,ADT)后维持雄激素受体信号传导。它还能促进血管生成和转移。AURKA在有丝分裂 中发挥重要作用,并在去势疗法后促进神经内分泌前列腺癌的发育。DNMT3B可能调节表观遗传事件以促进去势抵抗性前列腺癌(CRPC)的进展。总而言之,所有的证据支持SigMuc1NW特征性基因组与前列腺癌复发的关联。
于此一致的是,单变量Cox比例风险(proportional hazards,PH)分析显示,在确定的表达水平(-1.5SD下调和+2SD上调)的所有组分基因十分有效的预测前列腺癌的生化复发(表2,表2为SigMuc1NW的基因组分和前列腺癌的复发 a)。除了TATDN2和OIP5,Cox模型的PH假设都获得了确认。对某些基因(MGAT4B,ASNS,DNMT3B和OIP5)的预测是有效率的(表3),特别是考虑到这些预测是基于单个基因的。
表3
Figure PCTCN2018113414-appb-000028
其中,a:在TCGA亚数据库(n=492)单变量Cox分析;b:Cox系数;c:风险比;d:自信区间;e:相对于参考群体,基因表达平均<-1.5SD;f:相对于参考群体,基因表达平均>2SD。*p<0.05;**p<0.01;***p<0.001。
为了支持我们选择相关基因,15种基因的变化显示出重叠的特征(图3,上图),它们的表达可以簇集一起(图3,下图)。基于相对于标准差(SD)的下调/上调改变和来源于基因表达的簇集分析是相匹配的(图3)。这也验证了我们的协变量选择。重要的是,出现这些变化的前列腺癌患者确实有复发的风险,也就是这些患者主要都在前列腺癌复发组里(图3,参见“无病状态”图示)。对SigMuc1NW阳性的肿瘤也与无病生存率(DFS)的降低强有关(图4A,p=1.12e-12)。该关联的敏感性为56.4%,特异性为72.6%,与最初报道的9-基 因的特征性基因组相比,敏感性显着提高(敏感性为34.8%,特异性为83.6%,p=5.57e-5)。考虑到TCGA亚数据库中有10例患者癌死亡,有趣的是这10例死亡中有8例发生在SigMuc1NW阳性患者中(图4B,p=0.00212),这与VAV2、ASNS、DNMT3B和AURKA是促进去势抵抗性前列腺癌(CRPC)的推论是一致的。正如预期的那样,SigMuc1NW与用于选择696个差异性表达基因的9-基因的特征性基因组的作用是重叠(图5)。结合使用SigMuc1NW可以显着增强了9-基因特征性基因组与前列腺癌的生化复发的关联(图6A,C),并且与总生存(OS)的降低显着相关(图6B)。
3、SigMuc1NW有效地将复发的前列腺癌与无生化复发的前列腺癌区分开来
为了检验SigMuc1NW在区分复发前列腺癌和没有生化复发的前列腺癌中的有效性,我们根据Cox效率分配了15个基因的改变(表3)。然后计算个体患者SigMuc1NW的累积得分作为Σ(fi)n(fi:genei的Cox系数,n=15)。我们使用时间依赖性ROC(tROC)评估使用SigMuc1NW预测生化复发时的敏感性和特异性。这些评分将复发前列腺癌按tAUC(曲线下面积)分为11.5月和32.1个月的74.9%,以及48.4个月时的69.7%(图7A)。这进一步揭示SigMuc1NW对预测早期BCR特别有效。我们使用R中Maxstat软件包中的最大选择等级统计(图8)确定了在区分复发与非复发前列腺癌时SigMuc1NW的分界点(cutpoint)得分,并将分数转换为二进制代码;分数≤1.7833(分割点,图8)被分配“0”,分数>1.7833被分配“1”。分数高于分界点的前列腺癌生化复发变化十分明显于分数不高于分界点(图7B)。与SigMuc1NW阳性前列腺癌(图4A;MMDF 63.2,95%CI 40-77.3)相比,分界点阳性肿瘤甚至在更短的时间内出现生化复发(图7B;MMDF 33.1,95%CI 30.9-73.4)。分界点不仅有助于SigMuc1NW的在临床引用时的预测效力,而且还增强其预测能力。此外,均值和四分位数3(quartile 3,Q3)分数可以将具有高风险生化复发的患者分层分组,具有与SigMuc1NW相当的效果(比较图7C,D与图4A)。平均值和Q3分数分别覆盖了48个和46个复发前列腺癌(图7C,D),这超过了41个经过分界点标记的复发前列腺癌(图7A)。因此,可以结合使用平均值(0.918)、Q3(1.019)和 分值点(1.7883)来预测具有一定范围的生化复发风险的生化复发。在调整了诊断年龄,根治术后Gleason评分,手术切缘癌残留和TMN肿瘤分期后,我们进一步证实了前列腺癌复发的独立危险因素包括SigMuc1NW(p=1.62e-4)、分值点(p=2.05e-5)、平均值(p=1.19e-4)和Q3(p=1.67e-4)(表4)。当使用世界卫生组织(WHO)前列腺癌等级系统代替Gleason分级时,在调整上述临床因素后,SigMuc1NW(p=0.0532)和Q3(p=0.0576)统计显著性p值接近0.05,而分界点(p=0.00395)和平均值(p=0.0187)仍是前列腺癌生化复发的独立危险因素。
表4
Figure PCTCN2018113414-appb-000029
其中,1:SigMuc1NW;2:SigMuc1NW派生的分界点;3:诊断年龄;4:根治性前列腺切除术Gleason评分;5:精囊入侵;6:手术边缘;7:肿瘤分期(对于≤T2,0;对于T3和T4,1);HR:风险比;CI:置信区间;NA:不可用。
4、提高SigMuc1NW的预测效率
为了进一步证明SigMuc1NW的有效性和强大性,我们使用实际的基因表达数据而不是使用基于标准差(SD)的分布来分析特征性基因组的效力。为此目的,从TCGA亚数据库中检索所有15个SigMuc1NW基因的RNA测序数据,并估计区分复发前列腺癌中个体基因表达的分界点 1(表5)。如上所述,对所有15个基因给予所有肿瘤的二进制编码,除了下调的基因SLCO2A1和CGNL1外,其中表达小于分界点的肿瘤被指定为“1”。在比例风险假设下,对所有基因进行单变量Cox比例风险(Cox proportional hazards,PH)分析。根据其分界点定义的所有15个基因显著预测癌的生化复发(图9)。在校正年龄,根治术后Gleason评分,手术切缘和TMN肿瘤分期后,SLC02A1(p=0.0369),SUPV3L1(p=0.000798),TATDN2(p=0.000835),MGAT4B(p=0.0128),VAV2(p=0.0024), SLC25A33(p=0.0297)和OIP5(P=0.00638)仍然是癌的生化复发的独立危险因素(p=0.00102)。
表5
Figure PCTCN2018113414-appb-000030
其中,1:从TCGA亚数据库(cBioPortal)获取SigMuc1NW组分基因的RNA测序数据。2:使用R中的最大选择等级统计(Maximally Selected Rank Statistics)来估计分界点。3:使用单变量Cox比例风险分析确定生化复发的系数。#:PH假设在p<0.05。
使用获得的Cox系数(表5),将所有的分界点阳性的肿瘤转换为相应的系数值。基于由p值所确定的有效性(图9),我们进一步分组出三个亚特征性基因组SigCut1,SigCut2和SigCut3(图9)。然后使用Σ(fi)n(fi:genei的Cox系数,n=3、6和15)对所有肿瘤进行SigCut1、SigCut2和SigCut3评分。在tAUC>70%时,所有三个亚特征性基因组都有效地区分了复发性前列腺癌(图10A)。这些基因组各自的分界点:SigCut1的1.0331/p=6.166e-8、SigCut2的4.0135/p=1.005e-11、以及SigCut3的5.4067/p=7.97e-15。然后将各亚基因组的相应二进制代码被分配给所有肿瘤,以此用来进行生存分析。所有三个亚基因组都与无病生存期(DFS)的减少显着相关,SigCut2和SigCut3更加有效(图10B-D)。尽管如此,他们都能用来预测癌的生化复发(BCR),这些包括复发肿瘤数量和中位无病期(MMDF)。它们的敏感性/特异性:SigCut1为71.4%/63.9%、SigCut2为41.8%/87.5%、和SigCut3为67.7%/75.7%(图10B-D)。因此,这三 个亚基因组可以一起用于预测复发前列腺癌。
Q1(1.647)、中位数(3.589)和Q3(6.386)得分均有效地对前列腺癌的生化复发风险进行分层分组,其灵敏度/特异度/无病中位月(MMDF/p)值:Q1为93.4%/31.8%/81.2/6.76e-6,中位数为80.2%/56.9%/66.9/6.73e-11,Q3为56%/82%/40/0(图11)。当Q1,中位数,Q3和SigCut3分界点一起使用时,它提供了非常有效的评估系统,可以对复发和非复发前列腺癌进行分层分组,而只有少数复发病例的肿瘤评分小于Q1(图10E)。
此外,与使用标准差(SD)构建的SigMuc1NW(图4A)相比,SigCut3显然更有效(图10D)。在根据诊断年龄,根治术后Gleason评分,手术切缘和TMN肿瘤分期进行调整后,SigCut1(p=0.00308),SigCut2(p=1.55e-5)和SigCut3(p=2.97e-6)可以分别独立预测癌的生化复发。所有三个亚基因组与前列腺癌的不良特征相关:在进展期肿瘤(T3和T4)SigCut1的优势比和95%CI为1.78/1.51-2.12(p=2.39e-11),SigCut 2为1.55/1.37-1.77(p=1.33e-11)和SigCut 3为1.33/1.23-1.44(p=8.47e-13);在Gleason分级为8-10时,优势比/95%CI为2.19/1.86-2.6(p<2e-16),1.84/1.62-2.1(p<2e-16)和1.48/1.37-1.61(p<2e-16)。总之,这些证实了SigMuc1NW的有效性。
5、进一步验证SigMuc1NW
在cBioPortal中的13个前列腺癌数据库中,有4个包含mRNA数据。原发前列腺癌来自于TCGA亚库和Broad/Cornell(Nat Genet 2012)数据库;转移性前列腺癌来自于Fred Hutchinson(Nat Med 2016)和SU2/PCF Dream群体(cBioPortal)。此外,这些数据集提供了基于标准差(SD)分布的基因表达分析。分析这些数据库的数据,转移前列腺癌中SigMuc1NW阳性比例显著高于原发前列腺癌(表6)。表6为SigMuc1NW在转移性前列腺癌中的上调 i
表6
数据库 前列腺癌 n SigMuc1NW+
TCGA Prov ii Primary PC 497 33.6%(167/497)
Broad/Cornell Primary PC 31 32.3%(10/31)
Total   528 33.4%(177/528)
Fred Hutchinson mCRPC d 63 58.7%(37/63)
SU2C/PCF Dream Team mCRPC 118 59.3%(70/118)
Total   181 59.1%%(107/181)*
其中,i:数据获取自cBioPortal;ii:TCGA数据库;*p=0.0002使用Fisher's Exact检验,同原发前列腺癌相比。
在cBioPortal的MKSCC(Cancer Cell 2010)数据库中有216个前列腺癌数据,它们的mRNA表达获取DNA芯片;在比较正常前列腺组织和前列腺癌后,这些数据进行了整理(cBioPortal)。这一群体包含随访信息,因此支持生存分析。为了进一步验证使用来自TCGA亚数据库的RNA测序数据构建的SigMuc1NW的有效性,我们从MKSCC数据库中提取所有15种组分基因的mRNA表达数据以及相应的临床信息。患者样本可分为正常前列腺(n=29),原发前列腺癌(n=149),复发前列腺癌(n=36)和转移前列腺癌(n=9)(cBioPortal)。我们证明原发性前列腺癌中CGNL1比正常前列腺组织相比表达显著降低;而且,转移性前列腺癌与局部前列腺癌相比,非复发性前列腺癌与复发前列腺癌相比,SigMuc1NW的两个下调基因,SLCO2A1和CGNL1表达显著降低(图12A-C)。上述比较(图12A-C)中,SigMuc1NW上调基因的表达也明显处于显著的高水平,这一点支持了SigMuc1NW的真实性。
使用上述的系统,我们也获取了所有15个基因的分界点,并且分配二进制编码。我们同时使用Cox比例风险回归(Cox PH)确定个体基因与生化复发的关联(表7,表7为MSKCC数据库中前列腺癌患者SigMuc1NW的分界点和Cox系数1)。除了MCCC1与无病生存期(DFS)逆向关联和4个与无病生存期无显着相关外,其余10个基因显著预测生化复发风险同时CGNL1和CTHRC1更加强有力地预测生化复发风险(表7)。据此,这个10个基因成为一个亚特征性基因组--SigMuc1NW1。如上所述的方法,使用它们的系数对所有肿瘤进行SigMuc1NW1评分(表7)。tROC分析显示tAUC值从76.6%至82.5%(图13A)。在18.4个月和65个月的所有随访期间内,SigMuc1NW1有效区分复发与非复发前列腺癌(图13A);它的效力同SigMuc1NW在鉴别TCGA亚库中的复发前列腺癌的效力类似(图10A)。此外,使用来自Q1(0),中位数(1.805),Q3(3.727)和分界点(6.2136)的SigMuc1NW1的二进制代码得分,我们对复发前列腺癌进行非常有效的分层分组(图13B-E)。敏感性/特异性/PPV(阳性预测值)为分界点的36.1%/98.1%/86.7%,Q1为97.2%/35.6%/34.3%,中位值为75% /59.6%/39.1%,Q3为52.8%/84.6%/54.3%(图13B-E)。分界点的阳性预测值非常高的(86.7%)。综合而言,通过结合Q1、中位数、Q3和分界点可以有效地预测MSKCC数据库中患者的前列腺癌复发。在使用TCGA亚数据库,我们也证实了SigMuc1NW的有效性。在一项反向验证工作中,我们证明,在TCGA亚数据库中,SigMuc1NW1与肿瘤生化复发强烈相关,并且与总生存期(OS)的减少显着相关(图14A,B)。综合起来,我们全面验证了SigMuc1NW和SigMuc1NW1对预测前列腺癌复发的有效性。
表7
Figure PCTCN2018113414-appb-000031
其中,1:从MSKCC数据库(cBioPortal)中获取的SigMuc1NW组分基因的DNA芯片数据。2:分界点使用R中的最大选择等级统计(Maximally Selected Rank Statistics)来估计。3:使用单变量Cox比例风险分析确定生化复发的系数。
6、SigMuc1NW和SigMuc1NW1与多种癌症类型的无病生存期(DFS)和总生存期(OS)减少相关
我们分析了SigMuc1NW和SigMuc1NW1预测其他癌症类型的无病生存期(DFS)和总生存期(OS)的价值。这两个标记与两大乳腺癌群体,低级别胶质瘤,头颈部鳞状细胞癌(仅限SigMuc1NW1),透明细胞肾细胞癌(ccRCC)、乳头状肾细胞癌(pRCC)、肝细胞癌、肉瘤、甲状腺癌和子宫体子宫内膜癌中的总生存期减少显著相关(表8,表8为SigMuc1NW和SigMuc1NW1与多种癌 症总生存减少的关联 a)。p值在10-5(e-5)至10-7(e-7)的范围内,这些关联十分显著(表8)。这两个特征性基因组还可以预测低级别胶质瘤,透明细胞肾细胞癌,嫌色细胞癌,肝细胞癌和肉瘤中的无病生存期(DFS)降低(表9)。SigMuc1NW1与肉瘤的DFS降低相关,并且与SigMuc1NW(表9,表9为SigMuc1NW和SigMuc1NW1与多种癌症的无病生存率的降低相关联 a)相比,SigMuc1NW1在更多癌症类型中与疾病复发有关。总的来说,这些数据证实SigMuc1NW和SigMuc1NW1的临床意义。
表8
Figure PCTCN2018113414-appb-000032
其中,a:所有癌症数据集均来自cBioPortal数据库。b:指定的特征性基因组阳性(+)和阴性(-)数。包括了总数/复发数;MMDF。
表9
Figure PCTCN2018113414-appb-000033
Figure PCTCN2018113414-appb-000034
其中,a:所有癌症数据集均来自cBioPortal数据库。b:指定的特征性基因组阳性(+)和阴性(-)数。包括了总数/复发数;MMDF。
本发明开发了一种新方法来分析多基因相关的转录组,以获取可以用于肿瘤复发诊断的特征性基因组。这是第一次完全基于多基因的转录组分析(696个基因),而不是基于单基因分析。由于本发明的新颖角度和全新的综合分析方法,我们获取了15-基因的特征性基因组。在这一基因组中,73.3%(11/15)的基因尚未有被报道与前列腺癌相关。这11个新的前列腺癌基因包括MGAT4B和OIP5。前者可能在改变肿瘤蛋白质糖基化方面起作用,而糖基化改变是肿瘤发生十分重要的改变。MUC1糖基化异常在肿瘤发生中已充分得到证实。因此,15-基因组中的MGAT4B同这一基因组来源于9-基因MUC1特征性基因组是一致的。在SigMuc1NW中的OIP5表明敢蛋白在前列腺癌中是肿瘤相关抗原(TAA)。肿瘤相关抗原在癌症诊断和治疗中已经获得广泛研究。因此,OIP5将在前列腺癌诊断和治疗中有着潜在的临床应用。
由于癌症进展的复杂性质,我们选择不专注于肿瘤发生的特定方面,而是把最新的机器学习系统应用到评估696个基因的前列腺癌的生化复发的预测能力方面。我们由此构建了包含15个基因的基因组。尽管SigMuc1NW的构建并非针对特定通路,但该基因组可能涵盖多种通路。除了MGAT4B对蛋白质糖基化的潜在影响之外,该基因组还包含具有RNA解旋酶活性(SUPV3L1,表2)和DNA甲基转移酶活性(DNMT3B,表2)的蛋白质。这些细胞过程在基因表达和表观遗传改变中非常重要,它们的恶性变是癌症进展的重要表现。SigMuc1NW也包含具有调节细胞增殖的基因。AURKA正在逐渐被认为有丝分裂的重要调节蛋白,也是肿瘤发生的关键参与者。在癌症治疗的研究领域,AURKA被认为是十分重要的潜在的目标基因。有趣的是,在SigMuc1NW的15个基因中,只有4个有报道在前列腺癌中起作用,并且所有4个基因都能促进 去势抵抗性前列腺癌(CRPC)的进展。由于,在去势抵抗性前列腺癌中,基因表达和表观遗传都发生明显的异常改变,15基因组也可以预测去势抵抗性前列腺癌的进展。
包含在多种通道中起作用的基因可能是特征性基因组非常有效预测肿瘤复发的主要原因。SigMuc1NW及其亚基因组均可有效地将前列腺癌根据生化复发的风险(p=0)进行分层分组;并且也可以预估出tAUC>75%的复发前列腺癌。通过结合SigMuc1NW的亚基因组进行分析,灵敏度、特异性和PPV(阳性预测值)可以达到十分可靠的水平,如依次为97.2%、98.1%和86.7%(图13B-E)。总的来说,这些证据强有力地表明,本发明中所构建的特征性基因组将在预测前列腺癌复发方面具有非常重要的临床意义。
本发明的方法包括:
cBioPortal
cBioPortal(http://www.cbioportal.org/index.do)数据库包含最完善和最全面的关于各种癌症类型的遗传学数据。TCGA亚数据库涵盖遗传异常,通过cDNA芯片或RNA测序确定的转录表达以及包括疾病结果(复发和死亡)的详细临床特征。TCGA临床前列腺癌数据库有492例患者的局限性前列腺癌。
建立多基因的特征性基因组
在cBioPortal数据库(http://www.cbioportal.org/index.do)(其中包括492位患者随访数据)中最大的TCGA亚数据库(n=499)被使用来获得696个差异表达性基因。这696个基因同来源于MUC1基因组网络的9-基因特征性基因密切相关(q<0.001)。随访期间,复发等临床资料也被提取出来。使用R的glmnet软件包中的Elastic-net logistic回归用来选择对BCR有重大影响的变量,并进行10次交叉验证;Elastic-netα的混合参数用于:0.2,0.5和0.8。当α=0时,Elastic-net使用Ridge回归分析,它不执行协变量选择,但将相关预测变量的系数相互缩小;当α=1时,使用Lasso回归分析,它倾向于从一组相关协变量中选择一个协变量;这将使特征性基因组更加有效。为了增强选择一个群体中高度相关的变量,同时保持协变量的数最小,我们把α值定义为:0.2和0.8。使用这个系统,我们获取了一个包含15个基因的基因组。
给予患者/肿瘤相应的特征性基因组评分
使用单变量Cox比例风险(proportional hazards,PH)回归检验单个组分基因对预测生化复发的效力;获得单个组分基因的Cox系数。PH假设也被确定。该分析使用Rsurvival包进行。个体患者的特征性基因评分采用Sum(coef1+coef2+...+coefn)给出,其中coef1...coefn是单个基因的系数。
分界点(Cutpoint)估算
根据患者的特征性基因组评分,使用R中的最大选择等级统计(Maxstat软件包)分析获取分界点。这一分界点用于区分复发性和非复发性前列腺癌。我们也从TCGA亚数据库获取了根据RNA测序确定的RNA表达;我们也还评估了分界点对于区分复发性和非复发性前列腺癌的效力。
回归分析
使用R语言进行Logistic回归。使用Rsurvival包进行Cox比例风险(Cox PH)回归分析。PH假设也进行了检查。
通路的富集分析
R中的GAGE和Reactine包用于分析差异性基因的KEGG(Kyoto Encyclopedia of Genes and Genomes)和GO(gene ontology)通路分析。
统计分析
使用GraphPad Prism 5软件进行Fisher精确检验。使用Rsurvival包和由cBioPortal提供的工具进行Kaplan-Meier生存分析和对数秩检验。使用Rsurvival包进行单变量和多变量Cox回归分析。时间相关的ROC(Time-dependent receive operating characteristic,tROC)分析使用R timeROC软件包进行。p<0.05的值被认为是统计学显着的。
综上所述,本发明提供的用于检测癌症复发风险的基因组合具有可有效预测前列腺癌等癌症的复发风险的优点。
以上所述仅为本发明的实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等同变换,或直接或间接运用在相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (10)

  1. 一种用于检测癌症复发风险的生物标志物,其特征在于,至少包括表1中696个差异表达基因中的一个。
  2. 一种用于检测癌症复发风险的生物标志物,其特征在于,至少包括以下一个基因:SLCO2A1、CGNL1、SUPV3L1、TATDN2、MGAT4B、VAV2、SLC25A33、MCCC1、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B。
  3. 一种用于检测癌症复发风险的生物标志物,其特征在于,至少包括以下一组特征性基因组:SigCut1、SigCut2、SigCut3和SigMuc1NW1;
    所述SigCut1包括以下基因:MGAT4B、AURKA和OIP5;
    所述SigCut2包括以下基因:TATDN2、MGAT4B、VAV2、AURKA和OIP5;
    所述SigCut3包括以下基因:SLCO2A1、CGNL1、SUPV3L1、TATDN2、MGAT4B、VAV2、SLC25A33、MCCC1、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B;
    所述SigMuc1NW1包括以下基因:CGNL1、MGAT4B、VAV2、ASNS、CASKIN1、DNMT3B、AURKA、OIP5、CTHRC1和GOLGA7B。
  4. 根据权利要求1、2或3所述的用于检测癌症复发风险的生物标志物,其特征在于,所述癌症包括以下至少一种:前列腺癌、乳腺癌、急性骨髓性白血病、低级别的脑胶质瘤、头颈部鳞状细胞癌、透明细胞肾细胞癌、乳头状肾细胞癌、嫌色细胞肾癌、肝细胞癌、肉瘤、皮肤黑色素瘤、甲状腺癌和子宫体子宫内膜癌。
  5. 根据权利要求1、2或3所述的用于检测癌症复发风险的生物标志物,其特征在于,所述生物标志物中的基因包括所述基因的同种型和基因的家族成员。
  6. 根据权利要求1、2或3所述的用于检测癌症复发风险的生物标志物,其特征在于,所述生物标志物用于评估前列腺癌患者的死亡风险。
  7. 根据权利要求1、2或3所述的用于检测癌症复发风险的生物标志物,其特征在于,所述生物标志物用于评估前列腺癌初诊时的癌症复发风险。
  8. 根据权利要求1、2或3所述的用于检测癌症复发风险的生物标志物,其特征在于,所述生物标志物用于评估前列腺癌根治术后的复发风险。
  9. 一种检测癌症复发风险的方法,其特征在于,通过检查权利要求1、2或3所述的用于检测癌症复发风险的生物标志物中的基因的表达变化来诊断或预估患者的死亡风险。
  10. 根据权利要求8所述的检测癌症复发风险的方法,其特征在于,采用PCR、DNA芯片、Nanostring或RAN测序的方法,以检查生物标志物中的基因的mRNA低表达和mRNA高表达。
PCT/CN2018/113414 2018-06-13 2018-11-01 用于检测癌症复发风险的生物标志物及检测方法 WO2019237641A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810606265.4 2018-06-13
CN201810606265.4A CN108424970B (zh) 2018-06-13 2018-06-13 用于检测癌症复发风险的生物标志物及检测方法

Publications (1)

Publication Number Publication Date
WO2019237641A1 true WO2019237641A1 (zh) 2019-12-19

Family

ID=63164364

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/113414 WO2019237641A1 (zh) 2018-06-13 2018-11-01 用于检测癌症复发风险的生物标志物及检测方法

Country Status (2)

Country Link
CN (2) CN108424970B (zh)
WO (1) WO2019237641A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108424970B (zh) * 2018-06-13 2021-04-13 深圳市颐康生物科技有限公司 用于检测癌症复发风险的生物标志物及检测方法
EP3666906A1 (en) * 2018-12-11 2020-06-17 Consejo Superior De Investigaciones Científicas Methods and kits for the prognosis of squamous cell carcinomas (scc)
CN109897899B (zh) * 2019-03-01 2023-11-03 中山大学肿瘤防治中心(中山大学附属肿瘤医院、中山大学肿瘤研究所) 一种用于局部晚期食管鳞癌预后判断的标志物及其应用
CN110760584B (zh) * 2019-11-07 2022-12-09 深圳市华启生物科技有限公司 前列腺癌疾病进展生物标志物及其应用
CN110760585B (zh) * 2019-11-07 2022-12-09 深圳市华启生物科技有限公司 前列腺癌生物标志物及其应用
CN115707460A (zh) * 2021-08-19 2023-02-21 南京施江医药科技有限公司 苯甲酰苯胺类药物在治疗肿瘤中的应用
CN115708822A (zh) * 2021-08-23 2023-02-24 南京施江医药科技有限公司 酰肼类药物在治疗肿瘤中的应用
CN114990222B (zh) * 2022-07-06 2023-07-18 山东大学第二医院 低级别胶质瘤患者总生存期预测模型

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108424970A (zh) * 2018-06-13 2018-08-21 深圳市颐康生物科技有限公司 用于检测癌症复发风险的生物标志物及检测方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2006246241A1 (en) * 2005-05-13 2006-11-16 Universite Libre De Bruxelles Gene-based algorithmic cancer prognosis
DK2382331T3 (en) * 2009-01-07 2016-08-22 Myriad Genetics Inc CANCER biomarkers
TW201343920A (zh) * 2012-03-29 2013-11-01 Nat Health Research Institutes 預測前列腺癌預後之分子標記、方法與套組
WO2013190081A1 (en) * 2012-06-22 2013-12-27 Proyecto De Biomedicina Cima, S.L. Methods and reagents for the prognosis of cancer
CN103602720A (zh) * 2013-06-24 2014-02-26 复旦大学附属肿瘤医院 前列腺癌基因标记物在标记前列腺癌复发和转移中的用途及方法
WO2016049276A1 (en) * 2014-09-25 2016-03-31 Moffitt Genetics Corporation Prognostic tumor biomarkers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108424970A (zh) * 2018-06-13 2018-08-21 深圳市颐康生物科技有限公司 用于检测癌症复发风险的生物标志物及检测方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GAO, J. ET AL.: "Intergrative analysis of complex cancer genomics and clinical profiles using cBioPortal", SCI SIGNAL, vol. 6, no. 269, 2 April 2013 (2013-04-02), XP055297746, DOI: 10.1126/scisignal.2004088 *
LIN, X. ET AL.: "Overexpression of MUC1 and genomic alterations in its network associate with prostate cancer progression", NEOPLASIA, vol. 19, no. 11, 18 September 2017 (2017-09-18), pages 857 - 867, XP055670259 *
YANG, L: "Development and validation of a 28- gene hypoxia-related prognostic signature for localized prostate cancer", EBIOMEDICINE, vol. 31, 23 April 2018 (2018-04-23), pages 182 - 189, XP055670266, DOI: 10.1016/j.ebiom.2018.04.019 *

Also Published As

Publication number Publication date
CN108424970B (zh) 2021-04-13
CN112941184A (zh) 2021-06-11
CN108424970A (zh) 2018-08-21

Similar Documents

Publication Publication Date Title
WO2019237641A1 (zh) 用于检测癌症复发风险的生物标志物及检测方法
Vakiani et al. Comparative genomic analysis of primary versus metastatic colorectal carcinomas
AU2015301390B2 (en) Methods and materials for assessing homologous recombination deficiency
US20170029901A1 (en) Methods for prediction of clinical outcome to epidermal growth factor receptor inhibitors by cancer patients
JP2017079772A (ja) 癌の分子的診断検査
JP2015536667A (ja) 癌のための分子診断検査
WO2022184073A1 (zh) 一种用于人肿瘤分级的基因组合及其用途
WO2021164492A1 (zh) 一组结肠癌预后相关基因的应用
Ouchi et al. DNA methylation status as a biomarker of anti‐epidermal growth factor receptor treatment for metastatic colorectal cancer
EP3950960A1 (en) Dna methylation marker for predicting recurrence of liver cancer, and use thereof
US20200308654A1 (en) Pre-surgical risk stratification based on pde4d7 expression and pre-surgical clinical variables
Ji et al. IL1A polymorphisms is a risk factor for colorectal cancer in Chinese Han population: a case control study
Grant et al. MammaPrint Pre-screen Algorithm (MPA) reduces chemotherapy in patients with early-stage breast cancer
Wang et al. Identification and validation of potential novel biomarkers to predict distant metastasis in differentiated thyroid cancer
US20130102483A1 (en) Methods for the analysis of breast cancer disorders
Huang et al. The role of XRCC6/Ku70 in nasopharyngeal carcinoma
US20190112729A1 (en) Novel set of biomarkers useful for predicting lung cancer survival
CN112071365B (zh) 基于pten基因状态筛选胶质瘤生物标记物的方法
CA3204918A1 (en) Methods for evaluation of early stage oral squamous cell carcinoma
JP7313374B2 (ja) Tmprss2-erg融合状態により選択された、pde4d変異発現、及び術後の臨床変数に基づく、術後リスクの層別化
CA3188261A1 (en) Methods of diagnosing and treating patients with cutaneous squamous cell carcinoma
WO2021211057A1 (en) Method of predicting the responsiveness to a cancer therapy
Basu et al. Prevalence of KRAS, BRAF, NRAS, PIK3CA and PTEN alterations in colorectal cancer: analysis of a large international cohort of 7,186 patients
Fleitas-Kanonnikoff et al. Molecular profile in Paraguayan colorectal cancer patients, towards to a precision medicine strategy
JP2022527316A (ja) ウィルスに関連した癌のリスクの層別化

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18922798

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18922798

Country of ref document: EP

Kind code of ref document: A1