WO2019201186A1 - 鉴别及评价肿瘤进展的装置和方法 - Google Patents

鉴别及评价肿瘤进展的装置和方法 Download PDF

Info

Publication number
WO2019201186A1
WO2019201186A1 PCT/CN2019/082574 CN2019082574W WO2019201186A1 WO 2019201186 A1 WO2019201186 A1 WO 2019201186A1 CN 2019082574 W CN2019082574 W CN 2019082574W WO 2019201186 A1 WO2019201186 A1 WO 2019201186A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
tumor
patient
genes
determining
Prior art date
Application number
PCT/CN2019/082574
Other languages
English (en)
French (fr)
Inventor
曾坚阳
周斌
Original Assignee
图灵人工智能研究院(南京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 图灵人工智能研究院(南京)有限公司 filed Critical 图灵人工智能研究院(南京)有限公司
Publication of WO2019201186A1 publication Critical patent/WO2019201186A1/zh
Priority to US16/725,147 priority Critical patent/US20200185054A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/30Unsupervised data analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/178Oligonucleotides characterized by their use miRNA, siRNA or ncRNA
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the present application relates to the detection and treatment of diseases, and in particular to devices and methods for identifying biological indicators that can be used to assess tumor progression, as well as devices and methods for determining tumor progression.
  • CNV copy number variation
  • Somatic mutations are often thought to be another cause of bladder cancer progression (see Soung YH, et al, Oncogene 2003, 22(39): 8048-8052). Abnormal expression of microRNAs may lead to disturbances in the intracellular regulatory network in bladder cancer cells (see Jin Y, et al, Tumour Biol 2015, 36(5): 3791-3797).
  • the present application provides an apparatus and method for identifying a biological indicator capable of assessing tumor progression, the apparatus and method being capable of creatively placing clinical features of a patient having a tumor (eg, a stage of tumor staging and/or The patient's survival time is compared and correlated with at least one biological indicator of the patient (eg, gene expression level, copy number variation, DNA methylation, somatic mutation, and microRNA, etc.) A biological indicator for evaluating tumor progression.
  • the present application also provides an apparatus and method for determining tumor progression in a subject, which is capable of creatively utilizing the various biological indicators identified and assigning different indicators to each other. The weight of the patient is judged by the weight of the tumor.
  • the devices or methods of the present application may also provide a suitable treatment regimen based on the results of the determination.
  • the application provides a device for identifying a biological indicator capable of assessing tumor progression, the device comprising: 1) a clinical feature module capable of providing clinical characteristics of a patient having the tumor, Clinical features include the stage of tumor staging of the patient and/or the survival time of the patient; 2) a biological indicator module capable of providing at least one biological indicator derived from the patient; 3) a correlation determination module, It is capable of determining a correlation of the at least one biological indicator of each of the patients with the clinical characteristics of the respective patient; and 4) an authentication module capable of being determined in the module 3) with the clinical feature Relevant biological indicators are identified as being capable of assessing the progression of the tumor.
  • the present application provides an apparatus for identifying a biological indicator capable of assessing tumor progression, the apparatus comprising a computer for identifying the biological indicator, the computer being programmed to perform the following steps: Providing a clinical feature of a patient having the tumor, the clinical features comprising a stage of tumor staging of the patient and/or a survival time of the patient; 2) providing at least one biological indicator derived from the patient 3) determining a correlation between the at least one biological indicator of each of the patients and the clinical characteristics of the corresponding patient; and 4) the organism determined to be associated with the clinical feature in 3) The index was identified as being able to evaluate the progression of the tumor.
  • the present application provides a method of identifying a biological indicator capable of assessing tumor progression, the method comprising: 1) providing a clinical characteristic of a patient having the tumor, the clinical characteristic comprising the patient a stage of tumor staging and/or a survival time of said patient; 2) providing at least one biological indicator derived from said patient; 3) determining said at least one biological indicator of each of said patient and a corresponding patient Correlation between clinical features; and 4) Identification of biological indicators determined to be associated with the clinical features in 3) to be able to assess the progression of the tumor.
  • the tumor comprises bladder cancer.
  • the bladder cancer comprises bladder urothelial carcinoma (BLCA).
  • the stage of tumor staging is selected from the group consisting of: stage I tumor, stage II of tumor, stage III of tumor, and stage IV of tumor.
  • the at least one biological indicator comprises one or more types of indicators selected from the group consisting of: Class 1: expression level of the patient gene; Class 2: copy number variation of the patient gene Class 3: DNA methylation of the patient gene; Class 4: somatic mutation of the patient gene; and Class 5: microRNA in the patient.
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between an expression level of the gene and the clinical characteristic comprises: expressing the gene
  • a univariate regression analysis is performed as a single variable with respect to the clinical features, and a gene in the regression analysis in which the p-value is less than or equal to the first threshold and the FDR value is less than or equal to the second threshold is identified as the clinical feature Related.
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between an expression level of the gene and the clinical characteristic comprises: relative to the clinical characteristic Performing a multivariate regression analysis, and identifying a gene in the regression analysis that has an FDR value less than or equal to a third threshold as being associated with the clinical feature, and wherein the multivariate includes a gene expression level in the patient, The age of the patient, the gender of the patient, and/or the stage of tumor staging of the patient.
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between the expression level of the gene and the clinical characteristic further comprises: according to the multivariate a correlation coefficient value for each gene obtained in the regression analysis, the gene is divided into a protective effector gene and a risk effector gene, wherein the correlation coefficient value of the protective effector gene is negative, and the risk is The correlation coefficient of the effector gene is positive.
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between the expression level of the gene and the clinical characteristic further comprises determining the patient's The expression level of the gene in each stage of tumor staging, according to which the gene co-expression of the tumor staging is determined, and the gene is divided into two or more groups according to the co-expression of the gene, and each group is determined separately. The correlation between the level of gene expression and the clinical features.
  • the device or method divides the genes into two or more groups according to the co-expression of the genes using a WGCNA algorithm.
  • the at least one biological indicator comprises a copy number change of the patient gene
  • determining a correlation between the gene copy number change and the clinical characteristic comprises comparing the patient's The frequency of copy number change of the gene at each stage of tumor staging.
  • the at least one biological indicator comprises DNA methylation of the patient gene
  • determining a correlation between the DNA methylation and the clinical characteristic comprises: using the DNA The degree of methylation is used as a variable for regression analysis relative to the clinical features, and DNA methylation in the regression analysis where the p-value is less than or equal to the fourth threshold is identified as being associated with the clinical features.
  • determining the correlation between the DNA methylation and the clinical feature in the device or method further comprises determining each DNA methylation position identified as being associated with the clinical feature The risk value of the point, which is determined based on the correlation coefficient obtained by the methylation site in the regression analysis and the degree of methylation of the methylation site.
  • the at least one biological indicator comprises a somatic mutation of the patient gene
  • determining a correlation between the somatic mutation and the clinical characteristic comprises: determining having the somatic cell The signal pathway to which the mutated gene belongs, and/or the correlation between the expression level of the gene having the somatic mutation and the clinical characteristic.
  • the at least one biological indicator comprises a microRNA in the patient
  • determining a correlation between the microRNA and the clinical feature comprises determining that the microRNA is regulated The correlation between the expression level of the gene and the clinical features, and the correlation between the expression level of the microRNA in the patient and the expression level of the gene regulated by the microRNA.
  • the at least one biological indicator comprises two or more classes of the biological indicator, and determining a correlation between the biological indicator and the clinical characteristic comprises determining a variety of The weight of the biological indicator affecting the clinical features.
  • the apparatus or method determines the weights by performing an ordered logistic regression analysis.
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between an expression level of the gene and the clinical characteristic comprises: a) using the gene a level of expression as a single variable and a univariate regression analysis with respect to the clinical feature, and identifying, in the regression analysis, a gene having a p-value less than or equal to a first threshold and an FDR value less than or equal to a second threshold The first gene set associated with clinical features.
  • determining the correlation between the expression level of the gene and the clinical feature in the device or method further comprises: b) performing a multivariate regression analysis relative to the clinical feature, and A gene having an FDR value less than or equal to a third threshold in the regression analysis is identified as a second set of genes associated with the clinical feature, and wherein the multivariate includes an expression level of each gene in the first set of genes, The age of the patient, the gender of the patient, and the stage of tumor staging of the patient.
  • determining the correlation between the expression level of the gene and the clinical characteristic in the device or method further comprises: c) correlating the genes obtained according to the multivariate regression analysis The coefficient of the coefficient is divided into a protective effector gene and a risk effector gene, wherein the correlation coefficient value of the protective effector gene is negative, and the correlation coefficient value of the risk effector gene is positive.
  • determining the correlation between the expression level of the gene and the clinical characteristic in the device or method further comprises: determining each gene in the second set of genes at each stage of tumor staging a level of expression according to which a gene co-expression specific for tumor stage is determined, and genes in the second gene set are divided into two or more groups according to the co-expression of the gene, and each group is determined separately Correlation between gene expression levels and the clinical features.
  • the device or method divides genes in the second set of genes into two or more groups according to the co-expression of the genes using a WGCNA algorithm.
  • the at least one biological indicator further comprises a copy number change of the patient gene, and determining a correlation between the gene copy number change and the clinical characteristic comprises: comparing the first The frequency of copy number changes of genes in the two gene sets at each tumor staging stage.
  • the at least one biological indicator further comprises DNA methylation of the patient gene
  • determining a correlation between the DNA methylation and the clinical characteristic comprises: determining the The DNA methylation site of the gene in the second gene set and the degree of DNA methylation of each of the sites are subjected to regression analysis with respect to the clinical features using the degree of DNA methylation as a variable, and DNA methylation in the regression analysis where the p-value is less than or equal to the fourth threshold is identified as the first set of DNA methylation associated with the clinical features.
  • determining the correlation between the DNA methylation and the clinical feature in the device or method further comprises determining a DNA methylation position in the first DNA methylation set The risk value of the point, which is determined based on the correlation coefficient obtained by the methylation site in the regression analysis and the degree of methylation of the methylation site.
  • the at least one biological indicator further comprises a somatic mutation of the patient gene, and determining a correlation between the somatic mutation and the clinical feature comprises determining the second A somatic mutation in a gene in a gene set, and a signal pathway to which a gene having the somatic mutation is associated.
  • the at least one biological indicator comprises a microRNA in the patient
  • determining a correlation between the microRNA and the clinical feature comprises determining to modulate the second set of genes The microRNA of the gene in the gene, and the correlation between the expression level of the microRNA in the patient and the expression level of the gene regulated by the microRNA, and the microRNA whose correlation is higher than the fifth threshold is identified as The first microRNA set associated with the clinical feature.
  • determining a correlation between the biological indicator and the clinical characteristic in the apparatus or method comprises determining, by performing an ordered logistic regression analysis, the following biological indicators respectively affecting the clinical characteristic Weight: the level of expression of the gene in the second set of genes, the change in the number of copies of the gene in the second set of genes, and the risk value of the DNA methylation site in the first set of DNA methylations.
  • the device or method determines a respective weighting of a protective effector gene expression level and a risk effector gene expression level in the second set of genes, respectively.
  • the present application provides a computer readable storage medium storing a computer program, wherein the computer program causes a computer to perform the authentication method described herein.
  • the application provides a device for determining tumor progression in a subject, the device comprising: a) an analysis module capable of determining one or more genes shown in Table 1 in the subject Or a level of expression in a biological sample derived from the subject; and b) a determination module capable of determining the progression of the tumor in the subject according to the expression level determined in a).
  • the application provides a device for determining tumor progression in a subject, the device comprising a computer for determining tumor progression in a subject, the computer being programmed to perform the following steps: a) determining a table Expression level of one or more genes shown in 1 in the subject or in a biological sample derived from the subject; and b) judging from the expression level determined in a) The progression of the tumor in the subject.
  • the application provides a method of determining tumor progression in a subject, the method comprising: a) determining that one or more genes shown in Table 1 are in or derived from the subject The expression level in the biological sample of the subject; and b) determining the progression of the tumor in the subject according to the expression level determined in a).
  • the tumor progression comprises a stage of staging of the tumor and/or a survival rate of the subject.
  • the stage of staging of the tumor is selected from the group consisting of: stage I tumor, stage II of tumor, stage III of tumor, and stage IV of tumor.
  • the tumor comprises bladder cancer.
  • the bladder cancer comprises bladder urothelial carcinoma (BLCA).
  • the one or more genes comprise at least one or more of the protective effector genes set forth in Table 2.
  • the one or more genes comprise at least one or more of the risk effector genes set forth in Table 3.
  • the one or more genes comprise at least one or more of the genes set forth in Table 4. In certain embodiments, the one or more genes comprise at least one or more of the genes set forth in Table 5.
  • the apparatus or method further comprises the step or module of determining a copy number change of the one or more genes.
  • the device or method further comprises the step or module of determining a DNA methylation risk value for one or more of the genes shown in Table 8.
  • the device or method further comprises the step or module of determining the age of the subject.
  • determining, in the device or method, the expression level of one or more genes shown in Table 1 in the subject or in a biological sample derived from the subject comprises: Determining an average expression level of the genes shown in Table 2 of the one or more genes; and determining an average expression level of the genes shown in Table 3 of the one or more genes.
  • the device or method determines the progression of the tumor in the subject according to Formula I:
  • a is the average expression level of the genes shown in Table 2 of the one or more genes;
  • b is The average expression level of the genes shown in Table 3 in the one or more genes;
  • c is the copy number variation of the one or more genes;
  • the present application provides a computer readable storage medium storing a computer program, wherein the computer program causes a computer to perform the method of determining described herein.
  • the present application provides a method of treating a tumor in a subject, the method comprising: determining a progression of the tumor in the subject according to the determining method described in the present application; The progress is administered to the subject an effective amount of treatment.
  • the application provides a device for treating a tumor in a subject, the device comprising: a) an analysis module capable of determining one or more genes shown in Table 1 in the subject a level of expression in a biological sample derived from the subject; b) a determination module capable of determining the progression of the tumor in the subject according to the expression level determined in a); A therapeutic module capable of administering to the subject an effective amount of treatment according to the progression as judged in b).
  • FIG. 1 is a schematic diagram showing the workflow of the authentication method and apparatus of the present application.
  • Figures 2A-2D show Kaplan-Meier plots of APOL2, BCL2L14, CSAD and ORMDL1 expression in two different BLCA patients.
  • Figures 3A-3B show gene ontology (GO) enrichment analysis of protective effector genes and risk effector genes in genes important for survival of BLCA patients.
  • Figures 4A-4C show dynamic changes in correlation between key genes in BLCA patients at different stages of cancer staging.
  • FIGS 5A-5D show functional blocks of the gene co-expression network obtained by the WGCNA algorithm.
  • FIGS 6A-6E show the analysis of copy number variation (CNV) in different stages of bladder cancer.
  • Figures 7A-7B show exemplary results of DNA methylation analysis.
  • Figures 8A-8D show cell signaling pathways in the BLCA sample that are significantly enriched for the mutated gene.
  • Figures 9A-9E show somatic mutation analysis in different stages of bladder cancer.
  • FIGS 10A-10C show the evolution of microRNA regulatory networks in different stages of bladder cancer.
  • Figure 11 shows a forest plot of ordered logistic regression in an integrated analysis.
  • the application provides a device for identifying a biological indicator capable of assessing tumor progression, the device comprising: 1) a clinical feature module capable of providing clinical characteristics of a patient having the tumor, Clinical features include the stage of tumor staging of the patient and/or the survival time of the patient; 2) a biological indicator module capable of providing at least one biological indicator derived from the patient; 3) a correlation determination module, It is capable of determining a correlation of the at least one biological indicator of each of the patients with the clinical characteristics of the respective patient; and 4) an authentication module capable of being determined in the module 3) with the clinical feature Relevant biological indicators are identified as being capable of assessing the progression of the tumor.
  • the present application provides an apparatus for identifying a biological indicator capable of assessing tumor progression, the apparatus comprising a computer for identifying the biological indicator, the computer being programmed to perform the following steps: Providing a clinical feature of a patient having the tumor, the clinical features comprising a stage of tumor staging of the patient and/or a survival time of the patient; 2) providing at least one biological indicator derived from the patient 3) determining a correlation between the at least one biological indicator of each of the patients and the clinical characteristics of the corresponding patient; and 4) the organism determined to be associated with the clinical feature in 3) The index was identified as being able to evaluate the progression of the tumor.
  • the present application provides a method of identifying a biological indicator capable of assessing tumor progression, the method comprising: 1) providing a clinical characteristic of a patient having the tumor, the clinical characteristic comprising the patient a stage of tumor staging and/or a survival time of said patient; 2) providing at least one biological indicator derived from said patient; 3) determining said at least one biological indicator of each of said patient and a corresponding patient Correlation between clinical features; and 4) Identification of biological indicators determined to be associated with the clinical features in 3) to be able to assess the progression of the tumor.
  • the term "patient” generally refers to an individual having a characterization of a disease, which may refer to a symptom of a disease, or to a deleterious physiological condition that cannot be altered in a preventive setting.
  • the individual may include males and/or females, and typically includes human or non-human animals including, but not limited to, humans, dogs, cats, horses, sheep, goats, pigs, cows, rabbits, rats, mice, monkeys, and the like.
  • the patient is a human patient.
  • tumor generally refers to an uncontrolled proliferation of cells in the body due to abnormal pathological changes of the cells, and in many cases, agglomeration becomes a mass.
  • Tumors can be divided into benign tumors and malignant tumors. In the malignant tumor, the proliferating cells aggregate into a mass and spread to other sites.
  • the tumor may be selected from the group consisting of nasopharyngeal carcinoma, lip cancer, colorectal cancer, gallbladder cancer, lung cancer, liver cancer, cervical cancer, bone cancer, laryngeal cancer, melanoma, thyroid cancer, oropharyngeal cancer, brain tumor, bladder.
  • the tumor can be a bladder cancer, such as bladder urothelial carcinoma (BLCA).
  • BLCA bladder urothelial carcinoma
  • the term "clinical feature module” generally refers to a functional unit sufficient to provide clinical features of a patient having the tumor.
  • the clinical feature module can include an information input and/or extraction unit capable of receiving and/or providing the clinical features including a stage of tumor staging of the patient and/or a survival time of the patient.
  • clinical features generally refers to one or more indicators and/or parameters that reflect the clinical characteristics of a patient's disease, such as the stage of tumor staging of a patient and/or the survival time of the patient, and the like.
  • the clinical feature module can include reagents, devices, and/or devices that are capable of obtaining a stage of tumor staging of a patient and/or a survival time of the patient.
  • the clinical feature module can include reagents, devices, and/or devices that detect tumor size, degree of infiltration, metastasis (eg, magnetic resonance imaging, CT, gastroscope).
  • the clinical feature module can include devices and/or devices that monitor patient survival time (eg, reagents, devices, and/or devices that detect tumor markers).
  • the tumor marker can be selected from the group consisting of serum carcinoembryonic antigen (CEA), alpha-fetoprotein (AFP), prostate specific antigen (PSA), and chorionic gonadotropin (HCG).
  • tumor staging generally refers to a histopathological classification method for evaluating the progression of a tumor by the number and location of tumors in the patient.
  • the tumor staging can describe the severity and extent of involvement of the malignant tumor based on the primary tumor within the individual and the degree of dissemination (eg, according to the TNM classification method proposed by the WHO).
  • the tumor staging can help the doctor to develop a corresponding treatment plan and understand the prognosis of the disease while avoiding over-treatment or under-treatment.
  • Tumors are generally staged according to the TNM classification proposed by the World Health Organization (WHO).
  • WHO World Health Organization
  • T the range and size of the primary tumor, the extent of infiltration, the presence or absence of metastasis, and the depth of infiltration, divided into 0 (T0 to T4, 5 levels), the number is more Large indicates that the more obvious the progression of cancer, the different classification methods based on the different organs of cancer; N: lymph node dissemination, divided into 0 (N0 to N3, 4 grades), the greater the number indicates cancer The more obvious the progression; M: whether there is metastasis, where M0 means no metastasis and M1 means distant metastasis. Clinically, the above T, N, and M results are combined to determine the stage of the tumor.
  • the tumor stage can include a tumor stage I, a tumor stage II, a tumor stage III, and a tumor stage IV.
  • tumor stage I generally refers to the early stages of a tumor.
  • tumor stage II generally refers to a mild stage of a tumor.
  • tumor stage III generally refers to the intermediate stage of a tumor.
  • tumor stage IV generally refers to the complete stage of a tumor.
  • time to live refers to the total survival time of a tumor patient after treatment.
  • the survival time can be related to the tumor stage.
  • the term "bladder cancer” generally refers to a variety of malignant tumors from the bladder.
  • the bladder cancer can include bladder urothelial carcinoma (BLCA).
  • the bladder urothelial carcinoma can be classified into non-muscle invasive urothelial carcinoma and myometrial invasive urothelial carcinoma.
  • the etiology of bladder cancer is complex, both intrinsic genetic factors and external environmental factors.
  • the two more common risk factors for smoking are occupational exposure to aromatic amine chemicals.
  • the initial clinical manifestation of approximately 90% of patients with bladder cancer is hematuria, usually manifested as painless, intermittent, gross hematuria, and sometimes microscopic hematuria.
  • Hematuria may occur only once or last for 1 day to several days, can be reduced or stopped by itself; about 10% of patients with bladder cancer may first have bladder irritation, which is characterized by frequent urination, urgency, dysuria and dysuria.
  • bladder irritation is mostly caused by tumor necrosis, ulceration, large or large number of tumors in the bladder or diffuse infiltration of bladder tumor into the bladder wall, resulting in decreased bladder capacity or concurrent infection.
  • bladder cancer can be divided into the following staging stages: stage 0 bladder cancer (non-invasive papillary and carcinoma in situ), stage I bladder cancer, stage II and III bladder cancer, and stage IV bladder cancer.
  • stage 0 bladder cancer non-invasive papillary and carcinoma in situ
  • stage I bladder cancer stage II and III bladder cancer
  • stage IV bladder cancer stage IV bladder cancer.
  • the treatments for bladder cancer with different tumor stages include the following methods (see the NIH National Cancer Institute).
  • the main treatments include:
  • Intravesical chemotherapy is given immediately after surgery
  • Intravesical chemotherapy is given immediately after surgery, and then intravesical BCG or intravesical chemotherapy is used regularly;
  • the main treatments include:
  • Intravesical chemotherapy is given immediately after surgery
  • Intravesical chemotherapy is given immediately after surgery, and then intravesical BCG or intravesical chemotherapy is used regularly;
  • the main treatments include:
  • the main treatments include:
  • Urine flow diversion or cystectomy as a palliative treatment.
  • Treatment for stage IV bladder cancer that has spread to other parts of the body may include the following:
  • Chemotherapy or chemotherapy plus topical therapy (surgery or radiotherapy);
  • biological indicator module generally refers to a functional unit capable of providing at least one biological indicator derived from the patient.
  • the biological indicator module can provide indicators and/or characteristics that reflect the stage of tumor staging of the patient and/or the survival time of the patient at the molecular level.
  • the biological indicator module can include a sample unit that obtains a patient sample (eg, peripheral blood).
  • the biological indicator module can include a sample device that obtains a patient sample (eg, a device that takes a sample, such as a blood collection needle; and/or a device that carries a sample, such as a test tube).
  • the biological indicator module can include a sample processing device that obtains DNA of a patient by processing a patient sample (eg, a kit for extracting whole blood DNA, a test tube, and related devices).
  • the biological indicator module can also include a separation unit that is capable of separating a patient sample.
  • the biological indicator module can include an agent that separates cells (eg, proteinase K) and a device that separates cells (eg, a centrifuge).
  • the biological indicator module can include a sample processing unit that obtains the biological indicator.
  • the sample processing unit may include reagents and equipment for detecting the expression level of the patient's gene, reagents and equipment for detecting changes in the copy number of the patient's gene, reagents and equipment for detecting DNA methylation of the patient's gene, and detection.
  • the sample processing unit may include a q-RT PCR kit, an MLPA (Multiple Linked Probe Amplification) kit, a methylation profiling kit, a TruSeq Rapid Exome Library kit, and a microarray analysis kit.
  • MLPA Multiple Linked Probe Amplification
  • biological indicator generally refers to one or more types of indicators selected from the group consisting of: class 1: expression level of the patient gene; class 2: copy number variation of the patient gene; Class 3: DNA methylation of the patient gene; class 4: somatic mutation of the patient gene; and class 5: microRNA (microRNA) in the patient.
  • class 1 expression level of the patient gene
  • class 2 copy number variation of the patient gene
  • Class 3 DNA methylation of the patient gene
  • class 4 somatic mutation of the patient gene
  • class 5 microRNA (microRNA) in the patient.
  • the expression level of the gene of the patient may be up-regulated, for example, up to about 10%, more than 20%, more than 30%, more than 40%, more than 50%, 60% compared with the expression level in normal cells.
  • the expression level of the patient's gene may be Downregulation, for example, down to 12% of expression level in normal cells, 20% or less, 30% or less, 40% or less, 50% or less, 60% or less, 70% or less, 80% or less, 90% or less, 92 % or less, 94% or less, 96% or less, 98% or less, or 99% or less.
  • the change in copy number of the patient gene may be increased, for example, by about 0.1 times or more, about 0.5 times or more, about 1 time or more, about 2 times or more, and about 3 times or more as compared with the expression level in normal cells. , about 4 times or more, about 5 times or more, about 6 times or more, about 7 times or more, about 8 times or more, about 9 times or more, or about 10 times or more; for example, the copy number of the patient gene may be reduced Small, for example, compared with the expression level in normal cells, about 0.1 times or more, about 0.5 times or more, about 1 time or more, about 2 times or more, about 3 times or more, about 4 times or more, about 5 times or more, About 6 times or more, about 7 times or more, about 8 times or more, about 9 times or more, or about 10 times or more.
  • the DNA methylation of the patient gene may be increased in levels, for example, by about 0.1 times or more, about 0.5 times or more, about 1 time or more, and about 2 times or more as compared with the level of DNA methylation in normal cells. , about 3 times or more, about 4 times or more, about 5 times or more, about 6 times or more, about 7 times or more, about 8 times or more, about 9 times or more, or about 10 times or more; for example, the DNA of the patient gene Methylation may be reduced in level, for example, by about 0.1 times or more, about 0.5 times or more, about 1 time or more, about 2 times or more, about 3 times or more, compared with the level of DNA methylation in normal cells. About 4 times or more, about 5 times or more, about 6 times or more, about 7 times or more, about 8 times or more, about 9 times or more, or about 10 times or more.
  • the term "expression level of a gene” generally refers to the level at which information encoded in a gene is translated into a gene product (eg, RNA, protein).
  • the expressed gene includes a gene that is transcribed into an RNA (for example, mRNA) that is subsequently translated into a protein, and a gene that is transcribed into a non-coding functional RNA that is not translated into a protein (for example, tRNA, rRNA ribozyme, etc.).
  • RNA expression level or “expression level” refers to the level (eg, amount) of one or more products (eg, RNA, protein) encoded by a given gene in a sample or reference standard.
  • copy number change of a gene generally refers to a CNV (Copy Number Variation), which indicates a slice repeat of a genome, and a phenomenon in which the number of repeats in a genome differs among individuals in a population (see Mccarroll, SA et al. (2007). "Copy-number variation and association studies of human diseases”. Nature Genetics. 39: 37-42.).
  • CNV is a repeat or deletion event that affects a significant number of base pairs and is primarily found in the human genome. Copy number variations can usually be divided into two broad categories: short repeats and long repeats.
  • Short repeats primarily include dinucleotide repeats (two repeat nucleotides, such as A-C-A-C-A-C%) and trinucleotide repeats.
  • Long repeats include repeats of the entire gene. CNV's research data not only provides additional evidence for evolution and natural selection, but can also be used to develop treatments for a variety of genetic diseases.
  • DNA methylation of a gene generally refers to a process of adding a methyl group to a DNA molecule (mainly cytosine and adenine). Methylation can alter the activity of a DNA fragment without altering the sequence. When located in a gene promoter, DNA methylation usually acts to inhibit gene transcription. DNA methylation is essential for normal development and is associated with many key processes, including genomic imprinting, X-chromosome inactivation, transposable element inhibition, aging and carcinogenesis.
  • Methylation of cytosine to form 5-methylcytosine occurs at the same 5 position of the pyrimidine ring where the thymine methyl group of the DNA base is located; the same position distinguishes between thymine and a similar RNA base uracil that does not contain a methyl group.
  • Spontaneous deamination of 5-methylcytosine converts it to thymine. This will cause the T-G to not match.
  • the repair mechanism modifies it back to the original C-G pair; alternatively, they can replace A with G, turning the original C-G pair into a T-A pair, effectively changing bases and introducing mutations.
  • DNA methylation of a gene may result in a DNA methylation marker, which is a genomic region of a particular methylation pattern with a particular biological state (eg, tissue, cell type, individual), It is considered to be a possible functional area involved in gene transcriptional regulation.
  • a DNA methylation marker which is a genomic region of a particular methylation pattern with a particular biological state (eg, tissue, cell type, individual), It is considered to be a possible functional area involved in gene transcriptional regulation.
  • somatic mutation of a gene generally refers to a mutation occurring in a cell other than a germ cell line, also referred to as an acquired mutation. Somatic mutations do not cause genetic changes in the offspring, but can cause changes in the genetic structure of certain contemporary cells. Most somatic mutations have no phenotypic effects. The sporadic form of malignant tumors can be caused by somatic mutations. Studies have shown that somatic cell cancer does not necessarily have a genetic structure change, when substances other than genes such as proteins, RNA, and biofilms are altered, and these changes can also cause genes related to growth and differentiation to be abnormally shut down or activated. At the time, cells can also be transformed into cancer cells, a view called extra-genetic regulation.
  • microRNA generally refers to a non-coding RNA (microRNA, miRNA for short) which is about 22 nt in length and is widely present in various organisms from viruses to humans. These small RNAs are capable of binding to mRNA to block the expression of protein-encoding genes and prevent their translation into proteins. Mammalian miRNAs can have many unique targets. For example, analysis of highly conserved miRNAs in vertebrates indicates that there are approximately 400 conserved targets on average; as such, individual miRNA species may inhibit the production of hundreds of proteins. Studies have shown that chronic lymphocytic leukemia and B cell malignancies may be associated with miRNAs.
  • correlation determination module generally refers to a functional unit capable of determining the correlation of the at least one biological indicator of each of the patients with the clinical characteristics of the respective patient.
  • the term “relevance” generally means that the at least one biological indicator of a patient in the present application exhibits a statistically significant association with the clinical characteristics of the corresponding patient.
  • a gene can be expressed at a higher or lower level and is associated with the status or outcome of a tumor (eg, bladder cancer).
  • the correlation determination module may include a sample determination unit that may determine a correlation of the at least one biological indicator of each of the patients with the clinical characteristic of the corresponding patient.
  • the correlation determination module can include performing a univariate regression analysis with respect to the clinical features with the expression level of the gene as a single variable to determine a correlation between the expression level of the gene and the clinical feature.
  • Units eg, which may include hardware, programs, and/or software capable of executing the relevant instructions).
  • the correlation determination module can include the expression level of the gene, the age of the patient, the gender of the patient, and/or the stage of the tumor of the patient as a multivariate relative to the clinical feature A unit that performs multivariate regression analysis to determine the correlation between the expression level of the gene and the clinical features (eg, it can include hardware, programs, and/or software capable of executing relevant instructions).
  • the correlation determination module may further include a unit that determines a correlation between an expression level of the gene and the clinical feature according to a correlation coefficient value for each gene obtained in the regression analysis (eg, It may include hardware, programs, and/or software capable of executing the relevant instructions.
  • the correlation determination module may further comprise: according to the expression level of the patient's gene at each tumor staging stage, determining co-expression of a gene specific for tumor staging according to the co-expression of the gene.
  • the unit may utilize a WGCNA (Weighted Gene Co-Expression Network Analysis) algorithm to implement at least a portion of the described functionality.
  • WGCNA Weighted Gene Co-Expression Network Analysis
  • the correlation determination module may further include means for determining a correlation between the change in the copy number of the gene and the clinical feature according to a frequency of change in copy number of the patient's gene at each tumor staging stage (for example, It may include hardware, programs, and/or software capable of executing related instructions.
  • the correlation determination module may further comprise: performing regression analysis on the degree of the methylation of the DNA as a variable relative to the clinical feature, and determining the DNA methylation according to the determined DNA methylation A unit that correlates with the clinical features (eg, it can include hardware, programs, and/or software capable of executing the relevant instructions).
  • the correlation determination module may further comprise: determining, based on the correlation coefficient obtained in the regression analysis based on the methylation site, and the degree of methylation of the methylation site, A unit of risk associated with each DNA methylation site associated with a clinical characteristic to determine a unit of correlation between the DNA methylation and the clinical feature (eg, which may include hardware, procedures, and instructions capable of executing the relevant instructions) / or software).
  • the correlation determination module may further comprise determining a correlation between the expression level of the somatically mutated gene and the clinical feature according to a signal pathway to which the somatic mutation of the patient belongs.
  • a unit eg, it can include hardware, programs, and/or software capable of executing the relevant instructions).
  • the correlation determination module may further include determining, according to the expression level of the gene regulated by the microRNA, an expression level of the gene regulated by the microRNA and the clinical feature and the clinical feature A unit of relevance (eg, it can include hardware, programs, and/or software capable of executing the relevant instructions).
  • the correlation determination module may further comprise determining between the biological indicator and the clinical feature by determining a weight including the two or more types of the biological indicators affecting the clinical feature.
  • a unit of relevance eg, it can include hardware, programs, and/or software capable of executing the relevant instructions).
  • the unit can determine the weight by performing an ordered logistic regression analysis.
  • the at least one biological indicator may include an expression level of the patient gene, and determining a correlation between the expression level of the gene and the clinical characteristic may include: expressing an expression level of the gene Performing a univariate regression analysis as a single variable relative to the clinical features, and identifying genes in the regression analysis that have a p-value less than or equal to the first threshold and an FDR value less than or equal to the second threshold as being associated with the clinical feature .
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between an expression level of the gene and the clinical characteristic comprises: a) using the gene a level of expression as a single variable and a univariate regression analysis with respect to the clinical feature, and identifying, in the regression analysis, a gene having a p-value less than or equal to a first threshold and an FDR value less than or equal to a second threshold The first gene set associated with clinical features.
  • the term "first threshold” is generally a cut-off value of the statistical significance of the determination result in a univariate regression analysis with respect to the clinical characteristics with the expression level of the gene as a single variable ( That is, the cutoff value of the p value).
  • the first threshold may be 0.09 or less.
  • the first threshold may be 0.08 or less, 0.07 or less, 0.06 or less, 0.05 or less, 0.045 or less, 0.04 or less, 0.03 or less, 0.02 or less, 0.01 or less, or 0.005 or less.
  • the term "second threshold” generally refers to a threshold of a false discovery rate (FDR) less than or equal to a univariate regression analysis performed with respect to the clinical features with the expression level of the gene as a single variable.
  • the second threshold may be 0.5 or less.
  • the second threshold may be 0.4 or less, 0.3 or less, 0.2 or less, 0.1 or less, or 0.05 or less.
  • the gene can be identified as a first set of genes associated with the clinical feature.
  • the expression level of the gene may be related to the clinical feature, and/or the gene may be used as an evaluation tumor One of the biological indicators of progress.
  • the at least one biological indicator can include an expression level of the patient gene, and determining a correlation between the expression level of the gene and the clinical characteristic comprises: performing relative to the clinical feature Multivariate regression analysis, and identifying a gene in the regression analysis that has an FDR value less than or equal to a third threshold as being associated with the clinical feature, and wherein the multivariate includes a gene expression level in the patient, the patient The age, the gender of the patient, and/or the stage of tumor staging of the patient.
  • determining a correlation between the expression level of the gene and the clinical characteristic further comprises: b) performing a multivariate regression analysis with respect to the clinical feature, and using the FDR value in the regression analysis
  • a gene less than or equal to a third threshold is identified as a second set of genes associated with the clinical feature, and wherein the multivariate includes an expression level of each gene in the first set of genes, the age of the patient, The gender of the patient, and the stage of tumor staging of the patient.
  • the term "third threshold” generally refers to a threshold for which the false discovery rate (FDR) in a multivariate regression analysis is less than or equal to the clinical feature.
  • the multivariate may be selected from the group consisting of a gene expression level in the patient, an age of the patient, a gender of the patient, and/or a stage of tumor staging of the patient.
  • the third threshold may be 0.2 or less.
  • the third threshold may be 0.2 or less, 0.15 or less, 0.1 or less, or 0.05 or less.
  • the gene can be identified as a second set of genes associated with the clinical feature.
  • the gene of the second gene set may be selected from the genes shown in Table 1.
  • the number of genes of the second gene set may be 1078.
  • the at least one biological indicator may include an expression level of the patient gene, and determining a correlation between the expression level of the gene and the clinical feature further comprises: according to the multivariate regression The correlation coefficient values obtained for each gene obtained in the analysis divide the gene into a protective effector gene and a risk effector gene, wherein the correlation coefficient value of the protective effector gene may be negative, and the risk effect is The correlation coefficient of the gene can be positive.
  • determining the correlation between the expression level of the gene and the clinical characteristic may further comprise: c) according to the correlation coefficient value for each gene obtained in the multivariate regression analysis,
  • the gene is divided into a protective effector gene and a risk effector gene, wherein the correlation coefficient value of the protective effector gene is negative, and the correlation coefficient value of the risk effector gene is positive.
  • the term "protective effector gene” generally refers to a gene whose expression level is positively correlated with the survival of a patient, or whose expression level is inversely related to the degree of progression of the tumor (eg, progression of the tumor stage).
  • the correlation coefficient value between the expression level of the protective effector gene and the clinical feature (eg, tumor stage) may be negative.
  • the protective effector gene may be selected from the genes shown in Table 2.
  • the number of protective effector genes may be 356.
  • the expression level of the protective effector gene can be downregulated during progression of the tumor.
  • the protective effector gene may be inversely related to the staging phase of the tumor.
  • the term "hazard effector gene” generally refers to a gene whose expression level is inversely related to the survival of the patient, or whose expression level is positively correlated with the degree of progression of the tumor (eg, progression of the tumor staging stage).
  • the correlation coefficient value between the expression level of the risk effector gene and the clinical feature (eg, tumor stage) may be positive.
  • the risk effector gene may be selected from the genes shown in Table 3.
  • the number of the dangerous effector genes may be 722.
  • the expression level of the risk effector gene can be upregulated during the progression of the tumor.
  • the risk effector gene may be positively correlated with the staging phase of the tumor.
  • the at least one biological indicator may include an expression level of the patient gene, and determining a correlation between the expression level of the gene and the clinical characteristic further comprises: determining the gene of the patient At the expression level of each tumor stage, the co-expression of genes specific for tumor stage is determined, and the genes are divided into 2 groups or groups according to the co-expression of the genes, and each group is determined separately. Correlation between gene expression levels and the clinical features.
  • the genes can be divided into two or more groups by identifying the co-expression relationship of each gene in a stage of tumor staging, and/or identifying changes in the stage of staging of the tumor.
  • the genes in one group can present a stage-specific co-expression profile of the tumor stage.
  • the correlation between the genes in each group and the clinical features can be analyzed (eg, by univariate and/or more as described herein) Variable regression analysis) to identify the genome with the desired correlation.
  • determining the correlation between the expression level of the gene and the clinical feature may further comprise: determining an expression level of each gene in the second gene set at each tumor staging stage, and determining the pair accordingly Tumor staging has specific gene co-expression, and the genes in the second gene set are divided into two or more groups according to the co-expression of the gene, and the gene expression level of each group is determined separately from the clinical Correlation between features.
  • the genes in the second gene set can be divided into two groups by identifying a co-expression relationship of each gene in a stage of tumor staging, and/or identifying changes in the stage of staging of the tumor. More groups, in which the genes in each group can exhibit a stage-specific co-expression profile of the tumor stage.
  • the correlation between the genes in each group and the clinical features can be analyzed (eg, by univariate and/or more as described herein) Variable regression analysis) to identify the genome with the desired correlation.
  • the term "gene co-expression” generally means that a plurality of genes in the second set of genes are capable of exhibiting similar expression level trends at a particular stage of the tumor stage (eg, expression levels are at a certain
  • the tumor staging tendency is the same or similar, such as up-regulation in the tumor stage I, so that the genes in the second gene set can be divided into two or more groups according to the phenomenon of co-expression of the genes (for example, two or more groups) , 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more or more), such that the gene expression level of each group and the clinical features Have the correlation.
  • the gene co-expression can be determined by using the WGCNA algorithm.
  • the at least one biological indicator may include a copy number change of the patient gene, and determining a correlation between the gene copy number change and the clinical feature comprises: comparing the patient's gene The frequency of copy number changes at each stage of tumor staging.
  • the at least one biological indicator further comprises a copy number change of the patient gene, and determining a correlation between the gene copy number change and the clinical characteristic comprises: comparing the first The frequency of copy number changes of genes in the two gene sets at each tumor staging stage.
  • the at least one biological indicator can include DNA methylation of the patient gene, and determining a correlation between the DNA methylation and the clinical feature comprises: using the DNA The degree of basalization is used as a variable for regression analysis with respect to the clinical features, and DNA methylation in the regression analysis where the p-value is less than or equal to the fourth threshold is identified as being associated with the clinical features.
  • the at least one biological indicator further comprises DNA methylation of the patient gene
  • determining a correlation between the DNA methylation and the clinical characteristic comprises: determining the The DNA methylation site of the gene in the second gene set and the degree of DNA methylation of each of the sites are subjected to regression analysis with respect to the clinical features using the degree of DNA methylation as a variable, and DNA methylation in the regression analysis where the p-value is less than or equal to the fourth threshold is identified as the first set of DNA methylation associated with the clinical features.
  • the term "fourth threshold” generally refers to a threshold value in which a p value in a regression analysis with respect to the clinical feature is less than or equal to the degree of DNA methylation of the gene (for example, A p-value cutoff that reflects statistical significance).
  • the fourth threshold may be 0.2 or less.
  • the fourth threshold may be 0.15 or less, 0.1 or less, 0.05 or less, 0.01 or less, or 0.005 or less.
  • the DNA methylation may be identified as A first DNA methylation collection associated with clinical features.
  • the first DNA methylation set may be selected from the genes shown in Table 8.
  • the first set of DNA methylations can comprise DNA methylation events in 23 genes.
  • determining the correlation between the DNA methylation and the clinical feature may further comprise: determining a risk value of each DNA methylation site identified as being associated with the clinical feature, The risk value is determined based on the correlation coefficient obtained by the methylation site in the regression analysis and the degree of methylation of the methylation site.
  • determining a correlation between the DNA methylation and the clinical feature further comprises: determining a risk value for each DNA methylation site in the first DNA methylation pool, The risk value is determined based on the correlation coefficient obtained by the methylation site in the regression analysis and the degree of methylation of the methylation site.
  • the risk value for a DNA methylation event can be a linear combination of the correlation coefficient obtained by the methylation site in the regression analysis and the methylation degree value of the methylation site.
  • the at least one biological indicator can include a somatic mutation of the patient gene, and determining a correlation between the somatic mutation and the clinical feature comprises: determining having the somatic mutation The signal pathway to which the gene belongs, and/or the correlation between the expression level of the gene having the somatic mutation and the clinical characteristic.
  • the at least one biological indicator further comprises a somatic mutation of the patient gene, and determining a correlation between the somatic mutation and the clinical feature comprises determining the second A somatic mutation in a gene in a gene set, and a signal pathway to which a gene having the somatic mutation is associated.
  • the signaling pathway may include the PI3K/AKT pathway, the Ras pathway, the Rap1 pathway, and the MAPK pathway.
  • the signaling pathway may have been shown to be associated with a tumor.
  • the at least one biological indicator may include microRNAs in the patient, and determining a correlation between the microRNA and the clinical feature comprises: determining a gene regulated by the microRNA The correlation between the expression level and the clinical features, and the correlation between the expression level of the microRNA in the patient and the expression level of the gene regulated by the microRNA.
  • the at least one biological indicator can include microRNAs in the patient, and determining a correlation between the microRNA and the clinical characteristic comprises determining to modulate the second gene Identifying a microRNA of a gene in the collection, and determining a correlation between the expression level of the microRNA in the patient and the expression level of the gene regulated by the microRNA, and identifying the microRNA having a correlation higher than a fifth threshold as A first collection of microRNAs associated with the clinical features.
  • the term "fifth threshold” generally refers to a cut-off value that determines the statistical significance of the correlation.
  • the fifth threshold may be less than -0.1.
  • the fifth threshold may be less than -0.15, less than -0.2, less than -0.25, less than -0.3, less than -0.35, less than -0.4, or less than -0.45.
  • the correlation coefficient is smaller than the fifth threshold, it is considered that there is a significant correlation between the expression level of the gene regulated by the microRNA and the expression level of the microRNA.
  • the microRNA and the gene interacting therewith can be paired regulatory pairs (microRNA-gene regulatory pairs).
  • the fifth threshold can reflect the degree of cooperation between the microRNA and its regulated gene.
  • the fifth threshold may vary as the stage of the tumor stage changes.
  • first microRNA set may include the microRNA whose correlation is higher than the fifth threshold.
  • the first microRNA set may be selected from the microRNAs shown in Table 10.
  • the at least one biological indicator may comprise two or more types of the biological indicator, and determining a correlation between the biological indicator and the clinical characteristic comprises determining each type of said
  • the weight of the biological indicators affecting the clinical features can be determined by performing an ordered logistic regression analysis.
  • determining the correlation between the biological indicator and the clinical feature may comprise determining, by performing an ordered logistic regression analysis, the weight of the following biological indicators affecting the clinical feature: the second The level of expression of the gene in the set of genes, the copy number of the gene in the set of second genes, the risk value of the DNA methylation site in the first set of DNA methylations.
  • the respective weights of the protective effector gene expression level and the risk effector gene expression level in the second gene set can be determined separately.
  • weight generally refers to the relative importance of a certain indicator (eg, the biological indicator) in overall evaluation (eg, evaluation of tumor progression).
  • the present application also provides a computer readable storage medium storing a computer program, wherein the computer program causes a computer to perform the methods described herein.
  • Computer readable storage medium generally refers to a medium in a computer memory for storing certain parameters or data.
  • Computer storage media can include, for example, semiconductors, magnetic cores, magnetic drums, magnetic tapes, and laser disks.
  • authentication module generally refers to a functional unit capable of identifying a biological index determined to be related to the clinical feature in the correlation determination module as being capable of evaluating the progression of the tumor.
  • the authentication module can include a program, reagent, and/or device capable of identifying the biological indicator as being capable of evaluating the progression of the tumor.
  • the identification of biological indicators capable of evaluating tumor progression can be divided into three phases (as shown in Figure 1): Phase 1, through a large-scale Cox regression model (ie, univariate and multivariate Cox regression models) According to the influence of genes on the survival state of patients in tumor patients (for example, bladder cancer patients) obtained from TCGA, 1078 key genes were identified. Next, the relationship between these key genes in the different stages of tumors (eg, bladder cancer) and patient survival and/or tumor stage is analyzed to determine the protective or deleterious properties of these genes.
  • Phase 1 Phase 1
  • a large-scale Cox regression model ie, univariate and multivariate Cox regression models
  • Stage 2 Analysis of the staging-specific gene co-expression profiles in different stages of tumors (eg, bladder cancer), and according to which the 1078 key genes are divided into multiple subgroups, the genes in each subgroup are identical Or a similar staging-specific co-expression pattern, followed by determining the correlation between the genes in each of the subgroups and the patient survival rate and/or tumor staging phase, thereby identifying tumor progression in the 1078 key genes The most relevant subset of genes.
  • tumors eg, bladder cancer
  • Stage 3 Analysis of tumor (eg, bladder cancer) progression (eg, patient survival and/or tumor staging phase) and other biological indicators of the patient, such as copy number variation, DNA methylation of the 1078 key genes, Correlation between somatic mutations and microRNA regulatory networks, etc., to identify other one or more biological indicators that can reflect this correlation.
  • Stage 4 An integrated analysis of the comprehensive correlation between the various biological indicators identified and the progression of tumors (eg, bladder cancer) (eg, patient survival and/or tumor stage staging).
  • the application provides a device for determining tumor progression in a subject, the device comprising: a) an analysis module capable of determining one or more genes shown in Table 1 in the subject Or a level of expression in a biological sample derived from the subject; and b) a determination module capable of determining the progression of the tumor in the subject according to the expression level determined in a).
  • the present application also provides a device for determining tumor progression in a subject, the device comprising a computer for determining tumor progression in a subject, the computer being programmed to perform the following steps: a) determining that in Table 1 Demonstrating the expression level of one or more genes in the subject or in a biological sample derived from the subject; and b) judging the subject according to the expression level determined in a) The progression of the tumor described.
  • the application provides a method of determining tumor progression in a subject, the method comprising: a) determining that one or more genes shown in Table 1 are in or derived from the subject The expression level in the biological sample of the subject; and b) determining the progression of the tumor in the subject according to the expression level determined in a).
  • analytical module generally refers to the ability to determine the expression level of one or more genes shown in Table 1 in or derived from a biological sample of the subject. Functional unit.
  • the analysis module can include a sample unit that obtains a sample of the subject (eg, peripheral blood).
  • the analysis module can include a sample device that obtains a sample of the subject (eg, a device that takes a sample, such as a blood collection needle; and/or a device that carries the sample, such as a test tube).
  • the analysis module can include a sample processing device that obtains DNA of the subject by processing a patient sample (eg, a kit for extracting whole blood DNA, a test tube, and related devices).
  • the analysis module can also include a separation unit capable of separating the subject sample.
  • the analysis module can include an agent that separates cells (eg, proteinase K) and a device that separates cells (eg, a centrifuge).
  • the analysis module can include reagents and equipment that detect the level of expression of one or more genes shown in Table 1 in the subject or from a biological sample of the subject.
  • the analysis module can include a q-RT PCR kit and a q-RT PCR machine.
  • the term “judging module” generally refers to a functional unit that determines the progression of the tumor in the subject based on the expression level determined in the analysis module.
  • the judging module may include a sample judging unit that can judge the progress of the tumor in the subject according to the expression level determined in the analysis module.
  • the tumor progression can include a stage of staging of the tumor and/or a survival rate of the subject.
  • the stage of staging of the tumor can be selected from the group consisting of: stage I tumor, stage II of tumor, stage III of tumor, and stage IV of tumor.
  • the tumor can include bladder cancer.
  • the bladder cancer can include bladder urothelial carcinoma (BLCA).
  • BLCA bladder urothelial carcinoma
  • the one or more genes may comprise at least one or more of the protective effector genes shown in Table 2.
  • the one or more genes may include at least one or more of the risk effect genes shown in Table 3.
  • the one or more genes may include at least one or more of the genes shown in Table 4.
  • the expression level of the gene in Table 4 may be negative for the coefficient of correlation coefficient with the tumor stage.
  • the expression levels of the genes in Table 4 eg, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, 99% or more, or 100% of the genes in Table 4) may be The correlation coefficient of bladder cancer staging is negative.
  • the one or more genes may include at least one or more of the genes shown in Table 5.
  • the expression level of the gene in Table 5 may be positive with the correlation coefficient value of the tumor stage.
  • the expression level of the gene in Table 5 can be positive with the coefficient of correlation coefficient of bladder cancer staging.
  • the device or party may further comprise the step or module of determining a copy number change of the one or more genes.
  • the determination of the change in copy number may include the step of utilizing data of copy number changes in Broad GDAC Firehose for analysis.
  • the data is derived from samples of patients at different stages of bladder cancer.
  • the device or method may further comprise the step or module of determining the DNA methylation risk value of one or more of the genes shown in Table 8.
  • the risk value is typically determined based on the correlation coefficient obtained by the methylation site in the regression analysis and the degree of methylation of the methylation site.
  • the risk value can be determined by a method comprising the steps of: methylation level (ie, beta value) and 23 DNA methylation genes in regularized Cox regression (eg, as described herein) a linear combination of the corresponding coefficients of the genes in the first DNA methylation pool, or the genes shown in Table 8; then all patients were scored according to the median of the risk values, which were then classified as high risk Groups and low-risk groups, and Kaplan-Meier analysis and log-rank test were performed on these two groups of patients.
  • the device or method further comprises the step or module of determining or providing the age of the subject.
  • the steps or modules can include or perform the steps of asking the age of the patient, investigating a patient's medical record, or determining bone age.
  • determining, in the device or method, the expression level of one or more genes shown in Table 1 in the subject or in a biological sample derived from the subject may comprise: determining The average expression level of the genes shown in Table 2 in the one or more genes; and the average expression level of the genes shown in Table 3 in the one or more genes.
  • one or more of Tables 2 and 3 can be measured according to the respective measurements (for example, one or more, two or more, four or more, six or more, eight or more, ten or more, twenty or more, More than 50, more than 100, more than 200, or more than 500 of the average expression levels of the genes, determining that one or more of the genes shown in Table 1 are in or derived from the subject The level of expression in the subject's biological sample.
  • the device or method can determine the progression of the tumor in the subject according to Formula I:
  • a is the average expression level of the genes shown in Table 2 of the one or more genes;
  • b is The average expression level of the genes shown in Table 3 in the one or more genes;
  • c is the copy number variation of the one or more genes;
  • the present application provides a computer readable storage medium storing a computer program, wherein the computer program can cause a computer to perform the above-described determination method.
  • the present application provides a method of treating a tumor in a subject, the method comprising: determining a progression of the tumor in the subject according to the determining method described herein; The progression is administered to the subject an effective amount of treatment.
  • the tumor can include bladder cancer (eg, bladder urothelial carcinoma (BLCA)).
  • bladder cancer eg, bladder urothelial carcinoma (BLCA)
  • the progression of the tumor can be selected from the group consisting of: stage I tumor, stage II of tumor, stage III of tumor, and stage IV of tumor.
  • the treatment can include: transurethral resection, intravesical chemotherapy, partial cystectomy, and radical cystectomy with electrocautery.
  • the treatment may include: radical cystectomy, combined chemotherapy followed by radical cystectomy, radiotherapy, partial cystectomy, and electrocautery through the urethra resection.
  • the treatment may include chemotherapy, simple radical cystectomy or subsequent chemotherapy, external radiation therapy, or external radiation plus chemotherapy and palliative therapy (eg, urinary diversion or Cystectomy).
  • the application provides a device for treating a tumor in a subject, the device comprising: a) an analysis module capable of determining one or more genes shown in Table 1 in the subject a level of expression in a biological sample derived from the subject; b) a determination module capable of determining the progression of the tumor in the subject according to the expression level determined in a); A therapeutic module capable of administering to the subject an effective amount of treatment according to the progression as judged in b).
  • the term "therapeutic module” generally refers to a functional unit capable of determining and/or administering an effective amount of treatment to a subject based on the progress of the tumor as determined in the determination module.
  • the treatment module can include reagents, medicaments, instruments, and devices required for a method of treatment selected from the group consisting of surgical procedures for cutting tumors, chemotherapy, radiation therapy, biological targeted therapy, and palliative therapy.
  • the palliative therapy may be for the treatment of symptoms affecting quality of life such as pain, anorexia, constipation, fatigue, dyspnea, vomiting, cough, dry mouth, diarrhea, dysphagia, etc., while paying attention to the treatment of mental and psychological problems.
  • the cancer can be bladder cancer
  • the bio-targeted therapy can include administration, such as IL2 and/or IFN-[alpha]2a.
  • the treatment module can include administering to the subject an effective amount of the agent.
  • the "effective amount” can be an amount of a drug that alleviates or eliminates a disease or condition in a subject. Generally, it can be based on the subject's weight, age, sex, diet, excretion rate, past medical history, current treatment, time of administration, dosage form, method of administration, route of administration, combination of drugs, health of the subject
  • the specific effective amount is determined by the potential of the condition and cross-infection, allergies, hypersensitivity and side effects, and/or the degree of staging of the tumor, and the like.
  • One skilled in the art e.g., a physician or veterinarian may proportionally reduce or increase the effective amount in accordance with these or other conditions or requirements.
  • the term "about” generally means a range of 0.5% to 10% above or below a specified value, such as 0.5%, 1%, 1.5%, 2%, 2.5% above or below a specified value, Variations within the range of 3%, 3.5%, 4%, 4.5%, 5%, 5.5%, 6%, 6.5%, 7%, 7.5%, 8%, 8.5%, 9%, 9.5%, or 10%.
  • CLAIMS 1.
  • a device for identifying a biological indicator capable of assessing tumor progression comprising:
  • a clinical feature module capable of providing clinical characteristics of a patient having the tumor, the clinical features comprising a stage of tumor staging of the patient and/or a survival time of the patient;
  • a biological indicator module capable of providing at least one biological indicator derived from said patient
  • a correlation determination module capable of determining a correlation of the at least one biological indicator of each of the patients with the clinical characteristics of the corresponding patient
  • An authentication module capable of identifying a biological indicator determined in module 3) to be associated with said clinical feature as being capable of evaluating the progression of said tumor.
  • a device for identifying a biological indicator capable of assessing tumor progression comprising a computer for identifying the biological indicator, the computer being programmed to perform the following steps:
  • the biological index determined to be related to the clinical feature in 3) is identified as being capable of evaluating the progression of the tumor.
  • a method of identifying a biological indicator capable of assessing tumor progression comprising:
  • the biological index determined to be related to the clinical feature in 3) is identified as being capable of evaluating the progression of the tumor.
  • bladder cancer comprises bladder urothelial carcinoma (BLCA).
  • stage of tumor staging is selected from the group consisting of: stage I, stage II, stage III, and stage IV of tumor.
  • the at least one biological indicator comprises one or more types of indicators selected from the group consisting of:
  • Class 1 expression level of the patient gene
  • Class 2 a copy number change of the patient gene
  • Class 3 DNA methylation of the patient's gene
  • Class 4 somatic mutation of the patient's gene
  • Class 5 MicroRNAs in the patient.
  • the at least one biological indicator comprises an expression level of the patient gene
  • determining a correlation between an expression level of the gene and the clinical characteristic comprises Performing a univariate regression analysis with respect to the clinical characteristics with the expression level of the gene as a single variable, and the gene in the regression analysis that the p value is less than or equal to the first threshold and the FDR value is less than or equal to the second threshold Identification is associated with the clinical features.
  • the at least one biological indicator comprises an expression level of the patient gene and determining an expression level of the gene and the clinical characteristic Correlation between: multivariate regression analysis with respect to the clinical features, and identifying a gene in the regression analysis that has an FDR value less than or equal to a third threshold as being associated with the clinical feature, and wherein
  • the variables include the level of gene expression in the patient, the age of the patient, the gender of the patient, and/or the stage of tumor staging of the patient.
  • the at least one biological indicator comprises an expression level of the patient gene, and determining an expression level of the gene and the clinical characteristic
  • the correlation between the two further includes: dividing the gene into a protective effector gene and a risk effector gene according to the correlation coefficient value for each gene obtained in the regression analysis, wherein the protective effector gene is related The coefficient of sex coefficient is negative, and the correlation coefficient value of the risk effector gene is positive.
  • the at least one biological indicator comprises an expression level of the patient gene and determining an expression level of the gene and the clinical characteristic
  • the correlation between the two further includes: determining the expression level of the gene of the patient in each stage of tumor staging, determining the co-expression of the gene specific for the tumor stage, and dividing the gene according to the co-expression of the gene.
  • the at least one biological indicator comprises a copy number change of the patient gene
  • determining the gene copy number change and the clinical Correlation between features includes comparing the frequency of copy number changes of the patient's genes at each tumor staging stage.
  • the at least one biological indicator comprises DNA methylation of the patient gene and determining the DNA methylation and the clinical Correlation between features includes regression analysis with respect to the clinical features with the degree of methylation of the DNA as a variable, and methylation of DNA with a p-value less than or equal to a fourth threshold in the regression analysis Identification is associated with the clinical features.
  • determining the correlation between the DNA methylation and the clinical feature further comprises determining each DNA methylation identified as being associated with the clinical feature A risk value for the site, the risk value being determined based on a correlation coefficient obtained by the methylation site in the regression analysis and a degree of methylation of the methylation site.
  • the at least one biological indicator comprises a somatic mutation of the patient gene and determining the somatic mutation and the clinical feature
  • the correlation between the two comprises: determining a signaling pathway to which the gene having the somatic mutation is present, and/or determining a correlation between the expression level of the gene having the somatic mutation and the clinical characteristic.
  • Correlation includes determining a correlation between the expression level of the gene regulated by the microRNA and the clinical characteristic, and determining the expression level of the microRNA in the patient and the expression level of the gene regulated thereby The correlation between them.
  • the correlation between clinical features includes determining the weight of each of the biological indicators affecting the clinical features.
  • the at least one biological indicator comprises an expression level of the patient gene, and determining an expression level of the gene and the clinical characteristic
  • the gene is identified as a first set of genes associated with the clinical features.
  • determining a correlation between the expression level of the gene and the clinical characteristic further comprises:
  • the variables include the expression level of each gene in the first set of genes, the age of the patient, the gender of the patient, and the stage of tumor staging of the patient.
  • determining a correlation between the expression level of the gene and the clinical characteristic further comprises:
  • determining a correlation between the expression level of the gene and the clinical characteristic further comprises:
  • each gene in the second gene set in each tumor staging stage Determining the expression level of each gene in the second gene set in each tumor staging stage, and determining co-expression of a gene specific for tumor staging according to the co-expression of the gene, and the second gene set according to the co-expression of the gene.
  • the genes are divided into two or more groups, and the correlation between the gene expression levels of each group and the clinical features is determined separately.
  • the at least one biological indicator further comprises a copy number change of the patient gene
  • determining the gene copy number change and the Correlation between clinical features includes comparing the frequency of copy number changes of genes in the second set of genes at each stage of tumor staging.
  • the at least one biological indicator further comprises DNA methylation of the patient gene and determining the DNA methylation and Correlation between clinical features includes determining a DNA methylation site of a gene in the second gene set and a degree of DNA methylation of each of the sites, using the degree of DNA methylation as a variable A regression analysis is performed relative to the clinical features, and DNA methylation in the regression analysis with a p-value less than or equal to a fourth threshold is identified as a first DNA methylation set associated with the clinical features.
  • determining the correlation between the DNA methylation and the clinical characteristic further comprises determining each DNA methylation in the first DNA methylation pool A risk value for the site, the risk value being determined based on a correlation coefficient obtained by the methylation site in the regression analysis and a degree of methylation of the methylation site.
  • the at least one biological indicator further comprises a somatic mutation of the patient gene, and determining the somatic mutation and the clinical Correlation between features includes determining somatic mutations possessed by genes in the second set of genes, and determining signal pathways to which genes having the somatic mutations belong.
  • the at least one biological indicator comprises microRNA in the patient and determining between the microRNA and the clinical feature Correlation includes determining a microRNA that regulates a gene in the second set of genes, and determining a correlation between a level of expression of the microRNA in the patient and a level of expression of a gene regulated thereby, MicroRNAs with a correlation above a fifth threshold are identified as a first collection of microRNAs associated with the clinical features.
  • determining a correlation between the biological indicator and the clinical characteristic comprises determining the following biology by performing an ordered logistic regression analysis, respectively The weight of the indicator on the clinical characteristics: the expression level of the gene in the second gene set, the copy number of the gene in the second gene set, and the DNA methylation position in the first DNA methylation set The risk value of the point.
  • a computer readable storage medium storing a computer program, wherein the computer program causes the computer to perform the method of any of embodiments 3-31.
  • a device for determining tumor progression in a subject comprising:
  • an analysis module capable of determining the expression level of one or more genes shown in Table 1 in the subject or in a biological sample derived from the subject;
  • a determination module capable of determining the progression of said tumor in said subject in accordance with said expression level determined in a).
  • a device for determining tumor progression in a subject comprising a computer for determining tumor progression in a subject, the computer being programmed to perform the following steps:
  • a method of determining tumor progression in a subject comprising:
  • the tumor progression comprises a stage of staging of the tumor and/or a survival rate of the subject.
  • stage of staging of the tumor is selected from the group consisting of: stage I tumor, stage II of tumor, stage III of tumor, and stage IV of tumor.
  • the tumor comprises bladder cancer.
  • bladder cancer comprises bladder urothelial carcinoma (BLCA).
  • BLCA bladder urothelial carcinoma
  • the expression level in the sample comprises: determining an average expression level of the genes shown in Table 2 of the one or more genes; and determining an average expression level of the genes shown in Table 3 of the one or more genes.
  • a is the average expression level of the genes shown in Table 2 in the one or more genes;
  • b is the average expression level of the genes shown in Table 3 in the one or more genes;
  • c is a copy number change of the one or more genes
  • d is the DNA methylation risk value of the gene shown in Table 8 of the one or more genes;
  • e is the age of the subject
  • f is the sex of the subject, where male is 0 and female is 1.
  • a computer readable storage medium storing a computer program, wherein the computer program causes the computer to perform the method of any one of embodiments 35-48.
  • a method of treating a tumor in a subject comprising:
  • An effective amount of treatment is administered to the subject based on the progression.
  • a device for treating a tumor in a subject comprising:
  • an analysis module capable of determining the expression level of one or more genes shown in Table 1 in the subject or in a biological sample derived from the subject;
  • a determination module capable of determining the progression of said tumor in said subject in accordance with said expression level determined in a);
  • a treatment module capable of administering to the subject an effective amount of treatment according to the progression as judged in b).
  • Example 1 Patient and tumor sample data sources
  • the genome and clinical data sets for most of the BLCA patients used in this application were downloaded from the "NCI GDC Data Portal Legacy Archive". Among them, the clinical information of BLCA patients comes from the TCGA-BLCA clinical document.
  • the obtained BLCA patient RNA-seq data set contained 419 samples, including 400 tumor samples and 19 normal samples. All gene expression values were normalized.
  • TCGA Level 3 methylation data was downloaded from "jhu-usc_BLCA.HumanMethylation450”.
  • Correlation data between TCGA level 4 mRNA expression and DNA methylation was obtained from Broad GDAC Firehose.
  • TCGA Level 4 copy number variation (CNV) data was downloaded from Broad GDAC Firehose.
  • the "per million miRNA mapped (RPM)" from the TCGA Level 3 microRNA quantitative file was selected as the microRNA expression value.
  • Survival analysis is used to study the relationship between survival status and different potential influencing factors (eg, key genes).
  • the p-value of the univariate Cox proportional hazard regression was ⁇ 0.05 and the false discovery rate (FDR) was ⁇ 0.1; the p-value of the multivariate Cox proportional hazard regression was ⁇ 0.05 and FDR ⁇ 0.05.
  • FDR false discovery rate
  • Univariate and multivariate Cox proportional hazards regression models were used to select a set of key genes that may have important implications for the survival of BLCA patients.
  • the use of gene expression values for univariate Cox regression was used as the only predictor. Initially, after removing genes that were rarely expressed (only expressed genes in less than 20 samples), expression values of 19,472 genes were obtained for all 404 BLCA patients. Then 1307 candidate genes were selected based on the threshold p value ⁇ 0.05 and the false discovery rate (FDR) ⁇ 0.1. Next, it was examined whether the candidate genes met the proportional hazard (PH) hypothesis and excluded 99 genes that did not satisfy the hypothesis. Therefore, 1208 candidate genes were screened by univariate Cox regression analysis.
  • PH proportional hazard
  • the FDR threshold ⁇ 0.05 was used and the candidate genes were examined for compliance with the proportional hazard (PH) hypothesis to further screen candidate genes.
  • 1078 candidate genes were obtained from multivariate Cox regression (see Table 1, where Table 1 shows the identified 1078 key genes). The 1078 genes shown in Table 1 were defined as key genes, and then Used for subsequent analysis.
  • Example 2 1078 key genes have been divided into two groups, a protective effector gene and a risk effector gene.
  • a protective effector gene To investigate the correlation of gene expression within or between the two genomes in different tumor stages of bladder cancer, the protective effector gene-protective effector gene, protective effector gene-risk was analyzed in each tumor stage. Correlation coefficient between the expression level of the sexual effector gene and the risk effector gene-risk effector gene.
  • Example 4 Construction of a co-expression network of key genes and detection of functional gene modules related to clinical features
  • a gene module is defined as a gene group comprising a number of highly linked genes in a constructed gene co-expression network.
  • the topology overlap matrix (TOM) is obtained from the adjacency matrix by the "TOM similarity” function in the program. Based on the corresponding dissimilarity scores obtained from this topological overlap matrix.
  • Use the "hclust” function to get a tree view of the gene, and then use the "cutreeDynamic" function for module identification.
  • the minimum module size is set to 20.
  • Use the Mark Heat Map feature to generate a heat map of module-feature associations.
  • Gene co-expression networks can provide an overall picture of the association between genes and genes. Based on the gene expression values of BLCA patients at different stages, a tumor staging-specific gene co-expression network was constructed using the WGCNA algorithm.
  • the genes within the module usually exhibit similar expression patterns.
  • Such network modules are generally considered to have basic network topology features, which can provide favorable clues for understanding the biological functions of related genes in the module.
  • the adjacency matrix is first converted to a topological overlap matrix and provides a topological similarity score useful for downstream module detection.
  • a dynamic tree cutting algorithm is then run on a hierarchical clustering tree (ie, a tree generated by dynamic tree clipping) generated by the WGCNA algorithm to produce seven different sized network modules (see Figure 5A and Table 6).
  • FIG. 5A shows a hierarchical clustering tree (i.e., a tree diagram) constructed by WGCNA derived from the dissimilarity scores represented by the various gene clusters and topological overlapping matrices derived by the dynamic tree cutting algorithm.
  • each gene cluster is named in a different color; in FIG. 5B, the left side is respectively represented by different numbers to represent gene clusters of different colors, that is, the first to seventh modules sequentially represent cyan, black, yellow, and brown. , single, functional modules of red, blue, and green.
  • Figure 5B shows the relationship between modular cells (rows) defined by the first major component of the gene expression profile in a single module and clinical features (columns) of all BLCA patients. Each box shows the correlation coefficient and the corresponding p value (in parentheses).
  • PDGFRB has been shown to be closely associated with recurrence of non-muscle invasive bladder cancer (see Feng J et al, PLoS One 2014, 9(5): e96671).
  • MARVELD1 was found to be down-regulated in several cancers including bladder cancer (see Wang S et al, Cancer Lett 2009, 282(1): 77-86).
  • KCNE4 an ion channel gene, has been found to display abnormal expression levels in bladder cancer samples (see Biasiotta A et al J Transl Med 2016, 14(1): 285).
  • CPT1B has been shown to be down-regulated in bladder cancer tissues along with other genes in the carnitine-acylcarnitine metabolic pathway (see Kim WT et al, Yonsei Med J 2016, 57(4): 865-871).
  • CKD6 has been shown to be involved in several regulatory pathways in bladder cancer (see Lu S et al, Exp Ther Med 2017, 13(6): 3309-3314). It can be seen that genes with high connectivity in the network module may also have important biological functions in the staging of bladder cancer. Thus, the above results indicate that the phase-specific association between the survival rate of BLCA patients and their tumor stage can be reflected by the expression levels of different groups of key genes.
  • CNV data from "SNP6 Copy Number Analysis (Gistic2)" in Broad GDAC Firehose (Level 4).
  • CNV data for 1078 key genes selected from 400 BLCA samples were obtained, including 129 samples from stage I/II, 139 samples from stage III, and 132 samples from stage IV.
  • the frequency (i.e., amplification or deletion) of the sample with CNV in each phase was calculated. Taking into account the imbalance in the number of samples from different stages of bladder cancer, I/II was used as a baseline to normalize the frequency of each period.
  • FIG. 6A shows a comparison of CNV ratios in different stages of bladder cancer.
  • Figures 6B-6E show a comparison of CNV ratios for the blue and cyan modules as a whole and for I/II, III and IV; where *p value is ⁇ 0.05; **: p value is ⁇ 0.01; : p value ⁇ 0.001; ****: p value ⁇ 0.0001, after double-sided Wilcoxon rank sum test.
  • the results indicate that copy number variation is an important factor affecting different stages (ie, progression) of bladder cancer, and affects different functional gene modules at different levels.
  • the DNA methylation status of 1078 key genes screened in Example 2 was analyzed, and some of the DNA methylation features may be used as biomarkers for bladder cancer prognosis.
  • a risk value was then introduced, which was defined as the linear combination of the methylation level (i.e., beta value) and the corresponding coefficient of the 23 DNA methylation genes in the regularized Cox regression.
  • beta value the linear combination of the methylation level
  • corresponding coefficient of the 23 DNA methylation genes in the regularized Cox regression was then introduced.
  • all BLCA patients were scored according to the median risk value and divided into high-risk and low-risk groups. Kaplan-Meier analysis and log-rank test were then performed on these two groups of patients. The results showed that the high-risk group and the low-risk group showed significantly different risk score distributions (see Figure 7A).
  • the Kaplan-Meier curve drawn also has a significant difference, ie the higher the risk score, the worse the prognosis and vice versa (see Figure 7B).
  • Figure 7A shows the distribution of risk scores (based on 23 selected DNA methylation genes) and the corresponding clinical features of patients in the high-risk and low-risk groups of DNA methylation analysis; the dotted line shows the cut-off value of the risk score .
  • Figure 7B shows Kaplan-Meier survival curves for the high-risk and low-risk groups, with statistical differences between the two groups by log-rank test. The results indicate that new risk values based on the selected DNA methylation genes can provide a good prognostic indicator for bladder cancer.
  • Figures 8A-8D show significant enrichment of mutant genes for the PI3K-AKT pathway, the MAPK pathway, the Ras pathway, and the Rap1 pathway, respectively, in samples from BLCA patients.
  • the row represents the mutated gene and is sequentially arranged according to the frequency of the mutated gene in all samples; the column represents the sample involved (the blank column in which no mutation has been removed).
  • a significant portion of these four pathways were mutated in bladder cancer. Specifically, in all samples, 60% of the MAPK pathway, 56% of the PI3K/AKT pathway, 35% of the Rap1 pathway, and 35% of the Ras pathway already have a mutated gene, and mutations occur more frequently than 1%.
  • the R package "igraph” was used to calculate the degree of synergy of different stages of microRNA regulation networks in bladder cancer.
  • the network map was generated by Cytoscape 3.5.0.
  • Table 10 For the microRNAs that interact with the 1078 key genes screened in Example 2.
  • microRNA-gene pairs with a coefficient less than -0.3 were selected as potential regulatory partners, based on which a microRNA-gene was constructed for each phase of bladder cancer. Interacting networks. It was found that in different stages (progression) of bladder cancer, the structure of microRNA regulatory networks (including the interactions involving microRNAs known to be BLCA-specific) tend to become more sparse, and the interaction between them is gradually reduced (see figure). 10). To quantify this trend, the degree of synergy of individual microRNA regulatory networks of different stages was also calculated.
  • Figures 10A-10C show visual dynamic changes in the microRNA regulatory networks of stage I/II, III, and IV, respectively.
  • the rectangle represents the selected microRNA, and the known BLCA-specific microRNA is shown in red; the target gene corresponding to the microRNA is represented by a green circle, and the degree of cooperation of each network is also shown.
  • microRNA regulatory network of 1078 key genes screened in BLCA patients showed a discretizing growth trend with the progression of bladder cancer, which may be related to the dysregulation of microRNAs in cancer cells. It also reflects the disorder of intracellular regulation and control of gene expression in bladder cancer.
  • the ordinal logistic regression task is performed using the "mnrfit" function in Matlab 2016b.
  • the mean expression values (z-normalized) of the protective effector gene and the risk effector gene, the frequency of copy number variation (z-normalization), DNA methylation risk score, age and sex were considered in the comprehensive analysis (see Table 11). ).
  • the mean expression value of the risk gene, the frequency of copy number variation, and the risk score of DNA methylation can significantly affect the stage of bladder cancer.
  • the boxes and lines represent the odds ratio (OR) and the corresponding 95% confidence interval, respectively, and the asterisks represent statistically significant variables.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Public Health (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Organic Chemistry (AREA)
  • Immunology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Hospice & Palliative Care (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Oncology (AREA)
  • Microbiology (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本申请公开了一种鉴别及评价肿瘤进展的装置和方法。所述装置或方法包括:1)能够提供患有所述肿瘤的患者的临床特征的模块或步骤;2)能够提供源自所述患者的至少一种生物学指标的模块或步骤;3)能够确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性的模块或步骤;以及4)能够评价所述肿瘤的进展或鉴定相关评价指标的模块或步骤。本申请的装置或方法能够为研究肿瘤进展的潜在分子机制和提供针对肿瘤进展的治疗策略提供指导。

Description

鉴别及评价肿瘤进展的装置和方法 技术领域
本申请涉及疾病的检测和治疗,具体涉及用于鉴别能够用于评价肿瘤进展的生物学指标的装置和方法,以及用于判断肿瘤进展的装置和方法。
背景技术
阐明肿瘤发生的潜在分子机制是肿瘤学中最重要的问题之一。通过高通量DNA测序能够获得患者在基因表达失调时的基因组特征。例如,已经发现拷贝数变异(CNV)可以作为结直肠癌等癌症的重要指标(参见Zhao S,et al,.Proc Natl Acad Sci U S A 2013,110(8):2916-2921)。DNA甲基化是一种重要的表观遗传机制,在膀胱癌细胞中,异常的DNA甲基化水平已被证明与某些基因的功能紊乱有关,因此也与膀胱癌的发生有关(参见Rose M,et al,.Carcinogenesis 2014,35(3):727-736)。体细胞突变往往被认为是膀胱癌进展的另一个原因(参见Soung YH,et al,.Oncogene 2003,22(39):8048-8052)。而微小RNA的异常表达可能导致膀胱癌细胞中胞内调节网络的紊乱(参见Jin Y,et al,.Tumour Biol 2015,36(5):3791-3797)。
然而,癌症的发生和进展往往是多步骤和高度动态的过程,其涉及细胞中多种分子的活性水平变化。因此,仅通过单个指标来评价癌症的进展或预后通常是困难的。本领域中也缺乏能够与临床特征(例如,疾病进展情况)相关联的可靠的生物学指标。相应地,亟需鉴别出能够揭示癌症进展的潜在生物学指标,并从例如基因表达水平、拷贝数变异、DNA甲基化、体细胞突变和微小RNA(microRNA)调控等多个角度来评价与癌症进展相关的重要生物学指标,并研究如何综合利用这些指标来评价癌症的进展和/或预后。
发明内容
本申请提供了一种用于鉴别能够评价肿瘤进展的生物学指标的装置和方法,所述装置和方法能够通过创造性地将患有肿瘤的患者的临床特征(例如肿瘤分期阶段和/或所述患者的生存时间)与该患者的至少一种生物学指标(例如基因的表达水平、拷贝数变异、DNA甲基化、体细胞突变和微小RNA等)进行比较和关联,而鉴定出能够用于评价肿瘤进展的生物学指标。此外,本申请还提供了用于判断受试者中肿瘤进展的装置和方法,所述装置和方法能够通过创造性地综合利用所鉴定出的各种生物学指标、并给各个不同的指标分配合理的权重,而判断受试者中肿瘤进展的情况。在某些情况下,本申请的装置或方法还可以根据所述判断 的结果,提供适合的治疗方案。
一方面,本申请提供了一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括:1)临床特征模块,其能够提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;2)生物学指标模块,其能够提供源自所述患者的至少一种生物学指标;3)相关性判断模块,其能够确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性;以及4)鉴别模块,其能够将在模块3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
另一方面,本申请提供了一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括用于鉴别所述生物学指标的计算机,所述计算机被编程以执行如下步骤:1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;2)提供源自所述患者的至少一种生物学指标;3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
另一方面,本申请提供了一种鉴别能够评价肿瘤进展的生物学指标的方法,所述方法包括:1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;2)提供源自所述患者的至少一种生物学指标;3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
在某些实施方式中,所述肿瘤包括膀胱癌。在某些实施方式中,所述膀胱癌包括膀胱尿路上皮癌(BLCA)。
在某些实施方式中,所述肿瘤分期阶段选自:肿瘤I期,肿瘤II期,肿瘤III期和肿瘤IV期。
在某些实施方式中,所述至少一种生物学指标包括选自下组的一类或多类指标:类1:所述患者基因的表达水平;类2:所述患者基因的拷贝数变化;类3:所述患者基因的DNA甲基化;类4:所述患者基因的体细胞突变;和类5:所述患者中的微小RNA。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的表达水平,且确定所 述基因的表达水平与所述临床特征之间的相关性包括:相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关,且其中所述多变量包括所述患者中的基因表达水平,所述患者的年龄,所述患者的性别,和/或所述患者的肿瘤分期阶段。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:根据所述多变量回归分析中获得的针对各基因的相关性系数数值,将所述基因分为保护性效应基因和危险性效应基因,其中所述保护性效应基因的相关性系数数值为负,且所述危险性效应基因的相关性系数数值为正。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:确定所述患者的基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。
在某些实施方式中,所述的装置或方法通过使用WGCNA算法根据所述基因共表达情况将所述基因分为2组或更多组。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的拷贝数变化,且确定所述基因拷贝数变化与所述临床特征之间的相关性包括:比较所述患者的基因在各肿瘤分期阶段的拷贝数变化频率。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的DNA甲基化,且确定所述DNA甲基化与所述临床特征之间的相关性包括:以所述DNA甲基化的程度作为变量而相对于所述临床特征进行回归分析,并将所述回归分析中p值小于或等于第四阈值的DNA甲基化鉴别为与所述临床特征相关。
在某些实施方式中,所述装置或方法中确定所述DNA甲基化与所述临床特征之间的相关性还包括:确定被鉴别为与所述临床特征相关的各DNA甲基化位点的风险值,所述风险值基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的体细胞突变,且确定所述体细胞突变与所述临床特征之间的相关性包括:确定具有所述体细胞突变的基因所属的信号通路,和/或确定具有所述体细胞突变的基因的表达水平与所述临床特征之间的相关性。
在某些实施方式中,所述至少一种生物学指标包括所述患者中的微小RNA,且确定所述 微小RNA与所述临床特征之间的相关性包括:确定所述微小RNA所调控的基因的表达水平与所述临床特征之间的相关性,以及确定所述微小RNA在所述患者中的表达水平与其所调控的基因的表达水平之间的相关性。
在某些实施方式中,所述至少一种生物学指标包括两类或更多类所述生物学指标,且确定所述生物学指标与所述临床特征之间的相关性包括确定各类所述生物学指标对所述临床特征影响的权重。
在某些实施方式中,所述装置或方法通过进行有序逻辑回归分析确定所述权重。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:a)以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关的第一基因集合。
在某些实施方式中,所述装置或方法中确定所述基因的表达水平与所述临床特征之间的相关性还包括:b)相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关的第二基因集合,且其中所述多变量包括所述第一基因集合中各基因的表达水平,所述患者的年龄,所述患者的性别,和所述患者的肿瘤分期阶段。
在某些实施方式中,所述装置或方法中确定所述基因的表达水平与所述临床特征之间的相关性还包括:c)根据所述多变量回归分析中获得的针对各基因的相关性系数数值,将所述基因分为保护性效应基因和危险性效应基因,其中所述保护性效应基因的相关性系数数值为负,且所述危险性效应基因的相关性系数数值为正。
在某些实施方式中,所述装置或方法中确定所述基因的表达水平与所述临床特征之间的相关性还包括:确定所述第二基因集合中的各基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述第二基因集合中的基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。
在某些实施方式中,所述装置或方法通过使用WGCNA算法根据所述基因共表达情况将所述第二基因集合中的基因分为2组或更多组。
在某些实施方式中,所述至少一种生物学指标还包括所述患者基因的拷贝数变化,且确定所述基因拷贝数变化与所述临床特征之间的相关性包括:比较所述第二基因集合中的基因在各肿瘤分期阶段的拷贝数变化频率。
在某些实施方式中,所述至少一种生物学指标还包括所述患者基因的DNA甲基化,且 确定所述DNA甲基化与所述临床特征之间的相关性包括:确定所述第二基因集合中基因的DNA甲基化位点及各所述位点的DNA甲基化程度,以所述DNA甲基化程度作为变量而相对于所述临床特征进行回归分析,并将所述回归分析中p值小于或等于第四阈值的DNA甲基化鉴别为与所述临床特征相关的第一DNA甲基化集合。
在某些实施方式中,所述装置或方法中确定所述DNA甲基化与所述临床特征之间的相关性还包括:确定所述第一DNA甲基化集合中各DNA甲基化位点的风险值,所述风险值基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。
在某些实施方式中,所述至少一种生物学指标还包括所述患者基因的体细胞突变,且确定所述体细胞突变与所述临床特征之间的相关性包括:确定所述第二基因集合中的基因所具有的体细胞突变,以及确定具有所述体细胞突变的基因所属的信号通路。
在某些实施方式中,所述至少一种生物学指标包括所述患者中的微小RNA,且确定所述微小RNA与所述临床特征之间的相关性包括:确定调控所述第二基因集合中的基因的微小RNA,以及确定所述微小RNA在所述患者中的表达水平与其所调控的基因的表达水平之间的相关性,将该相关性高于第五阈值的微小RNA鉴别为与所述临床特征相关的第一微小RNA集合。
在某些实施方式中,所述装置或方法中确定所述生物学指标与所述临床特征之间的相关性包括:通过进行有序逻辑回归分析分别确定下列生物学指标对所述临床特征影响的权重:所述第二基因集合中基因的表达水平,所述第二基因集合中基因的拷贝数变化,所述第一DNA甲基化集合中DNA甲基化位点的风险值。
在某些实施方式中,所述装置或方法分别确定所述第二基因集合中保护性效应基因表达水平和危险性效应基因表达水平各自的权重。
另一方面,本申请提供了一种存储有计算机程序的计算机可读存储介质,其中所述计算机程序使计算机执行本申请所述的鉴别方法。
另一方面,本申请提供了一种判断受试者中肿瘤进展的装置,所述装置包括:a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
另一方面,本申请提供了一种判断受试者中肿瘤进展的装置,所述装置包括用于判断受试者中肿瘤进展的计算机,所述计算机被编程以执行如下步骤:a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及b)根据a)中 确定的所述表达水平判断所述受试者中所述肿瘤的进展。
另一方面,本申请提供了一种判断受试者中肿瘤进展的方法,所述方法包括:a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
在某些实施方式中,所述肿瘤进展包括所述肿瘤的分期阶段和/或所述受试者的生存率。
在某些实施方式中,所述肿瘤的分期阶段选自:肿瘤I期,肿瘤II期,肿瘤III期和肿瘤IV期。
在某些实施方式中,所述肿瘤包括膀胱癌。在某些实施方式中,所述膀胱癌包括膀胱尿路上皮癌(BLCA)。
在某些实施方式中,所述一个或多个基因至少包括表2中所示的一个或多个保护性效应基因。
在某些实施方式中,所述一个或多个基因至少包括表3中所示的一个或多个危险性效应基因。
在某些实施方式中,所述一个或多个基因至少包括表4中所示的一个或多个基因。在某些实施方式中,所述一个或多个基因至少包括表5中所示的一个或多个基因。
在某些实施方式中,所述装置或方法还包括:确定所述一个或多个基因的拷贝数变化的步骤或模块。
在某些实施方式中,所述的装置或方法还包括:确定表8中所示的一个或多个基因的DNA甲基化风险值的步骤或模块。
在某些实施方式中,所述的装置或方法还包括:确定所述受试者的年龄的步骤或模块。
在某些实施方式中,所述装置或方法中确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平包括:确定所述一个或多个基因中表2所示基因的平均表达水平;以及确定所述一个或多个基因中表3所示基因的平均表达水平。
在某些实施方式中,所述装置或方法根据式I判断所述受试者中所述肿瘤的进展:
Figure PCTCN2019082574-appb-000001
Figure PCTCN2019082574-appb-000002
其中,j=肿瘤III期时,Intercept=0.9609;j=肿瘤I期/II期时,Intercept=-0.6617;a为所述一个或多个基因中表2所示基因的平均表达水平;b为所述一个或多个基因中表3所示基因的平均表达水平;c为所述一个或多个基因的拷贝数变化;d为所述一个或多个基因中表8中所示基因的DNA甲基化风险值;e为所述受试者的年龄;且f为所述受试者的性别,其中男性为0,女性为1。
另一方面,本申请提供了一种存储有计算机程序的计算机可读存储介质,其中所述计算机程序使计算机执行本申请所述的判断方法。
另一方面,本申请提供了一种治疗受试者中的肿瘤的方法,所述方法包括:根据本申请所述的判断方法,判断所述受试者中所述肿瘤的进展;以及根据所述进展向所述受试者施用有效量的治疗。
另一方面,本申请提供了一种治疗受试者中的肿瘤的装置,所述装置包括:a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展;以及c)治疗模块,其能够根据b)中判断的所述进展向所述受试者施用有效量的治疗。
本领域技术人员能够从下文的详细描述中容易地洞察到本公开的其它方面和优势。下文的详细描述中仅显示和描述了本公开的示例性实施方式。如本领域技术人员将认识到的,本公开的内容使得本领域技术人员能够对所公开的具体实施方式进行改动而不脱离本申请所涉及发明的精神和范围。相应地,本申请的附图和说明书中的描述仅仅是示例性的,而非为限制性的。
附图说明
本申请所涉及的发明的具体特征如所附权利要求书所显示。通过参考下文中详细描述的示例性实施方式和附图能够更好地理解本申请所涉及发明的特点和优势。对附图简要说明书如下:
图1显示的是本申请的鉴别方法和装置的工作流程示意图。
图2A-2D显示的是两组不同BLCA患者中APOL2、BCL2L14、CSAD和ORMDL1表达的Kaplan-Meier曲线图示意图。
图3A-3B显示的是对于BLCA患者的存活而言重要的基因中,保护性效应基因和危险性效应基因的基因本体(GO)富集分析。
图4A-4C显示的是不同癌症分期阶段的BLCA患者中,关键性基因之间相关性的动态变化。
图5A-5D显示的是通过WGCNA算法检测获得的基因共表达网络的功能模块。
图6A-6E显示的是膀胱癌不同分期中拷贝数变异(CNV)的分析。
图7A-7B显示的是DNA甲基化分析的示例结果。
图8A-8D显示的是BLCA样品中显著富集了突变基因的细胞信号传导通路。
图9A-9E显示的是膀胱癌不同分期中的体细胞突变分析。
图10A-10C显示的是膀胱癌不同分期中的微小RNA调节网络的演变。
图11显示的是整合分析中有序逻辑回归的森林图(forest plot)。
具体实施方式
以下由特定的具体实施例说明本申请发明的实施方式,熟悉此技术的人士可由本说明书所公开的内容容易地了解本申请发明的其他优点及效果。
一方面,本申请提供了一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括:1)临床特征模块,其能够提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;2)生物学指标模块,其能够提供源自所述患者的至少一种生物学指标;3)相关性判断模块,其能够确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性;以及4)鉴别模块,其能够将在模块3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
另一方面,本申请提供了一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括用于鉴别所述生物学指标的计算机,所述计算机被编程以执行如下步骤:1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;2)提供源自所述患者的至少一种生物学指标;3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
另一方面,本申请提供了一种鉴别能够评价肿瘤进展的生物学指标的方法,所述方法包括:1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;2)提供源自所述患者的至少一种生物学指标;3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
在本申请中,术语“患者”通常指具有某种疾病表征的个体,所述疾病表征可以指疾病的症状,也可以指在预防性情况中不能被改变的有害的生理状态。所述个体可包括雄性和/或雌性,通常包含人或非人动物,包括但不限于人、狗、猫、马、绵羊、山羊、猪、牛、兔、大鼠、小鼠、猴等。在某些实施方式中,所述患者为人类患者。
在本申请中,术语“肿瘤”通常是指因细胞的异常病变,身体部分细胞有不受控制的增生,许多时会集结成为肿块。肿瘤可以分为良性肿瘤和恶性肿瘤。在所述恶性肿瘤中,增生的细胞集结成为肿块,并扩散至其他部位。所述肿瘤可以选自以下组:鼻咽癌、唇癌、大肠癌、 胆囊癌、肺癌、肝癌、宫颈癌、骨癌、喉癌、黑素瘤、甲状腺癌、口咽癌、脑瘤、膀胱癌、皮肤癌、前列腺癌、乳腺癌、食道癌、神经胶质瘤、舌癌、肾癌、肾上腺皮质癌、胃癌、血管瘤、胰腺癌、阴道癌、子宫癌和脂肪瘤。例如,所述肿瘤可以为膀胱癌,如膀胱尿路上皮癌(BLCA)。
临床特征
在本申请中,术语“临床特征模块”通常是指够提供患有所述肿瘤的患者的临床特征的功能单元。例如,所述临床特征模块可以包括信息输入和/或提取单元,其能够接收和/或提供包括患者的肿瘤分期阶段和/或所述患者的生存时间的所述临床特征。
在本申请中,术语“临床特征”通常是指反映患者的疾病临床特点的一种或多种指标和/或参数,例如,患者的肿瘤分期阶段和/或所述患者的生存时间等。
在本申请中,所述临床特征模块可以包括能够获得患者的肿瘤分期阶段和/或所述患者的生存时间的试剂、设备和/或装置。例如,所述临床特征模块可以包括检测肿瘤大小、浸润程度、转移情况的试剂、设备和/或装置(例如,核磁共振成像、CT、肠胃镜)。又例如,所述临床特征模块可以包括监控患者生存时间的设备和/或装置(例如,检测肿瘤标志物的试剂、设备和/或装置)。所述肿瘤标志物可以选自以下组:血清癌胚抗原(CEA)、甲胎蛋白(AFP)、前列腺特异抗原(PSA)和绒毛膜促性腺激素(HCG)。
在本申请中,术语“肿瘤分期”通常是指通过患者体内肿瘤的数量和位置,评价所述肿瘤的进展的组织病理学分类方法。所述肿瘤分期可以根据个体内原发肿瘤以及播散程度(例如,按照WHO提出的TNM分类方法)来描述恶性肿瘤的严重程度和受累范围。所述肿瘤分期可以帮助医生制定相应的治疗计划并且了解疾病的预后,同时避免过度治疗或治疗不足的情况。一般按照世界卫生组织(WHO)提出的TNM分类方法对肿瘤进行分期。TNM分类方法中各个英文数字代号的含义如下,T:原发肿瘤的范围和大小、浸润范围、有无转移、浸润深达程度,分为由0(T0至T4,5个等级),数字越大表示癌症进展得越明显,根据癌症发生的不同脏器制定的分类方法也不尽相同;N:淋巴结播散情况,分为由0(N0至N3,4个等级),数字越大表示癌症进展得越明显;M:是否存在转移,其中M0表示没有转移,M1表示有远处转移。临床上将上述的T、N、M的结果综合在一起来决定肿瘤的分期。例如,所述肿瘤分期可以包括肿瘤I期、肿瘤II期、肿瘤III期和肿瘤IV期。
在本申请中,术语“肿瘤I期”通常是指肿瘤的早期阶段。在本申请中,术语“肿瘤II期”通常是指肿瘤的轻度阶段。在本申请中,术语“肿瘤III期”通常是指肿瘤的中期阶段。在本申请中,术语“肿瘤IV期”通常是指肿瘤的完全阶段。
在本申请中,术语所述“生存时间”是指经过治疗后的肿瘤患者的总存活时间。所述生存时间可以与所述肿瘤分期相关。
在本申请中,术语“膀胱癌”通常是指各种出自膀胱的恶性肿瘤。所述膀胱癌可包括膀胱尿路上皮癌(BLCA)。所述膀胱尿路上皮癌可以分为非肌层浸润性尿路上皮癌和肌层浸润性尿路上皮癌。膀胱癌的病因复杂,既有内在的遗传因素,又有外在的环境因素。较为明确的两大致病危险因素是吸烟和职业接触芳香胺类化学物质。在临床表现方面,大约有90%以上的膀胱癌患者最初的临床表现是血尿,通常表现为无痛性、间歇性、肉眼全程血尿,有时也可为镜下血尿。血尿可能仅出现1次或持续1天至数天,可自行减轻或停止;约10%的膀胱癌患者可首先出现膀胱刺激症状,表现为尿频、尿急、尿痛和排尿困难。所述膀胱刺激症状多是由于肿瘤坏死、溃疡、膀胱内肿瘤较大或数目较多或膀胱肿瘤弥漫浸润膀胱壁,使膀胱容量减少或并发感染所引起。
在本申请中,膀胱癌可以分为以下分期阶段:0期膀胱癌(无创性乳头状癌和原位癌)、I期膀胱癌、II和III期膀胱癌和IV期膀胱癌。不同肿瘤分期的膀胱癌所对应的治疗方法包括如下的方法(参见NIH国家癌症研究院(National Cancer Insititute)的说明)。
对于0期膀胱癌,主要的治疗方法包括:
●用电灼术经尿道切除,
手术后立即给予膀胱内化疗;
手术后立即给予膀胱内化疗,然后定期使用膀胱内卡介苗或膀胱内化疗;
●部分膀胱切除术;
●根治性膀胱切除术;
●新疗法的临床实践。
对于I期膀胱癌,主要的治疗方法包括:
●用电灼术经尿道切除,
手术后立即给予膀胱内化疗;
手术后立即给予膀胱内化疗,然后定期使用膀胱内卡介苗或膀胱内化疗;
●部分膀胱切除术;
●根治性膀胱切除术;
●新疗法的临床实践。
对于II和III期膀胱癌,主要的治疗方法包括:
●根治性膀胱切除术;
●联合化疗随后进行根治性膀胱切除术。还可能进行尿流改道;
●外部放疗,或者外部放疗加化疗;
●部分膀胱切除术,或者部分膀胱切除术加化疗;
●用电灼术经尿道切除;
●新疗法的临床试验。
对于IV期膀胱癌,主要的治疗方法包括:
●化疗;
●单纯根治性膀胱切除术或随后进行化疗;
●外部放疗,或者外部放疗加化疗;
●尿流改道或膀胱切除术作为姑息疗法。
针对已经扩散到身体其他部位(如肺,骨或肝脏)的IV期膀胱癌的治疗可包括以下内容:
●化疗,或者化疗加局部治疗(手术或放疗);
●免疫治疗;
●外部放射治疗作为姑息治疗;
●尿流改道或膀胱切除术作为姑息治疗;
新型抗癌药物的临床试验。
生物学指标
在本申请中,术语“生物学指标模块”通常是指能够提供源自所述患者的至少一种生物学指标的功能单元。例如,所述生物学指标模块可以提供在分子水平上反映所述患者的肿瘤分期阶段和/或所述患者的生存时间的指标和/或特征。
例如,所述生物学指标模块可以包括获得患者样品(例如,外周血)的样品单元。例如,所述生物学指标模块可包括获得患者样品的样品装置(例如,采血针等获取样品的装置;和/或,试管等承载样品的装置)。例如,所述生物学指标模块可以包括通过处理患者样品获得患者的DNA的样品处理装置(例如,抽提全血DNA的试剂盒、试管和相关装置)。又例如,所述生物学指标模块还可以包括能够分离患者样品的分离单元。例如,所述生物学指标模块可包括分离细胞的试剂(例如,蛋白酶K)和分离细胞的装置(例如,离心机)。
例如,所述生物学指标模块可以包括获得所述生物学指标的样品处理单元。例如,所述样品处理单元可以包括检测所述患者基因表达水平的试剂和器材、检测所述患者基因拷贝数变化的试剂和器材、检测所述患者基因的DNA甲基化的试剂和器材、检测所述患者基因的体 细胞突变的试剂和器材,以及检测所述患者中微小RNA的试剂和器材。又例如,所述样品处理单元可以包括q-RT PCR试剂盒、MLPA(多重连接探针扩增)试剂盒、甲基化图谱分析试剂盒、TruSeq Rapid Exome Library试剂盒和微阵列分析试剂盒。
在本申请中,术语“生物学指标”通常是指包括选自下组的一类或多类指标:类1:所述患者基因的表达水平;类2:所述患者基因的拷贝数变化;类3:所述患者基因的DNA甲基化;类4:所述患者基因的体细胞突变;和类5:所述患者中的微小RNA(微小RNA)。
例如,所述患者的基因的表达水平可能为上调,例如,与正常细胞中的表达水平相比,上调约10%以上、20%以上、30%以上、40%以上、50%以上、60%以上、70%以上、80%以上、90%以上、100%以上、120%以上、140%以上、160%以上、180%以上或200%以上;例如,所述患者的基因的表达水平可能为下调,例如,下调至正常细胞中的表达水平约10%以下、20%以下、30%以下、40%以下、50%以下、60%以下、70%以下、80%以下、90%以下、92%以下、94%以下、96%以下、98%以下或99%以下。例如,所述患者基因的拷贝数变化可能为增加,例如与正常细胞中的表达水平相比,增加约0.1倍以上、约0.5倍以上、约1倍以上、约2倍以上、约3倍以上、约4倍以上、约5倍以上、约6倍以上、约7倍以上、约8倍以上、约9倍以上或约10倍以上;又例如,所述患者基因的拷贝数变化可能为减小,例如与正常细胞中的表达水平相比,减小约0.1倍以上、约0.5倍以上、约1倍以上、约2倍以上、约3倍以上、约4倍以上、约5倍以上、约6倍以上、约7倍以上、约8倍以上、约9倍以上或约10倍以上。例如,所述患者基因的DNA甲基化可能为水平增加,例如与正常细胞中的DNA甲基化水平相比,增加约0.1倍以上、约0.5倍以上、约1倍以上、约2倍以上、约3倍以上、约4倍以上、约5倍以上、约6倍以上、约7倍以上、约8倍以上、约9倍以上或约10倍以上;又例如,所述患者基因的DNA甲基化可能为水平减小,例如与正常细胞中的DNA甲基化水平相比,减小约0.1倍以上、约0.5倍以上、约1倍以上、约2倍以上、约3倍以上、约4倍以上、约5倍以上、约6倍以上、约7倍以上、约8倍以上、约9倍以上或约10倍以上。
在本申请中,术语“基因的表达水平”通常是指将编码在基因中的信息翻译成基因产物(例如RNA,蛋白质)的水平。表达的基因包括转录成随后翻译成蛋白质的RNA(例如mRNA)的基因以及转录成不翻译成蛋白质的非编码功能性RNA的基因(例如,tRNA,rRNA核酶等)。如本文所用,“基因表达水平”或“表达水平”是指样品或参考标准中由给定基因编码的一种或多种产物(例如RNA,蛋白质)的水平(例如,量)。
在本申请中,术语“基因的拷贝数变化”通常是指CNV(Copy Number Variation),其表 示基因组的切片重复,以及基因组中的重复数量在种群中的个体之间有所不同的现象(参见Mccarroll,S.A等(2007)."Copy-number variation and association studies of human diseases".Nature Genetics.39:37–42.)。CNV是一种重复或删除事件,影响相当数量的碱基对,并主要出现在人基因组中。拷贝数变异通常可以分为两大类:短重复和长重复。短重复序列主要包括双核苷酸重复序列(两个重复核苷酸,例如A-C-A-C-A-C...)和三核苷酸重复序列。长重复序列包括整个基因的重复。CNV的研究数据不仅可以为进化和自然选择提供额外的证据,还可以用于开发各种遗传疾病的治疗方法。
在本申请中,术语“基因的DNA甲基化”通常是指是将甲基添加到DNA分子(主要是胞嘧啶和腺嘌呤)中的过程。甲基化可以改变DNA片段的活性而不改变序列。当位于基因启动子中时,DNA甲基化通常起到抑制基因转录的作用。DNA甲基化对于正常发育是必不可少的,并与许多关键过程相关,包括基因组印记、X染色体失活、转座因子抑制、衰老和致癌作用。胞嘧啶甲基化形成5-甲基胞嘧啶发生在DNA碱基胸腺嘧啶甲基所在的嘧啶环的相同5位;相同的位置区分胸腺嘧啶和不含甲基的类似RNA碱基尿嘧啶。5-甲基胞嘧啶的自发脱氨将其转化为胸腺嘧啶。这会导致T-G不匹配。修复机制然后将其修改回原始的C-G对;或者,它们可以用G代替A,将原始C-G对变为T-A对,从而有效地改变碱基并引入突变。在本申请中,基因的DNA甲基化可能会产生DNA甲基化标记,DNA甲基化标记是具有特定生物学状态(例如组织,细胞类型,个体)的特定甲基化模式的基因组区域,被认为是参与基因转录调控的可能功能区。
在本申请中,术语“基因的体细胞突变”通常是指在生殖细胞系以外的细胞中发生的突变,也称作获得性突变。体细胞突变不会造成后代的遗传改变,却可以引起当代某些细胞的遗传结构发生改变。绝大部分体细胞突变无表型效应。恶性肿瘤的散发形式可以通过体细胞突变引起。研究表明,体细胞癌变并不一定有基因结构的改变,当基因以外的物质如蛋白质、RNA、生物膜发生了改变,而这些改变也能使与生长、分化有关的基因异常关闭或启动,此时,细胞也能转化为癌细胞,这一观点称为基因外调节学说。
在本申请中,术语“微小RNA”通常是指是长约22nt的非编码RNA(微小RNA,简称miRNA),广泛存在于从病毒到人类的各种生物中。这些小RNA能够与mRNA结合阻断蛋白编码基因的表达,防止它们翻译成为蛋白。哺乳动物miRNA可以具有许多独特的靶标。例如,对脊椎动物中高度保守的miRNA的分析表明,每个平均有大约400个保守的靶标;同样,单个miRNA种类可能会抑制数百种蛋白质的产生。研究表明,慢性淋巴细胞白血病和B细胞恶性肿瘤可能与miRNA有关。
相关性
在本申请中,术语“相关性判断模块”通常是指能够确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性的功能单元。
在本申请中,术语“相关性”通常是指本申请中患者的所述至少一种生物学指标与相应患者的所述临床特征表现出有统计学意义的关联性。例如,一个基因可以以较高的或较低的水平表达,并且与肿瘤(例如,膀胱癌)的状态或结果相关。
例如,所述相关性判断模块可以包括样品判断单元,所述样品判断单元可以确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性。例如,所述相关性判断模块可以包括以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析来确定所述基因的表达水平与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。例如,所述相关性判断模块可以包括以所述基因的表达水平,所述患者的年龄,所述患者的性别,和/或所述患者的肿瘤分期阶段作为多变量而相对于所述临床特征进行多变量回归分析来确定所述基因的表达水平与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。又例如,所述相关性判断模块还可以包括根据所述回归分析中获得的针对各基因的相关性系数数值来确定所述基因的表达水平与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。
又例如,所述相关性判断模块还可以包括根据所述患者的基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,从而根据所述基因共表达情况将所述基因分为2组或更多组来分别确定每组的基因表达水平与所述临床特征间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。例如,所述单元可以利用WGCNA(Weighted Gene Co-Expression Network Analysis,加权关联网络分析)算法来实现其至少一部分所述功能。
又例如,所述相关性判断模块还可以包括根据所述患者的基因在各肿瘤分期阶段的拷贝数变化频率确定所述基因拷贝数变化与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。
又例如,所述相关性判断模块还可以包括根据以所述DNA甲基化的程度作为变量而相对于所述临床特征进行回归分析,据此确定的DNA甲基化来确定所述DNA甲基化与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。 又例如,所述相关性判断模块还可以包括基于甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定的被鉴别为与所述临床特征相关的各DNA甲基化位点的风险值,来确定所述DNA甲基化与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。
又例如,所述相关性判断模块还可以包括根据所述患者的所述体细胞突变的基因所属的信号通路确定所述体细胞突变的基因的表达水平与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。
又例如,所述相关性判断模块还可以包括根据所述微小RNA所调控的基因的表达水平确定所述微小RNA所调控的基因的表达水平与所述临床特征以及与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。
又例如,所述相关性判断模块还可以包括通过确定包括两类或更多类所述生物学指标对所述临床特征影响的权重,来确定所述生物学指标与所述临床特征之间的相关性的单元(例如,其可包括能够执行相关指令的硬件、程序和/或软件)。例如,所述单元可通过进行有序逻辑回归分析确定所述权重。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的表达水平,确定所述基因的表达水平与所述临床特征之间的相关性可以包括:以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关。
在某些实施方式中,所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:a)以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关的第一基因集合。
在本申请中,术语“第一阈值”通常是在以所述基因的表达水平作为单一变量而相对于所述临床特征进行的单变量回归分析中,判定结果的统计学显著性的截断值(即p值的截断值)。,例如,所述第一阈值可以为0.09或以下。例如,所述第一阈值可以为0.08或以下,0.07或以下,0.06或以下,0.05或以下,0.045或以下,0.04或以下,0.03或以下,0.02或以下,0.01或以下,或0.005或以下。
在本申请中,术语“第二阈值”通常是指以所述基因的表达水平作为单一变量而相对于所述临床特征进行的单变量回归分析中的错误发现率(FDR)小于或等于的阈值。在本申请中,所述第二阈值可以为0.5或以下。例如,所述第二阈值可以为0.4或以下,0.3或以下,0.2或 以下,0.1或以下,或者0.05或以下。
在本申请中,如果所述基因的表达水平同时满足所述第一阈值和所述第二阈值,则该基因可以被鉴别为与所述临床特征相关的第一基因集合。在本申请中,如果所述基因的表达水平同时满足所述第一阈值和所述第二阈值,则该基因的表达水平可以与所述临床特征相关,和/或,该基因可以作为评价肿瘤进展的生物学指标之一。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关,且其中所述多变量包括所述患者中的基因表达水平,所述患者的年龄,所述患者的性别,和/或所述患者的肿瘤分期阶段。
在某些实施方式中,确定所述基因的表达水平与所述临床特征之间的相关性还包括:b)相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关的第二基因集合,且其中所述多变量包括所述第一基因集合中各基因的表达水平,所述患者的年龄,所述患者的性别,和所述患者的肿瘤分期阶段。
在本申请中,术语“第三阈值”通常是指相对于所述临床特征进行多变量回归分析中的错误发现率(FDR)小于或等于的阈值。其中,所述多变量可以选自以下组:所述患者中的基因表达水平,所述患者的年龄,所述患者的性别,和/或所述患者的肿瘤分期阶段。在本申请中,所述第三阈值可以为0.2以下。例如,所述第三阈值可以为0.2或以下,0.15或以下,0.1或以下,或者0.05或以下。
在本申请中,如果所述基因的表达水平满足所述第三阈值,则该基因可以被鉴别为与所述临床特征相关的第二基因集合。例如,所述第二基因集合的基因可以选自表1中所示的基因。例如,所述第二基因集合的基因的个数可以为1078个。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:根据所述多变量回归分析中获得的针对各基因的相关性系数数值,将所述基因分为保护性效应基因和危险性效应基因,其中所述保护性效应基因的相关性系数数值可以为负,所述危险性效应基因的相关性系数数值可以为正。
在本申请中,确定所述基因的表达水平与所述临床特征之间的相关性还可以包括:c)根据所述多变量回归分析中获得的针对各基因的相关性系数数值,将所述基因分为保护性效应基因和危险性效应基因,其中所述保护性效应基因的相关性系数数值为负,且所述危险性效 应基因的相关性系数数值为正。
在本申请中,术语“保护性效应基因”通常是指其表达水平与患者的生存期正相关,或者其表达水平与肿瘤的进展程度(例如肿瘤分期阶段的进展)呈负相关的基因。例如,在本申请的所述多变量回归分析中,所述保护性效应基因的表达水平与所述临床特征(例如,肿瘤分期阶段)之间的相关性系数数值可以为负。在本申请中,所述保护性效应基因可以选自表2中所示的基因。在本申请中,所述保护性效应基因的数目可以为356个。所述保护性效应基因的表达水平可以在所述肿瘤的进展过程中被下调。例如,所述保护性效应基因可能与所述肿瘤的分期阶段负相关。
在本申请中,术语“危险性效应基因”通常是指其表达水平与患者的生存期负相关,或者其表达水平与肿瘤的进展程度(例如肿瘤分期阶段的进展)呈正相关的基因。例如,在本申请的所述多变量回归分析中,所述危险性效应基因的表达水平与所述临床特征(例如,肿瘤分期阶段)之间的相关性系数数值可以为正。在本申请中,所述危险性效应基因可以选自表3中所示的基因。在本申请中,所述危险性效应基因的数目可以为722个。所述危险性效应基因的表达水平可以在所述肿瘤的进展过程中被上调。例如,所述危险性效应基因可能与所述肿瘤的分期阶段正相关。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:确定所述患者的基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。例如,可通过鉴别在某肿瘤分期阶段中各基因的共表达关系,和/或鉴别该共表达关系在各肿瘤分期阶段的变化情况来将所述基因分为2组或更多组,其中每一组中的基因可呈现肿瘤分期阶段特异性共表达谱。随后,可分析每一组中的基因与所述临床特征(例如,患者的生存期和/或肿瘤的分期阶段)之间的相关性(例如,通过本申请所述的单变量和/或多变量回归分析),从而可鉴别出具有所需相关性的基因组。
在本申请中,确定所述基因的表达水平与所述临床特征之间的相关性还可以包括:确定所述第二基因集合中的各基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述第二基因集合中的基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。例如,可通过鉴别在某肿瘤分期阶段中各基因的共表达关系,和/或鉴别该共表达关系在各肿瘤分期阶段的变化情况来将所述第二基因集合中的基因分为2组或更多组,其中每一组中的基因可呈现肿瘤分期 阶段特异性共表达谱。随后,可分析每一组中的基因与所述临床特征(例如,患者的生存期和/或肿瘤的分期阶段)之间的相关性(例如,通过本申请所述的单变量和/或多变量回归分析),从而可鉴别出具有所需相关性的基因组。
在本申请中,术语“基因共表达”通常是指所述第二基因集合中的多个基因在所述肿瘤分期的特定阶段能够体现出类似的表达水平趋势(例如,表达水平均在某一肿瘤分期趋势相同或类似,如在肿瘤I期上调),从而能够根据所述基因共表达的现象,将所述第二基因集合中的基因分为2组或更多组(例如,2组以上、3组以上、4组以上、5组以上、6组以上、7组以上、8组以上、9组以上、10组以上或更多),使得每组的基因表达水平与所述临床特征间具备所述相关性。例如,所述基因共表达可以通过使用WGCNA算法进行确定。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的拷贝数变化,且确定所述基因拷贝数变化与所述临床特征之间的相关性包括:比较所述患者的基因在各肿瘤分期阶段的拷贝数变化频率。
在某些实施方式中,所述至少一种生物学指标还包括所述患者基因的拷贝数变化,且确定所述基因拷贝数变化与所述临床特征之间的相关性包括:比较所述第二基因集合中的基因在各肿瘤分期阶段的拷贝数变化频率。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的DNA甲基化,且确定所述DNA甲基化与所述临床特征之间的相关性包括:以所述DNA甲基化的程度作为变量而相对于所述临床特征进行回归分析,并将所述回归分析中p值小于或等于第四阈值的DNA甲基化鉴别为与所述临床特征相关。
在某些实施方式中,所述至少一种生物学指标还包括所述患者基因的DNA甲基化,且确定所述DNA甲基化与所述临床特征之间的相关性包括:确定所述第二基因集合中基因的DNA甲基化位点及各所述位点的DNA甲基化程度,以所述DNA甲基化程度作为变量而相对于所述临床特征进行回归分析,并将所述回归分析中p值小于或等于第四阈值的DNA甲基化鉴别为与所述临床特征相关的第一DNA甲基化集合。
在本申请中,术语“第四阈值”通常是指以所述基因的DNA甲基化的程度作为变量而相对于所述临床特征进行的回归分析中的p值小于或等于的阈值(例如,体现统计学显著性的p值截断值)。在本申请中,所述第四阈值可以为0.2以下。例如,所述第四阈值可以为0.15以下,0.1以下,0.05以下,0.01以下,或0.005以下。
在本申请中,如果所述第二基因集合中基因的DNA甲基化程度经所述回归分析后p值小于或等于所述第四阈值,则该DNA甲基化可以被鉴别为与所述临床特征相关的第一DNA 甲基化集合。在本申请中,所述第一DNA甲基化集合可以选自表8中所示的基因。例如,所述第一DNA甲基化集合可以包含23个基因中的DNA甲基化事件。
在本申请中,确定所述DNA甲基化与所述临床特征之间的相关性还可以包括:确定被鉴别为与所述临床特征相关的各DNA甲基化位点的风险值,所述风险值基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。
在某些实施方式中,确定所述DNA甲基化与所述临床特征之间的相关性还包括:确定所述第一DNA甲基化集合中各DNA甲基化位点的风险值,所述风险值基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。例如,某DNA甲基化事件的所述风险值可以为该甲基化位点在所述回归分析中获得的相关性系数与该甲基化位点的甲基化程度数值的线性组合。
在本申请中,所述至少一种生物学指标可以包括所述患者基因的体细胞突变,且确定所述体细胞突变与所述临床特征之间的相关性包括:确定具有所述体细胞突变的基因所属的信号通路,和/或确定具有所述体细胞突变的基因的表达水平与所述临床特征之间的相关性。
在某些实施方式中,所述至少一种生物学指标还包括所述患者基因的体细胞突变,且确定所述体细胞突变与所述临床特征之间的相关性包括:确定所述第二基因集合中的基因所具有的体细胞突变,以及确定具有所述体细胞突变的基因所属的信号通路。
在本申请中,所述信号通路可以包括PI3K/AKT途径、Ras途径、Rap1途径和MAPK途径。在本申请中,所述信号通路可以已经被证实与肿瘤有关。
在本申请中,所述至少一种生物学指标可以包括所述患者中的微小RNA,且确定所述微小RNA与所述临床特征之间的相关性包括:确定所述微小RNA所调控的基因的表达水平与所述临床特征之间的相关性,以及确定所述微小RNA在所述患者中的表达水平与其所调控的基因的表达水平之间的相关性。
在某些实施方式中,所述至少一种生物学指标可以包括所述患者中的微小RNA,且确定所述微小RNA与所述临床特征之间的相关性包括:确定调控所述第二基因集合中的基因的微小RNA,以及确定所述微小RNA在所述患者中的表达水平与其所调控的基因的表达水平之间的相关性,将该相关性高于第五阈值的微小RNA鉴别为与所述临床特征相关的第一微小RNA集合。
在本申请中,术语“第五阈值”通常是指确定所述相关性的统计学显著性的截断值。在本申请中,所述第五阈值可以为小于-0.1。例如,所述第五阈值可以为小于-0.15、小于-0.2、小于-0.25、小于-0.3、小于-0.35、小于-0.4或小于-0.45。在本申请中,如果所述相关系数小于所 述第五阈值,则可以认为所述微小RNA所调控的基因的表达水平和该微小RNA的表达水平之间存在显著相关性。例如,可将该微小RNA和与其相互作用的所述基因成为一对调节对(微小RNA-基因调节对)。因此,所述第五阈值可以反映微小RNA与其调控的基因之间的配合程度。在本申请中,所述第五阈值可以随着所述肿瘤分期阶段的变化而变化。
在本申请中,术语“第一微小RNA集合”可以包括所述相关性高于所述第五阈值的微小RNA。在本申请中,所述第一微小RNA集合可以选自表10中所示的微小RNA。
在本申请中,所述至少一种生物学指标可以包括两类或更多类所述生物学指标,且确定所述生物学指标与所述临床特征之间的相关性包括确定各类所述生物学指标对所述临床特征影响的权重。例如,可以通过进行有序逻辑回归分析确定所述权重。
在本申请中,确定所述生物学指标与所述临床特征之间的相关性可以包括:通过进行有序逻辑回归分析分别确定下列生物学指标对所述临床特征影响的权重:所述第二基因集合中基因的表达水平,所述第二基因集合中基因的拷贝数变化,所述第一DNA甲基化集合中DNA甲基化位点的风险值。例如,可以分别确定所述第二基因集合中保护性效应基因表达水平和危险性效应基因表达水平各自的权重。
在本申请中,术语“权重”通常是指某一指标(例如,所述生物学指标)在整体评价(例如,评价肿瘤进展)中的相对重要程度。
另一方面,本申请还提供了存储有计算机程序的计算机可读存储介质,其中所述计算机程序使计算机执行本申请所述的方法。
在本申请中,术语“计算机可读存储介质”通常是指计算机存储器中用于存储某种参数或数据的媒体。计算机存储介质可以包括,例如半导体、磁芯、磁鼓、磁带和激光盘等。
在本申请中,术语“鉴别模块”通常是指能够将在所述相关性判断模块中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展的功能单元。
例如,所述鉴别模块可以包括能够将所述生物学指标鉴别为能够评价所述肿瘤的进展的程序、试剂和/或设备。
在本申请中,对能够评价肿瘤进展的生物学指标的鉴别,可以分为三个阶段(如图1所示):阶段一,通过大规模Cox回归模型(即单变量和多变量Cox回归模型),根据从TCGA获得的肿瘤患者(例如,膀胱癌患者)中基因对患者存活状态的影响,鉴定出了1078个关键基因。接着,对这些关键基因在肿瘤(例如,膀胱癌)不同分期中与患者生存率和/或肿瘤分期阶段间的关系,分析这些基因的保护性或有害性。阶段二:分析在肿瘤(例如膀胱癌)不同分期阶段中的分期特异性基因共表达谱,并据此将所述1078个关键基因分为多个亚组,每 个亚组中的基因呈现相同或类似的分期特异性共表达模式,随后判断各所述亚组中的基因与所述患者生存率和/或肿瘤分期阶段间的相关性,从而鉴定出所述1078个关键基因中与肿瘤进展相关性最高的基因亚组。阶段三:分别分析肿瘤(例如膀胱癌)进展(例如患者生存率和/或肿瘤分期阶段)与患者的其他生物学指标,例如所述1078个关键基因的拷贝数变异、DNA甲基化情况、体细胞突变和微小RNA调节网络等之间的相关性,从而鉴别出能够体现该相关性的其他一种或多种生物学指标。阶段四:对所鉴别出的多种生物学指标与肿瘤(例如膀胱癌)进展(例如,患者生存率和/或肿瘤阶段分期)之间的综合相关性进行整合分析。通过上述的研究,本申请提供了一个系统性和合理的途径来综合分析患者的生物学指标数据和临床特征数据,从而揭示癌症(例如膀胱癌)进展的特征性指标。
判断肿瘤进展的装置或方法
另一方面,本申请提供了一种判断受试者中肿瘤进展的装置,所述装置包括:a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
本申请还提供了一种判断受试者中肿瘤进展的装置,所述装置包括用于判断受试者中肿瘤进展的计算机,所述计算机被编程以执行如下步骤:a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
另一方面,本申请提供了一种判断受试者中肿瘤进展的方法,所述方法包括:a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
在本申请中,术语“分析模块”通常是指能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平的功能单元。
例如,所述分析模块可以包括获得受试者样品(例如,外周血)的样品单元。例如,所述分析模块可包括获得受试者样品的样品装置(例如,采血针等获取样品的装置;和/或,试管等承载样品的装置)。例如,所述分析模块可以包括通过处理患者样品获得所述受试者的DNA的样品处理装置(例如,抽提全血DNA的试剂盒、试管和相关装置)。又例如,所述分析模块还可以包括能够分离所述受试者样品的分离单元。例如,所述分析模块可包括分离细胞的试剂(例如,蛋白酶K)和分离细胞的装置(例如,离心机)。
例如,所述分析模块可以包括检测表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平的试剂和器材。例如,所述分析模块可以包括q-RT PCR试剂盒和q-RT PCR仪。
在本申请中,术语“判断模块”通常是指根据所述分析模块中确定的所述表达水平判断所述受试者中所述肿瘤的进展的功能单元。
例如,所述判断模块可以可以包括样品判断单元,所述样品判断单元可以根据所述分析模块中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
例如,所述肿瘤进展可以包括所述肿瘤的分期阶段和/或所述受试者的生存率。
例如,所述肿瘤的分期阶段可以选自:肿瘤I期,肿瘤II期,肿瘤III期和肿瘤IV期。
例如,所述肿瘤可以包括膀胱癌。又例如,所述膀胱癌可以包括膀胱尿路上皮癌(BLCA)。
在本申请中,所述一个或多个基因可以至少包括表2中所示的一个或多个保护性效应基因。
在本申请中,所述一个或多个基因可以至少包括表3中所示的一个或多个危险性效应基因。
在本申请中,所述一个或多个基因可以至少包括表4中所示的一个或多个基因。例如,表4中的基因的表达水平可以与所述肿瘤分期的相关性系数数值为负。例如,表4中的基因(例如表4中93%以上、94%以上、95%以上、96%以上、97%以上、98%以上、99%以上或100%的基因)的表达水平可以与膀胱癌分期的相关性系数数值为负。
在本申请中,所述一个或多个基因可以至少包括表5中所示的一个或多个基因。例如,表5中的基因的表达水平可以与所述肿瘤分期的相关性系数数值为正。例如,表5中的基因的表达水平可以与膀胱癌分期的相关性系数数值为正。
在本申请中,所述装置或方还可以包括:确定所述一个或多个基因的拷贝数变化的步骤或模块。例如,所述拷贝数变化的确定,可以包括以下的步骤:利用Broad GDAC Firehose中的拷贝数变化的数据来进行分析。其中,所述数据来源自处于不同膀胱癌分期的患者的样本。
在本申请中,所述的装置或方法还可以包括:确定表8中所示的一个或多个基因的DNA甲基化风险值的步骤或模块。
在本申请中,所述风险值通常基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。例如,所述风险值可以由包括以下步骤的方法来确定:其可被定义为甲基化水平(即β值)与正则化Cox回归中23个DNA甲基化基因(例如本申请所述的第一DNA甲基化集合中的基因,或表8中所示的基因)的相应系数的线性组合;然 后根据该风险值的中位数对所有患者进行风险评分,进而将其分为高风险组和低风险组,并且对这两组患者进行Kaplan-Meier分析和log-rank检验。
在本申请中,所述的装置或方法还包括:确定或提供所述受试者的年龄的步骤或模块。例如,所述步骤或模块可以包括或执行以下的步骤:询问患者的年龄、调查患者的就医记录或测定骨龄等。
在本申请中,所述装置或方法中确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平可以包括:确定所述一个或多个基因中表2所示基因的平均表达水平;以及确定所述一个或多个基因中表3所示基因的平均表达水平。例如,可以根据分别测定的、关于表2和表3中一个或多个(例如,1个以上、2个以上、4个以上、6个以上、8个以上、10个以上、20个以上、50个以上、100个以上、200个以上或500个以上)所述基因的平均表达水平,确定出表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平。
整合判断
在本申请中,所述装置或方法可以根据式I判断所述受试者中所述肿瘤的进展:
Figure PCTCN2019082574-appb-000003
其中,j=肿瘤III期时,Intercept=0.9609;j=肿瘤I期/II期时,Intercept=-0.6617;a为所述一个或多个基因中表2所示基因的平均表达水平;b为所述一个或多个基因中表3所示基因的平均表达水平;c为所述一个或多个基因的拷贝数变化;d为所述一个或多个基因中表8中所示基因的DNA甲基化风险值;e为所述受试者的年龄;且f为所述受试者的性别,其中男性为0,女性为1。
另一方面,本申请提供了一种存储有计算机程序的计算机可读存储介质,其中所述计算机程序可以使计算机执行上述的判断方法。
治疗肿瘤的方法
另一方面,本申请提供了一种治疗受试者中的肿瘤的方法,所述方法可以包括:根据本申请所述的判断方法,判断所述受试者中所述肿瘤的进展;以及根据所述进展向所述受试者施用有效量的治疗。
例如,所述肿瘤可以包括膀胱癌(例如膀胱尿路上皮癌(BLCA))。又例如,所述肿瘤 的进展可以选自:肿瘤I期,肿瘤II期,肿瘤III期和肿瘤IV期。
例如,当受试者患有I期膀胱癌,所述治疗可包括:用电灼术经尿道切除、膀胱内化疗、部分膀胱切除术和根治性膀胱切除术。例如,当受试者患有II和III期膀胱癌,所述治疗可包括:根治性膀胱切除术、联合化疗随后进行根治性膀胱切除术、放疗、部分膀胱切除术和用电灼术经尿道切除。例如,当受试者患有IV期膀胱癌,所述治疗可包括:化疗、单纯根治性膀胱切除术或随后进行化疗、外部放疗,或者外部放疗加化疗和姑息疗法(例如,尿流改道或膀胱切除术)。
另一方面,本申请提供了一种治疗受试者中的肿瘤的装置,所述装置包括:a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展;以及c)治疗模块,其能够根据b)中判断的所述进展向所述受试者施用有效量的治疗。
在本申请中,术语“治疗模块”通常是指能够根据所述判断模块中判断的所述肿瘤的进展情况,确定和/或实施向所述受试者施用有效量的治疗的功能单元。
例如,所述治疗模块可以包括选自下组的治疗方法所需的试剂、药剂、仪器和设备:切割肿瘤的外科术、化疗、放疗、生物靶向治疗和姑息疗法。其中,所述姑息疗法可以为对疼痛、厌食、便秘、疲乏、呼吸困难、呕吐、咳嗽、口干、腹泻、吞咽困难等影响生活质量的症状控制,同时注意精神心理问题的治疗方法。例如,所述癌症可以为膀胱癌,所述生物靶向治疗可以包括施用,例如IL2和/或IFN-α2a。
例如,所述治疗模块可以包括向受试者施用有效量的药剂。所述“有效量”可以为缓解或者消除受试者的疾病或症状的药物的量。通常,可根据受试者的体重、年龄、性别、饮食、排泄速率、过往病史、现用治疗、给药时间、剂型、给药方法、给药途径、药物组合、所述受试者的健康状况和交叉感染的潜力、过敏、超敏和副作用、和/或所述肿瘤分期的程度等来确定具体的有效量。本领域技术人员(例如,医生或兽医)可根据这些或其它条件或要求按比例降低或升高所述有效量。
在本申请中,术语“约”通常是指在指定数值以上或以下0.5%-10%的范围内变动,例如在指定数值以上或以下0.5%、1%、1.5%、2%、2.5%、3%、3.5%、4%、4.5%、5%、5.5%、6%、6.5%、7%、7.5%、8%、8.5%、9%、9.5%、或10%的范围内变动。
本申请还涉及以下的实施方案:
1.一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括:
1)临床特征模块,其能够提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;
2)生物学指标模块,其能够提供源自所述患者的至少一种生物学指标;
3)相关性判断模块,其能够确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性;以及
4)鉴别模块,其能够将在模块3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
2.一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括用于鉴别所述生物学指标的计算机,所述计算机被编程以执行如下步骤:
1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;
2)提供源自所述患者的至少一种生物学指标;
3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及
4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
3.一种鉴别能够评价肿瘤进展的生物学指标的方法,所述方法包括:
1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;
2)提供源自所述患者的至少一种生物学指标;
3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及
4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
4.根据实施方案1-3中任一项所述的装置或方法,其中所述肿瘤包括膀胱癌。
5.根据实施方案4所述的装置或方法,其中所述膀胱癌包括膀胱尿路上皮癌(BLCA)。
6.根据实施方案1-5中任一项所述的装置或方法,其中所述肿瘤分期阶段选自:肿瘤I期,肿瘤II期,肿瘤III期和肿瘤IV期。
7.根据实施方案1-6中任一项所述的装置或方法,其中所述至少一种生物学指标包括选自下组的一类或多类指标:
类1:所述患者基因的表达水平;
类2:所述患者基因的拷贝数变化;
类3:所述患者基因的DNA甲基化;
类4:所述患者基因的体细胞突变;和
类5:所述患者中的微小RNA。
8.根据实施方案7所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关。
9.根据实施方案7-8中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关,且其中所述多变量包括所述患者中的基因表达水平,所述患者的年龄,所述患者的性别,和/或所述患者的肿瘤分期阶段。
10.根据实施方案8-9中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:根据所述回归分析中获得的针对各基因的相关性系数数值,将所述基因分为保护性效应基因和危险性效应基因,其中所述保护性效应基因的相关性系数数值为负,且所述危险性效应基因的相关性系数数值为正。
11.根据实施方案7-10中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:确定所述患者的基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。
12.根据实施方案11所述的装置或方法,其中通过使用WGCNA算法根据所述基因共表达情况将所述基因分为2组或更多组。
13.根据实施方案7-12中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的拷贝数变化,且确定所述基因拷贝数变化与所述临床特征之间的相关性包括:比较所述患者的基因在各肿瘤分期阶段的拷贝数变化频率。
14.根据实施方案7-13任一项所述的装置或方法,其中所述至少一种生物学指标包括所 述患者基因的DNA甲基化,且确定所述DNA甲基化与所述临床特征之间的相关性包括:以所述DNA甲基化的程度作为变量而相对于所述临床特征进行回归分析,并将所述回归分析中p值小于或等于第四阈值的DNA甲基化鉴别为与所述临床特征相关。
15.根据实施方案14所述的装置或方法,其中确定所述DNA甲基化与所述临床特征之间的相关性还包括:确定被鉴别为与所述临床特征相关的各DNA甲基化位点的风险值,所述风险值基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。
16.根据实施方案7-15中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的体细胞突变,且确定所述体细胞突变与所述临床特征之间的相关性包括:确定具有所述体细胞突变的基因所属的信号通路,和/或确定具有所述体细胞突变的基因的表达水平与所述临床特征之间的相关性。
17.根据实施方案7-16中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者中的微小RNA,且确定所述微小RNA与所述临床特征之间的相关性包括:确定所述微小RNA所调控的基因的表达水平与所述临床特征之间的相关性,以及确定所述微小RNA在所述患者中的表达水平与其所调控的基因的表达水平之间的相关性。
18.根据实施方案7-17中任一项所述的装置或方法,其中所述至少一种生物学指标包括两类或更多类所述生物学指标,且确定所述生物学指标与所述临床特征之间的相关性包括确定各类所述生物学指标对所述临床特征影响的权重。
19.根据实施方案18所述的装置或方法,其中通过进行有序逻辑回归分析确定所述权重。
20.根据实施方案1-19中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:
a)以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关的第一基因集合。
21.根据实施方案20所述的装置或方法,其中确定所述基因的表达水平与所述临床特征之间的相关性还包括:
b)相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关的第二基因集合,且其中所述多变量包括所述第一基因集合中各基因的表达水平,所述患者的年龄,所述患者的性别,和所述患者的肿瘤分期阶段。
22.根据实施方案21所述的装置或方法,其中确定所述基因的表达水平与所述临床特征之间的相关性还包括:
c)根据所述多变量回归分析中获得的针对各基因的相关性系数数值,将所述基因分为保护性效应基因和危险性效应基因,其中所述保护性效应基因的相关性系数数值为负,且所述危险性效应基因的相关性系数数值为正。
23.根据实施方案21-22中任一项所述的装置或方法,其中确定所述基因的表达水平与所述临床特征之间的相关性还包括:
确定所述第二基因集合中的各基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述第二基因集合中的基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。
24.根据实施方案23所述的装置或方法,其中通过使用WGCNA算法根据所述基因共表达情况将所述第二基因集合中的基因分为2组或更多组。
25.根据实施方案21-24中任一项所述的装置或方法,其中所述至少一种生物学指标还包括所述患者基因的拷贝数变化,且确定所述基因拷贝数变化与所述临床特征之间的相关性包括:比较所述第二基因集合中的基因在各肿瘤分期阶段的拷贝数变化频率。
26.根据实施方案21-25中任一项所述的装置或方法,其中所述至少一种生物学指标还包括所述患者基因的DNA甲基化,且确定所述DNA甲基化与所述临床特征之间的相关性包括:确定所述第二基因集合中基因的DNA甲基化位点及各所述位点的DNA甲基化程度,以所述DNA甲基化程度作为变量而相对于所述临床特征进行回归分析,并将所述回归分析中p值小于或等于第四阈值的DNA甲基化鉴别为与所述临床特征相关的第一DNA甲基化集合。
27.根据实施方案26所述的装置和方法,其中确定所述DNA甲基化与所述临床特征之间的相关性还包括:确定所述第一DNA甲基化集合中各DNA甲基化位点的风险值,所述风险值基于该甲基化位点在所述回归分析中获得的相关性系数及该甲基化位点的甲基化程度而确定。
28.根据实施方案21-27中任一项所述的装置或方法,其中所述至少一种生物学指标还包括所述患者基因的体细胞突变,且确定所述体细胞突变与所述临床特征之间的相关性包括:确定所述第二基因集合中的基因所具有的体细胞突变,以及确定具有所述体细胞突变的基因所属的信号通路。
29.根据实施方案21-28中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者中的微小RNA,且确定所述微小RNA与所述临床特征之间的相关性包括:确定调 控所述第二基因集合中的基因的微小RNA,以及确定所述微小RNA在所述患者中的表达水平与其所调控的基因的表达水平之间的相关性,将该相关性高于第五阈值的微小RNA鉴别为与所述临床特征相关的第一微小RNA集合。
30.根据实施方案27-29中任一项所述的装置或方法,其中确定所述生物学指标与所述临床特征之间的相关性包括:通过进行有序逻辑回归分析分别确定下列生物学指标对所述临床特征影响的权重:所述第二基因集合中基因的表达水平,所述第二基因集合中基因的拷贝数变化,所述第一DNA甲基化集合中DNA甲基化位点的风险值。
31.根据实施方案30所述的装置或方法,其中分别确定所述第二基因集合中保护性效应基因表达水平和危险性效应基因表达水平各自的权重。
32.存储有计算机程序的计算机可读存储介质,其中所述计算机程序使计算机执行如实施方案3-31中任一项所述的方法。
33.一种判断受试者中肿瘤进展的装置,所述装置包括:
a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及
b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
34.一种判断受试者中肿瘤进展的装置,所述装置包括用于判断受试者中肿瘤进展的计算机,所述计算机被编程以执行如下步骤:
a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及
b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
35.一种判断受试者中肿瘤进展的方法,所述方法包括:
a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及
b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
36.根据实施方案33-35中任一项所述的装置或方法,其中所述肿瘤进展包括所述肿瘤的分期阶段和/或所述受试者的生存率。
37.根据实施方案36所述的装置或方法,其中所述肿瘤的分期阶段选自:肿瘤I期,肿瘤II期,肿瘤III期和肿瘤IV期。
38.根据实施方案33-37中任一项所述的装置或方法,其中所述肿瘤包括膀胱癌。
39.根据实施方案38所述的装置或方法,其中所述膀胱癌包括膀胱尿路上皮癌(BLCA)。
40.根据实施方案33-39中任一项所述的装置或方法,其中所述一个或多个基因至少包括表2中所示的一个或多个保护性效应基因。
41.根据实施方案33-40中任一项所述的装置或方法,其中所述一个或多个基因至少包括表3中所示的一个或多个危险性效应基因。
42.根据实施方案33-41中任一项所述的装置或方法,其中所述一个或多个基因至少包括表4中所示的一个或多个基因。
43.根据实施方案33-42中任一项所述的装置或方法,其中所述一个或多个基因至少包括表5中所示的一个或多个基因。
44.根据实施方案33-43中任一项所述的装置或方法,其还包括:确定所述一个或多个基因的拷贝数变化的步骤或模块。
45.根据实施方案33-44中任一项所述的装置或方法,其还包括:确定表8中所示的一个或多个基因的DNA甲基化风险值的步骤或模块。
46.根据实施方案33-45中任一项所述的装置或方法,其还包括:确定所述受试者的年龄的步骤或模块。
47.根据实施方案33-46中任一项所述的装置或方法,其中确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平包括:确定所述一个或多个基因中表2所示基因的平均表达水平;以及确定所述一个或多个基因中表3所示基因的平均表达水平。
48.根据实施方案47中所述的装置或方法,其中根据式I判断所述受试者中所述肿瘤的进展:
ln((P(Stage≤j))/(1-P(Stage≤j)))=Intercept+0.0366*a+0.3386*b+0.3349*c+1.2193*d+0.0084*e-0.048*f  (I)
其中,j=肿瘤III期时,Intercept=0.9609;j=肿瘤I期/II期时,Intercept=-0.6617;
a为所述一个或多个基因中表2所示基因的平均表达水平;
b为所述一个或多个基因中表3所示基因的平均表达水平;
c为所述一个或多个基因的拷贝数变化;
d为所述一个或多个基因中表8中所示基因的DNA甲基化风险值;
e为所述受试者的年龄;且
f为所述受试者的性别,其中男性为0,女性为1。
49.存储有计算机程序的计算机可读存储介质,其中所述计算机程序使计算机执行如实 施方案35-48中任一项所述的方法。
50.治疗受试者中的肿瘤的方法,所述方法包括:
根据实施方案35-48中任一项所述的方法,判断所述受试者中所述肿瘤的进展;以及
根据所述进展向所述受试者施用有效量的治疗。
51.一种治疗受试者中的肿瘤的装置,所述装置包括:
a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;
b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展;以及
c)治疗模块,其能够根据b)中判断的所述进展向所述受试者施用有效量的治疗。
不欲被任何理论所限,下文中的实施例仅仅是为了阐释本申请的装置、方法和系统的工作方式,而不用于限制本申请发明的范围。
实施例
本申请实施例中的所有的统计分析均由R软件(版本3.3.3)执行。
实施例1 患者和肿瘤样本数据来源
本申请中使用的大部分BLCA患者的基因组和临床数据集均从“NCI GDC Data Portal Legacy Archive”下载。其中,BLCA患者的临床信息来自TCGA-BLCA临床文件。获得的BLCA患者RNA-seq数据集包含419个样本,其中包括400个肿瘤样本和19个正常样本。所有的基因表达值被标准化。
使用突变注释格式(MAF文件)的TCGA 2级的体细胞突变数据。TCGA 3级甲基化数据从“jhu-usc_BLCA.HumanMethylation450”下载。TCGA 4级的mRNA的表达与DNA甲基化之间的相关性数据来自Broad GDAC Firehose。TCGA 4级的拷贝数变异(CNV)数据从Broad GDAC Firehose下载。
使用以下的离散指标来表示CNV的扩增和缺失水平:严重缺失=-2;缺失=1;没有变化=0;扩增=1;高水平扩增=2。
选择使用来自TCGA 3级微小RNA定量文件中的“每百万bp miRNA基因组映射(per million miRNA mapped,RPM)”作为微小RNA表达值。
经文献验证的已知miRNA-基因相互作用的列表从miRWalk2.0获得。微小RNA-癌症的关系信息来自miRCancer。
实施例2 基于存活率分析筛选关键基因
利用生存率分析来研究生存状态和不同的潜在影响因素(例如,关键基因)之间的关系。
实验方法:
Cox比例风险回归
应用单变量和多变量Cox比例风险回归模型来识别可能影响BLCA患者生存的关键基因。首先将所有BLCA样品中单个基因的表达值根据其z-分数(z-scores)归一化。并去除了仅在小于20个的样品中的表达的基因。
在单变量Cox比例风险回归中,基因表达值被用作唯一的预测变量;而在多变量Cox比例风险回归中,年龄、性别、肿瘤分期和基因表达值都被用作预测变量。使用“Benjamini & Hochberg”方法调整p值。
对于生存分析统计显著性的阈值,单变量Cox比例风险回归的p值<0.05且错误发现率(FDR)<0.1;多变量Cox比例风险回归的p值<0.05且FDR<0.05。对于所有Cox回归模型,还检查了比例风险假设并去除了不符合这一假设的那些基因。
Kaplan-Meier分析
对于Kaplan-Meier生存分析,首先根据各个选定基因的中值将所有BLCA样本分为高和低组。接着绘制Kaplan-Meier生存曲线,并通过运行对数秩检验来比较两组之间的差异。使用R包“生存”(the R package“survival”)进行生存分析。
GO分析
所筛选基因的功能注释及其基因本体(GO)富集分析在DAVID v6.8中进行。采用阈值p值<0.05选择GO功能。
实验结果:
应用单变量和多变量Cox比例风险回归模型来选择一组可能对BLCA患者生存有重要影响的关键基因。其中,针对单变量Cox回归使用基因表达值被用作唯一的预测变量。最初,在去除很少表达的基因(仅在小于20个的样品中的表达的基因)后,获得了所有404名BLCA患者的19472个基因的表达值。然后根据阈值p值<0.05且错误发现率(FDR)<0.1选择了1307个候选基因。接下来,检查了候选基因是否符合比例风险(PH)假设,并排除了不满足该假设的99个基因。因此,单变量Cox回归分析筛选得到了1208个候选基因。
在多变量Cox回归中,除了上述1208个基因的表达值之外,还整合了包括BLCA患者的年龄、性别和肿瘤分期信息(其中I/II期=3、III期=2、IV期=1)作为输入的预测变量。使用FDR阈值<0.05,并检查候选基因是否符合比例风险(PH)假设以进一步筛选候选基因。 最后,从多变量Cox回归中获得了1078个候选基因(参见表1,其中表1显示的是鉴定出的1078个关键基因),表1所示的这1078个基因被定义为关键基因,然后用于后续分析。
根据上述多变量Cox回归模型得到的基因表达的系数(coefficients),将1078个关键基因分为两组,其中相关性系数数值为负的基因为356个,相关性系数数值为正的基因为722个,分别定义为保护性效应基因和危险性效应基因(参见表2和表3)。图2A-2D的Kaplan-Meier图以4个样本为例,显示了所筛选的关键基因对BLCA患者存活的影响。图2A-2D依次为基因APOL2、BCL2L14、CSAD和ORMDL1的结果,其中采用log-rank检验统计学上的显著性差异。
为了表征上述筛选的关键基因的潜在生物学功能,对上述保护性效应基因和危险性效应基因进行了基因本体(gene ontology,GO)富集分析。结果发现,保护性效应基因的GO功能主要在于基本的细胞过程或功能,如核酸结合、RNA剪接和tRNA结合(参见图3A)。而危险性效应基因则可能参与膀胱癌的发病机制,如细胞粘附、血管生成、药物反应和细胞迁移的正调控(参见图3B)。根据所涉及的基因的比例对GO功能进行排序,图3显示30个显著的GO功能,且p值<0.05。总体而言,这些功能富集分析的结果表明,所筛选的1078个关键基因,特别是那些有害性的基因,与膀胱癌发生的生物学功能密切相关。
表1 1078个关键基因
Figure PCTCN2019082574-appb-000004
Figure PCTCN2019082574-appb-000005
Figure PCTCN2019082574-appb-000006
Figure PCTCN2019082574-appb-000007
表2 保护性效应基因
Figure PCTCN2019082574-appb-000008
Figure PCTCN2019082574-appb-000009
表3 危险性效应基因
Figure PCTCN2019082574-appb-000010
Figure PCTCN2019082574-appb-000011
Figure PCTCN2019082574-appb-000012
实施例3 膀胱癌病程与关键基因表达动态变化的相关性
实施例2中已将1078个关键基因分为两组,即保护性效应基因和危险性效应基因。为了研究在膀胱癌的不同肿瘤分期中这两个基因组内或之间的基因表达的相关性,分析了在每个肿瘤分期中,保护性效应基因-保护性效应基因、保护性效应基因-危险性效应基因和危险性效应基因-危险性效应基因之间的表达水平的相关系数。这些比较结果表明,随着膀胱癌的肿瘤分期的增加和病情的严重(即,按照I/II期、III期和IV期的顺序),在相同性质的基因之间(即保护性效应基因-保护性效应基因或危险性效应基因-危险性效应基因)或不同性质的基因之间(即保护性效应基因-危险性效应基因)的相关性都会显著减小(参见图4A-4C)。针对I/II期、III期和IV期,图4A-4C分别表示保护性效应基因-保护性效应基因、保护性效应基因-危险性效应基因或危险性效应基因-危险性效应基因的相关系数(所有异常值均未显示)和相应密度曲线。其中*p值<0.05;**:p值<0.01;***:p值<0.001;****:p值<0.0001,经过双面Wilcoxon秩和检验。
这种变化也可以反映在相应密度曲线的变化中,即随着膀胱癌的肿瘤分期的增加和病情的严重,密度曲线变得越来越高和越来越窄。可见对基因表达相关性模式的动态变化的分析表明,所鉴定的关键基因表达水平的变化与膀胱癌的肿瘤分期(即进展)密切相关。
实施例4 构建关键基因的共表达网络并检测与临床特征相关的功能基因模块
实验方法:
使用加权相关网络分析(WGCNA)算法(参见Langfelder P等,BMC Bioinformatics 2008,9:559)构建了它们的基因共表达网络。与硬阈值过滤(hard threshold filters)相比,WGCNA算法可以通过软阈值方法保留目标基因及其关系的所有信息。为了获得基因之间相关性的迹象,选择了来自实施例2所得的1078个关键基因之间相关性的邻接矩阵(adjacency matrix)的“有符号”类型。通过程序中的“pick Soft Threshold”功能,选择合适的软阈值β=8来构建所有BLCA样本中该1078个关键基因的基因共表达网络。
在WGCNA算法中,基因模块被定义为在构建的基因共表达网络中包含许多高度连接的基因的基因群。通过程序中的“TOM相似性”功能,从邻接矩阵中获得了拓扑重叠矩阵(TOM)。根据从这个拓扑重叠矩阵得到的相应的相异性得分。使用“hclust”函数获得基因的树状图,然后通过“cutreeDynamic”函数进行模块识别。最小模块的大小设置为20。使用“标记热图”功能生成模块-特征关联的热图。
实验结果:
基因共表达网络可以提供基因-基因之间关联的整体情况。基于不同阶段BLCA患者的基 因表达值,使用WGCNA算法构建了肿瘤分期特异性基因共表达网络。
在基因共表达网络中,其模块内基因通常表现出相似的表达模式,这样的网络模块通常被认为具有基本的网络拓扑特征,能够为理解该模块中相关基因的生物学功能提供有利的线索。为了从先前构建的基因共表达网络中检测功能性基因模块,首先将邻接矩阵转换成拓扑重叠矩阵,并提供了对下游模块检测有用的拓扑相似性分数。然后在由WGCNA算法产生的层级聚类树(即动态树剪切产生的树状图)上运行动态树切割算法,从而产生七个不同尺寸的网络模块(参见图5A和表6)。图5A显示了由WGCNA构建的层级聚类树(即树状图),这是基于由动态树切割算法得出的各个基因簇和拓扑重叠矩阵表示的不相似性得分而导出的。在图5A底部以不同颜色命名各个基因簇;在图5B中,左侧分别用不同的数字编号对应表示不同的颜色的基因簇,即第1-第7模块依次表示青色、黑色、黄色、棕色、红色、蓝色和绿色的单个功能基因模块。
为了鉴定与BLCA患者的临床特征有关的基因模块,计算了模块单基因组(其定义为相应模块的基因表达谱的第一主要组分)与癌症患者的临床特征之间的相关系数(参见图5B)。图5B显示了单个模块内基因表达谱的第一主要组分定义的模块单元格(行)与所有BLCA患者的临床特征(列)之间的关系。每个框显示相关系数和相应的p值(在括号中)。
由于肿瘤分期与患者的生存密切相关,特别研究了与肿瘤分析相关的基因模块。可以观察到两个基因模块分别与膀胱癌的分期具备负相关和正相关的联系(分别在图5A-5B中标记为青色和蓝色)。此外,结果发现青色模块中的大部分(约93%)基因(即与膀胱癌分期呈负相关关系)属于保护性效应基因,而蓝色模块中的所有基因(即与膀胱癌分期呈正相关关系)属于危险性效应基因。
进一步计算了蓝色和青色模块中整体关联性(即,整个网络中节点的平均度)和模块内的关联性(即,模块内节点的平均度数)(参见表4-表5,其中表4反映了青色模块的相关情况;表5反映了蓝色模块的相关情况)。结果发现,蓝色模块和青色模块在模块内的关联性方面显示出显着差异,但在整体关联性方面却没有显着差异,即青色模块中的基因彼此之间的关系比蓝色模块中的基因更密切(参见图5C-图5D)。图5C表示蓝色模块和青色模块的整体关联性,图5D表示这两个模块内的关联性。****表示p值<0.0001;经双面Wilcoxon秩和检验。
由此,接下来研究了具有前30个模块内的关联性的基因,发现其中许多基因(尤其是蓝色模块内的基因)已有文献报道与膀胱癌相关。例如,已证实PDGFRB与非肌层浸润性膀胱癌的复发密切相关(参见Feng J等,PLoS One 2014,9(5):e96671)。发现MARVELD1在 包括膀胱癌的几种癌症中表达水平下调(参见Wang S等,Cancer Lett 2009,282(1):77-86)。已发现KCNE4(一种离子通道基因)在膀胱癌样本中显示异常表达水平(参见Biasiotta A等J Transl Med 2016,14(1):285)。已经表明CPT1B的表达与肉碱-酰基肉碱代谢途径中的其他基因一起在膀胱癌组织中下调(参见Kim WT等,Yonsei Med J 2016,57(4):865-871)。此外,CKD6已被证明参与膀胱癌的几种调控途径(参见Lu S等,Exp Ther Med 2017,13(6):3309-3314)。可见网络模块中具有高连接度的基因在膀胱癌的分期中也可能具有重要的生物学功能。因此上述结果表明,BLCA患者的存活率与其肿瘤分期之间的阶段特异性的关联性可以通过关键基因的不同群组的表达水平来反映。
表4-表5 青色和蓝色模块中整体关联性和模块内的关联性
Figure PCTCN2019082574-appb-000013
Figure PCTCN2019082574-appb-000014
Figure PCTCN2019082574-appb-000015
Figure PCTCN2019082574-appb-000016
Figure PCTCN2019082574-appb-000017
Figure PCTCN2019082574-appb-000018
Figure PCTCN2019082574-appb-000019
Figure PCTCN2019082574-appb-000020
Figure PCTCN2019082574-appb-000021
Figure PCTCN2019082574-appb-000022
Figure PCTCN2019082574-appb-000023
Figure PCTCN2019082574-appb-000024
表6 7个网络模块
Figure PCTCN2019082574-appb-000025
Figure PCTCN2019082574-appb-000026
Figure PCTCN2019082574-appb-000027
Figure PCTCN2019082574-appb-000028
实施例5 拷贝数变化的分析
实验方法:
利用Broad GDAC Firehose(4级)中来自“SNP6拷贝数分析(Gistic2)”的CNV数据来进行分析。获得了400个BLCA样本中选定的1078个关键基因的CNV数据,其中包括I/II期的129个样本,III期的139个样本和IV期的132个样本。对于每个基因,计算每期中具有CNV的样品的频率(即扩增或缺失)。考虑到膀胱癌不同期的样本数量的不平衡,使用I/II期作为基准来对各期的频率进行归一化。
实验结果:
结果显示,膀胱癌的不同期(I/II期、III期和IV期)显示出明显不同的CNV频率,并且CNV随着膀胱癌的进展而显著增加(参见图6A)。这个结果意味着拷贝数异常可能对膀胱癌的进展有推动作用。同时检查了实施例4中蓝色模块和青色模块(参见图5B中第6模块和第1模块)中的基因的CNV(参见表7),它们分别与膀胱癌的不同分期具有最为正相关和负相关的关系。结果发现,在所有样品或BLCA患者的不同分期,蓝色模块(其中所有基因是危险性效应基因)显示出比青色模块(其中大多数(即93%)基因是保护性效应基因)显示出更高的CNV比率(参见图6B-6E)。其中,图6A表示膀胱癌不同期的CNV比率的比较。图6B-6E表示蓝色模块和青色模块在整体上和I/II期、III期和IV期时的CNV比率的比较;其中*p值<0.05;**:p值<0.01;***:p值<0.001;****:p值<0.0001,经过双面Wilcoxon秩和检验。结果表明,拷贝数变异是影响膀胱癌不同期(即进展)的重要因素,并以不同的水平影响不同功能基因模块。
表7 青色模块和蓝色模块中的基因的CNV
Figure PCTCN2019082574-appb-000029
Figure PCTCN2019082574-appb-000030
Figure PCTCN2019082574-appb-000031
Figure PCTCN2019082574-appb-000032
Figure PCTCN2019082574-appb-000033
Figure PCTCN2019082574-appb-000034
Figure PCTCN2019082574-appb-000035
Figure PCTCN2019082574-appb-000036
Figure PCTCN2019082574-appb-000037
Figure PCTCN2019082574-appb-000038
Figure PCTCN2019082574-appb-000039
Figure PCTCN2019082574-appb-000040
Figure PCTCN2019082574-appb-000041
Figure PCTCN2019082574-appb-000042
实施例6 DNA甲基化分析
实验方法:
利用Broad GDAC Firehose中“mRNA表达与DNA甲基化之间的相关性”中,获得了933个DNA甲基化探针用于鉴定实施例2得到的1078个关键基因,并且它们中的每一个与相应基因的表达值最负相关。然后从TCGA的“jhu-usc.edu_BLCA.Human-Methylation450”文件中提取这些DNA甲基化探针的β值。之后,应用多变量正则化Cox回归(multivariate regularized Cox regression,一种基于LASSO的回归方法)从上述933个DNA甲基化探针中鉴定了一组具有低多重共线性的最佳基因。总共有23个DNA甲基化基因被保留作为该分析的活性协同变量(参见表8),并且其在相应的单变量Cox回归模型(即调整后的p值<0.05)中也显示出统计学上的差异显著性。
在上述基于LASSO的回归分析中,对获得的DNA甲基化数据集进行了10次交叉验证,以确定正则化参数的最优值。回归分析使用R包“glmnet”(R package“glmnet”)进行。
实验结果:
通过分析了实施例2筛选的1078个关键基因的DNA甲基化状态,其中一些DNA甲基化特征可能被用来作为膀胱癌预后的生物标记。
首先为1078个关键基因获得了933个DNA甲基化探针,鉴定与相应基因的表达最相关的DNA甲基化特征。然后,采用基于LASSO回归的多变量正则化Cox回归方法,筛选出最能解释这些输入的存活率数据的23个重要DNA甲基化基因(参见表8)。所有这些筛选出的23个基因在相应的单变量Cox回归模型中显示出统计学上的显著差异,同时调整p值<0.05。在这23个DNA甲基化基因中,已有报道指出与其相关的基因在膀胱癌发生中起重要作用,如JAG1、CLIC3、IRF1和POLB(例如,参见Shi TP等,J Urol 2008,180(1):361-366)。
然后引入了一个风险值,该风险值被定义为甲基化水平(即β值)与正则化Cox回归中23个DNA甲基化基因的相应系数的线性组合。接下来,根据这个新的风险值的中位数对所有BLCA患者进行风险评分,并分为高风险组和低风险组。然后对这两组患者进行Kaplan-Meier分析和log-rank检验。结果发现分高风险组和低风险组显示出显著不同的风险评分分布(参见图7A)。此外,可以观察到绘制的Kaplan-Meier曲线也有着显著的差异,即风险评分越高,预后越差,反之亦然(参见图7B)。图7A显示了DNA甲基化分析所述高风险组和低风险组的风险评分(根据23种选择的DNA甲基化基因)的分布以及患者相应的临床特征;其中虚线显示风险评分的截断值。图7B显示了高风险组和低风险组的Kaplan-Meier生存曲 线,两组的统计学差异经log-rank检验。结果表明,基于所筛选的DNA甲基化基因的新风险值可以为膀胱癌提供良好的预后指标。
表8 23个甲基化基因
基因名称 相关系数
CYTH2 -0.984161972
PGLYRP4 -0.835135351
JAG1 -0.758694541
LTBP1 -0.358058521
CLIC3 -0.344045267
AKR1B1 -0.21615728
CNN3 -0.174817703
MESTIT1 -0.165094565
BAIAP2 -0.091244951
THBS3 -0.078528329
EIF2AK4 -0.058860853
KCNJ15 0.011163386
MTERFD3 0.066920184
PARP4 0.076173864
IRF1 0.125102152
TEAD4 0.247255028
TIA1 0.293154238
EFHD2 0.542824755
PRRT4 0.641295163
POLB 0.703060414
CRTC2 0.881500449
C3orf19 1.083780825
CCDC21 1.245618158
实施例7 体细胞突变分析
分析实施例2所筛选的1078个关键基因中的体细胞突变的基因组特征。
实验方法:
从TCGA(2级)下载体细胞突变数据之后,总共从397个BLCA样品的1078个基因中获得了908个基因上的6052个体细胞突变。这397个样品包括I/II期的129个样品、III期的135个样品和IV期的133个样品。
实验结果:
首先研究可能受突变基因影响的途径。通过DAVID(参见Huang da W等,Nat Protoc 2009,4(1):44-57)对1078个关键基因中908个突变基因的KEGG通路进行富集分析,可以发现相对较大比例的富集途径确实被已被认为属于肿瘤相关信号通路(参见表9)。尤其是四种已被研究证实与膀胱癌有关的重要途径,即PI3K/AKT途径、Ras途径、Rap1途径和MAPK途径(例如,参见Houede N等,Pharmacol Ther 2015,145:1-18)。图8A-8D分别表示在BLCA 患者的样本中,PI3K-AKT途径、MAPK途径、Ras途径和Rap1途径的突变基因显著性富集。其中行代表突变的基因,并根据所有样本中突变基因的频率依次排列;列则代表涉及的样本(其中没有突变的空白列已被去除)。如图8的结果所示,这四条途径中的相当一部分基因在膀胱癌中发生了突变。具体来说,在所有样品中,60%的MAPK途径,56%的PI3K/AKT途径,35%的Rap1途径和35%的Ras途径已具有突变基因,并且突变发生的频率超过1%。可以观察到,这四条途径具备相对高的体细胞突变频率,这个结果与先前的研究结果一致,即重要的细胞信号传导途径中的基因突变往往对肿瘤的发生具有驱动作用(例如,参见Fawdar S等,Proc Natl Acad Sci U S A 2013,110(30):12426-12431)。
同时还分析了膀胱癌不同分期时突变基因的分布(参见图9)。结果发现在1078个关键基因中,不同分期的BLCA患者共享大部分体细胞突变的基因(437个基因)(参见图9A)。更重要的是,可以观察到两个模块(即对应于实施例4中的蓝色模块和青色模块,其分别与肿瘤不同分期最为正相关和负相关)之间的突变频率在所有或特定的分期的样本中均有显著的差异。尤其是蓝色模块(其中所有基因都是危险性效应基因)中的基因比青色模块(其中93%的基因为保护性效应基因)中的基因具有更多的体细胞突变(参见图9B-9E)。这一结果表明,虽然体细胞突变存在于大多数关键基因中,但它们对与肿瘤分期特定相关的基因组中的基因表现出显著的偏向性。该结果为了解体细胞突变对膀胱癌不同分期(进展)的影响提供了有用的线索。
表9 KEGG分析的结果
[根据细则91更正 08.05.2019] 
Figure WO-DOC-TABLE-9
[根据细则91更正 08.05.2019] 
Figure WO-DOC-TABLE-9-1
实施例8 微小RNA调控网络在不同癌症分期的动态变化
分析关于实施例2中筛选的膀胱癌不同分期的关键基因的miRNA调控网络的动态变化。
实验方法:
微小RNA调控网络的网络分析
使用R包“igraph”来计算膀胱癌不同分期微小RNA调节网络的协同程度。网络图由Cytoscape 3.5.0生成。
微小RNA-mRNA相互作用数据的处理
首先获得微小RNAs与miRWalk2.0数据库中筛选的1078个关键基因之间的已被实验验证的相互作用(参见Dweep H等Nat Methods 2015,12(8):697)。然后对于膀胱癌的各个分期,计算了1078个关键基因的表达值和相应的相互作用的微小RNA之间的相关系数。如果一对微小RNA和基因之间的相关系数小于-0.3,认为它们是可能的调节对;否则,将从这一对微小RNA和基因从原始的微小RNA-基因相互作用网络中移除。此外,从miRCancer数据库(2016年12月版)检索可知与膀胱癌相关的特定微小RNA(参见Xie B等,Bioinformatics 2013,29(5):638-644)。
实验结果:
与实施例2中筛选得到的1078个关键基因相互作用的微小RNA参见表10。计算微小RNAs及其相应靶基因的表达值之间的相关系数,只选择系数小于-0.3的微小RNA-基因对作为潜在的调控伴侣,基于此为膀胱癌的每期构建了一个微小RNA-基因相互作用的网络。结果发现,在膀胱癌不同分期(进展)中,微小RNA调控网络的结构(包括涉及已知BLCA特异性的微小RNA的相互作用)往往变得更加稀疏,可见彼此的相互作用逐渐减少(参见图10)。为了定量分析这一趋势,还计算了不同分期的个体微小RNA调控网络的协同程度。并分别观察到I/II期、III期和IV期的显著下降趋势:0.039、-0.27和-0.27。图10A-10C分别显示了I/II期、III期和IV期的微小RNA调控网络的可视动态变化。其中矩形代表选定的微小RNA,已知的BLCA特异性微小RNA显示为红色;微小RNA所对应的靶基因则用绿色圆圈表示,同时还显示了各个网络的配合程度。
由此可见,BLCA患者中筛选得到的1078个关键基因的微小RNA调控网络随着膀胱癌的进展呈现出离散化的增长趋势,这可能与癌细胞中的微小RNA失调有关。也反映了膀胱癌中细胞内调节和控制基因表达的紊乱。
表10 与1078个关键基因相互作用的微小RNA
Figure PCTCN2019082574-appb-000045
Figure PCTCN2019082574-appb-000046
Figure PCTCN2019082574-appb-000047
Figure PCTCN2019082574-appb-000048
Figure PCTCN2019082574-appb-000049
Figure PCTCN2019082574-appb-000050
实施例9 不同因素对膀胱癌分期的综合分析
为了全面了解不同基因组和临床因素对膀胱癌进展的影响,使用有序逻辑回归模型进一步对这些因素进行综合分析。
实验方法
用于综合分析的有序逻辑回归
使用Matlab 2016b中的“mnrfit”函数来执行序数逻辑回归任务。在该综合分析中,响应变量是肿瘤阶段(IV期=1、III期=2、I/II期=3),而预测变量包括保护性效应基因和危险性效应基因的平均表达值(z-标准化)、拷贝数变异的频率(z-标准化)、DNA甲基化风险评分、年龄和性别(男性=0,女性=1)。
实验结果
在综合分析中考虑了保护性效应基因和危险性效应基因的平均表达值(z-标准化)、拷贝数变异的频率(z-标准化)、DNA甲基化风险评分、年龄和性别(参见表11)。如图11中的森林图所示,可以观察到危险性基因的平均表达值、拷贝数变异的频率以及DNA甲基化的风险评分可以显著地影响膀胱癌的分期。图11中,框和线分别表示比值比(OR)和相应的95%置信区间,星号表示统计显着性变量。其中*:p值<0.05;**:p值<0.01。
这些因素的OR都大于1,表明它们可以被认为是膀胱癌进展的危险因素。所有这些综合建模结果都与实施例2-8中单因素分析的结果一致。因此,尽管基因组数据来自不同平台的而存在异质性,但多角度、多指标数据的综合分析及其临床信息为研究膀胱癌基因组和临床因素对进展的联合作用提供了可靠依据。
表11 综合分析的结果
Figure PCTCN2019082574-appb-000051
Figure PCTCN2019082574-appb-000052
前述详细说明是以解释和举例的方式提供的,并非要限制所附权利要求的范围。目前本文所列举的实施方式的多种变化对本领域普通技术人员来说是显而易见的,且保留在所附的权利要求和其等同方案的范围内。

Claims (18)

  1. 一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括:
    1)临床特征模块,其能够提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;
    2)生物学指标模块,其能够提供源自所述患者的至少一种生物学指标;
    3)相关性判断模块,其能够确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征的相关性;以及
    4)鉴别模块,其能够将在模块3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
  2. 一种用于鉴别能够评价肿瘤进展的生物学指标的装置,所述装置包括用于鉴别所述生物学指标的计算机,所述计算机被编程以执行如下步骤:
    1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;
    2)提供源自所述患者的至少一种生物学指标;
    3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及
    4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
  3. 一种鉴别能够评价肿瘤进展的生物学指标的方法,所述方法包括:
    1)提供患有所述肿瘤的患者的临床特征,所述临床特征包括所述患者的肿瘤分期阶段和/或所述患者的生存时间;
    2)提供源自所述患者的至少一种生物学指标;
    3)确定各所述患者的所述至少一种生物学指标与相应患者的所述临床特征之间的相关性;以及
    4)将在3)中被判定为与所述临床特征相关的生物学指标鉴别为能够评价所述肿瘤的进展。
  4. 根据权利要求1-3中任一项所述的装置或方法,其中所述肿瘤包括膀胱癌。
  5. 根据权利要求1-4中任一项所述的装置或方法,其中所述至少一种生物学指标包括选自下组的一类或多类指标:
    类1:所述患者基因的表达水平;
    类2:所述患者基因的拷贝数变化;
    类3:所述患者基因的DNA甲基化;
    类4:所述患者基因的体细胞突变;和
    类5:所述患者中的微小RNA。
  6. 根据权利要求5所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:以所述基因的表达水平作为单一变量而相对于所述临床特征进行单变量回归分析,并将所述回归分析中p值小于或等于第一阈值且FDR值小于或等于第二阈值的基因鉴别为与所述临床特征相关。
  7. 根据权利要求5-6中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性包括:相对于所述临床特征进行多变量回归分析,并将所述回归分析中FDR值小于或等于第三阈值的基因鉴别为与所述临床特征相关,且其中所述多变量包括所述患者中的基因表达水平,所述患者的年龄,所述患者的性别,和/或所述患者的肿瘤分期阶段。
  8. 根据权利要求5-7中任一项所述的装置或方法,其中所述至少一种生物学指标包括所述患者基因的表达水平,且确定所述基因的表达水平与所述临床特征之间的相关性还包括:确定所述患者的基因在各肿瘤分期阶段的表达水平,据此确定对肿瘤分期具有特异性的基因共表达情况,根据所述基因共表达情况将所述基因分为2组或更多组,以及分别确定每组的基因表达水平与所述临床特征间的相关性。
  9. 一种判断受试者中肿瘤进展的装置,所述装置包括:
    a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及
    b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
  10. 一种判断受试者中肿瘤进展的装置,所述装置包括用于判断受试者中肿瘤进展的计算机,所述计算机被编程以执行如下步骤:
    a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及
    b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
  11. 一种判断受试者中肿瘤进展的方法,所述方法包括:
    a)确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;以及
    b)根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展。
  12. 根据权利要求9-11中任一项所述的装置或方法,其中所述肿瘤进展包括所述肿瘤的分期阶段和/或所述受试者的生存率。
  13. 根据权利要求9-12中任一项所述的装置或方法,其中所述肿瘤包括膀胱癌。
  14. 根据权利要求9-13中任一项所述的装置或方法,其中所述一个或多个基因至少包括表4中所示的一个或多个基因。
  15. 根据权利要求9-14中任一项所述的装置或方法,其中所述一个或多个基因至少包括表5中所示的一个或多个基因。
  16. 根据权利要求9-15中任一项所述的装置或方法,其中确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平包括:确定所述一个或多个基因中表2所示基因的平均表达水平;以及确定所述一个或多个基因中表3所示基因的平均表达水平。
  17. 根据权利要求16中所述的装置或方法,其中根据式I判断所述受试者中所述肿瘤的进展:
    Figure PCTCN2019082574-appb-100001
    其中,j=肿瘤III期时,Intercept=0.9609;j=肿瘤I期/II期时,Intercept=-0.6617;
    a为所述一个或多个基因中表2所示基因的平均表达水平;
    b为所述一个或多个基因中表3所示基因的平均表达水平;
    c为所述一个或多个基因的拷贝数变化;
    d为所述一个或多个基因中表8中所示基因的DNA甲基化风险值;
    e为所述受试者的年龄;且
    f为所述受试者的性别,其中男性为0,女性为1。
  18. 一种治疗受试者中的肿瘤的装置,所述装置包括:
    a)分析模块,其能够确定表1中所示的一个或多个基因在所述受试者中或源自所述受试者的生物学样品中的表达水平;
    b)判断模块,其能够根据a)中确定的所述表达水平判断所述受试者中所述肿瘤的进展;以及
    c)治疗模块,其能够根据b)中判断的所述进展向所述受试者施用有效量的治疗。
PCT/CN2019/082574 2018-04-16 2019-04-12 鉴别及评价肿瘤进展的装置和方法 WO2019201186A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/725,147 US20200185054A1 (en) 2018-04-16 2019-12-23 Device and method of identifying and evaluating a tumor progression

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810337789.8 2018-04-16
CN201810337789.8A CN108504555B (zh) 2018-04-16 2018-04-16 鉴别及评价肿瘤进展的装置和方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/725,147 Continuation US20200185054A1 (en) 2018-04-16 2019-12-23 Device and method of identifying and evaluating a tumor progression

Publications (1)

Publication Number Publication Date
WO2019201186A1 true WO2019201186A1 (zh) 2019-10-24

Family

ID=63382413

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/082574 WO2019201186A1 (zh) 2018-04-16 2019-04-12 鉴别及评价肿瘤进展的装置和方法

Country Status (3)

Country Link
US (1) US20200185054A1 (zh)
CN (1) CN108504555B (zh)
WO (1) WO2019201186A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724903A (zh) * 2020-06-29 2020-09-29 北京市肿瘤防治研究所 预测受试者胃癌预后的系统
CN117694839A (zh) * 2024-02-05 2024-03-15 四川省肿瘤医院 基于图像的非肌层浸润性膀胱癌复发率预测方法和系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108504555B (zh) * 2018-04-16 2021-08-03 图灵人工智能研究院(南京)有限公司 鉴别及评价肿瘤进展的装置和方法
CN109872776B (zh) * 2019-02-14 2023-06-09 辽宁省肿瘤医院 一种基于加权基因共表达网络分析对胃癌潜在生物标志物的筛选方法及其应用
CN110201148A (zh) * 2019-07-05 2019-09-06 浙江大学 Prrt4细胞因子在制备治疗肝功能衰竭药剂中的应用
CN112185548B (zh) * 2020-09-25 2022-10-28 智慧中医科技(广东)有限公司 一种基于神经网络算法的智能中医诊断方法及装置
CN111932538B (zh) * 2020-10-10 2021-01-15 平安科技(深圳)有限公司 分析甲状腺图谱的方法、装置、计算机设备及存储介质
CN112481218A (zh) * 2020-11-24 2021-03-12 河南牧业经济学院 基于CRISPR/Cas9基因编辑系统敲除猪miR-155基因的细胞系及构建方法
CN114093422B (zh) * 2021-11-23 2024-06-25 湖南大学 一种基于多关系图卷积网络的miRNA和基因相互作用的预测方法及其系统
CN118028468A (zh) * 2024-02-27 2024-05-14 上海仁东医学检验所有限公司 膀胱癌预后预测标志物、预测模型及其构建方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010063121A1 (en) * 2008-12-04 2010-06-10 University Health Network Methods for biomarker identification and biomarker for non-small cell lung cancer
CN102482711A (zh) * 2009-01-07 2012-05-30 美瑞德生物工程公司 癌症生物标记
CN105277718A (zh) * 2015-09-29 2016-01-27 上海知先生物科技有限公司 用于恶性肿瘤相关筛查及评估的产品、应用及方法
CN105759052A (zh) * 2015-12-02 2016-07-13 陈炜 用于膀胱癌非侵入式诊断的分子标志
CN108504555A (zh) * 2018-04-16 2018-09-07 清华大学 鉴别及评价肿瘤进展的装置和方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2866052A1 (en) * 2012-01-20 2013-07-25 The Ohio State University Breast cancer biomarker signatures for invasiveness and prognosis
KR102297935B1 (ko) * 2013-11-21 2021-09-06 퍼시픽 에지 리미티드 유전자형 및 표현형 바이오마커를 사용하는 무증상 혈뇨를 갖는 환자의 분류

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010063121A1 (en) * 2008-12-04 2010-06-10 University Health Network Methods for biomarker identification and biomarker for non-small cell lung cancer
CN102482711A (zh) * 2009-01-07 2012-05-30 美瑞德生物工程公司 癌症生物标记
CN105277718A (zh) * 2015-09-29 2016-01-27 上海知先生物科技有限公司 用于恶性肿瘤相关筛查及评估的产品、应用及方法
CN105759052A (zh) * 2015-12-02 2016-07-13 陈炜 用于膀胱癌非侵入式诊断的分子标志
CN108504555A (zh) * 2018-04-16 2018-09-07 清华大学 鉴别及评价肿瘤进展的装置和方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724903A (zh) * 2020-06-29 2020-09-29 北京市肿瘤防治研究所 预测受试者胃癌预后的系统
CN111724903B (zh) * 2020-06-29 2023-09-26 北京市肿瘤防治研究所 预测受试者胃癌预后的系统
CN117694839A (zh) * 2024-02-05 2024-03-15 四川省肿瘤医院 基于图像的非肌层浸润性膀胱癌复发率预测方法和系统
CN117694839B (zh) * 2024-02-05 2024-04-16 四川省肿瘤医院 基于图像的非肌层浸润性膀胱癌复发率预测方法和系统

Also Published As

Publication number Publication date
US20200185054A1 (en) 2020-06-11
CN108504555B (zh) 2021-08-03
CN108504555A (zh) 2018-09-07

Similar Documents

Publication Publication Date Title
WO2019201186A1 (zh) 鉴别及评价肿瘤进展的装置和方法
JP6854792B2 (ja) ゲノムモデルに関するデータ統合を用いたパスウェイ認識アルゴリズム(paradigm)
Na et al. Germline mutations in ATM and BRCA1/2 distinguish risk for lethal and indolent prostate cancer and are associated with early age at death
JP6400746B2 (ja) ゲノムモデルに関するデータ統合を用いたパスウェイ認識アルゴリズム(paradigm)
KR20190026837A (ko) 무세포 핵산의 프래그멘톰 프로파일링을 위한 방법
US20180330049A1 (en) Methods for classification of glioma
Vossen et al. Comparative genomic analysis of oral versus laryngeal and pharyngeal cancer
Kennedy et al. An integrated-omics analysis of the epigenetic landscape of gene expression in human blood cells
Kalari et al. Deep sequence analysis of non-small cell lung cancer: integrated analysis of gene expression, alternative splicing, and single nucleotide variations in lung adenocarcinomas with and without oncogenic KRAS mutations
Du et al. Next‐generation sequencing unravels extensive genetic alteration in recurrent ovarian cancer and unique genetic changes in drug‐resistant recurrent ovarian cancer
Wang et al. Malignant melanotic Xp11 neoplasms exhibit a clinicopathologic spectrum and gene expression profiling akin to alveolar soft part sarcoma: a proposal for reclassification
Ramos et al. NRF1 motif sequence-enriched genes involved in ER/PR− ve HER2+ ve breast cancer signaling pathways
Dai et al. Identification of hub methylated‐CpG sites and associated genes in oral squamous cell carcinoma
Xu et al. Multi-omics analysis at epigenomics and transcriptomics levels reveals prognostic subtypes of lung squamous cell carcinoma
Díez-Villanueva et al. Identifying causal models between genetically regulated methylation patterns and gene expression in healthy colon tissue
Sarver et al. Distinct mechanisms of PTEN inactivation in dogs and humans highlight convergent molecular events that drive cell division in the pathogenesis of osteosarcoma
Wang et al. Genetic intratumor heterogeneity remodels the immune microenvironment and induces immune evasion in brain metastasis of lung cancer
US20200263258A1 (en) Assessing and treating mammals having polyps
Malgulwar et al. Transcriptional co-expression regulatory network analysis for Snail and Slug identifies IL1R1, an inflammatory cytokine receptor, to be preferentially expressed in ST-EPN-RELA and PF-EPN-A molecular subgroups of intracranial ependymomas
Kozakiewicz et al. Spatial variation in gene expression of Tasmanian devil facial tumors despite minimal host transcriptomic response to infection
Landry et al. Multiplatform molecular analysis of vestibular schwannoma reveals two robust subgroups with distinct microenvironment
Hosseinkhan et al. Large contribution of copy number alterations in early stage of papillary thyroid carcinoma
WO2021227950A1 (zh) 癌症预后方法
Zhao et al. Identification of gene-set signature in early-stage hepatocellular carcinoma and relevant immune characteristics
Zhu et al. Expression patterns and prognostic value of key regulators associated with m7G RNA modification based on all gene expression in colon adenocarcinoma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19788248

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19788248

Country of ref document: EP

Kind code of ref document: A1