CN116564421A - Method for constructing prognosis model related to copper death of acute myelogenous leukemia patient - Google Patents
Method for constructing prognosis model related to copper death of acute myelogenous leukemia patient Download PDFInfo
- Publication number
- CN116564421A CN116564421A CN202310672840.1A CN202310672840A CN116564421A CN 116564421 A CN116564421 A CN 116564421A CN 202310672840 A CN202310672840 A CN 202310672840A CN 116564421 A CN116564421 A CN 116564421A
- Authority
- CN
- China
- Prior art keywords
- death
- copper
- model
- prognosis
- risk
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 title claims abstract description 122
- 229910052802 copper Inorganic materials 0.000 title claims abstract description 122
- 239000010949 copper Substances 0.000 title claims abstract description 122
- 230000034994 death Effects 0.000 title claims abstract description 121
- 238000004393 prognosis Methods 0.000 title claims abstract description 80
- 238000000034 method Methods 0.000 title claims abstract description 48
- 208000031261 Acute myeloid leukaemia Diseases 0.000 title claims abstract description 40
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 title claims abstract description 33
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 86
- 230000009467 reduction Effects 0.000 claims abstract description 5
- 230000004083 survival effect Effects 0.000 claims description 31
- 238000012549 training Methods 0.000 claims description 18
- 208000032839 leukemia Diseases 0.000 claims description 13
- 238000001325 log-rank test Methods 0.000 claims description 12
- 101150094083 24 gene Proteins 0.000 claims description 11
- 238000012795 verification Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 238000010200 validation analysis Methods 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 5
- 102100024396 Adrenodoxin, mitochondrial Human genes 0.000 claims description 4
- 101001120734 Ascaris suum Pyruvate dehydrogenase E1 component subunit alpha type I, mitochondrial Proteins 0.000 claims description 4
- 102100023319 Dihydrolipoyl dehydrogenase, mitochondrial Human genes 0.000 claims description 4
- 102100027152 Dihydrolipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase complex, mitochondrial Human genes 0.000 claims description 4
- 101000833098 Homo sapiens Adrenodoxin, mitochondrial Proteins 0.000 claims description 4
- 101001122360 Homo sapiens Dihydrolipoyllysine-residue acetyltransferase component of pyruvate dehydrogenase complex, mitochondrial Proteins 0.000 claims description 4
- 101000798109 Homo sapiens Melanotransferrin Proteins 0.000 claims description 4
- 101000967073 Homo sapiens Metal regulatory transcription factor 1 Proteins 0.000 claims description 4
- 101001030197 Homo sapiens Myelin transcription factor 1 Proteins 0.000 claims description 4
- 101001120726 Homo sapiens Pyruvate dehydrogenase E1 component subunit alpha, somatic form, mitochondrial Proteins 0.000 claims description 4
- 101001137451 Homo sapiens Pyruvate dehydrogenase E1 component subunit beta, mitochondrial Proteins 0.000 claims description 4
- 101710119292 Probable D-lactate dehydrogenase, mitochondrial Proteins 0.000 claims description 4
- 102100026067 Pyruvate dehydrogenase E1 component subunit alpha, somatic form, mitochondrial Human genes 0.000 claims description 4
- 102100035711 Pyruvate dehydrogenase E1 component subunit beta, mitochondrial Human genes 0.000 claims description 4
- 101000968127 Homo sapiens Lipoyl synthase, mitochondrial Proteins 0.000 claims description 3
- 101001005211 Homo sapiens Lipoyltransferase 1, mitochondrial Proteins 0.000 claims description 3
- 102100021174 Lipoyl synthase, mitochondrial Human genes 0.000 claims description 3
- 102100025853 Lipoyltransferase 1, mitochondrial Human genes 0.000 claims description 3
- 239000000654 additive Substances 0.000 claims description 3
- 230000000996 additive effect Effects 0.000 claims description 3
- 230000002349 favourable effect Effects 0.000 claims description 3
- 108010009392 Cyclin-Dependent Kinase Inhibitor p16 Proteins 0.000 claims description 2
- -1 GLS Proteins 0.000 claims description 2
- 102000009508 Cyclin-Dependent Kinase Inhibitor p16 Human genes 0.000 claims 1
- 102100032239 Melanotransferrin Human genes 0.000 claims 1
- 238000013103 analytical ultracentrifugation Methods 0.000 description 11
- 238000011160 research Methods 0.000 description 9
- 102100040514 Metal regulatory transcription factor 1 Human genes 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 230000030833 cell death Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 102100032887 Clusterin Human genes 0.000 description 1
- 108090000197 Clusterin Proteins 0.000 description 1
- 208000035984 Colonic Polyps Diseases 0.000 description 1
- 102100024458 Cyclin-dependent kinase inhibitor 2A Human genes 0.000 description 1
- MBMLMWLHJBBADN-UHFFFAOYSA-N Ferrous sulfide Chemical compound [Fe]=S MBMLMWLHJBBADN-UHFFFAOYSA-N 0.000 description 1
- 102000005298 Iron-Sulfur Proteins Human genes 0.000 description 1
- 108010081409 Iron-Sulfur Proteins Proteins 0.000 description 1
- 108020005198 Long Noncoding RNA Proteins 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 108091046869 Telomeric non-coding RNA Proteins 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 108091005647 acylated proteins Proteins 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 238000011088 calibration curve Methods 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 238000011281 clinical therapy Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000002559 cytogenic effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000002489 hematologic effect Effects 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000011368 intensive chemotherapy Methods 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 238000002626 targeted therapy Methods 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Public Health (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Biotechnology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioethics (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Software Systems (AREA)
- Pathology (AREA)
- Primary Health Care (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a method for constructing a copper death related prognosis model of an acute myeloid leukemia patient, which comprises the following steps: determining a new copper death-related gene set based on three data sets of GSE37642, GSE12417 and TCGA-LAML and a plurality of known copper death-related genes, obtaining copper death-related genes related to prognosis through single factor Cox regression, then performing dimensionality reduction on the copper death-related genes related to prognosis through spike-and-slide lasso, finally obtaining an optimal gene combination and regression coefficients thereof through stepwise regression, and constructing a preliminary model of copper death-related prognosis characteristics; and fitting the preliminary model of the copper death-related prognosis characteristic with a plurality of gene models based on a stacking strategy to obtain an expanded stacking model. The invention utilizes a stacking strategy and combines two high-quality gene models to expand the constructed copper death-related prognosis characteristics and improve the prediction efficiency of the models.
Description
Technical Field
The invention relates to the technical field of biological information, in particular to a method for constructing a prognosis model related to copper death of an acute myelogenous leukemia patient.
Background
Acute myelogenous leukemia (Acute myeloid leukemia, AML) is a molecular and cytogenetic heterogeneous disease characterized by the clonal expansion of myeloid precursors. In all leukemic subtypes, AML mortality was 44.3%. Up to now, chemotherapy remains a routine treatment for AML patients. However, the cure rate of the traditional intensive chemotherapy is only 30-50%. As basic medical research progresses, there is a greater understanding of AML, particularly in terms of its underlying mechanisms, environmental and genetic risk factors, and new therapeutic approaches. New therapies (especially targeted therapies and immunotherapy) and new clinical studies are critical to improve prognosis in AML patients. The determination of new prognostic markers is important for guiding leukemia-related studies and for advancing clinical therapies.
Copper death is a novel type of cell death, defined as the intracellular copper accumulation that triggers aggregation of mitochondrial lipid acylated proteins and instability of iron sulfur clusterin, resulting in a unique cell death. The TCGA database was used by the only two studies (Zhu, y., he, j, li, z. & Yang, w.cuproptosis-related lncRNA signature for prognostic prediction in patients with acute myeloid leukemia BMC Bioinformatics, 37, doi:10.1186/s12859-023-05148-9 (2023); li, p.et al A novel Cuproptosis-related LncRNA signature: prognostic and therapeutic value for acute myeloid leukemia, front Oncol 12, 966920) to construct gene models related to copper death in AML, which have problems in the course and outcome of the study. First, the amount of sample modeled is insufficient, the data volume of the data sets used in both studies is about 150, with less data volume being used for modeling, which may result in poor representativeness of the model and poor extrapolation results, in the studies of Li et al, the model performs poorly in the validation data set; secondly, the reported models are not verified by an external data set, verification sets of the models are obtained by randomly splitting an original data set, and good verification results are often obtained by verifying the models by randomly splitting the verification sets obtained by the data set. Based on the current state of research, it is necessary to perform research to fill the research gap in this field and to improve the quality and standard of future research.
Stacking strategies have a strong predictive power for handling complex problems. In recent years, stacks have been developed in the medical field and applied to clinical practice. The Wang et al establish a stacked set model by utilizing cfDNA fragment histology characteristics and obtain high sensitivity in detecting early lung cancer. Carina Albuquerque et al set up a stack-based artificial intelligence framework for effectively detecting and locating colonic polyps. Thus, stacking strategies have considerable potential in integrating models and important clinical practical implications in advancing the application of models. But there is little research to investigate the research potential of stacking strategies in the hematology direction.
Based on the current state of research, constructing a new copper death-related feature to predict the prognosis of AML patients and exploring the possibility of stacking strategies in the hematological direction has become an important research topic.
Disclosure of Invention
The invention provides a method for constructing a copper death-related prognosis model of an acute myeloid leukemia patient, which utilizes a stacking strategy and combines two high-quality gene models to expand the constructed copper death-related prognosis characteristics and improve the prediction efficiency of the model.
In order to achieve the above object, the present invention provides the following solutions:
a method for constructing a copper death-related prognosis model of an acute myelogenous leukemia patient comprises the following steps:
constructing a preliminary model of copper death-related prognosis characteristics:
using GSE37642 data set as training set, GSE12417 and TCGA-LAML queue as verification set, combining several known copper death related genes, determining new copper death related gene set;
obtaining copper death related genes related to prognosis through single factor Cox regression in a new copper death related gene set, then carrying out dimension reduction on the copper death related genes related to prognosis through spike-and-slide lasso, finally obtaining the optimal gene combination and regression coefficient thereof through stepwise regression, and constructing a preliminary model of copper death related prognosis characteristics;
extended copper death-related prognostic signature preliminary model:
based on a stacking strategy, fitting the preliminary model of the copper death related prognosis characteristics with a plurality of gene models to obtain an expanded stacking model which is used as a final copper death related prognosis model of the acute myeloid leukemia patient.
Further, the plurality of known copper death-related genes includes:
CDKN2A, FDX1, DLD, DLAT, LIAS, GLS, LIPT1, MTF1, PDHA1 and PDHB.
Further, the determining a new copper death-related gene set using the GSE37642 dataset as a training set and the GSE12417 and TCGA-LAML queues as a validation set, in combination with a plurality of known copper death-related genes, comprises:
carrying out Spearman rank correlation on a plurality of known copper death related genes and common genes of three data sets of GSE37642, GSE12417 and TCGA-LAML to obtain a correlation coefficient and a P value of each gene;
and taking the gene with the absolute value of the correlation coefficient larger than 0.4 and the P value smaller than 0.05 as a new copper death related gene to obtain a new copper death related gene set.
Further, the obtaining of copper death-related genes associated with prognosis by single factor Cox regression in the new copper death-related gene set comprises:
genes with P values less than 0.01 were selected from the new copper death-related gene set as prognosis-related copper death-related genes by single factor Cox regression.
Further, the preliminary model of the copper death-related prognostic characteristic is formulated as:
,
wherein ,nfor the final modeling of the basis factors,Exp i beta, the expression level of the gene i Regression coefficients are stepwise regression.
Further, the number of gene models includes a 4-mRNA model and a 24gene model.
Further, fitting the preliminary model of the copper death-related prognosis feature with a plurality of gene models based on a stacking strategy to obtain an expanded stacking model as a final copper death-related prognosis model of the acute myeloid leukemia patient, comprising:
first, the GSE37642 dataset was used as training data, randomly divided into 10 uniform groups, called "folds";
secondly, fitting three sub-models of a copper death-related prognosis feature primary model, a 4-mRNA model and a 24gene model by multi-factor Cox regression through 9 folds in 10 tradeoffs, calculating the risk score of each sub-model in another trade-off, and repeating the process for 10 times, wherein the risk scores of the three sub-models can be obtained by all folds;
thirdly, integrating the risk score of each sub-model with the survival outcome of the training data;
fourthly, integrating the risk scores of the sub-models by adopting an additive linear model, wherein a random survival forest is used for fitting the integrated risk scores and survival outcomes, and obtaining the weight of the risk score of each sub-model by adopting a restrictive least square method;
fifthly, obtaining risk scores of all sub-models through multi-factor Cox regression in all training data again, and obtaining final integrated risk scores according to weights in the fourth step;
and sixthly, fitting the relation between the risk score and survival ending in the fifth step by using a random survival forest method to obtain an expanded stacking model which is used as a final copper death-related prognosis model of the acute myelogenous leukemia patient.
Further, the method further comprises:
verifying a preliminary model of copper death-related prognosis characteristics:
the prediction efficiency of the preliminary model of the copper death-related prognosis characteristics is estimated according to the ROC curve and the calibration chart during passing; meanwhile, patients were divided into high-risk and low-risk groups according to the optimal cut-off value of risk score, and survival differences between the two groups were compared, and a P value of log-rank test less than 0.05 was considered as a difference in survival between the two groups.
Further, the method further comprises:
verifying the expanded stacking model:
comparing the degree of distinction between the stacking model and each sub-model according to the ROC curve, and simultaneously, verifying the prediction effect of the risk classification of the stacking model and the European leukemia net risk classification by using a BeatAML queue;
patients are divided into low-risk groups and high-risk groups according to the optimal cut-off value of risk scores of the stacked models, and in European leukemia net risk classification, patients are divided into unfavorable, medium and favorable groups;
recombining two grouping standards of a low-risk group and a high-risk group, wherein the low-risk group is based on European leukemia network risk classification, and in the rest group, the medium-risk group and the high-risk group are based on classification standards of a stacking model, the low-risk group in the stacking model is re-divided into the medium-risk group, and the high-risk group in the stacking model is re-divided into the high-risk group;
the log-rank test was used to compare the differences between the survival curves of the groups, and when comparing between groups, a P value of less than 0.017 for the log-rank test was considered to be a difference between the two groups.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: according to the method for constructing the copper death-related prognosis model of the acute myeloid leukemia patient, disclosed by the invention, the constructed copper death-related prognosis characteristic preliminary model is expanded by combining two other gene models published on a high-quality journal by utilizing a stacking strategy, and the advantages of different algorithms and models can be combined by stacking, so that the prediction performance is improved, and the stacking strategy has very strong prediction capability on complex problems. The model after expansion is superior to other models in the direction in the aspect of prediction efficiency, specifically, the distinguishing degree of the copper death related prognosis features is higher than that of other models, the model evaluation index is perfect, and the model generalization capability is strong.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing a copper death-related prognosis model of an acute myeloid leukemia patient according to the present invention;
FIG. 2 is a schematic representation of the construction of copper death-related prognostic signatures in GSE37642 dataset according to the present invention, wherein (A) represents the optimal cut-off value of risk score; (B) Representing Kaplan-Meier plots showing patient OS differences stratified based on risk scores; (C) represents a ROC curve of 1,2, 3 years; (D) represents a calibration chart;
FIG. 3 is a schematic representation of the present invention for validating copper death-related prognostic signatures in GSE12417 dataset, wherein (A) represents an optimal cutoff value for risk score; (B) Representing Kaplan-Meier plots showing patient OS differences stratified based on risk scores; (C) represents a ROC curve of 1,2, 3 years; (D) represents a calibration chart;
FIG. 4 is a schematic diagram illustrating the verification of copper death-related prognosis features in a TCGA-LAML queue according to the present invention; (a) represents an optimal cut-off value for risk score; (B) Representing Kaplan-Meier plots showing patient OS differences stratified based on risk scores; (C) represents a ROC curve of 1,2, 3 years; (D) represents a calibration chart;
FIG. 5 is a1, 2, 3 year ROC curve comparison between the stacked model and each sub-model, with (A) - (D) representing the 1,2, 3 year ROC curves of the copper death-related features, 4-mRNA model, 24gene model, and stacked model in the GSE37642 dataset, respectively; (E) (H) represents the 1,2 and 3 year ROC curves of the copper death related features, the 4-mRNA model, the 24gene model and the stacking model in the GSE12417 data set respectively;
FIG. 6 is a Kaplan-Meier diagram of an embodiment of the present invention, (A) shows that the Kaplan-Meier diagram shows patient OS differences based on ELN layering; (B) The Kaplan-Meier graph shows patient OS differences based on new stratification.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a method for constructing a copper death-related prognosis model of an acute myelogenous leukemia patient, which utilizes a stacking strategy and combines two high-quality gene models to expand the constructed copper death-related prognosis characteristics and improve the prediction efficiency of the model.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in FIG. 1, the method for constructing the copper death-related prognosis model of the acute myelogenous leukemia patient provided by the invention comprises the following steps:
(1) Constructing a preliminary model of copper death-related prognosis characteristics:
taking a GSE37642 data set as a training set, taking a GSE12417 and TCGA-LAML queue as a verification set, combining 10 copper death related genes (CDKN 2A, FDX1, DLD, DLAT, LIAS, GLS, LIPT1, MTF1, PDHA1 and PDHB), and taking Spearman rank correlation between the 10 genes and a common gene of the three data sets (GSE 37642, GSE12417 and TCGA-LAML) to obtain a correlation coefficient and a P value of each gene, wherein the absolute value of the correlation coefficient is larger than 0.4, and the genes with the P value smaller than 0.05 are selected as new copper death related genes; in the new copper death-related gene set, copper death-related genes associated with prognosis were obtained by single factor Cox regression (P-value less than 0.01). Then, carrying out dimension reduction on copper death related genes related to prognosis through spike-and-slide lasso, finally, obtaining the optimal gene combination through stepwise regression, and constructing a preliminary model of copper death related prognosis characteristics by utilizing the genes obtained through stepwise regression and regression coefficients thereof:
,
wherein ,nfor the final modeling of the basis factors,Exp i beta, the expression level of the gene i Regression coefficients are stepwise regression.
(2) Verifying a preliminary model of copper death-related prognosis characteristics:
the prediction efficiency of the preliminary model of the copper death-related prognosis characteristics is estimated according to the ROC curve and the calibration chart during passing; meanwhile, patients were divided into high-risk and low-risk groups according to the optimal cut-off value of risk score, and survival differences between the two groups were compared, and a P value of log-rank test less than 0.05 was considered as a difference in survival between the two groups.
(3) Extended copper death-related prognostic signature preliminary model:
based on a stacking strategy, fitting the preliminary model of the copper death related prognosis characteristics with a plurality of gene models to obtain an expanded stacking model which is used as a final copper death related prognosis model of the acute myeloid leukemia patient.
Wherein the plurality of gene models includes a 4-mRNA model and a 24gene model, the one model derived from:
Chen, Z. et al. A novel 4-mRNA signature predicts the overall survival in acute myeloid leukemia. Am J Hematol 96, 1385-1395, doi:10.1002/ajh.26309 (2021).
Li, Z. et al. Identification of a 24-gene prognostic signature that improves the European LeukemiaNet risk classification of acute myeloid leukemia: an international collaborative study. J Clin Oncol 31, 1172-1181, doi:10.1200/jco.2012.44.3184 (2013).
two other gene models were first collected: modeling genes corresponding to 4-mRNA models (KLF 9, ENPP4, TUBA4A and CD 247) and 24gene models (ALS 2CR8, ANGEL1, ARL6IP5, BSPRY, BTBD3, C1RL, CPT1A, DAPK1, ETFB, FGFR1, HEATR6, LAPTM4B, MAP7, NDFIP1, PBX3, PLA2G4A, PLOD3, PTP4A3, SLC25A12, SLC2A5, TMEM159, TRIM44, TRPS1 and VAV 3).
Based on the method, the model is expanded under the stacking framework, the technical flow is shown in fig. 1, and the method specifically comprises the following steps:
first, the GSE37642 dataset was used as training data, randomly divided into 10 uniform groups, called "folds";
secondly, fitting three sub-models of a copper death-related prognosis feature primary model, a 4-mRNA model and a 24gene model by multi-factor Cox regression through 9 folds in 10 tradeoffs, calculating the risk score of each sub-model in another trade-off, and repeating the process for 10 times, wherein the risk scores of the three sub-models can be obtained by all folds;
thirdly, integrating the risk score of each sub-model with the survival outcome of the training data;
fourthly, integrating the risk scores of the sub-models by adopting an additive linear model, wherein a random survival forest is used for fitting the integrated risk scores and survival outcomes, and obtaining the weight of the risk score of each sub-model by adopting a restrictive least square method;
fifthly, obtaining risk scores of all sub-models through multi-factor Cox regression in all training data again, and obtaining final integrated risk scores according to weights in the fourth step;
and sixthly, fitting the relation between the risk score and survival ending in the fifth step by using a random survival forest method to obtain an expanded stacking model which is used as a final copper death-related prognosis model of the acute myelogenous leukemia patient.
(4) Verifying the expanded stacking model:
comparing the degree of distinction between the stacking model and each sub-model according to the ROC curve, and simultaneously, verifying the prediction effect of the risk classification of the stacking model and the European leukemia net risk classification by using a BeatAML queue;
patients are divided into low-risk groups and high-risk groups according to the optimal cut-off value of risk scores of the stacked models, and in European leukemia net risk classification, patients are divided into unfavorable, medium and favorable groups;
recombining two grouping standards of a low-risk group and a high-risk group, wherein the low-risk group is based on European leukemia network risk classification, and in the rest group, the medium-risk group and the high-risk group are based on classification standards of a stacking model, the low-risk group in the stacking model is re-divided into the medium-risk group, and the high-risk group in the stacking model is re-divided into the high-risk group;
the log-rank test was used to compare the differences between the survival curves of the groups, and when comparing between groups, a P value of less than 0.017 for the log-rank test was considered to be a difference between the two groups.
Other methods (e.g., machine learning, bioinformatics methods, etc.) may be used in determining the final modeled gene portion.
In the process of constructing the stacking model, the scheme selects two high-quality journal upper-published gene models. The choice of these sub-models is alternative in practice. The alternative can be a model of other histology directions, and also can be a prediction model constructed by clinical variables; the scheme uses a random living forest method for final fitting the model, and can also be replaced by other machine learning algorithms.
Examples
In the process of constructing the copper death-related prognosis feature, 10 copper death-related genes (CDKN 2A, FDX1, DLD, DLAT, LIAS, GLS, LIPT, MTF1, PDHA1 and PDHB) which are reported at first are collected by taking a GSE37642 data set as a training set and a GSE12417 and TCGA-LAML queue as a verification set, the 10 genes are subjected to Spearman rank correlation with a common gene of the three data sets (GSE 37642, GSE12417 and TCGA-LAML), and 3170 novel copper death-related genes are obtained in total according to a screening standard that the absolute value of a correlation coefficient is larger than 0.4 and the P value is smaller than 0.05. On this basis 122 copper death-related genes associated with prognosis of the patient were obtained by single factor Cox regression. Dimension reduction is carried out through spike-and-slide lasso (S1=0.1, S0=0.01), and 24 prognosis-related copper death-related genes are further obtained through screening. After stepwise regression, the best combination of 14 genes was obtained. The risk scores were constructed from these genes and their regression coefficients as follows:
,
in the process of verifying the copper death-related prognosis characteristics, the prediction performance of the model is evaluated according to the ROC curve and the calibration chart when passing. At the same time, patients were divided into high-risk and low-risk groups according to the optimal cut-off value of risk score and the survival differences between the two groups were compared (a P value of log-rank test less than 0.05 was considered as a difference in survival between the two groups). As shown in fig. 2, in training set GSE37642, AUC for 1,2, 3 years are 0.748, 0.785, and 0.807, respectively; fitting the calibration curve to the diagonal line shows that the model has good calibration degree; there was a difference in survival between the high and low risk groups. As shown in fig. 3, in validation set GSE12417, AUCs for 1,2, 3 years are 0.757, 0.745, and 0.772, respectively; the calibration degree is good; there was a difference in survival between the high and low risk groups. As shown in fig. 4, in the validation set TCGA-LAML queue, AUCs for 1,2, and 3 years are 0.735, 0.758, and 0.748, respectively; the calibration degree is good; there was a difference in survival between the high and low risk groups.
In constructing the stack model, two microarray datasets (GSE 37642 as training set and GSE12417 as validation set) were used and two additional gene models (4-mRNA model and 24gene model) were collected as sub-models to construct the final stack model.
,
In validating the stacked model, the distinction between stacked model and each sub-model is compared by ROC curve when passing. As shown in fig. 5, in the training set GSE37642, the 1,2, 3 year AUC for the copper death-related prognostic signature was 0.748, 0.785, and 0.807, respectively; 1,2, 3 year AUC for the 4-mRNA model were 0.634, 0.645, and 0.652, respectively; the 1,2, 3 year AUC for the 24gene model was 0.704, 0.714, and 0.744, respectively; the 1,2, 3 year AUC for the stack model were 0.816, 0.843, and 0.857, respectively. In the validation set GSE12417, the 1,2, 3 year AUC for the copper death-related prognostic signature was 0.757, 0.745, and 0.772, respectively; 1,2, 3 year AUC for the 4-mRNA model were 0.678, 0.65, and 0.638, respectively; the 1,2, 3 year AUC for the 24gene model was 0.65, 0.653, and 0.646, respectively; the 1,2, 3 year AUC for the stacked model were 0.778, 0.751, and 0.769, respectively. Meanwhile, a BeatAML queue was used to verify the predictive effect of risk classification of the stacked model in combination with the European leukemia network risk classification. As shown in fig. 6, in the beaaml cohort, european leukemia net risk classification failed to distinguish survival of the intermediate and adverse group of acute myeloid leukemia patients (P value of 0.2 for log-rank test). After integrating the risk classification of the stacked model and the european leukemia net risk classification, the P-value of the log-rank test between the two groups was 0.011, i.e. the new risk classification could be used to better distinguish the two groups of patients.
The copper death-related prognosis characteristic model constructed by the invention is superior to other models in the direction in the aspect of prediction efficiency. Specifically, the degree of distinction of the copper death related prognosis features is higher than that of other models, the model evaluation index is perfect, and the model generalization capability is strong.
The reason is that: (1) model differentiation is highly derived from strategies of gene screening: compared with the lasso method, the spike-and-slide lasso method has advantages in terms of variable selection and parameter estimation. (2) The predictive efficacy of the model is evaluated from multiple dimensions. The distinguishing degree and the calibration degree of the model are evaluated according to the ROC curve and the calibration chart when in use. Meanwhile, the patients are divided into a high-risk group and a low-risk group according to the optimal cut-off value of the linear predictive value, and survival differences between the two groups are compared. (3) The sample size of the modeled dataset is sufficient.
The constructed copper death-related prognostic signatures were extended using a stacking strategy in combination with two other gene models 1,2 published on high quality journals. The invention provides an integration strategy for reference, and the prediction efficiency of the original model can be improved based on the strategy.
The reason is that: stacking may combine the advantages of different algorithms and models to improve predictive performance. Stacking strategies have a strong predictive power for complex problems.
The invention also provides a system for constructing a copper death-related prognosis model of an acute myelogenous leukemia patient, which comprises the following steps:
the development module is used for constructing a preliminary model of the copper death-related prognosis characteristics:
the verification module is used for verifying the preliminary model of the copper death related prognosis characteristics and the expanded stacking model;
and the expansion model is used for expanding the preliminary model of the copper death-related prognosis characteristics.
The invention also discloses an electronic device, which comprises one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the acute myeloid leukemia patient copper death-related prognosis model construction method as described above.
Those of skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The disclosed systems, modules, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the units may be merely a logical functional division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: USB flash disk, mobile hard disk, read-only memory (ROM), random access memory
Various media such as a disk or optical disk may store program code.
Those skilled in the art will appreciate that all or part of the processes in implementing the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the program may include processes of the embodiments of the methods as described above when executed. Wherein the storage medium may be a magnetic disk, an optical disk, a ROM, a RAM, etc.
The above description of embodiments is only for aiding in the understanding of the method of the present invention and its core ideas; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.
Claims (9)
1. The method for constructing the copper death related prognosis model of the acute myelogenous leukemia patient is characterized by comprising the following steps of:
constructing a preliminary model of copper death-related prognosis characteristics:
using GSE37642 data set as training set, GSE12417 and TCGA-LAML queue as verification set, combining several known copper death related genes, determining new copper death related gene set;
obtaining copper death related genes related to prognosis through single factor Cox regression in a new copper death related gene set, then carrying out dimension reduction on the copper death related genes related to prognosis through spike-and-slide lasso, finally obtaining the optimal gene combination and regression coefficient thereof through stepwise regression, and constructing a preliminary model of copper death related prognosis characteristics;
extended copper death-related prognostic signature preliminary model:
based on a stacking strategy, fitting the preliminary model of the copper death related prognosis characteristics with a plurality of gene models to obtain an expanded stacking model which is used as a final copper death related prognosis model of the acute myeloid leukemia patient.
2. The method of constructing a copper death-related prognosis model for an acute myeloid leukemia patient according to claim 1, wherein the plurality of known copper death-related genes comprises:
CDKN2A, FDX1, DLD, DLAT, LIAS, GLS, LIPT1, MTF1, PDHA1 and PDHB.
3. The method of claim 1, wherein determining a new set of copper death-related genes by combining a plurality of known copper death-related genes using GSE37642 dataset as a training set and GSE12417 and TCGA-LAML cohorts as a validation set, comprises:
carrying out Spearman rank correlation on a plurality of known copper death related genes and common genes of three data sets of GSE37642, GSE12417 and TCGA-LAML to obtain a correlation coefficient and a P value of each gene;
and taking the gene with the absolute value of the correlation coefficient larger than 0.4 and the P value smaller than 0.05 as a new copper death related gene to obtain a new copper death related gene set.
4. The method for constructing a copper death-related prognosis model for acute myeloid leukemia patients according to claim 3, wherein the obtaining the copper death-related gene related to prognosis by single factor Cox regression in the new copper death-related gene set comprises:
genes with P values less than 0.01 were selected from the new copper death-related gene set as prognosis-related copper death-related genes by single factor Cox regression.
5. The method for constructing a copper death-related prognosis model for an acute myeloid leukemia patient according to claim 4, wherein the preliminary model of copper death-related prognosis characteristics is formulated as:
,
wherein ,nfor the final modeling of the basis factors,Exp i beta, the expression level of the gene i Regression coefficients are stepwise regression.
6. The method for constructing a copper death-related prognosis model for an acute myelogenous leukemia patient according to claim 1, wherein the plurality of gene models comprises a 4-mRNA model and a 24gene model.
7. The method for constructing a copper death-related prognosis model for an acute myeloid leukemia patient according to claim 6, wherein fitting the preliminary model of copper death-related prognosis characteristics with a plurality of gene models based on a stacking strategy to obtain an expanded stacking model as a final copper death-related prognosis model for the acute myeloid leukemia patient comprises:
first, the GSE37642 dataset was used as training data, randomly divided into 10 uniform groups, called "folds";
secondly, fitting three sub-models of a copper death-related prognosis feature primary model, a 4-mRNA model and a 24gene model by multi-factor Cox regression through 9 folds in 10 tradeoffs, calculating the risk score of each sub-model in another trade-off, and repeating the process for 10 times, wherein the risk scores of the three sub-models can be obtained by all folds;
thirdly, integrating the risk score of each sub-model with the survival outcome of the training data;
fourthly, integrating the risk scores of the sub-models by adopting an additive linear model, wherein a random survival forest is used for fitting the integrated risk scores and survival outcomes, and obtaining the weight of the risk score of each sub-model by adopting a restrictive least square method;
fifthly, obtaining risk scores of all sub-models through multi-factor Cox regression in all training data again, and obtaining final integrated risk scores according to weights in the fourth step;
and sixthly, fitting the relation between the risk score and survival ending in the fifth step by using a random survival forest method to obtain an expanded stacking model which is used as a final copper death-related prognosis model of the acute myelogenous leukemia patient.
8. The method for constructing a copper death-related prognosis model for an acute myeloid leukemia patient according to claim 1, wherein the method further comprises:
verifying a preliminary model of copper death-related prognosis characteristics:
the prediction efficiency of the preliminary model of the copper death-related prognosis characteristics is estimated according to the ROC curve and the calibration chart during passing; meanwhile, patients were divided into high-risk and low-risk groups according to the optimal cut-off value of risk score, and survival differences between the two groups were compared, and a P value of log-rank test less than 0.05 was considered as a difference in survival between the two groups.
9. The method for constructing a copper death-related prognosis model for an acute myeloid leukemia patient according to claim 1, wherein the method further comprises:
verifying the expanded stacking model:
comparing the degree of distinction between the stacking model and each sub-model according to the ROC curve, and simultaneously, verifying the prediction effect of the risk classification of the stacking model and the European leukemia net risk classification by using a BeatAML queue;
patients are divided into low-risk groups and high-risk groups according to the optimal cut-off value of risk scores of the stacked models, and in European leukemia net risk classification, patients are divided into unfavorable, medium and favorable groups;
recombining two grouping standards of a low-risk group and a high-risk group, wherein the low-risk group is based on European leukemia network risk classification, and in the rest group, the medium-risk group and the high-risk group are based on classification standards of a stacking model, the low-risk group in the stacking model is re-divided into the medium-risk group, and the high-risk group in the stacking model is re-divided into the high-risk group;
the log-rank test was used to compare the differences between the survival curves of the groups, and when comparing between groups, a P value of less than 0.017 for the log-rank test was considered to be a difference between the two groups.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310672840.1A CN116564421B (en) | 2023-06-08 | 2023-06-08 | Method for constructing prognosis model related to copper death of acute myelogenous leukemia patient |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310672840.1A CN116564421B (en) | 2023-06-08 | 2023-06-08 | Method for constructing prognosis model related to copper death of acute myelogenous leukemia patient |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116564421A true CN116564421A (en) | 2023-08-08 |
CN116564421B CN116564421B (en) | 2024-01-30 |
Family
ID=87503609
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310672840.1A Active CN116564421B (en) | 2023-06-08 | 2023-06-08 | Method for constructing prognosis model related to copper death of acute myelogenous leukemia patient |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116564421B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117409855A (en) * | 2023-10-25 | 2024-01-16 | 苏州卫生职业技术学院 | Hepatoma patient mismatch repair related prognosis model, and construction and verification methods and application thereof |
CN117789819A (en) * | 2024-02-27 | 2024-03-29 | 北京携云启源科技有限公司 | Construction method of VTE risk assessment model |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6964868B1 (en) * | 1998-01-28 | 2005-11-15 | Nuvelo, Inc. | Human genes and gene expression products II |
CN102836292A (en) * | 2011-06-20 | 2012-12-26 | 苏州卫生职业技术学院 | Research method for effect of extracts of modified five-ingredient toxin-dispersing beverage to carrageenan-induced inflammation in mice |
CN110468203A (en) * | 2019-08-08 | 2019-11-19 | 浙江省人民医院 | A kind of marker, detection primer sequence and its application for predicting Gliblastoma patient prognosis |
CN112992346A (en) * | 2021-04-09 | 2021-06-18 | 中山大学附属第三医院(中山大学肝脏病医院) | Method for establishing prediction model for prognosis of severe spinal cord injury |
CN113160979A (en) * | 2020-12-18 | 2021-07-23 | 北京臻知医学科技有限责任公司 | Machine learning-based liver cancer patient clinical prognosis prediction method |
CN113782090A (en) * | 2021-09-18 | 2021-12-10 | 中南大学湘雅三医院 | Iron death model construction method and application |
CN114317532A (en) * | 2021-12-31 | 2022-04-12 | 广东省人民医院 | Evaluation gene set, kit, system and application for predicting leukemia prognosis |
CN114898874A (en) * | 2022-04-18 | 2022-08-12 | 广东省科学院生物与医学工程研究所 | Prognosis prediction method and system for renal clear cell carcinoma patient |
CN115019965A (en) * | 2022-05-20 | 2022-09-06 | 中山大学附属第一医院 | Method for constructing liver cancer patient survival prediction model based on cell death related gene |
CN115033758A (en) * | 2022-06-30 | 2022-09-09 | 郑州金域临床检验中心有限公司 | Application of kidney clear cell carcinoma prognosis marker gene, screening method and prognosis prediction method |
CN115497562A (en) * | 2022-10-27 | 2022-12-20 | 中国医学科学院北京协和医院 | Pancreatic cancer prognosis prediction model construction method based on copper death-related gene |
CN116004815A (en) * | 2022-08-02 | 2023-04-25 | 山东大学齐鲁医院 | Endometrial cancer prognosis model based on copper death-related lncRNA and application thereof in immunotherapy |
-
2023
- 2023-06-08 CN CN202310672840.1A patent/CN116564421B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6964868B1 (en) * | 1998-01-28 | 2005-11-15 | Nuvelo, Inc. | Human genes and gene expression products II |
CN102836292A (en) * | 2011-06-20 | 2012-12-26 | 苏州卫生职业技术学院 | Research method for effect of extracts of modified five-ingredient toxin-dispersing beverage to carrageenan-induced inflammation in mice |
CN110468203A (en) * | 2019-08-08 | 2019-11-19 | 浙江省人民医院 | A kind of marker, detection primer sequence and its application for predicting Gliblastoma patient prognosis |
CN113160979A (en) * | 2020-12-18 | 2021-07-23 | 北京臻知医学科技有限责任公司 | Machine learning-based liver cancer patient clinical prognosis prediction method |
CN112992346A (en) * | 2021-04-09 | 2021-06-18 | 中山大学附属第三医院(中山大学肝脏病医院) | Method for establishing prediction model for prognosis of severe spinal cord injury |
CN113782090A (en) * | 2021-09-18 | 2021-12-10 | 中南大学湘雅三医院 | Iron death model construction method and application |
CN114317532A (en) * | 2021-12-31 | 2022-04-12 | 广东省人民医院 | Evaluation gene set, kit, system and application for predicting leukemia prognosis |
CN114898874A (en) * | 2022-04-18 | 2022-08-12 | 广东省科学院生物与医学工程研究所 | Prognosis prediction method and system for renal clear cell carcinoma patient |
CN115019965A (en) * | 2022-05-20 | 2022-09-06 | 中山大学附属第一医院 | Method for constructing liver cancer patient survival prediction model based on cell death related gene |
CN115033758A (en) * | 2022-06-30 | 2022-09-09 | 郑州金域临床检验中心有限公司 | Application of kidney clear cell carcinoma prognosis marker gene, screening method and prognosis prediction method |
CN116004815A (en) * | 2022-08-02 | 2023-04-25 | 山东大学齐鲁医院 | Endometrial cancer prognosis model based on copper death-related lncRNA and application thereof in immunotherapy |
CN115497562A (en) * | 2022-10-27 | 2022-12-20 | 中国医学科学院北京协和医院 | Pancreatic cancer prognosis prediction model construction method based on copper death-related gene |
Non-Patent Citations (2)
Title |
---|
杜佳慧;王晓晓;刘松柏: "RNA甲基化转移酶METTL3在白血病中的研究进展", 《重庆医学》, vol. 52, no. 11, pages 1732 - 1737 * |
许铖铖等: "基于铜死亡相关长链非编码RNA构建膀胱癌患者预后风险评估模型", 《浙江大学学报》, vol. 52, no. 02, pages 139 - 147 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117409855A (en) * | 2023-10-25 | 2024-01-16 | 苏州卫生职业技术学院 | Hepatoma patient mismatch repair related prognosis model, and construction and verification methods and application thereof |
CN117409855B (en) * | 2023-10-25 | 2024-04-26 | 苏州卫生职业技术学院 | Hepatoma patient mismatch repair related prognosis model, and construction and verification methods and application thereof |
CN117789819A (en) * | 2024-02-27 | 2024-03-29 | 北京携云启源科技有限公司 | Construction method of VTE risk assessment model |
Also Published As
Publication number | Publication date |
---|---|
CN116564421B (en) | 2024-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116564421B (en) | Method for constructing prognosis model related to copper death of acute myelogenous leukemia patient | |
Assefa et al. | Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA-sequencing data | |
AU2015101194A4 (en) | Semi-Supervised Learning Framework based on Cox and AFT Models with L1/2 Regularization for Patient’s Survival Prediction | |
KR20190101966A (en) | Methods and Systems for Predicting DNA Accessibility in the Pan-Cancer Genome | |
WO2023217290A1 (en) | Genophenotypic prediction based on graph neural network | |
CN116741397B (en) | Cancer typing method, system and storage medium based on multi-group data fusion | |
Pan et al. | i-Modern: Integrated multi-omics network model identifies potential therapeutic targets in glioma by deep learning with interpretability | |
Ramos et al. | An interpretable approach for lung cancer prediction and subtype classification using gene expression | |
Shibahara et al. | Deep learning generates custom-made logistic regression models for explaining how breast cancer subtypes are classified | |
Jeng et al. | Efficient signal inclusion with genomic applications | |
Vimaladevi et al. | A microarray gene expression data classification using hybrid back propagation neural network | |
Sobhan et al. | Explainable machine learning to identify patient-specific biomarkers for lung cancer | |
Yang et al. | MSPL: Multimodal self-paced learning for multi-omics feature selection and data integration | |
Zhang et al. | Predicting patient survival from longitudinal gene expression | |
KR101816646B1 (en) | A METHOD FOR PROCESSING DATA OF A COMPUTER FOR IDENTIFYING GENE-microRNA MODULE HAVING HIGH COREELATION WITH CANCER AND A METHOD OF SELECTING GENES AND microRNAs HAVING HIGH COREELATION WITH CANCER | |
Xiang et al. | Exploring gene–gene interaction in family‐based data with an unsupervised machine learning method: EPISFA | |
Lin et al. | Evaluation of classical statistical methods for analyzing bs-seq data | |
CN109686400A (en) | A kind of enrichment degree method of inspection, device and readable medium, storage control | |
Berghout et al. | Single subject transcriptome analysis to identify functionally signed gene set or pathway activity | |
CN115985388B (en) | Multi-group-study integration method and system based on preprocessing noise reduction and biological center rule | |
Jia et al. | DCCAFN: deep convolution cascade attention fusion network based on imaging genomics for prediction survival analysis of lung cancer | |
Zhao et al. | Ensemble classification based signature discovery for cancer diagnosis in RNA expression profiles across different platforms | |
Cai et al. | Application and research progress of machine learning in Bioinformatics | |
Zhou et al. | Grading prediction of kidney renal clear cell carcinoma by deep learning | |
Manners et al. | Computational methods for detecting functional modules from gene regulatory network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |