CN112086199A - Liver cancer data processing system based on multiple groups of mathematical data - Google Patents
Liver cancer data processing system based on multiple groups of mathematical data Download PDFInfo
- Publication number
- CN112086199A CN112086199A CN202010963978.3A CN202010963978A CN112086199A CN 112086199 A CN112086199 A CN 112086199A CN 202010963978 A CN202010963978 A CN 202010963978A CN 112086199 A CN112086199 A CN 112086199A
- Authority
- CN
- China
- Prior art keywords
- data
- liver cancer
- module
- processing module
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Pathology (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Primary Health Care (AREA)
- Bioethics (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Biotechnology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention provides a liver cancer data processing system based on multigroup data, which comprises a preprocessing module, a data dimension reduction processing module, a classification processing module and a classifier module, wherein the preprocessing module is used for preprocessing the liver cancer data; the preprocessing module is used for screening the liver cancer multigroup data and outputting the screened target data to the data dimension reduction processing module; the data dimension reduction processing module is used for receiving the target data output by the preprocessing module, performing dimension reduction processing on the target data, and outputting the target data subjected to the dimension reduction processing to the data dimension reduction processing module; the classification processing module is used for receiving the target data after dimensionality reduction output by the data dimensionality reduction processing module, performing classification processing according to the target data after dimensionality reduction, and outputting a classification label; the classifier module is used for receiving the classification labels, training the classifier module by adopting the classification labels, receiving real-time multigroup liver cancer data and predicting the life cycle of the liver cancer; the method can well fuse multiple groups of chemical data of the liver cancer, effectively fuse the multiple groups of chemical data of the liver cancer by utilizing the complementarity of the data, thereby effectively avoiding the loss of characteristic information in the data processing process, effectively ensuring the accuracy of data processing and providing guarantee for the accuracy of the prediction of the subsequent life cycle of the liver cancer.
Description
Technical Field
The invention relates to a data processing system, in particular to a liver cancer data processing system based on multigroup data.
Background
Early liver cancer is mainly removed by operation, but clinical data show that the recurrence rate of liver cancer after operation is about 70%, which seriously hinders the long-term survival of patients. If we establish HCC typing standard, more detailed hierarchical management is carried out on high-risk recurrent patients, people who may benefit are firstly screened from the source and then surgery is carried out, and the HCC typing standard has more important significance on improving the survival of the patients and realizing accurate treatment of HCC. Establishing classification standard of liver cancer based on multiple groups of data, and performing more accurate prognosis treatment and management on different patients to improve survival rate of the patients. Therefore, the method has important significance for fusing multigroup data to classify patients from molecular level and predict the prognosis of the patients, and also has clinical significance for the treatment of the patients.
In recent years, there is also a method of predicting prognosis by typing liver cancer by fusing RNA sequencing data, miRNA data, methylation data, and clinical survival data of patients with liver cancer. However, few researchers in the prior art have considered the survival status of patients when studying molecular subtypes. The survival rate has important clinical significance for the research of molecular subtypes, and the huge difference of the survival rate has great influence on the molecular subtypes. The fusion of multiple sets of mathematical data for molecular typing and prediction of prognosis has the following two characteristics: (1) the fusion period of multigroup data is generally divided into early fusion, middle fusion and later fusion, and different fusion periods have great influence on the fusion result. (2) The way of fusion also has a great influence. The fusion method or system in the prior art has the following defects: on one hand, an automatic encoder is adopted to integrate input data, but characteristic data are easily lost, and on the other hand, the prior art only simply and directly superposes the data, so that different data are poor in fusion, the data cannot be complemented, and accurate information cannot be extracted.
Therefore, in order to solve the above technical problems, it is necessary to provide a new technical means.
Disclosure of Invention
In view of this, the present invention provides a liver cancer data processing system based on multiple sets of mathematical data, which can fuse multiple sets of mathematical data of a liver cancer well, and effectively fuse multiple sets of mathematical data of the liver cancer by using complementarity of the data, thereby effectively avoiding loss of characteristic information during data processing, effectively ensuring accuracy of data processing, and providing guarantee for accuracy of prediction of a subsequent liver cancer lifetime.
The invention provides a liver cancer data processing system based on multigroup data, which comprises a preprocessing module, a data dimension reduction processing module, a classification processing module and a classifier module, wherein the preprocessing module is used for preprocessing the liver cancer data;
the preprocessing module is used for screening the liver cancer multigroup data and outputting the screened target data to the data dimension reduction processing module;
the data dimension reduction processing module is used for receiving the target data output by the preprocessing module, performing dimension reduction processing on the target data, and outputting the target data subjected to the dimension reduction processing to the data dimension reduction processing module;
the classification processing module is used for receiving the target data after dimensionality reduction output by the data dimensionality reduction processing module, performing classification processing according to the target data after dimensionality reduction, and outputting a classification label;
the classifier module is used for receiving the classification labels, training the classifier module by adopting the classification labels, receiving real-time multigroup liver cancer data and predicting the life cycle of the liver cancer.
Further, the preprocessing module screens the liver cancer multigroup data, and comprises:
the preprocessing module scores each feature of the liver cancer multiomics data based on a univariate Cox-PH model and then scores Per1 and a set threshold value PyFor comparison, screening out Per1< PyAnd fusing the screened data to form target data.
Further, the performing, by the data dimension reduction processing module, dimension reduction processing on the target data specifically includes:
SA1, constructing a K-layer self-encoder in a data dimension reduction processing module, wherein an output function of the K-layer self-encoder is as follows:
x'=Relu(Wi·Relu(Wix+bi) ); wherein, WiAs a weight matrix between adjacent autocoders, biIs a weight matrix WiX is m-dimensional target data X ═ X1,x2,…,xm) The characteristic value of (1);
SA2, the data dimension reduction processing module constructs a loss function, wherein the loss function is as follows:
SA3, carrying out iterative operation through a loss function, and updating a weight matrix WiAnd a weight matrix WiOffset b ofiAnd after the iteration times are reached, the data dimension reduction processing module outputs the target data after dimension reduction processing.
Further, the lifetime prediction of the classification processing module specifically includes:
SB1, the classification processing module adopts a univariate Cox-PH model to score the features in the target data after the dimensionality reduction again, and then the features are subjected to the dimensionality reduction processingFeature score value Per2 and set threshold value PyComparing and screening out Per2 < PyThe screened data are fused;
and SB2, the classification processing module constructs a normalization processing model and normalizes the data processed in the step SB1, wherein the normalization processing model is as follows:
p is the feature data output in step SB1, P is the feature data after normalization, var (P) is the variance of the feature data P, e (P) is the empirical mean of the feature data P;
and SB3, constructing a similarity function by a classification processing module:
wherein W (i, j) is the ith sample ziAnd j sample zjOf (a) similarity, θijIs a normalization factor; wherein:
λiis the ith sample ziK neighbors of, lambdajIs the jth sample zjK neighbors of (a); z is a radical ofrDenotes λiThe r-th sample of (2).
And SB4, the classification processing module determines a classification label according to the similarity function and outputs the classification label to the classifier module.
The invention has the beneficial effects that: according to the invention, the liver cancer multigroup chemical data can be well fused, and the liver cancer multigroup chemical data can be fused together by effectively utilizing the complementarity of the data, so that the loss of characteristic information in the data processing process is effectively avoided, the accuracy of data processing is effectively ensured, and the accuracy of the prediction of the subsequent liver cancer life cycle is guaranteed.
Drawings
The invention is further described below with reference to the following figures and examples:
FIG. 1 is a schematic structural diagram of the present invention.
FIG. 2 is a schematic diagram of the present invention.
FIG. 3 is a comparison of an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings of the specification:
the invention provides a liver cancer data processing system based on multigroup data, which comprises a preprocessing module, a data dimension reduction processing module, a classification processing module and a classifier module, wherein the preprocessing module is used for preprocessing the liver cancer data;
the preprocessing module is used for screening the liver cancer multigroup data and outputting the screened target data to the data dimension reduction processing module;
the data dimension reduction processing module is used for receiving the target data output by the preprocessing module, performing dimension reduction processing on the target data, and outputting the target data subjected to the dimension reduction processing to the data dimension reduction processing module;
the classification processing module is used for receiving the target data after dimensionality reduction output by the data dimensionality reduction processing module, performing classification processing according to the target data after dimensionality reduction, and outputting a classification label;
the classifier module is used for receiving the classification labels, training the classifier module by adopting the classification labels, receiving real-time multigroup liver cancer data and predicting the life cycle of the liver cancer; according to the invention, the liver cancer multigroup chemical data can be well fused, and the liver cancer multigroup chemical data can be fused together by effectively utilizing the complementarity of the data, so that the loss of characteristic information in the data processing process is effectively avoided, the accuracy of data processing is effectively ensured, and the accuracy of the prediction of the subsequent liver cancer life cycle is guaranteed.
In this embodiment, the screening of the liver cancer multigroup mathematical data by the preprocessing module includes:
the preprocessing module scores each feature of the liver cancer multiomics data based on a univariate Cox-PH model and then scores Per1 and a set threshold value PyFor comparison, screening out Per1< PyAnd fusing the screened data to form target data, wherein a threshold value P is setyGenerally set to 0.5, which can effectively prevent information loss during processing and ensure the accuracy of the final result.
In this embodiment, the performing, by the data dimension reduction processing module, dimension reduction processing on the target data specifically includes:
SA1, constructing a K-layer self-encoder in a data dimension reduction processing module, wherein an output function of the K-layer self-encoder is as follows:
x'=Relu(Wi·Relu(Wix+bi) ); wherein, WiAs a weight matrix between adjacent autocoders, biIs a weight matrix WiX is m-dimensional target data X ═ X1,x2,…,xm) The characteristic value of (1);
SA2, the data dimension reduction processing module constructs a loss function, wherein the loss function is as follows:
SA3, carrying out iterative operation through a loss function, and updating a weight matrix WiAnd a weight matrix WiOffset b ofiAnd after the iteration times are reached, the data dimension reduction processing module outputs the target data after dimension reduction processing.
In this embodiment, the predicting the lifetime of the classification processing module specifically includes:
SB1, the classification processing module uses a univariate Cox-PH model to score the features in the target data after the dimensionality reduction again, and then scores the features Per2 and a set threshold value PyComparing and screening out Per2 < PyAnd fusing the screened data, wherein the data is obtained by the step (a)Combining a plurality of features to form a feature matrix in the data fusion process;
and SB2, the classification processing module constructs a normalization processing model and normalizes the data processed in the step SB1, wherein the normalization processing model is as follows:
p is the feature data output in step SB1, P is the feature data after normalization, var (P) is the variance of the feature data P, e (P) is the empirical mean of the feature data P;
and SB3, constructing a similarity function by a classification processing module:
wherein W (i, j) is the ith sample ziAnd j sample zjOf (a) similarity, θijIs a normalization factor; wherein:
λiis the ith sample ziK neighbors of, lambdajIs the jth sample zjK neighbors of (a); z is a radical ofrDenotes λiThe r-th sample of (2).
And SB4, the classification processing module determines a classification label according to the similarity function and outputs the classification label to the classifier module. The classifier module adopts an XGboost classifier, and the multigroup liver cancer data comprises RNA sequencing data, miRNA data and DNA methylation data; taking RNA sequencing data as an example: when the pretreatment module is used for screening, feature data meeting the screening standard is screened from the RNA sequencing data, and then the screening data of each RNA sequencing data is recombined to form new RNA sequencing data.
In step SB1, the features screened out from the three types of omics data are fused to form a data matrix of n × n order, and each column of the matrix is used as a sample, so that the clustering process is performedWith n samples z1,z2,…,znAnd (4) performing cluster analysis on each sample by the classifier module to obtain final classification labels, wherein generally, the number of the classification labels is set to 2.
The data sets GSE14520 and GSE31384 mined from the GEO database serve as validation queues for RNA-seq and miRNA-trained classifiers, respectively. For both validation queues, we first select common features in the training set samples and then normalize the data using the same method as for multi-component data normalization. In the study, we needed to select M features based on cluster labels for the training set and both queues. Thus, the two queues are used as verification data sets to test the model, and finally, a classification result is obtained. Here, we set the value of M (50-100), and found that when the value of M is set to 50, the obtained training model can obtain the best prediction result.
The method comprises the steps of obtaining RNA-seq, miRNA-seq and DNA methylation data of liver cancer by taking TCGA as a training data set, constructing a univariate Cox-PH model by a prediction processing module to obtain the characteristic of Per1<0.05, inputting the processed multigroup chemical data into a dimensionality reduction processing module for processing, inputting the processed multigroup chemical data into a classification processing module to construct the univariate Cox-PH model again for screening to obtain the characteristic of Per1<0.05, finally obtaining two subtypes with significant survival difference by a classifier module through spectral clustering, training the classifier module through an XGboost classifier through a clustering label based on the obtained clustering label, and inputting real-time multigroup chemical liver cancer data for life prediction. To verify the effectiveness of the classifier in predicting survival, we validated the model as in fig. 2 using two sets of data from GEO, i.e., GSE1452 and GES 31384. For the survival curves of the two survival subtypes, the results of the model are superior to those of other models, and the prediction effect of the model is obviously improved compared with other published models.
Finally, we also compared our results with those of other models. Whether the log rank is P value or C index, the experimental result is obviously better than other experimental results, such as figure 3.
In differential gene expression analysis, we could identify 1465 up-regulated genes and 930 down-regulated genes, including the tumor marker gene BIRC5(P ═ 2.07e-41) and the stem cell marker genes CD24(P ═ 2.83e-11), KRT19(P ═ 2.82e-26) and EPCAM (P ═ 1.01 e-6). Furthermore, we have found that 28 genes (SLC2a2, AQP9, RGN, SULT2a1, CRYL1, SERPINC1, PAH, CDO1, PLG, APOC3, CYP27a1, PFKFB3, TM4SF1, ACSL5, RGS2, HN1, SERPINA10, CYB5A, EPHX2, SPHX2, RGS1, ADH1B, LECT2, TBX3, RNASE4, ALDOA, ADH6, SLC38a1) differ between the two survival risk groups we identified and have a strong relationship with the survival of liver cancer.
For differentially expressed genes obtained by differential analysis, we also performed the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis on both subgroups. PI3K-Akt signal pathway, cell cycle signal pathway, P53 signal pathway and the like are rich in tumor-related pathways in invasive subtype (C2), wherein the P13K-Akt signal pathway is also related to CD8+ T cell infiltration. The low-risk survival subtype (C1) has related pathways such as drug metabolism, cytochrome P450, metabolic pathway and fatty acid degradation. These pathways have important significance for studying the prognosis of liver cancer.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.
Claims (4)
1. A liver cancer data processing system based on multigroup data is characterized in that: the system comprises a preprocessing module, a data dimension reduction processing module, a classification processing module and a classifier module;
the preprocessing module is used for screening the liver cancer multigroup data and outputting the screened target data to the data dimension reduction processing module;
the data dimension reduction processing module is used for receiving the target data output by the preprocessing module, performing dimension reduction processing on the target data, and outputting the target data subjected to the dimension reduction processing to the data dimension reduction processing module;
the classification processing module is used for receiving the target data after dimensionality reduction output by the data dimensionality reduction processing module, performing classification processing according to the target data after dimensionality reduction, and outputting a classification label;
the classifier module is used for receiving the classification labels, training the classifier module by adopting the classification labels, receiving real-time multigroup liver cancer data and predicting the life cycle of the liver cancer.
2. The system for processing liver cancer data based on multiple sets of mathematical data as claimed in claim 1, wherein: the pretreatment module screens the liver cancer multigroup data, and comprises the following steps:
the preprocessing module scores each feature of the liver cancer multiomics data based on a univariate Cox-PH model and then scores Per1 and a set threshold value PyFor comparison, screening out Per1< PyAnd fusing the screened data to form target data.
3. The system for processing liver cancer data based on multiple sets of mathematical data as claimed in claim 2, wherein: the data dimension reduction processing module specifically performs dimension reduction processing on the target data, and includes:
SA1, constructing a K-layer self-encoder in a data dimension reduction processing module, wherein an output function of the K-layer self-encoder is as follows:
x'=Relu(Wi·Relu(Wix+bi) ); wherein, WiAs a weight matrix between adjacent autocoders, biIs a weight matrix WiX is m-dimensional target data X ═ X1,x2,…,xm) The characteristic value of (1);
SA2, the data dimension reduction processing module constructs a loss function, wherein the loss function is as follows:
SA3, carrying out iterative operation through a loss function, and updating a weight matrix WiAnd a weight matrix WiOffset b ofiAnd after the iteration times are reached, the data dimension reduction processing module outputs the target data after dimension reduction processing.
4. The system of claim 3, wherein the liver cancer data processing system comprises: the lifetime prediction of the classification processing module specifically includes:
SB1, the classification processing module uses a univariate Cox-PH model to score the features in the target data after the dimensionality reduction again, and then scores the features Per2 and a set threshold value PyComparing and screening out Per2 < PyThe screened data are fused;
and SB2, the classification processing module constructs a normalization processing model and normalizes the data processed in the step SB1, wherein the normalization processing model is as follows:
p is the feature data output in step SB1, P is the feature data after normalization, var (P) is the variance of the feature data P, e (P) is the empirical mean of the feature data P;
and SB3, constructing a similarity function by a classification processing module:
wherein W (i, j) is the ith sample ziAnd j sample zjOf (a) similarity, θijIs a normalization factor; wherein:
λiis the ith sample ziK neighbors of, lambdajIs the jth sample zjK neighbors of (a); z is a radical ofrDenotes λiThe r-th sample of (2).
And SB4, the classification processing module determines a classification label according to the similarity function and outputs the classification label to the classifier module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010963978.3A CN112086199B (en) | 2020-09-14 | 2020-09-14 | Liver cancer data processing system based on multiple groups of study data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010963978.3A CN112086199B (en) | 2020-09-14 | 2020-09-14 | Liver cancer data processing system based on multiple groups of study data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112086199A true CN112086199A (en) | 2020-12-15 |
CN112086199B CN112086199B (en) | 2023-06-09 |
Family
ID=73738141
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010963978.3A Active CN112086199B (en) | 2020-09-14 | 2020-09-14 | Liver cancer data processing system based on multiple groups of study data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112086199B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112820403A (en) * | 2021-02-25 | 2021-05-18 | 中山大学 | Deep learning method for predicting prognosis risk of cancer patient based on multiple groups of mathematical data |
CN115497561A (en) * | 2022-09-01 | 2022-12-20 | 北京吉因加医学检验实验室有限公司 | Method and device for layering screening of methylation markers |
CN115982644A (en) * | 2023-01-19 | 2023-04-18 | 中国医学科学院肿瘤医院 | Esophageal squamous cell carcinoma classification model construction and data processing method |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100292303A1 (en) * | 2007-07-20 | 2010-11-18 | Birrer Michael J | Gene expression profile for predicting ovarian cancer patient survival |
CN105512477A (en) * | 2015-12-03 | 2016-04-20 | 万达信息股份有限公司 | Unplanned readmission risk assessment prediction model based on dimension reduction combination classification algorithm |
US20170039345A1 (en) * | 2015-07-13 | 2017-02-09 | Biodesix, Inc. | Predictive test for melanoma patient benefit from antibody drug blocking ligand activation of the T-cell programmed cell death 1 (PD-1) checkpoint protein and classifier development methods |
JP6080184B1 (en) * | 2016-02-29 | 2017-02-15 | 常雄 小林 | Data collection method used to classify cancer life |
CN107066781A (en) * | 2016-11-03 | 2017-08-18 | 西南大学 | Analysis method based on the related colorectal cancer data model of h and E |
CN107132268A (en) * | 2017-06-21 | 2017-09-05 | 佛山科学技术学院 | A kind of data processing equipment and system for being used to recognize cancerous lung tissue |
CN107169535A (en) * | 2017-07-06 | 2017-09-15 | 谈宜勇 | The deep learning sorting technique and device of biological multispectral image |
US20180357377A1 (en) * | 2017-06-13 | 2018-12-13 | Alexander Bagaev | Systems and methods for generating, visualizing and classifying molecular functional profiles |
CN110010250A (en) * | 2019-04-29 | 2019-07-12 | 青岛科技大学 | Cardiovascular patient weakness disease stage division based on data mining technology |
CN110580956A (en) * | 2019-09-19 | 2019-12-17 | 青岛市市立医院 | liver cancer prognosis markers and application thereof |
CN110852291A (en) * | 2019-11-15 | 2020-02-28 | 太原科技大学 | Palate wrinkle identification method adopting Gabor transformation and blocking dimension reduction |
CN111161882A (en) * | 2019-12-04 | 2020-05-15 | 深圳先进技术研究院 | Breast cancer life prediction method based on deep neural network |
US20200211716A1 (en) * | 2018-12-31 | 2020-07-02 | Tempus Labs | Method and process for predicting and analyzing patient cohort response, progression, and survival |
-
2020
- 2020-09-14 CN CN202010963978.3A patent/CN112086199B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100292303A1 (en) * | 2007-07-20 | 2010-11-18 | Birrer Michael J | Gene expression profile for predicting ovarian cancer patient survival |
US20170039345A1 (en) * | 2015-07-13 | 2017-02-09 | Biodesix, Inc. | Predictive test for melanoma patient benefit from antibody drug blocking ligand activation of the T-cell programmed cell death 1 (PD-1) checkpoint protein and classifier development methods |
CN105512477A (en) * | 2015-12-03 | 2016-04-20 | 万达信息股份有限公司 | Unplanned readmission risk assessment prediction model based on dimension reduction combination classification algorithm |
JP6080184B1 (en) * | 2016-02-29 | 2017-02-15 | 常雄 小林 | Data collection method used to classify cancer life |
CN107066781A (en) * | 2016-11-03 | 2017-08-18 | 西南大学 | Analysis method based on the related colorectal cancer data model of h and E |
US20180357377A1 (en) * | 2017-06-13 | 2018-12-13 | Alexander Bagaev | Systems and methods for generating, visualizing and classifying molecular functional profiles |
CN107132268A (en) * | 2017-06-21 | 2017-09-05 | 佛山科学技术学院 | A kind of data processing equipment and system for being used to recognize cancerous lung tissue |
CN107169535A (en) * | 2017-07-06 | 2017-09-15 | 谈宜勇 | The deep learning sorting technique and device of biological multispectral image |
US20200211716A1 (en) * | 2018-12-31 | 2020-07-02 | Tempus Labs | Method and process for predicting and analyzing patient cohort response, progression, and survival |
CN110010250A (en) * | 2019-04-29 | 2019-07-12 | 青岛科技大学 | Cardiovascular patient weakness disease stage division based on data mining technology |
CN110580956A (en) * | 2019-09-19 | 2019-12-17 | 青岛市市立医院 | liver cancer prognosis markers and application thereof |
CN110852291A (en) * | 2019-11-15 | 2020-02-28 | 太原科技大学 | Palate wrinkle identification method adopting Gabor transformation and blocking dimension reduction |
CN111161882A (en) * | 2019-12-04 | 2020-05-15 | 深圳先进技术研究院 | Breast cancer life prediction method based on deep neural network |
Non-Patent Citations (5)
Title |
---|
TONG,DY: "《Improving prediction performance of colon cancer prognosis based on the integration of clinical and multi-omics data》", 《BMC MEDICAL INFORMATICS AND DECISION MAKING》, vol. 20, no. 1 * |
潘浩;王昭;姚佳文;: "深度学习在肺癌患者生存预测中的应用研究", 计算机工程与应用, no. 14 * |
田梓君;崔新于;: "基于数据处理的肿瘤基因选择系统", 无线互联科技, no. 08 * |
陈景安: "《乳癌病人临床数据的降维处理及生存预测分析 》", 《医药卫生科技辑》, pages 072 - 1918 * |
齐惠颖: "《基于多组学数据融合构建乳腺癌生存预测模型 》", 《数据分析与知识发现 》, no. 8, pages 88 - 93 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112820403A (en) * | 2021-02-25 | 2021-05-18 | 中山大学 | Deep learning method for predicting prognosis risk of cancer patient based on multiple groups of mathematical data |
CN112820403B (en) * | 2021-02-25 | 2024-03-29 | 中山大学 | Deep learning method for predicting prognosis risk of cancer patient based on multiple sets of learning data |
CN115497561A (en) * | 2022-09-01 | 2022-12-20 | 北京吉因加医学检验实验室有限公司 | Method and device for layering screening of methylation markers |
CN115497561B (en) * | 2022-09-01 | 2023-08-29 | 北京吉因加医学检验实验室有限公司 | Methylation marker layered screening method and device |
CN115982644A (en) * | 2023-01-19 | 2023-04-18 | 中国医学科学院肿瘤医院 | Esophageal squamous cell carcinoma classification model construction and data processing method |
CN115982644B (en) * | 2023-01-19 | 2024-04-30 | 中国医学科学院肿瘤医院 | Esophageal squamous cell carcinoma classification model construction and data processing method |
Also Published As
Publication number | Publication date |
---|---|
CN112086199B (en) | 2023-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | DeepCDR: a hybrid graph convolutional network for predicting cancer drug response | |
Caudai et al. | AI applications in functional genomics | |
CN112086199A (en) | Liver cancer data processing system based on multiple groups of mathematical data | |
Zhang et al. | CircRNA-disease associations prediction based on metapath2vec++ and matrix factorization | |
WO2018136888A1 (en) | Methods for non-invasive assessment of genetic alterations | |
Arslan et al. | Machine learning in epigenomics: Insights into cancer biology and medicine | |
Zeng et al. | couple CoC+: An information-theoretic co-clustering-based transfer learning framework for the integrative analysis of single-cell genomic data | |
CN115116624A (en) | Drug sensitivity prediction method and device based on semi-supervised transfer learning | |
Dou et al. | Single-nucleotide variant calling in single-cell sequencing data with Monopogen | |
Titus et al. | Unsupervised deep learning with variational autoencoders applied to breast tumor genome-wide DNA methylation data with biologic feature extraction | |
Ming et al. | LPM: a latent probit model to characterize the relationship among complex traits using summary statistics from multiple GWASs and functional annotations | |
Sun et al. | Molecular subtyping of cancer based on distinguishing co-expression modules and machine learning | |
Thibodeau et al. | CoRE-ATAC: A deep learning model for the functional classification of regulatory elements from single cell and bulk ATAC-seq data | |
CN114360642A (en) | Cancer transcriptome data processing method based on gene co-expression network analysis | |
Kalyakulina et al. | Disease classification for whole-blood DNA methylation: meta-analysis, missing values imputation, and XAI | |
Shi et al. | Fundamental and practical approaches for single-cell ATAC-seq analysis | |
CN112037863B (en) | Early NSCLC prognosis prediction system | |
KR20210110241A (en) | Prediction system and method of cancer immunotherapy drug Sensitivity using multiclass classification A.I based on HLA Haplotype | |
CN110211634B (en) | Method for joint analysis of multiple groups of chemical data | |
CN117457065A (en) | Method and system for identifying phenotype-associated cell types based on single-cell multi-set chemical data | |
CN113921084B (en) | Multi-dimensional target prediction method and system for disease-related non-coding RNA (ribonucleic acid) regulation and control axis | |
Gong et al. | Interpretable single-cell transcription factor prediction based on deep learning with attention mechanism | |
Poinsignon et al. | Working with Omics Data: An Interdisciplinary Challenge at the Crossroads of Biology and Computer Science | |
Shanan et al. | Using alignment-free methods as preprocessing stage to classification whole genomes | |
Chowdhury et al. | Predicting High-Risk Individuals for Common Diseases Using Multi-Omics and Epidemiological Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |