CN114334014A - Cancer subtype identification method and system based on self-attention deep learning - Google Patents

Cancer subtype identification method and system based on self-attention deep learning Download PDF

Info

Publication number
CN114334014A
CN114334014A CN202111677858.8A CN202111677858A CN114334014A CN 114334014 A CN114334014 A CN 114334014A CN 202111677858 A CN202111677858 A CN 202111677858A CN 114334014 A CN114334014 A CN 114334014A
Authority
CN
China
Prior art keywords
data
learning
self
features
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111677858.8A
Other languages
Chinese (zh)
Inventor
巩萍
孙秋文
程磊
张志远
孟军
葛海涛
陈洁
章龙珍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuzhou Medical University
Original Assignee
Xuzhou Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuzhou Medical University filed Critical Xuzhou Medical University
Priority to CN202111677858.8A priority Critical patent/CN114334014A/en
Publication of CN114334014A publication Critical patent/CN114334014A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a cancer subtype identification method and system based on self-attention deep learning, which comprises the following steps: firstly, preprocessing a plurality of groups of cancer data, then respectively learning the low-dimensional characteristics of each omic by utilizing a deep learning Dense network, and preliminarily integrating the characteristics of different omics in a splicing mode; and then constructing a similarity matrix between the samples by using self attention, and performing feature fusion according to the matrix weight and the splicing features to obtain final integrated feature representation. The decoder is used to minimize the error between the fused features and the primitive omics features, and the discriminator is used to perform the antagonistic learning of the integrated feature distribution. And finally, clustering the learned integrated feature distribution through a Gaussian mixture model to identify cancer subtypes. The invention can effectively integrate multiple groups of chemical data, adaptively model the relationship between samples, learn better feature representation, obtain better clustering result and realize accurate identification of cancer subtypes.

Description

Cancer subtype identification method and system based on self-attention deep learning
Technical Field
The invention relates to the technical field of biological information, in particular to a cancer subtype identification method and system based on self-attention deep learning.
Background
The diagnosis, treatment and prognosis evaluation of cancer is one of the most urgent and important research subjects in the current life science and medical fields. Research shows that cancers have high heterogeneity, and the molecular typing of the cancers has the same clinical stage or tissue morphology and is greatly different, and different molecular typing plays a crucial role in the selection of preoperative treatment schemes and prognosis of patients and is an important basis for individualized treatment, particularly endocrine treatment and targeted treatment.
Early cancer molecular typing studies mainly utilized univomic data, and this typing method was dependent on the type of data used, and the results obtained from different types of omic data did not match, resulting in low model accuracy. Cancer heterogeneity is not represented at an omic level only, but rather at genomic, transcriptomic, epigenetic and other omic levels. As a class of diseases with higher complexity caused by different factors, research based on unicomics data has been difficult to meet the requirements of scientific research. Different omics data have complementarity, and the mechanism of tumorigenesis and development can be better revealed by combining multiple groups of chemical data, so that a new research direction is provided for tumor molecular typing.
The characteristic extraction is the basis of multigroup data research, good characteristics can well reflect the nuance and deeper information of the tumor, and the discriminability, the robustness and the repeatability are realized. The biological group data is usually high-dimensional small sample data, and the result obtained by directly applying the traditional data mining method to analyze the biological group data is often poor in generalization capability. This is because a high feature space dimension and a small number of samples can cause a dimension disaster problem, that is, as the feature dimension increases, the difficulty of the constructed data model with generalization capability increases exponentially, thereby resulting in data overfitting.
In order to overcome the problem of dimensionality disaster in high-dimensional omics data analysis, the original data needs to be subjected to feature extraction so as to reduce the size of each omics data. In recent years, deep learning, as a brand-new machine learning algorithm, is gradually applied to feature extraction of multigroup mathematical data due to its good feature learning capability. Deep learning simulates the learning process of human brain through a multilayer neural network, and hopefully, the multilayer abstract mechanism of the human brain is used for reference to realize abstract expression of data, so that more useful characteristics are learned. Cancer typing studies based on deep learning are the current focus of research.
The current multigroup cancer subtype identification based on deep learning mostly integrates multigroup data at the front end and then learns characteristics through a deep learning model. These methods ignore data features between different omics and relationships between samples. In order to solve the above problems, the present invention proposes a new cancer subtype identification method based on self-attention deep learning. The method fully considers the difference of various omics characteristics and the relation of the samples on the characteristics of the universe.
Disclosure of Invention
The purpose of the invention is as follows: in view of the problems and deficiencies of the prior art as described above, the present invention is directed to a cancer subtype identification method and system based on self-attention deep learning.
The technical scheme of the invention is as follows: in order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a cancer subtype identification method based on self-attention deep learning comprises the following steps:
respectively learning the low-dimensional features of each omic by utilizing a deep learning Dense network, and splicing the low-dimensional features of different omics obtained by learning to obtain spliced features;
constructing a similarity matrix between samples by using a self-attention mechanism, and performing feature fusion according to the matrix weight of the similarity matrix and the spliced features to obtain final integrated feature representation;
minimizing the error between the original features and the integrated features through a decoder, performing countermeasure learning of the integrated feature distribution through a discriminator, and training the learned integrated feature distribution to obtain the optimal integrated feature distribution;
and clustering the integrated feature distribution after training and learning by using a Gaussian mixture model to obtain the subtype of the cancer sample.
Preferably, the method further comprises the step of performing data preprocessing on the omics data of the cancer sample, and comprises the following steps:
preprocessing four different omics data of a cancer sample; wherein, the four different omics data are respectively mRNA data, miRNA data, DNA copy number variation data and DNA methylation data;
carrying out logarithmic transformation on the data of mRNA and miRNA, and reducing the absolute numerical value of the data;
removing the repeated regions of the DNA copy number variation data, and constructing characteristics according to the corresponding relation between the sample and the genome region;
for DNA methylation data, DNA methylation information was integrated and the average for each sample was calculated;
and carrying out normalization processing on the omics data.
Preferably, the learning of the low-dimensional features of the omics respectively by using the deep learning sense network comprises the following steps:
and (3) respectively extracting the characteristics of multiple groups of mathematical data by using a deep learning Dense network:
order to
Figure BDA0003452785280000031
Represents the input data of the kth omics,
Figure BDA0003452785280000032
Figure BDA0003452785280000033
representing the output characteristics of kth omics after the kth omics passes through the network, wherein N is a sample size, and D and D respectively represent the dimensionality of input data and the dimensionality of the output characteristics;
over the Dense network, ykExpressed as:
yk=Wkxk+bk
wherein, WkWeight matrix representing the network, bkRepresents a bias;
will ykSplicing to obtain a spliced characteristic matrix Y:
Y=Concat(y1,..,y4)
the matrix size of the spliced characteristic matrix Y is Nx 4 d; to prevent the network from overfitting, a batch normalization layer is added behind the Dense network, and a GELU function is used as a nonlinear excitation function to obtain a spliced feature matrix T':
Figure BDA0003452785280000034
preferably, the constructing a similarity matrix between samples by using a self-attention mechanism, and performing feature fusion according to the matrix weight of the similarity matrix and the spliced features to obtain a final integrated feature distribution includes the following steps:
regarding each spliced feature as a word in a sentence, let:
dk=4d
Figure BDA0003452785280000041
Figure BDA0003452785280000042
Figure BDA0003452785280000043
Q=K=V=Y′
Q=Y′WQ
K=Y′WK
V=Y′WV
wherein Q, K, V denotes query, key, value matrices, WQ、WK、WVRepresenting linear projection parameters;
the similarity between samples i and j is then expressed as:
Figure BDA0003452785280000044
wherein the content of the first and second substances,
Figure BDA0003452785280000045
is a scaling matrix;
Figure BDA0003452785280000046
wherein, the jth feature vector zjThe calculation steps are as follows:
let alphaiIs a similarity weight vector, α, of the sample i with all other samplesiExpressed as:
Figure BDA0003452785280000047
assume that the fused feature vector of sample i is ZiMultiplying each vector value of V by its weight value, and finally adding to obtain the following calculation formula:
Figure BDA0003452785280000051
the integrated features of all samples are expressed as:
Figure BDA0003452785280000052
adding a batch standardization layer after the self-attention model, and keeping the data distribution unchanged; suppose Z follows a Gaussian distribution Z-N (u, σ)2) Directly learning the mean u and variance σ of Z by using the full connection layer2The integrated feature distribution S (z) is obtained.
Preferably, the minimizing, by the decoder, an error between the original feature and the integrated feature comprises the steps of:
assume the inputs to the network are:
X={x1,x2,x3,x4}
the output of the decoder is:
X′={x1′,x2′,x3′,x4′}
loss function L between X and X' based on Euclidean distance1Expressed as:
Figure BDA0003452785280000053
preferably, for better fitting of S (z) to the Gaussian distribution, the counterlearning is performed using a discriminator, and the mean u and variance σ are randomly generated2The standard normal distribution of (2) P (z); inputting the generated normal distribution P (z) and the learned integrated feature distribution S (z) into a discriminator for counterlearning, wherein S (z) is close to P (z); the discriminator uses a binary cross entropy loss function, the formula is as follows:
L2=-Ez′~P(z)(log(D(z′)))-Ez~S(z)(log(1-D(S(z))))
the final network training loss function includes L1 and L2, as follows:
L=λ1L12L2
wherein λ is1And λ2∈[0,1]Is the weight parameter of each loss function.
Preferably, the clustering the integrated feature distribution after training and learning includes the following steps:
given a fusion characteristic
Figure BDA0003452785280000054
K is the number of clusters, p (z)n) The probability distribution function representing the mixture of gaussian distributions, the clustering process based on the gaussian mixture model is represented as:
Figure BDA0003452785280000061
wherein pi ═ pi (pi)1,π2,…,πk),μ=(μ1,μ2,...,μk) And sigma (∑ s)1,∑2,…,∑k) Respectively representing the weight, the mean value and the covariance of the clustering model;
the gaussian mixture model updates the parameters θ ═ (pi, μ., Σ) using the EM algorithm;
and after the training process is finished, obtaining the most appropriate subtype label according to the maximum probability density of each sample in different clusters.
A cancer subtype identification system based on self-attention deep learning comprising:
the deep learning Dense network is used for learning the low-dimensional features of the omics;
the characteristic splicing module is used for splicing the low-dimensional characteristics of different omics to obtain splicing characteristics;
the self-attention mechanism module is used for constructing a similarity matrix between the samples;
the feature fusion module is used for fusing the spliced features according to the matrix weight of the similarity matrix to obtain integrated feature representation;
a decoder for minimizing an error between the original feature and the integrated feature;
the discriminator is used for carrying out countermeasure learning on the integrated feature distribution to obtain the optimal integrated feature distribution;
and clustering the optimal integrated feature distribution by a clustering device to obtain the subtype of the cancer sample.
The invention has the beneficial effects that:
the invention provides a cancer subtype identification method based on self-attention deep learning by combining the characteristics of multiple groups of chemical data and the advantages of a self-attention mechanism. The main contributions of the invention are: (1) and respectively learning the characteristics of the omics according to the characteristics of the multi-omics data. (2) And (3) by utilizing a self-attention mechanism, fully considering omics data characteristics and constructing the relation weight between the samples in a self-adaptive manner. (3) The constructed model can effectively integrate multiple groups of mathematical data and can obtain a better clustering effect.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1
The invention discloses a cancer subtype identification method and system based on self-attention deep learning, which are shown in figure 1 and specifically comprise the following steps:
step 1, preprocessing four omics data of cancer samples respectively. For mRNA and miRNA expression data, log transformation was first performed to narrow the absolute values of the data. For DNA copy number variation data, the repetitive regions are first removed, and features are then constructed according to the correspondence between the sample and the genomic regions. For DNA methylation data, since each sample corresponds to many methylation site information, DNA methylation information is first integrated and the average for each sample is calculated. In cancer multiple groups of data, missing data occurs to different degrees, and the mean value of the sample is sampled for each group of data to fill out the missing data. And finally, carrying out normalization processing on the omics data.
Step 2, respectively extracting the characteristics of each omic by utilizing a deep learning Dense network, and preliminarily integrating the extracted characteristics of different omics in a splicing mode;
order to
Figure BDA0003452785280000071
Represents the input data of the kth omics,
Figure BDA0003452785280000072
Figure BDA0003452785280000073
representing the output characteristics of kth omics after the kth omics passes through the network, wherein N is a sample size, and D and D respectively represent the dimensionality of input data and the dimensionality of the output characteristics; over the Dense network, ykExpressed as:
yk=Wkxk+bk
wherein, WkWeight matrix representing the network, bkIndicating the bias. Will ykSplicing to obtain a splicing characteristic matrix Y:
Y=Concat(y1,..,y4)
the matrix size of Y is N × 4 d. To prevent network overfitting, a normalization layer is added after the Dense network and the GELU function is used as the nonlinear excitation function, i.e.:
Figure BDA0003452785280000081
and 3, constructing a similarity matrix between the samples by using a self-attention mechanism, and performing feature fusion according to the matrix weight and the spliced features to obtain final integrated feature representation and integrated feature distribution.
Let dk=4d,
Figure BDA0003452785280000082
Figure BDA0003452785280000083
Q=K=V=Y′,Q=Y′WQ,K=Y′WK,y=Y′WV. Wherein Q, K, V represents query, key, value, respectivelyMatrix, WQ、WK、WVRepresenting linear projection parameters. The self-attention based sample fusion process is as follows: the similarity between samples i and j is first calculated:
Figure BDA0003452785280000084
wherein
Figure BDA0003452785280000085
Is a scaling matrix. Let alphaiIs a similarity weight vector, α, of the sample i with all other samplesiExpressed as:
Figure BDA0003452785280000086
assume that the fusion characteristic of sample i is ZiSumming each vector value of V with αiThe weights are multiplied respectively and added to obtain the weight value, and the calculation formula is as follows:
Figure BDA0003452785280000087
the integrated features of all samples are then expressed as:
Figure BDA0003452785280000088
adding a batch standardization layer NB after the self-attention model, and keeping the data distribution unchanged; suppose Z follows a Gaussian distribution Z-N (u, σ)2) Directly learning the mean u and variance σ of Z by using the full connection layer2The integrated feature distribution S (z) is obtained.
And 4, network training, namely minimizing the error between the original feature and the integrated feature through a decoder in order to obtain a good feature representation, and performing counterlearning through a discriminator in order to enable the learned integrated feature distribution to better fit Gaussian distribution.
Let the input of the network be X ═ X1,x2,x3,x4The output of the decoder is X' ═ X }1′,x2′,x3′,x4′H.a loss function L between X and X1Expressed as:
Figure BDA0003452785280000091
to better fit S (z) to the Gaussian distribution, a discriminator is used for challenge learning, with a randomly generated mean u and variance σ2The standard normal distribution of (2) P (z); inputting the integrated feature distribution S (z) and the standard normal distribution P (z) into a discriminator, and enabling S (z) to be close to P (z) through learning; the discriminator uses a binary cross entropy loss function, defined as follows:
L2=-Ez′~P(z)(log(D(z′)))-Ez~S(z)(log(1-D(S(z))))
the final loss function of the network training consists of two parts, L1 and L2:
L=λ1L12L2
wherein λ is1And λ2∈[0,1]Is a weight parameter.
And 5, clustering the integrated feature distribution obtained by training and learning by using a Gaussian mixture model to obtain the subtype of the cancer sample.
Given the number of clusters k, for the fused feature distribution, the cluster expression based on the gaussian mixture model is:
Figure BDA0003452785280000092
wherein pi ═ pi (pi)1,π2,...,πk),μ=(μ1,μ2,...,μk) And sigma (∑ s)1,∑2,...,∑k) Respectively representing the weight, the mean value and the covariance of the statistical model; using EM algorithmNew parameter θ ═ (pi, μ.,. Σ); and after the training process is finished, obtaining the most appropriate subtype label according to the maximum probability density of each sample in different clusters.
The present invention is not limited to the above preferred embodiments, and any modifications, equivalent substitutions and improvements made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A cancer subtype identification method based on self-attention deep learning is characterized by comprising the following steps:
respectively learning the low-dimensional features of each omic by utilizing a deep learning Dense network, and splicing the low-dimensional features of different omics obtained by learning to obtain spliced features;
constructing a similarity matrix between samples by using a self-attention mechanism, and performing feature fusion according to the matrix weight of the similarity matrix and the spliced features to obtain final integrated feature representation;
minimizing the error between the original features and the integrated features through a decoder, and performing countermeasure learning of the integrated feature distribution through a discriminator to obtain the optimal integrated feature distribution after training learning;
and clustering the optimal integrated feature distribution after training and learning to obtain the subtype of the cancer sample.
2. The method for cancer subtype identification based on deep self-attention learning according to claim 1 further comprising data preprocessing of the cancer sample's multinomial data comprising the steps of:
preprocessing four different omics data of a cancer sample; wherein, the four different omics data are respectively mRNA data, miRNA data, DNA copy number variation data and DNA methylation data;
carrying out logarithmic transformation on the data of mRNA and miRNA, and reducing the absolute numerical value of the data;
removing the repeated regions of the DNA copy number variation data, and constructing characteristics according to the corresponding relation between the sample and the genome region;
for DNA methylation data, DNA methylation information was integrated and the average for each sample was calculated;
and carrying out normalization processing on the omics data.
3. The cancer subtype identification method based on self-attention deep learning according to claim 1, characterized in that the learning of the low-dimensional features of the omics separately using the deep learning sense network comprises the following steps:
and (3) respectively extracting the characteristics of multiple groups of mathematical data by using a deep learning Dense network:
order to
Figure FDA0003452785270000021
Represents the input data of the kth omics,
Figure FDA0003452785270000022
Figure FDA0003452785270000023
representing the output characteristics of kth omics after the kth omics passes through the network, wherein N is a sample size, and D and D respectively represent the dimensionality of input data and the dimensionality of the output characteristics;
over the Dense network, ykExpressed as:
yk=Wkxk+bk
wherein, WkWeight matrix representing the network, bkRepresents a bias;
will ykSplicing to obtain a spliced characteristic matrix Y:
Y=Concat(y1,..,y4)
the matrix size of the spliced characteristic matrix Y is Nx 4 d; to prevent network overfitting, a batch normalization layer NB is added after the Dense network, and the stitched feature matrix Y' is obtained using the GELU function as the nonlinear excitation function:
Figure FDA0003452785270000024
4. the cancer subtype identification method based on self-attention deep learning according to claim 3, characterized in that the method for constructing the similarity matrix between the samples by using the self-attention mechanism and performing feature fusion according to the matrix weight of the similarity matrix and the spliced features to obtain the final integrated feature representation comprises the following steps:
regarding each spliced feature as a word in a sentence, let:
dk=4d
Figure FDA0003452785270000025
Figure FDA0003452785270000026
Figure FDA0003452785270000027
Q=K=V=Y′
Q=Y′WQ
K=Y′WK
V=Y′WV
wherein Q, K, V denotes query, key, value matrices, WQ、WK、WVRepresenting linear projection parameters;
the similarity between samples i and j is then expressed as:
Figure FDA0003452785270000031
wherein the content of the first and second substances,
Figure FDA0003452785270000032
is a scaling matrix;
Figure FDA0003452785270000033
wherein, the jth feature vector zjThe calculation steps are as follows:
let alphaiIs a similarity weight vector, α, of the sample i with all other samplesiExpressed as:
Figure FDA0003452785270000034
assume that the fused feature vector of sample i is ZiMultiplying each vector value of V by its weight value, and finally adding to obtain the following calculation formula:
Figure FDA0003452785270000035
the integrated features of all samples are expressed as:
Figure FDA0003452785270000036
adding a batch standardization layer after the self-attention model, and keeping the data distribution unchanged; suppose Z follows a Gaussian distribution Z-N (u, σ)2) Directly learning the mean u and variance σ of Z by using the full connection layer2The integrated feature distribution S (z) is obtained.
5. The method for cancer subtype identification based on self-attention deep learning according to claim 1 characterized in that said minimizing the error between the original features and the integrated features by a decoder comprises the steps of:
assume the inputs to the network are:
X={x1,x2,x3,x4}
the output of the decoder is:
X′={x1′,x2′,x3′,x4′}
loss function L between X and X' based on Euclidean distance1Expressed as:
Figure FDA0003452785270000041
6. the cancer subtype identification method based on self-attention deep learning according to claim 5, characterized in that the antagonistic learning of the integrated feature distribution by the discriminator comprises the following steps:
randomly generated mean u and variance σ2S (z);
inputting the generated normal distribution S (z) and the learned integrated feature distribution P (z) into a discriminator for counterlearning, and defining a loss function of the discriminator by using binary cross entropy, wherein the formula is as follows:
L2=-Ez′~P(z)(log(D(z′)))-Ez~S(z)(log(1-D(S(z))))
the final network training loss function includes L1 and L2, as follows:
L=λ1L12L2
wherein λ is1And λ2∈[0,1]Is the weight parameter of each loss function.
7. The method for cancer subtype identification based on self-attention deep learning according to claim 4, characterized in that said clustering of ensemble feature representations after training learning comprises the following steps:
given integrated features
Figure FDA0003452785270000042
K is the number of clusters, p (z)n) The probability distribution function representing the mixture of gaussian distributions, the clustering process based on the gaussian mixture model is represented as:
Figure FDA0003452785270000043
wherein pi ═ pi (pi)1,π2,...,πk),μ=(μ1,μ2,...,μk) And sigma (∑ s)1,∑2,...,∑k) Respectively representing the weight, the mean value and the covariance of the clustering model;
the gaussian mixture model updates the parameters θ ═ (pi, μ., Σ) using the EM algorithm;
and after the training process is finished, obtaining the most appropriate subtype label according to the maximum probability density of each sample in different clusters.
8. A cancer subtype recognition system based on self-attention deep learning, comprising:
the deep learning Dense network is used for learning the low-dimensional features of the omics;
the characteristic splicing module is used for splicing the low-dimensional characteristics of different omics to obtain splicing characteristics;
the self-attention mechanism module is used for constructing a similarity matrix between the samples;
the feature fusion module is used for fusing the spliced features according to the matrix weight of the similarity matrix to obtain integrated feature representation;
a decoder for minimizing an error between the original feature and the integrated feature;
the discriminator is used for carrying out countermeasure learning on the integrated feature distribution to obtain the optimal integrated feature distribution;
and clustering the optimal integrated feature distribution by a clustering device to obtain the subtype of the cancer sample.
CN202111677858.8A 2021-12-31 2021-12-31 Cancer subtype identification method and system based on self-attention deep learning Withdrawn CN114334014A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111677858.8A CN114334014A (en) 2021-12-31 2021-12-31 Cancer subtype identification method and system based on self-attention deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111677858.8A CN114334014A (en) 2021-12-31 2021-12-31 Cancer subtype identification method and system based on self-attention deep learning

Publications (1)

Publication Number Publication Date
CN114334014A true CN114334014A (en) 2022-04-12

Family

ID=81022947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111677858.8A Withdrawn CN114334014A (en) 2021-12-31 2021-12-31 Cancer subtype identification method and system based on self-attention deep learning

Country Status (1)

Country Link
CN (1) CN114334014A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115641955A (en) * 2022-10-19 2023-01-24 哈尔滨工业大学 Gastric cancer stage discrimination system based on deep learning and storage medium
CN115985513A (en) * 2023-01-05 2023-04-18 徐州医科大学科技园发展有限公司 Data processing method, device and equipment based on multigroup cancer typing
CN117393175A (en) * 2023-10-16 2024-01-12 中国矿业大学 Cancer subtype identification method based on multiple sets of chemical data
CN117591953A (en) * 2024-01-19 2024-02-23 数据空间研究院 Cancer classification method and system based on multiple groups of study data and electronic equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115641955A (en) * 2022-10-19 2023-01-24 哈尔滨工业大学 Gastric cancer stage discrimination system based on deep learning and storage medium
CN115985513A (en) * 2023-01-05 2023-04-18 徐州医科大学科技园发展有限公司 Data processing method, device and equipment based on multigroup cancer typing
CN115985513B (en) * 2023-01-05 2023-11-03 徐州医科大学科技园发展有限公司 Data processing method, device and equipment based on multiple groups of chemical cancer typing
CN117393175A (en) * 2023-10-16 2024-01-12 中国矿业大学 Cancer subtype identification method based on multiple sets of chemical data
CN117591953A (en) * 2024-01-19 2024-02-23 数据空间研究院 Cancer classification method and system based on multiple groups of study data and electronic equipment

Similar Documents

Publication Publication Date Title
CN114334014A (en) Cancer subtype identification method and system based on self-attention deep learning
CN111414461B (en) Intelligent question-answering method and system fusing knowledge base and user modeling
CN110110318B (en) Text steganography detection method and system based on cyclic neural network
CN111681718B (en) Medicine relocation method based on deep learning multi-source heterogeneous network
CN111985310A (en) Training method of deep convolutional neural network for face recognition
CN111222318B (en) Trigger word recognition method based on double-channel bidirectional LSTM-CRF network
CN111832287B (en) Entity relationship joint extraction method and device
CN112765370B (en) Entity alignment method and device of knowledge graph, computer equipment and storage medium
CN111564183A (en) Single cell sequencing data dimension reduction method fusing gene ontology and neural network
CN115187610A (en) Neuron morphological analysis method and device based on graph neural network and storage medium
CN111026877A (en) Knowledge verification model construction and analysis method based on probability soft logic
Wang et al. Recognizing handwritten mathematical expressions as LaTex sequences using a multiscale robust neural network
CN116152554A (en) Knowledge-guided small sample image recognition system
CN117457081A (en) Space transcriptome data processing method and system based on hypergraph
CN111898337B (en) Automatic generation method of single sentence abstract defect report title based on deep learning
CN113360643A (en) Electronic medical record data quality evaluation method based on short text classification
Tan et al. Ranking analysis of microarray data: a powerful method for identifying differentially expressed genes
CN116226698A (en) Cell type identification method, system and equipment based on multi-group chemical data integration
CN115481674A (en) Single cell type intelligent identification method based on deep learning
CN112735604B (en) Novel coronavirus classification method based on deep learning algorithm
CN114913921A (en) System and method for identifying marker gene
CN114595336A (en) Multi-relation semantic solution model based on Gaussian mixture model
Wang et al. scBKAP: a clustering model for single-cell RNA-Seq data based on bisecting K-means
CN114416966B (en) Reasonable use and analysis method for medical consumables based on Simhash-BERT network
CN113449810B (en) Image clustering method based on self-supervision and semantic style decoupling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220412