CN116030025A - Hepatocellular carcinoma prediction method based on modal sensing distillation network - Google Patents
Hepatocellular carcinoma prediction method based on modal sensing distillation network Download PDFInfo
- Publication number
- CN116030025A CN116030025A CN202310058590.2A CN202310058590A CN116030025A CN 116030025 A CN116030025 A CN 116030025A CN 202310058590 A CN202310058590 A CN 202310058590A CN 116030025 A CN116030025 A CN 116030025A
- Authority
- CN
- China
- Prior art keywords
- network
- distillation
- data
- mri
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004821 distillation Methods 0.000 title claims abstract description 82
- 238000000034 method Methods 0.000 title claims abstract description 31
- 206010073071 hepatocellular carcinoma Diseases 0.000 title claims abstract description 28
- 231100000844 hepatocellular carcinoma Toxicity 0.000 title claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 20
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 15
- 238000002790 cross-validation Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000012360 testing method Methods 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 49
- 230000004927 fusion Effects 0.000 claims description 12
- 230000009466 transformation Effects 0.000 claims description 9
- 238000013140 knowledge distillation Methods 0.000 claims description 8
- 210000004204 blood vessel Anatomy 0.000 claims description 6
- 230000009545 invasion Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 3
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000003384 imaging method Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims 1
- 239000000203 mixture Substances 0.000 claims 1
- 230000001953 sensory effect Effects 0.000 claims 1
- 230000006870 function Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 238000007670 refining Methods 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 208000029028 brain injury Diseases 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 210000003494 hepatocyte Anatomy 0.000 description 1
- 230000008595 infiltration Effects 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 210000005229 liver cell Anatomy 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000001000 micrograph Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Magnetic Resonance Imaging Apparatus (AREA)
Abstract
The invention discloses a hepatocellular carcinoma prediction method based on a modal sensing distillation network, which comprises the following steps of: s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, wherein in each round of cross-validation, one fold data is used as a test set, and the other four fold data are used as training sets; s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube; s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities; s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
Description
Technical Field
The invention relates to the technical field of biology, in particular to a hepatocellular carcinoma prediction method based on a modal sensing distillation network.
Background
Hepatocellular carcinoma refers to malignant tumor generated by liver cells, and is a common pathological type of primary liver cancer. Currently, there are the following methods for predicting the infiltration of hepatic cell carcinoma: 1. predicting the pre-operative MVI using extreme gradient enhancement and deep learning of the CT image; 2. utilizing a 3DCNN prediction model to fuse features from multiple MR sequences; 3. embedding long-term memory LSTM into CNN to fuse the multi-modal MR volumes to predict MVI of HCC patients; the three methods described above only involve MR images to predict MVI states with low accuracy. In addition, predictions were made using the following method of knowledge distillation: 1. KD (knowledge-distillation) was used to effectively segment neuronal structure microscopy images from 3D optical images; 2. referring to the concept of KD, using soft labels to segment brain injury 3 by expanding mask boundaries, and using KD to carry out multi-source transmission learning lung mode analysis tasks; 4. a class-directed contrast distillation module is formulated to pull pairs of positive images from the same class in the teacher and student models while pushing pairs of negative images from different classes away; the distillation network adopted by the four methods only considers different image data, and transmits information from the input image data, so that the classification precision is poor and the prediction accuracy is low.
Disclosure of Invention
The invention aims to provide a hepatocellular carcinoma prediction method based on a modal sensing distillation network, which can effectively improve classification precision and prediction accuracy by migrating teacher network knowledge with image modes and non-image clinical data to a student network only with image modes and providing a modal sensing distillation network (MD-Net) for HCCMVI prediction.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a hepatocellular carcinoma prediction method based on a modal sensing distillation network comprises the following steps:
s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, wherein in each round of cross-validation, one fold data is used as a test set, and the other four fold data are used as training sets;
s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube;
s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities;
s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
Preferably, the dataset in step S1 consists of 270 pathologically confirmed HCC patient data, 270 patients comprising 128M 0 Patient, 93M 1 Patients and 49M 2 A patient; wherein M is 0 Indicating no microvascular invasion, M 1 Represents no more than 5 invasive blood vessels or is located within 1cm of the tumor surface, M 2 Represents invasion of more than 5 blood vessels or more than 1cm from the tumor surface.
Preferably, in step S2, the size of the cube is set to 80×80×20 pixels.
Preferably, the training process of the modal sensing distillation network in step S3 is specifically:
s31, transmitting the HBP image and clinical data to an MRI-clinical fusion module by a teacher network, and extracting 512-dimensional vector features; inputting the PRE image and clinical data into another MRI-clinical fusion module to obtain another 512-dimensional vector feature; inputting the obtained two 512-dimensional vector features into an SA module to obtain new features fused with each other information and />Finally, new features are added> and />Spliced to generate Z t And Z is to t Passed into two fully connected layers to predict classification result P t ;
S32, the student network takes the 3DHBPMRI image and the 3DPREMRI image as input, transmits HBP data to the MRI-only module to obtain the characteristics, and transmits PRE data to the other MRI-only module to obtain the characteristics; inputting the obtained two features into an SA module to obtain new features fused with each other information and />Wherein the new feature-> and />Is two feature vectors containing 512 dimensions; finally, new features are added> and />Are connected to generate Z s And Z is to s Input into the full connection layer to predict MVI classification result P s ;
S33, introducing a regression task into the student network to input HBConnection feature Z of P image and input PRE image s Delivered to two fully connected layers, the 52-dimensional vector P is predicted c For estimating potential clinical information, reusing input clinical data as prediction P c Is a real tag of (1);
s34, distilling the fused features in the teacher network clinical data and the MRI images to the features extracted from the MRI images by using the classification level distillation loss and the feature level distillation loss, and converting the clinical information of the teacher network into the student network by using a knowledge distillation strategy.
Preferably, in step S31, the MRI-clinical fusion module integrates MRI data and non-imaging clinical data, takes 3DMRI data and vectorized clinical data as input, and applies four full connection layers on the input clinical data to obtain four feature maps, wherein the feature channels are 64, 128, 256 and 256 respectively; another 3D feature map is obtained with four convolution blocks on the input MRI image, each consisting of two 3 x 3 convolution layers, and the feature channels are also set to 64, 128, 256, and 256; the four feature maps in the clinical data and the corresponding four features in the MRI data are channel-wise multiplied to integrate them together, and then a 3 x 3 convolution layer and a full connection layer are applied to output feature vectors having 512 dimensions.
Preferably, the MRI-only module extracts 512-dimensional feature vectors from the 3DMRI image in step S32; the MRI-only module consists of nine convolution blocks and a full connection layer, wherein each convolution block comprises a batch processing standard layer, a ReLU activation layer and a 3X 3 convolution layer and is used for improving the robustness of a network; the number of channels of the output features of the nine convolution blocks is set to be different, the feature channels of the first five layers are 32, 64 and 128, and the feature channels of the last four layers are 256, 128, 256 and 256, for balancing the efficiency and the calculation load.
Preferably, the SA module in steps S31 and S32 is a symmetric attention module, X and Y represent two input feature maps of the SA module, and the SA module applies a linear transformation layer on X to obtain three feature maps including Query vector Q x Key vector K x And Value vector V x The method comprises the steps of carrying out a first treatment on the surface of the SA module applies an adaptive transformation layer on Y to generate Key feature map K y And Value feature map V y The method comprises the steps of carrying out a first treatment on the surface of the By multiplying by Q x and Kx Generates score feature vector S by transpose of (1) x By multiplying by Q y and Ky To generate another score feature vector S y The method comprises the steps of carrying out a first treatment on the surface of the The obtained score feature vector S x And Value feature vector V x Multiply and add S y And V is equal to y Multiplying to generate two result eigenvectors, adding them, and finally generating output refined eigenvector
The SA module applies another linear transformation layer on Y to obtain a characteristic vector Query vector Q y By combining Q y and Ky Is multiplied by the transpose of (2) and Q y and Ky To calculate two score eigenvectors by transposed multiplication, and then calculate a refinement eigenvector by the following formula
Preferably, the mode-aware distillation in step S3 includes classified distillation and super distillation;
in a classified distillation, letRepresenting MRI image data x generated from a student network i Class probability of the belonging class, and +.>Representing MRI image data generated from teacher networkx i Class probability of the class; definition of Classification level distillation loss->So that class probabilities from the teacher network are the targets for training the student network, the difference between the two distributions is measured using the Kullback-Leibler divergence:
wherein N and M represent the number of training samples and the number of total categories, respectively, DKL (. Cndot.) represents the Kullback-Leibler divergence between the two probabilities,representing student network prediction samples, < >>Representing teacher network prediction samples;
in the characteristic stage distillation, the characteristic stage distillation is lostCalculated as +.> and />Kullback-Leibler divergence between and +.> and />Combinations of Kulback-leibler divergence:
wherein beta is used for weighting the Kullback-Leibler divergence term, and the weight beta 1 =1,Representing two features-> and />Kullback-Leibler divergence between,/-between>Representing two features-> and />Kullback-Leibler divergence between,/-between>Representing student network prediction samples, < >>Representing teacher network prediction samples;
the final loss function includes two supervised losses on the teacher network and the student network, a self-supervised loss predicted by clinical data, and a distillation loss between the student network and the teacher network, the loss function is defined as follows:
wherein , and />Representing respectively teacher network predicted supervision loss and student network predicted supervision loss, calculating +_ using Focalloss losses> and />Predicted loss of L clinical Self-supervising loss representing clinical data predictions, P calculated using cross entropy loss C Is a basic fact of prediction error and clinical data; />Represents the loss of class distillation, < >>Representing the equation feature level distillation loss between the teacher and student networks, using equation L total Training a modal aware distillation network for MIV prediction.
After the technical scheme is adopted, the invention has the following beneficial effects:
1. according to the invention, the teacher network knowledge with image modes and non-image clinical data is migrated to the student network with only image modes, and the mode sensing distillation network (MD-Net) for HCCMVI prediction is provided, so that the classification precision and the prediction accuracy can be effectively improved.
2. The student network of the modal aware distillation network (MD-Net) of the present invention includes two MRI-only modules for extracting MRI features only and one Symmetric Attention (SA) module for refining features from two MRI images, while the teacher network includes two MRI-clinical fusion modules for fusing MRI data with clinical data with 52-dimensional vectors and one SA module for refining the two fused features.
3. In addition to the resulting distillation of the original classification level, the present invention also designs a feature level distillation for the modal aware distillation network (MD-Net) to better transfer clinical data from the teacher network to the student network. In addition, a new self-supervising task was devised to predict clinical data from image data to further enhance MVI prediction.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is an exemplary view of a tumor in an HCCMRI image of the present invention;
FIG. 3 is a block diagram of a modal aware distillation network according to the present invention;
FIG. 4 is an exemplary diagram of an MRI-clinical fusion module, an MRI-only module, and a Channel-wise multiplication of the present invention, wherein (a) the MRI-clinical fusion module: fusing MRI data with non-imaging clinical data; (b) MRI-only module: using only MRI image data; (c) one example of Channel-wise multiplication;
fig. 5 is a frame flow chart of the SA module of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Examples
As shown in fig. 1 to 5, a hepatocellular carcinoma prediction method based on a modal sensing distillation network includes the following steps:
s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, wherein in each round of cross-validation, one fold data is used as a test set, and the other four fold data are used as training sets;
the dataset in step S1 consisted of 270 pathologically confirmed HCC patients data, 270 patients including 128M 0 Patient, 93M 1 Patients and 49M 2 A patient; wherein M is 0 Indicating no microvascular invasion, M 1 Indicating no more than 5 invasive blood vessels or tumor locationWithin 1cm near the surface, M 2 Represents invasion of more than 5 blood vessels or more than 1cm from the tumor surface;
s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube;
setting the size of the cube to 80×80×20 pixels in step S2;
s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities;
the training process of the modal sensing distillation network in the step S3 specifically comprises the following steps:
s31, transmitting the HBP image and clinical data to an MRI-clinical fusion module by a teacher network, and extracting 512-dimensional vector features; inputting the PRE image and clinical data into another MRI-clinical fusion module to obtain another 512-dimensional vector feature; inputting the obtained two 512-dimensional vector features into an SA module to obtain new features fused with each other information and />Finally, new features are added> and />Spliced to generate Z t And Z is to t Passed into two fully connected layers to predict classification result P t ;
S32, the student network takes the 3DHBPMRI image and the 3DPREMRI image as input, transmits HBP data to the MRI-only module to obtain the characteristics, and transmits PRE data to the other MRI-only module to obtain the characteristics; the two obtainedThe features are input into an SA module to obtain new features fused with each other information and />Wherein the new feature-> and />Is two feature vectors containing 512 dimensions; finally, new features are added> and />Are connected to generate Z s And Z is to s Input into the full connection layer to predict MVI classification result P s ;
S33, introducing a regression task into the student network, and inputting the connection characteristic Z of the HBP image and the PRE image s Delivered to two fully connected layers, the 52-dimensional vector P is predicted c For estimating potential clinical information, reusing input clinical data as prediction P c Is a real tag of (1);
s34, distilling the characteristics fused in the teacher network clinical data and the MRI images to the characteristics extracted from the MRI images by adopting the classification level distillation loss and the characteristic level distillation loss, and converting the clinical information of the teacher network into the student network by utilizing a knowledge distillation strategy;
in step S31, the MRI-clinical fusion module integrates MRI data and non-image clinical data, takes 3DMRI data and vectorized clinical data as input, and applies four full connection layers to the input clinical data to obtain four feature graphs, wherein the feature channels are 64, 128, 256 and 256 respectively; another 3D feature map is obtained with four convolution blocks on the input MRI image, each consisting of two 3 x 3 convolution layers, and the feature channels are also set to 64, 128, 256, and 256; multiplying the four feature maps in the clinical data and the corresponding four features in the MRI data by channel-wise to integrate the four feature maps and the corresponding four features together, and then applying a 3X 3 convolution layer and a full connection layer to output feature vectors with 512 dimensions;
in step S32, the MRI-only module extracts 512-dimensional feature vectors from the 3DMRI image; the MRI-only module consists of nine convolution blocks and a full connection layer, wherein each convolution block comprises a batch processing standard layer, a ReLU activation layer and a 3X 3 convolution layer and is used for improving the robustness of a network; setting the channel numbers of the output characteristics of the nine convolution blocks to be different, wherein the characteristic channels of the first five layers are 32, 64 and 128, and the characteristic channels of the last four layers are 256, 128, 256 and 256, so as to balance efficiency and calculation burden;
the SA modules in steps S31 and S32 are symmetric attention modules, X and Y represent two input feature maps of the SA module, and the SA module applies a linear transformation layer on X to obtain three feature maps including Query vector Q x Key vector K x And Value vector V x The method comprises the steps of carrying out a first treatment on the surface of the SA module applies an adaptive transformation layer on Y to generate Key feature map K y And Value feature map V y The method comprises the steps of carrying out a first treatment on the surface of the By multiplying by Q x and Kx Generates score feature vector S by transpose of (1) x By multiplying by Q y and Ky To generate another score feature vector S y The method comprises the steps of carrying out a first treatment on the surface of the The obtained score feature vector S x And Value feature vector V x Multiply and add S y And V is equal to y Multiplying to generate two result eigenvectors, adding them, and finally generating output refined eigenvector
SA module application on YAnother linear transformation layer, obtaining a characteristic vector Query vector Q y By combining Q y and Ky Is multiplied by the transpose of (2) and Q y and Ky To calculate two score eigenvectors by transposed multiplication, and then calculate a refinement eigenvector by the following formula
The mode sensing distillation in the step S3 comprises classified distillation and superfine distillation;
in a classified distillation, letRepresenting MRI image data x generated from a student network i Class probability of the belonging class, and +.>Representing MRI image data x generated from teacher network i Class probability of the class; definition of Classification level distillation loss->So that class probabilities from the teacher network are the targets for training the student network, the difference between the two distributions is measured using the Kullback-Leibler divergence:
wherein N and M represent the number of training samples and the number of total categories, respectively, DKL (. Cndot.) represents the Kullback-Leibler divergence between the two probabilities,representing student network prediction samples, < >>Representing teacher network prediction samples;
in the characteristic stage distillation, the characteristic stage distillation is lostCalculated as +.> and />Kullback-Leibler divergence between and +.> and />Combinations of Kulback-leibler divergence:
wherein beta is used for weighting the Kullback-Leibler divergence term, and the weight beta 1 =1,Representing two features-> and />Kullback-Leibler divergence between,/-between>Representing two features-> and />Kullback-Leibler divergence between,/-between>Representing student network prediction samples, < >>Representing teacher network prediction samples;
the final loss function includes two supervised losses on the teacher network and the student network, a self-supervised loss predicted by clinical data, and a distillation loss between the student network and the teacher network, the loss function is defined as follows:
wherein , and />Representing respectively teacher network predicted supervision loss and student network predicted supervision loss, calculating +_ using Focalloss losses> and />Predicted loss of L clinical Self-supervising loss representing clinical data predictions, P calculated using cross entropy loss C Is a basic fact of prediction error and clinical data; />Represents the distillation loss of the classification stage,representing the equation feature level distillation loss between the teacher and student networks, using equation L total Training a modal aware distillation network for MIV prediction;
s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
Claims (8)
1. The hepatocellular carcinoma prediction method based on the modal sensing distillation network is characterized by comprising the following steps of:
s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, taking one fold of data as a test set and the other four folds of data as a training set in each round of cross-validation, and calculating five rounds of mean and variance values of an evaluation index;
s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube;
s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities;
s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
2. The method for predicting hepatocellular carcinoma based on modal sensory distillation network as recited in claim 1 wherein said dataset in step S1 is composed of 270 pathologically confirmed cellsReal HCC patient data composition, 270 patients including 128M 0 Patient, 93M 1 Patients and 49M 2 A patient; wherein M is 0 Indicating no microvascular invasion, M 1 Represents no more than 5 invasive blood vessels or is located within 1cm of the tumor surface, M 2 Represents invasion of more than 5 blood vessels or more than 1cm from the tumor surface.
3. A method for predicting hepatocellular carcinoma based on a modal sensing and distillation network as recited in claim 1 wherein in step S2 the size of said cubes is set to 80 x 20 pixels.
4. The hepatocellular carcinoma prediction method based on a modal sensing distillation network as set forth in claim 1, wherein the training process of the modal sensing distillation network in step S3 is specifically as follows:
s31, transmitting the HBP image and clinical data to an MRI-clinical fusion module by a teacher network, and extracting 512-dimensional vector features; inputting the PRE image and clinical data into another MRI-clinical fusion module to obtain another 512-dimensional vector feature; inputting the obtained two 512-dimensional vector features into an SA module to obtain new features fused with each other information and />Finally, new features are added> and />Spliced to generate Z t And Z is to t Passed into two fully connected layers to predict classification result P t ;
S32, student network uses 3D HBP MRI image and 3D PRE MRI image is taken as input, HBP data is transmitted to an MRI-only module to obtain characteristics, PRE data is transmitted to another MRI-only module to obtain characteristics; inputting the obtained two features into an SA module to obtain new features fused with each other information and />Wherein the new feature-> and />Is two feature vectors containing 512 dimensions; finally, new features are added> and />Are connected to generate Z s And Z is to s Input into the full connection layer to predict MVI classification result P s ;
S33, introducing a regression task into the student network, and inputting the connection characteristic Z of the HBP image and the PRE image s Delivered to two fully connected layers, the 52-dimensional vector P is predicted c For estimating potential clinical information, reusing input clinical data as prediction P c Is a real tag of (1);
s34, distilling the fused features in the teacher network clinical data and the MRI images to the features extracted from the MRI images by using the classification level distillation loss and the feature level distillation loss, and converting the clinical information of the teacher network into the student network by using a knowledge distillation strategy.
5. The method for predicting hepatocellular carcinoma based on modal-aware distillation network as set forth in claim 4, wherein in step S31, the MRI-clinical fusion module integrates MRI data and non-imaging clinical data, takes 3D MRI data and vectorized clinical data as input, and applies four fully connected layers to the input clinical data to obtain four feature maps, the feature channels being 64, 128, 256 and 256, respectively; another 3D feature map is obtained with four convolution blocks on the input MRI image, each consisting of two 3 x 3 convolution layers, and the feature channels are also set to 64, 128, 256, and 256; the four feature maps in the clinical data and the corresponding four features in the MRI data are channel-wise multiplied to integrate them together, and then a 3 x 3 convolution layer and a full connection layer are applied to output feature vectors having 512 dimensions.
6. The method for predicting hepatocellular carcinoma based on modal-aware distillation network as set forth in claim 4, wherein the MRI-only module extracts 512-dimensional feature vectors from the 3D MRI image in step S32; the MRI-only module consists of nine convolution blocks and a full connection layer, wherein each convolution block comprises a batch processing standard layer, a ReLU activation layer and a 3X 3 convolution layer and is used for improving the robustness of a network; the number of channels of the output features of the nine convolution blocks is set to be different, the feature channels of the first five layers are 32, 64 and 128, and the feature channels of the last four layers are 256, 128, 256 and 256, for balancing the efficiency and the calculation load.
7. The method for predicting hepatocellular carcinoma based on modal-aware distillation network as set forth in claim 4, wherein the SA module in steps S31 and S32 is a symmetric attention module, X and Y represent two input feature maps of the SA module, and the SA module applies a linear transformation layer on X to obtain three feature maps including Query vector Q x Key vector K x And Value vector V x The method comprises the steps of carrying out a first treatment on the surface of the SA module applies an adaptive transformation layer on Y to generate Key feature map K y And Value feature map V y The method comprises the steps of carrying out a first treatment on the surface of the By multiplying by Q x and Kx Is transposed to generate score feature vector S x By multiplying by Q y and Ky To generate another score feature vector S y The method comprises the steps of carrying out a first treatment on the surface of the The obtained score feature vector S x And Value feature vector V x Multiply and add S y And V is equal to y Multiplying to generate two result eigenvectors, adding them, and finally generating output refined eigenvector
The SA module applies another linear transformation layer on Y to obtain a characteristic vector Query vector Q y By combining Q y and Ky Is multiplied by the transpose of (2) and Q y and Ky To calculate two score eigenvectors by transposed multiplication, and then calculate a refinement eigenvector by the following formula
8. The method for predicting hepatocellular carcinoma based on modal sense distillation network as recited in claim 4 wherein the modal sense distillation in step S3 comprises classified distillation and superfine distillation;
in a classified distillation, letRepresenting MRI image data x generated from a student network i Class probability of the belonging class, and +.>Representing MRI image data x generated from teacher network i Class probability of the class; definition of classified horizontal distillation lossSo that class probabilities from the teacher network are the targets for training the student network, the difference between the two distributions is measured using the Kullback-Leibler divergence:
wherein N and M represent the number of training samples and the number of total categories, respectively, DKL (. Cndot.) represents the Kullback-Leibler divergence between the two probabilities,representing student network prediction samples, < >>Representing teacher network prediction samples;
in the characteristic stage distillation, the characteristic stage distillation is lostCalculated as +.> and />Kullback-Leibler divergence between and +.> and />Kulback-L therebetweenCombination of eibleer divergences:
wherein beta is used for weighting the Kullback-Leibler divergence term, and the weight beta 1 =1,Representing two features and />Kullback-Leibler divergence between,/-between>Representing two features-> and />Kullback-Leibler divergence between,/-between>Representing student network prediction samples, < >>Representing teacher network prediction samples;
the final loss function includes two supervised losses on the teacher network and the student network, a self-supervised loss predicted by clinical data, and a distillation loss between the student network and the teacher network, the loss function is defined as follows:
wherein , and />Representing teacher network predicted supervision loss and student network predicted supervision loss, respectively, calculating +.> and />Predicted loss of L clinical Self-supervising loss representing clinical data predictions, P calculated using cross entropy loss C Is a basic fact of prediction error and clinical data; />Represents the loss of class distillation, < >>Representing the equation feature level distillation loss between the teacher and student networks, using equation L total Training a modal aware distillation network for MIV prediction. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310058590.2A CN116030025A (en) | 2023-01-18 | 2023-01-18 | Hepatocellular carcinoma prediction method based on modal sensing distillation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310058590.2A CN116030025A (en) | 2023-01-18 | 2023-01-18 | Hepatocellular carcinoma prediction method based on modal sensing distillation network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116030025A true CN116030025A (en) | 2023-04-28 |
Family
ID=86075702
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310058590.2A Pending CN116030025A (en) | 2023-01-18 | 2023-01-18 | Hepatocellular carcinoma prediction method based on modal sensing distillation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116030025A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117173165A (en) * | 2023-11-02 | 2023-12-05 | 安徽大学 | Contrast agent-free liver tumor detection method, system and medium based on reinforcement learning |
CN117253611A (en) * | 2023-09-25 | 2023-12-19 | 四川大学 | Intelligent early cancer screening method and system based on multi-modal knowledge distillation |
CN118366654A (en) * | 2024-04-09 | 2024-07-19 | 重庆邮电大学 | Cross-modal knowledge distillation-based lung cancer risk prediction method |
-
2023
- 2023-01-18 CN CN202310058590.2A patent/CN116030025A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117253611A (en) * | 2023-09-25 | 2023-12-19 | 四川大学 | Intelligent early cancer screening method and system based on multi-modal knowledge distillation |
CN117253611B (en) * | 2023-09-25 | 2024-04-30 | 四川大学 | Intelligent early cancer screening method and system based on multi-modal knowledge distillation |
CN117173165A (en) * | 2023-11-02 | 2023-12-05 | 安徽大学 | Contrast agent-free liver tumor detection method, system and medium based on reinforcement learning |
CN118366654A (en) * | 2024-04-09 | 2024-07-19 | 重庆邮电大学 | Cross-modal knowledge distillation-based lung cancer risk prediction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Mixed transformer u-net for medical image segmentation | |
Pacal et al. | A robust real-time deep learning based automatic polyp detection system | |
CN112966127B (en) | Cross-modal retrieval method based on multilayer semantic alignment | |
CN116030025A (en) | Hepatocellular carcinoma prediction method based on modal sensing distillation network | |
Duan et al. | Gesture recognition based on multi‐modal feature weight | |
CN107358293B (en) | Neural network training method and device | |
CN111325750B (en) | Medical image segmentation method based on multi-scale fusion U-shaped chain neural network | |
CN111462191B (en) | Non-local filter unsupervised optical flow estimation method based on deep learning | |
CN112651406B (en) | Depth perception and multi-mode automatic fusion RGB-D significance target detection method | |
CN111612100B (en) | Object re-identification method, device, storage medium and computer equipment | |
Cai et al. | A robust interclass and intraclass loss function for deep learning based tongue segmentation | |
CN117422704B (en) | Cancer prediction method, system and equipment based on multi-mode data | |
CN104463819A (en) | Method and apparatus for filtering an image | |
CN116612288B (en) | Multi-scale lightweight real-time semantic segmentation method and system | |
CN116258695A (en) | Semi-supervised medical image segmentation method based on interaction of Transformer and CNN | |
CN115761408A (en) | Knowledge distillation-based federal domain adaptation method and system | |
CN116434010A (en) | Multi-view pedestrian attribute identification method | |
CN114463848A (en) | Progressive learning gait recognition method based on memory enhancement | |
Huang et al. | Class-specific distribution alignment for semi-supervised medical image classification | |
Jia et al. | Bidirectional stereo matching network with double cost volumes | |
Li et al. | A multi-grained unsupervised domain adaptation approach for semantic segmentation | |
CN115409843B (en) | Brain nerve image feature extraction method based on scale equalization coupling convolution architecture | |
Lu et al. | GA-CSPN: generative adversarial monocular depth estimation with second-order convolutional spatial propagation network | |
CN113962846A (en) | Image alignment method and device, computer readable storage medium and electronic device | |
Lu et al. | Medical image segmentation using boundary-enhanced guided packet rotation dual attention decoder network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |