CN116030025A - Hepatocellular carcinoma prediction method based on modal sensing distillation network - Google Patents

Hepatocellular carcinoma prediction method based on modal sensing distillation network Download PDF

Info

Publication number
CN116030025A
CN116030025A CN202310058590.2A CN202310058590A CN116030025A CN 116030025 A CN116030025 A CN 116030025A CN 202310058590 A CN202310058590 A CN 202310058590A CN 116030025 A CN116030025 A CN 116030025A
Authority
CN
China
Prior art keywords
network
distillation
data
mri
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310058590.2A
Other languages
Chinese (zh)
Inventor
王连生
张英豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202310058590.2A priority Critical patent/CN116030025A/en
Publication of CN116030025A publication Critical patent/CN116030025A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Magnetic Resonance Imaging Apparatus (AREA)

Abstract

The invention discloses a hepatocellular carcinoma prediction method based on a modal sensing distillation network, which comprises the following steps of: s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, wherein in each round of cross-validation, one fold data is used as a test set, and the other four fold data are used as training sets; s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube; s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities; s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.

Description

Hepatocellular carcinoma prediction method based on modal sensing distillation network
Technical Field
The invention relates to the technical field of biology, in particular to a hepatocellular carcinoma prediction method based on a modal sensing distillation network.
Background
Hepatocellular carcinoma refers to malignant tumor generated by liver cells, and is a common pathological type of primary liver cancer. Currently, there are the following methods for predicting the infiltration of hepatic cell carcinoma: 1. predicting the pre-operative MVI using extreme gradient enhancement and deep learning of the CT image; 2. utilizing a 3DCNN prediction model to fuse features from multiple MR sequences; 3. embedding long-term memory LSTM into CNN to fuse the multi-modal MR volumes to predict MVI of HCC patients; the three methods described above only involve MR images to predict MVI states with low accuracy. In addition, predictions were made using the following method of knowledge distillation: 1. KD (knowledge-distillation) was used to effectively segment neuronal structure microscopy images from 3D optical images; 2. referring to the concept of KD, using soft labels to segment brain injury 3 by expanding mask boundaries, and using KD to carry out multi-source transmission learning lung mode analysis tasks; 4. a class-directed contrast distillation module is formulated to pull pairs of positive images from the same class in the teacher and student models while pushing pairs of negative images from different classes away; the distillation network adopted by the four methods only considers different image data, and transmits information from the input image data, so that the classification precision is poor and the prediction accuracy is low.
Disclosure of Invention
The invention aims to provide a hepatocellular carcinoma prediction method based on a modal sensing distillation network, which can effectively improve classification precision and prediction accuracy by migrating teacher network knowledge with image modes and non-image clinical data to a student network only with image modes and providing a modal sensing distillation network (MD-Net) for HCCMVI prediction.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a hepatocellular carcinoma prediction method based on a modal sensing distillation network comprises the following steps:
s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, wherein in each round of cross-validation, one fold data is used as a test set, and the other four fold data are used as training sets;
s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube;
s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities;
s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
Preferably, the dataset in step S1 consists of 270 pathologically confirmed HCC patient data, 270 patients comprising 128M 0 Patient, 93M 1 Patients and 49M 2 A patient; wherein M is 0 Indicating no microvascular invasion, M 1 Represents no more than 5 invasive blood vessels or is located within 1cm of the tumor surface, M 2 Represents invasion of more than 5 blood vessels or more than 1cm from the tumor surface.
Preferably, in step S2, the size of the cube is set to 80×80×20 pixels.
Preferably, the training process of the modal sensing distillation network in step S3 is specifically:
s31, transmitting the HBP image and clinical data to an MRI-clinical fusion module by a teacher network, and extracting 512-dimensional vector features; inputting the PRE image and clinical data into another MRI-clinical fusion module to obtain another 512-dimensional vector feature; inputting the obtained two 512-dimensional vector features into an SA module to obtain new features fused with each other information
Figure BDA0004060886490000021
and />
Figure BDA0004060886490000022
Finally, new features are added>
Figure BDA0004060886490000023
and />
Figure BDA0004060886490000024
Spliced to generate Z t And Z is to t Passed into two fully connected layers to predict classification result P t
S32, the student network takes the 3DHBPMRI image and the 3DPREMRI image as input, transmits HBP data to the MRI-only module to obtain the characteristics, and transmits PRE data to the other MRI-only module to obtain the characteristics; inputting the obtained two features into an SA module to obtain new features fused with each other information
Figure BDA0004060886490000025
and />
Figure BDA0004060886490000026
Wherein the new feature->
Figure BDA0004060886490000027
and />
Figure BDA0004060886490000028
Is two feature vectors containing 512 dimensions; finally, new features are added>
Figure BDA0004060886490000029
and />
Figure BDA00040608864900000210
Are connected to generate Z s And Z is to s Input into the full connection layer to predict MVI classification result P s
S33, introducing a regression task into the student network to input HBConnection feature Z of P image and input PRE image s Delivered to two fully connected layers, the 52-dimensional vector P is predicted c For estimating potential clinical information, reusing input clinical data as prediction P c Is a real tag of (1);
s34, distilling the fused features in the teacher network clinical data and the MRI images to the features extracted from the MRI images by using the classification level distillation loss and the feature level distillation loss, and converting the clinical information of the teacher network into the student network by using a knowledge distillation strategy.
Preferably, in step S31, the MRI-clinical fusion module integrates MRI data and non-imaging clinical data, takes 3DMRI data and vectorized clinical data as input, and applies four full connection layers on the input clinical data to obtain four feature maps, wherein the feature channels are 64, 128, 256 and 256 respectively; another 3D feature map is obtained with four convolution blocks on the input MRI image, each consisting of two 3 x 3 convolution layers, and the feature channels are also set to 64, 128, 256, and 256; the four feature maps in the clinical data and the corresponding four features in the MRI data are channel-wise multiplied to integrate them together, and then a 3 x 3 convolution layer and a full connection layer are applied to output feature vectors having 512 dimensions.
Preferably, the MRI-only module extracts 512-dimensional feature vectors from the 3DMRI image in step S32; the MRI-only module consists of nine convolution blocks and a full connection layer, wherein each convolution block comprises a batch processing standard layer, a ReLU activation layer and a 3X 3 convolution layer and is used for improving the robustness of a network; the number of channels of the output features of the nine convolution blocks is set to be different, the feature channels of the first five layers are 32, 64 and 128, and the feature channels of the last four layers are 256, 128, 256 and 256, for balancing the efficiency and the calculation load.
Preferably, the SA module in steps S31 and S32 is a symmetric attention module, X and Y represent two input feature maps of the SA module, and the SA module applies a linear transformation layer on X to obtain three feature maps including Query vector Q x Key vector K x And Value vector V x The method comprises the steps of carrying out a first treatment on the surface of the SA module applies an adaptive transformation layer on Y to generate Key feature map K y And Value feature map V y The method comprises the steps of carrying out a first treatment on the surface of the By multiplying by Q x and Kx Generates score feature vector S by transpose of (1) x By multiplying by Q y and Ky To generate another score feature vector S y The method comprises the steps of carrying out a first treatment on the surface of the The obtained score feature vector S x And Value feature vector V x Multiply and add S y And V is equal to y Multiplying to generate two result eigenvectors, adding them, and finally generating output refined eigenvector
Figure BDA0004060886490000031
Figure BDA0004060886490000041
The SA module applies another linear transformation layer on Y to obtain a characteristic vector Query vector Q y By combining Q y and Ky Is multiplied by the transpose of (2) and Q y and Ky To calculate two score eigenvectors by transposed multiplication, and then calculate a refinement eigenvector by the following formula
Figure BDA0004060886490000042
Figure BDA0004060886490000043
Preferably, the mode-aware distillation in step S3 includes classified distillation and super distillation;
in a classified distillation, let
Figure BDA0004060886490000044
Representing MRI image data x generated from a student network i Class probability of the belonging class, and +.>
Figure BDA0004060886490000045
Representing MRI image data generated from teacher networkx i Class probability of the class; definition of Classification level distillation loss->
Figure BDA0004060886490000046
So that class probabilities from the teacher network are the targets for training the student network, the difference between the two distributions is measured using the Kullback-Leibler divergence:
Figure BDA0004060886490000047
wherein N and M represent the number of training samples and the number of total categories, respectively, DKL (. Cndot.) represents the Kullback-Leibler divergence between the two probabilities,
Figure BDA0004060886490000048
representing student network prediction samples, < >>
Figure BDA0004060886490000049
Representing teacher network prediction samples;
in the characteristic stage distillation, the characteristic stage distillation is lost
Figure BDA00040608864900000410
Calculated as +.>
Figure BDA00040608864900000411
and />
Figure BDA00040608864900000412
Kullback-Leibler divergence between and +.>
Figure BDA00040608864900000413
and />
Figure BDA00040608864900000414
Combinations of Kulback-leibler divergence:
Figure BDA00040608864900000415
wherein beta is used for weighting the Kullback-Leibler divergence term, and the weight beta 1 =1,
Figure BDA00040608864900000416
Representing two features->
Figure BDA00040608864900000417
and />
Figure BDA00040608864900000418
Kullback-Leibler divergence between,/-between>
Figure BDA00040608864900000419
Representing two features->
Figure BDA00040608864900000420
and />
Figure BDA00040608864900000421
Kullback-Leibler divergence between,/-between>
Figure BDA00040608864900000422
Representing student network prediction samples, < >>
Figure BDA00040608864900000423
Representing teacher network prediction samples;
the final loss function includes two supervised losses on the teacher network and the student network, a self-supervised loss predicted by clinical data, and a distillation loss between the student network and the teacher network, the loss function is defined as follows:
Figure BDA0004060886490000051
wherein ,
Figure BDA0004060886490000052
and />
Figure BDA0004060886490000053
Representing respectively teacher network predicted supervision loss and student network predicted supervision loss, calculating +_ using Focalloss losses>
Figure BDA0004060886490000054
and />
Figure BDA0004060886490000055
Predicted loss of L clinical Self-supervising loss representing clinical data predictions, P calculated using cross entropy loss C Is a basic fact of prediction error and clinical data; />
Figure BDA0004060886490000056
Represents the loss of class distillation, < >>
Figure BDA0004060886490000057
Representing the equation feature level distillation loss between the teacher and student networks, using equation L total Training a modal aware distillation network for MIV prediction.
After the technical scheme is adopted, the invention has the following beneficial effects:
1. according to the invention, the teacher network knowledge with image modes and non-image clinical data is migrated to the student network with only image modes, and the mode sensing distillation network (MD-Net) for HCCMVI prediction is provided, so that the classification precision and the prediction accuracy can be effectively improved.
2. The student network of the modal aware distillation network (MD-Net) of the present invention includes two MRI-only modules for extracting MRI features only and one Symmetric Attention (SA) module for refining features from two MRI images, while the teacher network includes two MRI-clinical fusion modules for fusing MRI data with clinical data with 52-dimensional vectors and one SA module for refining the two fused features.
3. In addition to the resulting distillation of the original classification level, the present invention also designs a feature level distillation for the modal aware distillation network (MD-Net) to better transfer clinical data from the teacher network to the student network. In addition, a new self-supervising task was devised to predict clinical data from image data to further enhance MVI prediction.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is an exemplary view of a tumor in an HCCMRI image of the present invention;
FIG. 3 is a block diagram of a modal aware distillation network according to the present invention;
FIG. 4 is an exemplary diagram of an MRI-clinical fusion module, an MRI-only module, and a Channel-wise multiplication of the present invention, wherein (a) the MRI-clinical fusion module: fusing MRI data with non-imaging clinical data; (b) MRI-only module: using only MRI image data; (c) one example of Channel-wise multiplication;
fig. 5 is a frame flow chart of the SA module of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Examples
As shown in fig. 1 to 5, a hepatocellular carcinoma prediction method based on a modal sensing distillation network includes the following steps:
s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, wherein in each round of cross-validation, one fold data is used as a test set, and the other four fold data are used as training sets;
the dataset in step S1 consisted of 270 pathologically confirmed HCC patients data, 270 patients including 128M 0 Patient, 93M 1 Patients and 49M 2 A patient; wherein M is 0 Indicating no microvascular invasion, M 1 Indicating no more than 5 invasive blood vessels or tumor locationWithin 1cm near the surface, M 2 Represents invasion of more than 5 blood vessels or more than 1cm from the tumor surface;
s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube;
setting the size of the cube to 80×80×20 pixels in step S2;
s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities;
the training process of the modal sensing distillation network in the step S3 specifically comprises the following steps:
s31, transmitting the HBP image and clinical data to an MRI-clinical fusion module by a teacher network, and extracting 512-dimensional vector features; inputting the PRE image and clinical data into another MRI-clinical fusion module to obtain another 512-dimensional vector feature; inputting the obtained two 512-dimensional vector features into an SA module to obtain new features fused with each other information
Figure BDA0004060886490000071
and />
Figure BDA0004060886490000072
Finally, new features are added>
Figure BDA0004060886490000073
and />
Figure BDA0004060886490000074
Spliced to generate Z t And Z is to t Passed into two fully connected layers to predict classification result P t
S32, the student network takes the 3DHBPMRI image and the 3DPREMRI image as input, transmits HBP data to the MRI-only module to obtain the characteristics, and transmits PRE data to the other MRI-only module to obtain the characteristics; the two obtainedThe features are input into an SA module to obtain new features fused with each other information
Figure BDA0004060886490000075
and />
Figure BDA0004060886490000076
Wherein the new feature->
Figure BDA0004060886490000077
and />
Figure BDA0004060886490000078
Is two feature vectors containing 512 dimensions; finally, new features are added>
Figure BDA0004060886490000079
and />
Figure BDA00040608864900000710
Are connected to generate Z s And Z is to s Input into the full connection layer to predict MVI classification result P s
S33, introducing a regression task into the student network, and inputting the connection characteristic Z of the HBP image and the PRE image s Delivered to two fully connected layers, the 52-dimensional vector P is predicted c For estimating potential clinical information, reusing input clinical data as prediction P c Is a real tag of (1);
s34, distilling the characteristics fused in the teacher network clinical data and the MRI images to the characteristics extracted from the MRI images by adopting the classification level distillation loss and the characteristic level distillation loss, and converting the clinical information of the teacher network into the student network by utilizing a knowledge distillation strategy;
in step S31, the MRI-clinical fusion module integrates MRI data and non-image clinical data, takes 3DMRI data and vectorized clinical data as input, and applies four full connection layers to the input clinical data to obtain four feature graphs, wherein the feature channels are 64, 128, 256 and 256 respectively; another 3D feature map is obtained with four convolution blocks on the input MRI image, each consisting of two 3 x 3 convolution layers, and the feature channels are also set to 64, 128, 256, and 256; multiplying the four feature maps in the clinical data and the corresponding four features in the MRI data by channel-wise to integrate the four feature maps and the corresponding four features together, and then applying a 3X 3 convolution layer and a full connection layer to output feature vectors with 512 dimensions;
in step S32, the MRI-only module extracts 512-dimensional feature vectors from the 3DMRI image; the MRI-only module consists of nine convolution blocks and a full connection layer, wherein each convolution block comprises a batch processing standard layer, a ReLU activation layer and a 3X 3 convolution layer and is used for improving the robustness of a network; setting the channel numbers of the output characteristics of the nine convolution blocks to be different, wherein the characteristic channels of the first five layers are 32, 64 and 128, and the characteristic channels of the last four layers are 256, 128, 256 and 256, so as to balance efficiency and calculation burden;
the SA modules in steps S31 and S32 are symmetric attention modules, X and Y represent two input feature maps of the SA module, and the SA module applies a linear transformation layer on X to obtain three feature maps including Query vector Q x Key vector K x And Value vector V x The method comprises the steps of carrying out a first treatment on the surface of the SA module applies an adaptive transformation layer on Y to generate Key feature map K y And Value feature map V y The method comprises the steps of carrying out a first treatment on the surface of the By multiplying by Q x and Kx Generates score feature vector S by transpose of (1) x By multiplying by Q y and Ky To generate another score feature vector S y The method comprises the steps of carrying out a first treatment on the surface of the The obtained score feature vector S x And Value feature vector V x Multiply and add S y And V is equal to y Multiplying to generate two result eigenvectors, adding them, and finally generating output refined eigenvector
Figure BDA0004060886490000081
Figure BDA0004060886490000082
SA module application on YAnother linear transformation layer, obtaining a characteristic vector Query vector Q y By combining Q y and Ky Is multiplied by the transpose of (2) and Q y and Ky To calculate two score eigenvectors by transposed multiplication, and then calculate a refinement eigenvector by the following formula
Figure BDA0004060886490000083
Figure BDA0004060886490000084
/>
The mode sensing distillation in the step S3 comprises classified distillation and superfine distillation;
in a classified distillation, let
Figure BDA0004060886490000085
Representing MRI image data x generated from a student network i Class probability of the belonging class, and +.>
Figure BDA0004060886490000086
Representing MRI image data x generated from teacher network i Class probability of the class; definition of Classification level distillation loss->
Figure BDA0004060886490000087
So that class probabilities from the teacher network are the targets for training the student network, the difference between the two distributions is measured using the Kullback-Leibler divergence:
Figure BDA0004060886490000088
wherein N and M represent the number of training samples and the number of total categories, respectively, DKL (. Cndot.) represents the Kullback-Leibler divergence between the two probabilities,
Figure BDA0004060886490000089
representing student network prediction samples, < >>
Figure BDA00040608864900000810
Representing teacher network prediction samples;
in the characteristic stage distillation, the characteristic stage distillation is lost
Figure BDA0004060886490000091
Calculated as +.>
Figure BDA0004060886490000092
and />
Figure BDA0004060886490000093
Kullback-Leibler divergence between and +.>
Figure BDA0004060886490000094
and />
Figure BDA0004060886490000095
Combinations of Kulback-leibler divergence:
Figure BDA0004060886490000096
wherein beta is used for weighting the Kullback-Leibler divergence term, and the weight beta 1 =1,
Figure BDA0004060886490000097
Representing two features->
Figure BDA0004060886490000098
and />
Figure BDA0004060886490000099
Kullback-Leibler divergence between,/-between>
Figure BDA00040608864900000910
Representing two features->
Figure BDA00040608864900000911
and />
Figure BDA00040608864900000912
Kullback-Leibler divergence between,/-between>
Figure BDA00040608864900000913
Representing student network prediction samples, < >>
Figure BDA00040608864900000914
Representing teacher network prediction samples;
the final loss function includes two supervised losses on the teacher network and the student network, a self-supervised loss predicted by clinical data, and a distillation loss between the student network and the teacher network, the loss function is defined as follows:
Figure BDA00040608864900000915
wherein ,
Figure BDA00040608864900000916
and />
Figure BDA00040608864900000917
Representing respectively teacher network predicted supervision loss and student network predicted supervision loss, calculating +_ using Focalloss losses>
Figure BDA00040608864900000918
and />
Figure BDA00040608864900000919
Predicted loss of L clinical Self-supervising loss representing clinical data predictions, P calculated using cross entropy loss C Is a basic fact of prediction error and clinical data; />
Figure BDA00040608864900000920
Represents the distillation loss of the classification stage,
Figure BDA00040608864900000921
representing the equation feature level distillation loss between the teacher and student networks, using equation L total Training a modal aware distillation network for MIV prediction;
s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (8)

1. The hepatocellular carcinoma prediction method based on the modal sensing distillation network is characterized by comprising the following steps of:
s1, acquiring a data set of a hepatocellular carcinoma patient, dividing the whole data set into five folds according to a five-fold cross-validation scheme, taking one fold of data as a test set and the other four folds of data as a training set in each round of cross-validation, and calculating five rounds of mean and variance values of an evaluation index;
s2, preprocessing the data, finding the largest external cube for the tumors of all patients, and removing other non-tumor areas except the cube;
s3, establishing a modal sensing distillation network, and training the modal sensing distillation network, wherein the modal sensing distillation network is used for transferring the knowledge learned by combining a teacher network through clinical data modalities and image modalities to a student network only provided with the image modalities;
s4, predicting the hepatocellular carcinoma through the trained modal sensing distillation network.
2. The method for predicting hepatocellular carcinoma based on modal sensory distillation network as recited in claim 1 wherein said dataset in step S1 is composed of 270 pathologically confirmed cellsReal HCC patient data composition, 270 patients including 128M 0 Patient, 93M 1 Patients and 49M 2 A patient; wherein M is 0 Indicating no microvascular invasion, M 1 Represents no more than 5 invasive blood vessels or is located within 1cm of the tumor surface, M 2 Represents invasion of more than 5 blood vessels or more than 1cm from the tumor surface.
3. A method for predicting hepatocellular carcinoma based on a modal sensing and distillation network as recited in claim 1 wherein in step S2 the size of said cubes is set to 80 x 20 pixels.
4. The hepatocellular carcinoma prediction method based on a modal sensing distillation network as set forth in claim 1, wherein the training process of the modal sensing distillation network in step S3 is specifically as follows:
s31, transmitting the HBP image and clinical data to an MRI-clinical fusion module by a teacher network, and extracting 512-dimensional vector features; inputting the PRE image and clinical data into another MRI-clinical fusion module to obtain another 512-dimensional vector feature; inputting the obtained two 512-dimensional vector features into an SA module to obtain new features fused with each other information
Figure FDA0004060886480000011
and />
Figure FDA0004060886480000012
Finally, new features are added>
Figure FDA0004060886480000013
and />
Figure FDA0004060886480000014
Spliced to generate Z t And Z is to t Passed into two fully connected layers to predict classification result P t
S32, student network uses 3D HBP MRI image and 3D PRE MRI image is taken as input, HBP data is transmitted to an MRI-only module to obtain characteristics, PRE data is transmitted to another MRI-only module to obtain characteristics; inputting the obtained two features into an SA module to obtain new features fused with each other information
Figure FDA0004060886480000021
and />
Figure FDA0004060886480000022
Wherein the new feature->
Figure FDA0004060886480000023
and />
Figure FDA0004060886480000024
Is two feature vectors containing 512 dimensions; finally, new features are added>
Figure FDA0004060886480000025
and />
Figure FDA0004060886480000026
Are connected to generate Z s And Z is to s Input into the full connection layer to predict MVI classification result P s
S33, introducing a regression task into the student network, and inputting the connection characteristic Z of the HBP image and the PRE image s Delivered to two fully connected layers, the 52-dimensional vector P is predicted c For estimating potential clinical information, reusing input clinical data as prediction P c Is a real tag of (1);
s34, distilling the fused features in the teacher network clinical data and the MRI images to the features extracted from the MRI images by using the classification level distillation loss and the feature level distillation loss, and converting the clinical information of the teacher network into the student network by using a knowledge distillation strategy.
5. The method for predicting hepatocellular carcinoma based on modal-aware distillation network as set forth in claim 4, wherein in step S31, the MRI-clinical fusion module integrates MRI data and non-imaging clinical data, takes 3D MRI data and vectorized clinical data as input, and applies four fully connected layers to the input clinical data to obtain four feature maps, the feature channels being 64, 128, 256 and 256, respectively; another 3D feature map is obtained with four convolution blocks on the input MRI image, each consisting of two 3 x 3 convolution layers, and the feature channels are also set to 64, 128, 256, and 256; the four feature maps in the clinical data and the corresponding four features in the MRI data are channel-wise multiplied to integrate them together, and then a 3 x 3 convolution layer and a full connection layer are applied to output feature vectors having 512 dimensions.
6. The method for predicting hepatocellular carcinoma based on modal-aware distillation network as set forth in claim 4, wherein the MRI-only module extracts 512-dimensional feature vectors from the 3D MRI image in step S32; the MRI-only module consists of nine convolution blocks and a full connection layer, wherein each convolution block comprises a batch processing standard layer, a ReLU activation layer and a 3X 3 convolution layer and is used for improving the robustness of a network; the number of channels of the output features of the nine convolution blocks is set to be different, the feature channels of the first five layers are 32, 64 and 128, and the feature channels of the last four layers are 256, 128, 256 and 256, for balancing the efficiency and the calculation load.
7. The method for predicting hepatocellular carcinoma based on modal-aware distillation network as set forth in claim 4, wherein the SA module in steps S31 and S32 is a symmetric attention module, X and Y represent two input feature maps of the SA module, and the SA module applies a linear transformation layer on X to obtain three feature maps including Query vector Q x Key vector K x And Value vector V x The method comprises the steps of carrying out a first treatment on the surface of the SA module applies an adaptive transformation layer on Y to generate Key feature map K y And Value feature map V y The method comprises the steps of carrying out a first treatment on the surface of the By multiplying by Q x and Kx Is transposed to generate score feature vector S x By multiplying by Q y and Ky To generate another score feature vector S y The method comprises the steps of carrying out a first treatment on the surface of the The obtained score feature vector S x And Value feature vector V x Multiply and add S y And V is equal to y Multiplying to generate two result eigenvectors, adding them, and finally generating output refined eigenvector
Figure FDA0004060886480000031
Figure FDA0004060886480000032
The SA module applies another linear transformation layer on Y to obtain a characteristic vector Query vector Q y By combining Q y and Ky Is multiplied by the transpose of (2) and Q y and Ky To calculate two score eigenvectors by transposed multiplication, and then calculate a refinement eigenvector by the following formula
Figure FDA0004060886480000033
Figure FDA0004060886480000034
8. The method for predicting hepatocellular carcinoma based on modal sense distillation network as recited in claim 4 wherein the modal sense distillation in step S3 comprises classified distillation and superfine distillation;
in a classified distillation, let
Figure FDA0004060886480000035
Representing MRI image data x generated from a student network i Class probability of the belonging class, and +.>
Figure FDA0004060886480000036
Representing MRI image data x generated from teacher network i Class probability of the class; definition of classified horizontal distillation loss
Figure FDA0004060886480000037
So that class probabilities from the teacher network are the targets for training the student network, the difference between the two distributions is measured using the Kullback-Leibler divergence:
Figure FDA0004060886480000038
wherein N and M represent the number of training samples and the number of total categories, respectively, DKL (. Cndot.) represents the Kullback-Leibler divergence between the two probabilities,
Figure FDA0004060886480000039
representing student network prediction samples, < >>
Figure FDA00040608864800000310
Representing teacher network prediction samples;
in the characteristic stage distillation, the characteristic stage distillation is lost
Figure FDA00040608864800000311
Calculated as +.>
Figure FDA00040608864800000312
and />
Figure FDA00040608864800000313
Kullback-Leibler divergence between and +.>
Figure FDA00040608864800000314
and />
Figure FDA00040608864800000315
Kulback-L therebetweenCombination of eibleer divergences:
Figure FDA00040608864800000316
wherein beta is used for weighting the Kullback-Leibler divergence term, and the weight beta 1 =1,
Figure FDA0004060886480000041
Representing two features
Figure FDA0004060886480000042
and />
Figure FDA0004060886480000043
Kullback-Leibler divergence between,/-between>
Figure FDA0004060886480000044
Representing two features->
Figure FDA0004060886480000045
and />
Figure FDA0004060886480000046
Kullback-Leibler divergence between,/-between>
Figure FDA0004060886480000047
Representing student network prediction samples, < >>
Figure FDA0004060886480000048
Representing teacher network prediction samples;
the final loss function includes two supervised losses on the teacher network and the student network, a self-supervised loss predicted by clinical data, and a distillation loss between the student network and the teacher network, the loss function is defined as follows:
Figure FDA0004060886480000049
wherein ,
Figure FDA00040608864800000410
and />
Figure FDA00040608864800000411
Representing teacher network predicted supervision loss and student network predicted supervision loss, respectively, calculating +.>
Figure FDA00040608864800000412
and />
Figure FDA00040608864800000413
Predicted loss of L clinical Self-supervising loss representing clinical data predictions, P calculated using cross entropy loss C Is a basic fact of prediction error and clinical data; />
Figure FDA00040608864800000414
Represents the loss of class distillation, < >>
Figure FDA00040608864800000415
Representing the equation feature level distillation loss between the teacher and student networks, using equation L total Training a modal aware distillation network for MIV prediction. />
CN202310058590.2A 2023-01-18 2023-01-18 Hepatocellular carcinoma prediction method based on modal sensing distillation network Pending CN116030025A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310058590.2A CN116030025A (en) 2023-01-18 2023-01-18 Hepatocellular carcinoma prediction method based on modal sensing distillation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310058590.2A CN116030025A (en) 2023-01-18 2023-01-18 Hepatocellular carcinoma prediction method based on modal sensing distillation network

Publications (1)

Publication Number Publication Date
CN116030025A true CN116030025A (en) 2023-04-28

Family

ID=86075702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310058590.2A Pending CN116030025A (en) 2023-01-18 2023-01-18 Hepatocellular carcinoma prediction method based on modal sensing distillation network

Country Status (1)

Country Link
CN (1) CN116030025A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117173165A (en) * 2023-11-02 2023-12-05 安徽大学 Contrast agent-free liver tumor detection method, system and medium based on reinforcement learning
CN117253611A (en) * 2023-09-25 2023-12-19 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN118366654A (en) * 2024-04-09 2024-07-19 重庆邮电大学 Cross-modal knowledge distillation-based lung cancer risk prediction method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253611A (en) * 2023-09-25 2023-12-19 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN117253611B (en) * 2023-09-25 2024-04-30 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN117173165A (en) * 2023-11-02 2023-12-05 安徽大学 Contrast agent-free liver tumor detection method, system and medium based on reinforcement learning
CN118366654A (en) * 2024-04-09 2024-07-19 重庆邮电大学 Cross-modal knowledge distillation-based lung cancer risk prediction method

Similar Documents

Publication Publication Date Title
Wang et al. Mixed transformer u-net for medical image segmentation
Pacal et al. A robust real-time deep learning based automatic polyp detection system
CN112966127B (en) Cross-modal retrieval method based on multilayer semantic alignment
CN116030025A (en) Hepatocellular carcinoma prediction method based on modal sensing distillation network
Duan et al. Gesture recognition based on multi‐modal feature weight
CN107358293B (en) Neural network training method and device
CN111325750B (en) Medical image segmentation method based on multi-scale fusion U-shaped chain neural network
CN111462191B (en) Non-local filter unsupervised optical flow estimation method based on deep learning
CN112651406B (en) Depth perception and multi-mode automatic fusion RGB-D significance target detection method
CN111612100B (en) Object re-identification method, device, storage medium and computer equipment
Cai et al. A robust interclass and intraclass loss function for deep learning based tongue segmentation
CN117422704B (en) Cancer prediction method, system and equipment based on multi-mode data
CN104463819A (en) Method and apparatus for filtering an image
CN116612288B (en) Multi-scale lightweight real-time semantic segmentation method and system
CN116258695A (en) Semi-supervised medical image segmentation method based on interaction of Transformer and CNN
CN115761408A (en) Knowledge distillation-based federal domain adaptation method and system
CN116434010A (en) Multi-view pedestrian attribute identification method
CN114463848A (en) Progressive learning gait recognition method based on memory enhancement
Huang et al. Class-specific distribution alignment for semi-supervised medical image classification
Jia et al. Bidirectional stereo matching network with double cost volumes
Li et al. A multi-grained unsupervised domain adaptation approach for semantic segmentation
CN115409843B (en) Brain nerve image feature extraction method based on scale equalization coupling convolution architecture
Lu et al. GA-CSPN: generative adversarial monocular depth estimation with second-order convolutional spatial propagation network
CN113962846A (en) Image alignment method and device, computer readable storage medium and electronic device
Lu et al. Medical image segmentation using boundary-enhanced guided packet rotation dual attention decoder network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination