CN116091412A - Method for segmenting tumor from PET/CT image - Google Patents

Method for segmenting tumor from PET/CT image Download PDF

Info

Publication number
CN116091412A
CN116091412A CN202211575130.9A CN202211575130A CN116091412A CN 116091412 A CN116091412 A CN 116091412A CN 202211575130 A CN202211575130 A CN 202211575130A CN 116091412 A CN116091412 A CN 116091412A
Authority
CN
China
Prior art keywords
teacher
network
pet
networks
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211575130.9A
Other languages
Chinese (zh)
Inventor
姜慧研
王思阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN202211575130.9A priority Critical patent/CN116091412A/en
Publication of CN116091412A publication Critical patent/CN116091412A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention relates to a method for segmenting tumors from PET/CT images, which comprises the following steps: step 1: acquiring a tumor data set; the tumor data set is a PET/CT multi-mode data set or a CT or MRI single-mode data set; step 2: preprocessing the tumor data set to obtain a preprocessed tumor data set; step 3: training two heavy-level neural networks to serve as two teacher networks respectively and storing; step 4: constructing a knowledge distillation framework based on two teacher networks and a pre-built lightweight student network; step 5: training the student network based on a knowledge distillation architecture to obtain a trained student network; step 6: and inputting the preprocessed tumor data set into a trained student network to obtain a tumor segmentation result.

Description

Method for segmenting tumor from PET/CT image
Technical Field
The invention relates to the technical field of medical image tumor segmentation, in particular to a method for segmenting tumors from PET/CT images.
Background
Along with tumor segmentation is a basic task in medical image analysis, and accurately dividing tumor areas of cancer patients plays a key role in differential diagnosis and treatment of tumor of patients. The manual tumor segmentation is a time-consuming and labor-consuming task, and the segmentation result depends on the experience of doctors to a certain extent, and sometimes misjudgment occurs, so that the introduction of an automatic and efficient tumor segmentation method has strong clinical significance.
The existing automatic tumor segmentation methods are mainly divided into four types: region growing, graph cutting, level set and deep learning, the first three of which belong to the traditional methods, have the common disadvantages of poor segmentation accuracy and instability. In recent years, with the development and development of deep learning theory and the continuous improvement of numerical computing equipment, tumor segmentation by deep learning has become the most advanced method at present, wherein a convolutional neural network is an important deep learning technology, and is widely applied to the fields of computer vision, natural language processing and the like, and in the medical image segmentation problem, the convolutional neural network greatly improves the precision and stability of tumor segmentation. The existing tumor segmentation method based on deep learning can obtain better segmentation precision, wherein one main reason is that the existing tumor segmentation method depends on huge computing resources and storage space, but the existing tumor segmentation method is more and more difficult to deploy in the real world, and the popularization and popularization of related methods are also inhibited to a certain extent. Although some lightweight networks for semantic segmentation have been proposed, when the model is simplified, its performance tends to be compromised too much to provide accurate and efficient tumor segmentation results.
Disclosure of Invention
First, the technical problem to be solved
In view of the above-mentioned drawbacks and deficiencies of the prior art, the present invention provides a method for segmenting a tumor from a PET/CT image, which solves the technical problem that the prior art cannot provide an accurate and effective tumor segmentation result.
(II) technical scheme
In order to achieve the above purpose, the main technical scheme adopted by the invention comprises the following steps:
the embodiment of the invention provides a method for segmenting tumors from PET/CT images, which comprises the following steps:
step 1: acquiring a tumor data set;
the tumor data set is a PET/CT multi-mode data set or a CT or MRI single-mode data set;
step 2: preprocessing the tumor data set to obtain a preprocessed tumor data set;
step 3: training two heavy-level neural networks to serve as two teacher networks respectively and storing;
step 4: constructing a knowledge distillation framework based on two teacher networks and a pre-built lightweight student network;
step 5: training the student network based on a knowledge distillation architecture to obtain a trained student network;
step 6: and inputting the preprocessed tumor data set into a trained student network to obtain a tumor segmentation result.
Preferably, the method comprises the steps of,
the tumor data set is a published soft tissue sarcoma STS data set;
the disclosed soft tissue sarcoma STS data set comprises three-dimensional PET/CT imaging data of 51 patients with soft tissue sarcoma of four limbs, and each three-dimensional PET/CT imaging data consists of 91-311 Zhang Qiepian; wherein the size of the CT slice is 512 x 512 pixels, and the size of the PET slice is 128 x 128 pixels.
Preferably, the step 2 includes:
step 2.1: converting voxel values of CT slices in the soft tissue sarcoma STS dataset into HU value units;
converting voxel values of PET slices in the soft tissue sarcoma STS dataset into SUV value units;
step 2.2: dividing PET/CT imaging data in the soft tissue sarcoma STS data set into two-dimensional CT slices and two-dimensional PET slices, and keeping the corresponding relation between each two-dimensional CT slice and each two-dimensional PET slice and a mask marked in advance on each slice;
step 2.3: resampling the CT slices to the same 128 x 128 pixel size as the PET slices, or resampling the PET slices to the same 512 x 512 pixel size as the CT slices;
step 2.4: adjusting window width and window level of the CT slice, and normalizing the CT slice and the PET slice to obtain a normalized CT slice and a normalized PET slice;
step 2.5: and (3) performing data expansion on the normalized CT slices and the normalized PET slices by using a horizontal overturning and rotating method to obtain a preprocessed tumor data set.
Preferably, the step 2.4 specifically includes:
and adjusting window width and window level of the CT slice and the PET slice, and normalizing by using a min-max normalization mode or a Z-score normalization mode to obtain the normalized CT slice and the normalized PET slice.
Preferably, the method comprises the steps of,
the heavy-duty neural network comprises U-Net++ and mU-Net.
Preferably, the step 4 includes:
step 4.1: constructing a double-Teacher self-adaptive architecture by using two Teacher networks Teacher I and Teacher II;
constructing an initial knowledge distillation architecture based on a double-teacher self-adaptive architecture and a pre-built lightweight student network;
when the double-Teacher self-adaptive architecture is used for training a student network, the weight coefficients of Teacher I and Teacher II are automatically adjusted according to the respective DICE coefficients of the Teacher networks Teacher I and Teacher II;
the weight coefficients α and β of the Teacher networks Teacher I and Teacher II are shown below, in which DICE i Tumor segmentation dic coefficients representing the ith teacher network:
Figure BDA0003989095400000041
Figure BDA0003989095400000042
step 4.2: constructing an LMD module, wherein the LMD module is used for taking the Logits graphs of two teacher networks as soft targets, and minimizing the distribution difference of the Logits graphs of the student networks and the Kullback-Leibler divergence of the Logits graphs of the two teacher networks by calculating the distribution difference of the Logits graphs of the student networks, so that the performance of the student networks is improved;
the calculation steps of the LMD module comprise:
s1, outputting probability values on the ith class by using Logits graphs of two teacher networks and Logits graphs of student networks at temperature T
Figure BDA0003989095400000043
The calculation formula of (2) is as follows: />
Figure BDA0003989095400000044
Where z is a vector, z i And z j Is one of the elements, N is the total number of categories, n=2;
s2, respectively calculating probability values of Logits graphs of the teacher network on the ith class based on softmax functions with temperature T
Figure BDA0003989095400000045
And probability value of Logits graph of student network on ith class +.>
Figure BDA0003989095400000046
The calculation formula of the LMD module is as follows:
Figure BDA0003989095400000047
step 4.3: constructing an AMD module; and using an AMD module in the middle layer feature map corresponding to each scale of the teacher network and the student network to enable the teacher network to teach knowledge of different layers to the student network, wherein the calculation steps of the AMD module comprise:
(1) Calculating the spatial attention map corresponding to the middle layer characteristic diagram of the student network and the teacher network in the training process by adopting a formula (5);
Figure BDA0003989095400000051
where F (ε) represents a spatial attention map obtained by computing its p-norm along the channel dimension for an intermediate layer feature map ε of size c w h, where p takes 4:
c is the number of channels of the middle layer feature map;
w is the width of the middle layer feature map;
h is the height of the middle layer feature map;
(2) Based on
Figure BDA0003989095400000052
And->
Figure BDA0003989095400000053
The calculation formula of the resulting AMD module is shown below,
Figure BDA0003989095400000054
Figure BDA0003989095400000055
where P is the corresponding feature map set for which the student network and the teacher network possess the same dimensions, I.I 1 And|| | 2 Is l 1 And l 2 A norm;
wherein the method comprises the steps of
Figure BDA0003989095400000056
And->
Figure BDA0003989095400000057
The i-th layer characteristic diagram is extracted from the student network and the teacher network respectively>
Figure BDA0003989095400000058
And->
Figure BDA0003989095400000059
Is a spatial attention map of (1);
step 4.4: integrating the initial knowledge distillation architecture, the LMD module and the AMD module to obtain a final knowledge distillation architecture expression L KD The expression L KD The method comprises the following steps:
Figure BDA00039890954000000510
Figure BDA00039890954000000511
and->
Figure BDA00039890954000000512
Respectively representing an LMD module and an AMD module which are calculated based on a Teacher network Teacher I;
Figure BDA00039890954000000513
and->
Figure BDA00039890954000000514
The LMD module and the AMD module are respectively represented based on the Teacher network Teacher II calculation.
Preferably, the step 5 specifically includes:
training the student network based on a knowledge distillation architecture to obtain a trained student network, wherein epoch is greater than 100 in the process of training the student network, and the total LOSS function LOSS is as follows:
LOSS=L SEG +L KD (9);
wherein L is SEG Is a binary cross entropy loss.
(III) beneficial effects
The beneficial effects of the invention are as follows: according to the method for segmenting tumors from the PET/CT images, the information can be adaptively extracted from two trained teacher networks to a lightweight student network, so that the lightweight student network can approach or even exceed the lightweight teacher network in the capability of segmenting tumors while the running efficiency of the lightweight student network is maintained. In the method, two teacher networks are used in order that the student network can learn more tumor segmentation information and has better segmentation performance, and the weights of the two teacher networks are not the same, but can be automatically adjusted according to the respective segmentation performance. In the knowledge distillation architecture of the method, the Logits graph and the middle layer characteristic graph are considered at the same time, so that a student network can learn to a teacher network more comprehensively.
Drawings
FIG. 1 is a flow chart of a method of segmenting a tumor from a PET/CT image in accordance with the present invention;
FIG. 2 is a schematic diagram of a knowledge distillation structure according to the present invention.
Detailed Description
The invention will be better explained by the following detailed description of the embodiments with reference to the drawings.
In order that the above-described aspects may be better understood, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Knowledge distillation is a common neural network model compression method, often interpreted using an analogy of a teacher and a student, where a large, heavy-weight network is called a teacher network, and a small, light-weight network from which knowledge is transferred is called a student network. Knowledge distillation can extract information from a trained teacher network to a student network, so that the performance of the student network is improved, and meanwhile, the characteristics of light weight and easiness in deployment are maintained. The knowledge learned by a heavyweight network can be transferred to a lightweight network, the performance close to the heavyweight network is reserved, and a plurality of pieces of knowledge learned by the network can be transferred to a network, so that a single network has better performance. In knowledge distillation, most of the pioneering studies learn in their approach only with one teacher network, ignoring the potential that one student network can learn with multiple teacher networks at the same time. In the proposed method for learning to multiple teacher networks, each teacher network is simply regarded as being equally important, so that different importance of the teacher network to specific examples cannot be revealed, and the learning mode of the student network becomes monotonous and inefficient.
Example 1
Referring to fig. 1, the present embodiment provides a method for segmenting a tumor from a PET/CT image, comprising the steps of:
step 1: a tumor dataset is acquired.
The tumor dataset is a PET/CT multi-modality dataset or a CT or MRI single-modality dataset.
Step 2: and preprocessing the tumor data set to obtain a preprocessed tumor data set.
Step 3: two heavyweight neural networks are trained as two teacher networks and saved.
Step 4: a knowledge distillation architecture is constructed based on two teacher networks and a pre-built lightweight student network.
Step 5: training the student network based on a knowledge distillation architecture to obtain a trained student network.
Step 6: and inputting the preprocessed tumor data set into a trained student network to obtain a tumor segmentation result.
Preferably, the tumor dataset is a published soft tissue sarcoma STS dataset.
The disclosed soft tissue sarcoma STS dataset contains three-dimensional PET/CT imaging data of 51 patients with soft tissue sarcoma of four limbs, and each three-dimensional PET/CT imaging data consists of 91-311 Zhang Qiepian. Wherein the size of the CT slice is 512 x 512 pixels, and the size of the PET slice is 128 x 128 pixels.
In practical application of this embodiment, the step 2 includes:
step 2.1: voxel values of CT slices in the soft tissue sarcoma STS dataset are converted into HU value units.
Voxel values of PET slices in the soft tissue sarcoma STS dataset were converted into SUV value units.
Step 2.2: and cutting the PET/CT imaging data in the soft tissue sarcoma STS data set into two-dimensional CT slices and two-dimensional PET slices, and keeping the corresponding relation between each two-dimensional CT slice and each two-dimensional PET slice and a mask marked in advance for each slice.
Step 2.3: the CT slices are resampled to the same 128 x 128 pixel size as the PET slices, or the PET slices are resampled to the same 512 x 512 pixel size as the CT slices.
Step 2.4: and adjusting the window width and the window level of the CT slice, and normalizing the CT slice and the PET slice to obtain a normalized CT slice and a normalized PET slice.
Step 2.5: and (3) performing data expansion on the normalized CT slices and the normalized PET slices by using a horizontal overturning and rotating method to obtain a preprocessed tumor data set.
In practical application of this embodiment, the step 2.4 specifically includes:
and adjusting window width and window level of the CT slice and the PET slice, and normalizing by using a min-max normalization mode or a Z-score normalization mode to obtain the normalized CT slice and the normalized PET slice.
In practical application of the embodiment, the heavy-weight neural network comprises U-Net++ and mU-Net.
In practical application of this embodiment, the step 4 includes:
step 4.1: two Teacher networks Teacher I and Teacher II are used to construct a dual-Teacher self-adaptive architecture.
An initial knowledge distillation architecture is constructed based on a dual teacher adaptive architecture and a pre-built lightweight student network.
When the double-Teacher self-adaptive architecture is used for training a student network, the weight coefficients of Teacher I and Teacher II are automatically adjusted according to the respective DICE coefficients of the Teacher networks Teacher I and Teacher II.
The weight coefficients α and β of the Teacher networks Teacher I and Teacher II are shown below, in which DICE i Tumor segmentation dic coefficients representing the ith teacher network:
Figure BDA0003989095400000091
Figure BDA0003989095400000092
step 4.2: and constructing an LMD module, wherein the LMD module is used for taking the Logits graphs of the two teacher networks as soft targets, and minimizing the distribution difference of the Logits graphs of the student networks and the Kullback-Leibler divergence of the Logits graphs of the two teacher networks by calculating the distribution difference of the Logits graphs of the student networks, so that the performance of the student networks is improved.
The calculation steps of the LMD module comprise:
s1, outputting probability values on the ith class by using Logits graphs of two teacher networks and Logits graphs of student networks at temperature T
Figure BDA0003989095400000093
The calculation formula of (2) is as follows:
Figure BDA0003989095400000094
where z is a vector, z i And zj Is that Wherein N is the total number of categories, n=2;
for the tumor segmentation problem, N is equal to 2. Wherein class 1 represents a tumor and class 2 represents a non-tumor.
S2, respectively calculating probability values of Logits graphs of the teacher network on the ith class based on softmax functions with temperature T
Figure BDA0003989095400000101
And probability value of Logits graph of student network on ith class +.>
Figure BDA0003989095400000102
The calculation formula of the LMD module is as follows:
Figure BDA0003989095400000103
step 4.3: constructing an AMD module; and using an AMD module in the middle layer feature map corresponding to each scale of the teacher network and the student network to enable the teacher network to teach knowledge of different layers to the student network, wherein the calculation steps of the AMD module comprise:
(1) And (3) calculating the spatial attention map corresponding to the middle layer characteristic diagram of the student network and the teacher network in the training process by adopting the formula (5).
Figure BDA0003989095400000104
Where F (ε) represents a spatial attention map obtained by computing its p-norm along the channel dimension for an intermediate layer feature map ε of size C w h, where p is taken to be 4.
c is the number of channels of the middle layer profile.
w is the width of the middle layer feature map.
h is the height of the middle layer feature map.
(2) Based on
Figure BDA0003989095400000105
And->
Figure BDA0003989095400000106
The calculation formula of the resulting AMD module is shown below,
Figure BDA0003989095400000107
Figure BDA0003989095400000108
where P is the corresponding feature map set for which the student network and the teacher network possess the same dimensions, I.I 1 And|| | 2 Is l 1 And l 2 A norm; l (L) 1 Norm refers to absolute value norm; l (L) 2 The norm refers to the euclidean norm.
Wherein the method comprises the steps of
Figure BDA0003989095400000109
And->
Figure BDA00039890954000001010
The i-th layer characteristic diagram is extracted from the student network and the teacher network respectively>
Figure BDA00039890954000001011
And->
Figure BDA0003989095400000111
Is a spatial attention map.
Step 4.4: integrating the initial knowledge distillation architecture, the LMD module and the AMD module to obtain a final knowledge distillation architecture expression L KD The expression L KD The method comprises the following steps:
Figure BDA0003989095400000112
/>
Figure BDA0003989095400000113
and->
Figure BDA0003989095400000114
The LMD module and the AMD module are respectively represented based on the Teacher I calculation of the Teacher network.
Figure BDA0003989095400000115
And->
Figure BDA0003989095400000116
The LMD module and the AMD module are respectively represented based on the Teacher network Teacher II calculation.
In practical application of this embodiment, the step 5 specifically includes:
training the student network based on a knowledge distillation architecture to obtain a trained student network, wherein epoch is greater than 100 in the process of training the student network, and the total LOSS function LOSS is as follows:
LOSS=L SEG +L KD (9);
wherein L is SEG Is a binary cross entropy loss.
According to the method for segmenting tumors from the PET/CT images, the information can be adaptively extracted from two trained teacher networks to a lightweight student network, so that the lightweight student network can approach or even exceed the lightweight teacher network in the capability of segmenting tumors while the running efficiency of the lightweight student network is maintained. In the method, two teacher networks are used in order that the student network can learn more tumor segmentation information and has better segmentation performance, and the weights of the two teacher networks are not the same, but can be automatically adjusted according to the respective segmentation performance. In the knowledge distillation architecture of the method, the Logits graph and the middle layer characteristic graph are considered at the same time, so that a student network can learn to a teacher network more comprehensively.
Example two
Step 1: a tumor dataset is acquired.
Medical image data is a very valuable material and is very labor and financial intensive to collect and annotate. The medical image data set can be divided into a public data set and a private data set, wherein the public data set can be used in experiments after the use permission of a related institution is required to be obtained, and the private data set is required to be marked by a professional doctor and used on the premise of ensuring the privacy of the private data set. At present, an integrated imaging mode PET/CT combining Positron Emission Tomography (PET) and Computed Tomography (CT) becomes an important means for tumor treatment more and more, PET provides detailed functional and metabolic sub-information of a focus, CT provides accurate anatomical positioning of the focus, and the two organically combine and fully play the advantages of detection and fine positioning of physiological and pathological states. With the continued advancement of PET/CT technology, the segmentation of tumors based on multi-modality datasets of PET/CT is better than single modality datasets using only Computed Tomography (CT) or Magnetic Resonance Imaging (MRI).
In this embodiment, tumor segmentation may be performed using either PET/CT multi-modality data or single modality data such as CT or MRI, using the disclosed Soft Tissue Sarcoma (STS) dataset as an example in the following steps, which is a PET/CT multi-modality dataset provided by the Thecancerimagingarcharm, wherein the mask for the tumor may be generated using a code disclosed by the dataset provider at https:// gitub.
Step 2: the data set is preprocessed.
The disclosed Soft Tissue Sarcoma (STS) dataset contains three-dimensional PET/CT imaging data for 51 patients with soft tissue sarcoma of the extremities, each case consisting of 91-311 Zhang Qiepian. Wherein the size of the CT slice is 512 x 512 pixels, and the size of the PET slice is 128 x 128 pixels. The preprocessing of the data set is mainly divided into the following steps:
step 2.1: medical image unit conversion. For original PET/CT bimodal tumor data, the CT and PET images are firstly required to be subjected to unit conversion respectively, so that subsequent processing is convenient. For CT, it is necessary to convert the CT voxel values into HU value units, which reflect the extent of X-ray absorption by the tissue or lesion; for PET, it is necessary to convert the PET voxel values into units of SUV values reflecting the ratio of 18F-FDG activity to the average systemic activity, the magnitude of the SUV values being affected by the activity of tumor cells, and the SUV values may decrease as the activity decreases in the tumor undergoing radiation or chemotherapy.
Step 2.2: and (5) image slicing. Since the raw dataset is a three-dimensional PET/CT image, it needs to be segmented into two-dimensional slices so that it can be used as an input to the convolutional neural network. After image slicing, the corresponding relation of each two-dimensional CT, PET and mask is required to be maintained, so that the situation that the two-dimensional CT, PET and mask are not corresponding to each other after the two-dimensional CT, PET and mask are input into a neural network is prevented, and the experimental result is disordered.
Step 2.3: the image is resized. Since the size of the CT slice in the original dataset is 512×512 pixels, and the size of the PET slice is 128×128 pixels, the size of the two-modality data needs to be unified by bilinear interpolation or the like. In practical experiments, the CT slices may be resampled to the same 128 x 128 pixel size as the PET slices, or the PET slices may be resampled to the same 512 x 512 pixel size as the CT slices. The specific image size adjustment scheme can be set according to the actual experimental environment.
Step 2.4: and (5) normalizing the image. In the field of medical images, due to the fact that the same tissues are inconsistent in image gray information caused by the acquisition of different devices, imaging factors and the like, the image normalization is an image conversion method for reducing or even eliminating the inconsistent gray in the images while maintaining the gray difference with diagnostic value, and the image normalization is more beneficial to computer automatic analysis and processing. Thus, the normalization of the image is to convert the original image to be processed into a corresponding standard form by a series of transformations. Specifically, the HU value matrix of CT and the SUV value matrix of PET can be adjusted to the window width and window level according to the corresponding organs, and then the normalization is carried out by using the schemes of min-max normalization, Z-score normalization and the like.
Step 2.5: data enhancement. When using deep learning to perform tumor segmentation tasks, there is often an imbalance in the positive and negative sample data volumes or an insufficient data volume, and sufficient data is a basis for obtaining good segmentation results, so that data enhancement is required when performing experiments. Common data enhancement methods include translation, rotation, cropping, stretching, scaling, horizontal flipping, vertical flipping, horizontal-vertical, noise adding, and the like. In contrast, for tumor image data, stretching, cropping, etc. operations change the shape information of the image, so that the data expansion is generally performed by using a horizontal flipping and rotating method.
Step 3: two teacher networks were trained and models were saved.
The method provided by the invention needs to extract information from two trained teacher networks to a lightweight student network, so that two heavyweight networks with superior segmentation capability need to be trained as teacher networks. In the existing tumor segmentation network, the U-Net obtains good segmentation performance due to the simple U-shaped encoder-decoder structure and jump connection, wherein target features are extracted through four times of downsampling in an encoding path, a decoding path restores an advanced semantic feature image obtained by an encoder to an original image size through upsampling, the jump connection ensures that the finally restored feature image fuses more low-level features, and the feature images with different sizes are fused, so that multi-scale prediction and deep supervision can be performed. With the advent and popularity of U-Net, many variant networks of U-Net, such as U-Net++, mU-Net, etc., have also been generated, which have achieved excellent results in different segmentation tasks.
When selecting a teacher network, a plurality of different network structures can be selected for training, and finally comprehensive analysis is carried out according to training results of all networks, and two network models with best performance are selected as the teacher network. Because the training of the student network based on knowledge distillation requires the trained teacher network to provide feature maps of the middle layer and the logits layer, a model of the teacher network needs to be saved for subsequent use.
Step 4: and constructing a knowledge distillation framework.
Referring to fig. 2, the key point of this embodiment is to construct a high-efficiency knowledge distillation architecture, which can extract information from two trained teacher networks into a lightweight student network, so that the lightweight student network can approach or even exceed the lightweight teacher network in the capability of dividing tumors while maintaining the operation efficiency of the lightweight student network, and in the method, knowledge distillation of the Logits graph and the middle layer feature graph is considered at the same time, so that the student network can learn more comprehensively to the teacher network. The specific construction steps are as follows:
step 4.1: and constructing a double-teacher self-adaptive architecture. The two Teacher networks Teacher I and Teacher II are trained and stored in the step 3, and the purpose of the two Teacher networks Teacher I and Teacher II is to enable a Student network Student to learn more tumor segmentation information and have better segmentation performance. When training Student networks Student, two teacher networks are used only for reasoning, and in the constructed knowledge distillation architecture, thisThe weights of the two teacher networks are not the same, but can be automatically adjusted according to the respective segmentation capabilities. Since the DICE coefficient is the most important evaluation index in medical image segmentation, the segmentation performance of two Teacher networks is quantized by the DICE coefficient in the method, and then normalized by softmax according to the DICE coefficients of the two networks as the weight values thereof, the coefficients alpha and beta of the Teacher networks Teacher I and Teacher II are shown as follows, wherein the DICE is shown as follows i Tumor segmentation dic coefficients representing the ith teacher network:
Figure BDA0003989095400000151
Figure BDA0003989095400000152
step 4.2: an LMD (Logits graph distillation, logits Map Distillation) module was constructed. This module is used for knowledge distillation of Logits graphs. The module takes the Logits graph of the teacher network as a soft target, and minimizes the distribution difference of the Logits graph of the student network and the Kullback-Leibler divergence of the Logits graph of the teacher network by calculating the difference of the Logits graph of the student network and the Kullback-Leibler divergence of the Logits graph of the teacher network, so that the performance of the student network is improved. The calculation steps of the module are as follows:
(1) The convolutional neural network uses the softmax function to obtain probabilities of each category after obtaining model output, and besides positive labels, negative labels also carry a large amount of information. In the traditional hard target, all the negative labels are treated uniformly, and the information carried by the negative labels is ignored, so that the generalization capability of the model is reduced. To pay more attention to the negative tag, a softmax function with a temperature T may be used, and when T is greater than 1, the probability distribution of the output may be smoothed, thereby amplifying the information carried by the negative tag. The calculation formula of the probability value of the Logits graph output by softmax on the ith class at the temperature T is shown as follows, wherein z is a vector, and z i And z j Is one of the elements, N is the total number of categories, n=2 in the tumor segmentation problem:
Figure BDA0003989095400000161
(2) Based on the softmax function with temperature T above, probability values of Logits diagram of teacher network and student network on the ith class can be calculated
Figure BDA0003989095400000162
And->
Figure BDA0003989095400000163
The calculation formula of the LMD module is as follows:
Figure BDA0003989095400000164
step 4.3: AMD (attention seeking distillation, attention Map Distillation) modules were constructed. This module is used for knowledge distillation of intermediate layer feature patterns. The module transfers knowledge from the teacher network to the student network by calculating spatial attention of the middle layer feature map, thereby improving performance of the student network. In the method, AMD modules are used in middle layers corresponding to all scales of a teacher network and a student network, so that the teacher network can teach knowledge of different layers to the student network, and for a small target such as tumor, only few features contribute to the modules, thus l is used 1 The norms can be better used for feature selection. The calculation steps of the module are as follows:
(1) Firstly, calculating a spatial attention map corresponding to an intermediate layer characteristic map of a student network and a teacher network, wherein a calculation formula is shown as follows, the formula converts a three-dimensional characteristic map into a two-dimensional spatial attention map, F (epsilon) represents the spatial attention map obtained by calculating p norms of the intermediate layer characteristic map epsilon with the size of c multiplied by w multiplied by h along the channel dimension, and p is taken as 4;
Figure BDA0003989095400000165
based on
Figure BDA0003989095400000166
And->
Figure BDA0003989095400000167
The calculation formula of the resulting AMD module is shown below,
Figure BDA0003989095400000168
Figure BDA0003989095400000169
where P is the corresponding feature map set for which the student network and the teacher network possess the same dimensions, I.I 1 And|| | 2 Is l 1 And l 2 Normalizing;
wherein the method comprises the steps of
Figure BDA0003989095400000171
And->
Figure BDA0003989095400000172
The i-th layer characteristic diagram is extracted from the student network and the teacher network respectively>
Figure BDA0003989095400000173
And->
Figure BDA0003989095400000174
Is a spatial attention map of (1);
step 4.4: integrating the initial knowledge distillation architecture, the LMD module and the AMD module to obtain a final knowledge distillation architecture expression L KD The expression L KD The method comprises the following steps:
Figure BDA0003989095400000175
Figure BDA0003989095400000176
and->
Figure BDA0003989095400000177
Respectively representing an LMD module and an AMD module which are calculated based on a Teacher network Teacher I;
Figure BDA0003989095400000178
and->
Figure BDA0003989095400000179
The LMD module and the AMD module are respectively represented based on the Teacher network Teacher II calculation.
Step 5: training a student network based on knowledge distillation.
In step 4, a knowledge distillation architecture is built, based on which a lightweight student network can be trained. To quantitatively evaluate the complexity of the model, two evaluation indexes, namely, flow-point Operations Per Second (FLOPs) and model size (parameters), can be used. FLPs are the most commonly used evaluation index for measuring the complexity of neural network models, and the index-dependent calculation code can be obtained on https:// github.com/Lyken 17/pyrtorch-Opcounter, and the model size (parameters) can be calculated using the torchsurarray library.
In training the student network, a segmentation loss function L for calculating the difference degree between the network output and the gold standard is also needed SEG Binary cross entropy is used in the present method. The total LOSS function LOSS is thus as follows:
LOSS=L SEG +L KD (9);
step 6: tumor segmentation is performed through a trained lightweight student network.
In step 5, a lightweight student network model is trained and saved using the constructed knowledge distillation architecture, and in order to quantitatively evaluate the segmentation performance of the student network, several common medical image segmentation evaluation metrics may be used, including dic coeffient, volumetric Overlap Error (VOE), relative Volume Difference (RVD), precision, and Recall.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the terms first, second, third, etc. are for convenience of description only and do not denote any order. These terms may be understood as part of the component name.
Furthermore, it should be noted that in the description of the present specification, the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., refer to a specific feature, structure, material, or characteristic described in connection with the embodiment or example being included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art upon learning the basic inventive concepts. Therefore, the appended claims should be construed to include preferred embodiments and all such variations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, the present invention should also include such modifications and variations provided that they come within the scope of the following claims and their equivalents.

Claims (7)

1. A method of segmenting a tumor from a PET/CT image, comprising the steps of:
step 1: acquiring a tumor data set;
the tumor data set is a PET/CT multi-mode data set;
step 2: preprocessing the tumor data set to obtain a preprocessed tumor data set;
step 3: training two heavy-level neural networks to serve as two teacher networks respectively and storing;
step 4: constructing a knowledge distillation framework based on two teacher networks and a pre-built lightweight student network;
step 5: training the student network based on a knowledge distillation architecture to obtain a trained student network;
step 6: and inputting the preprocessed tumor data set into a trained student network to obtain a tumor segmentation result.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the tumor data set is a published soft tissue sarcoma STS data set;
the disclosed soft tissue sarcoma STS data set comprises three-dimensional PET/CT imaging data of 51 patients with soft tissue sarcoma of four limbs, and each three-dimensional PET/CT imaging data consists of 91-311 Zhang Qiepian; wherein the size of the CT slice is 512 x 512 pixels, and the size of the PET slice is 128 x 128 pixels.
3. The method according to claim 2, wherein the step 2 comprises:
step 2.1: converting voxel values of CT slices in the soft tissue sarcoma STS dataset into HU value units;
converting voxel values of PET slices in the soft tissue sarcoma STS dataset into SUV value units;
step 2.2: dividing PET/CT imaging data in the soft tissue sarcoma STS data set into two-dimensional CT slices and two-dimensional PET slices, and keeping the corresponding relation between each two-dimensional CT slice and each two-dimensional PET slice and a mask marked in advance on each slice;
step 2.3: resampling the CT slices to the same 128 x 128 pixel size as the PET slices, or resampling the PET slices to the same 512 x 512 pixel size as the CT slices;
step 2.4: adjusting window width and window level of the CT slice, and normalizing the CT slice and the PET slice to obtain a normalized CT slice and a normalized PET slice;
step 2.5: and (3) performing data expansion on the normalized CT slices and the normalized PET slices by using a horizontal overturning and rotating method to obtain a preprocessed tumor data set.
4. A method according to claim 3, wherein said step 2.4 comprises:
and adjusting window width and window level of the CT slice and the PET slice, and normalizing by using a min-max normalization mode or a Z-score normalization mode to obtain the normalized CT slice and the normalized PET slice.
5. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
the heavy-duty neural network comprises U-Net++ and mU-Net.
6. The method according to claim 5, wherein the step 4 comprises:
step 4.1: two Teacher networks Teacher I and Teacher II are used for constructing a double-Teacher self-adaptive architecture;
constructing an initial knowledge distillation architecture based on a double-teacher self-adaptive architecture and a pre-built lightweight student network;
when the double-Teacher self-adaptive architecture is used for training a student network, the weight coefficients of Teacher I and Teacher II are automatically adjusted according to the respective DICE coefficients of the Teacher networks Teacher I and Teacher II;
the weight coefficients alpha and beta of the Teacher networks Teacher I and Teacher II are shown as follows, wherein DICE i Tumor segmentation dic coefficients representing the ith teacher network:
Figure FDA0003989095390000021
/>
Figure FDA0003989095390000022
step 4.2: constructing an LMD module, wherein the LMD module is used for taking the Logits graphs of two teacher networks as soft targets, and minimizing the distribution difference of the Logits graphs of the student networks and the Kullback-Leibler divergence of the Logits graphs of the two teacher networks by calculating the distribution difference of the Logits graphs of the student networks, so that the performance of the student networks is improved;
the calculation steps of the LMD module comprise:
s1, outputting probability values on the ith class by using Logits graphs of two teacher networks and Logits graphs of student networks at temperature T
Figure FDA0003989095390000031
The calculation formula of (2) is as follows:
Figure FDA0003989095390000032
where z is a vector, z i And z j Is one of the elements, N is the total number of categories, n=2;
s2, respectively calculating probability values of Logits graphs of the teacher network on the ith class based on softmax functions with temperature T
Figure FDA0003989095390000033
And probability value of Logits graph of student network on ith class +.>
Figure FDA0003989095390000034
The calculation formula of the LMD module is as follows:
Figure FDA0003989095390000035
step 4.3: constructing an AMD module; and using an AMD module in the middle layer feature map corresponding to each scale of the teacher network and the student network to enable the teacher network to teach knowledge of different layers to the student network, wherein the calculation steps of the AMD module comprise:
(1) Calculating the spatial attention map corresponding to the middle layer characteristic diagram of the student network and the teacher network in the training process by adopting a formula (5);
Figure FDA0003989095390000036
where F (ε) represents a spatial attention map obtained by computing its p-norm along the channel dimension for an intermediate layer feature map ε of size c w h, where p takes 4:
c is the number of channels of the middle layer feature map;
w is the width of the middle layer feature map;
h is the height of the middle layer feature map;
(2) Based on
Figure FDA0003989095390000041
And->
Figure FDA0003989095390000042
The calculation formula of the resulting AMD module is shown below,
Figure FDA0003989095390000043
Figure FDA0003989095390000044
where P is the corresponding feature map set for which the student network and the teacher network possess the same dimensions, I.I 1 And|| | 2 Is l 1 And l 2 A norm;
wherein the method comprises the steps of
Figure FDA0003989095390000045
And->
Figure FDA0003989095390000046
The i-th layer characteristic diagram is extracted from the student network and the teacher network respectively>
Figure FDA0003989095390000047
And->
Figure FDA0003989095390000048
Is a spatial attention map of (1);
step 4.4: integrating the initial knowledge distillation architecture, the LMD module and the AMD module to obtain a final knowledge distillation architecture expression L KD The expression isL (L) KD The method comprises the following steps:
Figure FDA0003989095390000049
Figure FDA00039890953900000410
and->
Figure FDA00039890953900000411
Respectively representing an LMD module and an AMD module which are calculated based on a Teacher network Teacher I;
Figure FDA00039890953900000412
and->
Figure FDA00039890953900000413
The LMD module and the AMD module are respectively represented based on the calculation of a Teacher network Teacher II.
7. The method according to claim 6, wherein the step 5 specifically comprises:
training the student network based on a knowledge distillation architecture to obtain a trained student network, wherein epoch is greater than 100 in the process of training the student network, and the total LOSS function LOSS is as follows:
LOSS=L SEG +L KD (9);
wherein L is SEG Is a binary cross entropy loss.
CN202211575130.9A 2022-12-08 2022-12-08 Method for segmenting tumor from PET/CT image Pending CN116091412A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211575130.9A CN116091412A (en) 2022-12-08 2022-12-08 Method for segmenting tumor from PET/CT image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211575130.9A CN116091412A (en) 2022-12-08 2022-12-08 Method for segmenting tumor from PET/CT image

Publications (1)

Publication Number Publication Date
CN116091412A true CN116091412A (en) 2023-05-09

Family

ID=86198296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211575130.9A Pending CN116091412A (en) 2022-12-08 2022-12-08 Method for segmenting tumor from PET/CT image

Country Status (1)

Country Link
CN (1) CN116091412A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253611A (en) * 2023-09-25 2023-12-19 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253611A (en) * 2023-09-25 2023-12-19 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation
CN117253611B (en) * 2023-09-25 2024-04-30 四川大学 Intelligent early cancer screening method and system based on multi-modal knowledge distillation

Similar Documents

Publication Publication Date Title
CN111627019B (en) Liver tumor segmentation method and system based on convolutional neural network
Gu et al. MedSRGAN: medical images super-resolution using generative adversarial networks
Li et al. DenseX-net: an end-to-end model for lymphoma segmentation in whole-body PET/CT images
JP2023550844A (en) Liver CT automatic segmentation method based on deep shape learning
CN112381846B (en) Ultrasonic thyroid nodule segmentation method based on asymmetric network
CN112819831B (en) Segmentation model generation method and device based on convolution Lstm and multi-model fusion
Popescu et al. Retinal blood vessel segmentation using pix2pix gan
Shahsavari et al. Proposing a novel Cascade Ensemble Super Resolution Generative Adversarial Network (CESR-GAN) method for the reconstruction of super-resolution skin lesion images
CN114596317A (en) CT image whole heart segmentation method based on deep learning
CN112750137A (en) Liver tumor segmentation method and system based on deep learning
CN111127487B (en) Real-time multi-tissue medical image segmentation method
CN116612174A (en) Three-dimensional reconstruction method and system for soft tissue and computer storage medium
Molahasani Majdabadi et al. Capsule GAN for prostate MRI super-resolution
CN116091412A (en) Method for segmenting tumor from PET/CT image
Meng et al. Radiomics-enhanced deep multi-task learning for outcome prediction in head and neck cancer
CN112990359B (en) Image data processing method, device, computer and storage medium
Joseph et al. Optimised CNN based brain tumour detection and 3D reconstruction
CN111755131A (en) COVID-19 early screening and severity degree evaluation method and system based on attention guidance
CN116309754A (en) Brain medical image registration method and system based on local-global information collaboration
CN116703850A (en) Medical image segmentation method based on field self-adaption
Wang et al. An improved CapsNet applied to recognition of 3D vertebral images
CN112967295A (en) Image processing method and system based on residual error network and attention mechanism
Wei et al. Application of U-net with variable fractional order gradient descent method in rectal tumor segmentation
Li et al. Developing Large Pre-trained Model for Breast Tumor Segmentation from Ultrasound Images
US20230410483A1 (en) Medical imaging analysis using self-supervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination