CN117422871A - Lightweight brain tumor segmentation method and system based on V-Net - Google Patents
Lightweight brain tumor segmentation method and system based on V-Net Download PDFInfo
- Publication number
- CN117422871A CN117422871A CN202311301482.XA CN202311301482A CN117422871A CN 117422871 A CN117422871 A CN 117422871A CN 202311301482 A CN202311301482 A CN 202311301482A CN 117422871 A CN117422871 A CN 117422871A
- Authority
- CN
- China
- Prior art keywords
- network
- net
- segmentation
- image
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 30
- 208000003174 Brain Neoplasms Diseases 0.000 title claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000010606 normalization Methods 0.000 claims abstract description 20
- 238000012805 post-processing Methods 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 7
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 6
- 230000007704 transition Effects 0.000 claims description 4
- 230000000903 blocking effect Effects 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 230000006872 improvement Effects 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 5
- 238000004904 shortening Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 13
- 238000002595 magnetic resonance imaging Methods 0.000 description 12
- 239000000523 sample Substances 0.000 description 9
- 206010028980 Neoplasm Diseases 0.000 description 7
- 210000004556 brain Anatomy 0.000 description 6
- 206010018338 Glioma Diseases 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 239000013610 patient sample Substances 0.000 description 5
- 208000032612 Glial tumor Diseases 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 208000029824 high grade glioma Diseases 0.000 description 3
- 208000030173 low grade glioma Diseases 0.000 description 3
- 201000011614 malignant glioma Diseases 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 101100004648 Drosophila melanogaster brat gene Proteins 0.000 description 2
- 101100049199 Xenopus laevis vegt-a gene Proteins 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- JXSJBGJIGXNWCI-UHFFFAOYSA-N diethyl 2-[(dimethoxyphosphorothioyl)thio]succinate Chemical compound CCOC(=O)CC(SP(=S)(OC)OC)C(=O)OCC JXSJBGJIGXNWCI-UHFFFAOYSA-N 0.000 description 2
- 238000003709 image segmentation Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 201000007983 brain glioma Diseases 0.000 description 1
- 238000013170 computed tomography imaging Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 210000004872 soft tissue Anatomy 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30016—Brain
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/032—Recognition of patterns in medical or anatomical images of protuberances, polyps nodules, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Databases & Information Systems (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Processing (AREA)
Abstract
The invention provides a lightweight brain tumor segmentation method based on V-Net, which comprises the following steps: step 1: preprocessing an image to obtain an image conforming to a network structure; step 2: constructing a V-Net network; step 3: improving the V-Net network; changing batch normalization in the V-Net network into group normalization; replacing the normal convolution in the network with a depth separable convolution; adding a squeze-and-specifiionattention attention mechanism into an encoder part of the network; step 4: training the improved network by adopting a mixed Loss function BCEDice Loss; selecting an optimal network model according to the training result; step 5: inputting the image data preprocessed in the step 1 into an optimal network model to obtain a segmentation result; step 6: and carrying out post-processing on the segmentation result. The method can also keep higher training precision on the basis of shortening training time, has good segmentation performance, and has positive significance for diagnosis of clinicians and treatment of patients.
Description
Technical Field
The invention belongs to the technical field of deep learning computer vision and medical image processing, and particularly relates to a V-Net-based lightweight brain tumor segmentation method and system.
Background
Brain tumor is one of the diseases seriously endangering the life safety of patients, is an abnormal cell group growing in the cranium, has irregular shape and uncertain volume, can appear at any position in the brain, can cause serious dysfunction of the nervous system of human body, and is a tumor seriously threatening the life of patients. Among them, brain glioma is the most common craniocerebral tumor, and has the characteristics of high morbidity, high recurrence rate, high mortality rate and low cure rate. Gliomas can be classified into high-grade gliomas (HGG) and low-grade gliomas (LGG) according to the extent of invasion and the prognosis of the patient. High-grade glioma has higher mortality rate, and low-grade glioma has slower development. Therefore, the diagnosis and timely treatment of the low-level brain tumor are carried out in early stage, and the method has great significance for increasing the survival chance of patients, improving the survival quality of the patients and prolonging the life of the patients.
Magnetic resonance imaging (Magentic Resonance Imaging, MRI) acquires and reconstructs information of the human body through magnetic resonance phenomena, and MRI can obtain brain images with higher contrast than CT imaging. MRI is a non-invasive technique, is harmless to human body, can provide complete non-invasive images without craniography, has good soft tissue contrast, shows a tissue structure with higher resolution, and is widely applied to diagnosis and treatment of clinical medicine.
Along with the development of deep learning and the promotion of other relevant hardware, a deep learning method is successfully applied to the field of medical images, and the characteristics of time and labor waste of manual segmentation are compensated by utilizing a segmentation method of a deep learning model. At present, brain tumor image segmentation by deep learning mainly comprises two methods of a 2D convolution network and a 3D convolution network, wherein the 2D convolution network mainly comprises the following steps: FCN, U-Net, U-Net++, and the like. In the 2D method we decompose the constructed 3D MRI volume into a number of 2D slices, each of which is passed into a segmentation model that generates a segmentation for each slice, and then recombines the two-dimensional slices to form a segmented three-dimensional volume, a disadvantage of using this approach is that significant context information in the 3D image is lost.
In summary, providing an MRI image segmentation method based on a 3D convolution network is a problem to be solved urgently.
Disclosure of Invention
In view of the above, the invention discloses a lightweight brain tumor segmentation method and a system based on V-Net, so as to obtain a network model with higher training accuracy and high training speed.
The technical scheme of the invention is as follows: a lightweight brain tumor segmentation method based on V-Net comprises the following steps:
step 1: preprocessing an image to obtain an image conforming to a network structure;
step 2: constructing a V-Net network; the V-Net network consists of an encoder and a decoder;
the encoder is used for extracting features from the original input image, and the decoder is used for restoring the extracted features into segmentation results;
wherein the encoder is composed of a plurality of Down transition modules, and the decoder part is composed of a plurality of UpTranstion modules; the encoder and the decoder are connected through jump;
the Down transition module is used for gradually reducing the size of the feature map and gradually increasing the number of feature channels, so that the encoder can capture information of different layers;
the UpTransit module is used for gradually restoring the size of the feature map to be the same as the input image, and reducing the number of feature channels;
the jump connection is used for connecting the characteristic diagram of the encoder with the characteristic diagram of the corresponding layer of the decoder, so that the network can better transmit information;
step 3: improving the V-Net network;
changing batch normalization in the V-Net network into group normalization; replacing the normal convolution in the network with a depth separable convolution;
adding a squeze-and-specifiionattention attention mechanism into an encoder part of the network;
step 4: training the improved network by adopting a mixed Loss function BCEDice Loss; selecting an optimal network model according to the training result;
step 5: inputting the image data preprocessed in the step 1 into an optimal network model to obtain a segmentation result;
step 6: and carrying out post-processing on the segmentation result.
Specifically, the step 1 image preprocessing includes: and (5) sequentially carrying out standardization, clipping and blocking treatment on the brain tumor MRI data set.
Specifically, the mixed Loss function bcalice Loss is to combine the binary cross entropy Loss and the Dice coefficient Loss, and calculate the final Loss through linear combination;
the binary cross entropy loss is calculated as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1; log represents natural logarithm;
the calculation formula of the Dice coefficient loss function is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1;
thus, the mixing Loss function bcEDice Loss calculation formula is:
the values of alpha and beta are 0.5.
Specifically, the post-processing in step 5 includes removing noise, filling holes, and smoothing the segmentation boundary.
The invention also provides a light-weight brain tumor segmentation system based on V-Net, which comprises:
an image preprocessing module: the method comprises the steps of preprocessing an image data set to be segmented to obtain an image conforming to a network structure;
and a network construction module: for constructing a V-Net network;
V-Net network improvement module: the method is used for reducing network parameters, improving the network training speed and simultaneously keeping higher precision;
and the network training module: training the improved network by adopting a mixed Loss function BCEDice Loss;
and a post-processing module: for further processing of the segmentation results.
The invention provides a V-Net-based lightweight brain tumor segmentation method and a V-Net-based lightweight brain tumor segmentation system, which greatly reduce the calculated amount, simultaneously accelerate the training speed, keep higher training precision on the basis of shortening the training time, have good segmentation performance and have positive significance for diagnosis of clinicians and treatment of patients.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure of the invention as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a diagram of four modes of a BraTs provided by an embodiment of the present disclosure;
FIG. 2 is a block diagram of preprocessed data provided by an embodiment of the present disclosure;
FIG. 3 is a block diagram of batch normalization and group normalization provided by an embodiment of the present disclosure;
FIG. 4 is a diagram of a generic convolution process provided by an embodiment of the present disclosure;
FIG. 5 is a diagram of a depth separable convolution process provided by an embodiment of the present disclosure;
FIG. 6 is a flowchart of an SE attention mechanism algorithm provided by an embodiment of the present disclosure;
FIG. 7 is a network block diagram provided by an embodiment of the present disclosure;
fig. 8 is a graph showing comparison of segmentation effects provided by the disclosed embodiments of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of systems consistent with aspects of the invention as detailed in the accompanying claims.
The method of the 3D convolution network mainly comprises the following steps: 3D U-Net, V-Net, etc. In the 3D method, the whole MRI volume is ideally input into the segmentation model and the 3D segmentation result of the whole MRI is obtained. Due to the limitations of hardware and other resources, the MRI image is typically cut into several blocks, which are then fed into a segmentation model, which is finally aggregated to form a segmentation map of the entire volume, which captures certain context information.
The embodiment firstly provides a light-weight brain tumor segmentation method based on V-Net, which comprises the following steps:
step 1: preprocessing an image to obtain an image conforming to a network structure;
specifically, brain tumor MRI data were preprocessed, and the data set, the BraTS2018 data set, was normalized, cut and blocked. So that the processed image meets the requirement of the network structure.
In this embodiment: the training set and validation set of the experiments of the present invention were from the BraTs2018 dataset, the training set contained 210 advanced glioma patient samples, 75 lower glioma patient samples, and the validation set contained 66 unlabeled patient samples, wherein each sample in the training set in turn contained brain MRI images of 4 different modalities (T1, T2, T1ce, flair) and corresponding true label images. However, the BraTs only disclose training set data, and have no test set data, and if a part of the data in the training set is split to be used as a test set, the fitting phenomenon can occur due to too little training data, so that the network generalization capability is poor. To solve the problem of less data, we selected as the test set the increased fraction of the BraTs2019 training set over the BraTs2018 training set. Which contained 49 higher glioma patient samples and 1 lower glioma patient sample.
The pretreatment steps mainly comprise:
(1) 5 black slices were manually added. The four modality images (155,240,240) and corresponding masks (155,240,240) were added in front of 3 black slices and 2 behind them, and the image sizes were uniformly modified (160,240,240).
(2) And (5) standardization. The BraTS adopts four sequences of MR images of T1, T2, flair and T1ce, and the four sequences are images of different modes, so that the contrast of the images is different, and each mode image is normalized by adopting a z-score mode, and the average value is subtracted from the image to be divided by the standard deviation.
(3) Cutting. The grey part of the BraTS MR image is the brain region, the black is the background, the proportion of background information in the whole image is large, and the background does not help in segmentation. Viewing the MR image from the doctor's perspective automatically filters out this background information, focusing on the brain region, so that it is necessary to remove the background information around the brain region, and the network parameters become smaller after clipping, improving the performance of the network, where we clip the original size (160,240,240) image (160,160,160).
(4) The blocking processing, because the limited image of the memory display card resource size can not be completely input into the network, the image and the corresponding Mask need to be blocked. The cropped image and Mask size was (160,160,160), and in the present experiment, 5 (32,160,160) sized tiles were divided from the Axial direction.
(5) And merging the data and storing. The four standardized and blocked modes are combined into four channels, the shape after storage is (32,160,160,4), and the dtype is float64. Then, after the corresponding Mask is also segmented, combining the three labels into three nested subregions, and finally combining the three subregions into three channels, wherein the three channels are respectively WT, TC and ET, the numerical value is 0 or 1, the shape after storage is (32,160,160,3), and the dtype is uint8.
Step 2: constructing a V-Net network; the V-Net consists of two parts, encoder and decoder, which function to extract features from the original input image and to map the extracted features back to segmentation results, respectively. The V-Net adopts 3D convolution to process three-dimensional volume data, in addition, the V-Net introduces jump connection, and the network can better retain and transfer detail information by connecting the characteristic diagram of the encoder with the characteristic diagram of the corresponding layer of the decoder. The encoder portion of the V-Net consists of multiple DownTransition modules, each comprising a downsampling operation (typically using 3D convolution with step size 2), a batch normalization, a ReLU activation and one to more convolution operations (typically a 5 x 5 convolution kernel), and an optional Dropout layer. The effect of DownTransition is to progressively reduce the size of the feature map and progressively increase the number of feature channels to capture different levels of information in the encoder. The decoder portion of the V-Net consists of multiple upconversion modules, each of which includes a single normalization process with an upsampling operation (typically using a transposed convolution of step 2 or bilinear interpolation), a single ReLU activation function, and one or more convolution operations, the function of the decoder being to progressively restore the feature map size to the same as the input image and reduce the number of feature channels. V-Net also introduces a jump connection between the encoder and decoder, enabling the network to better communicate information by connecting the feature map of the encoder with the feature map of the corresponding layer of the decoder. This jump connection allows the network to access the low-level characteristic information of the encoder during decoding, thereby helping the network to better restore details and boundaries.
Step 3: improving the V-Net network; the batch normalization BN in the network is replaced by the group normalization GN, the batch normalization BN is used for processing the image in the traditional V-Net network, the calculation mode of the batch normalization BN is to independently take out N, H, W of each channel for normalization processing, so that the training speed can be increased, and the generalization capability of the network can be enhanced. However, the BN has a great influence on the accuracy due to the change of the size N of the batch, the smaller the batch is, the mean and variance obtained by calculation cannot represent the global, the higher the error rate is, and the higher the memory requirement of the computer is required by increasing the batch. Since the data volume of medical images is large, the volume is generally set to be relatively small. In this case, the calculation of the group normalization GN is not affected by the batch size, and the accuracy thereof is stable by taking an arbitrary value in a batch. Therefore, the group normalization GN is more suitable for the segmentation experiment of the medical image, so in the experiment of the invention, the batch normalization in the V-Net network is modified into the group normalization GN more suitable for the experiment. The basic idea of GN is as follows: dividing the extracted features of a certain layer into G groups, normalizing the features in each group, and finally merging the data after the normalization of the G groups.
The normal convolution in the network is replaced by a depth separable convolution (Depthwise Separable convolution), and an SE (Squeeze-and-specification) Attention mechanism is added to the encoder portion of the network. The network can reduce a large number of parameters, improve the training speed and keep higher precision.
The depth separable convolution is to divide one convolution kernel into two independent convolution kernels, and mainly includes two steps: depth convolution and point-by-point convolution. The deep convolution is to convolve each channel of the input feature map separately. Point-wise convolution refers to convolution over all channels using a 1 x 1 convolution kernel. As shown in the figure: let the size of the input feature map be D K ×D K X M, convolution kernel of size D F ×D F The calculation amount of convolution operation of the feature map and the kernel is D F ×D F ×M×N×D K ×D K . The convolution kernel of the depth convolution has a size D F ×D F X 1 xM, the calculation amount of convolution operation of the feature map and the depth convolution sum is D F ×D F ×M×D K ×D K . The convolution kernel size of the point-by-point convolution is 1×1×m×n, and the calculation amount of the convolution operation between the feature map and the point-by-point convolution is m×n×d K ×D K . By these two steps, its calculated amount can be expressed as D F ×D F ×M×D K ×D K +M×N×D K ×D K Whereas the calculation amount of the common convolution is D F ×D F ×M×N×D K ×D K Compared with the two, the calculated amount of the depth separable convolution is obviously reduced, so that the calculation speed of the network model can be greatly improved by changing the common convolution into the depth separable convolution in the convolution unit.
SE (Squeeze-and-specification) Attention mechanisms are added after the convolution of each layer of the V-Net network encoder section.
Depth separable convolution reduces the number of parameters and computation to some extent, but may sometimes limit the feature representation capabilities of the network, may not adequately capture complex features of the data, and thus affect the performance of the model. The SE Attention mechanism can dynamically adjust the importance of the inter-channel features, so that the network can pay more Attention to the important features, and the feature expression capability is enhanced. In addition, the SE Attention can help the network to better capture key information in the image, so that the performance of the model is improved. The basic idea of SE Attention is to learn the weights of each channel through global information pooling, which mainly comprises two steps: squeeze and specification. Wherein Ftr gives an input feature map X, and the input feature map X is subjected to Ftr operation to generate a feature map U. The feature map is subjected to ensemble averaging and pooling in the step of Squeeze (Fsq ()), a vector of 1 multiplied by C is generated, each channel is represented by a numerical value, the step of Excitation (Fsq) is completed through two layers of fully connected layers, weight information required by users is generated through weights W, the weights W are obtained through learning and are used for displaying the correlation of the features required by users for modeling, the weight vector S generated in the third step is used for carrying out weight assignment on the feature map U in Fscan, the size of the feature map X 'required by users is obtained, the size of the feature map X' is identical to that of the feature map U, and the SE module does not change the size of the feature map. Combining the two steps of Squeeze and specification together, SE attribute rescales the feature map for each channel by the learned weights. Channels with higher weights get more attention, while channels with lower weights are suppressed. Thus SE Attention can help the network focus on features that are meaningful to the task, thereby improving the performance of the network. The SE parameter amount is small, and the lightweight performance of the network is not affected.
Step 4: training the improved network by adopting a mixed Loss function BCEDice Loss;
since brain tumor segmentation is a very challenging task, it is mainly expressed in that brain tumor segmentation usually involves multiple classes, the distribution of different classes in the image is very different, resulting in a class imbalance problem, which makes the model more prone to predict larger classes and smaller classes may be ignored, affecting the segmentation effect. Furthermore, the shape and size of brain tumors vary greatly from patient to patient and from case to case, making it difficult for the model to capture all shape details and boundaries, especially for small or blurred-edge tumors. And the binary cross entropy Loss and the Dice coefficient Loss are combined together by using the BCEDice Loss, the final Loss is calculated through linear combination, the classification accuracy of the pixel level (through the binary cross entropy) and the overlapping degree of the segmentation result (through the Dice coefficient) can be comprehensively considered, and in the brain tumor segmentation task with unbalanced categories, the BCEDice Loss can better balance the two losses, so that the performance of the model is improved.
The binary cross entropy loss function (Binary Cross Entropy Loss) is a loss function commonly used for two classification tasks for measuring the difference between the prediction probability and the true label, and in brain tumor segmentation tasks, pixel-level segmentation is a classification problem, namely each pixel belongs to a tumor area or a non-tumor area, so that the network can be helped to learn the correct pixel classification by using the binary cross entropy loss. The calculation formula is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample and takes a value of 0 or 1.log represents the natural logarithm.
Dice coefficient loss function: the Dice coefficient is an index for evaluating the similarity of two sets, and is used for measuring the overlapping degree between the predicted segmentation result and the true segmentation. In brain tumor segmentation tasks, the Dice coefficient is widely used to measure the accuracy of segmentation, and is particularly effective for tasks with unbalanced categories. The formula is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample and takes a value of 0 or 1.
The final loss function is:
in the invention, the values of alpha and beta are both 0.5.
Post-processing the segmentation result:
VNet networks typically perform post-processing in brain tumor segmentation tasks to further optimize segmentation results. The goal of post-processing is to do some of the operations on the segmentation mask of the model output. Including removing noise, filling holes, smoothing segmentation boundaries. Thereby obtaining a more accurate segmentation result.
Selecting an optimal network model according to the training result, wherein the following table is used for comparing the segmentation effect of various algorithms on the BraTS data set:
the training set is randomly split into five equal parts, and five-fold cross validation is carried out, as shown in a table I, the average Dice value of the algorithm in the whole area, the enhancement area and the core area respectively reaches 90.10%, 90.10% and 89.42%. NVDLMED in the table is the algorithm for the first name of the BraTS2018 race. As can be seen from the table, the algorithm of the invention is respectively reduced by 12.5 times and 32.1 times compared with V-Net in terms of calculated amount and parameter amount, and is respectively reduced by 21.8 times and 28.6 times compared with NVDLMED, and the calculated amount and parameter amount are obviously reduced. However, in terms of computational accuracy, the algorithm of the invention is slightly lower than V-Net in the whole region (WT) by 1.89%, but is respectively higher than V-Net in the enhanced tumor region (ET) and the tumor core region (TC) by 1.71% and 0.52%, respectively, and is respectively higher than NVDLMED by 8.72% and 3.46%. In comprehensive view, the algorithm of the invention can greatly reduce the calculated amount, quicken the training speed, and can keep higher training precision on the basis of shortening the training time, has good segmentation performance, and has positive significance for diagnosis of clinicians and treatment of patients.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.
Claims (5)
1. A lightweight brain tumor segmentation method based on V-Net comprises the following steps:
step 1: preprocessing an image to obtain an image conforming to a network structure;
step 2: constructing a V-Net network; the V-Net network consists of an encoder and a decoder;
the encoder is used for extracting features from the original input image, and the decoder is used for restoring the extracted features into segmentation results;
wherein the encoder is composed of a plurality of Down transition modules, and the decoder part is composed of a plurality of UpTranstion modules; splicing the feature images in a jump connection mode between the encoder and the decoder;
the Down transition module is used for gradually reducing the size of the feature map and gradually increasing the number of feature channels, so that the encoder can capture information of different layers;
the UpTransit module is used for gradually restoring the size of the feature map to be the same as the input image, and reducing the number of feature channels;
the jump connection is used for connecting the characteristic diagram of the encoder with the characteristic diagram of the corresponding layer of the decoder, so that the network can better transmit information;
step 3: improving the V-Net network;
changing batch normalization in the V-Net network into group normalization; replacing the normal convolution in the network with a depth separable convolution;
adding a squeze-and-specifiionattention attention mechanism into an encoder part of the network;
step 4: training the improved network by adopting a mixed Loss function BCEDice Loss; selecting an optimal network model according to the training result;
step 5: inputting the image data preprocessed in the step 1 into an optimal network model to obtain a segmentation result;
step 6: and carrying out post-processing on the segmentation result.
2. The method for segmenting light-weight brain tumor based on V-Net according to claim 1, wherein,
the step 1 of image preprocessing comprises the following steps: and (5) sequentially carrying out standardization, clipping and blocking treatment on the brain tumor MRI data set.
3. The V-Net based lightweight brain tumor segmentation method according to claim 1, wherein the mixed Loss function bcEDice Loss is a combination of binary cross entropy Loss and Dice coefficient Loss, and the final Loss is calculated by linear combination;
the binary cross entropy loss is calculated as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1; log represents natural logarithm;
the calculation formula of the Dice coefficient loss function is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1;
thus, the mixing Loss function bcEDice Loss calculation formula is:
the values of alpha and beta are 0.5.
4. The V-Net based lightweight brain tumor segmentation method according to claim 1, wherein said step 5 post-processing comprises removing noise, filling voids, smoothing segmentation boundaries.
5. A lightweight brain tumor segmentation system based on V-Net is characterized by comprising
An image preprocessing module: the method comprises the steps of preprocessing an image data set to be segmented to obtain an image conforming to a network structure;
and a network construction module: for constructing a V-Net network;
V-Net network improvement module: the method is used for reducing network parameters, improving the network training speed and simultaneously keeping higher precision;
and the network training module: training the improved network by adopting a mixed Loss function BCEDice Loss;
and a post-processing module: for further processing of the segmentation results.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311301482.XA CN117422871A (en) | 2023-10-10 | 2023-10-10 | Lightweight brain tumor segmentation method and system based on V-Net |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311301482.XA CN117422871A (en) | 2023-10-10 | 2023-10-10 | Lightweight brain tumor segmentation method and system based on V-Net |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117422871A true CN117422871A (en) | 2024-01-19 |
Family
ID=89527528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311301482.XA Pending CN117422871A (en) | 2023-10-10 | 2023-10-10 | Lightweight brain tumor segmentation method and system based on V-Net |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117422871A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117911797A (en) * | 2024-03-19 | 2024-04-19 | 武汉理工大学 | Crop CT image semiautomatic labeling method and system |
-
2023
- 2023-10-10 CN CN202311301482.XA patent/CN117422871A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117911797A (en) * | 2024-03-19 | 2024-04-19 | 武汉理工大学 | Crop CT image semiautomatic labeling method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112116605B (en) | Pancreas CT image segmentation method based on integrated depth convolution neural network | |
CN113808146B (en) | Multi-organ segmentation method and system for medical image | |
CN109584244B (en) | Hippocampus segmentation method based on sequence learning | |
Klibisz et al. | Fast, simple calcium imaging segmentation with fully convolutional networks | |
Aranguren et al. | Improving the segmentation of magnetic resonance brain images using the LSHADE optimization algorithm | |
CN110889853A (en) | Tumor segmentation method based on residual error-attention deep neural network | |
CN110706214B (en) | Three-dimensional U-Net brain tumor segmentation method fusing condition randomness and residual error | |
CN111696126B (en) | Multi-view-angle-based multi-task liver tumor image segmentation method | |
CN114782350A (en) | Multi-modal feature fusion MRI brain tumor image segmentation method based on attention mechanism | |
CN110619641A (en) | Automatic segmentation method of three-dimensional breast cancer nuclear magnetic resonance image tumor region based on deep learning | |
Shahsavari et al. | Proposing a novel Cascade Ensemble Super Resolution Generative Adversarial Network (CESR-GAN) method for the reconstruction of super-resolution skin lesion images | |
CN117422871A (en) | Lightweight brain tumor segmentation method and system based on V-Net | |
CN117058307A (en) | Method, system, equipment and storage medium for generating heart three-dimensional nuclear magnetic resonance image | |
CN116091412A (en) | Method for segmenting tumor from PET/CT image | |
CN114926396A (en) | Mental disorder magnetic resonance image preliminary screening model construction method | |
Kumaraswamy et al. | Automatic prostate segmentation of magnetic resonance imaging using Res-Net | |
CN117746042A (en) | Liver tumor CT image segmentation method based on APA-UNet | |
CN111798455B (en) | Thyroid nodule real-time segmentation method based on full convolution dense cavity network | |
CN116934721A (en) | Kidney tumor segmentation method based on multi-scale feature extraction | |
CN116030043A (en) | Multi-mode medical image segmentation method | |
Rao et al. | Weight pruning-UNet: Weight pruning UNet with depth-wise separable convolutions for semantic segmentation of kidney tumors | |
CN112766333B (en) | Medical image processing model training method, medical image processing method and device | |
CN116128890A (en) | Pathological cell image segmentation method and system based on self-adaptive fusion module and cross-stage AU-Net network | |
CN113327221A (en) | Image synthesis method and device fusing ROI (region of interest), electronic equipment and medium | |
CN111932486A (en) | Brain glioma segmentation method based on 3D convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |