CN117422871A - Lightweight brain tumor segmentation method and system based on V-Net - Google Patents

Lightweight brain tumor segmentation method and system based on V-Net Download PDF

Info

Publication number
CN117422871A
CN117422871A CN202311301482.XA CN202311301482A CN117422871A CN 117422871 A CN117422871 A CN 117422871A CN 202311301482 A CN202311301482 A CN 202311301482A CN 117422871 A CN117422871 A CN 117422871A
Authority
CN
China
Prior art keywords
network
net
segmentation
image
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311301482.XA
Other languages
Chinese (zh)
Inventor
王琳霖
张彤
王传云
邵景
李中一
高骞
张鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Aerospace University
Original Assignee
Shenyang Aerospace University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Aerospace University filed Critical Shenyang Aerospace University
Priority to CN202311301482.XA priority Critical patent/CN117422871A/en
Publication of CN117422871A publication Critical patent/CN117422871A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/032Recognition of patterns in medical or anatomical images of protuberances, polyps nodules, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Databases & Information Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a lightweight brain tumor segmentation method based on V-Net, which comprises the following steps: step 1: preprocessing an image to obtain an image conforming to a network structure; step 2: constructing a V-Net network; step 3: improving the V-Net network; changing batch normalization in the V-Net network into group normalization; replacing the normal convolution in the network with a depth separable convolution; adding a squeze-and-specifiionattention attention mechanism into an encoder part of the network; step 4: training the improved network by adopting a mixed Loss function BCEDice Loss; selecting an optimal network model according to the training result; step 5: inputting the image data preprocessed in the step 1 into an optimal network model to obtain a segmentation result; step 6: and carrying out post-processing on the segmentation result. The method can also keep higher training precision on the basis of shortening training time, has good segmentation performance, and has positive significance for diagnosis of clinicians and treatment of patients.

Description

Lightweight brain tumor segmentation method and system based on V-Net
Technical Field
The invention belongs to the technical field of deep learning computer vision and medical image processing, and particularly relates to a V-Net-based lightweight brain tumor segmentation method and system.
Background
Brain tumor is one of the diseases seriously endangering the life safety of patients, is an abnormal cell group growing in the cranium, has irregular shape and uncertain volume, can appear at any position in the brain, can cause serious dysfunction of the nervous system of human body, and is a tumor seriously threatening the life of patients. Among them, brain glioma is the most common craniocerebral tumor, and has the characteristics of high morbidity, high recurrence rate, high mortality rate and low cure rate. Gliomas can be classified into high-grade gliomas (HGG) and low-grade gliomas (LGG) according to the extent of invasion and the prognosis of the patient. High-grade glioma has higher mortality rate, and low-grade glioma has slower development. Therefore, the diagnosis and timely treatment of the low-level brain tumor are carried out in early stage, and the method has great significance for increasing the survival chance of patients, improving the survival quality of the patients and prolonging the life of the patients.
Magnetic resonance imaging (Magentic Resonance Imaging, MRI) acquires and reconstructs information of the human body through magnetic resonance phenomena, and MRI can obtain brain images with higher contrast than CT imaging. MRI is a non-invasive technique, is harmless to human body, can provide complete non-invasive images without craniography, has good soft tissue contrast, shows a tissue structure with higher resolution, and is widely applied to diagnosis and treatment of clinical medicine.
Along with the development of deep learning and the promotion of other relevant hardware, a deep learning method is successfully applied to the field of medical images, and the characteristics of time and labor waste of manual segmentation are compensated by utilizing a segmentation method of a deep learning model. At present, brain tumor image segmentation by deep learning mainly comprises two methods of a 2D convolution network and a 3D convolution network, wherein the 2D convolution network mainly comprises the following steps: FCN, U-Net, U-Net++, and the like. In the 2D method we decompose the constructed 3D MRI volume into a number of 2D slices, each of which is passed into a segmentation model that generates a segmentation for each slice, and then recombines the two-dimensional slices to form a segmented three-dimensional volume, a disadvantage of using this approach is that significant context information in the 3D image is lost.
In summary, providing an MRI image segmentation method based on a 3D convolution network is a problem to be solved urgently.
Disclosure of Invention
In view of the above, the invention discloses a lightweight brain tumor segmentation method and a system based on V-Net, so as to obtain a network model with higher training accuracy and high training speed.
The technical scheme of the invention is as follows: a lightweight brain tumor segmentation method based on V-Net comprises the following steps:
step 1: preprocessing an image to obtain an image conforming to a network structure;
step 2: constructing a V-Net network; the V-Net network consists of an encoder and a decoder;
the encoder is used for extracting features from the original input image, and the decoder is used for restoring the extracted features into segmentation results;
wherein the encoder is composed of a plurality of Down transition modules, and the decoder part is composed of a plurality of UpTranstion modules; the encoder and the decoder are connected through jump;
the Down transition module is used for gradually reducing the size of the feature map and gradually increasing the number of feature channels, so that the encoder can capture information of different layers;
the UpTransit module is used for gradually restoring the size of the feature map to be the same as the input image, and reducing the number of feature channels;
the jump connection is used for connecting the characteristic diagram of the encoder with the characteristic diagram of the corresponding layer of the decoder, so that the network can better transmit information;
step 3: improving the V-Net network;
changing batch normalization in the V-Net network into group normalization; replacing the normal convolution in the network with a depth separable convolution;
adding a squeze-and-specifiionattention attention mechanism into an encoder part of the network;
step 4: training the improved network by adopting a mixed Loss function BCEDice Loss; selecting an optimal network model according to the training result;
step 5: inputting the image data preprocessed in the step 1 into an optimal network model to obtain a segmentation result;
step 6: and carrying out post-processing on the segmentation result.
Specifically, the step 1 image preprocessing includes: and (5) sequentially carrying out standardization, clipping and blocking treatment on the brain tumor MRI data set.
Specifically, the mixed Loss function bcalice Loss is to combine the binary cross entropy Loss and the Dice coefficient Loss, and calculate the final Loss through linear combination;
the binary cross entropy loss is calculated as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1; log represents natural logarithm;
the calculation formula of the Dice coefficient loss function is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1;
thus, the mixing Loss function bcEDice Loss calculation formula is:
the values of alpha and beta are 0.5.
Specifically, the post-processing in step 5 includes removing noise, filling holes, and smoothing the segmentation boundary.
The invention also provides a light-weight brain tumor segmentation system based on V-Net, which comprises:
an image preprocessing module: the method comprises the steps of preprocessing an image data set to be segmented to obtain an image conforming to a network structure;
and a network construction module: for constructing a V-Net network;
V-Net network improvement module: the method is used for reducing network parameters, improving the network training speed and simultaneously keeping higher precision;
and the network training module: training the improved network by adopting a mixed Loss function BCEDice Loss;
and a post-processing module: for further processing of the segmentation results.
The invention provides a V-Net-based lightweight brain tumor segmentation method and a V-Net-based lightweight brain tumor segmentation system, which greatly reduce the calculated amount, simultaneously accelerate the training speed, keep higher training precision on the basis of shortening the training time, have good segmentation performance and have positive significance for diagnosis of clinicians and treatment of patients.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure of the invention as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a diagram of four modes of a BraTs provided by an embodiment of the present disclosure;
FIG. 2 is a block diagram of preprocessed data provided by an embodiment of the present disclosure;
FIG. 3 is a block diagram of batch normalization and group normalization provided by an embodiment of the present disclosure;
FIG. 4 is a diagram of a generic convolution process provided by an embodiment of the present disclosure;
FIG. 5 is a diagram of a depth separable convolution process provided by an embodiment of the present disclosure;
FIG. 6 is a flowchart of an SE attention mechanism algorithm provided by an embodiment of the present disclosure;
FIG. 7 is a network block diagram provided by an embodiment of the present disclosure;
fig. 8 is a graph showing comparison of segmentation effects provided by the disclosed embodiments of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of systems consistent with aspects of the invention as detailed in the accompanying claims.
The method of the 3D convolution network mainly comprises the following steps: 3D U-Net, V-Net, etc. In the 3D method, the whole MRI volume is ideally input into the segmentation model and the 3D segmentation result of the whole MRI is obtained. Due to the limitations of hardware and other resources, the MRI image is typically cut into several blocks, which are then fed into a segmentation model, which is finally aggregated to form a segmentation map of the entire volume, which captures certain context information.
The embodiment firstly provides a light-weight brain tumor segmentation method based on V-Net, which comprises the following steps:
step 1: preprocessing an image to obtain an image conforming to a network structure;
specifically, brain tumor MRI data were preprocessed, and the data set, the BraTS2018 data set, was normalized, cut and blocked. So that the processed image meets the requirement of the network structure.
In this embodiment: the training set and validation set of the experiments of the present invention were from the BraTs2018 dataset, the training set contained 210 advanced glioma patient samples, 75 lower glioma patient samples, and the validation set contained 66 unlabeled patient samples, wherein each sample in the training set in turn contained brain MRI images of 4 different modalities (T1, T2, T1ce, flair) and corresponding true label images. However, the BraTs only disclose training set data, and have no test set data, and if a part of the data in the training set is split to be used as a test set, the fitting phenomenon can occur due to too little training data, so that the network generalization capability is poor. To solve the problem of less data, we selected as the test set the increased fraction of the BraTs2019 training set over the BraTs2018 training set. Which contained 49 higher glioma patient samples and 1 lower glioma patient sample.
The pretreatment steps mainly comprise:
(1) 5 black slices were manually added. The four modality images (155,240,240) and corresponding masks (155,240,240) were added in front of 3 black slices and 2 behind them, and the image sizes were uniformly modified (160,240,240).
(2) And (5) standardization. The BraTS adopts four sequences of MR images of T1, T2, flair and T1ce, and the four sequences are images of different modes, so that the contrast of the images is different, and each mode image is normalized by adopting a z-score mode, and the average value is subtracted from the image to be divided by the standard deviation.
(3) Cutting. The grey part of the BraTS MR image is the brain region, the black is the background, the proportion of background information in the whole image is large, and the background does not help in segmentation. Viewing the MR image from the doctor's perspective automatically filters out this background information, focusing on the brain region, so that it is necessary to remove the background information around the brain region, and the network parameters become smaller after clipping, improving the performance of the network, where we clip the original size (160,240,240) image (160,160,160).
(4) The blocking processing, because the limited image of the memory display card resource size can not be completely input into the network, the image and the corresponding Mask need to be blocked. The cropped image and Mask size was (160,160,160), and in the present experiment, 5 (32,160,160) sized tiles were divided from the Axial direction.
(5) And merging the data and storing. The four standardized and blocked modes are combined into four channels, the shape after storage is (32,160,160,4), and the dtype is float64. Then, after the corresponding Mask is also segmented, combining the three labels into three nested subregions, and finally combining the three subregions into three channels, wherein the three channels are respectively WT, TC and ET, the numerical value is 0 or 1, the shape after storage is (32,160,160,3), and the dtype is uint8.
Step 2: constructing a V-Net network; the V-Net consists of two parts, encoder and decoder, which function to extract features from the original input image and to map the extracted features back to segmentation results, respectively. The V-Net adopts 3D convolution to process three-dimensional volume data, in addition, the V-Net introduces jump connection, and the network can better retain and transfer detail information by connecting the characteristic diagram of the encoder with the characteristic diagram of the corresponding layer of the decoder. The encoder portion of the V-Net consists of multiple DownTransition modules, each comprising a downsampling operation (typically using 3D convolution with step size 2), a batch normalization, a ReLU activation and one to more convolution operations (typically a 5 x 5 convolution kernel), and an optional Dropout layer. The effect of DownTransition is to progressively reduce the size of the feature map and progressively increase the number of feature channels to capture different levels of information in the encoder. The decoder portion of the V-Net consists of multiple upconversion modules, each of which includes a single normalization process with an upsampling operation (typically using a transposed convolution of step 2 or bilinear interpolation), a single ReLU activation function, and one or more convolution operations, the function of the decoder being to progressively restore the feature map size to the same as the input image and reduce the number of feature channels. V-Net also introduces a jump connection between the encoder and decoder, enabling the network to better communicate information by connecting the feature map of the encoder with the feature map of the corresponding layer of the decoder. This jump connection allows the network to access the low-level characteristic information of the encoder during decoding, thereby helping the network to better restore details and boundaries.
Step 3: improving the V-Net network; the batch normalization BN in the network is replaced by the group normalization GN, the batch normalization BN is used for processing the image in the traditional V-Net network, the calculation mode of the batch normalization BN is to independently take out N, H, W of each channel for normalization processing, so that the training speed can be increased, and the generalization capability of the network can be enhanced. However, the BN has a great influence on the accuracy due to the change of the size N of the batch, the smaller the batch is, the mean and variance obtained by calculation cannot represent the global, the higher the error rate is, and the higher the memory requirement of the computer is required by increasing the batch. Since the data volume of medical images is large, the volume is generally set to be relatively small. In this case, the calculation of the group normalization GN is not affected by the batch size, and the accuracy thereof is stable by taking an arbitrary value in a batch. Therefore, the group normalization GN is more suitable for the segmentation experiment of the medical image, so in the experiment of the invention, the batch normalization in the V-Net network is modified into the group normalization GN more suitable for the experiment. The basic idea of GN is as follows: dividing the extracted features of a certain layer into G groups, normalizing the features in each group, and finally merging the data after the normalization of the G groups.
The normal convolution in the network is replaced by a depth separable convolution (Depthwise Separable convolution), and an SE (Squeeze-and-specification) Attention mechanism is added to the encoder portion of the network. The network can reduce a large number of parameters, improve the training speed and keep higher precision.
The depth separable convolution is to divide one convolution kernel into two independent convolution kernels, and mainly includes two steps: depth convolution and point-by-point convolution. The deep convolution is to convolve each channel of the input feature map separately. Point-wise convolution refers to convolution over all channels using a 1 x 1 convolution kernel. As shown in the figure: let the size of the input feature map be D K ×D K X M, convolution kernel of size D F ×D F The calculation amount of convolution operation of the feature map and the kernel is D F ×D F ×M×N×D K ×D K . The convolution kernel of the depth convolution has a size D F ×D F X 1 xM, the calculation amount of convolution operation of the feature map and the depth convolution sum is D F ×D F ×M×D K ×D K . The convolution kernel size of the point-by-point convolution is 1×1×m×n, and the calculation amount of the convolution operation between the feature map and the point-by-point convolution is m×n×d K ×D K . By these two steps, its calculated amount can be expressed as D F ×D F ×M×D K ×D K +M×N×D K ×D K Whereas the calculation amount of the common convolution is D F ×D F ×M×N×D K ×D K Compared with the two, the calculated amount of the depth separable convolution is obviously reduced, so that the calculation speed of the network model can be greatly improved by changing the common convolution into the depth separable convolution in the convolution unit.
SE (Squeeze-and-specification) Attention mechanisms are added after the convolution of each layer of the V-Net network encoder section.
Depth separable convolution reduces the number of parameters and computation to some extent, but may sometimes limit the feature representation capabilities of the network, may not adequately capture complex features of the data, and thus affect the performance of the model. The SE Attention mechanism can dynamically adjust the importance of the inter-channel features, so that the network can pay more Attention to the important features, and the feature expression capability is enhanced. In addition, the SE Attention can help the network to better capture key information in the image, so that the performance of the model is improved. The basic idea of SE Attention is to learn the weights of each channel through global information pooling, which mainly comprises two steps: squeeze and specification. Wherein Ftr gives an input feature map X, and the input feature map X is subjected to Ftr operation to generate a feature map U. The feature map is subjected to ensemble averaging and pooling in the step of Squeeze (Fsq ()), a vector of 1 multiplied by C is generated, each channel is represented by a numerical value, the step of Excitation (Fsq) is completed through two layers of fully connected layers, weight information required by users is generated through weights W, the weights W are obtained through learning and are used for displaying the correlation of the features required by users for modeling, the weight vector S generated in the third step is used for carrying out weight assignment on the feature map U in Fscan, the size of the feature map X 'required by users is obtained, the size of the feature map X' is identical to that of the feature map U, and the SE module does not change the size of the feature map. Combining the two steps of Squeeze and specification together, SE attribute rescales the feature map for each channel by the learned weights. Channels with higher weights get more attention, while channels with lower weights are suppressed. Thus SE Attention can help the network focus on features that are meaningful to the task, thereby improving the performance of the network. The SE parameter amount is small, and the lightweight performance of the network is not affected.
Step 4: training the improved network by adopting a mixed Loss function BCEDice Loss;
since brain tumor segmentation is a very challenging task, it is mainly expressed in that brain tumor segmentation usually involves multiple classes, the distribution of different classes in the image is very different, resulting in a class imbalance problem, which makes the model more prone to predict larger classes and smaller classes may be ignored, affecting the segmentation effect. Furthermore, the shape and size of brain tumors vary greatly from patient to patient and from case to case, making it difficult for the model to capture all shape details and boundaries, especially for small or blurred-edge tumors. And the binary cross entropy Loss and the Dice coefficient Loss are combined together by using the BCEDice Loss, the final Loss is calculated through linear combination, the classification accuracy of the pixel level (through the binary cross entropy) and the overlapping degree of the segmentation result (through the Dice coefficient) can be comprehensively considered, and in the brain tumor segmentation task with unbalanced categories, the BCEDice Loss can better balance the two losses, so that the performance of the model is improved.
The binary cross entropy loss function (Binary Cross Entropy Loss) is a loss function commonly used for two classification tasks for measuring the difference between the prediction probability and the true label, and in brain tumor segmentation tasks, pixel-level segmentation is a classification problem, namely each pixel belongs to a tumor area or a non-tumor area, so that the network can be helped to learn the correct pixel classification by using the binary cross entropy loss. The calculation formula is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample and takes a value of 0 or 1.log represents the natural logarithm.
Dice coefficient loss function: the Dice coefficient is an index for evaluating the similarity of two sets, and is used for measuring the overlapping degree between the predicted segmentation result and the true segmentation. In brain tumor segmentation tasks, the Dice coefficient is widely used to measure the accuracy of segmentation, and is particularly effective for tasks with unbalanced categories. The formula is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample and takes a value of 0 or 1.
The final loss function is:
in the invention, the values of alpha and beta are both 0.5.
Post-processing the segmentation result:
VNet networks typically perform post-processing in brain tumor segmentation tasks to further optimize segmentation results. The goal of post-processing is to do some of the operations on the segmentation mask of the model output. Including removing noise, filling holes, smoothing segmentation boundaries. Thereby obtaining a more accurate segmentation result.
Selecting an optimal network model according to the training result, wherein the following table is used for comparing the segmentation effect of various algorithms on the BraTS data set:
the training set is randomly split into five equal parts, and five-fold cross validation is carried out, as shown in a table I, the average Dice value of the algorithm in the whole area, the enhancement area and the core area respectively reaches 90.10%, 90.10% and 89.42%. NVDLMED in the table is the algorithm for the first name of the BraTS2018 race. As can be seen from the table, the algorithm of the invention is respectively reduced by 12.5 times and 32.1 times compared with V-Net in terms of calculated amount and parameter amount, and is respectively reduced by 21.8 times and 28.6 times compared with NVDLMED, and the calculated amount and parameter amount are obviously reduced. However, in terms of computational accuracy, the algorithm of the invention is slightly lower than V-Net in the whole region (WT) by 1.89%, but is respectively higher than V-Net in the enhanced tumor region (ET) and the tumor core region (TC) by 1.71% and 0.52%, respectively, and is respectively higher than NVDLMED by 8.72% and 3.46%. In comprehensive view, the algorithm of the invention can greatly reduce the calculated amount, quicken the training speed, and can keep higher training precision on the basis of shortening the training time, has good segmentation performance, and has positive significance for diagnosis of clinicians and treatment of patients.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (5)

1. A lightweight brain tumor segmentation method based on V-Net comprises the following steps:
step 1: preprocessing an image to obtain an image conforming to a network structure;
step 2: constructing a V-Net network; the V-Net network consists of an encoder and a decoder;
the encoder is used for extracting features from the original input image, and the decoder is used for restoring the extracted features into segmentation results;
wherein the encoder is composed of a plurality of Down transition modules, and the decoder part is composed of a plurality of UpTranstion modules; splicing the feature images in a jump connection mode between the encoder and the decoder;
the Down transition module is used for gradually reducing the size of the feature map and gradually increasing the number of feature channels, so that the encoder can capture information of different layers;
the UpTransit module is used for gradually restoring the size of the feature map to be the same as the input image, and reducing the number of feature channels;
the jump connection is used for connecting the characteristic diagram of the encoder with the characteristic diagram of the corresponding layer of the decoder, so that the network can better transmit information;
step 3: improving the V-Net network;
changing batch normalization in the V-Net network into group normalization; replacing the normal convolution in the network with a depth separable convolution;
adding a squeze-and-specifiionattention attention mechanism into an encoder part of the network;
step 4: training the improved network by adopting a mixed Loss function BCEDice Loss; selecting an optimal network model according to the training result;
step 5: inputting the image data preprocessed in the step 1 into an optimal network model to obtain a segmentation result;
step 6: and carrying out post-processing on the segmentation result.
2. The method for segmenting light-weight brain tumor based on V-Net according to claim 1, wherein,
the step 1 of image preprocessing comprises the following steps: and (5) sequentially carrying out standardization, clipping and blocking treatment on the brain tumor MRI data set.
3. The V-Net based lightweight brain tumor segmentation method according to claim 1, wherein the mixed Loss function bcEDice Loss is a combination of binary cross entropy Loss and Dice coefficient Loss, and the final Loss is calculated by linear combination;
the binary cross entropy loss is calculated as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1; log represents natural logarithm;
the calculation formula of the Dice coefficient loss function is as follows:
where N represents the number of samples, y pred (i) The probability that sample i is a positive example of model prediction, y true (i) Is the actual label of the sample, and takes a value of 0 or 1;
thus, the mixing Loss function bcEDice Loss calculation formula is:
the values of alpha and beta are 0.5.
4. The V-Net based lightweight brain tumor segmentation method according to claim 1, wherein said step 5 post-processing comprises removing noise, filling voids, smoothing segmentation boundaries.
5. A lightweight brain tumor segmentation system based on V-Net is characterized by comprising
An image preprocessing module: the method comprises the steps of preprocessing an image data set to be segmented to obtain an image conforming to a network structure;
and a network construction module: for constructing a V-Net network;
V-Net network improvement module: the method is used for reducing network parameters, improving the network training speed and simultaneously keeping higher precision;
and the network training module: training the improved network by adopting a mixed Loss function BCEDice Loss;
and a post-processing module: for further processing of the segmentation results.
CN202311301482.XA 2023-10-10 2023-10-10 Lightweight brain tumor segmentation method and system based on V-Net Pending CN117422871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311301482.XA CN117422871A (en) 2023-10-10 2023-10-10 Lightweight brain tumor segmentation method and system based on V-Net

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311301482.XA CN117422871A (en) 2023-10-10 2023-10-10 Lightweight brain tumor segmentation method and system based on V-Net

Publications (1)

Publication Number Publication Date
CN117422871A true CN117422871A (en) 2024-01-19

Family

ID=89527528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311301482.XA Pending CN117422871A (en) 2023-10-10 2023-10-10 Lightweight brain tumor segmentation method and system based on V-Net

Country Status (1)

Country Link
CN (1) CN117422871A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117911797A (en) * 2024-03-19 2024-04-19 武汉理工大学 Crop CT image semiautomatic labeling method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117911797A (en) * 2024-03-19 2024-04-19 武汉理工大学 Crop CT image semiautomatic labeling method and system

Similar Documents

Publication Publication Date Title
CN112116605B (en) Pancreas CT image segmentation method based on integrated depth convolution neural network
CN113808146B (en) Multi-organ segmentation method and system for medical image
CN109584244B (en) Hippocampus segmentation method based on sequence learning
Klibisz et al. Fast, simple calcium imaging segmentation with fully convolutional networks
Aranguren et al. Improving the segmentation of magnetic resonance brain images using the LSHADE optimization algorithm
CN110889853A (en) Tumor segmentation method based on residual error-attention deep neural network
CN110706214B (en) Three-dimensional U-Net brain tumor segmentation method fusing condition randomness and residual error
CN111696126B (en) Multi-view-angle-based multi-task liver tumor image segmentation method
CN114782350A (en) Multi-modal feature fusion MRI brain tumor image segmentation method based on attention mechanism
CN110619641A (en) Automatic segmentation method of three-dimensional breast cancer nuclear magnetic resonance image tumor region based on deep learning
Shahsavari et al. Proposing a novel Cascade Ensemble Super Resolution Generative Adversarial Network (CESR-GAN) method for the reconstruction of super-resolution skin lesion images
CN117422871A (en) Lightweight brain tumor segmentation method and system based on V-Net
CN117058307A (en) Method, system, equipment and storage medium for generating heart three-dimensional nuclear magnetic resonance image
CN116091412A (en) Method for segmenting tumor from PET/CT image
CN114926396A (en) Mental disorder magnetic resonance image preliminary screening model construction method
Kumaraswamy et al. Automatic prostate segmentation of magnetic resonance imaging using Res-Net
CN117746042A (en) Liver tumor CT image segmentation method based on APA-UNet
CN111798455B (en) Thyroid nodule real-time segmentation method based on full convolution dense cavity network
CN116934721A (en) Kidney tumor segmentation method based on multi-scale feature extraction
CN116030043A (en) Multi-mode medical image segmentation method
Rao et al. Weight pruning-UNet: Weight pruning UNet with depth-wise separable convolutions for semantic segmentation of kidney tumors
CN112766333B (en) Medical image processing model training method, medical image processing method and device
CN116128890A (en) Pathological cell image segmentation method and system based on self-adaptive fusion module and cross-stage AU-Net network
CN113327221A (en) Image synthesis method and device fusing ROI (region of interest), electronic equipment and medium
CN111932486A (en) Brain glioma segmentation method based on 3D convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination