CN114693933A - Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion - Google Patents

Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion Download PDF

Info

Publication number
CN114693933A
CN114693933A CN202210361443.8A CN202210361443A CN114693933A CN 114693933 A CN114693933 A CN 114693933A CN 202210361443 A CN202210361443 A CN 202210361443A CN 114693933 A CN114693933 A CN 114693933A
Authority
CN
China
Prior art keywords
liver
network
segmentation
data
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210361443.8A
Other languages
Chinese (zh)
Inventor
孙美君
杨淑清
王征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202210361443.8A priority Critical patent/CN114693933A/en
Publication of CN114693933A publication Critical patent/CN114693933A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Abstract

The invention discloses a medical image segmentation device based on generation of a countermeasure network and multi-scale feature fusion, which comprises: the input of the segmentation frame is an original 3D CT image, and a standardized image is obtained after preprocessing; utilizing a trained characteristic-based graph to synthesize and generate a discriminator in a confrontation network to segment the liver, and outputting a prediction probability graph; the value of each pixel in the probability map represents the probability that the pixel belongs to the liver, and more information is learned through counterstudy between the generator and the discriminator; automatic extraction of liver ROI: performing dot multiplication on the liver 3D segmentation result and the standardized image, shielding other non-related visceral organs, calculating a minimum external cuboid of a liver region, cutting, and resampling livers with different sizes to the same size; the method comprises the steps of taking a liver ROI as input, fusing multi-scale features by utilizing a trained three-channel cascade network based on improved V-Net, expanding a receptive field, processing the problems of position, shape and size difference of target regions and fuzzy boundaries of lesion regions in different data, and finally obtaining a tumor segmentation result.

Description

Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion
Technical Field
The invention relates to the field of computer vision of machine learning, in particular to a medical image segmentation device based on generation countermeasure network and multi-scale feature fusion.
Background
With the development of computer technology and biomedicine, medical imaging devices are gradually popularized, and various medical imaging technologies are widely applied to clinics. For example: magnetic Resonance Imaging (MRI), Computed Tomography (CT), Ultrasound (Ultrasound), X-ray (X-ray), etc., are all used to visualize various organs, different tissues, and diseased areas inside the human body in a non-invasive manner. Medical image segmentation plays an indispensable auxiliary role in disease diagnosis, case analysis, surgical planning and prognosis evaluation, can provide doctors with extremely valuable information such as the position of organ lesion, the size of lesion area, the severity of lesion and the like, and can play a role in real-time imaging in the surgical operation process. Deep learning is rapidly developed in the field of medical image segmentation, low-level features are combined into abstract high-level features by utilizing nonlinear combination, and the analysis difficulty of low resolution and high complexity of medical images is solved, so that the construction, upgrading and reasonable interpretation of a deep learning network become one of the hot points of current artificial intelligence and medical cross field research.
At present, how to realize high-precision automatic segmentation of liver tumors remains one of the most challenging tasks in medical image processing. Due to the limitation of labor cost and professional knowledge, it is difficult to label liver and liver tumor on voxel level in a large liver image data set, and the lack of labeled data is undoubtedly a problem to be solved at present for a deep learning model using data as a driver. Secondly, compared with the liver, the liver tumor has small volume and uneven gray distribution, the shape, the number and the position of the liver tumor are different from person to person, and the boundaries of the tumor and organs are quite fuzzy, so that the difficulty of finely dividing the liver tumor is stepped.
Currently, the medical image segmentation method at home and abroad can be divided into: conventional techniques, shallow machine learning based techniques, and deep learning based techniques. The traditional image segmentation technology utilizes the characteristics of gray scale, texture, edge and the like of an image, and segments a target area by manually setting a characteristic value, wherein the quality of a segmentation result is closely related to the manually set characteristics, so that the prediction performance of a complex scene is usually limited, and a large amount of available original image information is ignored; facing the limitation in the traditional segmentation technology, the development of machine learning provides a new solution for the segmentation of medical images, and the shallow machine learning technology comprises: clustering, support vector machines and the like all rely on manual feature extraction, the labor cost and the time cost are high, and the quality of feature selection directly influences the segmentation effect; research related to deep learning has undergone long-term development and evolution, from the introduction of MP (logical neuron) model in 1943, researchers in past generations successively introduced network models such as perceptron, back propagation algorithm, convolutional neural network, generation countermeasure network, residual error network, etc., and these methods appeared by injecting fresh blood for machine learning. With the application and popularization of artificial intelligence in the medical field, a deep learning algorithm is utilized to segment the liver and the tumor thereof from the medical image, the segmentation precision is greatly improved compared with the traditional image segmentation method and the segmentation method based on shallow machine learning, and partial problems still exist to become the barrier of further development. For example: for liver tumors with uneven density distribution or different scales, the existing segmentation technology still has a larger progress space; in addition, at present, no large-scale labeled data set can sufficiently meet the training requirement of a deep network, so that it is necessary to research how to learn under the condition of less sample data.
For medical images, especially three-dimensional images, a great deal of labor cost and time cost are needed for acquiring annotation data, and the development of AI (artificial intelligence) medical images is hindered to a certain extent by the problem of small samples. In recent years researchers have tried a variety of approaches to fully exploit incomplete data sets, proposing many high-performance models to reduce the need for labeled data in medical image segmentation. Data determines the upper limit of model performance, and under the condition that the amount of labeled data is very limited, how to enable a small amount of data to play a greater role is an urgent problem to be solved in medical image segmentation research.
At present, deep learning achieves excellent results in the field of liver tumor segmentation, but due to the characteristics of medical images, the segmentation task has the following problems:
1) most of 3D segmentation models are used for reducing the operation amount, 2D segmentation results are spliced into 3D results, the boundary curve of ROI (region of interest) in a 2D image slice replaces the ROI curved surface in a 3D image in the optimization target of the model, and the segmentation precision is undoubtedly reduced;
2) in abdominal CT images of different cases, the information such as the size, the position, the texture and the like of liver tumors are greatly different, and it is very difficult to accurately position the tumor region by directly using an end-to-end network;
3) the training of the deep neural network relies on a large number of labeled medical images, and the manual completion of accurate voxel-level labeling is a very time-consuming and labor-consuming process and has certain subjectivity. In addition, because medical images have different imaging protocols, training set images labeled for one study are difficult to use as a training set for another study, and often require re-labeling. Therefore, it is worth further exploring how to fully utilize the existing small amount of tag information under the condition that the tag data is incomplete.
4) Most segmentation algorithms have good segmentation effect only in images with clear and sharp boundaries and strong contrast, and are influenced by local body effect, tissue motion, noise, artifacts and the like, so that liver tumor boundaries in CT images are fuzzy, the contrast with the liver is low, and the data complexity is high. The neural network model can well divide the edges by extracting a large number of features, and how to use data with different scales to improve the model representation capability is also a problem to be solved, so that the tumor edge refined segmentation is realized.
Disclosure of Invention
Aiming at the problem of small samples of medical images, the invention realizes the segmentation of livers and tumors from abdominal CT images by combining the multi-scale semantic information of medical images through a semi-supervised learning mode and using a generated countermeasure network synthesized based on a feature map and improved multi-scale V-Nets, and is described in detail in the following description
A medical image segmentation apparatus based on generation of confrontation networks and multi-scale feature fusion, the apparatus comprising: a frame for dividing the liver tumor from coarse to fine,
the input of the segmentation frame is an original 3D CT image, and a standardized image is obtained after preprocessing; utilizing a trained discriminator in a confrontation network generated based on feature map synthesis to segment the liver, and outputting a prediction probability map;
the value of each pixel in the probability map represents the probability that the pixel belongs to the liver, and a countermeasure network is generated to learn more information through countermeasure learning between a generator and a discriminator;
automatic extraction of liver ROI: performing dot multiplication on the liver 3D segmentation result and the standardized image, shielding other non-related visceral organs, calculating a minimum external cuboid of a liver region, cutting, and resampling livers with different sizes to the same size;
the method comprises the steps of taking a liver ROI as input, fusing multi-scale features by utilizing a trained three-channel cascade network based on improved V-Net, expanding a receptive field, processing the problems of position, shape and size difference of target regions and fuzzy boundaries of lesion regions in different data, and finally obtaining a tumor segmentation result.
The discriminator is a V-Net network integrated with a pyramid pooling module, the generator is a neural network synthesized based on a feature map, the generator is trained by using the feature map output by the spatial pyramid pooling, and distribution of CT images is learned from label-free data to generate pseudo image data.
The discriminator uses the divided labeled data, non-labeled data and pseudo data generated by the generator to carry out semi-supervised learning; the network of the discriminator and the network of the generator mutually resist and learn until the discriminant and the generator reach dynamic balance, and the training is finished.
Further, the V-Net network of the pyramid pooling module is:
pyramid pooling is improved with a spatial shape awareness module for capturing long distance dependencies between different lesion areas in an image, capturing local context dependencies;
and for the input tensor, processing the input tensor by using three mutually vertical sheet-shaped seeds to obtain three outputs, expanding the three outputs to be consistent with the input tensor in size, and fusing to obtain a new eigenvector.
Furthermore, the fusion process is that dot product operation is followed by a Softmax activation function, and the fused tensor and the original input tensor are subjected to the same fusion operation to obtain a final output tensor;
using three-dimensional pooling layers of three scales to pool the input feature maps to three scales of 1 × 1 × 1, 2 × 2 × 2 and 3 × 3 × 3 respectively; reducing the number of channels of the three-scale pooling result by convolution; then up-sampling to the size of the original characteristic diagram respectively, and fusing the original characteristic diagram with the output result of the original characteristic diagram and the space shape sensing module; and reducing the number of channels through convolution again to obtain a feature map containing multi-scale information.
The three-channel cascade network based on the improved V-Net comprises the following components: a multi-scale split network is provided,
the input of the first branch is to reduce the three dimensions of the original input data to 0.5 times respectively, perform 2 times up-sampling on the result after outputting the corresponding segmentation result, and fuse the results of the other two branches to output the segmentation graph of the liver tumor.
The technical scheme provided by the invention has the beneficial effects that:
1. the invention uses the idea of cascade network to realize the division of liver image areas directly from 3D pictures, further more finely divide tumor image areas from the liver image areas and realize the end-to-end detection of medical images;
2. the invention uses a semi-supervised learning mode, can fully utilize the information of the unlabeled data, solves the problem of excessive dependence of a deep learning model on the labeled data, and can obtain higher segmentation precision by using less labeled data;
3. the invention uses the space shape perception module to improve the original pyramid pooling module, can match proper target sizes from the global and local angles through multiple receptive fields with different sizes, effectively plays the role of different scale features, utilizes the semantic information of the high-level features to enable the feature map of the low-level features to be more complete, utilizes the details of the low-level features to refine the edges of the high-level features, and avoids the interference caused by background noise.
Drawings
FIG. 1 is a schematic structural diagram of a medical image segmentation apparatus based on generation of a countermeasure network and multi-scale feature fusion;
FIG. 2 is a schematic diagram of a liver segmentation model;
FIG. 3 is a schematic diagram of a training process of a liver segmentation model;
FIG. 4 is a schematic diagram of a pyramid pooling module architecture using spatial shape perception improvement;
FIG. 5 is a schematic diagram of a spatial shape perception module;
FIG. 6 is a schematic diagram of a tumor segmentation model;
fig. 7 is a schematic diagram illustrating visualization of the segmentation result of each model liver tumor.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.
Example 1
The embodiment of the invention provides a medical image segmentation device based on generation of an antagonistic network and multi-scale feature fusion, and describes a liver tumor segmentation frame from coarse to fine, as shown in figure 1, the input of the frame is an original 3D CT image, and a standardized image is obtained after preprocessing; then, segmenting the liver by using a discriminant in a trained feature-based graph synthesis generation countermeasure network (FRGAN), outputting a prediction probability graph, wherein the value of each pixel in the probability graph represents the probability that the pixel belongs to the liver, and the FRGAN learns more information from less labeled data through countermeasure learning between a generator and the discriminant; liver ROI was then automatically extracted: performing dot multiplication on the liver 3D segmentation result and the standardized image, shielding other non-related visceral organs, calculating a minimum external cuboid of a liver region, cutting, and resampling livers with different sizes to the same size; then, the ROI of the liver is used as input, and a trained three-channel cascade network based on an improved V-Net (full convolution neural network for segmenting the volume medical image) is utilized to fuse multi-scale features, so that the receptive field is enlarged, the problems of position, shape and size difference of target regions and fuzzy boundary of lesion regions in different data are better processed, and finally, a tumor segmentation result is obtained.
In summary, the tumor segmentation performed by the above framework is performed in a smaller and more accurate liver region than the original image, so that the situation of mistakenly segmenting the tumor can be effectively reduced, and the tumor segmentation accuracy can be improved.
Example 2
The solution of example 1 is further described below in conjunction with fig. 2-5, and is described in detail below:
first, liver region extraction
Embodiments of the present invention use a feature map synthesis-based confrontation network (FRGAN) as the liver region segmentation module of the segmentation framework in fig. 1. Fig. 2 shows a network structure of FRGAN, which mainly comprises a generator and a discriminator, and uses a V-Net network integrated with a pyramid pooling module as the discriminator, and outputs a segmentation result, and uses a neural network based on a feature map synthesis method as the generator. The generator utilizes the characteristic diagram output by the S-PPM (space pyramid pooling) module to train, learns the distribution of the CT images from the label-free data and generates pseudo image data; in order to verify the performance of the model, dividing experimental data into labeled data and unlabeled data according to the ratio of 4:6, and performing semi-supervised learning by using the divided labeled data, unlabeled data and pseudo data generated by a generator by using a discriminator; the network of the discriminator and the network of the generator mutually resist and learn until the discriminant and the generator reach dynamic balance, and the training is finished. The three color arrows in fig. 3 represent the training process of the network.
The left half of fig. 2 is a network structure of the generator, and a pseudo image is generated using a feature map synthesis method. An S-PPM module in the discriminator network aggregates multi-level and multi-scale feature information, an output feature graph is used as input of a generator, and a pseudo image with the same size as real data is generated through 4 times of upsampling operation and 4 stages of convolution operation. Then, the authenticity image respectively passes through an encoder part of the discriminator to extract a Feature map (Unlabeled Feature-map) of the Unlabeled image and a Feature map (Fake Feature-map) of the Fake image, the mean difference of the two is used as loss, and after a plurality of iterations, the generator can synthesize the Fake image which is closer to the real image through the Feature maps.
Fig. 4 shows a network structure of the improved pyramid pooling module S-PPM. Organ tumors in the medical images are irregular in shape and uneven in distribution, original pyramid pooling completely depends on stacking of pooling layers of different sizes, and target region features and key position information are difficult to learn from the medical images. Embodiments of the present invention thus utilize a Spatial Shape Awareness Module (SSAM) to improve the pyramid pooling module. The network structure of the space shape perception module SSAM is shown in fig. 5, and can flexibly capture the long-distance dependency relationship between different lesion areas in an image, effectively capture the correlation of local context, and prevent the interference of irrelevant areas. For the input tensor, processing the input tensor by using three mutually vertical sheet sub-modules to obtain three outputs, expanding the three outputs to be consistent with the input tensor in size and fusing the outputs to obtain a new eigenvector, wherein the fusing process is a dot product operation and then is connected with a Softmax activation function; and finally, performing the same fusion operation on the fused tensor and the original input tensor to obtain the final output tensor. On the basis, three-dimensional pooling layers of three scales are used for pooling the input feature maps to three scales of 1 × 1 × 1, 2 × 2 × 2 and 3 × 3 × 3 respectively; then, reducing the number of channels for the three scales of pooling results through convolution; then, respectively up-sampling to the size of the original characteristic diagram, and fusing the size with the original characteristic diagram and the output result of the SSAM; and reducing the number of channels through convolution again to obtain a feature map containing multi-scale information.
Wherein the loss function l of the generatorGThe following were used:
lG=‖Ex′~G(x)f(x)+Ex′~G(x)f(x)‖2 (1)
where x represents the input unlabeled data, xRepresenting the pseudo data generated by the generator according to the non-label data, f (x) representing the non-label characteristic diagram extracted after the pseudo data passes through an S-PPM module in the discriminator, f (x)) The method comprises the steps of representing a pseudo feature map extracted after pseudo data pass through an S-PPM module, E representing expected values of loss of unlabeled data and pseudo data, and G (x) representing distribution of images generated by a generator.
The discriminator calculates the supervision loss between the label of the real marked sample and the target area segmentation result of the predicted marked sample, and simultaneously calculates the unsupervised loss of the target area segmentation result of the predicted unmarked sample and the target area segmentation result of the pseudo sample, the loss l of the discriminatorDLoss of image l from tape labelLabeledImage without label lUnlabeledAnd loss of data of three types of pseudo imageFakeConsists of the following components:
lD=lLabeled+lUnlabeled+lFake (2)
for input data, xGThe label data representing the correspondence of the image with label is x, xG~Pdata(x,xG) Representing input as tagged data, x-Pdata(x) Indicating input as unlabeled data, xG (f) represents that the input is dummy data, and G (f) represents the distribution of the dummy data. Using a cross-entropy loss function (well known to those skilled in the art), three classes are obtainedThe loss of the input image can be calculated by the following formula:
Figure BDA0003585434650000071
wherein p isD(yi|x,yi<N +1) indicates that the voxel prediction class in the image x is yiProbability of pD(yiN +1| x) represents the probability that the voxel prediction class in the unlabeled data is a false image, pD(yi=N+1|x) Representing the probability that the voxel prediction class in the pseudo data is a pseudo image, where N is 1, pDRepresenting the predicted probability of the discriminator network for each type of data.
The loss function has the functions of calculating the difference between the forward calculation result and the true value in each iterative training process of the network so as to guide the next training to be carried out in the correct direction, updating parameters in the training process of the neural network, learning the distribution of data images by the network after training, and having the capability of extracting liver regions from the images.
Tumor segmentation
After the extraction of the liver region is completed, the liver tumor is segmented, and in order to better utilize Multi-scale information to improve the segmentation performance of the network model, a three-channel Parallel liver tumor segmentation network (MP V-Nets) is designed based on S-PPM V-Net, and the structure is shown in fig. 6. There are 3 similar segmentation branches in a multi-scale segmentation network, except for the different scales of the inputs and the different sampling operations performed when the final segmentation results are merged. Taking the first branch in fig. 6 as an example, the input of the first branch is that the three dimensions of the original input data are respectively reduced to 0.5 times, after the corresponding segmentation result is output, the result needs to be up-sampled by 2 times, the results of the other two branches are fused, and finally, the segmentation map of the liver tumor is output.
Defining a loss function L for a network MP V-NetsMPVIs the sum of three parallel branch penalty functions, each of which employs a cross entropy penalty function. Using xijIs shown asinput data for i branches, N representing the total number of voxels in the data, xGijThe truth label, p (x), representing the jth voxel in the dataij) The prediction probability vector representing the point, the loss function is calculated from the following equation:
LMPV=L0.5+L1+L2 (4)
Figure BDA0003585434650000072
by a loss function LMPVAnd calculating the difference between the prediction result of the tumor segmentation network and the real label, updating parameters of each layer of the network in the process of back propagation, and updating through multiple iterations to enable the network to have the capability of distinguishing whether a certain pixel belongs to the foreground or the background, so that the tumor is accurately segmented from the image.
Example 3
The schemes of examples 1 and 2 are further described below in conjunction with fig. 7, table 1, and described in detail below:
the algorithm experiment code in the embodiment of the invention uses Python programming language, the algorithm frame is built in an efficient and flexible Pytrch deep learning frame, and because the scheme of the embodiment of the invention is based on a deep neural network, a GPU (graphics processing unit) processor needs to be configured in the experiment link, and the complex neural network calculation is completed by matching with a Cuda acceleration operation platform promoted by NVIDIA company and a GPU database cuDNN for the deep neural network.
Since the FRGAN liver segmentation network uses a semi-supervised training mode, 100 preprocessed training set data are randomly divided into 40 and 60 data, which are respectively used as labeled data and unlabeled data of a training discriminator. Using Adam optimization algorithm in the training process of the network, the initial learning rate of the generator is set to 1 × 10-3The initial learning rate of the discriminator is set to 1 × 10-5The BatchSize is set to 2 and the maximum number of training rounds is set to 20000. The segmented liver image area is used as training data of a tumor segmentation stage, meanwhile, the liver in the label is set to be 0, and the tumor is set to be 1.
Three-channel parallel tumor segmentation models, MP V-Nets, were then trained. During the training process of the network, Adam optimization algorithm is used, BatchSize is set to 2, and the initial learning rate is set to 1 × 10-3The gradient coefficient was set to 0.9, the coefficient of the square gradient was set to 0.999, and the weight attenuation coefficient was 1 × 10-8And setting the maximum training round number to be 200, stopping training when obvious overfitting occurs, and storing all weight parameters of the network.
The LiTS2017 is used for data collection, and the total number of training samples is 131, wherein the training set is 100, and the testing set is 31. As evaluation indexes, DICE (DICE coefficient), ACC (accuracy), VOE (voxel overlap error), RVD (relative error of acceleration), SP (specificity), and SE (forest acuity) are used. The DICE number is a similarity measure commonly used in medical image segmentation, and the larger the value is, the closer the output result is to a real result.
5 different advanced 3D segmentation algorithms are selected for comparison experiments, and the test results of each experimental method are shown in Table 1. The 5 models were: DenseVoxNet, Rubik's cube + +, 3D U-Net, V-Net, Vox2Vox, corresponding to [ a ] to [ e ] in the table.
TABLE 1 liver tumor segmentation index comparison
Figure BDA0003585434650000081
As can be seen from the above table:
(1) by comprehensively considering all indexes, the three-channel parallel liver tumor segmentation model MP V-Nets provided by the embodiment of the invention has better performance and is generally superior to other methods, and the segmentation framework from thick to thin is proved to be an excellent choice for realizing accurate segmentation of liver tumors.
(2) Compared with Rubik's cube + +, MP V-Nets have higher performance on DICE coefficients, ACC, VOE and SE although being slightly inferior in relative volume error, and are improved by 3.79% on the most important index DICE, which indicates that the model segmentation result is closer to the true value.
(3) Compared with V-Net, after the space shape sensing module and the pyramid pooling module are integrated, DICE indexes are improved, VOE and RVD are obviously reduced, the visible space shape feature module and the improved pyramid pooling module excite more information of the feature map on the space, high-level semantic information of the image is fully mined, more image edge information is kept, and accuracy and superiority of liver tumor segmentation of the scheme are proved.
In order to observe and compare the performance of each model more intuitively, the segmentation effect of each model on several cases which are difficult to segment is compared, and the segmentation results of different models are visualized, as shown in fig. 7. 5 specific cases with high segmentation difficulty in experimental tests are selected for displaying in the same way, and it can be obviously seen that compared with the liver, the tumor has the characteristics of wide distribution, large quantity, different shapes and sizes and the like, and the tumor segmentation difficulty is increased. The first four columns in the figure are the segmentation results of the original data, liver region, gold standard and the device, and the next five columns are the segmentation results of the five comparison models. The comparison shows that when the tumor volume is small and the distribution is very dense, other models can easily integrate a plurality of small tumors and divide the small tumors into a whole, and the model can accurately eliminate irrelevant areas among the tumors; for the case that the contrast between the gray scale of the tumor and the gray scale of surrounding tissues is small, the tumor boundary which is very similar to the gold standard can be obtained by the model, and the edge part of the tumor is difficult to accurately segment by other models, so that the under-segmentation phenomenon occurs.
In the embodiment of the present invention, except for the specific description of the model of each device, the model of other devices is not limited, as long as the device can perform the above functions.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-mentioned serial numbers of the embodiments of the present invention are only for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. An apparatus for segmenting medical images based on generation of confrontation networks and multi-scale feature fusion, the apparatus comprising: a frame for dividing liver tumor from coarse to fine,
the input of the segmentation frame is an original 3D CT image, and a standardized image is obtained after preprocessing; segmenting the liver by using a discriminator in a generation countermeasure network synthesized based on a trained feature map, and outputting a prediction probability map;
the value of each pixel in the probability map represents the probability that the pixel belongs to the liver, and a countermeasure network is generated to learn more information through countermeasure learning between a generator and a discriminator;
automatic extraction of liver ROI: performing dot multiplication on the liver 3D segmentation result and the standardized image, shielding other non-relevant visceral organs, calculating a minimum external cuboid of a liver region, cutting, and resampling livers of different sizes to the same size;
the method comprises the steps of taking a liver ROI as input, fusing multi-scale features by utilizing a trained three-channel cascade network based on improved V-Net, expanding a receptive field, processing the problems of position, shape and size difference of target regions and fuzzy boundaries of lesion regions in different data, and finally obtaining a tumor segmentation result.
2. The medical image segmentation apparatus based on generation countermeasure network and multi-scale feature fusion as claimed in claim 1,
the discriminator is a V-Net network integrated with a pyramid pooling module, the generator is a neural network synthesized based on a feature map, the generator is trained by using the feature map output by spatial pyramid pooling, and distribution of CT images is learned from label-free data to generate pseudo image data.
3. The medical image segmentation device based on generation countermeasure network and multi-scale feature fusion according to claim 1 or 2,
the discriminator uses the divided labeled data, non-labeled data and pseudo data generated by the generator to carry out semi-supervised learning; the network of the discriminator and the network of the generator mutually resist and learn until the discriminant and the generator reach dynamic balance, and the training is finished.
4. The medical image segmentation apparatus based on generation countermeasure network and multi-scale feature fusion as claimed in claim 2, wherein the V-Net network based on pyramid pooling module is:
pyramid pooling is improved and attached in the jump connection of the V-Net network with a spatial shape awareness module for capturing long distance dependencies between different lesion areas in the image, capturing local context dependencies;
and for the input tensor, processing the input tensor by using three mutually vertical sheet-shaped seeds to obtain three outputs, expanding the three outputs to be consistent with the input tensor in size, and fusing to obtain a new eigenvector.
5. The medical image segmentation device based on generation of the countermeasure network and multi-scale feature fusion as claimed in claim 4, wherein the fusion process is a dot product operation followed by a Softmax activation function, and the fused tensor and the original input tensor are subjected to the same fusion operation to obtain a final output tensor;
pooling the input feature maps to three scales of 1 × 1 × 1, 2 × 2 × 2 and 3 × 3 × 3 respectively by using three-dimensional pooling layers of three scales; reducing the number of channels of the three-scale pooling result by convolution; then up-sampling to the size of the original characteristic diagram respectively, and fusing the original characteristic diagram with the output result of the original characteristic diagram and the space shape sensing module; and reducing the number of channels through convolution again to obtain a feature map containing multi-scale information.
6. The medical image segmentation device based on generation countermeasure network and multi-scale feature fusion of claim 4, wherein the three-channel cascade network based on improved V-Net is characterized in that: a multi-scale split network is provided,
the input of the first branch is to reduce the three dimensions of the original input data to 0.5 times respectively, perform 2 times up-sampling on the result after outputting the corresponding segmentation result, and fuse the results of the other two branches to output the segmentation graph of the liver tumor.
CN202210361443.8A 2022-04-07 2022-04-07 Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion Pending CN114693933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210361443.8A CN114693933A (en) 2022-04-07 2022-04-07 Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210361443.8A CN114693933A (en) 2022-04-07 2022-04-07 Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion

Publications (1)

Publication Number Publication Date
CN114693933A true CN114693933A (en) 2022-07-01

Family

ID=82142910

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210361443.8A Pending CN114693933A (en) 2022-04-07 2022-04-07 Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion

Country Status (1)

Country Link
CN (1) CN114693933A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115359881A (en) * 2022-10-19 2022-11-18 成都理工大学 Nasopharyngeal carcinoma tumor automatic delineation method based on deep learning
CN115393301A (en) * 2022-08-16 2022-11-25 中山大学附属第一医院 Method and device for image omics analysis of liver two-dimensional shear wave elastic image
CN115953418A (en) * 2023-02-01 2023-04-11 公安部第一研究所 Method, storage medium and equipment for stripping notebook region in security check CT three-dimensional image
CN116206109A (en) * 2023-02-21 2023-06-02 桂林电子科技大学 Liver tumor segmentation method based on cascade network
CN116580133A (en) * 2023-07-14 2023-08-11 北京大学 Image synthesis method, device, electronic equipment and storage medium
CN117636076A (en) * 2024-01-25 2024-03-01 北京航空航天大学 Prostate MRI image classification method based on deep learning image model

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115393301A (en) * 2022-08-16 2022-11-25 中山大学附属第一医院 Method and device for image omics analysis of liver two-dimensional shear wave elastic image
CN115393301B (en) * 2022-08-16 2024-03-12 中山大学附属第一医院 Image histology analysis method and device for liver two-dimensional shear wave elastic image
CN115359881A (en) * 2022-10-19 2022-11-18 成都理工大学 Nasopharyngeal carcinoma tumor automatic delineation method based on deep learning
CN115953418A (en) * 2023-02-01 2023-04-11 公安部第一研究所 Method, storage medium and equipment for stripping notebook region in security check CT three-dimensional image
CN115953418B (en) * 2023-02-01 2023-11-07 公安部第一研究所 Notebook area stripping method, storage medium and device in security inspection CT three-dimensional image
CN116206109A (en) * 2023-02-21 2023-06-02 桂林电子科技大学 Liver tumor segmentation method based on cascade network
CN116206109B (en) * 2023-02-21 2023-11-07 桂林电子科技大学 Liver tumor segmentation method based on cascade network
CN116580133A (en) * 2023-07-14 2023-08-11 北京大学 Image synthesis method, device, electronic equipment and storage medium
CN116580133B (en) * 2023-07-14 2023-09-22 北京大学 Image synthesis method, device, electronic equipment and storage medium
CN117636076A (en) * 2024-01-25 2024-03-01 北京航空航天大学 Prostate MRI image classification method based on deep learning image model
CN117636076B (en) * 2024-01-25 2024-04-12 北京航空航天大学 Prostate MRI image classification method based on deep learning image model

Similar Documents

Publication Publication Date Title
CN106056595B (en) Based on the pernicious assistant diagnosis system of depth convolutional neural networks automatic identification Benign Thyroid Nodules
CN111047594B (en) Tumor MRI weak supervised learning analysis modeling method and model thereof
CN114693933A (en) Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion
CN112150428B (en) Medical image segmentation method based on deep learning
CN108257135A (en) The assistant diagnosis system of medical image features is understood based on deep learning method
CN109493308A (en) The medical image synthesis and classification method for generating confrontation network are differentiated based on condition more
CN109754007A (en) Peplos intelligent measurement and method for early warning and system in operation on prostate
Solanki et al. Brain tumor detection and classification using intelligence techniques: An overview
CN113674281A (en) Liver CT automatic segmentation method based on deep shape learning
CN111179237A (en) Image segmentation method and device for liver and liver tumor
CN110782427B (en) Magnetic resonance brain tumor automatic segmentation method based on separable cavity convolution
CN112085113B (en) Severe tumor image recognition system and method
Yonekura et al. Improving the generalization of disease stage classification with deep CNN for glioma histopathological images
Haq et al. BTS-GAN: computer-aided segmentation system for breast tumor using MRI and conditional adversarial networks
Patibandla et al. Comparative study on analysis of medical images using deep learning techniques
WO2021183765A1 (en) Automated detection of tumors based on image processing
Ibrahim et al. Deep learning based Brain Tumour Classification based on Recursive Sigmoid Neural Network based on Multi-Scale Neural Segmentation
CN116759076A (en) Unsupervised disease diagnosis method and system based on medical image
Yang A novel brain image segmentation method using an improved 3D U-net model
Baumgartner et al. Fully convolutional networks in medical imaging: Applications to image enhancement and recognition
Mani Deep learning models for semantic multi-modal medical image segmentation
CN116092643A (en) Interactive semi-automatic labeling method based on medical image
CN110706209B (en) Method for positioning tumor in brain magnetic resonance image of grid network
Zhao et al. Interpretable Model Based on Pyramid Scene Parsing Features for Brain Tumor MRI Image Segmentation
Tawfeeq et al. Predication of Most Significant Features in Medical Image by Utilized CNN and Heatmap.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination