CN112419327B - Image segmentation method, system and device based on generation countermeasure network - Google Patents
Image segmentation method, system and device based on generation countermeasure network Download PDFInfo
- Publication number
- CN112419327B CN112419327B CN202011438792.2A CN202011438792A CN112419327B CN 112419327 B CN112419327 B CN 112419327B CN 202011438792 A CN202011438792 A CN 202011438792A CN 112419327 B CN112419327 B CN 112419327B
- Authority
- CN
- China
- Prior art keywords
- network
- segmentation
- map
- label
- probability map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000003709 image segmentation Methods 0.000 title claims abstract description 22
- 230000011218 segmentation Effects 0.000 claims abstract description 112
- 238000012549 training Methods 0.000 claims abstract description 53
- 239000011159 matrix material Substances 0.000 claims description 23
- 208000009119 Giant Axonal Neuropathy Diseases 0.000 claims description 20
- 201000003382 giant axonal neuropathy 1 Diseases 0.000 claims description 20
- 238000009826 distribution Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 230000003042 antagnostic effect Effects 0.000 claims description 7
- 238000009825 accumulation Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 230000008485 antagonism Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000000873 masking effect Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 239000003550 marker Substances 0.000 claims 1
- 238000005192 partition Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 4
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 206010028980 Neoplasm Diseases 0.000 description 5
- 239000000872 buffer Substances 0.000 description 5
- 238000002372 labelling Methods 0.000 description 4
- 208000003174 Brain Neoplasms Diseases 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 230000002194 synthesizing effect Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 206010030113 Oedema Diseases 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002620 method output Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 230000001338 necrotic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image segmentation method, system and device based on a generated countermeasure network, wherein a semantic segmentation model consists of a segmentation network S and a generated countermeasure model; the segmentation network S predicts a label probability map S (x) of each pixel point for the input data x; the generator G generates a tag probability map G (z) according to the noise z; the arbiter D separates the false tag probability map from the true tag probability map y by predicting a pixel level confidence map p. The algorithm synthesizes the marked three-dimensional medical image data through the game between the generator and the discriminator, and can solve the problem of lack of marked medical image data. The generated data does not relate to user privacy, and is beneficial to sharing medical data. The SEG-GAN segmentation model is used for distinguishing the label generated by the generator through the medical image and the real label by the discriminator, so that the segmentation of the medical image is obtained, the medical image without the label is used for assisting model training, and the segmentation effect of the model is effectively improved.
Description
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image segmentation method, system, and apparatus based on generation of an countermeasure network.
Background
The labeled medical image data is severely starved. Because labeling of medical images requires a considerable level of medical literacy, labeling a complete medical image segmented dataset requires significant time and expense.
The GANs are increasingly receiving attention from the computer vision and medical community and have been used in many fields. With very small and very large two-player game games, the generators in the GANs will mimic the real data distribution at the discretion of the arbiter and achieve applications such as image translation, image synthesis, data enhancement, image completion, etc. Although GANs has been successful in many problems, its instability in the training process is its most fatal disadvantage, which is more exposedly evident when synthesizing high resolution images or three-dimensional voxels.
In the data enhancement problem of segmentation tasks, most of the work is limited by the cost of hardware equipment and training procedures, regarding the task of synthesizing three-dimensional voxels (e.g., MR images) as a sequence of two-dimensional slices in the z-axis. However, this approach may result in discontinuities in the z-axis of the synthesized three-dimensional voxel data, which may be detrimental to the three-dimensional segmentation network trained using such data. Whether relevance-enhanced or non-relevance-enhanced, they are subjectively imposed by the user, requiring the user to predefine transformation rules for the original data. The simplest and objective data enhancement method is to obtain more data from the real data distribution. However, this is not possible because the acquisition of data using medical devices is costly. Taking nuclear magnetic resonance technology as an example, a GE 1.5T exact HDXT price is between $150,000 and $250,000, and the cost of one scan is at least 500 Yuan RMB. Although hospitals and the like accumulate a large amount of data during clinical practice, there is no way to utilize such data for the protection of patient privacy.
In view of the foregoing, there is a particular need for a medical image segmentation method and apparatus based on generating an antagonistic network SEG-GAN to address the deficiencies of the prior art.
Disclosure of Invention
Aiming at the problems of heavy work and high cost of labeling brain tumor medical image data, and the problem of low segmentation accuracy of the existing method for segmenting the medical image by the supervised method, the invention provides an image segmentation method, system and device based on a generation countermeasure network.
In order to solve the technical problems existing in the prior art, the technical scheme of the invention comprises the following steps:
an image segmentation method based on generating an countermeasure network, comprising the steps of: s1, predicting a confidence map as a supervision signal by using a discriminant network which is pre-trained by using label data, and guiding cross entropy loss in a self-learning mode. The confidence map indicates which regions of the prediction distribution are close to the true tag map distribution so that these predictions can be trained by the segmentation network by masking cross entropy loss of other untrusted regions.
S2, as in the supervision setting, an antagonistic penalty is applied on the unlabeled data, which encourages the model' S prediction of unlabeled data to be close to the true label graph distribution.
S3, dividing the network S into a generator structure of the 3D-MedGAN. Given an input MR image x of one dimension h×w×d×1, the segmentation network outputs a semantic tag probability map S (x) of size h×w×d×c, where C is the number of semantic categories.
S4, the generator network G is responsible for generating a semantic tag probability map G (z) with the same size H multiplied by W multiplied by D multiplied by C as the input image according to a random vector z with fixed dimensions.
S5, the performance of the discriminator network D depends on a segmentation network and a generator network, wherein the segmentation network predicts a probability map S (x), the generator network generates a probability map G (z) and a one-hot map of a real label map y as inputs, and then a confidence map p with the size of H multiplied by W multiplied by D is output; each pixel on the confidence map p represents that the label of the input image x at the position corresponding thereto is a sample from the true label map y (p=1) or the false label map (p=0), including S (x) and G (z).
Preferably, the method uses a generated challenge model having two "generators" and a discriminator. The two "generators" are a segmentation network S that predicts the tag probability map for the incoming MR and a generator G that converts random noise into the tag probability map, respectively. The labels predicted by the segmentation network S and the generator G are used as false samples, and labels of the label data are used as true samples, so that the discriminator is trained to have the capability of separating the true labels from the false labels. When the arbiter has this capability, an indication matrix can be derived from its predictions. The indicator matrix may be used to keep a relatively reliable prediction of unlabeled exemplars by the segmentation network S as a supervisory signal for self-training. The better the performance of the discriminator, the more useful the retained supervisory signals, and the better the segmentation effect of the finally trained model on brain tumors.
Preferably, the input data x according to which the splitting network S is based comprises the marking data x l And unlabeled data x u Each marking data x l All have corresponding label patterns y l The label map y of H×W×D size here l One-hot encoding is changed into a probability map y with C channels of discrete labels l The labels at each location will map the probability map y l The channel of which the upper represents the category is set to be 1, and the same positions on the rest other channels are all set to be 0; in the training process, when the marking data x is used l In this case, the split network S is formed by a label probability map S (x l ) And true label map y l Standard cross entropy loss of (2)Distinguishing y by using a discriminator network D pred Is->Guiding and updating each parameter in the network; training the segmentation network using a self-supervised learning method for unlabeled data, predicting an unlabeled image x from the segmentation network S u Is (x) u ) Then, a tag probability map S (x u ) Confidence degree of each position is obtained, a confidence degree map p is obtained, the quality of the predicted segmentation area is indicated through the confidence degree map, and the result of the segmentation network S during training can be trusted; then, a region with high confidence is reserved by taking a threshold value on the confidence map p, and a predictive label probability map S (x u ) The channel with the highest probability in all channels in the areas is used as the label of the area to obtain a pseudo label graph, and the pseudo label graph is subjected to one-hot coding to obtain y u The method comprises the steps of carrying out a first treatment on the surface of the Then the network S predicts the label probability map S (x) with high confidence region segmentation u ) And y u Cross entropy loss between->Resistance loss with the arbiter network D>Training a segmentation network; the generator network generates a false tag probability map G (z) according to the random noise z, and obtains y by taking the channel with the largest value at the same position as the tag g At the same time, the discriminator is used to calculate the generator loss->The arbiter network D will identify the tag probability map y that is tagged with the genuine data l False label probability map y predicted by split network pred Guided, y u And authentication of true tag probability map y l False tag probability map y generated by generator network g Will result in minimized arbiter loss +.>Since the training process is to optimize the segmentation network S and the discriminant network D in turn, the resistance loss is +.>And discriminator loss->And will not be used simultaneously.
Preferably, the input data x comprises marking data x l And unlabeled data x u Each marking data x l All have corresponding label patterns y l The label map y of H×W×D size here l One-hot encoding is changed into a probability map y with C channels of discrete labels l The labels at each location will map the probability map y l The channel whose category is represented by the upper is set to 1, and the same positions on the remaining other channels are all set to 0. For the sake of expression, the label probability map resulting from the label map one-hot coding is denoted hereinafter by a symbol.
Preferably, during the training processAll data is used. When using the marking data x l In this case, the split network S is formed by a label probability map S (x l ) And true label map y l Standard cross entropy loss of (2)Distinguishing y by using a discriminator network D pred Is->And guiding and updating each parameter in the network.
Preferably, the proposed self-supervised learning method is used to train the segmentation network on unlabeled data. In the prediction of unlabeled image x by segmentation network S u Is (x) u ) Then, a tag probability map S (x u ) Confidence in each location, a confidence map p is obtained. The confidence map indicates the quality of the predicted segmented regions so that the results of the segmentation network during training can be trusted.
Preferably, the confidence region is then preserved by thresholding the confidence map p and taking the predictive label probability map S (x u ) The channel with the highest probability in all channels in the areas is used as the label of the area to obtain a pseudo label graph, and the pseudo label graph is subjected to one-hot coding to obtain y u 。
Preferably, the network predictive label probability map S (x) u ) And y u Cross entropy loss betweenResistance loss with the arbiter network>Together, the segmentation network is trained. Whether by marking data x l Whether or not the data x is unlabeled u Training the segmentation network and the discriminator network, the generator network generates false labels according to random noise zProbability map G (z), obtaining y by taking the channel with the largest value at the same position as the label g At the same time, the discriminator is used to calculate the generator loss->
Preferably, the arbiter network D will identify the tag probability map y that is marked with the authentic data l False label probability map y predicted by split network pred ,y u And authentication of true tag probability map y l False tag probability map y generated by sum generator network g Resulting discriminant lossSince the training process is to optimize the segmentation network S and the discriminant network D in turn, the resistance loss is +.>And discriminator loss->And will not be used simultaneously.
Loss in a network that a arbiter wants to minimizeThe method comprises the following steps:
the data input into the identifier network are unified into one-hot coding format, all the label probability graphs are obtained by taking the channel with the highest probability on each position as the label, then carrying out one-hot coding to obtain the data of HxW xD xC, and modifying the lossThe method comprises the following steps:
loss as the arbiter is continually optimizedContinuously decreasing, the generator also needs to be left with loss +.>Optimizing in order to generate a tag probability map y sufficient to fool the discriminant g Make->Lifting. />The method comprises the following steps:
wherein the loss isAnd->The loss function of the standard GANs is used.
Preferably, for the segmentation network S, the gap between the predicted tag probability map and the true tag samples needs to be reduced, this distance for the tagged samples x l Is a predictive probability map S (x l ) One-hot coding diagram y of each pixel point and true mark l Accumulation of cross entropy for each pixel point, for unlabeled exemplar x u Then it is the predictive probability map S (x u ) Pixel point on the image and reserved pseudo-marked one-hot coding diagram y u Accumulation of cross entropy for pixel points on the display. Thus dividing lossThe method comprises the following steps:
preferably, the indication matrix I is trained with the marking data x l Is set to a matrix with all elements 1, representing S (x l ) And y is l The cross entropy calculated at each point is used as a supervisory signal to train the segmentation network S. While training the unlabeled data x u When the position (h, w, d) is binarized by taking a threshold value T according to the confidence map p, the point indicating the position (h, w, d) in the c-axis direction of the matrix is set to 0 or 1. According to the above description, the indication matrix I is:
whether with marked data x l Or non-marking data x u Training the segmentation network S, and predicting the result y of the segmentation network by using a discriminator D of the full convolution network pred And y u Calculating the countermeasures against lossTo guide the segmentation network to optimize the segmentation network, and predict the probability graph y which is closer to the true label l And (3) a distributed result. Countering losses->The method comprises the following steps:
the training segmentation network S is:
another embodiment of the present invention provides an image segmentation system based on generating an antagonistic network SEG-GAN, the system comprising:
the data set acquisition module is used for acquiring a target image set, a reference image set and a pre-marked reference mark set corresponding to the reference image set; the target image set comprises a target image training set and a target image testing set.
And the network construction module is used for constructing a segmentation network and a discrimination network. Wherein the first objective loss function of the segmentation network comprises cross entropy loss of the objective image set and the reference label set, contrast loss of the objective image set, and semi-supervised loss between the objective image set and the reference image set.
The training module is used for inputting the target image training set and the reference image set into the segmentation network, correspondingly obtaining a target probability score graph and a reference probability score graph, and inputting the target probability score graph and the reference probability score graph into the discrimination network so as to perform joint training of the segmentation network and the discrimination network.
And the judging module is used for finishing training when the first target loss function of the segmentation network and the second target loss function of the judging network are converged.
And the test module is used for inputting the target image test set into the trained segmentation network to obtain a target segmentation image.
A further embodiment of the invention correspondingly provides a system using the image segmentation method based on the generation of the antagonism network SEG-GAN, the system comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing any one of the above image segmentation methods based on the generation of the antagonism network SEG-GAN when the computer program is executed.
The semantic segmentation model consists of a segmentation network S and a basic generated challenge model (G and D). The segmentation network S predicts a label probability map S (x) for each pixel for the input data x. The generator G generates a tag probability map G (z) from the noise z. The arbiter D attempts to separate the false tag probability map (i.e., S (x) and G (z)) from the true tag probability map y by predicting a pixel-level confidence map p.
Compared with the prior art, the image segmentation method, the system and the device based on the generation of the antagonism network SEG-GAN can solve the problem of lack of marked medical image data. And the generated data does not relate to user privacy, thereby being beneficial to sharing medical data. The 3D-MedGAN model is applied to semi-supervised medical image segmentation, an SEG-GAN segmentation model is provided, a label generated by a generator through a medical image and a real label are distinguished by a discriminator, so that segmentation of the medical image is obtained, the non-labeled medical image is utilized for assisting model training, the segmentation effect of the model is effectively improved, and the manpower labeling cost is greatly reduced.
Drawings
The invention is described in detail below with reference to the attached drawing figures and the detailed description:
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram of a network architecture of a discriminator of the invention;
FIG. 3 is a schematic diagram of a generated challenge model for semi-supervised learning of the present invention;
FIG. 4 is a schematic diagram of the prediction results of the present invention on a sample.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flowchart of an image segmentation method based on generating an countermeasure network according to an embodiment of the present invention is shown, where the method includes steps S1 to S5 as follows.
S1, predicting a confidence map as a supervision signal by using a discriminant network which is pre-trained by using label data, and guiding cross entropy loss in a self-learning mode. The confidence map indicates which regions of the prediction distribution are close to the true tag map distribution so that these predictions can be trained by the segmentation network by masking cross entropy loss of other untrusted regions.
S2, as in the supervision setting, the resistance loss is applied to the unlabeled data, which encourages the prediction result of the unlabeled data by the model to be close to the real label graph distribution.
S3, dividing the network S into a generator structure of the 3D-MedGAN. Given an input MR image x of one dimension h×w×d×1, the segmentation network outputs a semantic tag probability map S (x) of size h×w×d×c, where C is the number of semantic categories.
S4, the generator network G is responsible for generating a semantic tag probability map G (z) with the same size H multiplied by W multiplied by D multiplied by C as the input image according to a random vector z with fixed dimensions.
S5, the performance of the discriminator network D depends on a segmentation network and a generator network, wherein the segmentation network predicts a probability map S (x), the generator network generates a probability map G (z) and a one-hot map of a real label map y as inputs, and then a confidence map p with the size of H multiplied by W multiplied by D is output. Each pixel on the confidence map p represents that the label of the input image x at the position corresponding thereto is a sample from the true label map y (p=1) or the false label map (p=0), including S (x) and G (z).
The method uses a generated countermeasure model, which has two generators and a discriminator. The two "generators" are a segmentation network S that predicts the tag probability map for the incoming MR and a generator G that converts random noise into the tag probability map, respectively. The labels predicted by the segmentation network S and the generator G are used as false samples, and labels of the label data are used as true samples, so that the discriminator is trained to have the capability of separating the true labels from the false labels. When the arbiter has this capability, an indication matrix can be derived from its predictions. The indicator matrix may be used to keep a relatively reliable prediction of unlabeled exemplars by the segmentation network S as a supervisory signal for self-training. The better the performance of the discriminator, the more useful the retained supervisory signals, and the better the segmentation effect of the finally trained model on brain tumors.
Referring to fig. 2, the present invention generates a discriminator network structure in an antagonism network. The input data can be divided into two groups: false and true, both groups have the same number of samples. In the fake group, the image generated by the current generator would be randomly replaced by samples previously cached in the image pool. The samples in the true group are then the true MR images. The discriminator reduces the size of the input to one fourth by a 3 x 3 convolution with a step of 2 in the first two layers, the number of channels becomes 4 times the original, and then outputs a three-dimensional probability map with dimensions w x h x d by a 3 x 3 convolution with a step of 1 and a 1 x 1 convolution with a step of 1.
Expanding the discriminator network to three dimensions so that it is suitable for three-dimensional input data. The last layer of the discriminator D outputs a probability map of dimension w x h x D, the value of each position on the probability map representing the probability that the image block corresponding thereto belongs to true.
An image buffer pool is introduced to improve the stability of the countermeasure training. The image buffer pool buffers samples generated by some generators in the previous step. During training, samples in a part of the image cache pool are exchanged with the data currently synthesized by the generator to be delivered to the discriminator for judging true or false. This approach may stabilize the training of the discriminator, preventing the discriminator from "forgetting" the previously learned knowledge. In particular, four pairs of data, each pair consisting of MR image x and its tag y, will be cached in the image cache pool. For these four pairs of data (x i ,y i ) Is indicated (the subscript indicates its location in the image cache pool). At each iteration of training, it is first determined randomly whether the data in the image buffer pool is to be compared with the samples (x * ,y * ) Exchange is performed. The index i generated according to the uniform distribution is then used to buffer the samples (x i ,y i ) And (x) * ,y * ) Exchange, will (x i ,y i ) Is sent to a discriminator network D for authentication, and (x * ,y * ) Cached at location i.
The least square loss function proposed by the least square countermeasure generation network is used to replace the sigmoid cross entropy loss of the GAN. This improvement can improve the quality of the generated image, stabilizing the training process.
Where a and b are used to mark spurious and real data, respectively. c is a threshold matrix G such that D believes that the spurious data is real. A and c are initialized to a matrix of dimension w x h x d with all elements being 1, and b is initialized to a zero matrix of the same size.
Referring to fig. 3, in SEG-GAN, a generator in the 3D-MedGAN framework is used as a split network. Unlike a typical generator that trains to generate images from noise vectors, the segmentation network of the method outputs a probability map of the semantic label for each pixel point on a given input image. Under this setting, the result of forcing the output of the segmentation network is as close spatially as possible to the true label map, based on consistency (smoothness) constraint assumptions. For this purpose, an antagonist learning scheme is adopted, a discriminator formed by a convolutional neural network is used for learning to distinguish a real label image from a label image of a segmentation prediction, and an additional generator is used for synthesizing a generated label image according to a noise vector and requiring the discriminator to correctly divide the real label image from the generated label image. The SEG-GAN incorporates cross entropy loss in the segmentation task when trained with labeled data and uses the contrast loss to encourage the segmentation network to produce predictive probability maps in higher-order structures that approach true label maps. By further utilizing the above challenge learning scheme, these unlabeled data are utilized in conjunction with two semi-supervised loss terms when training with unlabeled data.
First, SEG-GAN predicts confidence graphs as supervisory signals using a network of discriminators previously pre-trained with tag data and directs cross entropy loss by way of self-learning. The confidence map indicates which regions of the prediction distribution are close to the true tag map distribution so that these predictions can be trained by the segmentation network by masking cross entropy loss of other untrusted regions. Second, as in the supervision setting, an antagonistic penalty is applied on the unlabeled data, which encourages the model's predictive outcome for the unlabeled data to be close to the true label graph distribution.
The method for generating the countermeasure model consists of three modules: divider S, generator G and arbiter D. Wherein the splitting network S is a generator structure of 3D-MedGAN. Given an input MR image x of one dimension h×w×d×1, the segmentation network outputs a semantic tag probability map S (x) of size h×w×d×c, where C is the number of semantic categories. The generator network G is responsible for generating a semantic tag probability map G (z) of the same size h×w×d×c as the input image from a random vector z of fixed dimension.
The input data x comprises marking data x l And unlabeled data x u Is obtained by the following steps.
Each marking data x l All have corresponding label patterns y l The label map y of H×W×D size here l One-hot encoding is changed into a probability map y with C channels of discrete labels l The labels at each location will map the probability map y l The channel whose category is represented by the upper is set to 1, and the same positions on the remaining other channels are all set to 0. For the sake of expression, the label probability map resulting from the label map one-hot coding is denoted hereinafter by a symbol.
All data was used during the training process. When using the marking data x l In this case, the split network S is formed by a label probability map S (x l ) And true label map y l Standard cross entropy loss of (2)Distinguishing y by using a discriminator network D pred Is->And guiding and updating each parameter in the network.
The proposed self-supervised learning method is used to train the segmentation network on unlabeled data. In the prediction of unlabeled image x by segmentation network S u Is (x) u ) Then, a tag probability map S (x u ) Confidence in each location, a confidence map p is obtained. The confidence map indicates the quality of the predicted segmented regions so that the results of the segmentation network during training can be trusted.
Then, a region with high confidence is reserved by taking a threshold value on the confidence map p, and a predictive label probability map S (x u ) The channel with the highest probability in all channels in the areas is used as the label of the area to obtain a pseudo label graph, and the pseudo label graph is subjected to one-hot coding to obtain y u 。
Then segment the network predictive label probability map S (x u ) And y u Cross entropy loss betweenResistance loss with the arbiter network>Together, the segmentation network is trained. Whether by marking data x l Whether or not the data x is unlabeled u Training the segmentation network and the discriminator network, generating a false tag probability graph G (z) by the generator network according to random noise z, and obtaining y by taking the channel with the largest value at the same position as a tag g At the same time, the discriminator is used to calculate the generator loss->
Distinguishing device netThe complex D will be marked by a tag probability map y identifying the true data l False label probability map y predicted by split network pred ,y u And authentication of true tag probability map y l False tag probability map y generated by sum generator network g Resulting discriminant lossSince the training process is to optimize the segmentation network S and the discriminant network D in turn, the resistance is lostAnd discriminator loss->And will not be used simultaneously.
Loss in a network that a arbiter wants to minimizeCan be expressed as the following formula:
wherein Y is n A label representing a pixel on a label probability map of h×w×d that is located at (H, W, D) is true. When Y is n When=0, i.e. false, the probability values representing all channels at that point are from the label probability map S (x) predicted by the segmentation network or the label probability map G (z) synthesized by the generator network, when Y n When=1, it is true, the probability values representing all channels at that point are derived from the labeled sample x l Is marked y of (2) l . The subscript n used in the formula represents the nth sample in the small batch of data used in training.
The data input into the arbiter network are unified into a one-hot coding format, all the tag probability maps are obtained by taking the channel with the highest probability on each position as the tag, and then performing one-hot codingH×W×D×C data, modification lossThe method comprises the following steps:
loss as the arbiter is continually optimizedContinuously decreasing, the generator also needs to be left with loss +.>Optimizing in order to generate a tag probability map y sufficient to fool the discriminant g Make->Lifting. />The method comprises the following steps:
wherein the loss isAnd->The loss function of the standard GANs is used.
For the segmentation network S, it is necessary to narrow the gap between the predicted tag probability map and the true tag samples, which is the case for the tagged samples x l Is a predictive probability map S (x l ) One-hot coding diagram y of each pixel point and true mark l Accumulation of cross entropy for each pixel point, for unlabeled exemplar x u Then it is to mask the low confidencePredictive probability map S (x) u ) Pixel point on the image and reserved pseudo-marked one-hot coding diagram y u Accumulation of cross entropy for pixel points on the display. Thus dividing lossThe method comprises the following steps:
indicating matrix I is trained with marked data x l Is set to a matrix with all elements 1, representing S (x l ) And y is l The cross entropy calculated at each point is used as a supervisory signal to train the segmentation network S. While training the unlabeled data x u When the position (h, w, d) is binarized by taking a threshold value T according to the confidence map p, the point indicating the position (h, w, d) in the c-axis direction of the matrix is set to 0 or 1. According to the above description, the indication matrix I is:
whether with marked data x l Or non-marking data x u Training the segmentation network S, and predicting the result y of the segmentation network by using a discriminator D of the full convolution network pred And y u Calculating the countermeasures against lossTo guide the segmentation network to optimize the segmentation network, and predict the probability graph y which is closer to the true label l And (3) a distributed result. Countering losses->The calculation formula of (2) is as follows:
the training segmentation network S is:
wherein,,and->Is for unmarked data x u Division loss calculated during semi-supervised learning>And counter-loss->λ adv And lambda (lambda) semi Is the two weights used to minimize the proposed multitasking loss function.
For verification of the method of the present invention, see FIG. 4, is a prediction of the model on sample brets_tcia_pat 483_0001. AdvSemiSeg under semi-supervised learning setup performs better than the model described for the prediction of labels occupying smaller areas, i.e. enhanced tumors (green) and necrotic and non-enhanced tumors (red). The results predicted by the model herein are significantly better than AdvSemiSeg. The areas covered by the tumor edema labels are more, and the prediction of the enhanced tumor labels and the necrosis and non-enhanced tumor labels is more accurate.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (8)
1. An image segmentation method based on generating an antagonistic network SEG-GAN, comprising the steps of:
s1, predicting a confidence map by using a discriminant network pre-trained with tag data, generating an antagonism network SEG-GAN, taking the confidence map as a supervision signal, and guiding cross entropy loss through self-learning; wherein the confidence map indicates which regions of the prediction distribution are close to the true tag map distribution, such that these predictions can be trained by the segmentation network by masking cross entropy loss of other untrusted regions;
s2: applying the resistance loss on the unlabeled data, and generating a prediction result of the resistance model on the unlabeled data to be close to the real label graph distribution;
s3: the segmentation network S is a generator structure of 3D-MedGAN, an input MR image x with one dimension H multiplied by W multiplied by D multiplied by 1 is given, and the segmentation network outputs a semantic tag probability map S (x) with the size H multiplied by W multiplied by D multiplied by C, wherein C is the number of semantic categories;
s4: the generator network G is responsible for generating a semantic tag probability map G (z) with the same size H multiplied by W multiplied by D multiplied by C as the input image according to a random vector z with fixed dimension;
s5: the discriminator network D depends on the segmentation network and the generator network, takes a probability map S (x) predicted by the segmentation network, a probability map G (z) generated by the generator network and a one-hot map of the real label map y as inputs, and then outputs a confidence map p with the size of H multiplied by W multiplied by D; each pixel on the confidence map p represents that the label of the input image x at the position corresponding thereto is a sample from the true label map y, p=1 or false label map, p=0, including S (x) and G (z);
wherein generating the challenge model includes two "generators" and a discriminator; one of the two generators is a segmentation network S for predicting a tag probability map for an input MR and a generator G for converting random noise into the tag probability map, and the tag probability map predicted by the segmentation network S and the generator G is used as a false sample, and marks of marking data are used as true samples, so that the discriminator is trained, and the discriminator has the capability of separating true marks from false marks; the arbiter obtains an indication matrix according to the prediction; the indication matrix is used for keeping the relative reliable prediction of the segmentation network S on the unlabeled samples as a supervision signal for self-training;
the input data x according to which the splitting network S is based comprises the marking data x l And unlabeled data x u Each marking data x l All have corresponding label patterns y l The label map y of H×W×D size here l One-hot encoding is changed into a probability map y with C channels of discrete labels l The labels at each location will map the probability map y l The channel of which the upper represents the category is set to be 1, and the same positions on the rest other channels are all set to be 0;
in the training process, when the marking data x is used l In this case, the split network S is formed by a label probability map S (x l ) And true label map y l Standard cross entropy loss of (2)Distinguishing y by using a discriminator network D pred Is->Guiding and updating each parameter in the network;
training the segmentation network using a self-supervised learning method for unlabeled data, predicting an unlabeled image x from the segmentation network S u Is (x) u ) Then, a tag probability map S (x u ) Confidence degree of each position is obtained, a confidence degree map p is obtained, the quality of the predicted segmentation area is indicated through the confidence degree map, and the result of the segmentation network S during training can be trusted;
then, a region with high confidence is reserved by taking a threshold value on the confidence map p, and a predictive label probability map S (x u ) The channel with the highest probability in all channels in the areas is used as the label of the area to obtain a pseudo label graph, and the pseudo label graph is the same asCarrying out one-hot coding on the pseudo tag image to obtain y u ;
Then the network S predicts the label probability map S (x) with high confidence region segmentation u ) And y u Cross entropy loss betweenResistance loss with the arbiter network D>Training a segmentation network; the generator network generates a false tag probability map G (z) according to the random noise z, and obtains y by taking the channel with the largest value at the same position as the tag g At the same time, the discriminator is used to calculate the generator loss->
The arbiter network D will identify the tag probability map y that is tagged with the genuine data l False label probability map y predicted by split network pred Guided, y u And identifying a true tag probability map yl and a false tag probability map y generated by the generator network g Will result in minimized arbiter lossSince the training process is to optimize the segmentation network S and the discriminant network D in turn, the resistance loss is +.>And discriminator loss->And will not be used simultaneously.
2. The method of generating an image segmentation based on a countermeasure network according to claim 1, wherein minimized arbiter loss among network discriminatorsThe method comprises the following steps:
where n is the nth sample in the small lot data representing training, Y n A label representing a pixel on a label probability map of h×w×d size at a position (H, W, D) is true; when Y is n When=0, i.e. false, the probability values representing all channels at that point are from the label probability map S (x) predicted by the segmentation network or the label probability map G (z) synthesized by the generator network, when Y n When=1, it is true, the probability values representing all channels at that point are derived from the labeled sample x l Is marked y of (2) l 。
3. The image segmentation method based on the generation of countermeasure network as set forth in claim 2, wherein the data input to the arbiter network D are unified into one-hot coding format, the tag probability map is modified by taking the channel with the highest probability on each position as the tag, and then performing one-hot coding to obtain the data of HxW xD xCThe method comprises the following steps:
4. a method of image segmentation based on generation of a countermeasure network as claimed in claim 3, wherein loss occurs as the arbiter is continually optimizedContinuously reducing the loss of the generator/>Optimizing in order to generate a tag probability map y of a fraud arbiter g So that->Lifting, said generator losing->The method comprises the following steps:
wherein the loss isAnd loss->The loss function of the standard GANs is used.
5. The image segmentation method based on generation of countermeasure network according to claim 4, wherein for the segmentation network S, a gap between a predicted tag probability map and a true tag sample needs to be narrowed, the gap for a tagged sample x l Is a predictive probability map S (x l ) One-hot coding diagram y of each pixel point and true mark l Accumulation of cross entropy for each pixel point, for unlabeled exemplar x u Then it is the predictive probability map S (x u ) Pixel point on the image and reserved pseudo-marked one-hot coding diagram y u Accumulation of cross entropy of pixel points on the pixel, thus partition lossThe method comprises the following steps:
6. the method for generating an image segmentation based on a countermeasure network according to claim 5, wherein the indicator matrix I is trained with the marker data x l Is set to a matrix with all elements 1, representing S (x l ) And y is l The cross entropy calculated by each point is used as a supervision signal to train the segmentation network S; while training the unlabeled data x u When the position (h, w, d) is binarized by taking a threshold value T according to the confidence map p, the point of the position (h, w, d) in the c-axis direction of the indication matrix is set to be 0 or 1; the overall acquisition indication matrix I is:
with marked data x l Or non-marking data x u Training the segmentation network S, and predicting the result y of the segmentation network by adopting a discriminator D of the full convolution network pred And y u Calculating the countermeasures against lossTo guide the segmentation network to optimize the segmentation network, and predict the probability graph y which is closer to the true label l Results of the distribution;
wherein, countering the lossThe calculation formula of (2) is as follows:
the training segmentation network S is:
7. a system based on an image segmentation method for generating an antagonistic network SEG-GAN, the system comprising:
the data set acquisition module is used for acquiring a target image set, a reference image set and a pre-marked reference mark set corresponding to the reference image set; the target image set comprises a target image training set and a target image testing set;
the network construction module is used for constructing a segmentation network and a discrimination network; wherein the first objective loss function of the segmentation network comprises cross entropy loss of the objective image set and the reference annotation set, contrast loss of the objective image set, and semi-supervised loss between the objective image set and the reference image set;
the training module is used for inputting the target image training set and the reference image set into the segmentation network, correspondingly obtaining a target probability score graph and a reference probability score graph, and inputting the target probability score graph and the reference probability score graph into the discrimination network so as to perform joint training of the segmentation network and the discrimination network;
the judging module is used for finishing training when the first target loss function of the segmentation network and the second target loss function of the judging network are converged;
the test module is used for inputting the target image test set into the trained segmentation network to obtain a target segmentation image;
wherein generating the challenge model includes two "generators" and a discriminator; one of the two generators is a segmentation network S for predicting a tag probability map for an input MR and a generator G for converting random noise into the tag probability map, and the tag probability map predicted by the segmentation network S and the generator G is used as a false sample, and marks of marking data are used as true samples, so that the discriminator is trained, and the discriminator has the capability of separating true marks from false marks; the arbiter obtains an indication matrix according to the prediction; the indication matrix is used for keeping the relative reliable prediction of the segmentation network S on the unlabeled samples as a supervision signal for self-training;
the input data x according to which the splitting network S is based comprises the marking data x l And unlabeled data x u Each marking data x l All have corresponding label patterns y l The label map y of H×W×D size here l One-hot encoding is changed into a probability map y with C channels of discrete labels l The labels at each location will map the probability map y l The channel whose category is represented by the upper is set to 1, and the same positions on the remaining other channels are all set to 0.
8. An apparatus for using an image segmentation method based on generating an antagonistic network SEG-GAN, characterized in that: the apparatus comprises a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the method of generating an image segmentation based on a countermeasure network as claimed in any one of claims 1 to 6 when the computer program is executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011438792.2A CN112419327B (en) | 2020-12-10 | 2020-12-10 | Image segmentation method, system and device based on generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011438792.2A CN112419327B (en) | 2020-12-10 | 2020-12-10 | Image segmentation method, system and device based on generation countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112419327A CN112419327A (en) | 2021-02-26 |
CN112419327B true CN112419327B (en) | 2023-08-04 |
Family
ID=74776274
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011438792.2A Active CN112419327B (en) | 2020-12-10 | 2020-12-10 | Image segmentation method, system and device based on generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112419327B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11580646B2 (en) * | 2021-03-26 | 2023-02-14 | Nanjing University Of Posts And Telecommunications | Medical image segmentation method based on U-Net |
CN113012171A (en) * | 2021-04-01 | 2021-06-22 | 东北林业大学 | Lung nodule segmentation method based on collaborative optimization network |
CN113486899B (en) * | 2021-05-26 | 2023-01-24 | 南开大学 | Saliency target detection method based on complementary branch network |
CN113313697B (en) * | 2021-06-08 | 2023-04-07 | 青岛商汤科技有限公司 | Image segmentation and classification method, model training method thereof, related device and medium |
CN114565586B (en) * | 2022-03-02 | 2023-05-30 | 小荷医疗器械(海南)有限公司 | Polyp segmentation model training method, polyp segmentation method and related device |
CN114897914B (en) * | 2022-03-16 | 2023-07-07 | 华东师范大学 | Semi-supervised CT image segmentation method based on countermeasure training |
CN114419321B (en) * | 2022-03-30 | 2022-07-08 | 珠海市人民医院 | CT image heart segmentation method and system based on artificial intelligence |
CN115393378B (en) * | 2022-10-27 | 2023-01-10 | 深圳市大数据研究院 | Low-cost and efficient cell nucleus image segmentation method |
CN116403074B (en) * | 2023-04-03 | 2024-05-14 | 上海锡鼎智能科技有限公司 | Semi-automatic image labeling method and device based on active labeling |
CN117079142B (en) * | 2023-10-13 | 2024-01-26 | 昆明理工大学 | Anti-attention generation countermeasure road center line extraction method for automatic inspection of unmanned aerial vehicle |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110837836A (en) * | 2019-11-05 | 2020-02-25 | 中国科学技术大学 | Semi-supervised semantic segmentation method based on maximized confidence |
CN111507993A (en) * | 2020-03-18 | 2020-08-07 | 南方电网科学研究院有限责任公司 | Image segmentation method and device based on generation countermeasure network and storage medium |
-
2020
- 2020-12-10 CN CN202011438792.2A patent/CN112419327B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110837836A (en) * | 2019-11-05 | 2020-02-25 | 中国科学技术大学 | Semi-supervised semantic segmentation method based on maximized confidence |
CN111507993A (en) * | 2020-03-18 | 2020-08-07 | 南方电网科学研究院有限责任公司 | Image segmentation method and device based on generation countermeasure network and storage medium |
Non-Patent Citations (1)
Title |
---|
Hybrid Segmentation Algorithm for Medical Image Segmentation Based on Generating Adversarial Networks, Mutual Information and Multi-Scale Information;YI SUN,ET AL.;《IEEE》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112419327A (en) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112419327B (en) | Image segmentation method, system and device based on generation countermeasure network | |
CN109191476A (en) | The automatic segmentation of Biomedical Image based on U-net network structure | |
CN112070781A (en) | Processing method and device of craniocerebral tomography image, storage medium and electronic equipment | |
CN106650670A (en) | Method and device for detection of living body face video | |
CN114663426B (en) | Bone age assessment method based on key bone region positioning | |
CN113762327B (en) | Machine learning method, machine learning system and non-transitory computer readable medium | |
CN114550169A (en) | Training method, device, equipment and medium for cell classification model | |
CN112149689A (en) | Unsupervised domain adaptation method and system based on target domain self-supervised learning | |
CN112287884A (en) | Examination abnormal behavior detection method and device and computer readable storage medium | |
CN115331007A (en) | Video disc and cup segmentation method based on unsupervised field self-adaptation and imaging method thereof | |
CN111598144A (en) | Training method and device of image recognition model | |
CN110533184A (en) | A kind of training method and device of network model | |
CN116824333B (en) | Nasopharyngeal carcinoma detecting system based on deep learning model | |
CN115205956B (en) | Left and right eye detection model training method, method and device for identifying left and right eyes | |
CN115512428A (en) | Human face living body distinguishing method, system, device and storage medium | |
CN112597842B (en) | Motion detection facial paralysis degree evaluation system based on artificial intelligence | |
CN114972335A (en) | Image classification method and device for industrial detection and computer equipment | |
CN113706450A (en) | Image registration method, device, equipment and readable storage medium | |
Xie et al. | Pulmonary nodules detection via 3D multi-scale dual path network | |
CN106023120A (en) | Face figure synthetic method based on coupling neighbor indexes | |
CN113222887A (en) | Deep learning-based nano-iron labeled neural stem cell tracing method | |
CN112199984A (en) | Target rapid detection method of large-scale remote sensing image | |
Rocamora-García et al. | A Deep Approach for Volumetric Tractography Segmentation | |
CN118015261B (en) | Remote sensing image target detection method based on multi-scale feature multiplexing | |
CN118485919B (en) | Plant canopy leaf segmentation and complement model training method, leaf parameter extraction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |