CN115439486A - Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network - Google Patents

Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network Download PDF

Info

Publication number
CN115439486A
CN115439486A CN202210629202.7A CN202210629202A CN115439486A CN 115439486 A CN115439486 A CN 115439486A CN 202210629202 A CN202210629202 A CN 202210629202A CN 115439486 A CN115439486 A CN 115439486A
Authority
CN
China
Prior art keywords
network
segmentation
data
image
organ
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210629202.7A
Other languages
Chinese (zh)
Inventor
雷涛
张栋
尚佳童
杜晓刚
王营博
丁丽萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi University of Science and Technology
Original Assignee
Shaanxi University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi University of Science and Technology filed Critical Shaanxi University of Science and Technology
Priority to CN202210629202.7A priority Critical patent/CN115439486A/en
Publication of CN115439486A publication Critical patent/CN115439486A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Abstract

The invention discloses a semi-supervised organ tissue image segmentation method and system based on a dual-countermeasure network, and belongs to the field of medical organ tissue image processing. According to the invention, mutual learning of the segmentation network and the discrimination network is promoted through dual confrontation network confrontation training, and the knowledge transfer capability of the segmentation network to the labeled data to the unlabeled data is improved. The bidirectional attention component based on dynamic convolution effectively prevents overfitting, reduces error accumulation in average teachers, and improves generalization capability of the segmentation network, so that performance of the image segmentation network is effectively improved. The method has smaller memory occupation and faster reasoning speed, reduces the dependence of the network on pixel-level labeled data by jointly learning a small amount of data with accurate labeling and a large amount of unlabeled data, and improves the knowledge migration capability of the segmentation network on the labeled data to the unlabeled data, thereby accurately positioning and segmenting organs and other tissue images of a human body.

Description

Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network
Technical Field
The invention belongs to the field of medical organ tissue image processing, relates to a semi-supervised organ tissue image segmentation method based on a dual countermeasure network, and is particularly suitable for a small number of semi-supervised medical image segmentation scenes with accurate annotation data and a large number of unmarked data.
Background
The coding and decoding convolutional neural network based on supervised learning achieves remarkable effect in the semantic segmentation task of the medical image, and effectively promotes the development of medical image segmentation, such as U-Net, U-Net + +, denseUNet and the like. However, the success of these techniques relies heavily on the data of a large number of pixel-level labels. The data accurately labeled in the abdominal CT medical image segmentation task is less, and the CT scanning image has the problems of high noise, low contrast and the like, so that the medical image labeling is difficult. In addition, the medical image labeling cost is high due to the fact that the medical image labeling task has high requirements on medical professional knowledge. Semi-supervised learning is a learning paradigm for solving the problem of incomplete data labeling in weak supervised learning, and the semi-supervised learning mainly utilizes a small amount of labeled data and a large amount of unlabeled data to carry out combined training, so that the semi-supervised learning is more in line with the actual clinical scene and has important research significance. In order to better utilize the unlabeled data, the mainstream semi-supervised image segmentation method based on disturbance consistency regularization has been greatly successful. Other new methods such as self-training, generation of countermeasure networks, contrast learning, and collaborative training are also applied to the semi-supervised image segmentation.
Among the consistency regularization methods, the Mean Teachers (MT) framework is one of the mainstream methods. The MT framework firstly carries out supervised learning on the marked data; then, providing a pseudo label for the unmarked data by using a teacher model, and enabling the teacher-student model to output similar prediction results for the unmarked data under different disturbances by using different regularization modes; finally, the student model is updated by monitoring feedback of loss and consistency loss. The teacher model is an Exponential Moving Average (EMA) of the weight of the student model, and the teacher model continuously accumulates network history prediction information of the unmarked data. The method based on the average teacher is very effective, but still has the following two problems that (1) the mainstream semi-supervised medical image segmentation network directly uses the segmentation network for supervised learning to have a larger problem, and because different models in the segmentation network share the same weight parameter, model overfitting is easily caused during training under a small number of data sets, the quality of generating pseudo labels for unmarked data is poor, so that parameters between the teacher and the student model are highly coupled, and the accumulation of error information is increased. (2) The mainstream semi-supervised medical image segmentation method does not better utilize the prior data distribution relation between the labeled data and the unlabelled data, so that the efficiency of the network to learn the characteristics of the unlabelled data is low, and the generalization capability of the model is poor.
Disclosure of Invention
The organ tissue image segmentation model based on deep learning is seriously dependent on a large amount of data with pixel-level labeling, but the organ tissue image labeling task is extremely difficult, requires basic professional knowledge, and is time-consuming and labor-consuming. Therefore, the invention mainly aims to provide a semi-supervised organ tissue image segmentation method and system based on a dual confrontation network, mutual learning of a segmentation network and a discrimination network is promoted through dual confrontation network confrontation training, prediction consistency of the segmentation network on labeled data and unlabeled data and prediction consistency of the same data under different disturbances are mainly learned by two discriminator networks respectively, and the knowledge migration capability of the segmentation network on the labeled data to the unlabeled data is promoted. In addition, in order to effectively prevent over-fitting and improve the utilization efficiency of unlabeled data, the invention provides a bidirectional attention component based on dynamic convolution, and network parameters are dynamically adjusted according to the structure prior information of each sample. The assembly can effectively prevent overfitting, reduce error accumulation in average teachers, and improve generalization capability of the segmentation network, so that performance of the image segmentation network is effectively improved. Compared with the existing semi-supervised segmentation network, the method can obtain more precise image segmentation results, has smaller memory occupation and faster reasoning speed, reduces the dependence of the network on pixel-level labeled data by jointly learning a small amount of data with precise labels and a large amount of unlabelled data, and improves the knowledge migration capability of the segmentation network on the labeled data to the unlabelled data, thereby accurately positioning and segmenting organs and other tissues of a human body.
In addition, the invention designs a segmentation system of human organs and skin lesion tissues based on deep learning, and assists in supporting artificial intelligence to segment human abdominal organs and tissues images, positioning tissue markers, reconstructing images, evaluating risk degree, detecting lesions, assisting in training and the like.
The artificial intelligence assists the medical training of the liver, including based on the daily training of supplementary medical staff of the liver image segmentation data, the supplementary support improves abdomen organ image and tissue pathological change analysis precision and efficiency.
In order to achieve the above purpose, the purpose of the invention is realized by the following technical scheme:
the invention discloses a semi-supervised organ and tissue image segmentation method based on a dual-countermeasure network, which comprises the following steps:
the method comprises the following steps: the original data is preprocessed according to the imaging principle of the image, and the contrast of the organ and tissue image is enhanced to obtain a clearer organ and tissue image data set.
The image preprocessing method comprises two window level and window width adjustment, image resampling, noise adding, random rotation and random overturning.
The specific implementation method for adjusting the data window level and the window width in the first step comprises the following steps:
step 1.1: and selecting proper window level and window width parameters according to different segmentation tasks for organ data, and enhancing the contrast of the target organ.
Because the contrast in the medical image data is low, the image is seriously interfered by noise and the size of the image is inconsistent, in order to acquire clear images of organs and tissues, the interference of other clutter is reduced by cutting off the part which exceeds the range in the data. The contrast of the organ area is enhanced by setting the proper window level and window width. And solving the problem of inconsistent image resolution through resampling operation, and preparing for next training of the segmentation model.
The main principle of the window level and width adjusting method in step 1.1 is that different pixel value domains in the image correspond to different abdominal organ tissues, and in order to enhance the target organ region, the data is normalized to be within the range of [ m, n ], so that a preprocessed image F is obtained. The image is first converted to Hounsfield (HU) values, the formula for calculation is given as:
HU=Pixel*Rs+Ri
wherein HU represents the density of different tissues, pixel is the Pixel value, rs value set by the invention is the conversion coefficient, and Ri value is the conversion bias. After converting the image into Hounsfield values, the maximum value H of the tissue density is calculated max And a minimum value H min Adjusting the window level and the window width, wherein the specific formula is as follows:
H max =(2*wc-ww)/2.0+0.5
H min =(2*wc+ww)/2.0+0.5
wherein, ww (window width) is the window width, and wc (window center) is the window level. Normalizing the obtained HU values to a fixed value range [ m, n ], obtaining a preprocessed image F, with the formula:
Figure BDA0003667247880000031
step 1.2: data enhancement is performed on data of different organizations.
The purpose of data enhancement of the image in step 1.2 is to facilitate increasing the diversity of data and the generalization performance of the model on the one hand, and to prevent overfitting of the network and facilitate the training and convergence of the model on the other hand.
Step two: and (3) performing feature extraction by using a designed convolutional neural network based on dual countermeasure according to the data set preprocessed in the step one, decoding the extracted features into low-dimensional feature representation, and finally outputting a segmentation result with a semantic level. The invention provides an image segmentation model with double antagonism. The DA-Net network is composed of two segmentation networks and two discriminator networks. The segmentation network consists of a student model and a teacher model. The student model and the teacher model have the same structure, and both are based on the coding and decoding structure, except that the former is trained by a loss function, while the latter is an exponential moving average of the weights of the student model. The arbiter network consists of convolutional layers, dynamic bidirectional attention components, and global averaging pools. In DA-Net, the average teacher is still used as the basic framework. In a network of discriminators, an antagonistic consensus training strategy is employed to achieve different goals by using dual discriminators of the same structure. The first discriminator learns the consistency of the prediction quality of the segmented network of unlabelled data and labeled data. The second discriminator learns the predicted consistency of the teacher and student networks with the same data but under different perturbations. Unlike other competing networks, the input to the discriminator network is the output of the segmented network and the original image. And (4) judging the matching relation between the network learning segmentation result and the original by taking the original image as a reference, and further measuring the quality of the segmentation result. In the split network portion, the bi-directional attention component is applied to all layers of the split network (except the first layer) in place of the conventional standard convolution. The dynamic bidirectional attention component adaptively adjusts parameters of the convolution kernel according to different inputs, improves the feature representation capability of the network, and reduces the risk of overfitting. The dynamic bidirectional attention component fully decouples the relation between the space and the channel, reduces the parameter quantity of the network while improving the characteristic expression capability of the network, improves the reasoning speed of the network, and is easier to deploy in edge equipment. In addition, the segmentation network and the discriminators are alternately trained, and the discriminators are not needed in the inference stage, so that additional computation overhead is avoided.
And in the second step, the specific implementation of the organ tissue segmentation network based on the dual confrontation network is divided into two parts, namely a generation process of a dynamic bidirectional attention component and a confrontation consistency training strategy.
Step 2.1: and generating a bidirectional attention component, wherein the dynamic bidirectional attention component adaptively adjusts parameters of the convolution kernel according to different inputs, improves the characteristic representation capability of the network, and reduces the risk of overfitting. The dynamic bidirectional attention component fully decouples the relation between the space and the channel, reduces the parameter quantity of the network while improving the characteristic expression capability of the network, improves the reasoning speed of the network, and is easier to deploy in edge equipment.
At given input data
Figure BDA0003667247880000041
C represents the number of input channels, and H × W represents the resolution of the input feature map. In order to enhance the significance of important spatial positions, the input feature map is firstly focused through a simple space. Carrying out dimension reduction on the input feature map by 1 multiplied by 1 convolution, then carrying out normalization by a sigmoid activation function, and multiplying the obtained space attention weight on the input feature map pixel by pixel to obtain an output feature map
Figure BDA0003667247880000042
First x 1 Feature map obtained by global average pooling
Figure BDA0003667247880000043
Then reducing dimension by 1 × 1 convolution and activating function by softmax to obtain
Figure BDA0003667247880000044
And N is the number of convolution kernels defined in advance, is a hyper-parameter, is set according to a specific task, and is set to be N =4 through experimental verification. Multiplying the obtained coefficients p to N convolution kernels respectively, summing the weights of the convolution kernels, generating only one convolution kernel and performing convolution operation, thereby obtaining the final productResulting convolution kernel weights
Figure BDA0003667247880000045
Is represented as follows:
Figure BDA0003667247880000046
wherein p is i Denotes the i-th coefficient of p, 0 ≦ p i ≤1,
Figure BDA0003667247880000047
w i Is the weight of the ith convolution kernel.
Meanwhile, the parameter Q of the standard dynamic convolution is calculated as:
Q=C in ×N+N×C in ×C out ×k×k
where k × k is the size of the convolution kernel, C in And C out The number of channels of the input/output characteristic diagram is shown. Obviously, the parameter amount is N times or more of that of the normal convolution, and in order to reduce the parameter amount, the original parameter amount is reduced to
Figure BDA0003667247880000048
Figure BDA0003667247880000049
Obviously, compared with the standard convolution and the standard dynamic convolution, the method can greatly reduce the parameter quantity and effectively improve the model performance.
Step 2.2: with the confrontation consistency training strategy, different goals are achieved by using dual discriminators of the same structure. The first discriminator learns the consistency of the prediction quality of the segmented network of unlabelled data and labeled data. The second discriminator learns the predicted consistency of the teacher and student networks with the same data but under different perturbations. Unlike other competing networks, the input to the network of discriminators is the output of the segmentation network and the original image. And (4) judging the matching relation between the network learning segmentation result and the original by taking the original image as a reference, and further measuring the quality of the segmentation result. In the split network portion, the bi-directional attention component is applied to all layers of the split network (except the first layer) in place of the conventional standard convolution.
The antagonism consistency training frame is mainly added with two discriminator networks on the basis of an average teacher, the two discriminator networks have the same structure but have different functions, and a discriminator D 1 The learning network outputs the difference of quality for the marked data and the unmarked data. Discriminator D 2 The difference between the unlabeled data under perturbation and the unperturbed data is learned. Finally, loss L is monitored s Loss of consistency L semi And to combat the loss L adv1 ,L adv2 The method encourages the student network to generate a high-quality prediction result on the unmarked data, and the countermeasure consistency training strategy adopting the DA-Net model structure is specifically realized by the following steps:
the counterstudy is realized by an alternate training mode, the segmentation network inputs medical images, outputs segmentation prediction graphs, splices the output of the segmentation network and the input images together and inputs the spliced output and the input images into a discrimination network, the output is the number of categories, 0 represents that the quality of segmentation results is poor, and 1 represents that the quality of the segmentation results is good. In training, it is encouraged to segment the network versus unlabeled data x u Generating a high quality segmentation result with a score close to 1, in summary, during the training process, such that the segmentation network outputs a high quality segmentation result and such that the discrimination network cannot discriminate whether the input is from a label or the segmentation network, the segmentation network objective function L (θ) s Is defined as:
Figure BDA0003667247880000051
the discriminator D decides whether the network is to distinguish the output of the segmented network as much as possible 1 And D 2 Are defined as:
Figure BDA0003667247880000052
Figure BDA0003667247880000053
wherein L is s (. Is) a multi-class cross-entropy penalty and dice penalty, L semi Is mean square error, L adv1 And L adv2 Cross entropy loss for multi-classification; x is the number of i And y i For input data and corresponding labels, x u And x ema The data is input as unlabelled data, with noise interference,
Figure BDA0003667247880000054
and
Figure BDA0003667247880000055
respectively the segmentation results of marked data and unmarked data,
Figure BDA0003667247880000056
a predicted result for the teacher network; λ is a weighting coefficient, which is a Gaussian rising curve,
Figure BDA0003667247880000057
i is the number of iterations of the training.
The parameters of the teacher model are EMA accumulations of student model parameters, and have proven their effectiveness in most methods, defined as:
θ' t =αθ' t-1 +(1-α)θ t
wherein, theta' t To update the parameters of the teacher model, θ t Alpha is a hyperparameter of a smoothing coefficient, alpha determines the dependency relationship between the teacher model and the student model, and according to experience, the optimal performance is obtained when alpha = 0.999.
And inputting the processed data into the trained model to obtain a primary segmentation result p.
Step three: based on the first step and the second step, the dependence of a network on pixel-level labeled data is reduced by performing combined learning on a small amount of data with accurate labeling and a large amount of unlabeled data, the knowledge migration capability of a segmentation network on the labeled data to the unlabeled data is improved, so that organs and other tissues of a human body are accurately positioned and segmented, a more precise organ and tissue image segmentation result is obtained, the segmentation result p is post-processed, and hole filling is performed through morphological reconstruction to obtain a precise organ and tissue image segmentation result.
Step four: the image segmentation results of organs and tissues are evaluated by using the Dice coefficient and Average Symmetric Surface Distance (ASD).
The performance of the segmentation result is evaluated by using a Dice coefficient (DI), a Jaccard similarity index (Jaccard index, JA), a pixel precision (AC), a Sensitivity (SE), and a Specificity (SP), and the specific formula is as follows:
Figure BDA0003667247880000058
Figure BDA0003667247880000059
Figure BDA00036672478800000510
Figure BDA0003667247880000061
Figure BDA0003667247880000062
wherein, TP, TN, FP, FN respectively represent a positive class with correct prediction, a parent class with correct prediction, a positive class with wrong prediction and a negative class with wrong prediction.
Figure BDA0003667247880000063
Where a and B represent the true value and the segmentation result, respectively. S (A) and S (B) represent the surface voxel sets of A and B, and d (-) represents the Euclidean distance. The closer the Dice coefficient, jaccard similarity index, pixel accuracy, sensitivity and specificity to 1, the closer to the true value, the better the segmentation effect, and the smaller the ASD, the average distance of the surface, the closer to the true value. Notably, the final organ segmentation results are evaluated by synthetic 3D.
Further comprises the following steps: and designing a segmentation system of human organs and skin lesion tissues based on deep learning according to the image segmentation results of the organs and tissues evaluated in the step four, and assisting in supporting artificial intelligence to perform image segmentation, tissue marking positioning, image reconstruction, risk degree evaluation, focus detection and auxiliary training on the organs and tissues in the abdomen of the human body.
The auxiliary training comprises the step of assisting the daily training of medical staff based on the liver image segmentation data, and the auxiliary support improves the analysis precision and efficiency of abdominal organ images and tissue lesions.
The invention also discloses a double-confrontation network-based semi-supervised organ tissue image segmentation system, which is used for realizing the double-confrontation network-based semi-supervised organ tissue image segmentation method.
The image preprocessing module is used for uniformly preprocessing an input image, mainly comprises data enhancement and resampling of the image, and is beneficial to accelerating the convergence speed of the model and reducing the overfitting risk.
The network training module is used for training the preprocessed data, and training the preprocessed data by integrating an average teacher training strategy and an antagonism consistency training strategy and selecting parameters such as iteration times, a learning rate and batch quantity.
And the network test module is used for predicting the trained model and outputting an image segmentation prediction result.
The visualization module is used for realizing visualization of a loss curve of the visualization training set, a correct rate curve of the test set and a segmentation result of the image.
The organ tissue includes liver, pancreas, kidney, lung tissue, and skin tissue.
The invention relates to the technology of convolutional neural network, computer vision, pattern recognition and the like, and has the following beneficial effects:
1. the invention discloses a method and a system for segmenting a semi-supervised organ dirty tissue image based on a dual confrontation network. In addition, in order to effectively prevent over-fitting and improve the utilization efficiency of unmarked data, the invention provides a bidirectional attention component based on dynamic convolution, and network parameters are dynamically adjusted according to the structure prior information of each sample. The assembly can effectively prevent overfitting, reduce error accumulation in average teachers, and improve generalization capability of the segmentation network, so that performance of the image segmentation network is effectively improved. Compared with the existing semi-supervised segmentation network, the method can obtain a more refined image segmentation result, has smaller memory occupation and higher reasoning speed, reduces the dependence of the network on pixel-level labeled data by jointly learning a small amount of data with accurate labels and a large amount of unlabelled data, and improves the knowledge migration capability of the segmentation network on the labeled data to the unlabelled data, thereby accurately positioning and segmenting organs and other tissues of a human body.
2. The invention discloses a method and a system for segmenting a semi-supervised organ tissue image based on a dual-countermeasure network.
3. The invention discloses a method and a system for segmenting a semi-supervised organ tissue image based on a dual-countermeasure network, which adopt a training strategy of countermeasure consistency and pass through two discriminators D 1 And D 2 The method respectively learns the prior relationship between the unmarked data and the marked data and the prediction consistency of the same data to different disturbances, effectively utilizes the unmarked data, and reduces the dependency of the network to the unmarked data.
4. The invention discloses a method and a system for segmenting a semi-supervisor dirty tissue image based on a dual-countermeasure network.
5. The invention discloses a method and a system for segmenting a semi-supervisor dirty tissue image based on a dual-countermeasure network.
6. The invention discloses a method and a system for segmenting a semi-supervised organ tissue image based on a dual-countermeasure network, which can assist in supporting artificial intelligence to segment human abdominal organs and tissue images, locate tissue marks, reconstruct images, evaluate risk degree, detect focuses and assist in training.
Drawings
FIG. 1 is a flow chart of a semi-supervised visceral tissue image segmentation method based on a dual-countermeasure network.
Fig. 2 is a detailed structure diagram of the semi-supervised image segmentation network based on the dual-countervailing learning of the invention.
FIG. 3 is a detailed block diagram of the bidirectional attention component of the present invention based on dynamic convolution.
Fig. 4 is a visualization of the segmentation result on the LiTS liver segmentation dataset according to the present invention.
Fig. 5 is a visualization of the segmentation result on a skin lesion data set according to the invention.
Detailed Description
For better illustrating the objects and advantages of the present invention, the following description is provided in conjunction with the accompanying drawings and examples.
Example 1:
as shown in fig. 1, the semi-supervised organ image segmentation method based on the dual countermeasure network disclosed in this embodiment includes the following specific steps:
the method comprises the following steps: the original data is preprocessed according to the imaging principle of the image, and the contrast of the organ and tissue image is enhanced to obtain a clearer organ and tissue image data set.
The image preprocessing method comprises two window level and window width adjustment, image resampling, noise adding, random rotation and random overturning.
The specific implementation method for adjusting the data window level and the window width in the first step comprises the following steps:
step 1.1: and selecting proper window level and window width parameters according to different segmentation tasks for organ data, and enhancing the contrast of the target organ.
Because the contrast in the medical image data is low, the image is seriously interfered by noise and the size of the image is inconsistent, in order to acquire clear images of organs and tissues, the interference of other clutter is reduced by cutting off the part which exceeds the range in the data. The contrast of the organ area is enhanced by setting the proper window level and window width. And solving the problem of inconsistent image resolution through resampling operation, and preparing for next training of the segmentation model.
The main principle of the window level and width adjusting method in step 1.1 is that different pixel value domains in the image correspond to different abdominal organ tissues, and in order to enhance the target organ region, the data is normalized to the range of [ -200,250] to obtain a preprocessed image F. The image is first converted to Hounsfield (HU) values, the calculation formula is expressed as:
HU=Pixel*1-1024
after converting the image into Hounsfield values, the maximum value H of the tissue density is calculated max And a minimum value H min Adjusting the window level and the window width, wherein the specific formula is as follows:
H max =(2*100-400)/2.0+0.5
H min =(2*100+400)/2.0+0.5
normalizing the obtained HU values to a fixed value range [0, 255], obtaining the image F after preprocessing, with the formula:
Figure BDA0003667247880000081
step 1.2: data enhancement is performed on data of different organizations.
The purpose of data enhancement of the image in step 1.2 is to facilitate increasing the diversity of data and the generalization performance of the model on the one hand, and to prevent overfitting of the network and facilitate the training and convergence of the model on the other hand.
Step two: and (3) performing feature extraction by using a designed convolutional neural network based on dual countermeasure according to the data set preprocessed in the step one, decoding the extracted features into low-dimensional feature representation, and finally outputting a segmentation result with a semantic level. The invention provides an image segmentation model with double antagonism. The DA-Net network consists of two split networks and two discriminator networks. The segmentation network consists of a student model and a teacher model. The student model and the teacher model have the same structure, and both are based on the coding and decoding structure, except that the former is trained by a loss function, while the latter is an exponential moving average of the weights of the student model. The arbiter network consists of convolutional layers, dynamic bidirectional attention components, and global averaging pools. In DA-Net, the average teacher is still used as the basic framework. In a network of discriminators, an antagonistic consensus training strategy is employed to achieve different goals by using dual discriminators of the same structure. The first discriminator learns the consistency of the prediction quality of the segmented network of unlabelled data and labeled data. The second discriminator learns the predicted consistency of the teacher and student networks with the same data but under different perturbations. Unlike other competing networks, the input to the discriminator network is the output of the segmented network and the original image. And (4) judging the matching relation between the network learning segmentation result and the original by taking the original image as a reference, and further measuring the quality of the segmentation result. In the split network portion, the bi-directional attention component is applied to all layers of the split network (except the first layer) in place of the conventional standard convolution. And the dynamic bidirectional attention component adaptively adjusts parameters of the convolution kernel according to different inputs, improves the characteristic representation capability of the network and reduces the risk of overfitting. The dynamic bidirectional attention component fully decouples the relation between the space and the channel, reduces the parameter quantity of the network while improving the characteristic expression capability of the network, improves the reasoning speed of the network, and is easier to deploy in edge equipment. In addition, the segmentation network and the discriminators are alternately trained, and the discriminators are not needed in the inference stage, so that additional computation overhead is avoided.
And in the second step, the specific implementation of the organ tissue segmentation network based on the dual confrontation network is divided into two parts, namely a generation process of a dynamic bidirectional attention component and a confrontation consistency training strategy.
Step 2.1: and generating a bidirectional attention component, wherein the dynamic bidirectional attention component adaptively adjusts parameters of the convolution kernel according to different inputs, so that the feature representation capability of the network is improved, and the risk of overfitting is reduced. The dynamic bidirectional attention component fully decouples the relation between the space and the channel, reduces the parameter quantity of the network while improving the characteristic expression capability of the network, improves the reasoning speed of the network, and is easier to deploy in edge equipment.
As shown in FIG. 3, at a given input data
Figure BDA0003667247880000091
C represents the number of input channels, and H × W represents the resolution of the input feature map. In order to enhance the significance of important spatial positions, the input feature map is firstly focused through a simple space. The specific operation is shown in figure 3 (a), andreducing dimension of the input feature map by 1 multiplied by 1 convolution, then normalizing by sigmoid activation function, multiplying the obtained space attention weight on the input feature map pixel by pixel to obtain an output feature map
Figure BDA0003667247880000092
First x 1 Feature map obtained by global average pooling
Figure BDA0003667247880000093
Then reducing the dimension by 1 multiplied by 1 convolution and activating the function by softmax to obtain
Figure BDA0003667247880000094
And N is the number of convolution kernels defined in advance, is a hyper-parameter, is set according to a specific task, and is set to be N =4 through experimental verification. Multiplying the obtained coefficients p to N convolution kernels respectively, summing the weights of the convolution kernels, generating only one convolution kernel for convolution operation, and obtaining the weight of the convolution kernel
Figure BDA0003667247880000095
Is represented as follows:
Figure BDA0003667247880000096
wherein p is i Denotes the i-th coefficient of p, 0 ≦ p i ≤1,
Figure BDA0003667247880000097
w i Is the weight of the ith convolution kernel.
Meanwhile, the parameter Q of the standard dynamic convolution is calculated as:
Q=G in ×4+4×G in ×G out ×3×3
wherein, C in And C out The number of channels of the input/output characteristic diagram is shown. Obviously, the parameter amount is more than 4 times of the ordinary convolution, and in order to reduce the parameter amount, the parameter amount is reducedIs reduced to
Figure BDA0003667247880000098
Figure BDA0003667247880000099
Obviously, compared with the standard convolution and the standard dynamic convolution, the method can greatly reduce the parameter quantity and effectively improve the model performance.
Step 2.2: with the confrontation consistency training strategy, different goals are achieved by using dual discriminators of the same structure. The first discriminator learns the consistency of the prediction quality of the segmented networks of unlabelled data and labeled data. The second discriminator learns the predicted consistency of the teacher and student networks with the same data but under different perturbations. Unlike other countermeasure networks, the input to the discriminator network is the output of the segmentation network and the original image. And (4) judging the matching relation between the network learning segmentation result and the original by taking the original image as a reference, and further measuring the quality of the segmentation result. In the split network portion, the bi-directional attention component is applied to all layers of the split network (except the first layer) in place of the conventional standard convolution.
The antagonism consistency training frame is mainly added with two discriminator networks on the basis of an average teacher, the two discriminator networks have the same structure but have different functions, and a discriminator D 1 The learning network outputs the difference of quality for the marked data and the unmarked data. Discriminator D 2 The difference between the unlabeled data under perturbation and the unperturbed data is learned. Finally, loss L is monitored s Loss of consistency L semi And to combat the loss L adv1 ,L adv2 To encourage the student network to generate high-quality prediction results for the unlabeled data, the structure of the DA-Net model is shown in FIG. 2. The countermeasure consistency training strategy comprises the following specific implementation steps:
the antagonistic learning is realized by an alternate training mode, the segmentation network inputs medical images, outputs segmentation prediction images and divides the segmentation networkThe output of the network and the input image are spliced together and input into a discrimination network, the output is the number of categories, 0 represents that the quality of the segmentation result is poor, and 1 represents that the quality of the segmentation result is good. In training, it is encouraged to segment the network versus unlabeled data x u Generating a high quality segmentation result with a score close to 1, in summary, during the training process, such that the segmentation network outputs a high quality segmentation result and such that the discrimination network cannot discriminate whether the input is from a label or the segmentation network, the segmentation network objective function L (θ) s Is defined as:
Figure BDA0003667247880000101
the discriminator D decides whether the network is to distinguish the output of the segmented network as much as possible 1 And D 2 Are defined as:
Figure BDA0003667247880000102
Figure BDA0003667247880000103
wherein L is s (. Is) a multi-class cross-entropy penalty and dice penalty, L semi Is mean square error, L adv1 And L adv2 Cross entropy loss for multi-classification; x is the number of i And y i For input data and corresponding labels, x u And x ema For input unlabeled data, with noise interference,
Figure BDA0003667247880000104
and
Figure BDA0003667247880000105
respectively the segmentation results of marked data and unmarked data,
Figure BDA0003667247880000106
predictive knot for teacher networkFruit; λ is a weighting coefficient, which is a Gaussian rising curve,
Figure BDA0003667247880000107
i is the number of iterations of the training.
The parameters of the teacher model are EMA accumulations of student model parameters, and have proven their effectiveness in most methods, defined as:
θ′ t =0.999*θ' t-1 +(1-0.999)*θ t
wherein, theta' t To update the parameters of the teacher model, θ t And inputting the processed data into the trained model for the weight parameters of the students to obtain a primary segmentation result p.
Step three: and (4) carrying out post-processing on the segmentation result p, and filling holes through morphological reconstruction to obtain a fine organ and tissue image segmentation result.
Step four: the image segmentation results of organs and tissues are evaluated by using the Dice coefficient and Average Symmetric Surface Distance (ASD).
In the experiment of the invention, two data sets, namely a liver image and a skin tissue lesion image, are adopted to verify the effectiveness of the invention.
Liver image: the invention selects Liver Tumor Segmentation Change (LiTS) as an experimental data set, the LiTS comprises 131 cases of CT scanning data with labels and 70 cases of labeling data (not disclosed), in order to enhance the Liver contrast and remove interference, the intensity values of all CT images are cut off to the range of [ -200,250] HU, and the size of each image is 512 x 512. In semi-supervised learning, 121 cases are randomly selected as a training set, the remaining 10 cases are used as a test set, and random data enhancement such as turning, mirroring, rotation and the like is performed on the training set. For better comparison, the present invention randomly selected 10% and 20% of cases in the training set as labeled data, respectively, and the rest as unlabeled data.
Skin tissue lesion image: the skin tissue image dataset was from the 2018 International Skin Imaging Cooperation (ISIC) skin lesion segmentation challenge. The training set contains 2594 images and the validation set contains 100 images. The data sets have different types of skin lesions and different resolutions. To improve the computational efficiency of the different models, we resize all images to 256 × 192. Similarly, for semi-supervised learning, we randomly selected 10% (259 images) and 20% (519 images) as labeled data and the rest as unlabeled data in the training set. In the training phase, online random data augmentation is performed.
All experiments were performed on one server with specific parameters Intel (R) Xeon (R) Gold 6226R CPU @ 2.90GHz,40GB RAM, NVIDIA GeForce RTX 3090GPU, ubuntu 18.04, and PyTorch 1.7. The invention selects Adam to optimize the segmentation model, and the initial learning rate is 1 multiplied by 10 -3 The discrimination network is optimized by using a random gradient descent algorithm with momentum of 0.9, the initial learning rate is 0.01, and the weight attenuation is 0.0001.
Two advantages are emphatically introduced in the invention, one is the bidirectional attention component TAC-Dy, and the other is consistency antagonistic learning. The invention takes a semi-supervised method average teacher as a reference and takes U-Net as a backbone network. Ablation experiments were performed on the LiTS liver test set, with the training set divided by 10% labeled and 90% unlabeled. The results as in table 1 demonstrate the effectiveness of the present invention. The average teacher semi-supervised method based on U-Net cuts the liver with accuracy rate DICE of 92.39%, and obviously, an added discriminator D is designed 1 、D 2 And DyTAC were raised by 0.72%,0.75% and 0.97% respectively on MT basis. And the bidirectional attention component can effectively improve the feature expression capability of the network, and the discriminator D 1 Can effectively utilize the relation between the unmarked data and the marked data, and a discriminator D 2 The generalization capability of the network can be effectively improved.
TABLE 1 ablation experiments on the LiTS test set
Figure BDA0003667247880000111
In order to further verify the effectiveness of the proposed framework of the invention, the invention mainly compares the mainstream semi-supervised image segmentation method, and tables 2 and 3 show the performances of using the supervision method U-Net, the semi-supervision methods DAN, mean Teacher, UA-MT, TCSM _ v2, CPS and the proposed DA-Net on the test data set. Obviously, the accuracy rate DICE of the method provided by the invention is 94.12% and 95.07% under the training conditions marked by 10% and 20%, respectively, and the average surface distance ASD is 3.51mm and 3.04mm, respectively, which is superior to other methods.
TABLE 2 comparative experimental results of different methods on LiTS liver test set, 10% of labeled data
Figure BDA0003667247880000121
TABLE 3 comparative experimental results of different methods on LiTS liver test set, 20% of labeled data
Figure BDA0003667247880000122
Table 4 shows the results of comparative experiments under skin tissue data, and it can be seen that the method of the present invention has better generalization performance. In addition, fig. 4 shows visualization results of liver segmentation of different methods under 10% labeled data, where the green part represents ground truth, the red part represents segmentation results, and the yellow part represents the overlap of segmentation results with ground truth, so that fewer green and red parts, and more yellow parts represent better segmentation results. Fig. 5 shows the segmentation result of the skin tissue lesion under the condition of 20% of training data, and as can be seen from the visualization result, the segmentation result of the invention is higher in quality compared with other methods.
Table 4 comparative experiments on skin lesion data sets
Method Marked/unmarked DI JA SE AC SP
UNet++ 2594/0 87.67 80.06 90.65 93.29 96.78
UNet++ 259/0 82.57 73.55 88.31 91.00 93.76
DAN(MACCAI 2017) 259/2335 84.26 75.15 87.23 91.97 95.75
Mean teacher(NIPS 2017) 259/2335 84.58 76.54 87.25 92.02 95.69
UA-MT(MACCAI 2019) 259/2335 84.80 78.02 88.63 91.94 95.82
TCSM_v2(TNNLS 2020) 259/2335 84.71 75.55 90.22 91.92 95.77
CPS(CVPR 2021) 259/2335 84.72 76.81 86.87 91.87 95.42
DA-Net(ours) 259/2335 85.19 77.80 89.38 91.78 96.80
UNet++ 519/0 84.36 75.64 88.83 92.15 94.95
DAN(MACCAI 2017) 519/2075 85.41 77.16 89.69 92.16 95.01
Mean teacher(NIPS 2017) 519/2075 85.83 77.48 89.97 92.57 94.46
UA-MT(MACCAI 2019) 519/2075 86.19 78.06 91.15 92.64 94.49
TCSM_v2(TNNLS 2020) 519/2075 86.16 77.98 91.07 92.56 94.26
CPS(CVPR 2021) 519/2075 86.34 78.17 90.57 92.72 94.78
DA-Net(ours) 519/2075 86.63 78.37 90.72 93.09 94.52
In addition, as shown in table 4, the efficiency analysis of the different networks at the inference stage is shown. Since our proposed arbiter network is only used in the training phase, we only test the efficiency of the split network. In particular, the present invention uses a standard convolution that replaces the segmented network with a bidirectional attention component based on dynamic convolution, with the first layer excluded. The calculation estimation input size is 1 × 256 × 256. Obviously, the invention obviously improves the reasoning speed of the segmentation network and reduces the parameter quantity.
TABLE 5 comparison of computational efficiencies of different network models
Model operations(GFLOPs) parameters(M) storage usage(MB)
U-Net 65.39G 34.52M 131.82M
DA-Net 9.26G 5.18M 21.11M
Further comprises the following steps: and designing a segmentation system of human organs and skin lesion tissues based on deep learning according to the image segmentation results of the organs and tissues evaluated in the step four, and assisting in supporting artificial intelligence to perform image segmentation, tissue marking positioning, image reconstruction, risk degree evaluation, focus detection and auxiliary training on the organs and tissues in the abdomen of the human body.
The auxiliary training comprises the step of assisting the daily training of medical staff based on the liver image segmentation data, and the auxiliary support improves the analysis precision and efficiency of abdominal organ images and tissue lesions.
The invention also discloses a double-confrontation network-based semi-supervised organ tissue image segmentation system, which is used for realizing the double-confrontation network-based semi-supervised organ tissue image segmentation method.
The image preprocessing module is used for uniformly preprocessing an input image, mainly comprises data enhancement and resampling of the image, and is beneficial to accelerating the convergence speed of the model and reducing the overfitting risk.
The network training module is used for training the preprocessed data, and training the preprocessed data by integrating an average teacher training strategy and an antagonism consistency training strategy and selecting parameters such as iteration times, a learning rate and batch quantity.
And the network test module is used for predicting the trained model and outputting an image segmentation prediction result.
The visualization module can visualize the loss curve of the training set, the accuracy curve of the test set and the segmentation result of the image and perform visualization.
The organ tissue includes liver, pancreas, kidney, lung tissue, and skin tissue.
The above detailed description is further intended to illustrate the objects, technical solutions and advantages of the present invention, and it should be understood that the above detailed description is only an example of the present invention and is not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (7)

1. A semi-supervised organ and tissue image segmentation method based on a dual-countermeasure network is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
the method comprises the following steps: preprocessing original data according to an imaging principle of an image, and enhancing the contrast of an organ and tissue image to obtain a clearer organ and tissue image data set;
step two: performing feature extraction by using a designed convolutional neural network based on dual countermeasure according to the data set preprocessed in the step one, decoding the extracted features into low-dimensional feature representation, and outputting a segmentation result with a semantic level; adopting a dual-antagonism image segmentation model; the DA-Net network consists of two segmentation networks and two discriminator networks; the segmentation network consists of a student model and a teacher model; the student model and the teacher model have the same structure, and both are based on coding and decoding structures, and the difference is that the student model is trained by a loss function, while the teacher model is an exponential moving average of the weight of the student model; the discriminator network consists of a convolutional layer, a dynamic bidirectional attention component and a global average pool; in DA-Net, the average teacher is still used as the basic frame; in the arbiter network, adopting an antagonistic consistency training strategy, and realizing different targets by using double arbiters with the same structure; a first discriminator learns the prediction quality consistency of the segmentation networks of the unlabelled data and the labeled data; the second discriminator learns the prediction consistency of the teacher and student networks with the same data but under different disturbances; adopting the input of the discriminator network as the output result of the segmentation network and the original image; judging the matching relation between the network learning segmentation result and the original by taking the original image as a reference, and further measuring the quality of the segmentation result; in the part of the segmentation network, a bidirectional attention component is applied to the segmentation network to replace the conventional standard convolution; the dynamic bidirectional attention component adaptively adjusts parameters of the convolution kernel according to different inputs, improves the characteristic representation capability of the network and reduces the risk of overfitting; the dynamic bidirectional attention component fully decouples the relation between the space and the channel, reduces the parameter quantity of the network while improving the characteristic expression capability of the network, improves the reasoning speed of the network, and is easier to deploy in edge equipment; in addition, the segmentation network and the discriminator are alternately trained, and the discriminator is not needed in the reasoning stage, so that the additional calculation expense is avoided;
step three: performing joint learning on a small amount of data with accurate labels and a large amount of unmarked data based on the first step and the second step to reduce the dependence of a network on pixel-level labeled data and improve the knowledge migration capability of a segmentation network on labeled data to unmarked data, so as to perform accurate positioning and segmentation on organs and other tissues of a human body, obtain more precise organ and tissue image segmentation results, perform post-processing on the segmentation results p, and perform hole filling through morphological reconstruction to obtain precise organ and tissue image segmentation results;
step four: and evaluating the image segmentation result of the organ and the tissue by using the Dice coefficient and the average symmetric surface distance ASD, and realizing semi-supervised organ and tissue image segmentation based on a dual-countermeasure network.
2. The method for semi-supervised organ and tissue image segmentation based on the dual-countermeasure network as claimed in claim 1, wherein: step one, the image preprocessing method comprises two window level and window width adjustment, image resampling, noise adding, random rotation and random overturning;
the specific implementation method for adjusting the data window level and the window width in the first step comprises the following steps:
step 1.1: selecting appropriate window level and window width parameters according to different segmentation tasks for organ data, and enhancing the contrast of a target organ;
because the contrast in the medical image data is low, the image is seriously interfered by noise and the sizes of the images are inconsistent, in order to acquire clear organ and tissue images, other clutter interference is reduced by cutting off the part which exceeds the range in the data; the contrast of the organ area is enhanced by setting a proper window level and window width; solving the problem of inconsistent image resolution through resampling operation, and preparing for next training of the segmentation model;
the main principle of the window level and width adjusting method in step 1.1 is that different pixel value domains in the image correspond to different abdominal organ tissues, and in order to enhance the target organ region, the data is normalized to be within the range of [ m, n ], so that a preprocessed image F is obtained; the image is first converted to Hounsfield (HU) values, the formula for calculation is given as:
HU=Pixel*Rs+Ri
HU represents the density of different tissues, pixel is a Pixel value, an Rs value set by the method is a conversion coefficient, and a Ri value is conversion offset; after converting the image into Hounsfield values, the maximum value H of the tissue density is calculated max And a minimum value H min Adjusting the window level and the window width, wherein the specific formula is as follows:
H max =(2*wc-ww)/2.0+0.5
H min =(2*wc+ww)/2.0+0.5
wherein, ww (window width) is window width, wc (window center) is window level; normalizing the obtained HU values to a fixed value range [ m, n ], obtaining the preprocessed image F, with the formula:
Figure FDA0003667247870000021
step 1.2: carrying out data enhancement on data of different organizations;
the purpose of data enhancement of the image in step 1.2 is to facilitate increasing the diversity of data and the generalization performance of the model on the one hand, and to prevent overfitting of the network and facilitate the training and convergence of the model on the other hand.
3. The method of claim 2, wherein the semi-supervised organ and tissue image segmentation based on the dual countermeasure network comprises: in the second step, the specific implementation of the organ tissue segmentation network based on the dual confrontation network is divided into two parts, namely a generation process of a dynamic bidirectional attention component and a confrontation consistency training strategy;
step 2.1: generating a bidirectional attention component, wherein the dynamic bidirectional attention component adaptively adjusts parameters of a convolution kernel according to different inputs, improves the characteristic representation capability of the network, and reduces the risk of overfitting; the dynamic bidirectional attention component fully decouples the relation between the space and the channel, reduces the parameter quantity of the network while improving the characteristic expression capability of the network, improves the reasoning speed of the network, and is easier to deploy in edge equipment;
at given input data
Figure FDA0003667247870000022
C represents the number of input channels, and H multiplied by W represents the resolution of the input feature map; in order to enhance the significance of important spatial positions, firstly, the input feature map is focused through a simple space; carrying out dimension reduction on the input feature map by 1 multiplied by 1 convolution, then carrying out normalization by a sigmoid activation function, and multiplying the obtained space attention weight on the input feature map pixel by pixel to obtain an output feature map
Figure FDA0003667247870000031
First x 1 Feature map obtained by global average pooling
Figure FDA0003667247870000032
Then reducing dimension by 1 × 1 convolution and activating function by softmax to obtain
Figure FDA0003667247870000033
Wherein N is the number of convolution kernels defined in advance, is a hyper-parameter, is set according to a specific task, and is set to be N =4 through experimental verification; multiplying the obtained coefficients p to N convolution kernels respectively, summing the weights of the convolution kernels, generating only one convolution kernel for convolution operation, and obtaining the weight of the convolution kernel
Figure FDA0003667247870000034
Is represented as follows:
Figure FDA0003667247870000035
wherein p is i Denotes the i-th coefficient of p, 0 ≦ p i ≤1,
Figure FDA0003667247870000036
w i The weight of the ith convolution kernel;
meanwhile, the parameter Q of the standard dynamic convolution is calculated as:
Q=C in ×N+N×C in ×C out ×k×k
where k × k is the size of the convolution kernel, C in And C out The number of channels representing the input and output characteristic diagram; obviously, the parameter amount is N times or more of that of the normal convolution, and in order to reduce the parameter amount, the original parameter amount is reduced to
Figure FDA0003667247870000037
Figure FDA0003667247870000038
Step 2.2: adopting an antagonistic consistency training strategy, and realizing different targets by using double discriminators with the same structure; a first discriminator learns the prediction quality consistency of the segmentation networks of the unlabelled data and the labeled data; the second arbiter learns the prediction consistency of teacher and student network under different disturbance with same data; different from other confrontation networks, the input of the discriminator network is the output result and the original image of the segmentation network; judging the matching relation between the network learning segmentation result and the original by taking the original image as a reference, and further measuring the quality of the segmentation result; in the part of the segmentation network, a bidirectional attention component is applied to the segmentation network to replace the conventional standard convolution;
the antagonism consistency training frame is mainly added with two discriminator networks on the basis of an average teacher, the two discriminator networks have the same structure but different functions, and a discriminator D 1 The learning network outputs the difference of quality between the marked data and the unmarked data; discriminator D 2 Learning the difference of the unlabeled data under disturbance and undisturbed conditions; finally, loss L is monitored s Loss of consistency L semi And to combat the loss L adv1 ,L adv2 To encourage student network to mark unmarked numbersAccording to the generated high-quality prediction result, the countermeasure consistency training strategy adopting the DA-Net model structure is specifically realized by the following steps:
the counterstudy is realized in an alternate training mode, the segmentation network inputs medical images, outputs segmentation prediction graphs, splices the output of the segmentation network and the input images together and inputs the spliced output and the input images into a discrimination network, the output is the number of categories, 0 represents that the quality of segmentation results is poor, and 1 represents that the quality of the segmentation results is good; in training, it is encouraged to segment the network versus unlabeled data x u Generating a high quality segmentation result with a score close to 1, in sum, during the training process, such that the segmentation network outputs a high quality segmentation result and such that the discrimination network cannot discriminate whether the input is from a label or the segmentation network, the segmentation network objective function L (θ) s Is defined as:
Figure FDA0003667247870000039
discriminating the network wants to distinguish the output of the segmented network as much as possible, discriminator D 1 And D 2 Are defined as:
Figure FDA00036672478700000310
Figure FDA0003667247870000041
wherein L is s (. Is) a multi-class cross-entropy penalty and dice penalty, L semi Is mean square error, L adv1 And L adv2 Cross entropy losses for multiple classes; x is the number of i And y i For input data and corresponding labels, x u And x ema For input unlabeled data, with noise interference,
Figure FDA0003667247870000042
and
Figure FDA0003667247870000043
respectively the segmentation results of marked data and unmarked data,
Figure FDA0003667247870000044
is the prediction result of the teacher network; λ is a weighting coefficient, which is a gaussian rising curve,
Figure FDA0003667247870000045
i is the iteration number of training;
the parameters of the teacher model are EMA accumulations of student model parameters, and have proven effective in most methods, defined as:
θ’ t =αθ′ t-1 +(1-α)θ t
wherein, theta' t To update the parameters of the teacher model, θ t The method comprises the following steps of (1) determining a weight parameter of a student, wherein alpha is a hyperparameter of a smoothing coefficient and determines the dependency relationship between a teacher model and a student model;
and inputting the processed data into the trained model to obtain a primary segmentation result p.
4. The method of claim 3, wherein the semi-supervised organ and tissue image segmentation based on the dual countermeasure network comprises: the implementation method of the fourth step is that,
the performance of the segmentation result is evaluated by using Dice coefficient (DI), jaccard similarity index (Jaccard index, JA), pixel precision (AC), sensitivity (SE), and Specificity (SP), and the specific formula is as follows:
Figure FDA0003667247870000046
Figure FDA0003667247870000047
Figure FDA0003667247870000048
Figure FDA0003667247870000049
Figure FDA00036672478700000410
wherein, TP, TN, FP and FN respectively represent a positive class with correct prediction, a parent class with correct prediction, a positive class with wrong prediction and a negative class with wrong prediction;
Figure FDA00036672478700000411
wherein, A and B respectively represent a true value and a segmentation result; s (A) and S (B) represent surface voxel sets of A and B, and d (-) represents Euclidean distance; the closer the Dice coefficient, the Jaccard similarity index, the pixel precision, the sensitivity and the specificity are to 1, the closer the Dice coefficient, the Jaccard similarity index, the pixel precision, the sensitivity and the specificity are to a true value, the better the segmentation effect is, the ASD represents the average distance of the surface, and the smaller the ASD represents the closer the Dice coefficient, the closer the Jaccard similarity index, the sensitivity and the specificity are to the true value; notably, the final organ segmentation results are evaluated by synthetic 3D.
5. The method of claim 4, wherein the semi-supervised organ and tissue image segmentation based on the dual countermeasure network comprises: and fifthly, designing a segmentation system of the human organs and the skin lesion tissues based on deep learning according to the image segmentation results of the organs and the tissues evaluated in the fourth step, and assisting in supporting artificial intelligence to segment the human abdominal organs and the tissues, marking and positioning the tissues, reconstructing images, evaluating risk degree, detecting focuses and training.
6. The method of claim 5, wherein the semi-supervised organ and tissue image segmentation based on the dual countermeasure network comprises: the auxiliary training comprises the step of assisting the daily training of medical staff based on the liver image segmentation data, and the auxiliary support improves the analysis precision and efficiency of abdominal organ images and tissue lesions.
7. A double-countermeasure-network-based semi-supervised organ tissue image segmentation system for implementing the double-countermeasure-network-based semi-supervised organ tissue image segmentation method of claim 1, 2, 4, 5 or 6, wherein: the system comprises an image preprocessing module, a network training module, a testing module and a visualization module;
the image preprocessing module is used for uniformly preprocessing an input image, mainly comprises data enhancement and resampling of the image, and is beneficial to accelerating the convergence speed of the model and reducing the overfitting risk;
the network training module is used for training the preprocessed data by integrating an average teacher and an antagonistic consistency training strategy and selecting iteration times, a learning rate and batch number;
the network test module is used for predicting the trained model and outputting an image segmentation prediction result;
the visualization module is used for realizing visualization of a loss curve of the visualization training set, a correct rate curve of the test set and a segmentation result of the image.
CN202210629202.7A 2022-05-27 2022-05-27 Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network Pending CN115439486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210629202.7A CN115439486A (en) 2022-05-27 2022-05-27 Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210629202.7A CN115439486A (en) 2022-05-27 2022-05-27 Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network

Publications (1)

Publication Number Publication Date
CN115439486A true CN115439486A (en) 2022-12-06

Family

ID=84241006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210629202.7A Pending CN115439486A (en) 2022-05-27 2022-05-27 Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network

Country Status (1)

Country Link
CN (1) CN115439486A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861252A (en) * 2022-12-14 2023-03-28 深圳技术大学 Semi-supervised medical image organ segmentation method based on counterstudy strategy
CN116205289A (en) * 2023-05-05 2023-06-02 海杰亚(北京)医疗器械有限公司 Animal organ segmentation model training method, segmentation method and related products
CN116344004A (en) * 2023-05-31 2023-06-27 苏州恒瑞宏远医疗科技有限公司 Image sample data amplification method and device
CN116664580A (en) * 2023-08-02 2023-08-29 经智信息科技(山东)有限公司 Multi-image hierarchical joint imaging method and device for CT images
CN116681717A (en) * 2023-08-04 2023-09-01 经智信息科技(山东)有限公司 CT image segmentation processing method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861252A (en) * 2022-12-14 2023-03-28 深圳技术大学 Semi-supervised medical image organ segmentation method based on counterstudy strategy
CN115861252B (en) * 2022-12-14 2023-09-22 深圳技术大学 Semi-supervised medical image organ segmentation method based on countermeasure learning strategy
CN116205289A (en) * 2023-05-05 2023-06-02 海杰亚(北京)医疗器械有限公司 Animal organ segmentation model training method, segmentation method and related products
CN116344004A (en) * 2023-05-31 2023-06-27 苏州恒瑞宏远医疗科技有限公司 Image sample data amplification method and device
CN116344004B (en) * 2023-05-31 2023-08-08 苏州恒瑞宏远医疗科技有限公司 Image sample data amplification method and device
CN116664580A (en) * 2023-08-02 2023-08-29 经智信息科技(山东)有限公司 Multi-image hierarchical joint imaging method and device for CT images
CN116664580B (en) * 2023-08-02 2023-11-28 经智信息科技(山东)有限公司 Multi-image hierarchical joint imaging method and device for CT images
CN116681717A (en) * 2023-08-04 2023-09-01 经智信息科技(山东)有限公司 CT image segmentation processing method and device
CN116681717B (en) * 2023-08-04 2023-11-28 经智信息科技(山东)有限公司 CT image segmentation processing method and device

Similar Documents

Publication Publication Date Title
CN115439486A (en) Semi-supervised organ tissue image segmentation method and system based on dual-countermeasure network
Shah et al. A robust approach for brain tumor detection in magnetic resonance images using finetuned efficientnet
CN109493308B (en) Medical image synthesis and classification method for generating confrontation network based on condition multi-discrimination
Yu et al. SAR sea-ice image analysis based on iterative region growing using semantics
Sharma et al. Brain tumor segmentation using genetic algorithm and artificial neural network fuzzy inference system (ANFIS)
CN110930416B (en) MRI image prostate segmentation method based on U-shaped network
CN112102266B (en) Attention mechanism-based cerebral infarction medical image classification model training method
Wang et al. Adversarial multimodal fusion with attention mechanism for skin lesion classification using clinical and dermoscopic images
Sharma et al. Brain tumor segmentation using hybrid genetic algorithm and artificial neural network fuzzy inference system (anfis)
CN111784628A (en) End-to-end colorectal polyp image segmentation method based on effective learning
CN110853009A (en) Retina pathology image analysis system based on machine learning
CN116884623B (en) Medical rehabilitation prediction system based on laser scanning imaging
CN114511502A (en) Gastrointestinal endoscope image polyp detection system based on artificial intelligence, terminal and storage medium
Lin et al. Interventional multi-instance learning with deconfounded instance-level prediction
CN114663426A (en) Bone age assessment method based on key bone area positioning
CN114821052A (en) Three-dimensional brain tumor nuclear magnetic resonance image segmentation method based on self-adjustment strategy
CN115082493A (en) 3D (three-dimensional) atrial image segmentation method and system based on shape-guided dual consistency
Reddy Effective CNN-MSO method for brain tumor detection and segmentation
Yang et al. RADCU-Net: Residual attention and dual-supervision cascaded U-Net for retinal blood vessel segmentation
Liu et al. Cosst: Multi-organ segmentation with partially labeled datasets using comprehensive supervisions and self-training
CN111798463B (en) Method for automatically segmenting multiple organs in head and neck CT image
CN116311387B (en) Cross-modal pedestrian re-identification method based on feature intersection
Qiu et al. Self-training with dual uncertainty for semi-supervised medical image segmentation
CN115830401B (en) Small sample image classification method
Pei et al. Real-time multi-focus biomedical microscopic image fusion based on m-SegNet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination