CN114897914A

CN114897914A - Semi-supervised CT image segmentation method based on confrontation training

Info

Publication number: CN114897914A
Application number: CN202210259206.0A
Authority: CN
Inventors: 孙仕亮; 丁超越; 赵静
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2022-03-16
Filing date: 2022-03-16
Publication date: 2022-08-12
Anticipated expiration: 2042-03-16
Also published as: CN114897914B

Abstract

The invention relates to the technical field of image processing, in particular to a semi-supervised CT image segmentation method based on confrontation training, which comprises the following steps: firstly, acquiring a three-dimensional CT image of a lung, and establishing a diseased data set labeled at a voxel level, a diseased data set not labeled and a healthy data set; and then, sequentially inputting the CT images in the three data sets to a generator and a divider to respectively obtain a synthesized healthy image and a division mask. Then splicing the mask area of the synthesized health image and the reverse mask area of the input image to obtain a recovered health image; setting the value of the mask region of the synthesized healthy image to 0 results in an anti-mask image. And finally, the two discriminators respectively supervise the recovered health image and the anti-mask image in an antagonistic training mode to improve the segmentation and generator effects of the segmenter. The invention realizes that a segmentation model with accurate performance is obtained by using a small amount of voxel level marking samples, and the specially designed segmenter effectively improves the feature representation capability.

Description

Semi-supervised CT image segmentation method based on confrontation training

Technical Field

The invention relates to the technical field of computers, in particular to a semi-supervised CT image segmentation method based on confrontation training.

Background

The background art involves: lung CT image segmentation and attention mechanism.

1) Lung CT image segmentation

Lung CT has greater accuracy in diagnosing lung disease and has been used in many applications for medical images of the lungs using deep learning techniques. However, much of the previous work has been directed to the task of classifying images of the lungs, which do not reflect the location and size of the lesion as well as the task of segmentation. Segmentation of the infected region using CT images may help radiologists to better quantify the lesion area. Quantitative analysis of the segmentation mask of the lesion area can lead to a series of meaningful diagnosis results related to the lung.

At present, deep learning has been primarily applied to the task of CT segmentation of the lungs, but this method requires a large amount of training data. Clinically, the three-dimensional CT image data annotation of the lung is very difficult, and it takes more than 3 hours for a radiologist to annotate a lung CT volume. Therefore, scarcity of voxel-level labeled lung CT images is an important issue. Image synthesis and data enhancement can alleviate the problem of voxel level label loss. Active learning and self-learning may provide pseudo-labels for unlabeled data to optimize the segmentation model. Generation of countermeasure networks (GANs) and Class Activation Maps (CAMs) deal with the absence of pixel-level or voxel-level labels by training the segmentation model using data of weak labels, such as slice-level labels or volume-level labels with CT. However, training a model using pseudo-labels may introduce noise, affecting the performance of the model, and the model is trained only on data with weak labels, often failing to obtain a satisfactory segmentation mask.

2) Attention mechanism

The basic idea of attention mechanism in computer vision is to let the system learn to pay attention and be able to ignore irrelevant information and concentrate on critical information. In recent years, attention models have been widely used in various fields such as image processing, speech recognition, and natural language processing. As deep learning progresses to this day, it becomes increasingly important to build a neural network with attention mechanisms. On the one hand, this neural network can learn the attention mechanism autonomously, and on the other hand, the attention mechanism can help to understand the world seen by the neural network. In recent years, much research work on the combination of deep learning and visual attention mechanisms has focused on the use of masks to form the attention mechanism. The principle of masking is to identify key features in the image data by another layer of new weights. Through learning and training, the deep neural network can learn the regions needing attention in each new image.

Disclosure of Invention

The invention aims to solve the problem of label shortage of voxel-level CT image data and provides a semi-supervised CT image segmentation method based on countermeasure training.

The specific technical scheme for realizing the purpose of the invention is as follows:

a semi-supervised CT image segmentation method based on antagonistic training comprises the following steps:

step 1) obtaining a lung three-dimensional CT image, establishing a diseased data set V marked at a voxel level, a diseased data set D not marked and a healthy data set H, and setting the marked diseased image as I _v Belongs to V, and the unmarked diseased image is I _d E D, healthy image I _h ∈H；

Step 2) for any input image in the data set, carrying out image preprocessing operations including image cutting, resampling and standardization on the input image to obtain an image I _i Inputting the data into a generator to obtain a reconstruction result I of the generator _g The generator is aimed at _i Reconstructing a healthy image I _g (ii) a Will I _i Input to a segmenter to obtain a segmentation mask

For a voxel-level labeled diseased dataset V, the segmentation mask is supervised using the true label Y in the dataset V

Step 3) masking the segmentation mask

Negation is carried out to obtain a reverse mask

Step 4) taking generator result I _g Is a mask prediction region of

A portion of the number 1 of the first to the second,

and an input image I _i Is the inverse mask of

A portion of the number 1 of the first to the second,

adding the two images according to elements to obtain a composite image I _p ；

Step 5) taking generator result I _g Is predicted in the reverse mask, i.e.

And an input image I _i Is predicted in the reverse mask, i.e.

Respectively obtain images I _gh And image I _ih ；

Step 6) constructing a semi-supervised CT image segmentation model, wherein the model comprises a generator G, a segmenter S and a discriminator D ₁ And a discriminator D ₂ (ii) a The input of the model is simultaneously input into a generator G and a divider S, the divider S obtains a division mask of the CT image, a voxel with the value of 1 represents a lesion area, and a voxel with the value of 0 represents a healthy area; the generator G is used for inputting a CT image, generating a CT image after the CT image is recovered to be healthy, and then combining the generated image with a segmentation mask and the image input by a model to generate a recovered healthy image and an anti-mask image so as to approximate a real image to cheat the discriminator D ₁ And a discriminator D ₂ (ii) a Discriminator D ₁ And D ₂ The function of (1) is to judge whether the input image is from a real image; by competing against each other, the generator G and the discriminator D ₁ And D ₂ Optimizing the weight in an iterative mode to improve the performance;

the semi-supervised CT image segmentation model comprises four losses, namely supervision loss, reconstruction loss, discrimination loss 1 and discrimination loss 2; the surveillance loss is effective when the input CT image belongs to a voxel-level labeled diseased data set; the reconstruction loss is the mean square error MSE loss between the model input and the generator G result to supervise the generator G to produce a more realistic healthy CT image; judging whether the loss 1 is a countermeasure loss, and judging whether the input image is a real image or a synthesized image so as to supervise the quality of the recovered healthy image; judging whether the loss 2 is a countermeasure loss, and judging whether the input image is from a real model input image or an image generated by a generator G so as to supervise the quality of the reverse mask image; continuously optimizing the loss using a model optimization method based on antagonistic training until the loss converges;

and 7) after the model training is finished, giving a three-dimensional CT image to be segmented, and inputting the three-dimensional CT image to the segmenter S to obtain the voxel-level three-dimensional segmentation mask of the image.

The step 1) specifically comprises the following steps:

step a 1: acquiring a three-dimensional CT image sample by professional equipment to obtain an original data set;

step a 2: dividing an original data set into a diseased data set and a healthy data set, randomly sampling one fifth of data in the diseased data, and labeling the data by an artificial expert to be used as a diseased data set V labeled in a voxel level; the rest image data are used as an unmarked diseased data set D, and all the healthy image data form a healthy data set H;

step a 3: sequentially inputting images in the diseased data set V labeled at the voxel level, the diseased data set D not labeled and the healthy data set H to a divider and a generator, and recording an input CT image of the model as I _i 。

The image preprocessing operation in step 2) specifically includes:

(1) image cropping

Image cropping cuts a three-dimensional medical image to a non-zero area of the image, namely, a minimum three-dimensional boundary box is searched in the image, the value outside the boundary box area is 0, and the boundary box is used for cropping the image;

(2) resampling

Resampling can solve the problem that the actual space represented by different voxel images is not consistent in size in a three-dimensional CT data set. In order to unify the resolution sizes of all the CT images, the resolutions of different CT images are scaled by resampling so as to unify the resolutions to 0.5mm multiplied by 0.5 mm;

(3) standardization

In order to have the same distributed gray-scale values for each image, the minimum and maximum values of the gray-scale values of the CT image are set to 300 and 3000; the value of the gradation value less than 300 is increased to 300 and the value of the gradation value greater than 3000 is decreased to 3000; the voxel values of the CT image are then normalized to obtain values between [0,1], and these values are then scaled to the [0,255] interval.

The generator G and the discriminator D in the step 6) ₁ And D ₂ The design of (2) follows the structure of CycleGAN, and in addition, the final discrimination result, namely 1 or 0, is obtained by using the fully connected layer as a classification network at the last layer, wherein 1 represents that the discriminator considers the input image to be a real image, and 0 represents that the discriminator considers the input image to be a generated or synthesized image.

The segmenter S in the step 6) designs a feature enhancement module for enhancing the feature representation of the encoder based on the 3DU-Net, wherein the feature enhancement module comprises channel attention and space attention. (ii) a To balance memory usage and segmentation accuracy, four 2-fold downsampling is used in the segmenter S; the divider S adopts a pyramid pooling structure of the dense cavity space, can combine features with different scales by using expansion rates with different sizes, and well realizes reuse of the features; the dilation convolution (scaled convolution) in segmenter S uses dilation rates of 3, 6, and 12; in order to improve the quality of the learned features, the features with the same scale are fused in a quick connection mode; in addition, features of the same scale are fused in a quick connection mode, and therefore the quality of the learned features is improved. The number of channels is set to 16 in consideration of the memory usage of the 3D segmentation model.

Step 6) the model optimization method based on the countermeasure training adopts a gradient descent algorithm based on an Adam optimizer to iteratively optimize the segmenter S, the generator G and the discriminator D ₁ And a discriminator D ₂ The method specifically comprises the following steps:

(1) initialization discriminator D ₁ And D ₂ The weight parameter of (2); the iteration iter is set to 1, and the maximum iteration iter is set _max ；

(2) Optimizing segmenter S and generator G: freezing discriminator D ₁ And D ₂ Parameters of the defreezing divider S and the generator G, calculating loss and optimizing the model, iter plus one;

(3) optimization discriminator D ₁ : freezing segmenter S, generator G and discriminator D ₂ Parameter of (2), thawing discriminator D ₁ Calculating the loss and optimizing the model, iter plus one;

(4) optimization discriminator D ₂ : freezing segmenter S, generator G and discriminator D ₁ Parameter of (2), thawing discriminator D ₂ Calculating the loss and optimizing the model, iter plus one;

repeating (1) - (4), wherein iter is larger than iter _max Or until the loss converges.

The design of the generator and the discriminator of the invention follows the structure of cycleGAN, and in addition, the final discrimination result is obtained by connecting the classifier by using a full connection layer at the last layer. A spectral normalization operation is added to the discriminator. Regarding the memory usage of 3D images, the number of fundamental channels is reduced from 64 to 16.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention provides a semi-supervised CT image segmentation method based on confrontation training, which is used for segmenting a focus region in a lung three-dimensional CT image. Compared with a two-dimensional divider, the three-dimensional divider can combine information between image layers, so that the change continuity between interlayer image masks can be ensured. The present invention can be trained using only a small amount of voxel-level annotated data and the remaining unlabeled data.

2. The invention optimizes the generator, the segmenter and the discriminator in the model in a mode of countertraining, and enables the segmenter to learn the characteristic information of healthy lungs and focuses from the annotated data of the voxel level so as to gradually improve the segmentation performance of the model, thereby greatly reducing the number of voxel level labels required by training the model.

3. In order to better deal with the problem of low contrast between the infected lung area and normal tissue, the present invention designs a feature enhancement module to deal with ambiguous boundaries. In particular, channel attention is used to implicitly enhance contrast between features, thereby highlighting boundary information for lesion regions, and spatial attention generation spatial attention is used to highlight important regions. The feature enhancement module effectively enhances the feature representation of the lesion area by fusing these features.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a diagram of an antagonistic semi-supervised model of the present invention;

FIG. 3 is a schematic diagram of a segmenter according to the present invention;

FIG. 4 is a schematic view of a feature enhancement module of the present invention that combines channel and spatial attention.

Detailed Description

The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. The procedures, conditions, experimental methods and the like for carrying out the present invention are general knowledge and common general knowledge in the art except for the contents specifically mentioned below, and the present invention is not particularly limited.

The whole process of the present invention includes steps 1) -7), please refer to fig. 1 and 2. Fig. 1 is a flow chart of a method for segmenting a semi-supervised CT image based on antagonistic training, and fig. 2 is a schematic diagram of an antagonistic semi-supervised model.

The process of the invention:

step 1) obtaining lungsThree-dimensional CT image of the patient, establishing a diseased data set V marked at voxel level, a diseased data set D not marked, a healthy data set S, and setting the marked diseased image as I _v Belongs to V, and the unmarked diseased image is I _d E.g. D, health image I _h ∈S。

Step 2) inputting all input data into a generator to obtain a healthy reconstructed version I _g Input to a segmenter to obtain a segmentation mask

Diseased data set I for voxel level labeling _v Supervision of segmentation masks using a true label Y in the data set V

Step 3) masking the segmentation mask

Get the inverse to

Step 4) taking generator result I _g Is predicted (i.e. by masking

A portion of the number 1 of the first to the second,

) And an input image I _i Is predicted (i.e. is predicted by the inverse mask of (c))

A portion that is a number 1 of the segment,

) Adding the two images according to elements to obtain a composite image I _p 。

Specifically, an unmarked disease data is set as I _d Belongs to E D, and the ill data marked on a voxel level isI _v Belongs to V, and one piece of health image data is I _h E.g. H. The lesion region predicted by the segmenter is replaced with the corresponding region reconstructed by the generator while the uninfected region is preserved, called pseudo-healthy image I _p . The pseudo-healthy image is calculated by the following formula:

1)

where φ is a function that generates a pseudo-health image;

is the probabilistic segmentation mask predicted by segmenter S; image I generated by the generator _g ＝G(I _d ；θ _G )；θ _S And theta _G Learnable parameters for segmenter S and generator G, respectively.

Step 5) taking generator result I _g Is predicted in the reverse mask

And an input image I _i Is predicted in the reverse mask

Respectively obtain I _gh And I _ih 。

Because there is no supervisory signal to constrain I _g May be used, which may lead to problems with the quality of the generated image being uncontrollable. Although they are not used to form a composite image, this affects the performance of the generator, which is a bottleneck in improving the final performance of the segmenter. The invention utilizes the reconstruction result of the generator to obtain I from the health area predicted by the divider _g Generated healthy area I _gh And I _i Health area I of _ih ：

2)

Then I _gh And I _ih Is inputted to a discriminator D ₂ ，I _gh Referred to as composite healthy area image sum I _ih Referred to as the true healthy area image.

Step 6) constructing a three-dimensional confrontation semi-supervised model, and iteratively optimizing a generator, a divider and a discriminator D ₁ And a discriminator D ₂ Until the total loss converges. The confrontation training can lead the model to be trained in a semi-supervised learning mode, namely, the network can learn information which is beneficial to the segmentation accuracy from the unlabeled data.

To fool the arbiter D completely ₁ And D ₂ The segmenter needs to segment all the infected regions, and the generator needs to generate pseudo-healthy images in both the segmented predicted diseased regions and healthy regions. In contrast, discriminator D ₁ For distinguishing and real health images I _d D, discriminator D ₂ To distinguish the composite healthy area image I _gh And a real healthy area image I _ih . The model was trained in an antagonistic manner with the following maximum minimum, antagonistic losses were as follows:

wherein theta is _G And theta _S Learnable parameters of the generator G and the divider S respectively;

and

are respectively a discriminator D ₁ And a discriminator D ₁ A learnable parameter of (c);

yang (Yang)

Is a loss function.

Marking the composite image as 0, the real health image objective function

By

5)

6) Wherein the content of the first and second substances,

is D ₁ The predicted result of (2); i is _p Calculated by formula (1);

representation discriminator D ₁ Learnable parameters of (c);

and

representing a mathematical expectation.

Label the synthetic healthy region as 0 and the real healthy region as 1, wherein the objective function

The calculation is as follows:

7)

wherein the content of the first and second substances,

is discriminator D ₂ The predicted result of (2); image I _gh And I _sh Calculated from equations 2 and 3, respectively;

represents D ₂ May be used to learn the parameters.

The segmenter of the present invention is described with reference to FIG. 3. Fig. 3 is a schematic diagram of a divider.

The invention proposes a feature enhancement module that uses channel and spatial attention to adapt the optimization features to optimize the segmenter, fig. 4 being a schematic diagram of the feature enhancement module that combines channel and spatial attention. In particular, the channel attention module learns global parameters to highlight useful boundary information. The spatial attention module calculates an attention weight map of the lung region. The channel attention, spatial attention, may improve the representation of features of the region of interest, enabling salient features and suppression of unnecessary features. Given an intermediate feature mapping

As input, where C is the number of channels, D is the feature dimension, H is the image height, and W is the image width. Feature enhancement module infers 1D channel attention in turn

And 2D spatial attention mapping

The whole attention process can be summarized as follows:

attention of the channel: to enhance the contrast of features, spatial information on the feature map is first aggregated using an average pooling and max pooling operation, generating two different spatial context descriptors, an average pooled feature and a max pooled feature. The two descriptors are forwarded to a multi-layer perceptron network sharing parameters, and finally the result is subjected to a sigmoid function to generate a channel attention M _c (F) In that respect The shared network is composed of a multi-layered perceptron (MLP) with one hidden layer. The channel attention mainly comprises an average pooling operation, a maximum pooling operation, a multi-layer perceptron of a shared parameter and a sigmoid function. An intermediate optimization feature for channel attention generation is

The calculation process is as follows:

8)

wherein F is the input characteristic of the channel attention; sigma represents a sigmoid activation function; AvgPool is the average pooling function; MaxPool is the maximum pooling function.

Spatial attention is as follows: the purpose of spatial attention is to discard unimportant features and highlight features of interest that are useful for segmenting COVID19 infections. Spatial attention generation spatial attention M using spatial relationships between features _s (F'), which is complementary to the channel attention. To compute spatial attention, the average pooling and maximum pooling operations are first taken along the channel axis and concatenated to generate a valid feature descriptor. Applying pooling along the channel axis can effectively highlight the feature region of interest. On concatenated feature descriptors, convolutional layers are used to generate spatial attention, which encodes information where emphasis or suppression is needed. The final optimization feature of spatial attention generation is F ",

wherein σ represents a sigmoid activation function; conv is a convolution operation of size 7x7x 7;

representing multiplication by element. F "is obtained by element-by-element multiplication between F' and the attention map.

Due to the synthesized image I _p With images I generated by a prediction mask sum generator _g In connection with, therefore, the

Can be fed back to the segmentor S and the generator G in order to optimize the parameters of both modules. In addition, also addA basic segmentation loss is obtained

The difference between the output Y of S and the true mask GT for measuring a small number of voxel level label samples:

wherein CEL is the cross-entropy loss;

is voxel level label data I _v The prediction mask of (3); y is _v Is voxel level label data I _v The real tag of (1). Prediction mask

Wherein S (-) is a segmenter θ _S Learnable parameters representing a segmenter.

In order to improve the stability of the resistance training in the step 6), the invention further sets auxiliary constraint. First, a health image I _h Input to a segmentor and generator. Health image data I _h Is predicted by the prediction mask S (I) _h ；θ _S ) And health image data I _h True label Y of _h The cross entropy loss between is added to

Secondly, to further improve the performance of the generator, the method uses reconstruction losses

To constrain the output of the generator:

where MSE (-) is a mean square error function; g (I) _h ；θ _G ) Is represented by I _h The reconstruction result obtained after being input to the generator G; theta _G Are learnable parameters of the generator G. Further, it also relates to

Introducing additional loss function

Wherein, G (I) _d ；θ _G ) Is represented by _d The reconstruction result obtained after being input to the generator G;

is discriminator D ₁ May be used to learn the parameters.

So as to pass through I _d To generate an image I _g Input D ₁ To further improve the generating effect. When the input of the segmenter S is a healthy image, the forward propagation process of synthesis and discrimination is not required, and calculation is not required

In addition, the original disease image I _d Is input to a discriminator D ₁ To keep the discriminator D during the training process ₁ Sensitivity and distinguishability to lesion signals, the dropout rate is set to be fixed at 0.5. Will constrain the loss function

Is added to

Discriminator D ₁ The goal of (1) is to be able to distinguish between images of a patient and a healthy person:

to summarize the above, the penalty function is extended by adding four new penalties as auxiliary constraints

And

the final objective function is defined as follows:

wherein λ is _S Is used for balancing

And

is determined.

And 7) realizing the CT image segmentation based on the trained segmenter.

The protection of the present invention is not limited to the above embodiments. Variations and advantages that may occur to those skilled in the art are intended to be included within the invention without departing from the spirit and scope of the inventive concept, and the scope of the invention is to be determined by the appended claims.

Claims

1. A semi-supervised CT image segmentation method based on antagonistic training is characterized by comprising the following steps:

step 1) obtaining a lung three-dimensional CT image, establishing a diseased data set V marked at a voxel level, a diseased data set D not marked and a healthy data set H, and setting a marked diseased image as I _v E.g. y, the unmarked diseased image is I _d E.g. D, health image I _h ∈H；

Step 2) forAny one input image in the data set is subjected to image preprocessing operations including image cropping, resampling and standardization to obtain an image I _i Inputting the data into a generator to obtain a reconstruction result I of the generator _g The generator is aimed at _i Reconstructing a healthy image I _g (ii) a Will I _i Input to a segmenter to obtain a segmentation mask

Step 3) masking the segmentation mask

Negation is carried out to obtain a reverse mask

Step 4) taking generator result I _g Is a mask prediction region of

A portion that is a number 1 of the segment,

and an input image I _i Is the inverse mask of

A portion of the number 1 of the first to the second,

Step 5) taking generator result I _g Is predicted in the reverse mask, i.e.

And an input image I _i Is predicted in the reverse mask, i.e.

Respectively obtain images I _gh And image I _ih ；

the semi-supervised CT image segmentation model comprises four losses, namely supervision loss, reconstruction loss, discrimination loss 1 and discrimination loss 2; the surveillance loss is effective when the input CT image belongs to a voxel-level labeled diseased data set; the reconstruction loss is the mean square error MSE loss between the model input and the generator G result to supervise the generator G to produce a more realistic healthy CT image; judging whether the loss 1 is a countermeasure loss, and judging whether the input image is a real image or a synthesized image so as to monitor the quality of the recovered healthy image; judging whether the loss 2 is a countermeasure loss, and judging whether the input image is from a real model input image or an image generated by a generator G so as to supervise the quality of the reverse mask image; continuously optimizing the loss using a model optimization method based on antagonistic training until the loss converges;

2. The method of claim 1, wherein the step 1) comprises:

step a 2: dividing an original data set into a diseased data set and a healthy data set, randomly sampling one fifth of data in the diseased data, and labeling the data by an artificial expert to be used as a diseased data set V labeled in a voxel level; the remaining image data serve as the unlabeled diseased data set D, and all healthy image data constitute the healthy data set H.

3. The method of claim 1, wherein the image preprocessing operation in step 2) specifically comprises:

(1) image cropping

(2) resampling

Scaling the different CT image resolutions using resampling to unify the resolutions to 0.5mm by 0.5 mm;

(3) standardization

Setting the minimum value and the maximum value of the CT image gray value to be 300 and 3000; the value of the gradation value less than 300 is increased to 300 and the value of the gradation value greater than 3000 is decreased to 3000; the voxel values of the CT image are then normalized to obtain values between [0,1], and these values are then scaled to the [0,255] interval.

4. The method of claim 1, wherein the generator G and the judging in step 6) are performed by a semi-supervised CT image segmentation method based on the interactive trainingPin D ₁ And D ₂ The design of (2) follows the structure of CycleGAN, and in addition, the final discrimination result, namely 1 or 0, is obtained by using the fully connected layer as a classification network at the last layer, wherein 1 represents that the discriminator considers the input image to be a real image, and 0 represents that the discriminator considers the input image to be a generated or synthesized image.

5. The method as claimed in claim 1, wherein the segmenter S in step 6) is based on 3DU-Net, and a feature enhancement module is designed in the segmenter S for enhancing the feature representation of the encoder, and the feature enhancement module includes channel attention and spatial attention. (ii) a To balance memory usage and segmentation accuracy, four 2-fold downsampling is used in the segmenter S; the divider S adopts a pyramid pooling structure of the dense cavity space, can combine features with different scales by using expansion rates with different sizes, and well realizes reuse of the features; the dilation convolution (scaled convolution) in segmenter S uses dilation rates of 3, 6, and 12; the features with the same scale are fused in a quick connection mode; the number of channels is set to 16.

6. The confrontation training-based semi-supervised CT image segmentation method of claim 1, wherein the confrontation training-based model optimization method of step 6) adopts a gradient descent algorithm based on Adam optimizer to iteratively optimize the segmenter S, the generator G and the discriminator D ₁ And a discriminator D ₂ The method specifically comprises the following steps:

(3) optimization discriminator D ₁ : freezing segmenter S, generator G and discriminator D ₂ Radix Ginseng (radix Ginseng)Number, unfreezing discriminator D ₁ Calculating the loss and optimizing the model, iter plus one;