CN116703850A

CN116703850A - Medical image segmentation method based on field self-adaption

Info

Publication number: CN116703850A
Application number: CN202310632806.1A
Authority: CN
Inventors: 郑恺潇; 许金山; 汪梦婷
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2023-05-30
Filing date: 2023-05-30
Publication date: 2023-09-05

Abstract

A medical image segmentation method based on field adaptation comprises the following steps: and constructing a model overall architecture, and constructing an encoder network framework according to the overall network layout to convert the image characteristics from low resolution to high resolution. After the image features are extracted, noise in the image is eliminated through a re-parameterization method. The regularization method is used for measuring the data distribution of the source domain and the target domain, so that the target training network is optimized. And constructing a decoder network framework, reconstructing an image by using the labels and the re-parameterized features, and improving the extraction capability of the model on the image structure information. And inputting the three-time re-parameterized result into a divider, and predicting the division result from different scales. An entropy discriminator network framework is constructed, and entropy distribution of a source domain and a target domain is similar to indirectly minimize entropy by using a entropy diagram of a soft segmentation result obtained by a divider. The invention realizes the migration of the existing knowledge by utilizing the similarity between the marked data and the unmarked data, and realizes the field-adaptive medical image segmentation.

Description

Medical image segmentation method based on field self-adaption

Technical Field

The invention relates to the field of auxiliary diagnosis of artificial intelligent medical images, in particular to a medical image segmentation method based on field self-adaption.

Background

Medical image analysis is one of the important components in the field of computer vision, and is widely used for diagnosis of various diseases. The medical image is obtained through various medical imaging devices, internal tissues of the body are presented in a non-invasive mode, and a doctor can realize clinical diagnosis according to the image information. Medical image segmentation is an important step of medical image analysis, can provide support for subsequent tasks such as disease diagnosis, treatment, disease condition monitoring and the like, and the quality of segmentation directly influences the subsequent treatment effect.

With the development of artificial intelligence, a medical image segmentation method based on deep learning is gradually raised, and compared with the traditional method, the diagnosis precision can be greatly improved. However, there are still a number of problems in this area. On one hand, the medical image has a plurality of imaging modes, the organ information of different individuals is different, and the medical image has the characteristics of low contrast and unclear integral structure; on the other hand, creating a medical image dataset of a large number of samples is a difficult task, and there is a large number of irregularly shaped tissues in the medical image, which is very costly to annotate and requires a lot of time and effort. These problems present a great challenge for medical image segmentation, and existing methods have difficulty in fully solving these problems.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a medical image segmentation method based on field adaptation.

The invention is based on a UNet model, combines a field self-adaptive method based on difference and countermeasure, explicitly measures the inter-domain distance, and optimizes the prediction result by utilizing the countermeasure entropy in an output space constraint model.

A medical image segmentation method based on field self-adaption comprises the following specific steps:

step S1, using the CT and MR medical images of the heart in the MMWHS dataset as source domain data, 16 slices are sampled from the long axis view around the center of the left ventricle cavity for each 3D image, and cut to 240X 220 size.

Step S2, firstly, constructing a model overall architecture, constructing an encoder network framework according to overall network layout, performing four downsampling operations on images of a source domain and a target domain, extracting image features, performing upsampling operation, and converting the image features from low resolution to high resolution.

And S3, after the image features are extracted, mapping the features to a low-dimensional space through three times of re-parameterization, eliminating noise in the image, and generating new data similar to the input data but without noise.

And S4, measuring the data distribution of the source domain and the target domain by using a regularization method, wherein the measurement is used as a loss term and is combined with other optimized target training networks.

And S5, constructing a decoder network framework, wherein the framework is connected with the tag and the characteristic of the encoder after the parameterization, inputting the feature into a plurality of continuous convolution blocks to reconstruct an image, and improving the extraction capability of the model on the image structure information.

And S6, inputting the three-time re-parameterization result into a divider, and predicting the division result from different scales.

And S7, constructing an entropy discriminator network frame, wherein the model adopts a PatchGan architecture, and inputs a entropy diagram of a soft segmentation result obtained by the segmenter, so that the entropy discriminator pair Ji Shang is constructed to map the distribution of a source domain and a target domain, and the entropy distribution of the source domain and the target domain is similar to indirectly minimize entropy.

The invention provides a medical image segmentation method based on field adaptation, which utilizes the similarity between marked data and unmarked data to migrate the existing knowledge, realizes knowledge migration in different fields, reserves the transferable common characteristics of a source field and a target field through an explicit measurement method and entropy minimization, and reduces the inter-domain distribution difference. In addition, the weighted self-information distribution of the source domain and the target domain is adjusted based on the antagonism entropy minimization by utilizing a discriminator loss optimization model, so that the prediction entropy of the target domain is indirectly minimized, and clearer semantic segmentation output and finer object edges are generated. The method forces the soft segmentation of the source domain and the target domain to be similar, and promotes the segmentation network to generate similar prediction for the source domain and the target domain.

The invention has the advantages that: the source domain and target domain data have strong similarity in semantic layout. While the explicit distance metric approach optimizes the model by minimizing the source domain and target domain feature loss, ignoring this similarity, and generating an uncertain prediction for the target domain. In order to inhibit the uncertainty prediction, the invention further uses an entropy-driven countermeasure learning model, and by forcing entropy mapping similarity of a source domain and a target domain, the inter-domain difference is reduced, and entropy minimization can prevent the model from being over-fitted, so that generalization performance and robustness are improved.

Drawings

FIG. 1 is a model overall architecture of the present invention

FIG. 2 is an encoder network architecture of the present invention

FIG. 3 is a re-parameterized module of the present invention

FIG. 4 is a block diagram of an entropy discriminator of the invention

Fig. 5 is a flow chart of the method of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without making any inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.

The detailed description of the data set processing in the above step S1 is as follows:

in the experiment, the image is cut into 240×220 size, normalized and input into the network. The CT image and the MR image are respectively used as a source domain, and the segmentation effect and the segmentation performance in a target domain are tested.

The detailed description of the model overall architecture and the encoder network framework in the above step S2 is as follows:

firstly, a model overall architecture is constructed, wherein the model mainly comprises three modules, namely a generator module, a data distribution measurement module and an entropy discriminator module, and the generator module comprises an encoder, a decoder and a divider. An encoder network framework is constructed according to an overall network layout.

And S21, performing four downsampling operations on the images of the source domain and the target domain, extracting image features, performing upsampling operation, converting the image features from low resolution to high resolution, performing four downsampling operations on the images by the encoder in a U-shaped structure, and extracting the image features.

And S22, after the image features are extracted, four times of up-sampling are used for converting low resolution into high resolution image features, and context information of the images is obtained through jump connection, so that semantic differences of the features of different layers are reduced. Each time of up-sampling can acquire image information of different levels, a feature image obtained by the second time of up-sampling is one fourth of an original image, and the feature image contains global information of the image; the feature map obtained by the third up-sampling is one half of the original image, contains the local information of the image, and can analyze the local structure and texture details in the image through the feature map; the third upsampling restores the image to the original image size, where the feature map contains smaller structural and detail information in the image. The encoder outputs a feature map obtained by three layers of up-sampling, so that the divider and the decoder acquire multi-level image features, and the constraint on the model is enhanced.

The detailed description of the re-parameterization in the above step S3 is as follows:

and S31, after the image features are extracted, calculating the mean value and variance of the input features by using a neural network, and generating data similar to the original features.

In step S32, in order to prevent the situation that the mean value and the variance are directly used to sample and overfit, noise is extracted from the standard normal distribution in the sampling process, and the noise is multiplied by the variance and added to generate samples with diversity, so that the stability of the model can be ensured.

The explicit measurement method in step S4 is described in detail as follows:

potential features of the source domain and the target domain are obtained through the encoder, and z is respectively _s And z _t . Will z _s And z _t Is expressed as a probability density function ofAnd->，θ _s And theta _t For the parameters to be learned, the posterior probability is expressed as +.>And->. Obtaining an approximation of the posterior distribution by parameterized modeling, i.e. +.>And->. Due to domain differences between the source domain and the target domain, the extracted features z _s And z _t Is a different distribution, thus +.>And->Inequality by calculating->And->The distance between them constrains them to the same distribution. M samples are independently and randomly sampled from a source domain and a target domain respectively, a regularization distance is calculated, and a calculation formula is shown in a formula (1):

z _s representing a potential feature of the device,represents the i-th sample in the source domain, +.>Representing the j-th sample in the target domain. Since the variables of the potential space follow normal distribution, the kernel definition is as shown in equation (2):

is->The first element of (a) is a first element of (b). Sigma (sigma) ² Representing variance and u representing mean.

The details of the establishment of the decoder network framework in the above step S5 are as follows:

the decoder is connected with the tag and the characteristic of the encoder after the parameterization, and the characteristic is input into a plurality of continuous convolution blocks to reconstruct an image, so that the extraction capability of the model on the image structure information is improved. Each block consists of a convolution layer, an instance normalization layer, and an activation layer. The source domain and the target domain have the same generator and all comprise three parts of an encoder, a decoder and a divider, and the encoders share weights. The difference is that the source domain reconstructs the image using the re-parameterized features and segmentation labels, while the target domain does not have labels, and the re-parameterized features and the predicted segmentation results are used to reconstruct the image.

The details of the divider prediction and division result in the above step S6 are as follows:

and after the feature is subjected to the re-parameterization to remove noise, inputting the re-parameterization result into a divider, and predicting the division result from different scales. The three segmentation results respectively comprise integral structure information and texture detail information of the target region, are complementary, and are fused with prediction results of different scales by using a single convolution layer to serve as final segmentation results.

The above-mentioned step S7 is described in detail as follows:

step S71, a discriminatory loss optimization model is utilized based on antagonism entropy minimization, weighted self-information distribution of a source domain and a target domain is adjusted, prediction entropy of the target domain is minimized indirectly, and clearer semantic segmentation output and finer object edges are generated. Knowing the input source domain image x _s Pixel-level probability prediction of (2)Entropy mapping is obtained by means of shannon entropy calculation, as shown in a formula (3):

where n is the number of images and c is the number of channels.

Step S72, introducing an countermeasure training frame to construct an entropy discriminator D _E Mapping E (x) to Ji Shang _s ) And E (x) _t ) The entropy distribution of the source domain and the target domain is made similar to indirectly minimize entropy. The purpose of the entropy discriminator is to determine whether the entropy diagram is from the source domain or the target domain. The framework uses a GAN network to match entropy distributions of source domain data and target domain data. The arbiter uses the PatchGan architecture to distinguish between 8×8 patches. Contains 5 convolution layers, and the feature mapping number of each layer is [64,128,256,512,1 ]]The convolution kernel size is 4 x 4, the first layer convolution step size is 1. The middle convolution layer step size is 2, which contains a Relu activation function, and the parameter is 0.2. The last layer uses a sigmoid activation function. The input of the discriminator is a entropy diagram of the source domain and the target domain, the output is 0 or 1 as the domain classification result, 0 represents the entropy diagram from the source domain, and 1 represents the entropy diagram from the target domain.

The invention relates to a medical image segmentation method based on field self-adaption. The following describes specific operation steps and provides an effect diagram to demonstrate the effectiveness of the method.

The medical images of the CT and MR of the heart in the MMWHS dataset were used as source domain data, respectively, for each 3D image 16 slices were sampled from the long axis view around the center of the left ventricular cavity, cut to 240 x 220 size.

In the experimental process, firstly, a model overall architecture is constructed, an encoder network framework is constructed according to overall network layout, four downsampling operations are performed on images of a source domain and a target domain, image features are extracted, upsampling operation is performed, and the image features are converted from low resolution to high resolution. The overall architecture of the model is shown in fig. 1, and the encoder network architecture is shown in fig. 2.

After extracting the image features, the features are mapped to a low-dimensional space through three times of re-parameterization, noise in the image is eliminated, and new data similar to the input data but without noise is generated. The regularization method is then used to measure the data distribution of the source and target domains, which is used as a penalty term in combination with other optimization objectives to train the network. The re-parameterization block of the encoder is shown in fig. 3.

And secondly, constructing a decoder network framework, a divider network framework and an entropy discriminator network framework, wherein the decoder network framework is connected with the features of the label and the encoder after the parameterization, inputting the features into a plurality of continuous convolution blocks to reconstruct images, and improving the extraction capability of the model on the image structure information. The divider network framework predicts the division result from different scales through the result of three times of re-parameterization. The entropy discriminator network framework adopts a PatchGan architecture, and a entropy diagram of a soft segmentation result obtained by the segmenter is input, so that the distribution of the source domain and the target domain mapped by the entropy discriminator pair Ji Shang is constructed, and the entropy distribution of the source domain and the target domain is similar to indirectly minimize entropy. The entropy discriminator network framework is shown in fig. 4.

Claims

1. A medical image segmentation method based on field adaptation comprises the following steps:

step S1, using CT and MR medical images of the heart in MMWHS data set as source domain data respectively, sampling 16 slices from a long axis view around the center of the left ventricle cavity for each 3D image, and cutting into 240X 220 size;

step S2, firstly, constructing a model overall architecture, constructing an encoder network framework according to overall network layout, performing four downsampling operations on images of a source domain and a target domain, extracting image features, performing upsampling operation, and converting the image features from low resolution to high resolution;

s3, after the image features are extracted, mapping the features to a low-dimensional space through three times of re-parameterization, eliminating noise in the image, and generating new data similar to input data but without noise;

s4, measuring the data distribution of the source domain and the target domain by using a regularization method, wherein the measurement is used as a loss term and is combined with other optimized target training networks;

s5, constructing a decoder network frame, wherein the frame is connected with the tag and the characteristic of the encoder after the parameterization, inputting the feature into a plurality of continuous convolution blocks to reconstruct an image, and improving the extraction capability of the model on the image structure information;

2. A field-adaptive based medical image segmentation method according to claim 1, wherein: the step S1 specifically comprises the following steps:

the image is cut to a size of 240×220, normalized and input to the network. The CT image and the MR image are respectively used as a source domain, and the segmentation effect and the segmentation performance in a target domain are tested.

3. A field-adaptive based medical image segmentation method according to claim 1, wherein: and S2, constructing a model overall architecture, wherein the model mainly comprises three modules, namely a generator module, a data distribution measurement module and an entropy discriminator module, and the generator module comprises an encoder, a decoder and a divider. According to the overall network layout, constructing an encoder network framework, which comprises the following specific steps:

step S21, performing four downsampling operations on images of a source domain and a target domain, extracting image features, performing upsampling operation, converting the image features from low resolution to high resolution, performing four downsampling operations on the images by an encoder in a U-shaped structure, and extracting the image features;

s22, after extracting image features, four times of up-sampling are used for converting low resolution into high resolution image features, context information of the images is obtained through jump connection, and semantic differences of the features of different layers are reduced; each time of up-sampling can acquire image information of different levels, a feature image obtained by the second time of up-sampling is one fourth of an original image, and the feature image contains global information of the image; the feature map obtained by the third up-sampling is one half of the original image, contains the local information of the image, and can analyze the local structure and texture details in the image through the feature map; the image is restored to the original image size through the third upsampling, and the characteristic diagram contains smaller structure and detail information in the image; the encoder outputs a feature map obtained by three layers of up-sampling, so that the divider and the decoder acquire multi-level image features, and the constraint on the model is enhanced.

4. A field-adaptive based medical image segmentation method according to claim 1, wherein: the step S3 specifically comprises the following steps:

s31, after extracting image features, calculating the mean value and variance of input features by using a neural network, and generating data similar to the original features;

5. A field-adaptive based medical image segmentation method according to claim 1, wherein: the step S4 specifically comprises the following steps:

potential features of the source domain and the target domain are obtained through the encoder, and z is respectively _s And z _t The method comprises the steps of carrying out a first treatment on the surface of the Will z _s And z _t Is expressed as a probability density function ofAnd->θ _s And theta _t For the parameters to be learned, the posterior probability is expressed as +.>And->Obtaining an approximation of the posterior distribution by parameterized modeling, i.e. +.>And->Due to domain differences between the source domain and the target domain, the extracted features z _s And z _t Is a different distribution, thus +.>And->Inequality by calculating->Andthe distance between them constrains them to the same distribution; m samples are independently and randomly sampled from a source domain and a target domain respectively, a regularization distance is calculated, and a calculation formula is shown in a formula (1):

z _s representing a potential feature of the device,represents the i-th sample in the source domain, +.>Represents the jth sample in the target domain; since the variables of the potential space follow normal distribution, the kernel definition is as shown in equation (2):

6. A field-adaptive based medical image segmentation method according to claim 1, wherein: the step S5 specifically comprises the following steps:

the decoder is connected with the tag and the characteristic of the encoder after the parameterization, and the characteristic is input into a plurality of continuous convolution blocks to reconstruct images, so that the extraction capacity of the model on the image structure information is improved; each block consists of a convolution layer, an instance normalization layer and an activation layer; the source domain and the target domain have the same generator and all comprise three parts of an encoder, a decoder and a divider, wherein the encoder shares weight; the difference is that the source domain reconstructs the image using the re-parameterized features and segmentation labels, while the target domain does not have labels, and the re-parameterized features and the predicted segmentation results are used to reconstruct the image.

7. A field-adaptive based medical image segmentation method according to claim 1, wherein: the step S6 specifically comprises the following steps:

after the characteristic is subjected to the re-parameterization to remove noise, inputting a re-parameterization result into a divider, and predicting the division result from different scales; the three segmentation results respectively comprise integral structure information and texture detail information of the target region, are complementary, and are fused with prediction results of different scales by using a single convolution layer to serve as final segmentation results.

8. A field-adaptive based medical image segmentation method according to claim 1, wherein: the step S7 specifically comprises the following steps:

step S71, adjusting weighted self-information distribution of a source domain and a target domain based on antagonism entropy minimization by utilizing a discriminator loss optimization model, indirectly minimizing prediction entropy of the target domain, and generating clearer semantic segmentation output and finer object edges; knowing the input source domain image x _s Pixel-level probability prediction of (2)Entropy mapping is obtained by means of shannon entropy calculation, as shown in a formula (3):

wherein n is the number of images and c is the number of channels;

step S72, introducing an countermeasure training frame to construct an entropy discriminator D _E Mapping E (x) to Ji Shang _s ) And E (x) _t ) To indirectly minimize entropy by making the entropy distributions of the source domain and the target domain similar; the purpose of the entropy discriminator is to determine whether the entropy diagram is from a source domain or a target domain; the framework uses a GAN network to match entropy distributions of source domain data and target domain data; the discriminator adopts a PatchGan architecture to distinguish 8 multiplied by 8 patch; contains 5 convolution layers, and the feature mapping number of each layer is [64,128,256,512,1 ]]The convolution kernel size is 4×4, and the first layer convolution step length is 1; the step length of each middle convolution layer is 2, and the step length comprises a Relu activation function, and the parameter is 0.2; the last layer uses a sigmoid activation function; the input of the discriminator is a entropy diagram of the source domain and the target domain, the output is 0 or 1 as the domain classification result, 0 represents the entropy diagram from the source domain, and 1 represents the entropy diagram from the target domain.