CN115661144B

CN115661144B - Adaptive medical image segmentation method based on deformable U-Net

Info

Publication number: CN115661144B
Application number: CN202211611945.8A
Authority: CN
Inventors: 余绍黔; 周易鹏; 周涛; 刘承志
Original assignee: Hunan University of Technology
Current assignee: Hunan University of Technology
Priority date: 2022-12-15
Filing date: 2022-12-15
Publication date: 2023-06-13
Anticipated expiration: 2042-12-15
Also published as: CN115661144A

Abstract

The embodiment of the invention provides a self-adaptive medical image segmentation method based on deformable U-Net, which belongs to the technical field of image processing and specifically comprises the following steps: acquiring a sample medical image data set and performing image enhancement operation to obtain a training image set, a verification image set and a label set corresponding to the training image set and the verification image set; constructing a self-adaptive medical image segmentation model based on the deformable U-Net and carrying out model training by utilizing a training image set and a corresponding label set; verifying the self-adaptive medical image segmentation model based on the deformable U-Net by using the verification image set and the label set corresponding to the verification image set, and adjusting and optimizing parameters of the model by using a random number seed to obtain the target self-adaptive medical image segmentation model based on the deformable U-Net with the best performance; inputting the medical image to be segmented into a target self-adaptive medical image segmentation model to obtain a focus region segmentation result. By the scheme of the invention, the adaptability and the accuracy of medical image segmentation are improved.

Description

Adaptive medical image segmentation method based on deformable U-Net

Technical Field

The embodiment of the invention relates to the technical field of image processing, in particular to a self-adaptive medical image segmentation method based on a deformable U-Net.

Background

At present, the characteristic extraction of a medical image is more difficult than that of a common RGB image due to the influence of image acquisition equipment, because the medical image often has the problems of blurring, noise, low contrast and the like, and the segmentation result is easily influenced by factors such as partial volume effect, gray level non-uniformity, artifacts, the proximity of gray levels among different soft tissues and the like. With the increase of the data volume of medical images, the workload of doctors is greatly increased, an automatic image segmentation method is urgently needed clinically, the doctors are assisted to perform rapid diagnosis, further the working efficiency is improved, the workload of the doctors is reduced, and the automatic segmentation of the medical images has important significance.

The U-Net convolutional network is a hotspot to solve such problems, its encoder section works in a similar way to the traditional classification CNN, continuously aggregating semantic information at the cost of reduced spatial information, and the decoder receives the semantic information from the bottom of the "U" and concatenates it with the low-level feature image obtained from the encoder by the jump connection. Medical images are of concern in human health, and the segmentation of lesion or abnormal regions often requires greater precision than the segmentation of natural images. In order to accurately segment the medical image, the existing model is greatly improved on the basis of a U-Net network model. For example, the Chinese patent publication No. CN115187621A discloses an automatic U-Net medical image contour extraction network integrating attention mechanisms, which suppresses irrelevant information through the attention mechanisms and improves the attention of useful information to improve the segmentation accuracy. However, the feature extraction module in the convolution network is fixed, the capability of geometric transformation modeling is limited in nature, and meanwhile, the problems of neglecting the extraction of multi-scale features, too deep convolution process, too large semantic difference between encoder and decoder and the like exist in the network structure, so that the feature information cannot be fully utilized, and rough segmentation, under-segmentation and the like are easily caused. In summary, at present, a medical image segmentation network capable of adaptively extracting features from a focus area is still lacking, so that the image features are not extracted sufficiently, and the medical image segmentation precision is damaged.

It can be seen that there is a need for a deformable U-Net based adaptive medical image segmentation method with high adaptability and accuracy.

Disclosure of Invention

In view of the above, the embodiment of the invention provides a deformable U-Net-based adaptive medical image segmentation method, which at least partially solves the problems of poor accuracy and adaptability in the prior art.

The embodiment of the invention provides a deformable U-Net-based adaptive medical image segmentation method, which comprises the following steps:

step 1, acquiring a sample medical image data set and performing image enhancement operation to obtain a training image set, a verification image set and a label set corresponding to the training image set and the verification image set;

step 2, constructing an initial self-adaptive medical image segmentation model based on the deformable U-Net, and carrying out model training by utilizing a training image set and a corresponding label set thereof;

step 3, verifying the self-adaptive medical image segmentation model based on the deformable U-Net by using the verification image set and the label set corresponding to the verification image set, and adjusting and optimizing parameters of the model by using random number seeds to obtain the target self-adaptive medical image segmentation model based on the deformable U-Net with the best performance;

and 4, inputting the medical image to be segmented into a target self-adaptive medical image segmentation model to obtain a focus region segmentation result.

According to a specific implementation manner of the embodiment of the present invention, the step 1 specifically includes:

enhancing foreground-background contrast of a sample medical image dataset using normalization and contrast-limited adaptive histogram equalization;

introducing gamma correction to improve image quality of the sample medical image dataset;

and carrying out data enhancement on the sample medical image data set by using random rotation, random scaling and random elastic deformation modes to obtain a training image set, a verification image set and a label set corresponding to the training image set and the verification image set.

According to a specific implementation of an embodiment of the present invention, the adaptive medical image segmentation model includes an encoder, a decoder, and a residual attention convolution module and a multi-scale depth supervision module between the encoder and the decoder.

According to a specific implementation manner of the embodiment of the present invention, the step 2 specifically includes:

step 2.1, the multi-scale deformable convolutional coding module extracts the characteristics on the encoder in a multi-scale way, and the deformable residual interval skip decoding module promotes the characteristics to be transmitted and utilized upwards on the decoder;

step 2.2, the residual attention convolution module optimizes the semantic gap between the encoder and the decoder, highlights effective features from the attention suppression irrelevant features on the channel and the space, and adaptively promotes feature transfer between the encoder and the decoder;

step 2.3, predicting and outputting feature images of different positions of the decoder through a multi-scale depth supervision module, restraining the discrimination and robustness of learning features in each stage, and improving the network training efficiency;

step 2.4, calculating a Dice loss function and a Focal loss function according to the prediction segmentation result and the tag set;

and 2.5, setting the batch processing size and the learning rate of the experiment on each data set by adopting an Adam optimizer and taking accuracy Dice and IoU as evaluation targets, and iterating the steps 2.1 to 2.4 for preset times.

According to a specific implementation of an embodiment of the invention, the encoder comprises 4 deformable convolution modules, 4 downsampling operations and 3 interval skip connections;

the decoder includes 4 deformable convolution residual modules and 4 upsampling operations.

According to a specific implementation manner of the embodiment of the present invention, the step 2.1 specifically includes:

carrying out pooling operation of 4X4 on the features before the two deformable convolution modules on the encoder, carrying out convolution operation of 1X1 again to reduce the number of channels, and splicing the obtained features with the features extracted by the two deformable modules and the downsampling operation;

the up-sampling operation of 4X4 is carried out on the decoder before the two deformable convolution modules, the same channel number is kept for the convolution operation of 1X1, and the obtained feature image and the feature image which is subjected to the two deformable modules and the two up-sampling operations are added pixel by pixel.

According to a specific implementation manner of the embodiment of the present invention, the step 2.3 specifically includes:

adding multi-scale depth supervision to 4 deformable convolution residual modules from top to bottom respectively, respectively carrying out up-sampling and restoring to the image size, and then carrying out common convolution operation and sigmoid operation once to obtain 4 predicted image labels;

and performing splicing operation on the obtained 4 image labels, performing common convolution operation and sigmoid operation again to obtain 1 predicted image label, performing loss calculation on the predicted image predicted label and the original 4 image predicted labels respectively with the original image label, and cascading different position maps by adding multi-scale depth supervision in a decoder to obtain a predicted segmentation result.

According to a specific implementation of the embodiment of the present invention, the residual attention convolution module includes a residual connection and a CBAM convolution attention mechanism module.

According to a specific implementation of an embodiment of the present invention, the loss function is herein

，

wherein ,

the loss of the mth side output is the sum of Focal loss and Dice loss, < >>

Weights for the mth penalty term;

the expression of the Focal loss function is

，

wherein ,

for controlling parameters of class imbalance +.>

To control the parameters of the sample's difficulty level +.>

The method comprises the steps of predicting a proximity degree parameter of a tag and a real tag;

the expression of the Dice loss function is that

，

wherein ,

for the aggregate similarity measure function, +.>

Represents the number of intersection elements between sample X and sample Y, < >>

The number of elements in samples X and Y is represented.

The adaptive medical image segmentation scheme based on the deformable U-Net in the embodiment of the invention comprises the following steps: step 1, acquiring a sample medical image data set and performing image enhancement operation to obtain a training image set, a verification image set and a label set corresponding to the training image set and the verification image set; step 2, constructing an initial self-adaptive medical image segmentation model based on the deformable U-Net, and carrying out model training by utilizing a training image set and a corresponding label set thereof; step 3, verifying the self-adaptive medical image segmentation model based on the deformable U-Net by using the verification image set and the label set corresponding to the verification image set, and adjusting and optimizing parameters of the model by using random number seeds to obtain the target self-adaptive medical image segmentation model based on the deformable U-Net with the best performance; and 4, inputting the medical image to be segmented into a target self-adaptive medical image segmentation model to obtain a focus region segmentation result.

The embodiment of the invention has the beneficial effects that: according to the scheme, the self-adaptive medical image segmentation model based on the deformable U-Net is constructed, the characteristics are extracted in multiple scales on the encoder, the context information of a lesion area is fully mined, and the gradient disappearance problem of the U-Net network is effectively improved through residual connection on the decoder; the residual attention convolution module is connected with the encoder and the decoder, semantic difference between the encoder and the decoder is reduced through residual connection, key features of the image are adaptively identified by using channel attention and space attention, and the feature utilization rate is increased; the multi-scale depth supervision and the mixed loss function further optimize the segmentation of the lesion area boundaries.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for adaptive medical image segmentation based on deformable U-Net according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of the overall algorithm structure of a deformable U-Net-based adaptive medical image segmentation method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a deformable convolution implementation process according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a deformable RoI pooling implementation process according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a channel and spatial attention implementation process according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a residual attention convolution module between a bottommost encoder-decoder according to an embodiment of the present invention;

fig. 7 is a schematic diagram of comparison between a common residual module, a multi-scale deformable convolutional encoding module and a deformable residual interval skip decoding module according to an embodiment of the present invention, where (a) is a common residual module, (b) is a multi-scale deformable convolutional encoding module, and (c) is a deformable residual interval skip decoding module;

FIG. 8 is a schematic diagram of a multi-scale depth supervision provided by an embodiment of the present invention;

FIG. 9 is a schematic diagram comparing a common deformable U-Net network model with a model in the text on a skin cancer dataset, wherein (a) is an original medical image, (b) is an image tag, (c) is a segmentation result of the common deformable U-Net network model, and (d) is a segmentation result of the model of the invention;

fig. 10 is a schematic diagram comparing a general deformable U-Net network model and a text model on a brain tumor dataset according to an embodiment of the present invention, where (a) is an original medical image, (b) is an image tag, (c) is a segmentation result of the general deformable U-Net network model, and (d) is a segmentation result of the model according to the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

It should also be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.

In addition, in the following description, specific details are provided in order to provide a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.

The embodiment of the invention provides a self-adaptive medical image segmentation method based on deformable U-Net, which can be applied to a medical image analysis process of a medical scene.

Referring to fig. 1, a flow chart of a deformable U-Net-based adaptive medical image segmentation method is provided in an embodiment of the present invention. As shown in fig. 1, the method mainly comprises the following steps:

further, the step 1 specifically includes:

In implementation, as shown in fig. 2, fig. 2 is a general framework of an adaptive medical image segmentation method based on deformable U-Net, and considering that medical image datasets are mostly small in scale, training a large neural network on a limited dataset requires special attention to the fitting problem. The invention enhances foreground-background contrast by using normalized and contrast-limited adaptive histogram equalization (CLAHE) on the dataset, introduces gamma correction to further improve image quality, and then uses random rotation, random scaling, random elastic deformation, etc. for data enhancement.

further, the adaptive medical image segmentation model includes an encoder, a decoder, and a residual attention convolution module and a multi-scale depth supervision module between the encoder and the decoder.

Further, the step 2 specifically includes:

Optionally, the encoder comprises 4 deformable convolution modules, 4 downsampling operations, and 3 interval skip connections;

Further, the step 2.1 specifically includes:

Further, the step 2.3 specifically includes:

Optionally, the residual attention convolution module includes a residual connection and a CBAM convolution attention mechanism module.

Further, the loss function herein is

，

wherein ,

loss of output for the mth side is FocSum of al loss and Dice loss, < ->

Weights for the mth penalty term;

the expression of the Focal loss function is

，

wherein ,

for controlling parameters of class imbalance +.>

To control the parameters of the sample's difficulty level +.>

the expression of the Dice loss function is that

，

wherein ,

for the aggregate similarity measure function, +.>

Representing the number of intersection elements between sample X and sample Y,

the number of elements in samples X and Y is represented.

In specific implementation, the training of the adaptive medical image segmentation model based on the deformable U-Net is performed by utilizing the enhanced training sample image set and the tag set.

Further, in the model training related parameters, the proportion of the focus area of the medical image data set to the normal area is small, so that the loss function adopts a mixed composition of Dice and Focal loss.

Focal Loss focus Loss improved binary cross entropy Loss function, newly added

and />

Two parameters, control class imbalance, control sample ease. It reduces the contribution of simple examples and enables models to focus more on learning hard examples, such as lesion area boundaries, for highly unbalanced class scenarios, expressed as follows:

the Dice Loss can relieve the negative influence caused by imbalance of foreground and background (area) in a sample, training is more concerned about mining a foreground area, namely a lower false counter example FN is ensured, and the expression is as follows:

the mixed composition of the Dice and the Focal Dice can improve the attention degree of few categories and segmentation of complex samples, optimize the problems of unbalanced categories and complex shapes of medical image data sets in the text, and improve the segmentation accuracy of focus boundaries.

As shown in fig. 3 and fig. 4, fig. 3 is a deformable convolution implementation process in the scheme of the present invention, and the deformable convolution kernel can freely deform and sample in the current grid, instead of limiting regular grid point sampling, and fig. 4 is a deformable RoI pooling implementation process in the scheme of the present invention, and can adaptively position objects with different shapes; the deformable convolution module formed by combining the figures 3 and 4 captures various shapes and scales by using deformable receptive fields, and the receptive fields can adaptively extract characteristic information, so that the complexity and the calculated amount of one convolution operation are increased, and the problem of complexity and diversity of a region of interest of a medical image is effectively solved.

In the step 2, the adaptive medical image segmentation model based on the deformable U-Net is that the deformable convolution module is replaced by a common convolution module, the deformable convolution operation of the deformable convolution module utilizes a traditional convolution kernel to extract an intermediate feature image according to an input feature image, a convolution layer is applied to the intermediate feature image, the number of channels is 2N, the deformation offset of the deformable convolution in the x direction and the y direction is obtained, and the traditional convolution kernel and the deformation offset are combined to obtain the deformable convolution kernel. The deformable RoI pooling operation of the deformable convolution module obtains a feature map corresponding to the region of interest through the common RoI pooling operation, and then generates the offset of each part of the region of interest through the full connection layer by utilizing the obtained feature map, and the common RoI pooling operation and the offset are combined to form the deformable RoI pooling.

As shown in fig. 5, fig. 5 is a process of realizing channel and spatial attention in the scheme of the present invention, the model can pay more attention to important areas of the feature map and discard irrelevant features so as to improve segmentation performance of the model, and the adaptive medical image segmentation model based on deformable U-Net in step 2 is composed of an encoder module and a decoder module, and in the encoder module, the model is composed of 4 deformable convolution modules, 4 downsampling operations and 3 interval skip connections; in the decoder module, it is made up of 4 deformable convolution residual modules and 4 up-sampling operations; wherein in the convolution operation, the feature map size and the channel number are unchanged; in the downsampling operation, the size of the feature map is reduced by 1 time, and the number of channels is enlarged by 1 time; in the up-sampling operation, the feature map size is enlarged by 1 time, and the number of channels is reduced by 1 time.

In the encoder module, the characteristic map is subjected to characteristic extraction of the deformable module for two times, the characteristics are gradually lost in convolution pooling operation, 4X4 pooling operation is carried out on the characteristics before the deformable module for two times, the number of channels is reduced in 1X1 convolution operation, the number of model parameters is reduced, and the obtained characteristics are spliced with the characteristics extracted through the deformable module for two times and downsampling operation, so that the purpose of compensating the lost characteristics is achieved.

In the decoder module, the convolution degree of the features is high through the convolution of the encoder, in order to avoid the problems of gradient disappearance and network degradation of the features in the transmission process, 4X4 up-sampling operation is carried out on the input features, and the convolution operation of 11 is carried out again to maintain the same channel number and feature map size, so that the features obtained by the feature and the features extracted through the two deformable modules and the two up-sampling operations are added pixel by pixel, and the problems of gradient disappearance and network degradation caused by network depth are solved.

While still there are semantic gaps between the corresponding levels of the encoder-decoder that can affect the accuracy of the segmentation. In the first splicing operation, the features before the first pooling of the encoder and the features after the last upsampling of the decoder are spliced, the features from the encoder are low-level features with lower convolution degree, the features from the decoder are high-level features with higher convolution degree, and a huge semantic gap exists between the features. Meanwhile, as the encoder pooling increases and the decoder upsampling decreases, the difference between the two is gradually decreasing.

In the step 2, a plurality of convolution layers are added in the splicing operation of the encoder and the decoder in the adaptive medical image segmentation model based on the deformable U-Net to pull in the semantic gap between the encoder and the decoder, and residual connection is introduced to promote feature transfer; in order to restrain noise and irrelevant information existing in connection between an encoder and a decoder, important areas of a feature map are focused more and irrelevant features are discarded, so that the segmentation performance of a model is improved, a CBAM convolution attention mechanism module is introduced on the basis of residual connection, effective features are extracted in a key way, and irrelevant features are restrained; the residual connection and attention mechanism modules combine to form a residual attention convolution module, as shown in particular in fig. 6. As shown in fig. 7, a comparison diagram of a common residual module, a multi-scale deformable convolutional encoding module and a deformable residual interval skip decoding module is shown, wherein (a) is the common residual module, (b) is the multi-scale deformable convolutional encoding module, and (c) is the deformable residual interval skip decoding module.

In the decoder module, the invention adds multi-scale depth supervision to 4 modules from top to bottom respectively, in the 4 modules, we up-sample and restore the image to the image size respectively, and then perform a common convolution operation and sigmoid operation to obtain 4 predicted image labels, and the specific processing flow is shown in figure 8.

And performing splicing operation on the obtained 4 image labels, performing common convolution operation and sigmoid operation again to obtain 1 predicted image label, performing loss calculation on the predicted image predicted label and the original 4 image predicted labels respectively with the original image label, and cascading different maps by adding multi-scale depth supervision in a decoder to ensure that the finally obtained segmentation result reaches the optimal value, thereby obtaining a predicted segmentation result.

Furthermore, the Adam optimizer has the advantage of high convergence speed, so that Adam is adopted as the optimizer, accuracy Dice and IoU are adopted as evaluation targets, the batch processing size of experiments on each data set is 4, the learning rate is set to be 1e-4, and training is terminated after 300 times of total iterations.

Step 3, verifying the self-adaptive medical image segmentation model based on the deformable U-Net by using the verification image set, the verification image set and the corresponding label set, and adjusting and optimizing parameters of the model by using a random number seed to obtain the target self-adaptive medical image segmentation model based on the deformable U-Net with the best performance;

in the specific implementation, after the iterative training times are completed, the sample image set and the label set can be verified after enhancement, the adaptive medical image segmentation model based on the deformable U-Net can be verified, the parameters of the model are adjusted and optimized by utilizing random number seeds, the adaptive medical image segmentation model based on the deformable U-Net with the best performance is obtained as a target adaptive medical image segmentation model, and the target adaptive medical image segmentation model can be directly called for image segmentation without model training when the medical image is segmented later.

In specific implementation, a medical image to be segmented can be input into a target self-adaptive medical image segmentation model to obtain a focus region segmentation result, meanwhile, as shown in fig. 9 and 10, the segmentation result of the common deformable U-Net network model is compared with that of the common deformable U-Net network model, and fig. 9 is a comparison schematic diagram of the common deformable U-Net network model and the model of the invention on a skin cancer data set, wherein (a) is an original medical image, (b) is an image label, (c) is a segmentation result of the common deformable U-Net network model, and (d) is a segmentation result of the model of the invention; fig. 10 is a schematic diagram comparing a general deformable U-Net network model and the model of the present invention on a brain tumor dataset, wherein (a) is an original medical image, (b) is an image label, (c) is a segmentation result of the general deformable U-Net network model, and (d) is a segmentation result of the model of the present invention, and it is known that the segmentation accuracy and adaptability of the adaptive medical image segmentation model of the present invention are stronger.

According to the adaptive medical image segmentation method based on the deformable U-Net, the multi-scale connection is introduced on the basis of the deformable convolution to form a multi-scale deformable convolution coding module, and the characteristic information is extracted from the multi-scale by utilizing the deformable receptive field in the deformable convolution module; aiming at the problems of gradient disappearance and network degradation caused by deformable convolution, a residual error connection mode is designed, and a deformable residual error interval skip decoding module is formed by the residual error connection mode and the deformable convolution, so that the transmission utilization of the encoder characteristics is improved; aiming at the problems of semantic gap and noise interference existing between the encoder and the decoder, a plurality of convolution layers are added in the traditional splicing operation to pull in the semantic gap between the encoder and the decoder, residual connection is introduced, feature transfer is promoted, a CBAM convolution attention mechanism module is introduced on the basis of the residual connection, effective features are extracted in a key way, irrelevant features are restrained, the two are combined to form a residual attention convolution module, the semantic gap between the two is reduced, noise and irrelevant information in the jump connection process are reduced, and the calculation cost of a model is reduced; the multi-scale depth supervision and the mixed functions consisting of the focal loss and the dice loss are introduced into the encoder to optimize the problem of unbalanced medical image categories so as to generate more accurate position sensing and boundary sensing segmentation graphs, thereby improving the segmentation accuracy and adaptability.

The units involved in the embodiments of the present invention may be implemented in software or in hardware.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. A deformable U-Net based adaptive medical image segmentation method, comprising:

step 2, constructing an initial self-adaptive medical image segmentation model based on deformable U-Net and performing model training by utilizing a training image set and a corresponding label set, wherein the self-adaptive medical image segmentation model comprises an encoder, a decoder, a residual attention convolution module between the encoder and the decoder and a multi-scale depth supervision module;

the step 2 specifically includes:

step 2.5, adopting an Adam optimizer, setting batch processing size and learning rate of experiments on each data set by taking accuracy Dice and IoU as evaluation targets, and iterating the steps 2.1 to 2.4 for preset times;

the encoder comprises 4 deformable convolution modules, 4 downsampling operations and 3 interval skip connections;

the decoder includes 4 deformable convolution residual modules and 4 upsampling operations;

2. The method according to claim 1, wherein the step 1 specifically comprises:

3. The method according to claim 2, wherein the step 2.1 specifically comprises:

4. A method according to claim 3, wherein said step 2.3 comprises:

and performing splicing operation on the obtained 4 image labels, then performing 1X1 common convolution operation and sigmoid operation again to obtain 1 predicted image label, performing loss calculation on the predicted image label and the original 4 image predicted labels respectively with the original image labels, and cascading different positions maps by adding multi-scale depth supervision in a decoder to obtain a predicted segmentation result.

5. The method of claim 4, wherein the residual attention convolution module comprises a residual module and a CBAM convolution attention mechanism module.

6. The method of claim 5, wherein the text loss function is