CN113205528A

CN113205528A - Medical image segmentation model training method, segmentation method and device

Info

Publication number: CN113205528A
Application number: CN202110362039.8A
Authority: CN
Inventors: 李书芳; 张鹏皓; 潘聚东
Original assignee: Shanghai Huihu Information Technology Co ltd
Current assignee: Shanghai Huihu Information Technology Co ltd
Priority date: 2021-04-02
Filing date: 2021-04-02
Publication date: 2021-08-03
Anticipated expiration: 2041-04-02
Also published as: CN113205528B

Abstract

The invention provides a training method, a segmentation method and a device for a medical image segmentation model, wherein the training method for the medical image segmentation model is based on a meta-learning mode in a source domain, model parameters are continuously adjusted in a source domain by using a support set and a query set in a circulation mode, an original basic model is adjusted in an outer circulation mode by using the model parameters based on the query set as an updating direction, and model parameters sensitive to target domain task changes are obtained through continuous training of multiple batches of source domain training data. When the source domain segmentation model is migrated to the target domain, the method can better adapt to a new task of the target domain, and improves the generalization effect. Compared with an advanced meta-learning method, the target domain segmentation model obtained by the training method is obviously improved, and the overfitting problem can be reduced to a certain degree while the domain migration problem is reduced and the segmentation precision is improved.

Description

Medical image segmentation model training method, segmentation method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a medical image segmentation model training method, a segmentation method and a device.

Background

Semantic segmentation is usually a fundamental task in the medical image analysis stage of computer-aided diagnosis and treatment. Due to the special nature of medical images, their segmentation task is somewhat challenging and complex. Manual segmentation has gradually been replaced by automatic segmentation due to high labor costs and time consumption.

The deep learning method is widely applied to medical image segmentation at the present stage, and great achievement is achieved in the aspects of stability and accuracy of image segmentation. However, the requirement of deep learning on training data is extremely high, and in order to obtain an excellent picture segmentation processing effect, a large amount of data is often required for early training, but a large amount of training data has a very large manual labeling cost, and in the medical field, patient privacy is often involved, and legal disputes are easily caused by acquiring a large amount of clinical data. Therefore, in the process of training a model for medical image segmentation processing, training data are often extremely limited, which directly results in that the model generated by training cannot meet the segmentation requirement and cannot complete the image segmentation task.

Disclosure of Invention

The embodiment of the invention provides a medical image segmentation model training method, a segmentation method and a device, which are used for solving the problems of low segmentation precision and stability caused by small samples in the medical image segmentation processing process.

The technical scheme of the invention is as follows:

in one aspect, the present invention provides a medical image segmentation model training method, including:

acquiring a plurality of batches of source domain training data, wherein each batch of source domain training data comprises a group of support sets and a group of query sets, each support set and each query set comprise a plurality of medical images, and blocks marked with specified human organs in the medical images are used as labels; wherein each batch of source domain training data only contains a label of a human organ;

acquiring a preset reference model, in an inner loop, performing gradient reduction on the reference model in k steps by using a support set in single batch source domain training data to obtain a second model parameter set from a first model parameter set, and performing gradient reduction on a query set in one step to obtain a third model parameter set; in an outer cycle, a first model parameter set of an original reference model is taken as a starting point, a third model parameter set is taken as an updating direction, and a fourth model parameter set is obtained by updating according to a set step length, a learning rate and a cycle number; continuously training the reference model by using each batch of source domain training data according to the steps of inner circulation and outer circulation to obtain a source domain segmentation model; wherein k is a natural number;

acquiring target domain training data, wherein the target domain training data comprises a plurality of medical images, and marking blocks of target human organs as labels;

and training the source domain segmentation model by using target domain training data by adopting a layer freezing migration method to obtain a target domain segmentation model.

In some embodiments, the reference model is a U-Net network.

In some embodiments, the learning rate of the U-Net network in the outer loop is set to 1e^-3Step size is set to 0.4, cycle number is 300, and k in the inner loop is taken to be 3.

In some embodiments, the U-Net network employs a cross entropy function as a loss function.

In some embodiments, the U-Net network comprises three parts, an encoder, a decoder, and a skip connection, the encoder comprises four downsampling modules, each downsampling module comprises two 3 × 3 convolutional layers, each convolutional layer is followed by a bulk normalization layer and a modified linear unit, and a2 × 2 maximum pooling layer with a step size of 2 ends at the downsampling module; the decoder comprises four up-sampling modules and an activation function layer, wherein each up-sampling module comprises a2 x 2 transposed convolution layer and two 3 x 3 convolution layers, and each convolution layer is followed by a batch normalization layer and a correction linear unit; the jump is used for connecting the feature map before the maximum pooling layer of the same depth down-sampling module with the feature map output by the transposed convolution layer in the up-sampling module.

In some embodiments, the training the source domain segmentation model with target domain training data by using a layer freeze migration method to obtain a target domain segmentation model, including:

freezing the first two downsampling modules in the encoder to 1e^-3The learning rate of (2) is finely adjusted on the target domain training data;

unfreezing a second downsampling module in the encoder to 1e^-3The initial learning rate of (a) and the decay rate of 0.0077 are fine-tuned again on the target domain training data;

unfreezing the first two downsampling modules in the encoder by 1e^-3The learning rate of (c) is fine-tuned over the target domain training data.

In some embodiments, the medical images of the support set in each batch of source domain training data are subjected to data enhancement processing, including random angle flipping from 0 to 180 degrees, image translation, cross-cut transformation, and/or image stretching.

In another aspect, the present invention further provides a medical image segmentation method, including:

acquiring a medical image to be segmented, and cutting the image according to a preset size;

and inputting the cut medical image to be segmented into the target domain segmentation model obtained by the medical image segmentation model training method so as to output a segmentation result.

In another aspect, the present invention also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the method.

In another aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the steps of the above-mentioned method.

The invention has the beneficial effects that:

in the method, the device and the system for training the medical image segmentation model, the method for training the medical image segmentation model is based on a meta-learning mode in a source domain, model parameters are continuously adjusted in a source domain by using a support set and a query set in a circulation mode, an original basic model is adjusted in an outer circulation mode by taking the model parameters based on the query set as an updating direction, and model parameters sensitive to target domain task changes are obtained through continuous training of multiple batches of source domain training data. When the source domain segmentation model is migrated to the target domain, the method can better adapt to a new task of the target domain, and improves the generalization effect.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the specific details set forth above, and that these and other objects that can be achieved with the present invention will be more clearly understood from the detailed description that follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

FIG. 1 is a logic diagram of a medical image segmentation model training method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a model training logic of a medical image segmentation model training method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a U-Net structure adopted by the medical image segmentation model training method according to an embodiment of the present invention;

FIG. 4 is a convergence image of a loss function of a multi-source domain I pre-training, a repeat, a MAML and a target domain segmentation model of the present application in a cecum few-sample scene;

FIG. 5 is a graph showing the loss and gradient propagation of the replay, MAML and the target domain segmentation models of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the structures and/or processing steps closely related to the scheme according to the present invention are shown in the drawings, and other details not so relevant to the present invention are omitted.

It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.

It is also noted herein that the term "coupled," if not specifically stated, may refer herein to not only a direct connection, but also an indirect connection in which an intermediate is present.

Semantic segmentation is usually a fundamental task in the medical image analysis stage of computer-aided diagnosis and treatment. Due to the special properties of medical images, the segmentation task has certain challenges and complexities, which are particularly reflected in high requirements on accuracy and stability of medical image segmentation. However, in the process of performing deep learning in the early stage of the medical image, the number of medical image samples is relatively small, and the traditional deep learning method needs an enormous amount of data as support to obtain stability and high generalization. A feasible solution is to learn the initialization of the model from other similar tasks (source domains) and then perform fine tuning on a limited target task training set (target domains), namely, traditional migration learning, but when the model migrates from the source domain to the target domain, a domain transfer problem occurs, and due to the difference between the source domain and the target domain, the migration learning adaptability is not strong, and a suboptimal problem is caused. Domain adaptation and domain generalization can solve this problem well. Domain adaptation algorithms focus on using unlabeled or small amounts of labeled data in the target domain to enable fast fitting of models initialized on different source domains. The goal of the domain generalization algorithm is to train a model based on multiple source domains so that it can migrate directly to the target domain. The method and the device realize the segmentation task on the target domain and the source domain with similar data distribution, in order to optimize the segmentation precision, model parameters learned by a plurality of batches of source domain data are finely adjusted on the target domain, and the domain transfer problem is optimized by using a domain adaptation method.

According to the method, model parameters are continuously adjusted by using the support set and the query set in the inner circulation process, the original basic model is adjusted in the outer circulation process based on the model parameters of the query set as the updating direction, a source domain segmentation model is obtained through continuous training of multiple batches of source domain training data and is transferred to a target domain for fine adjustment, and finally the target domain segmentation model capable of adapting to a source domain segmentation task is obtained.

Specifically, the present application provides a medical image segmentation model training method, referring to fig. 1 and 2, including steps S101 to S104:

step S101: acquiring a plurality of batches of source domain training data, wherein each batch of source domain training data comprises a group of support sets and a group of query sets, each support set and each query set comprises a plurality of medical images, and blocks marked with specified human organs in the medical images are used as labels; wherein each batch of source domain training data only contains labels of one human organ.

Step S102: acquiring a preset reference model, in an inner loop, performing gradient reduction on the reference model in k steps by using a support set in single batch source domain training data to obtain a second model parameter set from a first model parameter set, and performing gradient reduction on a query set in one step to obtain a third model parameter set; in an outer cycle, a first model parameter set of an original reference model is taken as a starting point, a third model parameter set is taken as an updating direction, and a fourth model parameter set is obtained by updating according to a set step length, a learning rate and a cycle number; continuously training the reference model by using each batch of source domain training data according to the steps of inner circulation and outer circulation to obtain a source domain segmentation model; wherein k is a natural number.

Step S103: and acquiring target domain training data, wherein the target domain training data comprises a plurality of medical images, and marking blocks of target human organs as labels.

Step S104: and training the source domain segmentation model by using the target domain training data by adopting a layer freezing migration method to obtain a target domain segmentation model.

In step S101, the source domain data training data may be constructed using the public data set of Medical Segmentation Decathlon. The medical images may be unified using magnetic resonance or CT scan images. The source domain training data includes a plurality of batches, each batch labeled with the same organ as a label. For the image segmentation task, the label content comprises the block where the designated organ is located in the image, the label is obtained by labeling of a professional physician, and the accuracy meets the clinical standard. Further, the image resolution for training may be trimmed to make the size uniform and meet the input requirements of the reference model, for example, to adjust the image resolution to 256 × 256.

Illustratively, 6 batches of source domain training data were created using the cardiac image disclosed by Kings College London, the liver image disclosed by IRCAD, the prostate image disclosed by nijmengen Medical Center, and the pancreas, spleen, and cecal images provided by the mental slow cutting Cancer Center, each batch of source domain training data being further divided into support and query sets. The number of the medical images in the support set and the query set can be configured according to actual application scenes and data quantity, and the number of the medical images in the support set and the query set can be configured according to a set proportion under a certain condition.

In step S102, the reference model is a network model for image segmentation, and may be FCN, DeepMASK, U-Net, or the like. In the initial state, the parameters of the reference model may be randomly generated. In the present application, the preferred reference model is a U-Net network. The most key part of the U-Net is that each down-sampling is cascaded with a jump connection and a corresponding up-sampling, the feature fusion of different scales is helpful for recovering pixels of the up-sampling, specifically, the down-sampling multiple of a high layer (shallow layer) is small, a feature map has more detailed map features, the down-sampling multiple of a bottom layer (deep layer) is large, information is greatly concentrated, space loss is large, but target area (classification) judgment is facilitated, and when the features of the high layer and the bottom layer are fused, the segmentation effect is very good.

In the process of training the source domain segmentation model in step S102 of the present application, a meta-learning manner is introduced, so that the source domain segmentation model can obtain stronger generalization for various segmentation tasks. Specifically, in the source domain training process, as shown in fig. 2, the constructed n batches of source domain training data are used to train the reference model one by one and continuously. The source domain training data of each batch is subjected to two parts of training of inner loop and outer loop. After a preset reference model is obtained, in an inner loop, the reference model performs k-step gradient descent according to model parameters theta 'by using a support set in single batch source domain training data'₀Obtaining a model parameter theta'_kObtaining a model parameter theta' through one-step gradient reduction of the query set; in one outer loop, model parameters theta 'of the original reference model are used'₀Is theta 'as a starting point'₀→ theta' is the updating direction, and the model parameter theta is obtained by updating according to the set step length, the learning rate and the periodicity; and continuously training the reference model by using each batch of source domain training data according to the steps of inner circulation and outer circulation to obtain a source domain segmentation model. In the inner loop, the benchmark model is iteratively trained on a support set batch by multiple gradient descent and then on a query set. In the outer loop, the model is updated in the direction of the parameters obtained by training the query set in the inner loop by taking the original model parameters as the starting points. And continuously carrying out continuous iterative training on the reference model based on n batches of source domain training data to obtain a source domain segmentation model. In FIG. 2, the original U-Net network model is trained by the source domain training data of the 1 st batch, and in the inner loop, the k-step gradient descent is performed on the support set according to model parameters theta'₀Obtaining a model parameter theta'_kAnd then obtaining a model parameter theta' through one-step gradient reduction of the query set. The starting point of the external circulation is model theta'₀Is θ'₀→ θ' is the model update direction, β is the step size, and the resulting model parameter θ of this outer loop is obtained. Then, the 2 nd batch of source domain training data is adopted to train the original U-Net network model, and in the inner circulation, the gradient reduction of k steps is carried out on the support set to obtain a model parameter theta from the model parameter theta^″ _kAnd then obtaining a model parameter theta' through one-step gradient reduction of the query set. The starting point of the outer loop is a model theta, theta → theta' is taken as the model updating direction, beta is taken as the step length, and the result model parameter theta of the outer loop is obtained₁. And by analogy, continuously training by using n batches of data to obtain a final source domain segmentation model.

Through the adaptive learning mode, the finally obtained source domain segmentation model is not an optimal solution for a certain batch of source domain training data, but has better adaptability for various target tasks aiming at the global optimization of all batches of source domain training data, and the problems of domain transfer and poor overfitting or segmentation effect can be avoided in the migration process.

The method is obviously different from the existing two meta-learning methods of the MAML and the replay. In step S102, the training of the support set and the training of the query set are continuous, and compared with the process in which the training is performed by the support set and the reference is performed on the original reference model by using the loss calculated on the query set in the MAML, the model parameter features obtained by the training of the support set and the training of the query set can be effectively transferred to the original base model. Compared with a replay which does not distinguish a support set and a query set, the method has the advantages of more stable training effect and stronger adaptability to various tasks.

In some embodiments, in step S102, the medical images of the support set in each batch of source domain training data are subjected to data enhancement processing, including random angle flip of 0 to 180 degrees, image translation, transverse transformation, and/or image stretching.

The training of the neural network generally needs a large amount of data to obtain a relatively ideal result, and under the condition of limited data quantity, the diversity of training samples can be increased through data enhancement, so that the robustness of the model is improved, and overfitting is avoided. Meanwhile, the characteristics of the training samples are changed randomly, so that the dependence of the model on certain attributes can be reduced, and the generalization capability of the model is improved. Medical images in each batch of source domain training data are input into the reference model after random transformation, and training is carried out. A larger number of training sets may also be formed by the transformation.

In step S103, the target domain training data is training data constructed based on the target organ to be identified, and includes a plurality of medical images and a block in which the target organ is labeled. In the practical application process, a certain number of medical images can be set in the target domain as a test set.

Specifically, as shown in fig. 3, in some embodiments, the U-Net network includes three parts, an encoder, a decoder, and a skip connection, where the encoder includes four down-sampling modules, each of which includes two 3 × 3 convolutional layers, each convolutional layer is followed by a batch normalization layer and a modified linear unit, and a2 × 2 max pooling layer with a step size of 2 ends at the down-sampling module; the decoder comprises four up-sampling modules and an activation function layer, wherein each up-sampling module comprises a2 x 2 transposed convolution layer and two 3 x 3 convolution layers, and each convolution layer is followed by a batch normalization layer and a correction linear unit; the jump is used for connecting the feature map before the maximum pooling layer of the same depth down-sampling module with the feature map output by the transposed convolution layer in the up-sampling module. In some embodiments, the U-Net network adopts a cross entropy function as a loss function, and the learning rate alpha of the U-Net network is set to be 1e^-3Step β is set to 0.4, the number of cycles is 300, and k is taken to be 3 in the inner loop.

Specifically, in the implementation process, a basic semantic segmentation model U-Net of a deep learning framework keras component is adopted, the experimental system environment is Ubuntu, and the method is based on an NVIDIA GeForce 1080Ti video card with the video memory of 11G.

In step S104, the source domain segmentation model is trained using the target domain training data by using the layer freeze migration method. The difference between the source domain data and the target domain data distribution will be magnified when the neural network learns multiple stages of the hidden representation across layers. The source domain data acts as a positive activation, while the target domain data results in a negative activation. Larger differences may corrupt knowledge and experience learned from the source domain, i.e., the forgetting corruption problem. To solve this problem, it is important to keep relatively small differences throughout the network, which is achieved by gradually thawing shallow layers in step S104 of the present application. The layered freezing method is adopted in the shallow layer because the medical image segmentation focuses more on low-dimensional features than the scene segmentation. These low-dimensional features are similar to general features and are typically extracted at a shallow layer.

Specifically, in some embodiments, the step S104 of training the source domain segmentation model by using target domain training data by using a layer freeze migration method to obtain a target domain segmentation model includes steps S1041 to S1043:

step S1041: freezing the first two downsampling blocks in the encoder, at 1e^-3The learning rate of (c) is fine-tuned over the target domain training data.

Step S1042: unfreezing a second downsampling module in the encoder at 1e^-3Again fine-tuned on the target domain training data, an initial learning rate of 0.0077.

Step S1043: unfreezing the first two downsampling modules in the encoder, with 1e^-3The learning rate of (c) is fine-tuned over the target domain training data.

In this embodiment, the shallow layer in the encoder is frozen and thawed layer by layer for fine tuning based on the U-Net model constructed in step S103. In the U-Net network structure, an encoder can capture semantic information in a down-sampling process, and a decoder can be accurately positioned in an up-sampling process. Since the medical image segmentation focuses more on low-dimensional features, these features are generally extracted in the shallow layer, and in order to reduce the negative activation effect, in this embodiment, the first two down-sampling modules of the encoder are frozen and are gradually thawed from the deep layer to the shallow layer in the thawing process. Specifically, the target domain training data may be divided into a plurality of batches to be trimmed according to a set period, for example, 8 batches of data may be set to train 300 periods.

On the other hand, the invention also provides a medical image segmentation method, which comprises the following steps of S201 to S202:

step S201: and acquiring a medical image to be segmented, and cutting the image according to a preset size.

Step S202: inputting the cut medical image to be segmented into the target domain segmentation model obtained by the medical image segmentation model training method in the steps S101 to S104, so as to output the segmentation result.

In the present embodiment, the steps S103 to S104 obtain a target region segmentation model for segmenting the target organ in the target region by fine adjustment based on the designated segmentation task. In step S201, after the medical image to be segmented is acquired, the medical image is segmented according to the input size requirement of the target domain segmentation model. In step S202, the target domain division model obtained in step S104 is subjected to computation and a division result is output.

The invention is illustrated below with reference to specific examples:

the public data set of Medical Segmentation Decathlon was used as the data set for the experiments herein. The data set includes magnetic resonance or CT scan images of ten different human organs. All images are marked by professional doctors, and the accuracy meets the clinical standard. The image scaling rate is adjusted to 256 × 256, and the multi-valued labeling is simplified to a binary segmentation task. Images of six organs were selected for validation experiments. These six organs are respectively the heart image disclosed by Kings College London, the liver image disclosed by IRCAD, the prostate image disclosed by nijmengen Medical Centre and the pancreas, spleen and cecum images provided by the mental slow cutting Cancer Centre.

Cecum and liver were selected as target domain data. To construct a small sample scene based on these two tasks, we split the above-mentioned six images into two groups. The first set of target domain training sets comprised 214 images randomly sampled from the cecal data, and the target domain test set comprised the remaining 1070 cecal data. The first set of source domain training sets included three batches of data, each consisting of 2611 total images of the prostate, pancreas, and spleen. The second set of target domain training sets comprised 191 randomly sampled images from the liver data, and the target domain test set consisted of 18791 remaining cecal data. The second set of source domain training sets included three batches of data, consisting of a total of 2877 images of the prostate, heart and pancreas, respectively.

In this embodiment, a method for training a medical image segmentation model is provided, which includes:

firstly, a basic semantic segmentation model U-Net is constructed based on a deep learning framework Keras, and the network structure is shown in FIG. 3. The network structure of the U-Net is composed of an encoder, a decoder and a jump connection part. The encoder comprises four downsampling modules, each containing two 3 × 3 convolutional layers (Conv2D + BN + Relu), each followed by a Batch Normalization layer (BN) and a modified Linear Unit (Relu), ending with a2 × 2 maximum pooling layer of step size 2 at the downsampling module. The decoder structure is similar to the encoder except that the largest pooled layer is replaced with a transposed convolutional layer (transposed) of 2 x 2. The jump connection connects the feature map before the maximum pooling layer of the same depth with the feature map output by the transposed convolutional layer in the upsampling module. Since the upsampling module of a specific depth utilizes the feature map generated by the downsampling module at the corresponding depth, the shallow freezing method of the encoder can also protect the relevant decoder layer from being activated.

Model parameters of the U-Net are initialized randomly, and a model is trained on a source domain training set. Specifically, the method includes an inner loop and an outer loop in the manner described in step S102. The inner loop is based on the support set and query set, first through successive k-step gradient descent over the support set by the model parameters θ'₀Obtaining a model parameter theta'_kAnd then obtaining a model parameter theta' through one-step gradient reduction of the query set. The starting point of the external circulation is model theta'₀Is θ'₀→ theta' is the model updating direction, beta is the step length, and the resulting model parameter of the outer loop is obtainedTheta. The U-Net model uses a cross entropy function as a loss function, the source domain element learning batch size is 6, and each batch of data is continuously trained according to the steps of inner circulation and outer circulation for 300 periods. Further, the learning rate α of the U-Net model is set to 1e^-3The step size β is set to 0.4, the number of learning cycles in the outer loop is also 300, and k in the inner loop is taken to be 3. And finally, training based on the source domain training set to obtain a source domain segmentation model.

Using function f_θRepresents U-Net, when f_θWhen training based on batch tau, the model parameter theta is updated to theta through continuous i-step gradient descent'_iThe updated model is represented as

The gradient update procedure at the ith iteration can be expressed as the following:

wherein alpha is a fixed hyper-parameter,

representation model

Loss function, θ ', based on batch τ'_iBy continuously optimizing the model over the same batch τ

Thus obtaining the product.

Further, meta-optimization can be described as the following expression:

optimization of the whole meta-learning is based on the model parameter theta, which can be updated based on another batch tau'_iTo obtain theta', the optimization process canDescribed as the following calculation:

where β is the meta learning step size, representing the update ratio toward the final parameter.

Further, the source domain segmentation model is transferred to the target domain for fine adjustment according to a layer freezing migration method. Combining the structure of the U-Net network, firstly freezing the first two down-sampling modules, and setting the learning rate to be 1e at the moment^-3. Then unfreezing the second down-sampled layer of the deep layer and processing with 1e^-3Again fine-tuned on the target domain, the initial learning rate of (1) and the decay rate of 0.0077. Finally all layers were thawed and 1e^-4The last fine tuning of the learning rate of (c) is performed on the target and top. The size of the target domain fine tuning batch is 8, and 300 periods are trained to obtain a target domain segmentation model. And finally, performing test evaluation on the target domain segmentation model on the target domain test set.

Further, the present embodiment adopts a dess score as an objective evaluation index, and the dess score is an index for measuring the overlapping degree of two images, and is widely applied to the evaluation of the medical image segmentation effect. For the medical image binary segmentation task, we set the ratio of the region of interest to 0.9 and the background region to 0.1. In the embodiment, three modes are tested and verified, wherein the mode I aims to directly train a model on a target domain based on randomly initialized model parameters; and the second mode is based on model parameters obtained by pre-training on a source domain and fine-tuning on a target domain. And the third mode is that a meta-learning method is applied in a source domain training stage, wherein the meta-learning method comprises three meta-learning structures, namely replay, MAML and the network structure (Ours) of the application. Comparative analysis was performed using a dess score, and the results are shown in table 1.

TABLE 1

In table 1, in the small sample scenario of the cecum, pattern one was trained on the target domain using standard supervised learning methods, resulting in a dess score of 0.537. Mode two single-source pre-training was performed on three different source domains, including prostate, pancreas and spleen, and the resulting model parameters were used as initialization parameters for target domain training, resulting in dess scores of 0.591, 0.590 and 0.591, respectively. Meanwhile, a second mode experiment is carried out on the multi-source domain I formed by combining the three tasks, and a result with a Daiss score of 0.611 is obtained. While in pattern three, the algorithm proposed by copy, MAML and the present application achieves a dess score of 0.608, 0.615 and 0.628, respectively. Further, the experimental layer freezes the gain brought by the migration strategy. For the various methods of designing models two and three, this migration method can result in a dice score boost of about 4%. For a few sample scene of the liver, design mode one can achieve a dess score expression of 0.904. The single-source domain mode two can respectively realize the scores of 0.903, 0.902 and 0.905, and the multi-source domain II mode two can realize the score of 0.905. The algorithm proposed by copy, MAML and the present application achieves a dess score of 0.904, 0.905 and 0.912, respectively. The best results were achieved when migration was performed using the layer freezing method, reaching a dess score of 0.926.

Further, as shown in fig. 4, the convergence of the loss function in the cecum-less-sample scenario is shown. The curves of a1, a2, A3, and a4 are used to show multi-source domain I mode two, the solution of the present application, duplicate, and MAML, respectively. Obviously, the other three methods can converge to smaller values than the method of the present application, but the method of the present application obtains better dess score on the test data, which means that the method of the present application can perform better in avoiding the problem of over-fitting of few samples training.

The innovation of the present application is the proposed meta-learning algorithm, as shown in fig. 5 (a). Unlike the MAML and the replay, the inner loop of the MAML obtains an intermediate parameter theta of the model on the support set through gradient descent^sAnd calculating loss and gradient on the query set based on the parameters, the gradient directly acting on the original parameters theta of the model, and the model parameters based on which the gradient is obtained from the support setIt is the final gradient descent direction of the outer loop of each MAML that is completely determined by the loss of the query set, which emphasizes the ability to learn the query set based on support set experience, and this imbalance impairs the support set role to some extent.

As shown in fig. 5(b), there is no partition of the support set and the query set in duplicate, and its inner loop obtains the intermediate model parameter θ 'through continuous gradient descent in the same batch, and updates the original model parameter θ with (θ, θ') as the outer loop gradient direction and with a certain step size, such a learning manner emphasizes the learning ability of the model in the same batch and enhances the generalization ability with a certain step size, but ignores the chance that different batches may bring stronger generalization ability.

The algorithm of this embodiment is as shown in fig. 5(c), the outer loop adopts the same parameter updating strategy as repeat, but the inner loop uses the concept of MAML, adopts a diversity strategy of support set and query set, and the last gradient descent is based on the query set. Such an approach not only takes into account the ability of diversity training to learn "schooling" but also more balances the contributions of the support set and the query set.

In combination with the above result analysis, the method of the present embodiment is significantly improved compared to the advanced meta learning method, and can reduce the over-fitting problem to some extent while reducing the domain migration problem and improving the segmentation accuracy.

In summary, in the medical image segmentation model training method, the segmentation method and the device of the present invention, the medical image segmentation model training method is based on a meta-learning manner in a source domain, continuously adjusts model parameters in a source domain by using a support set and a query set, adjusts an original basic model in an outer loop based on the model parameters of the query set as an update direction, and continuously trains through a plurality of batches of source domain training data to obtain model parameters sensitive to task changes in a target domain. When the source domain segmentation model is migrated to the target domain, the method can better adapt to a new task of the target domain, and improves the generalization effect.

Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein may be implemented as hardware, software, or combinations of both. Whether this is done in hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments in the present invention.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A medical image segmentation model training method is characterized by comprising the following steps:

2. The method of claim 1, wherein the reference model is a U-Net network.

3. The method according to claim 2, wherein the learning rate of the U-Net network in the outer loop is set to 1e^-3Step length ofSet to 0.4 with a cycle number of 300, k in the inner loop taken to be 3.

4. The method according to claim 3, wherein the U-Net network employs a cross entropy function as a loss function.

5. The method according to claim 2, wherein the U-Net network comprises three parts, namely an encoder, a decoder and a skip connection, wherein the encoder comprises four down-sampling modules, each down-sampling module comprises two 3 x 3 convolutional layers, each convolutional layer is followed by a batch normalization layer and a modified linear unit, and a2 x 2 max pooling layer with a step size of 2 is arranged at the end of the down-sampling module; the decoder comprises four up-sampling modules and an activation function layer, wherein each up-sampling module comprises a2 x 2 transposed convolution layer and two 3 x 3 convolution layers, and each convolution layer is followed by a batch normalization layer and a correction linear unit; the jump is used for connecting the feature map before the maximum pooling layer of the same depth down-sampling module with the feature map output by the transposed convolution layer in the up-sampling module.

6. The method for training a segmentation model of a medical image according to claim 5, wherein the method of layer freeze migration is used for training the source domain segmentation model by using target domain training data to obtain a target domain segmentation model, and comprises:

7. The method for training a segmentation model of medical images as claimed in claim 1, wherein the medical images of the support set in each batch of source domain training data are subjected to data enhancement processing including random angle flip of 0 to 180 degrees, image translation, transverse transformation and/or image stretching.

8. A method of medical image segmentation, comprising:

inputting the cut medical image to be segmented into the target domain segmentation model obtained by the medical image segmentation model training method according to any one of claims 1 to 7, so as to output the segmentation result.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 8 are implemented when the processor executes the program.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.