CN117409201A

CN117409201A - MR medical image colorectal cancer segmentation method and system based on semi-supervised learning

Info

Publication number: CN117409201A
Application number: CN202311382615.0A
Authority: CN
Inventors: 曹月鹏; 鲍军; 高阳; 杨柳; 司呈帅; 吴志平; 邵鹏; 李姝萌
Original assignee: Jiangsu Cancer Hospital
Current assignee: Jiangsu Cancer Hospital
Priority date: 2023-10-24
Filing date: 2023-10-24
Publication date: 2024-01-16

Abstract

The invention provides a method and a system for segmenting colorectal cancer based on MR medical images of semi-supervised learning, which are used for mixing information acquired from two dimensions of 2D and 3D, and simultaneously realizing the purposes of suppressing unreliable knowledge and retaining useful information so as to more efficiently utilize unlabeled data. The method comprises the following specific steps: and constructing a 2D semi-supervised segmentation network model and a 3D semi-supervised segmentation network model, and respectively training and regularizing the multi-scale consistency by utilizing the MR image 2D slice and the 3D volume. And performing 2D and 3D mixed prediction according to the model uncertainty estimation, performing uncertainty weighted regularization by the mixed prediction, and introducing mixed consistency loss. And carrying out optimization training by combining the 2D semi-supervised segmentation network model and the 3D semi-supervised segmentation network model, and testing new colorectal cancer MR image data to obtain segmentation results. The method can effectively relieve the problem of poor model performance in a single dimension, and can be applied to the field of colorectal cancer MR image segmentation.

Description

MR medical image colorectal cancer segmentation method and system based on semi-supervised learning

Technical Field

The invention belongs to the field of medical images, and particularly relates to an MR medical image colorectal cancer segmentation method and system based on semi-supervised learning.

Background

Colorectal cancer is a common malignant tumor in clinic, and in recent years, the incidence rate and death rate of the disease in clinic are continuously increased, and the colorectal cancer is a malignant tumor of the digestive system which is the first place of global morbidity and mortality. The image segmentation can assist in effectively judging clinical stage of colorectal cancer patients, so that the most appropriate treatment method can be selected, the treatment accuracy is improved, and the occurrence of excessive treatment is avoided. Currently, magnetic resonance imaging (MR) techniques show good results in pre-operative analysis of colorectal cancer, and medicine has proven to provide an important reference for clinical surgical treatment. However, staged diagnosis and tumor segmentation from a large number of 3D MR images generated in clinical routine is a laborious and time-consuming task.

The artificial intelligence method is widely applied to various medical image analysis tasks, and models such as a deep convolutional neural network and a visual transducer have made great progress in image recognition. Aiming at the problems that medical image labeling is high in cost and requires biomedical specialists with abundant experience to carry out a large number of complicated labels, the semi-supervised learning method can carry out efficient label learning under the condition that only part of labels are available, and the semi-supervised method based on data, model or task-level consistency achieves good effects. However, existing approaches mainly utilize information acquired from a single dimension (i.e., 2D or 3D), resulting in poor performance in the face of challenging data, with a large degradation in network performance as the tag data continues to decrease. Therefore, there is a need to develop new semi-supervised learning methods that mix information acquired from both 2D and 3D dimensions while achieving suppression of unreliable knowledge and retention of useful information to more efficiently utilize unlabeled data.

Disclosure of Invention

The invention provides an MR medical image colorectal cancer segmentation method based on semi-supervised learning, which performs information fusion from two dimensions of 2D and 3D to realize consistency-based semi-supervised learning. The mixed learning mechanism allows the model to combine "two-half-life" with the features extracted from 2D, 3D or both dimensions to produce an output result that it deems most appropriate, using the 2D and 3D, respectively, tee networks to capture mixed information from both dimensions. Uncertainty weighting is carried out on the two network output results, and a mixed regularization module is utilized to encourage the two student models to generate segmentation results close to uncertainty weighted mixed prediction, so that semi-supervised colorectal cancer MR image segmentation under two-dimensional mixed learning is realized. In order to achieve the above purpose, the technical scheme of the invention is as follows:

MR medical image colorectal cancer segmentation method based on semi-supervised learning comprises the following steps:

step 1, acquiring a colorectal cancer MR image data set, preprocessing an acquired medical image to obtain marked MRI data and unmarked MRI data, and dividing a training set, a verification set and a test set; the MR is a magnetic resonance imaging scan;

step 2, constructing a 2D semi-supervised segmentation network model, training and regularizing by utilizing an MR image data 2D slice, and encouraging predictions of different scales to keep consistent for given input through multi-scale consistency regular constraint, so as to generate a pseudo tag of an unlabeled 2D image;

step 3, constructing a 3D semi-supervised segmentation network model, training and regularizing by utilizing an MR image data 3D roll, and encouraging predictions of different scales to keep consistent for given input through multi-scale consistency regular constraint, so as to generate a pseudo tag of an unlabeled 3D image;

step 4, carrying out 2D and 3D mixed prediction according to the model uncertainty estimation, merging the outputs of the 2D semi-supervised segmentation network model and the 3D semi-supervised segmentation network model into the mixed prediction through respective uncertainty scores, estimating the mixed uncertainty, and providing uncertainty weighted regularization;

step 5, combining the 2D semi-supervised segmentation network model and the 3D semi-supervised segmentation network model for fine tuning and weight optimization, and storing model parameters;

and 6, testing the colorectal cancer MR image by using the medical image segmentation model which is well trained in an optimization mode, and obtaining a final prediction result.

Further, the preprocessing and original data set dividing process in step 1 includes:

selecting three-dimensional colorectal MR data, performing image voxel spacing adjustment, colorectal region extraction, resampling and data normalization on an image, performing random clipping and random overturning image expansion operation, and dividing a training set, a verification set and a test set, wherein each data set comprises an MR image of the colorectal region and a corresponding real segmentation result.

Further, the 2D semi-supervised segmentation network in step 2 adds an auxiliary layer after each block of the decoder of the student model and the teacher model to form a multi-scale prediction component, the teacher model weight is defined as EMA of the student model weight, the multi-scale feature is utilized to encourage the different layer predictions to keep consistent, the consistency loss is used to minimize the difference between the predictions of different scales, the multi-scale supervision loss is used to evaluate the quality of the network output for tagged inputs, the multi-scale consistency loss is used to measure the consistency of the predictions of different scales of the same inputs by the teacher model and the student model under different disturbances, and is defined as the expected distance between the predictions of the student model and the predictions of the teacher model:

wherein L is _ce And L _dice The cross entropy loss and the Dice loss respectively,and->Segmentation prediction on scale k, lambda representing multi-scale prediction components of a 2D student model and a teacher model, respectively ^2D Is true annotation, alpha _k Is the weight of scale k.

Further, the 3D semi-supervised segmentation network in the step 3 is the same as the 2D network in architecture, an auxiliary layer is added to the student model and the teacher model to form a multi-scale prediction component, the teacher model weight is EMA of the student model weight, the multi-scale feature is utilized to encourage the prediction of different layers to keep consistent, and the multi-scale supervision loss and the multi-scale consistency loss are defined as:

wherein,and->Multi-scale prediction representing a 3D student model and a teacher model, respectivelyPrediction of the partitioning of a component on scale k, y ^3D Is true annotation, alpha _k Is the weight of scale k.

Further, in the step 4, uncertainty estimation and hybrid prediction are performed on the 2D and 3D images, uncertainty in the 2D and 3D teacher outputs are estimated using Monte Carlo Dropout, respectively, and a set of softmax probability vectors are obtained for each voxel in the inputAnd->Using the prediction entropy as a metric to approximate uncertainty, the final hybrid segmentation prediction is an uncertainty weighted combination of the 2D and 3D segmentation outputs:

wherein Con represents the splicing operation, C ^―1 Converting 2D into 3D shape, { w } is an entropy-based weight map reflecting the confidence level of all segmentation maps per voxel position, u ^h Representing a hybrid segmentation uncertainty estimate.

Further, in the step 5, model optimization training is performed by combining the 2D network and the 3D network, and the overall training goal of the model is to minimize the 2D loss L ^2D And 3D loss L ^3D Both 2D and 3D loss including supervisory loss, multi-scale consistency loss, and hybrid consistency loss, determining weight coefficients, and determining optimization methods and hyper-parameters used in the training process, including optimizers, initial learning rates, training iteration numbers, etc.,training the model, storing the model and the visual result at fixed iteration times, properly adjusting settings such as super parameters and the like according to the result, and carrying out iterative optimization on the model.

Further, in the step 6, the MR images of the test set are tested by using the optimized and trained model to obtain the final segmentation result, and evaluation indexes such as the Dice score and the average surface distance ASD are output.

MR medical image colorectal cancer segmentation system based on semi-supervised learning is used for realizing the segmentation method, and comprises the following steps:

the method comprises the steps of a first module, acquiring a colorectal cancer MR image data set, preprocessing an acquired medical image to obtain marked MRI data and unmarked MRI data, and dividing a training set, a verification set and a test set; the MR is a magnetic resonance imaging scan;

the second module is used for constructing a 2D semi-supervised segmentation network model, training and regularization are carried out by utilizing the MR image data 2D slices, prediction of different scales is encouraged to be consistent for given input through multi-scale consistency regular constraint, and a pseudo tag of an unlabeled 2D image is generated;

the third module is used for constructing a 3D semi-supervised segmentation network model, training and regularizing by utilizing the MR image data 3D roll, encouraging the predictions of different scales to keep consistent for given input through multi-scale consistency regular constraint, and generating a pseudo tag of an unlabeled 3D image;

a fourth module, performing 2D and 3D hybrid prediction according to the model uncertainty estimation, merging the outputs of the 2D semi-supervised segmentation network model and the 3D semi-supervised segmentation network model into the hybrid prediction through respective uncertainty scores, estimating the hybrid uncertainty, and providing uncertainty weighted regularization;

the fifth module is used for carrying out fine adjustment and weight optimization on the combination of the 2D semi-supervised segmentation network model and the 3D semi-supervised segmentation network model, and saving model parameters;

and a sixth module for testing the colorectal cancer MR image by using the optimized and trained medical image segmentation model to obtain a final prediction result.

The beneficial effects of the invention are as follows: the invention provides a method and a system for segmenting colorectal cancer based on MR medical images of semi-supervised learning, which are used for mixing information acquired from two dimensions of 2D and 3D, and simultaneously realizing the purposes of suppressing unreliable knowledge and retaining useful information so as to more efficiently utilize unlabeled data. The invention can effectively alleviate the problem of poor model performance in a single dimension, can be applied to the field of colorectal cancer MR image segmentation, and can also be expanded in the field of other medical image segmentation.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

FIG. 2 is a schematic diagram of a 2D semi-supervised segmentation network model according to the present invention;

FIG. 3 is a schematic diagram of a 3D semi-supervised segmentation network model according to the present invention;

FIG. 4 is a schematic diagram of the joint training and hybrid prediction of the present invention.

Detailed Description

The following describes the embodiments of the present invention further with reference to the drawings.

Example 1

The embodiment discloses an MR medical image colorectal cancer segmentation method based on semi-supervised learning, which specifically comprises the following steps as shown in fig. 1:

step (1): colorectal cancer MR image data preprocessing and construction of data sets

And selecting three-dimensional colorectal cancer MR image data, acquiring an original data set, unifying voxel-to-voxel spacing of the images, and facilitating restoration of the real images, so that important information of the images is learned more efficiently. The colorectal target area is truncated and resampled to 96 x 96, data normalization processing is then performed to scale the pixel values of the image to the range of [0,1 ].

The training set, validation set and test set are partitioned according to a 3:1:1, and each data set includes a colorectal region MR image and a corresponding segmentation result. In order to improve the generalization capability of the model and avoid the phenomenon of over-fitting, the training image is subjected to image expansion operation of random clipping and random overturning after 0-value filling.

Formalized representation of a build datasetGiven with labelsAnd no tag->Is a training set image of (a). Wherein x is _i ∈R ^W×H×D Is an input 3D image, y _i ∈{0,1} ^W×H×D Is a true annotation. L, N represent the number of annotation images and the total training images, respectively, in order to reduce the annotation cost, making the method more practical and challenging, we use limited source domain annotation, i.e. +.>

Step (2): building 2D semi-supervised segmentation network and training

As shown in fig. 2, a 2D semi-supervised segmentation network is constructed, and a batch of 2D input images are x ^2D ∈R ^b×c×w×h Where b, c, w and h represent the batch size, the number of channels, and the width and height of the image, respectively. For tagged images, the true label is y ^2D ∈{0,1} ^b×c×w×h . The split network uses 2DU-Net as the backbone, with multi-scale consistent regularization constraints, encouraging predictions of different scales to remain consistent for a given input. To hierarchically extract the hidden feature representation, an auxiliary layer is added after each block of the decoder to form a multi-scale prediction component. In particular, the auxiliary layer consists of an upsampling layer, a 1 x 1 convolution and a softmax layer to obtain the multi-scale prediction. The 2D student model generates a multi-scale predictive label, expressed as:

wherein,representing the prediction probabilities activated by the softmax function, fused with their corresponding 3D predictions in step (4) to generateAnd finally, dividing the prediction result. f (f) ^{s 2d} And theta ^{s 2d} Respectively represent student model and its weight parameter, xi ^s2d Is added to x ^2D To enhance the random gaussian noise of the network robustness.

Multiscale supervised loss is used to evaluate the quality of network output for tagged inputs, expressed as:

wherein L is _ce And L _dice The cross entropy loss and the Dice loss respectively,representing a segmented prediction of a multi-scale prediction component on scale k, alpha _k Is the weight of scale k.

The teacher model is identical to the student model except that an additional dropout layer is inserted after the last layer of its encoder. In the training step, the weight of the average model is often more accurate than the final weight is directly used, so that better targets can be built by utilizing the point, and the teacher model is utilized to normalize the student model. The teacher model is the average of successive student models, i.e., EMA weights for the student models. More accurate target labels result in faster feedback loops between the student and teacher models, thereby improving test accuracy. Training the teacher model weight theta at the step t ^t2d Defined as continuous θ ^s2d EMA of weights (where α is a smoothing coefficient super parameter that controls the update rate):

the multi-scale consistency loss is used to measure the consistency of predictions of the teacher model and the student model for different scales of the same input under different perturbations, defined as the predictions of the student model (weight θ ^s2d And noise xi ^s2d ) Prediction with teacher model (weight θ ^t2d And noise xi ^t2d ) Expected distance between:

the goal of the 2D semi-supervised segmentation framework is to minimize the following objective functions:

a multi-scale hierarchical classifier mechanism is employed in the partitioning network, with K set to 4. Scale weight { alpha } _k K=0, …,3} is used for supervised and unsupervised consistency losses, designated {0.5,0.4,0.05,0.05}.

Step (3): constructing a 3D semi-supervised segmentation network and training

As in fig. 3, a 3D semi-supervised split network is built, the 3D network uses 3D U-Net as the backbone, which is structurally identical to its 2D network, but all layers run in 3D space. A batch of 3D input images is x ^3D ∈R ^{b×c×w×h×d} Where b, c, w, h and d represent the batch size, the number of channels, and the width, height and depth of the image, respectively. For tagged images, the true label is y ^3D ∈{0,1} ^{b×c×w×h×d} . The 3D student model generates a multi-scale predictive label, expressed as:

wherein,representing the prediction probabilities activated by the softmax function, are fused with their corresponding 2D predictions in step (4) to generate the final segmented prediction result. f (f) ^{s 3d} And theta ^{s 3d} Respectively represent student model and its weight parameter, xi ^s3d Is added to x ^3D To enhance the random gaussian noise of the network robustness.

training the teacher model weight θ at step t ^t3d Defined as continuous θ ^s3d Weight EMA:

the multi-scale consistency penalty is defined as the expected distance between the prediction of the student model and the prediction of the teacher model:

the goal of the 3D semi-supervised segmentation framework is to minimize the following objective functions:

in a dividing netA multi-scale hierarchical classifier mechanism is adopted in the complex, and K is set to be 4. Scale weight { alpha } _k K=0, …,3} is used for supervised and unsupervised consistency losses, designated {0.5,0.4,0.05,0.05}.

Step (4): uncertainty estimation and hybrid prediction

Combining 2D and 3D images, first defining a function C, stacking a batch of 3D volumes x by stacking a batch size dimension b and a depth dimension D ^3D ∈R ^{b×c×w×h×d} Conversion to 2D imagesThe inverse transform of C is denoted as C ^―1 . Then, in each training step we first pass by using +.>Replacing x in step (2) ^2D And acquiring a 2D output and a feature map from the trained 2D student model. Then use C ^―1 The obtained 2D prediction output is converted into a volumetric shape, fed to a 3D model to extract context information and provide additional guidance.

Hybrid prediction is performed based on the uncertainty estimates, the outputs of the 2D and 3D models are combined into the hybrid prediction by respective uncertainty scores, and the hybrid uncertainty is estimated, providing uncertainty weighted regularization.

To ensure a reliable regularization process, the teacher model measures the uncertainty of its predictions to filter out unreliable predictions. Uncertainty in the 2D and 3D teacher outputs was estimated using Monte Carlo Dropout, respectively. Specifically, T times of random forward propagation are performed on the teacher model with random dropout, and gaussian noise is input for each input amount. Thus, for each voxel in the input, a set of softmax probability vectors is derivedAnd->Using prediction entropy as a metricTo approximate the uncertainty. The final hybrid segmentation prediction is an uncertainty weighted combination of the 2D and 3D segmentation outputs:

wherein Con represents the splicing operation, C ^―1 Converting 2D to 3D shape, { w } is an entropy-based weight map that reflects the confidence of all segmentation maps for each voxel location.

Estimating hybrid segmentation uncertainty asEntropy of (2):

step (5): optimization training by combining 2D and 3D networks

Model optimization training is carried out by combining a 2D network and a 3D network, model parameters are saved, an SGD optimizer is used in the training process, and the initial learning rate is set to be 2 multiplied by 10 ^―4 . The network performs 2000 cycles of iterative training.

The 2D and 3D student models are jointly trained by using the reference labels, and regularization is carried out through mixed prediction and uncertainty, so that cross-dimension consistency is realized. Uncertainty weighted regularization by hybrid prediction, the hybrid consistency penalty is expressed as:

the overall training goal penalty is defined as follows:

L＝L ^2D +λ ₃ L ^3D

wherein lambda is ₂ Is a super-parameter, lambda, for balancing the contribution of the hybrid consistency loss ₃ Is a hyper-parameter used to balance the contributions of the 2D student model.

Step (6): MR image segmentation test for colorectal cancer

And testing the MR images of the test set by using the optimized and trained model to obtain a final segmentation result, and outputting evaluation indexes such as the Dice score, the average surface distance ASD and the like. The Dice is used to measure the ratio of the coincidence between the real marker and the predicted segmented result, and the average surface distance ASD is the average of the distances of all points of the predicted segmented result edge and all points of the real marker edge. The higher the Dice, the lower the ASD, proving higher the segmentation accuracy.

Example 2

The embodiment discloses an MR medical image colorectal cancer segmentation system based on semi-supervised learning, which is used for realizing the segmentation method described in embodiment 1, and comprises the following steps:

The MR medical image colorectal cancer segmentation method and system based on semi-supervised learning provided by the invention are described in detail. Noteworthy are: the above description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that the present invention is described in detail with reference to the above embodiments, and modifications and equivalents of some of the technical features described in the above embodiments may be made by those skilled in the art. Any equivalent replacement, modification, etc. made within the core idea and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The MR medical image colorectal cancer segmentation method based on semi-supervised learning is characterized by comprising the following steps:

2. The MR medical image colorectal cancer segmentation method based on semi-supervised learning according to claim 1, wherein the preprocessing of step 1 is implemented as follows:

3. The MR medical image colorectal cancer segmentation method based on semi-supervised learning according to claim 1, wherein the 2D semi-supervised segmentation network model of step 2 adds an auxiliary layer after each block of the decoder of the student model and the teacher model to form a multi-scale prediction component, the teacher model weights are defined as EMA (exponential moving average) of the student model weights, different layer predictions are encouraged to stay consistent with multi-scale features, the difference between the predictions of different scales is minimized with a consistency penalty for evaluating the quality of the network output for tagged inputs, the multi-scale consistency penalty for measuring the consistency of the predictions of the teacher model and the student model for the same input at different perturbations, and is defined as the expected distance between the predictions of the student model and the predictions of the teacher model:

wherein L is _ce And L _dice The cross entropy loss and the Dice loss respectively,and->Segmentation prediction on scale k, y, of a multi-scale prediction component representing a 2D student model and a teacher model, respectively ^2D Is true annotation, alpha _k Is the weight of scale k.

4. The MR medical image colorectal cancer segmentation method based on semi-supervised learning according to claim 1, wherein the 3D semi-supervised segmentation network model of step 3 is identical in architecture to the 2D semi-supervised segmentation network model, auxiliary layers are added to the student model and the teacher model to form a multi-scale prediction component, the teacher model weight is EMA of the student model weight, the multi-scale features are utilized to encourage the prediction of different layers to be consistent, and the multi-scale supervision loss and the multi-scale consistency loss are defined as:

wherein,and->Segmentation prediction on scale k, y, of a multi-scale prediction component representing a 3D student model and a teacher model, respectively ^3D Is true annotation, alpha _k Is the weight of scale k.

5. The MR medical image colorectal cancer segmentation method based on semi-supervised learning according to claim 1, wherein the step 4 performs uncertainty estimation and hybrid prediction on the 2D image and the 3D image, estimates the uncertainty in the 2D teacher output and the 3D teacher output respectively using Monte Carlo Dropout (monte carlo discard), and obtains a set of softmax (normalized exponential function) probability vectors for each voxel in the inputAnd->Using the prediction entropy as a metric to approximate uncertainty, the final hybrid segmentation prediction is an uncertainty weighted combination of the 2D and 3D segmentation outputs:

6. The MR medical image colorectal cancer segmentation method based on semi-supervised learning according to claim 1, wherein the joint 2D semi-supervised segmentation network model and 3D semi-supervised segmentation network model of step 5 are model optimized for training, and the overall training goal of the model is to minimize 2D loss L ^2D And 3D loss L ^3D The 2D loss and the 3D loss comprise supervision loss, multi-scale consistency loss and mixed consistency loss, weight coefficients are determined, an optimization method and super parameters used in the training process are determined, the optimization method comprises an optimizer, an initial learning rate and training iteration times, a training model is used, the model and a visualization result are stored every fixed iteration times, super parameter settings are properly adjusted according to the results, and iterative optimization is carried out on the model.

7. The method for segmenting colorectal cancer based on MR medical images of semi-supervised learning according to claim 1, wherein the MR images of the test set are tested by using an optimized trained model in step 6 to obtain a final segmentation result, and evaluation indexes such as a Dice score (sample similarity score) and an average surface distance ASD are output.

8. MR medical image colorectal cancer segmentation system based on semi-supervised learning for implementing the segmentation method according to any one of claims 1 to 7, characterized by comprising: