CN112686912B

CN112686912B - Acute stroke lesion segmentation method based on gradual learning and mixed samples

Info

Publication number: CN112686912B
Application number: CN202110006989.7A
Authority: CN
Inventors: 刘之洋; 赵彬; 吴虹; 刘国华; 丁数学
Original assignee: Nankai University
Current assignee: Nankai University
Priority date: 2021-01-05
Filing date: 2021-01-05
Publication date: 2022-06-10
Anticipated expiration: 2041-01-05
Also published as: CN112686912A

Abstract

The invention discloses an acute stroke lesion segmentation method based on gradual learning and mixed samples. The method comprises the following steps: designing a segmentation network capable of performing end-to-end training, and sequentially connecting a global mean pooling layer and a classification layer behind an output layer of a down-sampling part; initializing the segmentation network of the above steps by the marked data sample; using the initialized segmentation network to learn step by step, using the marked data samples to distribute image-level pseudo marks for the unmarked data samples, and using the marked data samples to carry out semantic constraint in the iterative process; designing a decoding network based on multi-feature map fusion, and connecting the down-sampling part trained in the above steps with the decoding network; end-to-end training the new network by using the marked data sample; the network is evaluated on a test data set and corresponding test results are output. The invention effectively utilizes the unmarked data sample and greatly reduces the cost for acquiring the data sample.

Description

Acute stroke lesion segmentation method based on gradual learning and mixed samples

Technical Field

The invention relates to the technical field of acute stroke nuclear magnetic resonance image segmentation, in particular to an acute stroke lesion segmentation method based on gradual learning and mixed samples.

Background

Ischemic stroke is a common disease, more frequent in the elderly population, and if not discovered and treated in a timely manner, it can lead to disability and even death. However, its therapeutic window is short in the acute stage of onset, and the rapid localization of lesion location and quantification of lesion volume are critical for the treatment of acute stroke.

Currently, with the rapid development of artificial intelligence technology, a number of medical image processing methods based on deep learning exist, and some researchers use the technology to perform lesion segmentation on medical images of acute stroke. Chen proposes a two-stage neural network structure for preliminary segmentation and post-processing, respectively, which achieves a segmentation accuracy of 0.67 on single-mode Diffusion Weighted Images (DWI); in order to fully utilize the context information of Magnetic Resonance Imaging (MRI), Zhang proposes a 3D-based neural network structure; in order to reduce the problem of overdetection or omission possibly caused by single-mode MRI, Liu proposes a residual convolutional neural network structure to segment acute stroke lesions in multi-mode MRI. However, these segmentation algorithms require a large number of pixel-level labeled data samples, which are labeled pixel-by-pixel by a professional clinician, which greatly increases the cost of collecting the data. It is a worthwhile trend to utilize less marking data or less costly marking data. Recently, Zhao proposed a two-branch neural network structure that can perform acute stroke lesion segmentation using data labeled at both pixel level and image level. Although better segmentation accuracy is achieved, the method still needs to rely on a large amount of image-level labeled data for pre-training. The more effective method is to generalize the algorithm by using a small number of pixel-level marked samples and some unmarked samples, and further segment the acute stroke lesion.

Some researchers currently use the idea of small sample learning to perform natural image semantic segmentation. Many methods are fine-tuned based on pre-training weights for large-scale public datasets. Due to the obvious difference between the medical image and the natural image, the medical image cannot achieve a good generalization effect under the study of a small sample by using the pre-training parameters of the related network. For medical image lesion segmentation, the current method focuses on generating pixel labels of pseudo labels for label-free data by using a small amount of pixel-level labeled data, and then mixing samples of two types of pixel-level labels to perform end-to-end segmentation training. Because a certain error exists in the generated pixel level pseudo label, the error can be accumulated during end-to-end training, and further, the pathological change region is not favorably and effectively segmented.

Aiming at the defect that the current acute stroke lesion segmentation method based on deep learning excessively depends on labeled data, a method for segmenting the acute stroke lesion by using easily obtained unlabeled data under the condition of guidance of a small amount of labeled data needs to be researched.

Disclosure of Invention

The invention provides an acute stroke lesion segmentation method based on gradual learning and mixed samples, which can effectively segment acute stroke lesions only by a small number of pixel-level marked data samples and some unmarked data samples. This form of mixing samples greatly reduces the cost of acquiring data samples and makes better use of the large amount of unused unmarked data. The technical scheme adopted by the method comprises the following steps:

step 1: designing a segmentation network capable of performing end-to-end training, and sequentially connecting a global mean pooling layer and a classification layer behind an output layer of a down-sampling part;

step 2: initializing the segmentation network of step 1 by a pixel-level labeled data sample;

and step 3: and (3) performing gradual learning by using the segmentation network initialized in the step (2), specifically, mixing and inputting marked data samples and unmarked data samples into the segmentation network, outputting a certain dimension characteristic at a global mean pooling layer, and then respectively averaging the characteristics corresponding to all positive samples and all negative samples. And respectively solving Euclidean distances between the feature vectors of the unlabeled data samples and the feature vectors of the processed positive and negative samples, selecting a plurality of unlabeled samples closest to the positive sample and a plurality of unlabeled samples closest to the negative sample according to the Euclidean distances, endowing the unlabeled samples closest to the positive sample and the unlabeled samples closest to the negative sample with respective corresponding image-level pseudo labels, and performing the next round of classification training until a certain proportion of the unlabeled data samples are iterated completely, and stopping the training process. In the process, semantic constraint is carried out on the marked data samples in each iteration;

and 4, step 4: designing a decoding network based on multi-feature map fusion, and connecting the down-sampling part trained in the step 3 with the decoding network;

and 5: end-to-end training is carried out on the network in the step 4 by using the marked data samples;

step 6: evaluating the algorithm obtained in the step 5 on a test data set, and outputting a corresponding test result;

further, when step-by-step learning is performed in step 3, if an unlabeled data sample is within a selectable range of positive and negative samples, the unlabeled data sample is discarded in the current iteration and is selected in the subsequent iteration.

Compared with the traditional acute stroke lesion segmentation method, the acute stroke lesion segmentation method based on the gradual learning and the mixed samples has the following advantages:

(1) aiming at the segmentation method of the acute stroke lesion, the invention provides a method for gradually learning by mixing a small number of marked data samples and some unmarked data samples, which greatly reduces the cost for acquiring the data samples.

(2) The invention provides that image-level pseudo-markers are distributed to unmarked data samples in a high-dimensional feature space, and semantic constraint is carried out on the marked data samples in a gradual learning process, so that the accumulation of errors caused by pseudo-markers in pixel-level segmentation can be reduced.

Drawings

Fig. 1 is a schematic diagram of a training process based on gradual learning and mixed samples according to the present invention.

Fig. 2 is a schematic diagram of acute stroke lesion segmentation.

Detailed Description

The method of the present invention is described in detail with reference to the accompanying drawings and examples.

A schematic diagram of the training process based on step-by-step learning and mixed samples is shown in fig. 1. The method comprises the following general flow: firstly, mixed marked data samples and unmarked data samples are input into a network, the mixed marked data samples and the unmarked data samples comprise two modal DWI images and ADC images, a feature vector with a certain dimensionality is output from a global mean pooling layer, the marked data samples are used for distributing image-level pseudo marks for the unmarked data samples, a small number of unmarked samples which are equal to each other are selected in each iteration, and the training process is stopped until all the unmarked data samples are selected. Then, using the segmentation schematic diagram of the acute stroke lesion shown in fig. 2, end-to-end training is performed using the labeled data sample pairs, and then the trained segmentation network is used to segment the acute stroke lesion on the test data set.

the end-to-end segmentation network designed by the invention refers to the structural form of a full-convolution segmentation network of a classical neural network, and a global mean pooling layer and a classification layer are sequentially added after an output layer of a down-sampling part.

Step 2: initializing the segmented network of step 1 from the marked data samples;

the marked data sample is calibrated by a professional physician, and the accuracy of the data sample is ensured. The data samples input into the network comprise DWI images and ADC images, and in order to facilitate the training of the neural network, the DWI images and the ADC images are respectively subjected to standardization processing and then channel fusion, and the standardization processing can be written as follows:

wherein x_iIs the ith sample and μ and σ are the mean and standard deviation, respectively.

And step 3: and (3) performing gradual learning by using the segmentation network initialized in the step (2), specifically, mixing and inputting marked data samples and unmarked data samples into the segmentation network, outputting a certain dimension characteristic at a global mean pooling layer, and then respectively averaging the characteristics corresponding to all positive samples and all negative samples. And respectively solving Euclidean distances between the feature vectors of the unlabeled data samples and the feature vectors of the processed positive and negative samples, selecting a plurality of unlabeled samples closest to the positive sample and a plurality of unlabeled samples closest to the negative sample according to the Euclidean distances, endowing the unlabeled samples closest to the positive sample and the unlabeled samples closest to the negative sample with respective corresponding image-level pseudo labels, and performing the next round of classification training until a certain proportion of the unlabeled data samples are iterated to finish the training process. In the process, semantic constraint is carried out on the marked data samples in each iteration;

the invention mixes the marked data sample and the unmarked data sample, and inputs the data sample into the segmentation network initialized in the step 2 for training. The method specifically comprises the steps of taking a feature vector output by a global mean pooling layer, firstly respectively calculating mean vectors of positive and negative samples, then calculating Euclidean distances between the feature vector of an unmarked data sample and the feature vectors of the positive and negative samples, and distributing image-level pseudo marks for the unmarked data sample according to the Euclidean distances. Selecting several unmarked samples nearest to the positive sample and several unmarked samples nearest to the negative sample, endowing the unmarked samples with corresponding image-level false marks, then performing classification training by using the data samples with the false marks and the data samples with the marks, and stopping training until a certain proportion of unmarked data samples are iterated. In this process, each iteration is semantically constrained using labeled data samples. The specific step-by-step learning process is shown in fig. 1.

since the number of marked data samples is too small, a decoding network with less parameters needs to be designed to decode the feature map of the downsampling part in the step 3 to the segmentation output, and in order to more effectively utilize the feature map which contributes more to the segmentation result, a channel attention module is potentially embedded into the decoding network.

and (4) training the end-to-end segmentation network obtained in the step (4) by using a marked data sample, monitoring the training process by using an early-stopping method and storing the optimal network parameters.

in order to test the generalization ability of the network obtained in the step 5, the network is verified by using a test set, and a formula for measuring the segmentation accuracy is as follows:

wherein G and P respectively represent an acute stroke lesion label and a predicted lesion, and | represents a segmentation region of the lesion.

To further analyze the lesion, the lesion level accuracy P is used_LRecall rate of lesion level R_LAnd F of pathological grade₁The score is used as an evaluation index and is specifically defined as follows:

where m # TP, m # FP, and m # FN refer to the average of true positive, false positive, and false negative examples, respectively. Specifically, the labels of the segmented lesions and the test set are subjected to three-dimensional connectivity analysis to form respective connected regions. When a region exists on both the label of the segmented lesion and the test set, it is defined as a true case (TP); the False Positive (FP) cases exist on the segmented lesion but do not exist on the label, and so on, we can get the False Negative (FN) case and the True Negative (TN) case, and finally get the average of them to get the m # TP, m # FP and m # FN.

Claims

1. A segmentation method for acute stroke lesion based on gradual learning and mixed samples is characterized in that marked data samples are used for gradually distributing image-level pseudo marks for unmarked data samples, semantic constraint is carried out on the marked data samples in the gradual learning process, then a down-sampling part of a network is connected with a newly designed decoding network, and end-to-end training is carried out by utilizing the marked data samples, and the method specifically comprises the following steps:

1) designing a segmentation network capable of performing end-to-end training, and sequentially connecting a global mean pooling layer and a classification layer behind an output layer of a down-sampling part;

2) initializing the partitioned network of 1) from the marked data samples;

3) using the segmentation network initialized in the step 2) to perform gradual learning, specifically, mixing and inputting labeled data samples and unlabeled data samples into the segmentation network, outputting features of a certain dimensionality at a global mean pooling layer, then respectively averaging the features corresponding to all positive samples and all negative samples, respectively averaging the feature vectors of the unlabeled data samples with the mean values of the processed positive and negative sample features to obtain Euclidean distances, selecting a plurality of unlabeled samples closest to the positive samples and a plurality of unlabeled samples closest to the negative samples according to the values, giving corresponding image-level pseudo labels to the unlabeled samples, performing classification training of the next round, stopping the training process until a certain proportion of the unlabeled data samples are iterated, and performing semantic constraint by using the labeled data samples in each iteration;

4) designing a decoding network based on multi-feature map fusion, and connecting the down-sampling part trained in 3) with the decoding network;

5) end-to-end training the network of 4) by using the marked data samples;

6) and (4) evaluating the algorithm obtained in the step (5) on the test data set, and outputting a corresponding test result.

2. The gradual learning and mixed sample based acute stroke lesion segmentation method of claim 1, wherein a new network is designed based on a classical full convolution network, and a global mean pooling layer and a classification layer are connected in sequence after an output layer of the down-sampling part.

3. The gradual learning and sample-mixing based acute stroke lesion segmentation method of claim 1, wherein the network designed in claim 2 is initialized with labeled data samples, which are set to 5 pixel-level data samples.

4. The gradual learning and sample mixing based acute stroke lesion segmentation method of claim 1, wherein a feature dimension of a single data sample output in the global mean pooling layer is n x 512, where n is a number of sample slices, and is selected according to 0.005 times of a total number of remaining samples in the classification training.

5. The method of claim 1, wherein a decoding network is designed, a down-sampling part of the network completed by using the gradual learning training is connected to the decoding network, and a channel attention module is embedded in the decoding network part for better focusing on the feature map which contributes more to lesion segmentation.

6. The gradual learning and sample-mixing based acute stroke lesion segmentation method of claim 1, wherein a new derived network is trained using labeled data samples, and an early-stopping method is used to monitor the training process and save optimal convergence parameters.

7. The gradual learning and mixed sample based acute stroke lesion segmentation method of claim 1, wherein the finally trained network is tested on a test set, and the adopted evaluation indexes comprise a segmentation precision Dice coefficient and lesion level accuracy rate P_LRecall rate of lesion level R_LAnd F of pathological grade₁And (6) scoring.