CN112308833A

CN112308833A - One-shot brain image segmentation method based on circular consistent correlation

Info

Publication number: CN112308833A
Application number: CN202011182378.XA
Authority: CN
Inventors: 王连生
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2020-10-29
Filing date: 2020-10-29
Publication date: 2021-02-02
Anticipated expiration: 2040-10-29
Also published as: CN112308833B

Abstract

The invention discloses a one-shot brain image segmentation method based on circular consistent correlation, which comprises the following steps of: s1, obtaining and classifying the brain anatomical structure image to obtain an unlabeled image y and an atlas x, wherein the atlas x has an label x_s(ii) a S2, constructing an LT-NET network model, wherein the LT-NET network model comprises a generator G_FGenerator G_BAnd two discriminators D; s3, input the image set x and the unlabelled image y into the generator G_FTo obtain a forward mapping Δ p_F(ii) a S4, mapping the forward direction to delta p_FApplied to the atlas x and the Label x, respectively_sObtaining a reconstructed image in accordance with the supervision loss

And labeling

S5, reconstructing the image

And atlas x input generator G_BGet backward mapping Δ p_B(ii) a S6, mapping the backward direction to delta p_BAre applied to the reconstructed images respectively

And labeling

Obtaining a reconstructed image in cooperation with the supervision loss

And labeling

The method adopts the LT-NET network model to be constructed, the efficiency of image segmentation is effectively improved by matching with supervision loss, and the unidirectional correlation learning performance is improved.

Description

One-shot brain image segmentation method based on circular consistent correlation

Technical Field

The invention relates to the technical field of medical image segmentation, in particular to a one-shot brain image segmentation method based on circular consistent correlation.

Background

Common methods for brain anatomy segmentation are segmentation by traditional machine learning, which relies on manually extracted features with limited feature representation and generalization capabilities, and Convolutional Neural Network (CNN) learning was developed because it is completely data-driven and can automatically retrieve hierarchical features using self-learned advanced features, eliminating the limitations of manual features in traditional machine learning methods, with sufficient labeled data, convolutional neural network has a better effect in a fully supervised segmentation task, using segmentation algorithms with forward and backward correlations, i.e. improving the segmentation network to learn the forward mapping of atlas x to unlabeled image y, through warp manipulation to reconstructed images

Subsequent learning of reconstructed images

The backward mapping to the atlas x can play a positive role in the forward correlation learning and the final segmentation result, but the learning mode has the defects of difficult efficiency meeting the requirement and low performance of the unidirectional correlation learning.

Disclosure of Invention

The invention aims to provide a one-shot brain image segmentation method based on circular consistent correlation, which effectively improves the efficiency of image segmentation and improves the learning performance of unidirectional correlation.

In order to achieve the purpose, the invention adopts the following technical scheme:

a one-shot brain image segmentation method based on circular consistent correlation comprises the following steps:

s1, obtaining and classifying the brain anatomical structure image to obtain a labeled image and an unlabeled image y, and dividing the labeled image into an atlas x with a label x_s；

S2, constructing an LT-NET network model, wherein the LT-NET network model comprises a generator G_FGenerator G_BAnd two discriminators D, generators G_FAnd generator G_BBoth comprise twin encoders and decoders;

s3, input the image set x and the unlabelled image y into the generator G_FBy means of a generator G_FIs processed by the twin encoder and decoder to obtain the forward mapping deltap_F；

S4, mapping the forward direction to delta p_FApplied to the atlas x and the Label x, respectively_sBy means of a discriminator D and a generator G_FPerforming cyclic countermeasure in coordination with supervision loss and obtaining reconstructed images through warp operation

And labeling

S5, reconstructing the image

And atlas x input generator G_BBy means of a generator G_BProcessed by the twin encoder and decoder to obtain a backward mapping Δ p_B；

S6, mapping the backward direction to delta p_BAre applied to the reconstructed images respectively

And labeling

By means of a discriminator D and a generator G_BPerforming cyclic countermeasure in coordination with supervision loss and obtaining reconstructed images through warp operation

And labeling

Further, the surveillance loss includes an image consistency loss

Transformation consistency L_{tran_cyc(ΔpF，ΔpB)}Anatomically consistent

And anatomical difference consistency

The formulas are respectively expressed as:

wherein the content of the first and second substances,

represented as for supervised reconstruction of images

Loss of image consistency, L, consistent with atlas x_{tran_cyc(ΔpF，ΔpB)}Denoted as Δ p for supervised forward mapping_FAnd backward mapping Δ p_BIs equal to (x), where ρ (x) is (x)²+ε²)^γRepresenting a generalized charbonier penalty, t represents a voxel point, e is 0.001, y is 0.45,

and

respectively denoted as x for supervision of the annotation_sAnd label

And labeling

Anatomical consistency and anatomical variance consistency.

Further, the supervisory loss further includes antagonistic loss

Loss of similarity

And a smoothing loss L_{smooth(ΔpF，ΔpB)}The formulas are respectively expressed as:

L_smooth(Δp_F,Δp_B)＝L_smooth(Δp_F)+L_smooth(Δp_B)

wherein f is_y(t) and

respectively representing the unmarked image y and the reconstructed image

Local average intensity:

ti denotes a volume around t of l³Coordinates within the range, l ═ 3; te Ω represents all the position spaces in Δ p, L_smooth ^{(ΔpF，ΔpB)}Expressed using the spatial gradient between adjacent pixels in the x, y, z direction

Further, the total supervision loss of the LT-NET network model is as follows:

wherein λ is₁,、λ₂、λ₃Is a measure of the weight lost, λ₁＝1，λ₂＝3，λ₃＝10。

Further, the generator G_FAnd generator G_BThe system also comprises a double attention module which extracts the atlas x and the unmarked image y/reconstructed image by the split flow of the twin encoder

Inputting the related features into a double-attention module, learning the spatial information and channel information of the related features by the double-attention module, and transmitting the spatial information and channel information to a decoder, and decoding by the decoder to obtain forward mapping delta p_FBackward mapping Δ p_B。

Furthermore, the double-attention module comprises a space attention module and a channel module, the space attention module captures space information in a space dimension, the channel module captures channel information in a channel dimension, and the space information and the channel information are added to obtain a new feature map and are transmitted to the decoder.

Furthermore, the twin encoder is provided with 5 encoding sub-modules which are respectively a first encoding sub-module, a second encoding sub-module, a third encoding sub-module, a fourth encoding sub-module and a fifth encoding sub-module, and the two encoding sub-modules form 1 processing stream; the twin encoder has 2 processing streams and is simultaneously connected with 1 fifth encoding submodule;

the decoder is provided with 5 decoding sub-modules which are respectively a first decoding sub-module, a second decoding sub-module, a third decoding sub-module, a fourth decoding sub-module and a fifth decoding sub-module; the first decoding submodule is connected with the double-attention module, the second decoding submodule receives the first decoding submodule and is respectively in long connection with the fourth coding submodule of 2 processing streams, the third decoding submodule receives the second decoding submodule and is respectively in long connection with the third coding submodule of 2 processing streams, the fourth decoding submodule receives the third decoding submodule and is respectively in long connection with the second coding submodule of 2 processing streams, the fifth decoding submodule receives the fourth decoding submodule, and the fifth decoding submodule outputs a forward mapping delta p from an image set x to an unmarked image y_FReconstruction of images

Backward mapping Δ p to atlas x_B。

Further, the coding sub-module is composed of ResNet-34 stacked by basic residual modules.

Further, the arbiter D adopts a PatchGAN arbiter.

After adopting the technical scheme, compared with the background technology, the invention has the following advantages:

1. the invention constructs LT-NET network model and generates G_FGenerator G_BRespectively confronted with a discriminator D, the LT-NET network model has supervision loss, and the supervision loss is matched with a generator G_FGenerator G_BLet the generator G_FGenerator G_BRespectively obtain forward mapping delta p with highest accuracy_FAnd backward mapping Δ p_BForward mapping of Δ p_FAnd backward mapping Δ p_BThe reconstructed images can be obtained by performing warp operation respectively

Labeling

Reconstructing an image

And labeling

The LT-NET network model effectively improves the image segmentation efficiency, improves the unidirectional correlation learning performance, and obtains reconstructed images and labels with higher accuracy through learning.

2. The discriminator D selects the PatchGAN discriminator, the PatchGAN discriminator can better discriminate the local part of the image, each patch is judged whether the image is true or false by dividing the image into a plurality of patches, and finally the judgment of the image level is obtained, so that the accuracy and the performance are superior to those of a common discriminator.

3. The present invention supervises losses including loss of image consistency

Transformation consistency L_{tran_cyc(ΔpF，ΔpB)}Anatomically consistent

Anatomical disparity consistency

Loss of antagonism

Loss of similarity

And a smoothing loss L_{smooth(ΔpF，ΔpB)}The LT-NET network model is supervised in different spaces through different supervision losses, so that the LT-NET network model effectively improves the image segmentation efficiency and improves the unidirectional correlation learning performance.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

FIG. 2 is a schematic diagram of the LT-NET network model structure of the present invention;

FIG. 3 is a schematic diagram of the operation of a twin encoder and decoder according to the present invention;

FIG. 4 is a graph showing the segmentation and comparison of the LT-NET network model and the MABMIS network model according to the present invention;

FIG. 5 is a graph showing a segmentation comparison of the LT-NET network model and the PICSL-MALF network model according to the present invention;

FIG. 6 is a graph showing the segmentation and comparison of the LT-NET network model, the VoxelMorph network model and the DataAug network model according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the present invention, it should be noted that the terms "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are all based on the orientation or positional relationship shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the apparatus or element of the present invention must have a specific orientation, and thus, should not be construed as limiting the present invention.

Examples

As shown in fig. 1 to fig. 3, the present invention discloses a one-shot brain image segmentation method based on circular consistent correlation, which includes the following steps:

s1, obtaining and classifying the brain anatomical structure image to obtain a labeled image and an unlabeled image y, and dividing the labeled image into an atlas x with a label x_s。

S2, constructing an LT-NET network model, wherein the LT-NET network model comprises a generator G_FGenerator G_BAnd two discriminators D, generators G_FAnd generator G_BBoth include a twin encoder and decoder.

S3, input the image set x and the unlabelled image y into the generator G_FBy means of a generator G_FIs processed by the twin encoder and decoder to obtain the forward mapping deltap_F。

And labeling

S5, reconstructing the image

And atlas x input generator G_BBy means of a generator G_BProcessed by the twin encoder and decoder to obtain a backward mapping Δ p_B。

And labeling

And labeling

The LT-NET network model of the embodiment is constructed based on a GAN network, and the network is added with a cycle structure and a antagonism thought; so that the generated reconstructed image

The image is reconstructed according to the unmarked image y

Consistent with atlas x, label x of atlas x_sAnd labels

And (4) the same.

Monitoring loss including for constraining atlas x and reconstructing image

Loss of image consistency

The forward and backward correlations are inverse functions of each other, such that the forward mapping Δ p is applied_FThe mapping of the backward mapping can be returned to the original state_{tran_cyc(ΔpF，ΔpB)}Segmentation labeling for constraining an atlas x true in anatomical spacex_sAnd reconstructed annotations

Of consistent anatomical consistency

And anatomical difference consistency

The formulas are respectively expressed as:

wherein the content of the first and second substances,

represented as for supervised reconstruction of images

Loss of image consistency, L, consistent with atlas x_{tran_cyc(ΔpF，ΔpB)}Denoted as Δ p for supervised forward mapping_FAnd backward mapping Δ p_BIs equal to (x), where ρ (x) is (x)²+ε²)^γDenotes a generalized charbonier penalty term, t denotes voxel points, te Ω denotes all position spaces in Δ p, e ═ 0.001, γ ═ 0.45,

and

respectively denoted as x for supervision of the annotation_sAnd label

And labeling

Anatomical consistency and anatomical variance consistency.

The loss of supervision also includes the loss of antagonism

Loss of similarity

L_smooth(Δp_F,Δp_B)＝L_smooth(Δp_F)+L_smooth(Δp_B)

wherein f is_y(t) and

respectively representing the unmarked image y and the reconstructed image

Local average intensity:

The total supervision loss of the LT-NET network model is as follows:

The purpose of this embodiment is to learn a label x that can be used to label an atlas x_sThe method is transferred to the relevance mapping of an unlabeled image y, and compared with the occlusion problem in the optical flow estimation task, the occlusion-non-occlusion symmetric consistency exists in the forward optical flow estimation and the backward optical flow estimation, and the embodiment provides an anatomical difference consistency loss to simply standardize the quality of a composite segmentation graph; this loss is addressed based on: the structural differences between the atlas x and the unlabeled image y in the forward and backward paths should follow a periodic consistency in the anatomical space.

Generator G_FAnd generator G_BThe system also comprises a double attention module which extracts the atlas x and the unmarked image y/reconstructed image by the split flow of the twin encoder

The double-attention module comprises a space attention module and a channel module, the space attention module captures space information in a space dimension, the channel module captures channel information in a channel dimension, and the space information and the channel information are added to obtain a new feature map and are transmitted to the decoder.

The twin encoder is provided with 5 encoding sub-modules which are respectively a first encoding sub-module, a second encoding sub-module, a third encoding sub-module, a fourth encoding sub-module and a fifth encoding sub-module and form 1 processing stream; the twin encoder has 2 processing streams and is simultaneously connected with 1 fifth encoding submodule.

Backward mapping Δ p to atlas x_B

The coding sub-module is composed of ResNet-34 stacked by basic residual modules.

The discriminator D adopts a PatchGAN discriminator.

Evaluation of experiments

As shown in fig. 4-6, brain anatomy images evaluated in this experiment were collected from The Child and Adolescent neurodevelopmental program (CANDI), The Child and Adolescent NeuroDevelopment Initiative, of The university of massachusetts, disclosing a series of brain structure images as images of experimental examples and mrbrain 18 data published by MICCAI2018 race.

The evaluation index of the experimental evaluation adopts a Dice similarity coefficient to evaluate the segmentation accuracy of the model, and the accuracy is used for measuring the similarity between the manual labeling and the prediction result:

wherein y is_sRepresenting a manual annotation of the test data,

the experimental evaluation takes the average Dice coefficient and the standard deviation of Dice as an evaluation standard, reflects the discrete degree of the prediction result of the measured data and is defined as:

where n denotes the number of test data, dice_iThe Dice value representing the ith test datum,

the average Dice of all test data is shown, and the smaller the standard deviation is, the more stable the performance of the model is.

The effectiveness of the supervision loss was verified by ablative experiments with the results shown in table 1 below:

TABLE 1 comparison of ablation results with supervised losses

The supervision loss of the LT-NET network model introduces the supervision of other spaces, so that the network performance can be further improved; by utilizing the forward correlation mapping, the segmentation graph of the atlas x can be synthesized into the segmentation graph of the unlabeled image y, and conversely, the synthesized segmentation graph can be restored into the segmentation graph of the atlas x by utilizing the backward correlation mapping; supervision loss of the LT-NET network model ensures the integrity and internal consistency of an anatomical structure, and effectively improves network performance.

Comparing an LT-NET network model with a MABMIS network model and a PICSL-MALF network model, wherein the MABMIS network model is composed of a tree-based group-by-group registration method and a group-by-group iterative segmentation method, and the PICSL-MALF network model provides a multi-atlas joint label fusion technology and a correction learning technology to solve the problems of the traditional voting-based multi-atlas label fusion strategy; in this experimental example, 2 to 5 atlas sets are used to verify the effects of the mabsis network model and the PICSL _ MALF network model, and the results are compared with the LT-NET network model trained in a one-shot mode, and the comparison results are shown in the following table 2:

TABLE 2 LT-NET network model and MABMIS, PICSL _ MALF network model comparison table

The LT-NET network model is far superior to MABMIS network model and PICSL-MALF network model which use 5 images only by using 1 image set, the MABMIS network model and the PICSL-MALF network model are abandoned due to overlong time consumption, and the MABMIS is in

The environment of (2) requires about 14 minutes to segment a case, PICSL _ MALF requires 3 minutes in a single Tesla P40 GPU environment, while LT-NET network model requires only 4 seconds in a single Tesla P40 GPU environment.

The segmentation results of the LT-NET network model and the MABMIS network model are visualized, the last four columns in the graph represent the segmentation results of the MABMIS network model when the atlas is 2-5, and it can be seen from the graph that the segmentation effect of the MABMIS network model is better and better along with the increase of the atlas, but compared with the LT-NET network model, the segmentation effect of the MABMIS network model is poorer, and serious multi-score and omission conditions exist.

The segmentation results of the LT-NET network model and the PICSL-MALF network model are visualized, the last four columns in the graph represent the segmentation results of the PICSL-MALF network model when the graph sets are 2-5, the edge segmentation of the PICSL-MALF network model is superior to that of the MABMIS network model, and the condition of excessive segmentation still exists; in conclusion, the LT-NET network model based on one-shot learns more accurate correlation in the brain anatomical structure segmentation task, so that an accurate segmentation result can be obtained.

The LT-NET network model was compared to the voxelmorphh network model and the DataAug network model, and the comparative segmentation results are shown in table 3 below:

TABLE 3 comparison of LT-NET network model with VoxelMorph and DataAug network models

Using only one labeled datum, the LT-NET network model achieved an average Dice of 82.3% performance, 6.3% and 1.9% higher than the VoxelMorph network model and the DataAug network model, respectively. In addition, compared with the DataAug network model, the LT-Net network model uses an end-to-end training mode, only one network needs to be trained, and the DataAug network model needs to train a plurality of networks, so that the process is complex; in summary, the LT-NET network model has simplicity and effectiveness in brain anatomy segmentation tasks.

The segmentation results of the LT-NET network model, the VoxelMorph network model and the DataAug network model are visualized, and the images show that the LT-NET network model can better segment detail and edge information.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A one-shot brain image segmentation method based on circular consistent correlation is characterized by comprising the following steps:

And labeling

S5, reconstructing the image

And labeling

And labeling

2. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 1, wherein: the supervised loss includes an image consistency loss

Transformation consistency L_{tran_cyc(ΔpF，ΔpB)}Anatomically consistent

And anatomical difference consistency

The formulas are respectively expressed as:

wherein the content of the first and second substances,

represented as for supervised reconstruction of images

and

respectively denoted as x for supervision of the annotation_sAnd label

And labeling

Anatomical consistency and anatomical variance consistency.

3. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 2, wherein: the supervision loss also includes antagonism loss

Loss of similarity

L_smooth(Δp_F,Δp_B)＝L_smooth(Δp_F)+L_smooth(Δp_B)

wherein f is_y(t) and f_y(t) respectively representing the unmarked image y and the reconstructed image

Local average intensity:

ti denotes a volume around t of l³Coordinates within the range, l ═ 3; te Ω represents all the position spaces in Δ p, L_smooth ^{(ΔpF，ΔpB)}Expressed as | | (Δ p (t)) using a spatial gradient between adjacent pixels in the x, y, z directions₂。

4. A one-shot brain image segmentation method based on circular consistent correlation as claimed in claims 2-3, characterized in that: the total supervision loss of the LT-NET network model is as follows:

5. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 1, wherein: the generator G_FAnd generator G_BThe system also comprises a double attention module, extracts the relevant characteristics of the image set x and the unmarked image y/reconstructed image y and the image set x by the split flow of the twin encoder, inputs the relevant characteristics into the double attention module, the double attention module learns the spatial information and the channel information of the relevant characteristics and transmits the spatial information and the channel information to a decoder, and the decoder decodes the spatial information and the channel information to obtain the forward mapping delta p_FBackward mapping Δ p_B。

6. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 5, wherein: the double-attention module comprises a space attention module and a channel module, the space attention module captures space information in a space dimension, the channel module captures channel information in a channel dimension, and the space information and the channel information are added to obtain a new characteristic diagram and are transmitted to the decoder.

7. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 5, wherein: the twin encoder is provided with 5 encoding sub-modules which are respectively a first encoding sub-module, a second encoding sub-module, a third encoding sub-module, a fourth encoding sub-module and a fifth encoding sub-module and form 1 processing stream; the twin encoder has 2 processing streams and is simultaneously connected with 1 fifth encoding submodule;

the decoder is provided with 5 decoding submodelsThe blocks are respectively a first decoding submodule, a second decoding submodule, a third decoding submodule, a fourth decoding submodule and a fifth decoding submodule; the first decoding submodule is connected with the double-attention module, the second decoding submodule receives the first decoding submodule and is respectively in long connection with the fourth coding submodule of 2 processing streams, the third decoding submodule receives the second decoding submodule and is respectively in long connection with the third coding submodule of 2 processing streams, the fourth decoding submodule receives the third decoding submodule and is respectively in long connection with the second coding submodule of 2 processing streams, the fifth decoding submodule receives the fourth decoding submodule, and the fifth decoding submodule outputs a forward mapping delta p from an image set x to an unmarked image y_FBackward mapping Δ p of reconstructed image y to atlas x_B。

8. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 7, wherein: the coding sub-module is composed of ResNet-34 stacked by basic residual modules.

9. The one-shot brain image segmentation method based on circular consistent correlation as claimed in claim 1, wherein: the discriminator D adopts a PatchGAN discriminator.