CN114529562A - Medical image segmentation method based on auxiliary learning task and re-segmentation constraint - Google Patents

Medical image segmentation method based on auxiliary learning task and re-segmentation constraint Download PDF

Info

Publication number
CN114529562A
CN114529562A CN202210162154.5A CN202210162154A CN114529562A CN 114529562 A CN114529562 A CN 114529562A CN 202210162154 A CN202210162154 A CN 202210162154A CN 114529562 A CN114529562 A CN 114529562A
Authority
CN
China
Prior art keywords
layer
convolution
segmentation
block
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210162154.5A
Other languages
Chinese (zh)
Inventor
屈磊
周文琼
吴军
欧阳磊
陶在洋
尚宏伟
赵婧雨
洪思成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202210162154.5A priority Critical patent/CN114529562A/en
Publication of CN114529562A publication Critical patent/CN114529562A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a medical image segmentation method based on an auxiliary learning task and re-segmentation constraint, which comprises the following steps in sequence: (1) preprocessing three-dimensional human brain nuclear magnetic resonance data to obtain a training set and a test set; (2) constructing a segmentation network based on an auxiliary learning task and a re-segmentation constraint; (3) inputting the training set into a segmentation network for training to obtain a trained segmentation network; (4) inputting the test set into the trained segmentation network, and outputting the segmentation network to obtain a segmentation result. The method is beneficial to segmentation network learning to complementary medical image characteristics by introducing an additional image reconstruction task branch, so that the model is helped to better understand the internal structure of the medical image; and inputting the reconstruction result into the segmentation network again, comparing the obtained re-segmentation result with the real segmentation graph, and providing an additional supervision signal for training the segmentation network from a semantic level so as to further improve the accuracy of the image segmentation result.

Description

Medical image segmentation method based on auxiliary learning task and re-segmentation constraint
Technical Field
The invention relates to the technical field of medical image segmentation, in particular to a medical image segmentation method based on an auxiliary learning task and re-segmentation constraint.
Background
In recent years, with the rapid development of artificial intelligence technology, computer vision has shown a very high recognition effect in the field of natural image application, and has also gained wide attention in the field of medical image segmentation. Generally, the purpose of segmenting a medical image is to make a human tissue structure or a pathological structure more clear and intuitive, or relevant tissues can be modeled through a segmentation result so as to perform a subsequent auxiliary diagnosis operation. However, the data in medical images is slightly different from the native image format, and in addition to data of two-dimensional structures, image data based on MRI or CT is generally a three-dimensional structure that includes the results of scanning the entire organ tissue. In terms of image content, the boundaries of various objects in natural images are relatively obvious. However, the medical image shows the human tissue structure and is acquired by a professional imaging instrument, and there may be characteristics that the boundary of the tissue edge contour is not clear enough, and the image brightness change is complicated.
Currently, with the rapid iterative update of the deep learning algorithm, researchers make a series of improvements on natural image segmentation models, and apply the models to the field of medical image segmentation, and compared with the traditional medical image segmentation method, the segmentation accuracy is obviously improved. Thus, the conventional medical image segmentation approach is gradually replaced by a deep learning approach. The deep learning mode does not need to acquire features manually like the traditional mode, and the difference caused by the prior knowledge can not be generated, so that the excellent performance effect is shown in the field of medical image segmentation. Under the background of increasing requirements of intelligent medical tasks, the existing medical image segmentation mode adopting a deep learning method usually requires the training of modeling on large-scale labeled data, and the medical image data is smaller than the general data in scale, so that the existing medical image segmentation model is usually difficult to fully extract related characteristics with discriminative power for representing segmentation. Therefore, the limitations described above cause the existing medical image segmentation work based on deep learning to still be further improved in terms of the accuracy of the segmentation performance.
Disclosure of Invention
The invention aims to provide a medical image segmentation method based on an auxiliary learning task and re-segmentation constraint, which improves the image segmentation precision of a segmentation main task by constructing an auxiliary task of image reconstruction, and simultaneously further constrains a network by reconstructing image secondary segmentation in a model training stage so as to improve the accuracy of a segmentation result again.
In order to achieve the purpose, the invention adopts the following technical scheme: a medical image segmentation method based on auxiliary learning task and re-segmentation constraint comprises the following steps:
(1) preprocessing three-dimensional human brain nuclear magnetic resonance data to obtain a training set and a test set;
(2) constructing a segmentation network based on an auxiliary learning task and re-segmentation constraints;
(3) inputting the training set into a segmentation network for training to obtain a trained segmentation network;
(4) and inputting the test set into the trained segmentation network, and outputting the segmentation network to obtain a segmentation result.
The step (1) specifically comprises the following steps:
(2a) the three-dimensional human brain nuclear magnetic resonance data comprises four modalities: t1, T1c, T2 and FLAIR, combining four-modality three-dimensional human brain nmr data, wherein the original size of the four-modality data is 240 × 155, generating four-channel three-dimensional data with the size of 4 × 240 × 155, wherein 4 represents the number of modalities, 155 represents the number of two-dimensional slices contained in each three-dimensional human brain nmr data, and 240 × 240 represents the height and width of the image, respectively;
(2b) converting the merged three-dimensional human brain image data from nii format to numpy format;
(2c) carrying out normalization processing on the converted data by adopting a zero-mean normalization method;
(2d) normalizing the processed image according to random division by 7:3, dividing the ratio into a training set and a test set;
(2e) the training set was randomly clipped, resulting in training set data of size 4 x 128.
In step (2), the partition network comprises a first encoding module, a second encoding module, a first decoding module, a second decoding module and a third decoding module;
the first coding module and the second coding module are composed of four convolution blocks and three maximum pooling downsampling layers, each of the four convolution blocks comprises a first convolution block, a second convolution block, a third convolution block and a fourth convolution block, and the first convolution block comprises a first convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a second convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the second convolution block comprises a third convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a fourth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the third convolution block comprises a fifth convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a sixth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the fourth convolution block comprises a seventh convolution layer, a first batch normalization layer, a first modified linear unit activation layer, an eighth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the three maximum pooled downsampled layers include a first maximum pooled downsampled layer, a second maximum pooled downsampled layer, and a third maximum pooled downsampled layer;
the first decoding module, the second decoding module and the third decoding module are all composed of three deconvolution blocks and three upper sampling layers, the three deconvolution blocks comprise a first deconvolution block, a second deconvolution block and a third deconvolution block, and the first deconvolution block comprises a ninth convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a tenth convolution layer, a fourth batch normalization layer and a fourth modified linear unit activation layer; the second deconvolution block comprises an eleventh convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a twelfth convolution layer, a fourth batch normalization layer and a fourth modified linear unit activation layer; the third anti-convolution block comprises a thirteenth convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a fourteenth convolution layer, a fourth batch normalization layer, a fourth modified linear unit activation layer and a fifteenth convolution layer; the three upsampling layers include a first upsampling layer, a second upsampling layer, and a third upsampling layer.
The step (3) specifically comprises the following steps:
(3a) the training set is sequentially input into a first coding module in batches, and the first coding module codes input data to obtain a first characteristic diagram;
(3b) inputting the first characteristic diagram into a first decoding module and a second decoding module in parallel to realize forward propagation of the segmentation network, wherein the first decoding module outputs a reconstruction result, and the second decoding module outputs a segmentation result;
(3c) inputting the reconstruction result into a second coding module to obtain a second characteristic diagram;
(3d) inputting the second characteristic diagram into a third decoding module to realize forward propagation of the network and obtain a re-segmentation result;
(3e) comparing the segmentation result with the corresponding real segmentation graph, and calculating the segmentation loss through a dice loss function; comparing the re-segmentation result with the corresponding real segmentation graph, and calculating the re-segmentation loss through a dice loss function, wherein the calculation formula of the dice loss function is as follows:
Figure BDA0003514395080000031
wherein X is a real segmentation graph; when calculating the segmentation loss, Y is a segmentation result, and when calculating the re-segmentation loss, Y is a re-segmentation result; comparing the reconstruction result obtained in the step (3b) with training set data to be segmented input into the segmentation network, and calculating reconstruction loss through a cross entropy loss function;
(3f) weighting and summing the segmentation loss, the re-segmentation loss and the reconstruction loss obtained in the step (3e) to obtain a total loss result, and performing back propagation to train the segmentation network by using a gradient descent algorithm;
(3g) and obtaining the trained segmentation network after the training times of the segmentation network reach the set training times.
The convolution kernel size of the first convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 32; the convolution kernel size of the second convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the convolution kernel size of the third convolution layer is 3 × 3 × 3, and the number of convolution kernels is 64: the convolution kernel size of the fourth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 128; the convolution kernel size of the fifth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 128; the convolution kernel size of the sixth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the convolution kernel size of the seventh convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the convolution kernel size of the eighth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 512;
the first maximum pooled downsampled layer, the second maximum pooled downsampled layer and the third maximum pooled downsampled layer are all 2 × 2 × 2 in size;
the size of convolution kernels of the ninth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the size of convolution kernels of the tenth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the size of convolution kernels of the eleventh convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 128; the size of convolution kernels of the twelfth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 128; the size of convolution kernels of the thirteenth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the size of convolution kernels of the fourteenth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the fifteenth convolutional layer has a size of 3 × 3 × 3 and the number of convolutional cores is 4.
The first convolution block of the first coding module is used as an input port of a partition network, the output result of the first convolution block of the first coding module is input to a first maximum pooling down-sampling layer of the first coding module, the output result of the first maximum pooling down-sampling layer of the first coding module is input to a second convolution block of the first coding module, the output result of the second convolution block of the first coding module is input to a second maximum pooling down-sampling layer of the first coding module, the output result of the second maximum pooling down-sampling layer of the first coding module is input to a third convolution block of the first coding module, the output result of the third convolution block of the first coding module is input to a third maximum pooling down-sampling layer of the first coding module, the output result of the third maximum pooling down-sampling layer of the first coding module is input to a fourth convolution block of the first coding module, the output result of the fourth convolution block of the first coding module is input to a first upper sampling layer of the first decoding module in parallel, A first up-sampling layer of a second decoding module splices an output result of the first up-sampling layer of the first decoding module and an output result of a third rolling block of the first encoding module to obtain a first splicing result, splices the output result of the first up-sampling layer of the second decoding module and the output result of the third rolling block of the first encoding module to obtain a second splicing result, inputs the first splicing result to a first reverse rolling block of the first decoding module, inputs the second splicing result to a first reverse rolling block of the second decoding module, inputs the output result of the first reverse rolling block of the first decoding module to a second up-sampling layer of the first decoding module, inputs the output result of the first reverse rolling block of the second decoding module to a second up-sampling layer of the second decoding module, splices the output result of the second up-sampling layer of the first decoding module and the output result of the second rolling block of the first encoding module, obtaining a third splicing result, splicing the output result of the second up-sampling layer of the second decoding module with the output result of the second rolling block of the first encoding module to obtain a fourth splicing result, inputting the third splicing result into the second anti-rolling block of the first decoding module, inputting the fourth splicing result into the second anti-rolling block of the second decoding module, inputting the output result of the second anti-rolling block of the first decoding module into the third up-sampling layer of the first decoding module, inputting the output result of the second anti-rolling block of the second decoding module into the third up-sampling layer of the second decoding module, splicing the output result of the third up-sampling layer of the first decoding module with the output result of the first rolling block of the first encoding module to obtain a fifth splicing result, splicing the output result of the third up-sampling layer of the second decoding module with the output result of the first rolling block of the first encoding module, obtaining a sixth splicing result, inputting the fifth splicing result into a third deconvolution block of the first decoding module, inputting the sixth splicing result into a third deconvolution block of the second decoding module, outputting a reconstruction result by the first decoding module, outputting a segmentation result in the second decoding, inputting the reconstruction result into a first convolution block of the second encoder, inputting an output result of the first convolution block of the second encoder into a first maximum pooled downsampling layer of the second encoder, inputting an output result of the first maximum pooled downsampling layer into a second convolution block of the second encoder, inputting an output result of the second convolution block into a second maximum pooled downsampling layer of the second encoder, inputting an output result of the second maximum pooled downsampling layer of the second encoder into a third convolution block of the second encoder, inputting an output result of the third convolution block of the second encoder into a third maximum pooled downsampling layer of the second encoder, the output result of the third maximal pooled downsampled layer of the second encoder is input into a fourth convolution block of the second encoder, the output result of the fourth convolution block of the second encoder is input into a first upsampled layer of a third decoder, the output of the first upsampled layer of the third decoder and the output of the third convolution block of the second encoder are concatenated, the concatenated result is input into a first inverse convolution block of the third decoder, the output result of the first inverse convolution block of the third decoder is input into a second upsampled layer of the third decoder, the output of the second upsampled layer of the third decoder and the output of the second convolution block of the second encoder are concatenated, the concatenated result is input into a second inverse convolution block of the third decoder, the output result of the second inverse convolution block of the third decoder is input into a third upsampled layer of the third decoder, the output of the third upsampled layer of the third decoder, And the output of the first convolution block of the second encoder is spliced, and the splicing result is input into a third anti-convolution block of a third decoder to obtain a re-segmentation result.
According to the technical scheme, the beneficial effects of the invention are as follows: firstly, the method is helpful for a segmentation network to learn complementary medical image characteristics through the introduction of an additional image reconstruction task branch, so as to help a model to better understand the internal structure of the medical image; secondly, the reconstruction result is input into the segmentation network again, the obtained re-segmentation result is compared with the real segmentation graph, and an additional supervision signal is provided for training of the segmentation network from the semantic level, so that the accuracy of the further image segmentation result is improved.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
fig. 2 is a schematic structural diagram of a segmentation network according to the present invention.
Detailed Description
As shown in fig. 1, a medical image segmentation method based on an assisted learning task and re-segmentation constraint includes the following steps:
(1) preprocessing three-dimensional human brain nuclear magnetic resonance data to obtain a training set and a test set;
(2) constructing a segmentation network based on an auxiliary learning task and re-segmentation constraints;
(3) inputting the training set into a segmentation network for training to obtain a trained segmentation network;
(4) and inputting the test set into the trained segmentation network, and outputting the segmentation network to obtain a segmentation result.
The step (1) specifically comprises the following steps:
(2a) the three-dimensional human brain nuclear magnetic resonance data comprises four modalities: t1, T1c, T2 and FLAIR, combining four-modality three-dimensional human brain nmr data, wherein the original size of the four-modality data is 240 × 155, generating four-channel three-dimensional data with the size of 4 × 240 × 155, wherein 4 represents the number of modalities, 155 represents the number of two-dimensional slices contained in each three-dimensional human brain nmr data, and 240 × 240 represents the height and width of the image, respectively;
(2b) converting the merged three-dimensional human brain image data from nii format to numpy format;
(2c) carrying out normalization processing on the converted data by adopting a zero-mean normalization method;
(2d) normalizing the processed image according to random division by 7:3, dividing the ratio into a training set and a test set;
(2e) the training set was randomly clipped, resulting in training set data of size 4 x 128.
In step (2), as shown in fig. 2, the partition network includes a first encoding module, a second encoding module, a first decoding module, a second decoding module, and a third decoding module;
the first coding module and the second coding module are composed of four convolution blocks and three maximum pooling downsampling layers, each convolution block comprises a first convolution block, a second convolution block, a third convolution block and a fourth convolution block, and each convolution block comprises a first convolution layer, a first batch normalization layer, a first correction linear unit activation layer, a second convolution layer, a second batch normalization layer and a second correction linear unit activation layer; the second convolution block comprises a third convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a fourth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the third convolution block comprises a fifth convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a sixth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the fourth convolution block comprises a seventh convolution layer, a first batch normalization layer, a first modified linear unit activation layer, an eighth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the three maximum pooled downsampled layers include a first maximum pooled downsampled layer, a second maximum pooled downsampled layer, and a third maximum pooled downsampled layer;
the first decoding module, the second decoding module and the third decoding module are all composed of three deconvolution blocks and three upsampling layers, the three deconvolution blocks comprise a first deconvolution block, a second deconvolution block and a third deconvolution block, and the first deconvolution block comprises a ninth convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a tenth convolution layer, a fourth batch normalization layer and a fourth modified linear unit activation layer; the second deconvolution block comprises an eleventh convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a twelfth convolution layer, a fourth batch normalization layer and a fourth modified linear unit activation layer; the third anti-convolution block comprises a thirteenth convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a fourteenth convolution layer, a fourth batch normalization layer, a fourth modified linear unit activation layer and a fifteenth convolution layer; the three upsampling layers include a first upsampling layer, a second upsampling layer, and a third upsampling layer.
The step (3) specifically comprises the following steps:
(3a) the training set is sequentially input into a first coding module in batches, and the first coding module codes input data to obtain a first characteristic diagram;
(3b) inputting the first characteristic diagram into a first decoding module and a second decoding module in parallel to realize forward propagation of the segmentation network, wherein the first decoding module outputs a reconstruction result, and the second decoding module outputs a segmentation result;
(3c) inputting the reconstruction result into a second coding module to obtain a second characteristic diagram;
(3d) inputting the second characteristic diagram into a third decoding module to realize forward propagation of the network and obtain a re-segmentation result;
(3e) comparing the segmentation result with the corresponding real segmentation graph, and calculating the segmentation loss through a dice loss function; comparing the re-segmentation result with the corresponding real segmentation graph, and calculating the re-segmentation loss through a dice loss function, wherein the calculation formula of the dice loss function is as follows:
Figure BDA0003514395080000081
wherein X is a real segmentation graph; when calculating the segmentation loss, Y is a segmentation result, and when calculating the re-segmentation loss, Y is a re-segmentation result; comparing the reconstruction result obtained in the step (3b) with training set data to be segmented input into the segmentation network, and calculating reconstruction loss through a cross entropy loss function;
(3f) weighting and summing the segmentation loss, the re-segmentation loss and the reconstruction loss obtained in the step (3e) to obtain a total loss result, and performing back propagation to train the segmentation network by using a gradient descent algorithm;
(3g) and obtaining the trained segmentation network after the training times of the segmentation network reach the set training times.
The convolution kernel size of the first convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 32; the convolution kernel size of the second convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the convolution kernel size of the third convolution layer is 3 × 3 × 3, and the number of convolution kernels is 64: the convolution kernel size of the fourth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 128; the convolution kernel size of the fifth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 128; the convolution kernel size of the sixth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the convolution kernel size of the seventh convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the convolution kernel size of the eighth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 512;
the first maximum pooled downsampled layer, the second maximum pooled downsampled layer and the third maximum pooled downsampled layer are all 2 × 2 × 2 in size;
the size of convolution kernels of the ninth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the size of convolution kernels of the tenth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the size of convolution kernels of the eleventh convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 128; the size of convolution kernels of the twelfth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 128; the size of convolution kernels of the thirteenth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the size of convolution kernels of the fourteenth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the fifteenth convolutional layer has a size of 3 × 3 × 3 and the number of convolutional cores is 4.
The first convolution block of the first coding module is used as an input port of the partition network, the output result of the first convolution block of the first coding module is input to the first maximum pooling downsampling layer of the first coding module, the output result of the first maximum pooling downsampling layer of the first coding module is input to the second convolution block of the first coding module, the output result of the second convolution block of the first coding module is input to the second maximum pooling downsampling layer of the first coding module, the output result of the second maximum pooling downsampling layer of the first coding module is input to the third convolution block of the first coding module, the output result of the third convolution block of the first coding module is input to the third maximum pooling downsampling layer of the first coding module, the output result of the third maximum pooling downsampling layer of the first coding module is input to the fourth convolution block of the first coding module, the output result of the fourth convolution block of the first coding module is input to the first upsampling layer of the first decoding module in parallel, A first up-sampling layer of a second decoding module splices an output result of the first up-sampling layer of the first decoding module and an output result of a third rolling block of the first encoding module to obtain a first splicing result, splices the output result of the first up-sampling layer of the second decoding module and the output result of the third rolling block of the first encoding module to obtain a second splicing result, inputs the first splicing result to a first reverse rolling block of the first decoding module, inputs the second splicing result to a first reverse rolling block of the second decoding module, inputs the output result of the first reverse rolling block of the first decoding module to a second up-sampling layer of the first decoding module, inputs the output result of the first reverse rolling block of the second decoding module to a second up-sampling layer of the second decoding module, splices the output result of the second up-sampling layer of the first decoding module and the output result of the second rolling block of the first encoding module, obtaining a third splicing result, splicing the output result of the second up-sampling layer of the second decoding module with the output result of the second rolling block of the first encoding module to obtain a fourth splicing result, inputting the third splicing result into the second anti-rolling block of the first decoding module, inputting the fourth splicing result into the second anti-rolling block of the second decoding module, inputting the output result of the second anti-rolling block of the first decoding module into the third up-sampling layer of the first decoding module, inputting the output result of the second anti-rolling block of the second decoding module into the third up-sampling layer of the second decoding module, splicing the output result of the third up-sampling layer of the first decoding module with the output result of the first rolling block of the first encoding module to obtain a fifth splicing result, splicing the output result of the third up-sampling layer of the second decoding module with the output result of the first rolling block of the first encoding module, obtaining a sixth splicing result, inputting the fifth splicing result into a third deconvolution block of the first decoding module, inputting the sixth splicing result into a third deconvolution block of the second decoding module, outputting a reconstruction result by the first decoding module, outputting a segmentation result in the second decoding, inputting the reconstruction result into a first convolution block of the second encoder, inputting an output result of the first convolution block of the second encoder into a first maximum pooled downsampling layer of the second encoder, inputting an output result of the first maximum pooled downsampling layer into a second convolution block of the second encoder, inputting an output result of the second convolution block into a second maximum pooled downsampling layer of the second encoder, inputting an output result of the second maximum pooled downsampling layer of the second encoder into a third convolution block of the second encoder, inputting an output result of the third convolution block of the second encoder into a third maximum pooled downsampling layer of the second encoder, the output result of the third maximal pooled downsampled layer of the second encoder is input into a fourth convolution block of the second encoder, the output result of the fourth convolution block of the second encoder is input into a first upsampled layer of a third decoder, the output of the first upsampled layer of the third decoder and the output of the third convolution block of the second encoder are concatenated, the concatenated result is input into a first inverse convolution block of the third decoder, the output result of the first inverse convolution block of the third decoder is input into a second upsampled layer of the third decoder, the output of the second upsampled layer of the third decoder and the output of the second convolution block of the second encoder are concatenated, the concatenated result is input into a second inverse convolution block of the third decoder, the output result of the second inverse convolution block of the third decoder is input into a third upsampled layer of the third decoder, the output of the third upsampled layer of the third decoder, And the output of the first convolution block of the second encoder is spliced, and the splicing result is input into a third anti-convolution block of a third decoder to obtain a re-segmentation result.
Example one
The present invention uses a total of 285 cases of 3D MRI data provided by brain tumor segmentation (BraTS)2018 challenge suite for studies of medical image segmentation. The data set consisted of four MR sequences, each patient having a 3D image of a brain tumor of 240x 240x155 voxel size. The labels for tumor segmentation included background (label 0), necrotic and non-enhanced tumors (label 1), peritumoral edema (label 2) and GD-enhanced tumors (label 4). The method adopts a random division mode, divides a data set into a training set and a test set according to the proportion of 7:3, and evaluates the effectiveness of a segmentation algorithm by calculating the segmentation precision of the test set. Segmentation accuracy is measured by the Dice score index, where ET, WT and TC refer to enhanced tumor region (tag 1), whole tumor (tags 1, 2 and 4) and tumor core (tags 1 and 4), respectively. After image reconstruction is added, the multi-task learning model promotes the sharing of features among different tasks to improve the overall learning performance of the network, so that the segmentation performance of the WT, ET and TC areas is improved by 1.06%, 0.11% and 0.17% respectively. The reconstruction result is input into the model branch again, and the overall brain tumor segmentation result of the model is improved by 1.44%, 0.58% and 1.89% respectively. This shows that, by encouraging that the two segmentation results are constrained to be sufficiently similar in the training process, an additional supervision signal can be generated on a semantic level to guide the training of the model so as to learn more feature information related to the segmentation target, thereby further optimizing the segmentation performance of the network.
Table 1 shows the effect of the image reconstruction task branch on brain tumor segmentation performance:
TABLE 1
Figure BDA0003514395080000111
Table 2 shows the effect of the present invention on brain tumor segmentation performance:
TABLE 2
Figure BDA0003514395080000112
The high classification precision of the invention can be further seen through the comparative analysis of experiments.
In conclusion, the method is beneficial to segmentation network learning to complementary medical image features through introduction of additional image reconstruction task branches, so that the model is helped to better understand the internal structure of the medical image; and inputting the reconstruction result into the segmentation network again, comparing the obtained re-segmentation result with the real segmentation graph, and providing an additional supervision signal for training the segmentation network from a semantic level so as to further improve the accuracy of the image segmentation result.

Claims (6)

1. A medical image segmentation method based on auxiliary learning task and re-segmentation constraint is characterized in that: the method comprises the following steps in sequence:
(1) preprocessing three-dimensional human brain nuclear magnetic resonance data to obtain a training set and a test set;
(2) constructing a segmentation network based on an auxiliary learning task and re-segmentation constraints;
(3) inputting the training set into a segmentation network for training to obtain a trained segmentation network;
(4) and inputting the test set into the trained segmentation network, and outputting the segmentation network to obtain a segmentation result.
2. The medical image segmentation method based on the assistant learning task and the re-segmentation constraint of claim 1, wherein: the step (1) specifically comprises the following steps:
(2a) the three-dimensional human brain nuclear magnetic resonance data comprises four modalities: t1, T1c, T2 and FLAIR, combining four-modality three-dimensional human brain nmr data, wherein the original size of the four-modality data is 240 × 155, generating four-channel three-dimensional data with the size of 4 × 240 × 155, wherein 4 represents the number of modalities, 155 represents the number of two-dimensional slices contained in each three-dimensional human brain nmr data, and 240 × 240 represents the height and width of the image, respectively;
(2b) converting the merged three-dimensional human brain image data from nii format to numpy format;
(2c) carrying out normalization processing on the converted data by adopting a zero-mean normalization method;
(2d) normalizing the processed image according to random division by 7:3, dividing the ratio into a training set and a test set;
(2e) the training set was randomly clipped, resulting in training set data of size 4 x 128.
3. The medical image segmentation method based on the assistant learning task and the re-segmentation constraint of claim 1, wherein: in step (2), the partition network comprises a first encoding module, a second encoding module, a first decoding module, a second decoding module and a third decoding module;
the first coding module and the second coding module are composed of four convolution blocks and three maximum pooling downsampling layers, each of the four convolution blocks comprises a first convolution block, a second convolution block, a third convolution block and a fourth convolution block, and the first convolution block comprises a first convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a second convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the second convolution block comprises a third convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a fourth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the third convolution block comprises a fifth convolution layer, a first batch normalization layer, a first modified linear unit activation layer, a sixth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the fourth convolution block comprises a seventh convolution layer, a first batch normalization layer, a first modified linear unit activation layer, an eighth convolution layer, a second batch normalization layer and a second modified linear unit activation layer; the three maximum pooled downsampled layers include a first maximum pooled downsampled layer, a second maximum pooled downsampled layer, and a third maximum pooled downsampled layer;
the first decoding module, the second decoding module and the third decoding module are all composed of three deconvolution blocks and three upper sampling layers, the three deconvolution blocks comprise a first deconvolution block, a second deconvolution block and a third deconvolution block, and the first deconvolution block comprises a ninth convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a tenth convolution layer, a fourth batch normalization layer and a fourth modified linear unit activation layer; the second deconvolution block comprises an eleventh convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a twelfth convolution layer, a fourth batch normalization layer and a fourth modified linear unit activation layer; the third anti-convolution block comprises a thirteenth convolution layer, a third batch normalization layer, a third modified linear unit activation layer, a fourteenth convolution layer, a fourth batch normalization layer, a fourth modified linear unit activation layer and a fifteenth convolution layer; the three upsampling layers include a first upsampling layer, a second upsampling layer, and a third upsampling layer.
4. The medical image segmentation method based on the assistant learning task and the re-segmentation constraint of claim 1, wherein: the step (3) specifically comprises the following steps:
(3a) the training set is sequentially input into a first coding module in batches, and the first coding module codes input data to obtain a first characteristic diagram;
(3b) inputting the first characteristic diagram into a first decoding module and a second decoding module in parallel to realize forward propagation of the segmentation network, wherein the first decoding module outputs a reconstruction result, and the second decoding module outputs a segmentation result;
(3c) inputting the reconstruction result into a second coding module to obtain a second characteristic diagram;
(3d) inputting the second characteristic diagram into a third decoding module to realize forward propagation of the network and obtain a re-segmentation result;
(3e) comparing the segmentation result with the corresponding real segmentation graph, and calculating the segmentation loss through a dice loss function; comparing the re-segmentation result with the corresponding real segmentation graph, and calculating the re-segmentation loss through a dice loss function, wherein the calculation formula of the dice loss function is as follows:
Figure FDA0003514395070000021
wherein X is a real segmentation graph; when calculating the segmentation loss, Y is a segmentation result, and when calculating the re-segmentation loss, Y is a re-segmentation result; comparing the reconstruction result obtained in the step (3b) with training set data to be segmented input into the segmentation network, and calculating reconstruction loss through a cross entropy loss function;
(3f) weighting and summing the segmentation loss, the re-segmentation loss and the reconstruction loss obtained in the step (3e) to obtain a total loss result, and performing back propagation to train the segmentation network by using a gradient descent algorithm;
(3g) and obtaining the trained segmentation network after the training times of the segmentation network reach the set training times.
5. The medical image segmentation method based on the assistant learning task and the re-segmentation constraint of claim 3, wherein: the convolution kernel size of the first convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 32; the convolution kernel size of the second convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the convolution kernel size of the third convolution layer is 3 × 3 × 3, and the number of convolution kernels is 64: the convolution kernel size of the fourth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 128; the convolution kernel size of the fifth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 128; the convolution kernel size of the sixth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the convolution kernel size of the seventh convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 256; the convolution kernel size of the eighth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 512;
the first maximum pooled downsampled layer, the second maximum pooled downsampled layer and the third maximum pooled downsampled layer are all 2 × 2 × 2 in size;
the size of convolution kernels of the ninth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the size of convolution kernels of the tenth convolution layer is 3 multiplied by 3, and the number of convolution kernels is 256; the size of convolution kernels of the eleventh convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 128; the size of convolution kernels of the twelfth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 128; the size of convolution kernels of the thirteenth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the size of convolution kernels of the fourteenth convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the fifteenth convolutional layer has a size of 3 × 3 × 3 and the number of convolutional cores is 4.
6. The medical image segmentation method based on the assistant learning task and the re-segmentation constraint of claim 3, wherein: the first convolution block of the first coding module is used as an input port of the partition network, the output result of the first convolution block of the first coding module is input to the first maximum pooling downsampling layer of the first coding module, the output result of the first maximum pooling downsampling layer of the first coding module is input to the second convolution block of the first coding module, the output result of the second convolution block of the first coding module is input to the second maximum pooling downsampling layer of the first coding module, the output result of the second maximum pooling downsampling layer of the first coding module is input to the third convolution block of the first coding module, the output result of the third convolution block of the first coding module is input to the third maximum pooling downsampling layer of the first coding module, the output result of the third maximum pooling downsampling layer of the first coding module is input to the fourth convolution block of the first coding module, the output result of the fourth convolution block of the first coding module is input to the first upsampling layer of the first decoding module in parallel, A first up-sampling layer of a second decoding module splices an output result of the first up-sampling layer of the first decoding module and an output result of a third rolling block of the first encoding module to obtain a first splicing result, splices the output result of the first up-sampling layer of the second decoding module and the output result of the third rolling block of the first encoding module to obtain a second splicing result, inputs the first splicing result to a first reverse rolling block of the first decoding module, inputs the second splicing result to a first reverse rolling block of the second decoding module, inputs the output result of the first reverse rolling block of the first decoding module to a second up-sampling layer of the first decoding module, inputs the output result of the first reverse rolling block of the second decoding module to a second up-sampling layer of the second decoding module, splices the output result of the second up-sampling layer of the first decoding module and the output result of the second rolling block of the first encoding module, obtaining a third splicing result, splicing the output result of the second up-sampling layer of the second decoding module with the output result of the second rolling block of the first encoding module to obtain a fourth splicing result, inputting the third splicing result into the second anti-rolling block of the first decoding module, inputting the fourth splicing result into the second anti-rolling block of the second decoding module, inputting the output result of the second anti-rolling block of the first decoding module into the third up-sampling layer of the first decoding module, inputting the output result of the second anti-rolling block of the second decoding module into the third up-sampling layer of the second decoding module, splicing the output result of the third up-sampling layer of the first decoding module with the output result of the first rolling block of the first encoding module to obtain a fifth splicing result, splicing the output result of the third up-sampling layer of the second decoding module with the output result of the first rolling block of the first encoding module, obtaining a sixth splicing result, inputting the fifth splicing result into a third deconvolution block of the first decoding module, inputting the sixth splicing result into a third deconvolution block of the second decoding module, outputting a reconstruction result by the first decoding module, outputting a segmentation result in the second decoding, inputting the reconstruction result into a first convolution block of the second encoder, inputting an output result of the first convolution block of the second encoder into a first maximum pooled downsampling layer of the second encoder, inputting an output result of the first maximum pooled downsampling layer into a second convolution block of the second encoder, inputting an output result of the second convolution block into a second maximum pooled downsampling layer of the second encoder, inputting an output result of the second maximum pooled downsampling layer of the second encoder into a third convolution block of the second encoder, inputting an output result of the third convolution block of the second encoder into a third maximum pooled downsampling layer of the second encoder, the output result of the third maximal pooled downsampled layer of the second encoder is input into a fourth convolution block of the second encoder, the output result of the fourth convolution block of the second encoder is input into a first upsampled layer of a third decoder, the output of the first upsampled layer of the third decoder and the output of the third convolution block of the second encoder are concatenated, the concatenated result is input into a first inverse convolution block of the third decoder, the output result of the first inverse convolution block of the third decoder is input into a second upsampled layer of the third decoder, the output of the second upsampled layer of the third decoder and the output of the second convolution block of the second encoder are concatenated, the concatenated result is input into a second inverse convolution block of the third decoder, the output result of the second inverse convolution block of the third decoder is input into a third upsampled layer of the third decoder, the output of the third upsampled layer of the third decoder, And the output of the first convolution block of the second encoder is spliced, and the splicing result is input into a third anti-convolution block of a third decoder to obtain a re-segmentation result.
CN202210162154.5A 2022-02-22 2022-02-22 Medical image segmentation method based on auxiliary learning task and re-segmentation constraint Pending CN114529562A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210162154.5A CN114529562A (en) 2022-02-22 2022-02-22 Medical image segmentation method based on auxiliary learning task and re-segmentation constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210162154.5A CN114529562A (en) 2022-02-22 2022-02-22 Medical image segmentation method based on auxiliary learning task and re-segmentation constraint

Publications (1)

Publication Number Publication Date
CN114529562A true CN114529562A (en) 2022-05-24

Family

ID=81625095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210162154.5A Pending CN114529562A (en) 2022-02-22 2022-02-22 Medical image segmentation method based on auxiliary learning task and re-segmentation constraint

Country Status (1)

Country Link
CN (1) CN114529562A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115648A (en) * 2022-06-20 2022-09-27 北京理工大学 Brain tissue segmentation method combining UNet and volume rendering prior knowledge
CN116823842A (en) * 2023-06-25 2023-09-29 山东省人工智能研究院 Vessel segmentation method of double decoder network fused with geodesic model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115648A (en) * 2022-06-20 2022-09-27 北京理工大学 Brain tissue segmentation method combining UNet and volume rendering prior knowledge
CN116823842A (en) * 2023-06-25 2023-09-29 山东省人工智能研究院 Vessel segmentation method of double decoder network fused with geodesic model
CN116823842B (en) * 2023-06-25 2024-02-02 山东省人工智能研究院 Vessel segmentation method of double decoder network fused with geodesic model

Similar Documents

Publication Publication Date Title
CN111311592B (en) Three-dimensional medical image automatic segmentation method based on deep learning
CN109949309A (en) A kind of CT image for liver dividing method based on deep learning
CN111260705B (en) Prostate MR image multi-task registration method based on deep convolutional neural network
CN111429460B (en) Image segmentation method, image segmentation model training method, device and storage medium
CN110188792A (en) The characteristics of image acquisition methods of prostate MRI 3-D image
CN114529562A (en) Medical image segmentation method based on auxiliary learning task and re-segmentation constraint
CN114581662B (en) Brain tumor image segmentation method, system, device and storage medium
CN110619641A (en) Automatic segmentation method of three-dimensional breast cancer nuclear magnetic resonance image tumor region based on deep learning
CN110414481A (en) A kind of identification of 3D medical image and dividing method based on Unet and LSTM
CN114693933A (en) Medical image segmentation device based on generation of confrontation network and multi-scale feature fusion
Deng et al. Combining residual attention mechanisms and generative adversarial networks for hippocampus segmentation
CN112381846A (en) Ultrasonic thyroid nodule segmentation method based on asymmetric network
CN114463341A (en) Medical image segmentation method based on long and short distance features
CN114549538A (en) Brain tumor medical image segmentation method based on spatial information and characteristic channel
WO2024104035A1 (en) Long short-term memory self-attention model-based three-dimensional medical image segmentation method and system
Li et al. MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation
CN117078941B (en) Cardiac MRI segmentation method based on context cascade attention
CN112420170A (en) Method for improving image classification accuracy of computer aided diagnosis system
Yuan et al. FM-Unet: Biomedical image segmentation based on feedback mechanism Unet
CN116309615A (en) Multi-mode MRI brain tumor image segmentation method
CN116433586A (en) Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method
CN112529949A (en) Method and system for generating DWI image based on T2 image
Ru et al. A dermoscopic image segmentation algorithm based on U-shaped architecture
CN116805284B (en) Feature migration-based super-resolution reconstruction method and system between three-dimensional magnetic resonance planes
Dong et al. Primary brain tumors Image segmentation based on 3D-UNET with deep supervision and 3D brain modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination