CN113052856A - Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism - Google Patents

Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism Download PDF

Info

Publication number
CN113052856A
CN113052856A CN202110269960.8A CN202110269960A CN113052856A CN 113052856 A CN113052856 A CN 113052856A CN 202110269960 A CN202110269960 A CN 202110269960A CN 113052856 A CN113052856 A CN 113052856A
Authority
CN
China
Prior art keywords
layer
hippocampus
data
dimensional
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110269960.8A
Other languages
Chinese (zh)
Inventor
林岚
吴玉超
吴水才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110269960.8A priority Critical patent/CN113052856A/en
Publication of CN113052856A publication Critical patent/CN113052856A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)

Abstract

A hippocampus three-dimensional semantic network segmentation method based on a multi-scale feature multi-path attention fusion mechanism belongs to the field of medical image processing. The invention comprises the following steps: preprocessing the image data of the hippocampus disclosed with the labels, and cutting original data into image blocks containing the hippocampus; constructing a new three-dimensional semantic segmentation network structure of the hippocampus based on multi-scale feature extraction, a multi-path attention fusion mechanism and a branch classifier integrated learning strategy; dividing the data set; off-line model training, namely obtaining model weight parameters aiming at the three-dimensional hippocampus structure; and segmenting the test set image by using the model file and evaluating a segmentation result. According to the invention, through designing the semantic segmentation network structure which accords with the characteristics of the three-dimensional hippocampus image, the utilization rate of the network to multi-dimensional image information can be improved, so that the pixel dense prediction capability is improved, and the hippocampus segmentation performance is improved.

Description

Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism
Technical Field
The invention relates to the field of medical image processing, in particular to a hippocampus three-dimensional semantic network segmentation method based on a multi-scale feature multi-path attention fusion mechanism.
Background
The hippocampus is an important structure in the brain, is located between the thalamus and the medial temporal lobe of the brain, belongs to a part of the limbic system, and is mainly responsible for functions such as storage, conversion, orientation and the like of long-term memory. Its nerve cells are very fragile and vulnerable, and once the nerve cells of the hippocampus die, memory of a human is lost. The morphological change is an important biological marker for studying long-term memory, and for example, judging whether the hippocampus volume is shrunk or not based on Magnetic Resonance Imaging (MRI) is one of the key technologies for diagnosing alzheimer disease. However, the hippocampus does not have a good distinction degree with the surrounding brain tissue structure in the magnetic resonance image, the size, position and structure details of the hippocampus are different from person to person, and the complexity of the form makes the manual division of the hippocampus very difficult. With the development of deep learning technology, a semantic segmentation method based on a convolutional neural network is utilized to bring new eosin for the automatic segmentation of the hippocampus.
The application number '201810775957.1', entitled brain MRI three-dimensional segmentation method based on deep learning, introduces a typical coding-decoding semantic segmentation network structure comprising 3 one-dimensional convolution layers, 15 three-dimensional convolution layers and 4 maximum pooling layers to segment the hippocampus structure, firstly, original image data is segmented into images with the resolution of 32 x 32, and then the images are sent into a network for training, although certain segmentation precision is obtained, the network structure is simpler, and an improved space is provided on the semantic network structure. With the introduction of attention mechanism, how to more efficiently identify and segment the region of interest by using key features has been noticed by researchers, application No. 201911179566.4, entitled "method for extracting hippocampus based on 3D neural network human brain nuclear magnetic resonance image" introduces a semantic segmentation method using channel attention mechanism in decoding network structure, and performs aggregation and recombination on channel features through attention mechanism, thereby effectively highlighting salient features of the region of interest, suppressing expression of irrelevant features, and improving segmentation precision of the hippocampus by network. However, in the method, attention coefficient weighting is only carried out on the channel dimension characteristics, more characteristic information exists in the space dimension voxels, in the three-dimensional segmentation network, better segmentation performance improvement can be brought by utilizing the characteristics of the voxels, and an improvement space also exists in the segmentation network applying the attention mechanism.
Aiming at the defects of the prior art, the invention provides a brand-new three-dimensional semantic segmentation network structure for segmenting the hippocampus aiming at the brain nuclear magnetic resonance image. The network uses multi-scale feature extraction and fusion structure in the coding path aiming at the characteristics of complicated shape, size and the like of the hippocampus, and then optimizes the feature extractor of the network by carrying out residual connection on the fused output through a direct connection path; the network fuses a multi-channel attention mechanism in a decoding path to carry out weighting processing on channel dimension and space dimension characteristics, and voxel characteristic information under each dimension is fully utilized; the network uses three-dimensional convolution with convolution kernel size of 1 at the output end of each layer of decoder to construct a group of classifiers to realize ensemble learning, and finally combines the output results of the two classifiers at the output end of the network to obtain the optimal segmentation performance in an ensemble learning mode.
Disclosure of Invention
The invention aims to provide a three-dimensional semantic segmentation network structure with multi-scale feature extraction and multi-path attention mechanism, which extracts features of hippocampus regions with different sizes through a multi-scale residual module, gathers and weights information in different dimensions through the fusion of multi-path attention modules, so that relevant features of a hippocampus are highlighted and irrelevant features are suppressed, and the three-dimensional automatic segmentation method improves the accuracy of network segmentation by combining the output of a classifier of an integrated learning path with the output of a decoder path.
The technical scheme of the invention comprises the following steps:
step 1: acquiring MRI image data in a database and preprocessing the MRI image data;
step 2: designing a three-dimensional hippocampus semantic segmentation network structure based on a multi-scale feature multi-path attention fusion mechanism;
and step 3: training, verifying and testing data set partitioning;
and 4, step 4: training a model and storing a model file with the best performance;
and 5: and testing the model and evaluating the segmentation result.
Further, the step 1 comprises:
A. acquiring image data containing hippocampus annotation in an ADNI (https:// ida.loni.usc.edu /) database, and carrying out left and right hippocampal merging processing;
B. cutting the hippocampal region of the image and the label in the data set, selecting a slice containing a hippocampus from the transverse direction, the sagittal direction and the coronal direction to cut the image and the label data, and removing background information in the data;
C. and carrying out standardization processing on the cut image data.
Further, the three-dimensional hippocampus semantic segmentation network based on the fusion of the multi-scale feature and multi-path attention mechanism in the step 2 takes the functional modules and layers in the network as units, and comprises the following steps in order of realizing different functions:
A. an input layer comprising a data input layer;
B. the network comprises nine multi-scale residual modules, and each multi-scale residual module comprises three-dimensional convolutional layers with the convolutional kernel size of 1, one three-dimensional convolutional layer with the convolutional kernel size of 3, one three-dimensional convolutional layer with the convolutional kernel size of 5, five batch normalization layers, four Relu activation layers and one addition fusion layer;
C. the network comprises four pooling layers, and each pooling layer comprises a three-dimensional maximum pooling layer with a pooling core size of 2;
D. the network comprises four dimension adjusting modules, and each dimension adjusting module comprises a three-dimensional convolution layer with a convolution kernel size of 1, a batch normalization layer and a Relu activation layer;
E. the network comprises four channel attention modules, and each channel attention module comprises a three-dimensional global average pooling layer, a three-dimensional global maximum pooling layer, four batch normalization layers, four full-connection layers, two Relu activation layers, an additive fusion layer, a Sigmoid activation layer, a reconstruction layer and an element multiplication layer;
F. the network comprises four space attention modules, each space attention module comprises a three-dimensional convolution layer with convolution kernel size of 2, three-dimensional convolution layers with convolution kernel size of 1, a three-dimensional deconvolution layer with convolution kernel size of 3, an additive fusion layer, a Relu activation layer, a Sigmoid activation layer, an upper sampling layer, an element multiplication layer and a batch normalization layer;
G. the integrated learning branch structure comprises four layers of three-dimensional convolution layers with convolution kernel size of 1, three layers of upper sampling layers and three layers of element addition fusion layers;
H. the network comprises four jump connection structures, and each jump connection structure comprises an upper sampling layer and a channel splicing and fusing layer;
I. and the output layer comprises a three-dimensional convolution layer with the convolution kernel size of 1.
Further, the step 3 comprises:
A. dividing the acquired ADNI data set into three groups of Alzheimer's disease, mild cognitive impairment and normal people according to disease states;
B. randomly selecting data from the three groups as a training data set on average, setting a random number as a parameter for randomly moving a slice in the data preprocessing of the first step to cut the original data under the condition of ensuring that no hippocampus is omitted, wherein the scale of the training data set can be expanded through multiple times of cutting, and the data enhancement effect is achieved;
C. averagely selecting data from the three groups of residual data as a verification set;
D. the remaining data in the three groups are taken as a test set.
Further, the step 4 comprises:
A. inputting the training set and the verification set into the network for training;
B. if the callback function is used for setting the learning rate, the learning rate can be attenuated according to the reduction condition of the loss value of the verification set in the training process;
C. if the model is set to stop early in the callback function, stopping training of the model according to the loss value reduction condition of the verification set in the training process, and storing the model with the lowest loss value on the verification set;
D、
Figure BDA0002973904330000041
g represents a tag pixel value and P represents a predicted pixel value;
E. the Dice coefficient on the validation set of the resulting model was trained to be 0.8379.
Further, the step 5 comprises:
A. and (3) carrying out a three-dimensional image hippocampus segmentation test on the model by using data in the test set, wherein the Dice coefficient on the test set is 0.8269.
The invention has the beneficial effects that:
(1) the multi-scale residual error module is used for carrying out multi-scale feature extraction and fusion on the hippocampus region with a complex shape, so that the segmentation precision is improved;
(2) the invention combines a multi-attention mechanism module to carry out attention weighted convergence on multi-dimensional information, screens and weights the characteristics with higher activation values from channel dimension to space dimension, and improves the identification and segmentation capability of a network on a hippocampal target;
(3) the invention constructs the branch path classifier to be combined with the output of the decoder path classifier, and improves the classification and segmentation capability of the whole network in an integrated learning mode.
Drawings
FIG. 1 is a flow chart of a method for segmenting a three-dimensional semantic network of a hippocampus based on a multi-scale feature multi-path attention fusion mechanism according to the present invention;
FIG. 2 is a schematic diagram of the multi-scale residual module structure of the present invention;
FIG. 3 is a schematic view of a channel attention module configuration of the present invention;
FIG. 4 is a schematic diagram of a spatial attention module configuration of the present invention;
FIG. 5 is a schematic diagram of the overall network architecture of the present invention;
FIG. 6 is a diagram comparing the segmentation result of the present invention with the segmentation result of a classical three-dimensional U-type network.
Detailed Description
The invention can automatically process the brain magnetic resonance image and realize the three-dimensional automatic segmentation of the hippocampus; aiming at the problems of position change, complex shape and the like of the hippocampus, a semantic segmentation network is constructed by utilizing a more effective image feature processing module to improve the segmentation accuracy of a trained model, and more reliable information support is provided for the diagnosis of the Alzheimer's disease.
As shown in fig. 1, a three-dimensional hippocampus semantic network segmentation method based on a multi-scale feature multi-path attention fusion mechanism includes the following 5 steps:
1. acquiring a hippocampus image and label data in an ADNI database and preprocessing the image and the label data;
2. designing a three-dimensional hippocampus semantic segmentation network structure based on multi-scale feature extraction and a multi-path attention fusion mechanism;
3. training and verifying data set division;
4. training a model and storing a model file with the best performance;
5. and testing the model and evaluating the segmentation result.
Further, the step 1 comprises:
1) acquiring a brain magnetic resonance image and label data in the ADNI data set, and merging a left hippocampal label and a right hippocampal label;
2) further carrying out hippocampal region cutting on the original image and the label, and selecting a section containing a hippocampus from three directions of a transverse direction, a sagittal direction and a coronal direction to cut the image and the label data;
3) and further carrying out standardization processing on the cut data set.
Further, the network structure in step 2 includes the following structures in order from input to output, in terms of a layer unit:
1) constructing an Input layer, wherein the Input layer comprises an Input layer, inputting a data set into a network, the data format is a five-dimensional structure, and each dimension is a voxel block, a pixel length, a pixel width, a pixel height and an image channel;
2) constructing an encoder, inputting the output of an input layer into a first layer of the encoder, wherein the encoder structure consists of four multi-scale residual error modules, four layers of pooling layers which are connected end to end, and finally one multi-scale residual error module;
3) constructing multi-scale residual modules, wherein one multi-scale residual module comprises a conv3_3 layer, a conv5_5 layer, an up layer, a shortcut layer and a res _ path layer;
4) the Conv3_3 layer is formed by superposing two Conv3D layers, a BatchNormalization layer and an Activation layer, and three-dimensional convolution, batch normalization and Relu Activation with the convolution kernel size of 3 are carried out on the input of the multi-scale residual error module;
5) the Conv5_5 layer is formed by overlapping two Conv3D layers, a BatchNormalization layer and an Activation layer, and three-dimensional convolution, batch normalization and Relu Activation with the convolution kernel size of 5 are carried out on the input of the multi-scale residual error module;
6) the up layer consists of a concatenate layer, and channel dimension splicing and fusion operations are carried out on the outputs of the conv3_3 layer and the conv5_5 layer;
7) the shortcut layer is formed by overlapping a Conv3D layer and a Batchnormalization layer, and three-dimensional convolution and batch normalization operation with convolution kernel size of 1 are carried out on the input of the multi-scale residual error module;
8) the res _ path layer consists of an Add layer, and the Add addition fusion operation is carried out on the output of the shortcut layer and the up layer;
9) building a pooling layer, wherein the pooling layer comprises a pool _64 layer, a pool _32 layer, a pool _16 layer and a pool _8 layer which are combined, each layer consists of a Max scaling 3D layer, and the three-dimensional maximum pooling operation with the pooling kernel size of 2 is performed on the output of the multi-scale residual error module layer by layer;
10) constructing a decoder, wherein the output of each layer of the encoder corresponds to the corresponding layer of the input decoder, and the decoder structure is formed by combining four dimension adjusting modules, four channel attention modules, four space attention modules, four jump connection structures and four multi-scale residual error modules;
11) constructing dimension adjusting modules, wherein each dimension adjusting module comprises a layer x;
12) the x layer is formed by overlapping a Conv3D layer, a BatchNormalization layer and an Activation layer, and three-dimensional convolution, batch normalization and Relu Activation with the convolution kernel size of 1 are carried out on the input of the dimension adjusting module;
13) and constructing channel attention modules, wherein one channel attention module comprises an x _ s _ avg layer, an x _ e _ avg layer, an x _ s _ max layer, an x _ e layer and a result layer. The structure diagram is shown in FIG. 3, wherein C, L, W, H, r represents the channel, space voxel length, width, height and dimension compression ratio of the characteristic diagram;
14) the x _ s _ avg layer is formed by superposing a GlobavalagePooling 3D layer and a Batchnormalization layer, and the three-dimensional global average pooling and batch normalization operations are carried out on the input of the channel attention module;
15) the x _ e _ avg layer is formed by overlapping a Dense layer, an Activation layer, a Batchnormalization layer and a Dense layer, and the output of the x _ s _ avg layer is subjected to dimensionality scaling;
16) the x _ s _ max layer is formed by overlapping a GlobalMaxPlaooling 3D layer and a Batchnormalization layer, and three-dimensional global maximum pooling and batch normalization operations are carried out on the input of the channel attention module;
17) the x _ e _ max layer is formed by overlapping a Dense layer, an Activation layer, a Batchnormalization layer and a Dense layer, and the dimensionality scaling is carried out on the output of the x _ s _ max layer;
18) the x _ e layer is formed by superposing an add layer, an Activation layer and a restore layer, and the outputs of the x _ e _ avg layer and the x _ e _ max layer are subjected to addition fusion, sigmoid Activation and reconstruction operation;
19) the result layer is composed of a multiplex layer, and element multiplication operation is carried out on the input of the channel attention module and the output of the x _ e layer;
20) constructing a spatial attention module, wherein one spatial attention module comprises a theta _ x layer, a phi _ g layer, an upsample _ g layer, a concat _ xg layer, an act _ xg layer, a psi layer, a sigmoid _ xg layer, an upsample _ psi layer, a y layer, a result layer and a result _ bn layer;
21) the theta _ x layer is composed of a Conv3D layer, and convolution operation with the convolution kernel size of 2 and the step size of 2 is carried out on the input of the spatial attention module;
22) the phi _ g layer is composed of a Conv3D layer, and performs a three-dimensional convolution operation with a convolution kernel size of 1 on the input of the spatial attention module;
23) the upsample _ g layer consists of a Conv3DTranspose layer, and three-dimensional deconvolution operation with the convolution kernel size of 3 is carried out on the output of the phi _ g layer;
24) the concat _ xg layer consists of an add layer and performs addition fusion operation on the outputs of the upsample _ g layer and the theta _ x layer;
25) the act _ xg layer consists of an Activation layer and performs Relu Activation operation on the output of the concat _ xg layer;
26) the psi layer consists of a Conv3D layer, and the three-dimensional convolution operation with the convolution kernel size of 1 is carried out on the output of the act _ xg layer;
27) the sigmoid _ xg layer consists of an Activation layer and carries out sigmoid Activation operation on the output of the psi layer;
28) the upsample _ psi layer consists of an UpSampling3D layer and performs UpSampling operation on the output of the sigmoid _ xg layer;
29) the y layer consists of a multiplex layer, and element multiplication operation is carried out on the input of the space attention module and the output of the upsample _ psi layer;
30) the result layer consists of a Conv3D layer, and the three-dimensional convolution operation with the convolution kernel size of 1 is carried out on the output of the y layer;
31) the result _ bn layer consists of a Batchnormalization layer, and batch normalization operation is carried out on the output of the result layer;
32) constructing a jump connection structure, wherein the jump connection structure comprises a layer of up _16, a layer of up _32, a layer of up _64 and a layer of up _128, each layer is formed by overlapping a layer of UpSampling3D and a layer of concatanate, and the characteristic diagram in the decoder is subjected to up-sampling, splicing and fusing operations;
33) the decoder multi-scale residual error module has the same structure as the encoder multi-scale residual error module;
34) constructing an integrated learning branch, and correspondingly inputting the output of each layer of the decoder into a corresponding layer of an integrated learning branch structure, wherein the integrated learning branch structure is formed by combining a layer up _ conv _16_11, a layer up _ conv _32_11, a layer up _16_11, a layer add _01, a layer up _ conv _64_11, a layer up _ add _01, a layer add _02, a layer up _ conv _128_11, a layer up _ add _02 and a layer add _ 03;
35) the up _ Conv _16_11 is composed of a Conv3D layer, and the three-dimensional convolution operation with the convolution kernel size of 1 is carried out on the output of the multi-scale residual error module in the decoder;
36) the up _ Conv _32_11 is composed of a Conv3D layer, and the three-dimensional convolution operation with the convolution kernel size of 1 is carried out on the output of the multi-scale residual error module in the decoder;
37) the up _16_11 layer consists of a layer of UpSamplling 3D, and performs up-sampling operation on the output of the up _ conv _16_11 layer;
38) the add _01 layer consists of an add layer and performs addition fusion operation on the outputs of the up _16_11 layer and the up _ conv _32_11 layer;
39) the up _ Conv _64_11 layer consists of a Conv3D layer, and the three-dimensional convolution operation with the convolution kernel size of 1 is carried out on the output of the multi-scale residual error module in the decoder;
40) the up _ add _01 layer consists of an UpSamplling 3D layer and performs UpSampling operation on the output of the add _01 layer;
41) the add _02 layer consists of an add layer and performs addition fusion operation on the outputs of the up _ add _01 layer and the up _ conv _64_11 layer;
42) the up _ Conv _128_11 layer consists of a Conv3D layer, and the three-dimensional convolution operation with the convolution kernel size of 1 is carried out on the output of the multi-scale residual error module in the decoder;
43) the up _ add _02 layer consists of an UpSamplling 3D layer and performs up-sampling operation on the output of the add _02 layer;
44) the add _03 layer consists of an add layer and performs addition fusion operation on the outputs of the up _ add _02 layer and the up _ conv _128_11 layer;
45) constructing an output layer, inputting the output of the last layer of the integrated learning branch structure into the output layer, wherein the output layer comprises a conv10 layer;
46) the Conv10 layer consists of a Conv3D layer, the three-dimensional convolution with the convolution kernel size of 1 and sigmoid activation operation are carried out on the output of the integrated learning branch structure, and a segmentation result is output;
47) designing an evaluation function using a two-classification Dice coefficient as a model, and calculating the Dice coefficient of an output result and a label;
48) the Dice coefficient is defined as:
Figure BDA0002973904330000101
g represents a label pixel value, P represents a prediction pixel value, the value range is a closed interval from 0 to 1, 1 is completely overlapped, and 0 is completely not overlapped;
49) designing a loss function using a Dice loss function as a model;
50) the Dice loss function is defined as: loss 1-Dice.
Further, the step 3 comprises:
1) dividing the data in the acquired ADNI data set into three groups of Alzheimer's disease, mild cognitive impairment and normal people according to disease states;
2) randomly selecting data from the three groups as a training data set on average, setting a random number as a parameter for randomly moving a slice to cut the original data under the condition of ensuring that no hippocampus is missed, and expanding the scale of the training data set through multiple cutting;
3) averagely selecting data from the three groups of residual data as a verification set;
4) the remaining data in the three groups served as test sets.
Further, the step 4 comprises:
1) sending the images and the labels of the training set and the verification set into a network for off-line training;
2) if the callback function is used for learning rate attenuation and model early-stop setting, the model can perform learning rate attenuation and training stop according to the reduction condition of the loss value of the verification set in the training process, and the model with the lowest loss value on the verification set is stored.
Further, the step 5 comprises:
1) the obtained Dice coefficient of the trained model on the verification set is 0.8379;
2) carrying out a three-dimensional image hippocampus segmentation test on the model by using the data in the test set, wherein the Dice coefficient obtained by the test set is 0.8269;
3) the segmentation results obtained by the model on the test set are shown in fig. 6.
In summary, the experimental result of the invention shows that the utilization rate of the network to the information of a plurality of dimensional features can be improved by designing the semantic segmentation network structure aiming at the structural characteristics of the hippocampus, so that the pixel dense prediction capability is improved, and the hippocampus segmentation performance is improved.

Claims (5)

1. A hippocampus three-dimensional semantic network segmentation method based on a multi-scale feature multi-path attention fusion mechanism comprises the following steps:
step 1: acquiring magnetic resonance image data and preprocessing the magnetic resonance image data;
step 2: designing a three-dimensional hippocampus semantic segmentation network structure based on a multi-scale feature multi-path attention fusion mechanism;
and step 3: training, verifying and testing data set partitioning;
and 4, step 4: training a model and storing a model file with the best performance;
and 5: testing the model, and evaluating a segmentation result;
in the step 2, a three-dimensional semantic segmentation network structure based on multi-scale feature extraction and a multi-path attention fusion mechanism is constructed, wherein the network comprises an encoder structure for extracting image feature information, a decoder structure for generating dense pixel prediction and a branch structure for forming ensemble learning;
constructing an input layer, wherein the input layer inputs data of the data set; constructing an encoder, wherein the encoder structure performs feature extraction on the output of an input layer by using a multi-scale residual error module and a pooling layer; constructing a decoder, wherein the decoder structure performs weighting processing on the characteristics of the channel and space dimensions respectively by using a channel attention module and a space attention module for the output of the encoder so as to highlight the relevant characteristics of the hippocampus in the characteristic diagram, and the characteristic diagrams in the encoder and the decoder are subjected to characteristic splicing and fusion through a jump connection structure; constructing an ensemble learning branch, wherein the ensemble learning branch structure uses a three-dimensional convolution layer with the convolution kernel size of 1 to perform two-classification output on the output of each layer of decoder and fuse the outputs layer by layer to generate a weak classification result and combine the weak classification result with the classification result of the decoder; constructing an output layer, and outputting the two types of segmentation results through an activation function by the output layer; the design uses the two-class Dice coefficient as the evaluation function of the model, which is defined as follows:
Figure FDA0002973904320000011
wherein G represents the tag pixel value and P represents the predicted pixel value; the design uses the Dice loss function as a model's loss function, which is defined as follows:
Loss=1-Dice (2)。
2. the method for segmenting the hippocampus three-dimensional semantic network based on the multi-scale feature multi-path attention fusion mechanism according to claim 1, characterized in that in step 1, image data including hippocampus labels in a database are firstly obtained and merged left and right hippocampus, a hippocampus region is cut for an original image and a label, a slice including the hippocampus is selected from three directions of transverse, sagittal and coronal to cut the image and the label data, background information in the data is removed, and the cut image data is standardized.
3. The method for segmenting the three-dimensional semantic network of the hippocampus based on the multi-scale feature multi-path attention fusion mechanism according to claim 1, wherein in the step 3, the data in the obtained data set are divided into three groups, namely alzheimer disease, mild cognitive impairment and normal people according to disease states; randomly selecting data from the three groups as a training data set on average, setting a random number as a parameter for randomly moving a slice to cut the original data under the condition of ensuring that no hippocampus is missed, and cutting for multiple times can expand the scale of the training data set to play a role in data enhancement; and averaging to select data from three groups of residual data as a verification set, and taking the residual data in the three groups as a test set.
4. The method for segmenting the hippocampus based on the multi-scale feature multi-path attention fusion mechanism according to claim 1, wherein in the step 4, if the hyper-parameters in the training of the "learning rate decay" and "early-stop" callback function control model are used in the training of the model, the model reduces the learning rate and stops the training according to the change situation of the loss value of the verification set during the training, and the model with the lowest loss value on the verification set is saved.
5. The method for segmenting the hippocampus three-dimensional semantic network based on the multi-scale feature multi-path attention fusion mechanism according to claim 1, wherein in the step 5, the obtained model is used for carrying out segmentation test on the test set data, and the segmentation result is compared with the segmentation result of the classical network structure.
CN202110269960.8A 2021-03-12 2021-03-12 Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism Pending CN113052856A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110269960.8A CN113052856A (en) 2021-03-12 2021-03-12 Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110269960.8A CN113052856A (en) 2021-03-12 2021-03-12 Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism

Publications (1)

Publication Number Publication Date
CN113052856A true CN113052856A (en) 2021-06-29

Family

ID=76512131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110269960.8A Pending CN113052856A (en) 2021-03-12 2021-03-12 Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism

Country Status (1)

Country Link
CN (1) CN113052856A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113496496A (en) * 2021-07-07 2021-10-12 中南大学 MRI image hippocampus region segmentation method based on multiple losses and multiple scale characteristics
CN113628223A (en) * 2021-08-05 2021-11-09 杭州隐捷适生物科技有限公司 Dental CBCT three-dimensional tooth segmentation method based on deep learning
CN113706570A (en) * 2021-08-02 2021-11-26 中山大学 Segmentation method and device for zebra fish fluorescence image
CN113870243A (en) * 2021-10-11 2021-12-31 烟台大学 Magnetic resonance brain image hippocampus segmentation method
CN116563285A (en) * 2023-07-10 2023-08-08 邦世科技(南京)有限公司 Focus characteristic identifying and dividing method and system based on full neural network
CN116681705A (en) * 2023-08-04 2023-09-01 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Surface morphology measurement method and processing equipment based on longitudinal structure of human brain hippocampus

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113496496A (en) * 2021-07-07 2021-10-12 中南大学 MRI image hippocampus region segmentation method based on multiple losses and multiple scale characteristics
CN113706570A (en) * 2021-08-02 2021-11-26 中山大学 Segmentation method and device for zebra fish fluorescence image
CN113706570B (en) * 2021-08-02 2023-09-15 中山大学 Segmentation method and device for zebra fish fluorescence image
CN113628223A (en) * 2021-08-05 2021-11-09 杭州隐捷适生物科技有限公司 Dental CBCT three-dimensional tooth segmentation method based on deep learning
CN113870243A (en) * 2021-10-11 2021-12-31 烟台大学 Magnetic resonance brain image hippocampus segmentation method
CN116563285A (en) * 2023-07-10 2023-08-08 邦世科技(南京)有限公司 Focus characteristic identifying and dividing method and system based on full neural network
CN116563285B (en) * 2023-07-10 2023-09-19 邦世科技(南京)有限公司 Focus characteristic identifying and dividing method and system based on full neural network
CN116681705A (en) * 2023-08-04 2023-09-01 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Surface morphology measurement method and processing equipment based on longitudinal structure of human brain hippocampus
CN116681705B (en) * 2023-08-04 2023-09-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Surface morphology measurement method and processing equipment based on longitudinal structure of human brain hippocampus

Similar Documents

Publication Publication Date Title
CN113052856A (en) Hippocampus three-dimensional semantic network segmentation method based on multi-scale feature multi-path attention fusion mechanism
CN112116605B (en) Pancreas CT image segmentation method based on integrated depth convolution neural network
CN111369565B (en) Digital pathological image segmentation and classification method based on graph convolution network
CN115482241A (en) Cross-modal double-branch complementary fusion image segmentation method and device
CN111080657A (en) CT image organ segmentation method based on convolutional neural network multi-dimensional fusion
CN112465754B (en) 3D medical image segmentation method and device based on layered perception fusion and storage medium
CN113706542A (en) Eyeball segmentation method and device based on convolutional neural network and mixed loss function
CN115620010A (en) Semantic segmentation method for RGB-T bimodal feature fusion
CN112288749A (en) Skull image segmentation method based on depth iterative fusion depth learning model
CN111862261B (en) FLAIR modal magnetic resonance image generation method and system
CN113421240A (en) Mammary gland classification method and device based on ultrasonic automatic mammary gland full-volume imaging
CN110490843A (en) A kind of eye fundus image blood vessel segmentation method
CN112949707A (en) Cross-mode face image generation method based on multi-scale semantic information supervision
CN114742802B (en) Pancreas CT image segmentation method based on 3D transform mixed convolution neural network
CN115661165A (en) Glioma fusion segmentation system and method based on attention enhancement coding and decoding network
CN112150470A (en) Image segmentation method, image segmentation device, image segmentation medium, and electronic device
CN111667488B (en) Medical image segmentation method based on multi-angle U-Net
CN113538359A (en) System and method for finger vein image segmentation
CN116433654A (en) Improved U-Net network spine integral segmentation method
KR102561214B1 (en) A method and apparatus for image segmentation using global attention
CN113744284B (en) Brain tumor image region segmentation method and device, neural network and electronic equipment
CN114331996A (en) Medical image classification method and system based on self-coding decoder
CN114581459A (en) Improved 3D U-Net model-based segmentation method for image region of interest of preschool child lung
CN114418949A (en) Pulmonary nodule detection method based on three-dimensional U-shaped network and channel attention
CN114782532A (en) Spatial attention method and device for PET-CT (positron emission tomography-computed tomography) multi-modal tumor segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination