CN111311592B - Three-dimensional medical image automatic segmentation method based on deep learning - Google Patents

Three-dimensional medical image automatic segmentation method based on deep learning Download PDF

Info

Publication number
CN111311592B
CN111311592B CN202010172837.XA CN202010172837A CN111311592B CN 111311592 B CN111311592 B CN 111311592B CN 202010172837 A CN202010172837 A CN 202010172837A CN 111311592 B CN111311592 B CN 111311592B
Authority
CN
China
Prior art keywords
segmentation
layer
output
convolutional layer
convolutional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010172837.XA
Other languages
Chinese (zh)
Other versions
CN111311592A (en
Inventor
赵于前
潘宇
杨振
张帆
陈武阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Tiao Medical Technology Co.,Ltd.
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202010172837.XA priority Critical patent/CN111311592B/en
Publication of CN111311592A publication Critical patent/CN111311592A/en
Application granted granted Critical
Publication of CN111311592B publication Critical patent/CN111311592B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10104Positron emission tomography [PET]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Abstract

The invention discloses a three-dimensional medical image automatic segmentation method based on deep learning, which mainly solves the problems of rough three-dimensional medical image segmentation result, unstable training process and low small target segmentation precision in the prior art. The implementation scheme is as follows: 1) acquiring a three-dimensional medical image; 2) expanding the sample data set; 3) constructing a new feature extraction network; 4) constructing a region-of-interest adaptive attention network; 5) obtaining a segmentation model; 6) training a segmentation model; 7) and (4) segmenting the three-dimensional medical image. The segmentation model constructed by the method can effectively relieve the instability and overfitting conditions in the deep convolutional neural network training process, reduce the class imbalance problem caused by the imbalance of the number of positive and negative samples, remarkably improve the segmentation precision of small and medium targets in the three-dimensional medical image, have good robustness and can be used for a computer-aided diagnosis system.

Description

Three-dimensional medical image automatic segmentation method based on deep learning
Technical Field
The invention belongs to the technical field of image processing, relates to automatic segmentation of three-dimensional medical images, and can be used for computer-aided diagnosis and treatment.
Background
Medical image segmentation refers to segmenting a region of interest, such as a lesion region, an organ at risk, a radiotherapy target region and the like, according to an original medical image such as CT, MRI, PET and the like. Medical image segmentation is an important component in the field of medical image processing, and plays an important role in disease diagnosis, treatment plan formulation, curative effect evaluation and the like. The manual segmentation method needs a doctor to determine the position and the boundary of a segmentation region according to various medical image data of a patient and manually draw the segmentation result on each slice, and the method has the advantages of large workload, long time consumption and strong subjectivity, and different experts can have large errors on the segmentation result of the same object. Therefore, the method for automatically segmenting the medical image has important significance for improving the working efficiency of doctors, strengthening the effect of the medical image technology in disease diagnosis, improving the precision and efficiency of computer-aided diagnosis and the like. Traditional automatic segmentation methods for medical images include threshold methods, region growing methods, clustering methods and the like, and the methods are gradually replaced by deep learning methods due to the defects of long time consumption, low accuracy, incapability of segmenting a plurality of targets simultaneously and the like. Most of the existing medical image automatic segmentation network models based on deep learning are built on two-dimensional images, slice layers of the three-dimensional images are segmented by using the two-dimensional models respectively, and segmentation results of all the layers of images are spliced to serve as final three-dimensional segmentation results. Because the two-dimensional image segmentation model cannot acquire interlayer information of the three-dimensional medical image, segmentation results of different layers cannot be effectively and accurately connected, the final three-dimensional segmentation result is rough and even wrong, and accurate diagnosis and treatment of diseases by doctors are not facilitated. In addition, due to the limitation of the video memory of the computer, the whole three-dimensional image cannot be directly used as the input of the model for training, the original three-dimensional image needs to be cut and diced, and the cut three-dimensional image block is used as the input of the model, but the dicing causes the number of samples containing small targets (such as tumors) to be too small, so that the problem of serious class imbalance is caused, and the segmentation precision and efficiency of the small targets are greatly influenced.
Disclosure of Invention
The invention fully considers the defects of the prior art, and aims to provide a three-dimensional medical image automatic segmentation method based on deep learning to relieve the instability and overfitting condition in the deep convolutional neural network training process, improve the segmentation precision of small and medium targets in the three-dimensional medical image and have good robustness.
First, technical principle
The current segmentation model based on the convolutional neural network is mostly composed of an encoder and a decoder. The encoder identifies the target by obtaining high-dimensional semantic information by stepping down spatial information through a cascade of convolutional and pooling layers, but the pooling operation results in feature map size reduction. In order to obtain the final voxel-level segmentation result, the identified target needs to be restored to the original size. Therefore, the decoder needs to restore the original size according to the high-dimensional semantic information and the spatial information, but in the prior art, a large amount of spatial information is lost in the stage of obtaining the semantic information, and the spatial information of the medium and small targets cannot be effectively restored. In order to better acquire spatial information and semantic information, the invention provides a multi-scale layer cross-connection structure: the multilevel feature graph of the encoder stage is directly connected to the decoder stage by using shortcut (shortcut), so that the model automatically learns and selects the required hierarchical feature when the spatial information is recovered, and the reuse of the multilevel feature graph is realized.
In order to solve the problem of class imbalance caused by too small target, the invention introduces a region-of-interest adaptive attention mechanism based on an auxiliary function. By predicting the large target boundary rectangular frame, the encoder of the model pays more attention to the position information of the large target containing the small target, so that the segmentation area of the small target is reduced, the self-adaptive attention mechanism of the model is realized, and the segmentation precision of the small target is improved. Meanwhile, in order to prevent the problems of instability and overfitting in the network training process, a residual convolution module, a channel adaptive attention (SE) module and anti-aliasing pooling operation are introduced.
Secondly, according to the principle, the invention is realized by the following scheme:
a three-dimensional medical image automatic segmentation method based on deep learning comprises the following steps:
(1) acquiring an original training data set from a three-dimensional medical image segmentation public database, extracting boundary rectangular frame information of an interested area by reading annotation data in the original training data set, and forming a sample data set by using case images, segmentation annotations of the case images and the boundary information of the interested area;
(2) randomly dicing the three-dimensional medical image, and expanding a sample data set:
due to the limitation of video memory, the whole three-dimensional medical image cannot be directly input into the segmentation model, so that the original sample data set is zoomed and randomly cut into blocks for multiple times to form an expanded sample data set;
(3) constructing a new feature extraction network, which specifically comprises the following steps:
(3-a) taking a 3D U-Net network as a basic network, wherein the basic network comprises ten convolutional layers and four splicing layers, the output of the first convolutional layer is connected with the output of the eleventh convolutional layer to form a twelfth splicing layer, the output of the second convolutional layer is connected with the output of the ninth convolutional layer to form a tenth splicing layer, the output of the third convolutional layer is connected with the output of the seventh convolutional layer to form an eighth splicing layer, and the output of the fourth convolutional layer is connected with the output of the fifth convolutional layer to form a sixth splicing layer;
(3-b) adding cross-connection among multiple layers on the basis network described in the step (3-a), and constructing a new feature extraction network: connecting the outputs of the first, second, third and fourth convolutional layers, performing dimension reduction by using convolution, and then connecting the outputs of the first, second, third and fourth convolutional layers with the output of the eleventh convolutional layer to form a twelfth splicing layer, connecting the outputs of the second, third and fourth convolutional layers, performing dimension reduction by using convolution, and then connecting the outputs of the second, third and fourth convolutional layers with the output of the ninth convolutional layer to form a tenth splicing layer, and connecting the outputs of the third and fourth convolutional layers with the output of the seventh convolutional layer to form an eighth splicing layer;
(3-c) reconstructing all convolutional layers in the base network described in step (3-a) as follows:
I. replacing a convolution module in the original convolution layer with a residual convolution module;
replacing a pooling module in the original convolution layer with an anti-aliasing pooling module, wherein the anti-aliasing pooling is to add smooth convolution operation before maximum pooling operation;
a channel self-adaptive attention (SE) module is added behind the residual convolution module and in front of the anti-aliasing pooling module;
(3-d) adding a thirteenth convolutional layer and a fourteenth convolutional layer on the basis network described in the step (3-a), wherein the output of the twelfth convolutional layer is connected with the input of the thirteenth convolutional layer, the output of the thirteenth convolutional layer is connected with the input of the fourteenth convolutional layer, and the output of the fourteenth convolutional layer and the segmentation golden standard labeled by the data set are used for constructing a segmentation loss function by using a Dice coefficient loss function, which is shown as the following formula:
Figure GDA0002973417440000031
where K is the number of segmentation classes, yiFor the segmentation gold criteria of class i in the dataset,
Figure GDA0002973417440000032
the symbol # represents a union for the output of the fourteenth convolutional layer, i.e., the i-th class segmentation result obtained by the network.
(4) The method for constructing the area-of-interest adaptive attention network specifically comprises the following steps:
(4-a) adding a region-of-interest adaptive attention network on the feature extraction network obtained in the step (3), wherein the network comprises two new convolutional layers: a first new convolutional layer and a second new convolutional layer, and the output of the fifth layer of convolutional layer is connected with the input of the first new convolutional layer and the output of the first new convolutional layer is connected with the input of the second new convolutional layer;
(4-b) constructing a supplementary loss function using a mean square error function from the output of the second new convolution layer and the data set labeled region of interest bounding rectangle information, as shown in the following equation:
Lroi=(t-tp)2
wherein t is the boundary rectangle frame information of the region of interest obtained in the step (1), and t ispThe predicted value of the boundary rectangle frame is the output of the new convolution layer of the second layer, namely the predicted value of the network pair boundary rectangle frame.
(5) Obtaining a segmentation model, specifically comprising the following steps:
(5-a) constructing a new total loss function by using the segmentation loss function obtained in the step (3-d) and the auxiliary loss function obtained in the step (4-b), as shown in the following formula:
Ltotal=αLroi+(1-α)Lseg
wherein α is a segmentation loss function L for balancing the segmentation loss function of step (3-d)segAnd the auxiliary loss function L described in step (4-b)roiAnd α is a constant greater than 0 and less than 1.
(5-b) combining the feature extraction network constructed in the step (3) and the region-of-interest adaptive attention network constructed in the step (4) to obtain a final segmentation model;
(6) training a segmentation model:
training the segmentation model constructed in the step (5-b) by using the total loss function obtained in the step (5-a) through the extended sample data set obtained in the step (2), and optimizing the weight parameter of each layer through gradient back propagation to obtain the trained segmentation model;
(7) three-dimensional medical image segmentation:
and (4) segmenting each three-dimensional medical image in the test data set by using the trained segmentation model to obtain a final segmentation result of each medical image.
In the (5-a) step, the hyperparameter α is preferably 0.18.
The invention has the following advantages:
firstly, the invention provides a layer cross connection method aiming at the problem that the segmentation result cannot be effectively recovered at the decoder stage due to the loss of too much spatial information of the existing image segmentation model at the encoder stage, so that a multi-scale feature map can be repeatedly utilized at the encoder stage, a network can select required hierarchical features by self, and the segmentation precision and robustness of the model are improved.
Secondly, aiming at the problem of category imbalance possibly caused by three-dimensional model training, the invention designs an attention mechanism based on auxiliary function, so that the model automatically learns and pays attention to the position information of a large target containing a small target, thereby improving the segmentation accuracy of the small target and the small target.
Thirdly, the method introduces a residual convolution module, a channel adaptive attention (SE) module and anti-aliasing pooling operation, effectively stabilizes the training process, and reduces the overfitting possibility and the training difficulty of the network.
Drawings
FIG. 1 is a flow chart of a method for automatically segmenting three-dimensional medical images based on deep learning according to an embodiment of the present invention;
FIG. 2 illustrates a convolutional neural network-based segmentation model constructed in an embodiment of the present invention;
FIG. 3 is a diagram of a method for constructing convolutional layers in a feature extraction network in accordance with an embodiment of the present invention;
FIG. 4 is a graph comparing the segmentation results of liver and tumor with those of other methods according to the embodiment of the present invention;
FIG. 5 is a graph comparing the segmentation result of the brain tumor according to the embodiment of the present invention with that of other methods.
Detailed Description
The following describes specific embodiments of the present invention:
example 1
Fig. 1 is a flowchart of an embodiment of an automatic segmentation method for a three-dimensional medical image based on deep learning, which includes the following specific steps:
step 1, acquiring a three-dimensional medical image.
The method comprises the steps of obtaining an original training data set from a three-dimensional medical image segmentation public database, extracting boundary rectangular frame information of an interested area by reading label data in the original training data set, and forming a sample data set by using case images, segmentation labels of the case images and the boundary information of the interested area.
And 2, randomly cutting the three-dimensional medical image into blocks and expanding a sample data set.
Due to the limitation of video memory, the whole three-dimensional medical image cannot be directly input into the segmentation model, so that the original sample data set is zoomed and randomly three-dimensionally diced for many times to form an expanded sample data set.
And 3, constructing a new feature extraction network.
Fig. 2 shows a segmentation model based on a convolutional neural network constructed in the embodiment of the present invention, which includes the following specific steps:
(3-a) taking a 3D U-Net network as a basic network, wherein the basic network comprises ten convolutional layers and four splicing layers, the output of the first convolutional layer is connected with the output of the eleventh convolutional layer to form a twelfth splicing layer, the output of the second convolutional layer is connected with the output of the ninth convolutional layer to form a tenth splicing layer, the output of the third convolutional layer is connected with the output of the seventh convolutional layer to form an eighth splicing layer, and the output of the fourth convolutional layer is connected with the output of the fifth convolutional layer to form a sixth splicing layer;
(3-b) adding cross-connection among multiple layers on the basis network described in the step (3-a), and constructing a new feature extraction network: connecting the outputs of the first, second, third and fourth convolutional layers, performing dimension reduction by using convolution, and then connecting the outputs of the first, second, third and fourth convolutional layers with the output of the eleventh convolutional layer to form a twelfth splicing layer, connecting the outputs of the second, third and fourth convolutional layers, performing dimension reduction by using convolution, and then connecting the outputs of the second, third and fourth convolutional layers with the output of the ninth convolutional layer to form a tenth splicing layer, and connecting the outputs of the third and fourth convolutional layers with the output of the seventh convolutional layer to form an eighth splicing layer;
(3-c) reconstructing all convolutional layers in the base network described in the step (3-a).
Fig. 3 shows a method for constructing a convolutional layer according to an embodiment of the present invention, which includes the following steps:
I. replacing a convolution module in the original convolution layer with a residual convolution module;
replacing a pooling module in the original convolution layer with an anti-aliasing pooling module, wherein the anti-aliasing pooling is to add smooth convolution operation before maximum pooling operation;
a channel self-adaptive attention (SE) module is added behind the residual convolution module and in front of the anti-aliasing pooling module;
(3-d) adding a thirteenth convolutional layer and a fourteenth convolutional layer on the basis network described in the step (3-a), wherein the output of the twelfth convolutional layer is connected with the input of the thirteenth convolutional layer, the output of the thirteenth convolutional layer is connected with the input of the fourteenth convolutional layer, and the output of the fourteenth convolutional layer and the segmentation golden standard labeled by the data set are used for constructing a segmentation loss function by using a Dice coefficient loss function, which is shown as the following formula:
Figure GDA0002973417440000061
where K is the number of segmentation classes, yiFor the segmentation gold criteria of class i in the dataset,
Figure GDA0002973417440000062
the symbol # represents a union for the output of the fourteenth convolutional layer, i.e., the i-th class segmentation result obtained by the network.
The structure of each layer of the partial network is constructed as follows:
in the first layer of convolution layer, two groups of 32 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 32 characteristic graphs, then a channel adaptive attention (SE) module is used, and finally anti-aliasing pooling operation with the step length of 2 is adopted;
in the second layer of convolution layer, two groups of 48 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 48 characteristic graphs, then a channel self-adaptive attention (SE) module is used, and finally anti-aliasing pooling operation with the step length of 2 is adopted;
in the third layer of convolution layer, two groups of 64 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 64 characteristic graphs, then a channel adaptive attention (SE) module is used, and finally anti-aliasing pooling operation with the step length of 2 is adopted;
in the fourth layer of convolution layer, two groups of 96 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 96 characteristic graphs, then a channel adaptive attention (SE) module is used, and finally anti-aliasing pooling operation with the step length of 2 is adopted;
a fifth convolution layer, wherein two groups of 128 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 128 characteristic graphs, a channel adaptive attention (SE) module is used, and finally bilinear interpolation operation with a scaling factor of 2 is adopted;
a sixth splicing layer, which splices the output of the fourth layer of convolution layer and the output of the fifth layer of convolution layer, uses 128 convolution kernels, the size of the convolution kernels is 1 multiplied by 1, and outputs 128 characteristic graphs after characteristic graph fusion;
a seventh layer of convolution layer, two groups of 96 convolution kernels connected by residual errors are used, the size of the convolution kernels is 3 multiplied by 3, the convolution kernels are used for outputting 96 characteristic graphs, then a channel adaptive attention (SE) module is used, and finally bilinear interpolation operation with a scaling factor of 2 is adopted;
the eighth splicing layer splices the outputs of the third and fourth layers of convolution layers and the output of the seventh layer of convolution layer, uses 96 convolution kernels, the size of the convolution kernels is 1 multiplied by 1, and outputs 96 characteristic graphs after characteristic graph fusion is carried out;
in the ninth layer of convolution layer, two groups of 64 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 64 characteristic graphs, then a channel adaptive attention (SE) module is used, and finally bilinear interpolation operation with a scaling factor of 2 is adopted;
a tenth splicing layer, splicing the outputs of the second, third and fourth layers of convolution layer and the ninth layer of convolution layer, using 64 convolution kernels, the size of the convolution kernels is 1 multiplied by 1, fusing the feature maps, and outputting 64 feature maps;
in the eleventh convolutional layer, two groups of 48 convolutional kernels connected by residual errors are used, the size of the convolutional kernels is 3 multiplied by 3, the convolutional kernels are used for outputting 48 characteristic graphs, a channel adaptive attention (SE) module is used, and finally bilinear interpolation operation with a scaling factor of 2 is adopted;
a twelfth splicing layer, after splicing the outputs of the first, second, third and fourth layers of convolution layer and the output of the eleventh layer of convolution layer, using 32 convolution kernels, the size of the convolution kernels is 1 multiplied by 1, and outputting 32 characteristic graphs after characteristic graph fusion;
a thirteenth convolution layer, using two groups of 32 convolution kernels connected by residual errors, wherein the size of the convolution kernels is 3 multiplied by 3, and outputting 32 characteristic graphs, and then using a channel adaptive attention (SE) module;
and a fourteenth convolutional layer, which uses n convolutional kernels, wherein the size of the convolutional kernel is 1 × 1 × 1, and is used for outputting n types of segmentation results, and n is the number of types of segmentation results.
And 4, constructing the region-of-interest adaptive attention network.
(4-a) adding a region-of-interest adaptive attention network on the feature extraction network obtained in the step (3), wherein the network comprises two new convolutional layers: a first new convolutional layer and a second new convolutional layer, and the output of the fifth layer of convolutional layer is connected with the input of the first new convolutional layer and the output of the first new convolutional layer is connected with the input of the second new convolutional layer;
(4-b) constructing a supplementary loss function using a mean square error function from the output of the second new convolution layer and the data set labeled region of interest bounding rectangle information, as shown in the following equation:
Lroi=(t-tp)2
wherein t is the boundary rectangle frame information of the region of interest obtained in the step (1), and t ispThe predicted value of the boundary rectangle frame is the output of the new convolution layer of the second layer, namely the predicted value of the network pair boundary rectangle frame.
The structure of each layer of the partial region of interest adaptive attention network is constructed as follows:
in the first layer of new convolution layer, two groups of 32 convolution kernels connected by residual errors are used, the size of each convolution kernel is 3 multiplied by 3, the convolution kernels are used for outputting 32 characteristic graphs, then a channel adaptive attention (SE) module is adopted, and finally global average pooling operation is used;
and the second new convolution layer uses 6 convolution kernels, and the size of the convolution kernels is 1 multiplied by 1 to obtain the prediction result of the boundary frame of the region of interest.
And 5, obtaining a segmentation model.
(5-a) constructing a new total loss function from the loss functions obtained in steps (3-d) and (4-b), as shown in the following formula:
Ltotal=αLroi+(1-α)Lseg
wherein α is a segmentation loss function L for balancing the segmentation loss function of step (3-d)segAnd the auxiliary loss function L described in step (4-b)roiIs determined. In this embodiment, α is preferably 0.18.
And (5-b) combining the feature extraction network constructed in the step (3) and the region-of-interest adaptive attention network constructed in the step (4) to obtain a final segmentation model.
And 6, training a segmentation model.
And (3) training the segmentation model constructed in the step (5-b) by using the total loss function obtained in the step (5-a) through the extended sample data set obtained in the step (2), and optimizing the weight parameter of each layer through gradient back propagation to obtain the trained segmentation model.
And 7, segmenting the three-dimensional medical image.
And (4) segmenting each three-dimensional medical image in the test data set by using the trained segmentation model to obtain a final segmentation result of each medical image.
Example 2
Liver Tumor Segmentation experiments were performed on the public data set lits (liver Tumor Segmentation challenge) using the method in example 1. The three types of segmentation are background, liver and tumor. Computer environment of this experiment: the operating system is Linux ubuntu 16.06 version; two NVIDIA1080Ti 11G GPUs; the software platform is as follows: python, PyTorch.
FIG. 4 is a graph showing the results of segmentation of a liver and tumor using an embodiment of the present invention compared to the results of segmentation using other methods. Each medical image in the test data set is segmented by using a trained segmentation model, and the segmentation sample result is shown in fig. 4, wherein fig. 4(a) - (e) respectively show 3D U-Net, FCN + SegSE, CE Net and Attention U-Net, and the segmentation result of the method of the invention shows that the segmentation effect of the small tumor of fig. 4(a) and 4(d) is poor and the omission phenomenon occurs, the method of fig. 4(c) has the wrong segmentation condition, and the liver segmentation edge of fig. 4(b) is rough.
TABLE 1
Figure GDA0002973417440000091
Figure GDA0002973417440000101
The segmentation Average precision of the test sample set is compared by using a Dice coefficient (Dice coefficient), Sensitivity (Sensitivity) and Average Symmetric Surface Distance (ASD) in each method, and the result is shown in table 1.
Example 3
Brain Tumor Segmentation experiments were performed on the public data set brats (brain Tumor Segmentation challenge) using the method in example 1. The classification includes five types, namely background, necrotic tissue (necrossis), cyst (Edema), Non-enhanced Tumor (Non-enhanced Tumor) and enhanced Tumor (enhanced Tumor). Computer environment of this experiment: the operating system is Linux ubuntu 16.06 version; two NVIDIA1080Ti 11G GPUs; the software platform is as follows: python, PyTorch.
Fig. 5 is a graph showing the results of the brain tumor segmentation according to the embodiment of the present invention compared with the results of the other methods. Each medical image in the test data set is segmented by using the trained segmentation model, and the segmentation sample results are shown in FIG. 5, wherein FIGS. 5(a) - (e) are 3D U-Net, FCN + SegSE, CE Net, Attention U-Net, and the segmentation results of the method of the present invention, respectively. It can be seen that fig. 5(a) - (c) show obvious over-segmentation, the non-tumor region is segmented into tumors by mistake, and fig. 5(d) shows omission phenomenon for enhancing tumor segmentation.
The mean accuracy of the segmentation of the test sample set was compared by each of the above methods using Dice coefficient (Dice coefficient), Sensitivity (Sensitivity) and edge distance (HD), and the results are shown in table 2, where the whole Tumor (whole Tumor) includes necrotic tissue, cyst, non-enhanced Tumor and enhanced Tumor, and the Tumor core (Tumor core) includes necrotic tissue, non-enhanced Tumor and enhanced Tumor.
TABLE 2
Figure GDA0002973417440000111

Claims (5)

1. A three-dimensional medical image automatic segmentation method based on deep learning is characterized by comprising the following steps:
(1) acquiring an original training data set from a three-dimensional medical image segmentation public database, extracting boundary rectangular frame information of an interested area by reading annotation data in the original training data set, and forming a sample data set by using case images, segmentation annotations of the case images and the boundary information of the interested area;
(2) randomly dicing the three-dimensional medical image, and expanding a sample data set:
due to the limitation of video memory, the whole three-dimensional medical image cannot be directly input into the segmentation model, so that the original sample data set is zoomed and randomly cut into blocks for multiple times to form an expanded sample data set;
(3) constructing a new feature extraction network, which specifically comprises the following steps:
(3-a) taking a 3D U-Net network as a basic network, wherein the basic network comprises ten convolutional layers and four splicing layers, the output of the first convolutional layer is connected with the output of the eleventh convolutional layer to form a twelfth splicing layer, the output of the second convolutional layer is connected with the output of the ninth convolutional layer to form a tenth splicing layer, the output of the third convolutional layer is connected with the output of the seventh convolutional layer to form an eighth splicing layer, and the output of the fourth convolutional layer is connected with the output of the fifth convolutional layer to form a sixth splicing layer;
(3-b) adding cross-connection among multiple layers on the basis network described in the step (3-a), and constructing a new feature extraction network: connecting the outputs of the first, second, third and fourth convolutional layers, performing dimension reduction by using convolution, and then connecting the outputs of the first, second, third and fourth convolutional layers with the output of the eleventh convolutional layer to form a twelfth splicing layer, connecting the outputs of the second, third and fourth convolutional layers, performing dimension reduction by using convolution, and then connecting the outputs of the second, third and fourth convolutional layers with the output of the ninth convolutional layer to form a tenth splicing layer, and connecting the outputs of the third and fourth convolutional layers with the output of the seventh convolutional layer to form an eighth splicing layer;
(3-c) reconstructing all convolutional layers in the base network described in step (3-a) as follows:
I. replacing a convolution module in the original convolution layer with a residual convolution module;
replacing a pooling module in the original convolution layer with an anti-aliasing pooling module, wherein the anti-aliasing pooling is to add smooth convolution operation before maximum pooling operation;
a channel self-adaptive attention module is added behind the residual convolution module and in front of the anti-aliasing pooling module;
(3-d) adding a thirteenth convolutional layer and a fourteenth convolutional layer on the basis network described in the step (3-a), wherein the output of the twelfth splicing layer is connected with the input of the thirteenth convolutional layer, the output of the thirteenth convolutional layer is connected with the input of the fourteenth convolutional layer, and the output of the fourteenth convolutional layer and the segmentation gold standard labeled by the data set construct a segmentation loss function by using a Dice coefficient loss function;
(4) the method for constructing the area-of-interest adaptive attention network specifically comprises the following steps:
(4-a) adding a region-of-interest adaptive attention network on the feature extraction network obtained in the step (3), wherein the network comprises two new convolutional layers: a first new convolutional layer and a second new convolutional layer, and the output of the fifth layer of convolutional layer is connected with the input of the first new convolutional layer and the output of the first new convolutional layer is connected with the input of the second new convolutional layer;
(4-b) constructing an auxiliary loss function by using a mean square error function on the output of the second new convolution layer and the information of the region of interest boundary rectangular frame marked by the data set;
(5) obtaining a segmentation model, specifically comprising the following steps:
(5-a) constructing a new total loss function by using the segmentation loss function obtained in the step (3-d) and the auxiliary loss function obtained in the step (4-b);
(5-b) combining the feature extraction network constructed in the step (3) and the region-of-interest adaptive attention network constructed in the step (4) to obtain a final segmentation model;
(6) training a segmentation model:
training the segmentation model constructed in the step (5-b) by using the total loss function obtained in the step (5-a) through the extended sample data set obtained in the step (2), and optimizing the weight parameter of each layer through gradient back propagation to obtain the trained segmentation model;
(7) three-dimensional medical image segmentation:
and (4) segmenting each three-dimensional medical image in the test data set by using the trained segmentation model to obtain a final segmentation result of each medical image.
2. The method as claimed in claim 1, wherein the residual convolution module in step (3-c) directly connects the input and output of the convolution by using a shortcut to implement identity mapping.
3. The method for automatic segmentation of three-dimensional medical images based on deep learning as claimed in claim 1, wherein the segmentation loss function in the step (3-d) is constructed as follows:
Figure FDA0002973417430000021
where K is the number of segmentation classes, yiFor the segmentation gold criteria of class i in the dataset,
Figure FDA0002973417430000022
the symbol # represents a union for the output of the fourteenth convolutional layer, i.e., the i-th class segmentation result obtained by the network.
4. The method for automatic segmentation of three-dimensional medical image based on deep learning as claimed in claim 1, wherein the auxiliary loss function in the step (4-b) is constructed as follows:
Lroi=(t-tp)2
wherein t is the boundary rectangle frame information of the region of interest obtained in the step (1), and t ispThe predicted value of the boundary rectangle frame is the output of the new convolution layer of the second layer, namely the predicted value of the network pair boundary rectangle frame.
5. The method for automatic segmentation of three-dimensional medical images based on deep learning as claimed in claim 1, wherein the total loss function for training in the step (5-a) is constructed as follows:
Ltotal=αLroi+(1-α)Lseg
wherein α is a segmentation loss function L for balancing the segmentation loss function of step (3-d)segAnd the auxiliary loss function L described in step (4-b)roiAnd α is a constant greater than 0 and less than 1.
CN202010172837.XA 2020-03-13 2020-03-13 Three-dimensional medical image automatic segmentation method based on deep learning Active CN111311592B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010172837.XA CN111311592B (en) 2020-03-13 2020-03-13 Three-dimensional medical image automatic segmentation method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010172837.XA CN111311592B (en) 2020-03-13 2020-03-13 Three-dimensional medical image automatic segmentation method based on deep learning

Publications (2)

Publication Number Publication Date
CN111311592A CN111311592A (en) 2020-06-19
CN111311592B true CN111311592B (en) 2021-10-08

Family

ID=71155318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010172837.XA Active CN111311592B (en) 2020-03-13 2020-03-13 Three-dimensional medical image automatic segmentation method based on deep learning

Country Status (1)

Country Link
CN (1) CN111311592B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915623B (en) * 2020-07-22 2022-06-21 山东大学 Image segmentation method and device using gating and adaptive attention
CN112348769A (en) * 2020-08-20 2021-02-09 盐城工学院 Intelligent kidney tumor segmentation method and device in CT (computed tomography) image based on U-Net depth network model
CN114170128B (en) * 2020-08-21 2023-05-30 张逸凌 Bone segmentation method and system based on deep learning
CN112150428B (en) * 2020-09-18 2022-12-02 青岛大学 Medical image segmentation method based on deep learning
CN116261743A (en) * 2020-09-27 2023-06-13 上海联影医疗科技股份有限公司 System and method for generating radiation treatment plans
CN112419348B (en) * 2020-11-18 2024-02-09 西安电子科技大学 Male pelvic cavity CT segmentation method based on multitask learning edge correction network
CN112529042B (en) * 2020-11-18 2024-04-05 南京航空航天大学 Medical image classification method based on dual-attention multi-example deep learning
CN112465779B (en) * 2020-11-26 2024-02-27 中国科学院苏州生物医学工程技术研究所 Full-automatic detection and segmentation method and system for choledocholithiasis focus in abdomen CT
CN112862089B (en) * 2021-01-20 2023-05-23 清华大学深圳国际研究生院 Medical image deep learning method with interpretability
CN112767407B (en) * 2021-02-02 2023-07-07 南京信息工程大学 CT image kidney tumor segmentation method based on cascade gating 3DUnet model
CN113160232B (en) * 2021-03-29 2022-01-28 吉林大学 Intracranial hemorrhage focus segmentation algorithm applied to CT image based on MU-Net
CN113192089B (en) * 2021-04-12 2022-07-19 温州医科大学附属眼视光医院 Bidirectional cross-connection convolutional neural network for image segmentation
CN113822845A (en) * 2021-05-31 2021-12-21 腾讯科技(深圳)有限公司 Method, apparatus, device and medium for hierarchical segmentation of tissue structure in medical image
CN113298826B (en) * 2021-06-09 2023-11-14 东北大学 Image segmentation method based on LA-Net network
CN113344950A (en) * 2021-07-28 2021-09-03 北京朗视仪器股份有限公司 CBCT image tooth segmentation method combining deep learning with point cloud semantics
CN113436211B (en) * 2021-08-03 2022-07-15 天津大学 Medical image active contour segmentation method based on deep learning
CN113808753B (en) * 2021-09-11 2023-09-26 中南大学 Method for predicting auxiliary radiotherapy and chemotherapy curative effect based on decomposition expression learning of multiple losses
CN113781465A (en) * 2021-09-18 2021-12-10 长春理工大学 Grad-CAM-based medical image segmentation model visualization method
CN113744288B (en) * 2021-11-04 2022-01-25 北京欧应信息技术有限公司 Method, apparatus, and medium for generating annotated sample images
CN114693689B (en) * 2022-03-02 2024-03-15 西北工业大学 Self-adaptive neural network segmentation model construction method for medical image
CN117437365B (en) * 2023-12-20 2024-04-12 中国科学院深圳先进技术研究院 Medical three-dimensional model generation method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139377A (en) * 2015-07-24 2015-12-09 中南大学 Rapid robustness auto-partitioning method for abdomen computed tomography (CT) sequence image of liver
WO2018106783A1 (en) * 2016-12-06 2018-06-14 Siemens Energy, Inc. Weakly supervised anomaly detection and segmentation in images
CN109903292A (en) * 2019-01-24 2019-06-18 西安交通大学 A kind of three-dimensional image segmentation method and system based on full convolutional neural networks
CN110136133A (en) * 2019-03-11 2019-08-16 嘉兴深拓科技有限公司 A kind of brain tumor dividing method based on convolutional neural networks
CN110348515A (en) * 2019-07-10 2019-10-18 腾讯科技(深圳)有限公司 Image classification method, image classification model training method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965863B2 (en) * 2016-08-26 2018-05-08 Elekta, Inc. System and methods for image segmentation using convolutional neural network
US10762398B2 (en) * 2018-04-30 2020-09-01 Elekta Ab Modality-agnostic method for medical image representation
CN109685813B (en) * 2018-12-27 2020-10-13 江西理工大学 U-shaped retinal vessel segmentation method capable of adapting to scale information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139377A (en) * 2015-07-24 2015-12-09 中南大学 Rapid robustness auto-partitioning method for abdomen computed tomography (CT) sequence image of liver
WO2018106783A1 (en) * 2016-12-06 2018-06-14 Siemens Energy, Inc. Weakly supervised anomaly detection and segmentation in images
CN109903292A (en) * 2019-01-24 2019-06-18 西安交通大学 A kind of three-dimensional image segmentation method and system based on full convolutional neural networks
CN110136133A (en) * 2019-03-11 2019-08-16 嘉兴深拓科技有限公司 A kind of brain tumor dividing method based on convolutional neural networks
CN110348515A (en) * 2019-07-10 2019-10-18 腾讯科技(深圳)有限公司 Image classification method, image classification model training method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation;Fabian Isensee 等;《arXiv:1809.10486v1》;arXiv;20180927;1-11 *
基于非线性增强和图割的CT序列肝脏肿瘤自动分割;廖苗 等;《计算机辅助设计与图形学学报》;20190615;第31卷(第6期);1030-1038 *

Also Published As

Publication number Publication date
CN111311592A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN111311592B (en) Three-dimensional medical image automatic segmentation method based on deep learning
CN109410219B (en) Image segmentation method and device based on pyramid fusion learning and computer readable storage medium
CN109584252B (en) Lung lobe segment segmentation method and device of CT image based on deep learning
CN111429460B (en) Image segmentation method, image segmentation model training method, device and storage medium
US20240078722A1 (en) System and method for forming a super-resolution biomarker map image
CN111798462B (en) Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image
CN109829918B (en) Liver image segmentation method based on dense feature pyramid network
WO2021203795A1 (en) Pancreas ct automatic segmentation method based on saliency dense connection expansion convolutional network
US10853409B2 (en) Systems and methods for image search
JP2023550844A (en) Liver CT automatic segmentation method based on deep shape learning
CN111489357A (en) Image segmentation method, device, equipment and storage medium
CN112446892A (en) Cell nucleus segmentation method based on attention learning
CN113706487A (en) Multi-organ segmentation method based on self-supervision characteristic small sample learning
Rehman et al. Liver lesion segmentation using deep learning models
CN115619797A (en) Lung image segmentation method of parallel U-Net network based on attention mechanism
CN111798424A (en) Medical image-based nodule detection method and device and electronic equipment
CN110992310A (en) Method and device for determining partition where mediastinal lymph node is located
CN116664590B (en) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image
Pollastri et al. Long-range 3d self-attention for mri prostate segmentation
Chen et al. Liver segmentation in CT images using a non-local fully convolutional neural network
Han et al. Three dimensional nuclei segmentation and classification of fluorescence microscopy images
Mansour et al. Kidney segmentations using cnn models
Chen et al. Pulmonary nodule segmentation in computed tomography with an encoder-decoder architecture
Li et al. An overview of abdominal multi-organ segmentation
Lu et al. A novel u-net based deep learning method for 3d cardiovascular MRI segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220531

Address after: 410000 room 105, building 5, R & D headquarters, Central South University Science Park, changzuo Road, Yuelu street, Yuelu District, Changsha City, Hunan Province

Patentee after: Hunan Theo Technology Co.,Ltd.

Address before: 226, geoscience building, 932 Lushan South Road, Yuelu District, Changsha City, Hunan Province, 410083

Patentee before: CENTRAL SOUTH University

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 410000 room 105, building 5, R & D headquarters, Central South University Science Park, changzuo Road, Yuelu street, Yuelu District, Changsha City, Hunan Province

Patentee after: Hunan Tiao Medical Technology Co.,Ltd.

Address before: 410000 room 105, building 5, R & D headquarters, Central South University Science Park, changzuo Road, Yuelu street, Yuelu District, Changsha City, Hunan Province

Patentee before: Hunan Theo Technology Co.,Ltd.