CN114842025A - CT image liver tumor region automatic segmentation method based on multi-branch network - Google Patents
CT image liver tumor region automatic segmentation method based on multi-branch network Download PDFInfo
- Publication number
- CN114842025A CN114842025A CN202210389599.7A CN202210389599A CN114842025A CN 114842025 A CN114842025 A CN 114842025A CN 202210389599 A CN202210389599 A CN 202210389599A CN 114842025 A CN114842025 A CN 114842025A
- Authority
- CN
- China
- Prior art keywords
- module
- output
- layer
- deconvolution
- liver tumor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 206010019695 Hepatic neoplasm Diseases 0.000 title claims abstract description 104
- 208000014018 liver neoplasm Diseases 0.000 title claims abstract description 104
- 230000011218 segmentation Effects 0.000 title claims abstract description 97
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012937 correction Methods 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000007246 mechanism Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 18
- 238000000605 extraction Methods 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 12
- 238000012986 modification Methods 0.000 claims description 9
- 230000004048 modification Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 8
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000000638 solvent extraction Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 238000013461 design Methods 0.000 claims description 3
- 239000013604 expression vector Substances 0.000 claims description 3
- 230000004907 flux Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims 1
- 206010028980 Neoplasm Diseases 0.000 abstract description 9
- 238000013135 deep learning Methods 0.000 description 5
- 230000000903 blocking effect Effects 0.000 description 4
- 101150064138 MAP1 gene Proteins 0.000 description 1
- 230000003187 abdominal effect Effects 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 238000004195 computer-aided diagnosis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 208000019423 liver disease Diseases 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30056—Liver; Hepatic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a CT image liver tumor region automatic segmentation method based on a multi-branch network, which comprises the following steps: (1) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof; (2) constructing a depth convolution multi-branch network model fusing a self-attention mechanism and direction information; (3) constructing a network loss function; (4) training the network by adopting a training data set A; (5) and (4) segmenting the test image by using the trained network model to obtain a final liver tumor segmentation result. The invention relates to a full-automatic liver tumor segmentation method, which solves the problems that a long-distance target dependency relationship is difficult to establish in liver tumor segmentation of a convolutional network and tumor boundary identification is inaccurate by introducing a self-attention module and a direction correction module based on direction information into the convolutional network, and effectively improves the segmentation precision of liver tumors.
Description
Technical Field
The invention relates to the technical field of medical image processing, in particular to a CT image liver tumor region automatic segmentation method based on a multi-branch network.
Background
Liver tumor segmentation is an indispensable means for extracting quantitative information (such as volume, shape, size and the like) of liver lesion tissues in medical images, and plays an important role in liver tumor radiotherapy, surgical planning, curative effect evaluation and the like. Currently, the segmentation of liver tumors is mainly done by physicians manually delineating the liver tumors from anatomical structures in the patient CT image data. Since there are hundreds of CT slices per patient, manual segmentation is time and labor consuming and is influenced by physician experience and knowledge level, and there are significant differences in segmentation results from physician to physician. Therefore, the liver tumor automatic segmentation method for researching the CT image has important significance for improving the tumor delineation efficiency and precision.
The existing automatic liver tumor segmentation methods are mainly divided into traditional methods and deep learning-based methods. The traditional automatic segmentation method mainly comprises threshold value, region growing, clustering and the like, the methods usually need to manually appoint regions of interest or seed points, the segmentation efficiency is low, and the segmentation accuracy is low due to the influence of factors such as fuzzy CT image boundaries and adjacent tissue gray level approaching. With the development of deep learning techniques, in recent years, a liver tumor automatic segmentation method based on deep learning has been proposed in many documents. Most of the existing methods based on deep learning adopt a deep convolution network to realize automatic segmentation of liver tumors. Because the size of the convolution kernel in the convolution network is fixed, the receptive field is small, the characteristic relation between long-distance pixels in the image is difficult to extract, and the segmentation precision is not high. In addition, because the boundary of the liver tumor is fuzzy, the existing method based on deep learning is easy to generate under-segmentation or over-segmentation outside the tumor boundary.
Disclosure of Invention
Aiming at the defects and shortcomings of the prior art, the invention integrates a self-attention module and a direction correction module based on direction information into a deep convolutional network for construction, and aims to provide a CT image liver tumor region automatic segmentation method based on a multi-branch network, solve the problems that the convolutional network is difficult to establish a long-distance target dependency relationship in liver tumor segmentation, and the tumor boundary identification is inaccurate, and improve the precision and efficiency of computer-aided diagnosis of liver diseases.
The CT image liver tumor region automatic segmentation method based on the multi-branch network comprises the following steps:
(1) the method comprises the following steps of obtaining an original training data set containing an original CT image and a liver tumor manual segmentation result thereof from a liver tumor segmentation public database, and obtaining direction information pointing to a liver tumor boundary according to the liver tumor manual segmentation result, wherein the specific process comprises the following steps:
(1-a) for each pixel i in the CT image, judging whether the pixel i belongs to a liver tumor region according to a liver tumor manual segmentation result, if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the liver tumor region, and if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the non-liver tumor region;
(1-b) calculating the direction D (i) of the pixel i pointing to the boundary of the liver tumor by using the following formula according to the relative position relationship of the pixel i and the pixel j:
D(i)=(D(i) x ,D(i) y )
wherein, D (i) x And D (i) y Respectively representing the values of the directional information of the pixel i in the row and column directions, i x And i y Respectively representing the row and column coordinates, j, of a pixel i x And j y Respectively represent the row and column coordinates of pixel j;
(2) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof;
(3) a deep convolution multi-branch network combining a self-attention mechanism and direction information is constructed and called as MlpBrann-Net, the network comprises an initial segmentation module, a direction information extraction module and a direction correction module, and the specific construction process comprises the following steps:
(3-a) constructing an initial segmentation module, wherein the network module comprises five convolution modules, four deconvolution modules, a self-attention module and an output layer, and the output of the first convolution module is simultaneously used as the input of the second convolution module and the fourth deconvolution module; the output of the second convolution module is simultaneously used as the input of a third convolution module and a third deconvolution module; the output of the third convolution module is simultaneously used as the input of the fourth convolution module and the second deconvolution module; the output of the fourth convolution module is simultaneously used as the input of the fifth convolution module and the self-attention module; the output of the self-attention module and the output of the fifth convolution module are used as the input of the first deconvolution module; in addition, the output of the last deconvolution module is used as the input of the next deconvolution module; the output layer included in the initial partitioning module is composed of 1 convolution layer with the size of 1 × 1, and the output layer is recorded as a first output layer; the output of the fourth deconvolution module is used as the input of the first output layer, and the output of the first output layer is the initial segmentation result of the liver tumor;
(3-b) in the initial segmentation module of the step (3-a), the first to fourth convolution modules are respectively formed by sequentially connecting 2 convolution layers with the size of 3 × 3 and 1 maximum pooling layer with the size of 2 × 2, and the fifth convolution module is formed by sequentially connecting 2 convolution layers with the size of 3 × 3;
(3-c) in the initial segmentation module of the step (3-a), each deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3, 1 splicing layer, and 2 convolution layers with the size of 3 × 3, wherein: the input of the deconvolution layer in the first deconvolution module is the output of the fifth convolution module, and the input of the deconvolution layer in the next deconvolution module is the output of the previous deconvolution module; the input of the splicing layer in the first deconvolution module comprises the output of the deconvolution layer in the current deconvolution module and the output of the self-attention module, and the input of the splicing layer in the second to fourth deconvolution modules comprises the output of the deconvolution layer in the current deconvolution module and the output of the convolution module connected with the deconvolution module; the 2 convolutional layers in each deconvolution module are respectively marked as a first convolutional layer and a second convolutional layer, the input of the first convolutional layer is the output of a splicing layer in the current deconvolution module, and the input of the second convolutional layer is the output of the first convolutional layer in the current deconvolution module;
(3-d) in the initial segmentation module of the step (3-a), the self-attention module adopts a residual structure design including an image blocking layer, a linear mapping layer, n self-attention layers and an image resizing layer, which are sequentially connected, and a jump connection, wherein: the image blocking layer divides the feature map output by the fourth convolution module into a plurality of sub-blocks; after flattening the sub-blocks, the linear mapping layer linearly maps the sub-blocks to an M-dimensional sub-space; each self-attention layer consists of a plurality of attention layers and a plurality of layers of perceptrons; the image resizing layer converts the output of the last self-attention layer into a feature map with the same size as the output of the fourth convolution module; adding the feature map obtained by the image resizing layer and the feature map output by the fourth convolution module by introducing jump connection, and taking the addition result as one input of the first deconvolution module; n is preferably a constant of 4-8, and M is preferably a constant of 400-900;
(3-e) constructing a direction information extraction module, which specifically comprises: firstly, adding a deconvolution module after the first, second and third deconvolution modules in the initial segmentation module, respectively recording as the fifth, sixth and seventh deconvolution modules, wherein each newly added deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3 and 2 convolution layers with the size of 3 × 3; then, a splicing layer is added after the fifth, sixth and seventh deconvolution modules, and the outputs of the fifth, sixth and seventh deconvolution modules are spliced with the output of the fourth deconvolution module in the initial segmentation module; finally, inputting the spliced result into an output layer formed by convolution layers with the size of 1 multiplied by 1, wherein the output layer is marked as a second output layer, and the output of the second output layer is direction information;
(3-f) constructing a direction correction module, which specifically comprises the following steps: firstly, the direction information output by the direction information extraction module and the feature map output by the fourth deconvolution module in the initial segmentation module are used as input to construct a direction correction layer, and the direction correction layer is characterized in that the direction correction layer can use the direction information to iteratively update the feature value of each pixel in the feature map F output by the fourth deconvolution module, so that the feature value of the pixel far away from the liver tumor boundary gradually moves towards the boundary, and the specific calculation formula is as follows:
F t (p x ,p y )=F t-1 (p x -D(p) x ,p y -D(p) y )
wherein p is x And p y Respectively the row and column coordinates, F, of a pixel p t (p x ,p y ) The characteristic value of a pixel p after t times of iterative update of a characteristic diagram F is shown, D (p) x And D (p) y Respectively the value of the directional information of the pixel p in the row and column directions, F t-1 (p x -D(p) x ,p y -D(p) y ) Representing the coordinate (p) after the characteristic diagram F is iteratively updated for t-1 times x -D(p) x ,p y -D(p) y ) A characteristic value of the pixel of (1); for the coordinate (p) x -D(p) x ,p y -D(p) y ) The value is non-integer and satisfies 1 ≤ p x -D(p) x ≤row、1<p y -D(p) y <col, where row and col represent the number of rows and columns, respectively, of the feature map F, the feature value F of the coordinate point t-1 (p x -D(p) x ,p y -D(p) y ) By using a feature map F t-1 Calculating a bilinear difference value to obtain the bilinear difference value; for the coordinate (p) x -D(p) x ,p y -D(p) y ) Beyond the feature map boundary, i.e. when p x -D(p) x <1 or p x -D(p) x > row or p y -D(p) y <1 or p y -D(p) y When > col, the characteristic value of pixel p is not updated, i.e. F t (p x ,p y )=F t-1 (p x ,p y ) (ii) a Then, adding an output layer formed by convolution layers with the size of 1 multiplied by 1 after the direction correction layer, recording the output layer as a third output layer, and outputting a final liver tumor segmentation result; the t is preferably a constant of 4-8;
(4) constructing a loss function of the MlpBran-Net network, which comprises the following specific processes:
(4-a) constructing a loss function L of the initial segmentation module by adopting the Dice coefficient 1 As shown in the following formula:
wherein Y is a segmentation gold criterion, wherein the liver tumor region pixels are labeled as "1", the non-liver tumor region pixels are labeled as "0", P is an output result of the first output layer, wherein the pixels predicted to be liver tumor are labeled as "1", the pixels predicted to be non-liver tumor are labeled as "0", the symbol ∞ represents a union set, | · | represents the number of non-0 pixels in a given region;
(4-b) constructing a loss function L of the direction information extraction module by adopting a 2-norm and a vector included angle 2 As shown in the following formula:
wherein D is T (k) And D G (k) The direction information of the kth pixel output by the second output layer and the direction information golden standard of the kth pixel calculated according to the manual segmentation result respectively, | · | | luminous flux 2 Representing the 2-norm of a given vector, D G (k)·D T (k) Expression vector D G (k) And D T (k) Inner product of between, cos -1 (D G (k)·D T (k) Represents a calculation vector D G (k) And D T (k) The included angle between the two pixels is mu is a scale factor used for balancing 2-norm and vector included angle, N is the number of pixels in the CT image, and lambda (k) is the self-adaptive weight of the pixel k, and the calculation formula is as follows:
wherein G is k Representing the same number of pixels in the CT image as the class of pixel k if k belongs toIn case of liver tumor, G indicates k =N T In which N is T Is the number of pixels in the liver tumor region in the CT image, if k does not belong to the liver tumor, G k =N B In which N is B The number of pixels of a non-liver tumor region in the CT image; the mu is preferably a constant of 0.8-1.2;
(4-c) constructing a directional modification module loss function L by adopting a Dice coefficient 3 As shown in the following formula:
wherein, T is an output result of the third output layer, where a pixel predicted as a liver tumor is marked as "1", a pixel predicted as a non-liver tumor is marked as "0", and Y is a segmentation gold standard, where a liver tumor region pixel is marked as "1", and a non-liver tumor region pixel is marked as "0";
(4-d) constructing a loss function L of the MlpBran-Net network by combining the loss functions of the initial segmentation module, the direction information extraction module and the direction correction module:
L=L 1 +L 2 +L 3
(5) training the MlpBran-Net network by adopting a training data set A until L converges;
(6) and segmenting the CT image to be tested by using the trained network model to obtain the liver tumor segmentation result in the image.
Compared with the prior art, the method has the main advantages and innovations that:
aiming at the problem that the feature relation between long-distance pixels in an image is difficult to extract by a convolutional network, a self-attention module is introduced into the convolutional network to establish the long-distance dependence relation between tumor and non-tumor tissues, so that the global information extraction capability of the convolutional network can be enhanced, and the segmentation precision of the network is improved;
and (II) the invention introduces a direction correction module based on direction information, and guides the pixel characteristic value in the characteristic diagram obtained by the segmentation network to move towards the tumor boundary by using the predicted direction information, so that the boundary of the characteristic diagram tends to be consistent with the boundary of the target area, and the tumor boundary identification capability of the segmentation network is improved.
Drawings
FIG. 1 is a schematic diagram of an MlpBran-Net network structure according to an embodiment of the present invention
FIG. 2 is a schematic diagram of a self-attention module according to an embodiment of the present invention
FIG. 3 is a schematic diagram of a result obtained by modifying a feature map by a direction modification module according to an embodiment of the present invention, where FIG. 3(a) is an original feature map output by a fourth deconvolution module in an initial segmentation module, and FIGS. 3(b) to 3(f) are results obtained by iteratively updating the feature map by the direction modification module 1 to 5 times according to direction information, respectively, where a white dotted line represents a manual segmentation result of an expert
FIG. 4 is a liver tumor segmentation result example according to an embodiment of the present invention
Detailed Description
The CT image liver tumor region automatic segmentation method based on the multi-branch network comprises the following specific implementation steps:
(1) randomly selecting 100 abdominal CT original sequence images and corresponding liver tumor manual segmentation results from a LiTS public database, and acquiring direction information pointing to the liver tumor boundary according to the liver tumor manual segmentation results, wherein the specific process comprises the following steps:
(1-a) for each pixel i in the CT image, judging whether the pixel i belongs to a liver tumor region according to a liver tumor manual segmentation result, if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the liver tumor region, and if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the non-liver tumor region;
(1-b) calculating the direction D (i) of the pixel i pointing to the boundary of the liver tumor by using the following formula according to the relative position relationship of the pixel i and the pixel j:
D(i)=(D(i) x ,D(i) y )
wherein, D (i) x And D (i) y Respectively representing the values of the directional information of the pixel i in the row and column directions, i x And i y Respectively representing the row and column coordinates, j, of a pixel i x And j y Respectively represent the row and column coordinates of pixel j;
(2) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof;
(3) the method comprises the following steps of constructing a deep convolution multi-branch network which integrates a self-attention mechanism and direction information, namely MlpBrann-Net, and FIG. 1 is a schematic diagram of the structure of the MlpBrann-Net of the embodiment of the invention, wherein the network comprises an initial segmentation module, a direction information extraction module and a direction correction module, and the specific construction process comprises the following steps:
(3-a) constructing an initial segmentation module, wherein the network module comprises five convolution modules, four deconvolution modules, a self-attention module and an output layer, and the output of the first convolution module is simultaneously used as the input of the second convolution module and the fourth deconvolution module; the output of the second convolution module is simultaneously used as the input of a third convolution module and a third deconvolution module; the output of the third convolution module is simultaneously used as the input of the fourth convolution module and the second deconvolution module; the output of the fourth convolution module is simultaneously used as the input of the fifth convolution module and the self-attention module; the output of the self-attention module and the output of the fifth convolution module are used as the input of the first deconvolution module; in addition, the output of the last deconvolution module is used as the input of the next deconvolution module; the output layer included in the initial partitioning module is composed of 1 convolution layer with the size of 1 × 1, and the output layer is recorded as a first output layer; the output of the fourth deconvolution module is used as the input of the first output layer, and the output of the first output layer is the initial segmentation result of the liver tumor;
(3-b) in the initial segmentation module of the step (3-a), the first to fourth convolution modules are respectively formed by sequentially connecting 2 convolution layers with the size of 3 × 3 and 1 maximum pooling layer with the size of 2 × 2, and the fifth convolution module is formed by sequentially connecting 2 convolution layers with the size of 3 × 3;
(3-c) in the initial segmentation module of the step (3-a), each deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3, 1 splicing layer, and 2 convolution layers with the size of 3 × 3, wherein: the input of the deconvolution layer in the first deconvolution module is the output of the fifth convolution module, and the input of the deconvolution layer in the next deconvolution module is the output of the previous deconvolution module; the input of the splicing layer in the first deconvolution module comprises the output of the deconvolution layer in the current deconvolution module and the output of the self-attention module, and the input of the splicing layer in the second to fourth deconvolution modules comprises the output of the deconvolution layer in the current deconvolution module and the output of the convolution module connected with the deconvolution module; the 2 convolutional layers in each deconvolution module are respectively marked as a first convolutional layer and a second convolutional layer, the input of the first convolutional layer is the output of a splicing layer in the current deconvolution module, and the input of the second convolutional layer is the output of the first convolutional layer in the current deconvolution module;
(3-d) in the initial segmentation module in the step (3-a), the self-attention module adopts a residual structure design, and comprises an image blocking layer, a linear mapping layer, n self-attention layers and an image resizing layer which are connected in sequence, and a jump connection, wherein the image blocking layer divides the feature map output by the fourth convolution module into a plurality of sub-blocks, and the specific structure is shown in fig. 2; after flattening the sub-blocks, the linear mapping layer linearly maps the sub-blocks to an M-dimensional sub-space; each self-attention layer consists of a plurality of attention layers and a plurality of layers of perceptrons; the image resizing layer converts the output of the last self-attention layer into a feature map with the same size as the output of the fourth convolution module; adding the feature map obtained by the image resizing layer and the feature map output by the fourth convolution module by introducing jump connection, and taking the addition result as one input of the first deconvolution module; in this embodiment, n is preferably 6, and M is preferably 768;
(3-e) constructing a direction information extraction module, which specifically comprises: firstly, adding a deconvolution module after the first, second and third deconvolution modules in the initial segmentation module, respectively recording as the fifth, sixth and seventh deconvolution modules, wherein each newly added deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3 and 2 convolution layers with the size of 3 × 3; then, a splicing layer is added after the fifth, sixth and seventh deconvolution modules, and the outputs of the fifth, sixth and seventh deconvolution modules are spliced with the output of the fourth deconvolution module in the initial segmentation module; finally, inputting the spliced result into an output layer formed by convolution layers with the size of 1 multiplied by 1, wherein the output layer is marked as a second output layer, and the output of the second output layer is direction information;
(3-f) constructing a direction correction module, which specifically comprises the following steps: firstly, the direction information output by the direction information extraction module and the feature map output by the fourth deconvolution module in the initial segmentation module are used as input to construct a direction correction layer, and the direction correction layer is characterized in that the direction correction layer can use the direction information to iteratively update the feature value of each pixel in the feature map F output by the fourth deconvolution module, so that the feature value of the pixel far away from the liver tumor boundary gradually moves towards the boundary, and the specific calculation formula is as follows:
F t (p x ,p y )=F t-1 (p x -D(p) x ,p y -D(p) y )
wherein p is x And p y Respectively the row and column coordinates, F, of a pixel p t (p x ,p y ) The characteristic value of a pixel p after t times of iterative update of a characteristic diagram F is shown, D (p) x And D (p) y Respectively the value of the directional information of the pixel p in the row and column directions, F t-1 (p x -D(p) x ,p y -D(p) y ) Representing the coordinate (p) after the characteristic diagram F is iteratively updated for t-1 times x -D(p) x ,p y -D(p) y ) A characteristic value of the pixel of (1); for the coordinate (p) x -D(p) x ,p y -D(p) y ) The value is non-integer and satisfies 1 ≤ p x -D(p) x ≤row、1<p y -D(p) y <col, where row and col represent the number of rows and columns, respectively, of the feature map F, the feature value F of the coordinate point t-1 (p x -D(p) x ,p y -D(p) y ) By using a feature map F t-1 Calculating a bilinear difference value to obtain the bilinear difference value; for coordinate (p) x -D(p) x ,p y -D(p) y ) Beyond the feature map boundary, i.e. when p x -D(p) x <1 or p x -D(p) x > row or p y -D(p) y <1 or p y -D(p) y When > col, the characteristic value of the pixel p is not updated, i.e. F t (p x ,p y )=F t-1 (p x ,p y ) (ii) a Then, adding an output layer formed by convolution layers with the size of 1 multiplied by 1 after the direction correction layer, recording the output layer as a third output layer, and outputting a final liver tumor segmentation result; t-5 is preferred in this example;
fig. 3 is a schematic diagram of a result obtained by modifying a feature map by a direction modification module in the present embodiment, where fig. 3(a) is an original feature map output by a fourth deconvolution module in an initial segmentation module, fig. 3(b) to 3(f) are results obtained by iteratively updating the feature map 1 to 5 times by the direction modification module according to direction information, and a white dotted line represents a manual segmentation result of an expert, and it can be seen that the direction modification module can effectively make a feature map boundary and a boundary of a target area tend to be consistent by iteratively updating the feature map;
(4) constructing a loss function of the MlpBRAN-Net network, which comprises the following specific processes:
(4-a) constructing a loss function L of the initial segmentation module by adopting the Dice coefficient 1 As shown in the following formula:
wherein, Y is a segmentation gold standard, wherein the liver tumor region pixels are labeled as "1", the non-liver tumor region pixels are labeled as "0", P is an output result of the first output layer, wherein the pixels predicted as liver tumor are labeled as "1", the pixels predicted as non-liver tumor are labeled as "0", the symbol | · | represents the union set, and | · | represents the number of non-0 pixels in the given region;
(4-b) constructing a loss function L of the direction information extraction module by adopting a 2-norm and a vector included angle 2 As shown in the following formula:
wherein D is T (k) And D G (k) The direction information of the kth pixel output by the second output layer and the direction information golden standard of the kth pixel calculated according to the manual segmentation result respectively, | · | | luminous flux 2 Representing the 2-norm of a given vector, D G (k)·D T (k) Expression vector D G (k) And D T (k) Inner product of between, cos -1 (D G (k)·D T (k) Represents a calculation vector D G (k) And D T (k) The included angle between the two pixels is mu is a scale factor used for balancing 2-norm and vector included angle, N is the number of pixels in the CT image, and lambda (k) is the self-adaptive weight of the pixel k, and the calculation formula is as follows:
wherein, G k Representing the same number of pixels in the CT image as the class of pixel k, G if k belongs to a liver tumor k =N T In which N is T Is the number of pixels in the liver tumor region in the CT image, if k does not belong to the liver tumor, G k =N B In which N is B The number of pixels of a non-liver tumor region in the CT image; in the present embodiment, preferably μ ═ 1;
(4-c) constructing a directional modification module loss function L by adopting a Dice coefficient 3 As shown in the following formula:
wherein, T is an output result of the third output layer, where a pixel predicted as a liver tumor is marked as "1", a pixel predicted as a non-liver tumor is marked as "0", and Y is a segmentation gold standard, where a liver tumor region pixel is marked as "1", and a non-liver tumor region pixel is marked as "0";
(4-d) combining the loss functions of the initial segmentation module, the direction information extraction module and the direction correction module to construct a loss function L of the MlpBran-Net network:
L=L 1 +L 2 +L 3
(5) training the MlpBran-Net network by adopting a training data set A until L converges;
(6) and segmenting the CT image to be tested by using the trained network model to obtain the liver tumor segmentation result in the image.
The areas indicated by the white curves in fig. 4(a) to fig. 4(d) are examples of the final liver tumor segmentation result obtained by the present embodiment, and it can be seen that the method of the present invention can effectively segment liver tumors in different positions and shapes in the CT image, and can also accurately segment tumors with blurred boundaries.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (2)
1. The CT image liver tumor region automatic segmentation method based on the multi-branch network is characterized by comprising the following steps of:
(1) the method comprises the following steps of obtaining an original training data set containing an original CT image and a liver tumor manual segmentation result thereof from a liver tumor segmentation public database, and obtaining direction information pointing to a liver tumor boundary according to the liver tumor manual segmentation result, wherein the specific process comprises the following steps:
(1-a) for each pixel i in the CT image, judging whether the pixel i belongs to a liver tumor region according to a liver tumor manual segmentation result, if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the liver tumor region, and if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the non-liver tumor region;
(1-b) calculating the direction D (i) of the pixel i pointing to the boundary of the liver tumor by using the following formula according to the relative position relationship of the pixel i and the pixel j:
D(i)=(D(i) x ,D(i) y )
wherein, D (i) x And D (i) y Respectively representing the values of the directional information of the pixel i in the row and column directions, i x And i y Respectively representing the row and column coordinates, j, of a pixel i x And j y Respectively represent the row and column coordinates of pixel j;
(2) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof;
(3) a deep convolution multi-branch network combining a self-attention mechanism and direction information is constructed and called as MlpBrann-Net, the network comprises an initial segmentation module, a direction information extraction module and a direction correction module, and the specific construction process comprises the following steps:
(3-a) constructing an initial segmentation module, wherein the network module comprises five convolution modules, four deconvolution modules, a self-attention module and an output layer, and the output of the first convolution module is simultaneously used as the input of the second convolution module and the fourth deconvolution module; the output of the second convolution module is simultaneously used as the input of a third convolution module and a third deconvolution module; the output of the third convolution module is simultaneously used as the input of the fourth convolution module and the second deconvolution module; the output of the fourth convolution module is simultaneously used as the input of the fifth convolution module and the self-attention module; the output of the self-attention module and the output of the fifth convolution module are used as the input of the first deconvolution module; in addition, the output of the last deconvolution module is used as the input of the next deconvolution module; the output layer included in the initial partitioning module is composed of 1 convolution layer with the size of 1 × 1, and the output layer is recorded as a first output layer; the output of the fourth deconvolution module is used as the input of the first output layer, and the output of the first output layer is the initial segmentation result of the liver tumor;
(3-b) in the initial segmentation module of the step (3-a), the first to fourth convolution modules are respectively formed by sequentially connecting 2 convolution layers with the size of 3 × 3 and 1 maximum pooling layer with the size of 2 × 2, and the fifth convolution module is formed by sequentially connecting 2 convolution layers with the size of 3 × 3;
(3-c) in the initial segmentation module of the step (3-a), each deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3, 1 splicing layer, and 2 convolution layers with the size of 3 × 3, wherein: the input of the deconvolution layer in the first deconvolution module is the output of the fifth convolution module, and the input of the deconvolution layer in the next deconvolution module is the output of the previous deconvolution module; the input of the splicing layer in the first deconvolution module comprises the output of the deconvolution layer in the current deconvolution module and the output of the self-attention module, and the input of the splicing layer in the second to fourth deconvolution modules comprises the output of the deconvolution layer in the current deconvolution module and the output of the convolution module connected with the deconvolution module;
(3-d) in the initial segmentation module in the step (3-a), the self-attention module adopts a residual structure design and comprises an image partitioning layer, a linear mapping layer, n self-attention layers and an image resizing layer which are connected in sequence, and a jump connection, wherein the image partitioning layer partitions the feature map output by the fourth convolution module into a plurality of sub-blocks; after flattening the sub-blocks, the linear mapping layer linearly maps the sub-blocks to an M-dimensional sub-space; each self-attention layer consists of a plurality of attention layers and a plurality of layers of perceptrons; the image resizing layer converts the output of the last self-attention layer into a feature map with the same size as the output of the fourth convolution module; adding the feature map obtained by the image resizing layer and the feature map output by the fourth convolution module by introducing jump connection, and taking the addition result as one input of the first deconvolution module;
(3-e) constructing a direction information extraction module, which specifically comprises: firstly, adding a deconvolution module after the first, second and third deconvolution modules in the initial segmentation module, respectively recording as the fifth, sixth and seventh deconvolution modules, wherein each newly added deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3 and 2 convolution layers with the size of 3 × 3; then, a splicing layer is added after the fifth, sixth and seventh deconvolution modules, and the outputs of the fifth, sixth and seventh deconvolution modules are spliced with the output of the fourth deconvolution module in the initial segmentation module; finally, inputting the spliced result into an output layer formed by convolution layers with the size of 1 multiplied by 1, wherein the output layer is marked as a second output layer, and the output of the second output layer is direction information;
(3-f) constructing a direction correction module, which specifically comprises the following steps: firstly, the direction information output by the direction information extraction module and the feature map output by the fourth deconvolution module in the initial segmentation module are used as input to construct a direction correction layer, and the direction correction layer is characterized in that the direction correction layer can iteratively update the feature value of each pixel in the feature map F output by the fourth deconvolution module by using the direction information, so that the feature value of the pixel far away from the liver tumor boundary gradually moves towards the boundary, and the specific calculation formula is as follows:
F t (p x ,p y )=F t-1 (p x -D(p) x ,p y -D(p) y )
wherein p is x And p y Respectively the row and column coordinates, F, of a pixel p t (p x ,p y ) The characteristic value of a pixel p after t times of iterative update of a characteristic diagram F is shown, D (p) x And D (p) y Are respectively asValue of direction information of a pixel p in row and column directions, F t-1 (p x -D(p) x ,p y -D(p) y ) Representing the coordinate (p) after the characteristic diagram F is iteratively updated for t-1 times x -D(p) x ,p y -D(p) y ) A characteristic value of the pixel of (a); for the coordinate (p) x -D(p) x ,p y -D(p) y ) The value is non-integer and satisfies 1 ≤ p x -D(p) x ≤row、1<p y -D(p) y <col, where row and col represent the number of rows and columns, respectively, of the feature map F, the feature value F of the coordinate point t-1 (p x -D(p) x ,p y -D(p) y ) By using a feature map F t-1 Calculating a bilinear difference value to obtain the bilinear difference value; for the coordinate (p) x -D(p) x ,p y -D(p) y ) Beyond the feature map boundary, i.e. when p x -D(p) x <1 or p x -D(p) x > row or p y -D(p) y <1 or p y -D(p) y When > col, the characteristic value of pixel p is not updated, i.e. F t (p x ,p y )=F t-1 (p x ,p y ) (ii) a Then, adding an output layer formed by convolution layers with the size of 1 multiplied by 1 after the direction correction layer, recording the output layer as a third output layer, and outputting a final liver tumor segmentation result;
(4) constructing a loss function of the MlpBran-Net network, which comprises the following specific processes:
(4-a) constructing a loss function L of the initial segmentation module by adopting the Dice coefficient 1 As shown in the following formula:
wherein, Y is a segmentation gold standard, wherein the liver tumor region pixels are labeled as "1", the non-liver tumor region pixels are labeled as "0", P is an output result of the first output layer, wherein the pixels predicted as liver tumor are labeled as "1", the pixels predicted as non-liver tumor are labeled as "0", the symbol | · | represents the union set, and | · | represents the number of non-0 pixels in the given region;
(4-b) constructing a loss function L of the direction information extraction module by adopting a 2-norm and a vector included angle 2 As shown in the following formula:
wherein D is T (k) And D G (k) The direction information of the kth pixel output by the second output layer and the direction information golden standard of the kth pixel calculated according to the manual segmentation result respectively, | · | | luminous flux 2 Representing the 2-norm, D, of a given vector G (k)·D T (k) Expression vector D G (k) And D T (k) Inner product of between, cos -1 (D G (k)·D T (k) Represent a calculated vector D G (k) And D T (k) The included angle between the two pixels is mu is a scale factor used for balancing 2-norm and vector included angle, N is the number of pixels in the CT image, and lambda (k) is the self-adaptive weight of the pixel k, and the calculation formula is as follows:
wherein G is k Representing the same number of pixels in the CT image as the class of pixel k, G if k belongs to a liver tumor k =N T In which N is T Is the number of pixels in the liver tumor region in the CT image, if k does not belong to the liver tumor, G k =N B In which N is B The number of pixels of a non-liver tumor region in the CT image;
(4-c) constructing a directional modification module loss function L by adopting a Dice coefficient 3 As shown in the following formula:
wherein, T is an output result of the third output layer, where a pixel predicted as a liver tumor is marked as "1", a pixel predicted as a non-liver tumor is marked as "0", and Y is a segmentation gold standard, where a liver tumor region pixel is marked as "1", and a non-liver tumor region pixel is marked as "0";
(4-d) combining the loss functions of the initial segmentation module, the direction information extraction module and the direction correction module to construct a loss function L of the MlpBran-Net network:
L=L 1 +L 2 +L 3
(5) training the MlpBran-Net network by adopting a training data set A until L converges;
(6) and segmenting the CT image to be tested by using the trained network model to obtain the liver tumor segmentation result in the image.
2. The method for automatically segmenting liver tumor region of CT image based on multi-branch network as claimed in claim 1, characterized in that: n is preferably a constant of 4-8, M is preferably a constant of 400-900, mu is preferably a constant of 0.8-1.2, and t is preferably a constant of 4-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210389599.7A CN114842025B (en) | 2022-04-14 | 2022-04-14 | CT image liver tumor region automatic segmentation method based on multi-branch network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210389599.7A CN114842025B (en) | 2022-04-14 | 2022-04-14 | CT image liver tumor region automatic segmentation method based on multi-branch network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114842025A true CN114842025A (en) | 2022-08-02 |
CN114842025B CN114842025B (en) | 2024-04-05 |
Family
ID=82563828
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210389599.7A Active CN114842025B (en) | 2022-04-14 | 2022-04-14 | CT image liver tumor region automatic segmentation method based on multi-branch network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114842025B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210089807A1 (en) * | 2019-09-25 | 2021-03-25 | Samsung Electronics Co., Ltd. | System and method for boundary aware semantic segmentation |
CN112734762A (en) * | 2020-12-31 | 2021-04-30 | 西华师范大学 | Dual-path UNet network tumor segmentation method based on covariance self-attention mechanism |
CN113129310A (en) * | 2021-03-04 | 2021-07-16 | 同济大学 | Medical image segmentation system based on attention routing |
CN114240962A (en) * | 2021-11-23 | 2022-03-25 | 湖南科技大学 | CT image liver tumor region automatic segmentation method based on deep learning |
-
2022
- 2022-04-14 CN CN202210389599.7A patent/CN114842025B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210089807A1 (en) * | 2019-09-25 | 2021-03-25 | Samsung Electronics Co., Ltd. | System and method for boundary aware semantic segmentation |
CN112734762A (en) * | 2020-12-31 | 2021-04-30 | 西华师范大学 | Dual-path UNet network tumor segmentation method based on covariance self-attention mechanism |
CN113129310A (en) * | 2021-03-04 | 2021-07-16 | 同济大学 | Medical image segmentation system based on attention routing |
CN114240962A (en) * | 2021-11-23 | 2022-03-25 | 湖南科技大学 | CT image liver tumor region automatic segmentation method based on deep learning |
Non-Patent Citations (2)
Title |
---|
ASHISH SINHA,ET AL: "Multi-Scale Self-Guided Attention for Medical Image Segmentation", IEEE, 14 April 2020 (2020-04-14) * |
何兰;吴倩;: "基于3D卷积神经网络的肝脏自动分割方法", 中国医学物理学杂志, no. 06, 25 June 2018 (2018-06-25) * |
Also Published As
Publication number | Publication date |
---|---|
CN114842025B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111798462B (en) | Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image | |
CN113674253B (en) | Automatic segmentation method for rectal cancer CT image based on U-transducer | |
CN113012172B (en) | AS-UNet-based medical image segmentation method and system | |
CN114240962B (en) | CT image liver tumor region automatic segmentation method based on deep learning | |
CN110363802B (en) | Prostate image registration system and method based on automatic segmentation and pelvis alignment | |
CN109166133A (en) | Soft tissue organs image partition method based on critical point detection and deep learning | |
CN111127482A (en) | CT image lung trachea segmentation method and system based on deep learning | |
CN114066866B (en) | Medical image automatic segmentation method based on deep learning | |
CN107680107B (en) | Automatic segmentation method of diffusion tensor magnetic resonance image based on multiple maps | |
CN113706486B (en) | Pancreatic tumor image segmentation method based on dense connection network migration learning | |
CN107680110B (en) | Inner ear three-dimensional level set segmentation method based on statistical shape model | |
CN110648331B (en) | Detection method for medical image segmentation, medical image segmentation method and device | |
CN112529909A (en) | Tumor image brain region segmentation method and system based on image completion | |
CN110008992B (en) | Deep learning method for prostate cancer auxiliary diagnosis | |
CN112750137B (en) | Liver tumor segmentation method and system based on deep learning | |
CN111127487B (en) | Real-time multi-tissue medical image segmentation method | |
CN113436173A (en) | Abdomen multi-organ segmentation modeling and segmentation method and system based on edge perception | |
CN112950611A (en) | Liver blood vessel segmentation method based on CT image | |
CN116258933A (en) | Medical image segmentation device based on global information perception | |
CN109919216B (en) | Counterlearning method for computer-aided diagnosis of prostate cancer | |
Zuo et al. | A method of crop seedling plant segmentation on edge information fusion model | |
CN112489062B (en) | Medical image segmentation method and system based on boundary and neighborhood guidance | |
CN117911432A (en) | Image segmentation method, device and storage medium | |
CN112750131A (en) | Pelvis nuclear magnetic resonance image musculoskeletal segmentation method based on scale and sequence relation | |
CN113344940A (en) | Liver blood vessel image segmentation method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |