CN114842025A - CT image liver tumor region automatic segmentation method based on multi-branch network - Google Patents

CT image liver tumor region automatic segmentation method based on multi-branch network Download PDF

Info

Publication number
CN114842025A
CN114842025A CN202210389599.7A CN202210389599A CN114842025A CN 114842025 A CN114842025 A CN 114842025A CN 202210389599 A CN202210389599 A CN 202210389599A CN 114842025 A CN114842025 A CN 114842025A
Authority
CN
China
Prior art keywords
module
output
layer
deconvolution
liver tumor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210389599.7A
Other languages
Chinese (zh)
Other versions
CN114842025B (en
Inventor
邸拴虎
赵于前
廖苗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202210389599.7A priority Critical patent/CN114842025B/en
Publication of CN114842025A publication Critical patent/CN114842025A/en
Application granted granted Critical
Publication of CN114842025B publication Critical patent/CN114842025B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a CT image liver tumor region automatic segmentation method based on a multi-branch network, which comprises the following steps: (1) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof; (2) constructing a depth convolution multi-branch network model fusing a self-attention mechanism and direction information; (3) constructing a network loss function; (4) training the network by adopting a training data set A; (5) and (4) segmenting the test image by using the trained network model to obtain a final liver tumor segmentation result. The invention relates to a full-automatic liver tumor segmentation method, which solves the problems that a long-distance target dependency relationship is difficult to establish in liver tumor segmentation of a convolutional network and tumor boundary identification is inaccurate by introducing a self-attention module and a direction correction module based on direction information into the convolutional network, and effectively improves the segmentation precision of liver tumors.

Description

CT image liver tumor region automatic segmentation method based on multi-branch network
Technical Field
The invention relates to the technical field of medical image processing, in particular to a CT image liver tumor region automatic segmentation method based on a multi-branch network.
Background
Liver tumor segmentation is an indispensable means for extracting quantitative information (such as volume, shape, size and the like) of liver lesion tissues in medical images, and plays an important role in liver tumor radiotherapy, surgical planning, curative effect evaluation and the like. Currently, the segmentation of liver tumors is mainly done by physicians manually delineating the liver tumors from anatomical structures in the patient CT image data. Since there are hundreds of CT slices per patient, manual segmentation is time and labor consuming and is influenced by physician experience and knowledge level, and there are significant differences in segmentation results from physician to physician. Therefore, the liver tumor automatic segmentation method for researching the CT image has important significance for improving the tumor delineation efficiency and precision.
The existing automatic liver tumor segmentation methods are mainly divided into traditional methods and deep learning-based methods. The traditional automatic segmentation method mainly comprises threshold value, region growing, clustering and the like, the methods usually need to manually appoint regions of interest or seed points, the segmentation efficiency is low, and the segmentation accuracy is low due to the influence of factors such as fuzzy CT image boundaries and adjacent tissue gray level approaching. With the development of deep learning techniques, in recent years, a liver tumor automatic segmentation method based on deep learning has been proposed in many documents. Most of the existing methods based on deep learning adopt a deep convolution network to realize automatic segmentation of liver tumors. Because the size of the convolution kernel in the convolution network is fixed, the receptive field is small, the characteristic relation between long-distance pixels in the image is difficult to extract, and the segmentation precision is not high. In addition, because the boundary of the liver tumor is fuzzy, the existing method based on deep learning is easy to generate under-segmentation or over-segmentation outside the tumor boundary.
Disclosure of Invention
Aiming at the defects and shortcomings of the prior art, the invention integrates a self-attention module and a direction correction module based on direction information into a deep convolutional network for construction, and aims to provide a CT image liver tumor region automatic segmentation method based on a multi-branch network, solve the problems that the convolutional network is difficult to establish a long-distance target dependency relationship in liver tumor segmentation, and the tumor boundary identification is inaccurate, and improve the precision and efficiency of computer-aided diagnosis of liver diseases.
The CT image liver tumor region automatic segmentation method based on the multi-branch network comprises the following steps:
(1) the method comprises the following steps of obtaining an original training data set containing an original CT image and a liver tumor manual segmentation result thereof from a liver tumor segmentation public database, and obtaining direction information pointing to a liver tumor boundary according to the liver tumor manual segmentation result, wherein the specific process comprises the following steps:
(1-a) for each pixel i in the CT image, judging whether the pixel i belongs to a liver tumor region according to a liver tumor manual segmentation result, if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the liver tumor region, and if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the non-liver tumor region;
(1-b) calculating the direction D (i) of the pixel i pointing to the boundary of the liver tumor by using the following formula according to the relative position relationship of the pixel i and the pixel j:
D(i)=(D(i) x ,D(i) y )
Figure BDA0003596228480000021
Figure BDA0003596228480000022
wherein, D (i) x And D (i) y Respectively representing the values of the directional information of the pixel i in the row and column directions, i x And i y Respectively representing the row and column coordinates, j, of a pixel i x And j y Respectively represent the row and column coordinates of pixel j;
(2) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof;
(3) a deep convolution multi-branch network combining a self-attention mechanism and direction information is constructed and called as MlpBrann-Net, the network comprises an initial segmentation module, a direction information extraction module and a direction correction module, and the specific construction process comprises the following steps:
(3-a) constructing an initial segmentation module, wherein the network module comprises five convolution modules, four deconvolution modules, a self-attention module and an output layer, and the output of the first convolution module is simultaneously used as the input of the second convolution module and the fourth deconvolution module; the output of the second convolution module is simultaneously used as the input of a third convolution module and a third deconvolution module; the output of the third convolution module is simultaneously used as the input of the fourth convolution module and the second deconvolution module; the output of the fourth convolution module is simultaneously used as the input of the fifth convolution module and the self-attention module; the output of the self-attention module and the output of the fifth convolution module are used as the input of the first deconvolution module; in addition, the output of the last deconvolution module is used as the input of the next deconvolution module; the output layer included in the initial partitioning module is composed of 1 convolution layer with the size of 1 × 1, and the output layer is recorded as a first output layer; the output of the fourth deconvolution module is used as the input of the first output layer, and the output of the first output layer is the initial segmentation result of the liver tumor;
(3-b) in the initial segmentation module of the step (3-a), the first to fourth convolution modules are respectively formed by sequentially connecting 2 convolution layers with the size of 3 × 3 and 1 maximum pooling layer with the size of 2 × 2, and the fifth convolution module is formed by sequentially connecting 2 convolution layers with the size of 3 × 3;
(3-c) in the initial segmentation module of the step (3-a), each deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3, 1 splicing layer, and 2 convolution layers with the size of 3 × 3, wherein: the input of the deconvolution layer in the first deconvolution module is the output of the fifth convolution module, and the input of the deconvolution layer in the next deconvolution module is the output of the previous deconvolution module; the input of the splicing layer in the first deconvolution module comprises the output of the deconvolution layer in the current deconvolution module and the output of the self-attention module, and the input of the splicing layer in the second to fourth deconvolution modules comprises the output of the deconvolution layer in the current deconvolution module and the output of the convolution module connected with the deconvolution module; the 2 convolutional layers in each deconvolution module are respectively marked as a first convolutional layer and a second convolutional layer, the input of the first convolutional layer is the output of a splicing layer in the current deconvolution module, and the input of the second convolutional layer is the output of the first convolutional layer in the current deconvolution module;
(3-d) in the initial segmentation module of the step (3-a), the self-attention module adopts a residual structure design including an image blocking layer, a linear mapping layer, n self-attention layers and an image resizing layer, which are sequentially connected, and a jump connection, wherein: the image blocking layer divides the feature map output by the fourth convolution module into a plurality of sub-blocks; after flattening the sub-blocks, the linear mapping layer linearly maps the sub-blocks to an M-dimensional sub-space; each self-attention layer consists of a plurality of attention layers and a plurality of layers of perceptrons; the image resizing layer converts the output of the last self-attention layer into a feature map with the same size as the output of the fourth convolution module; adding the feature map obtained by the image resizing layer and the feature map output by the fourth convolution module by introducing jump connection, and taking the addition result as one input of the first deconvolution module; n is preferably a constant of 4-8, and M is preferably a constant of 400-900;
(3-e) constructing a direction information extraction module, which specifically comprises: firstly, adding a deconvolution module after the first, second and third deconvolution modules in the initial segmentation module, respectively recording as the fifth, sixth and seventh deconvolution modules, wherein each newly added deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3 and 2 convolution layers with the size of 3 × 3; then, a splicing layer is added after the fifth, sixth and seventh deconvolution modules, and the outputs of the fifth, sixth and seventh deconvolution modules are spliced with the output of the fourth deconvolution module in the initial segmentation module; finally, inputting the spliced result into an output layer formed by convolution layers with the size of 1 multiplied by 1, wherein the output layer is marked as a second output layer, and the output of the second output layer is direction information;
(3-f) constructing a direction correction module, which specifically comprises the following steps: firstly, the direction information output by the direction information extraction module and the feature map output by the fourth deconvolution module in the initial segmentation module are used as input to construct a direction correction layer, and the direction correction layer is characterized in that the direction correction layer can use the direction information to iteratively update the feature value of each pixel in the feature map F output by the fourth deconvolution module, so that the feature value of the pixel far away from the liver tumor boundary gradually moves towards the boundary, and the specific calculation formula is as follows:
F t (p x ,p y )=F t-1 (p x -D(p) x ,p y -D(p) y )
wherein p is x And p y Respectively the row and column coordinates, F, of a pixel p t (p x ,p y ) The characteristic value of a pixel p after t times of iterative update of a characteristic diagram F is shown, D (p) x And D (p) y Respectively the value of the directional information of the pixel p in the row and column directions, F t-1 (p x -D(p) x ,p y -D(p) y ) Representing the coordinate (p) after the characteristic diagram F is iteratively updated for t-1 times x -D(p) x ,p y -D(p) y ) A characteristic value of the pixel of (1); for the coordinate (p) x -D(p) x ,p y -D(p) y ) The value is non-integer and satisfies 1 ≤ p x -D(p) x ≤row、1<p y -D(p) y <col, where row and col represent the number of rows and columns, respectively, of the feature map F, the feature value F of the coordinate point t-1 (p x -D(p) x ,p y -D(p) y ) By using a feature map F t-1 Calculating a bilinear difference value to obtain the bilinear difference value; for the coordinate (p) x -D(p) x ,p y -D(p) y ) Beyond the feature map boundary, i.e. when p x -D(p) x <1 or p x -D(p) x > row or p y -D(p) y <1 or p y -D(p) y When > col, the characteristic value of pixel p is not updated, i.e. F t (p x ,p y )=F t-1 (p x ,p y ) (ii) a Then, adding an output layer formed by convolution layers with the size of 1 multiplied by 1 after the direction correction layer, recording the output layer as a third output layer, and outputting a final liver tumor segmentation result; the t is preferably a constant of 4-8;
(4) constructing a loss function of the MlpBran-Net network, which comprises the following specific processes:
(4-a) constructing a loss function L of the initial segmentation module by adopting the Dice coefficient 1 As shown in the following formula:
Figure BDA0003596228480000051
wherein Y is a segmentation gold criterion, wherein the liver tumor region pixels are labeled as "1", the non-liver tumor region pixels are labeled as "0", P is an output result of the first output layer, wherein the pixels predicted to be liver tumor are labeled as "1", the pixels predicted to be non-liver tumor are labeled as "0", the symbol ∞ represents a union set, | · | represents the number of non-0 pixels in a given region;
(4-b) constructing a loss function L of the direction information extraction module by adopting a 2-norm and a vector included angle 2 As shown in the following formula:
Figure BDA0003596228480000052
wherein D is T (k) And D G (k) The direction information of the kth pixel output by the second output layer and the direction information golden standard of the kth pixel calculated according to the manual segmentation result respectively, | · | | luminous flux 2 Representing the 2-norm of a given vector, D G (k)·D T (k) Expression vector D G (k) And D T (k) Inner product of between, cos -1 (D G (k)·D T (k) Represents a calculation vector D G (k) And D T (k) The included angle between the two pixels is mu is a scale factor used for balancing 2-norm and vector included angle, N is the number of pixels in the CT image, and lambda (k) is the self-adaptive weight of the pixel k, and the calculation formula is as follows:
Figure BDA0003596228480000053
wherein G is k Representing the same number of pixels in the CT image as the class of pixel k if k belongs toIn case of liver tumor, G indicates k =N T In which N is T Is the number of pixels in the liver tumor region in the CT image, if k does not belong to the liver tumor, G k =N B In which N is B The number of pixels of a non-liver tumor region in the CT image; the mu is preferably a constant of 0.8-1.2;
(4-c) constructing a directional modification module loss function L by adopting a Dice coefficient 3 As shown in the following formula:
Figure BDA0003596228480000061
wherein, T is an output result of the third output layer, where a pixel predicted as a liver tumor is marked as "1", a pixel predicted as a non-liver tumor is marked as "0", and Y is a segmentation gold standard, where a liver tumor region pixel is marked as "1", and a non-liver tumor region pixel is marked as "0";
(4-d) constructing a loss function L of the MlpBran-Net network by combining the loss functions of the initial segmentation module, the direction information extraction module and the direction correction module:
L=L 1 +L 2 +L 3
(5) training the MlpBran-Net network by adopting a training data set A until L converges;
(6) and segmenting the CT image to be tested by using the trained network model to obtain the liver tumor segmentation result in the image.
Compared with the prior art, the method has the main advantages and innovations that:
aiming at the problem that the feature relation between long-distance pixels in an image is difficult to extract by a convolutional network, a self-attention module is introduced into the convolutional network to establish the long-distance dependence relation between tumor and non-tumor tissues, so that the global information extraction capability of the convolutional network can be enhanced, and the segmentation precision of the network is improved;
and (II) the invention introduces a direction correction module based on direction information, and guides the pixel characteristic value in the characteristic diagram obtained by the segmentation network to move towards the tumor boundary by using the predicted direction information, so that the boundary of the characteristic diagram tends to be consistent with the boundary of the target area, and the tumor boundary identification capability of the segmentation network is improved.
Drawings
FIG. 1 is a schematic diagram of an MlpBran-Net network structure according to an embodiment of the present invention
FIG. 2 is a schematic diagram of a self-attention module according to an embodiment of the present invention
FIG. 3 is a schematic diagram of a result obtained by modifying a feature map by a direction modification module according to an embodiment of the present invention, where FIG. 3(a) is an original feature map output by a fourth deconvolution module in an initial segmentation module, and FIGS. 3(b) to 3(f) are results obtained by iteratively updating the feature map by the direction modification module 1 to 5 times according to direction information, respectively, where a white dotted line represents a manual segmentation result of an expert
FIG. 4 is a liver tumor segmentation result example according to an embodiment of the present invention
Detailed Description
The CT image liver tumor region automatic segmentation method based on the multi-branch network comprises the following specific implementation steps:
(1) randomly selecting 100 abdominal CT original sequence images and corresponding liver tumor manual segmentation results from a LiTS public database, and acquiring direction information pointing to the liver tumor boundary according to the liver tumor manual segmentation results, wherein the specific process comprises the following steps:
(1-a) for each pixel i in the CT image, judging whether the pixel i belongs to a liver tumor region according to a liver tumor manual segmentation result, if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the liver tumor region, and if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the non-liver tumor region;
(1-b) calculating the direction D (i) of the pixel i pointing to the boundary of the liver tumor by using the following formula according to the relative position relationship of the pixel i and the pixel j:
D(i)=(D(i) x ,D(i) y )
Figure BDA0003596228480000071
Figure BDA0003596228480000072
wherein, D (i) x And D (i) y Respectively representing the values of the directional information of the pixel i in the row and column directions, i x And i y Respectively representing the row and column coordinates, j, of a pixel i x And j y Respectively represent the row and column coordinates of pixel j;
(2) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof;
(3) the method comprises the following steps of constructing a deep convolution multi-branch network which integrates a self-attention mechanism and direction information, namely MlpBrann-Net, and FIG. 1 is a schematic diagram of the structure of the MlpBrann-Net of the embodiment of the invention, wherein the network comprises an initial segmentation module, a direction information extraction module and a direction correction module, and the specific construction process comprises the following steps:
(3-a) constructing an initial segmentation module, wherein the network module comprises five convolution modules, four deconvolution modules, a self-attention module and an output layer, and the output of the first convolution module is simultaneously used as the input of the second convolution module and the fourth deconvolution module; the output of the second convolution module is simultaneously used as the input of a third convolution module and a third deconvolution module; the output of the third convolution module is simultaneously used as the input of the fourth convolution module and the second deconvolution module; the output of the fourth convolution module is simultaneously used as the input of the fifth convolution module and the self-attention module; the output of the self-attention module and the output of the fifth convolution module are used as the input of the first deconvolution module; in addition, the output of the last deconvolution module is used as the input of the next deconvolution module; the output layer included in the initial partitioning module is composed of 1 convolution layer with the size of 1 × 1, and the output layer is recorded as a first output layer; the output of the fourth deconvolution module is used as the input of the first output layer, and the output of the first output layer is the initial segmentation result of the liver tumor;
(3-b) in the initial segmentation module of the step (3-a), the first to fourth convolution modules are respectively formed by sequentially connecting 2 convolution layers with the size of 3 × 3 and 1 maximum pooling layer with the size of 2 × 2, and the fifth convolution module is formed by sequentially connecting 2 convolution layers with the size of 3 × 3;
(3-c) in the initial segmentation module of the step (3-a), each deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3, 1 splicing layer, and 2 convolution layers with the size of 3 × 3, wherein: the input of the deconvolution layer in the first deconvolution module is the output of the fifth convolution module, and the input of the deconvolution layer in the next deconvolution module is the output of the previous deconvolution module; the input of the splicing layer in the first deconvolution module comprises the output of the deconvolution layer in the current deconvolution module and the output of the self-attention module, and the input of the splicing layer in the second to fourth deconvolution modules comprises the output of the deconvolution layer in the current deconvolution module and the output of the convolution module connected with the deconvolution module; the 2 convolutional layers in each deconvolution module are respectively marked as a first convolutional layer and a second convolutional layer, the input of the first convolutional layer is the output of a splicing layer in the current deconvolution module, and the input of the second convolutional layer is the output of the first convolutional layer in the current deconvolution module;
(3-d) in the initial segmentation module in the step (3-a), the self-attention module adopts a residual structure design, and comprises an image blocking layer, a linear mapping layer, n self-attention layers and an image resizing layer which are connected in sequence, and a jump connection, wherein the image blocking layer divides the feature map output by the fourth convolution module into a plurality of sub-blocks, and the specific structure is shown in fig. 2; after flattening the sub-blocks, the linear mapping layer linearly maps the sub-blocks to an M-dimensional sub-space; each self-attention layer consists of a plurality of attention layers and a plurality of layers of perceptrons; the image resizing layer converts the output of the last self-attention layer into a feature map with the same size as the output of the fourth convolution module; adding the feature map obtained by the image resizing layer and the feature map output by the fourth convolution module by introducing jump connection, and taking the addition result as one input of the first deconvolution module; in this embodiment, n is preferably 6, and M is preferably 768;
(3-e) constructing a direction information extraction module, which specifically comprises: firstly, adding a deconvolution module after the first, second and third deconvolution modules in the initial segmentation module, respectively recording as the fifth, sixth and seventh deconvolution modules, wherein each newly added deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3 and 2 convolution layers with the size of 3 × 3; then, a splicing layer is added after the fifth, sixth and seventh deconvolution modules, and the outputs of the fifth, sixth and seventh deconvolution modules are spliced with the output of the fourth deconvolution module in the initial segmentation module; finally, inputting the spliced result into an output layer formed by convolution layers with the size of 1 multiplied by 1, wherein the output layer is marked as a second output layer, and the output of the second output layer is direction information;
(3-f) constructing a direction correction module, which specifically comprises the following steps: firstly, the direction information output by the direction information extraction module and the feature map output by the fourth deconvolution module in the initial segmentation module are used as input to construct a direction correction layer, and the direction correction layer is characterized in that the direction correction layer can use the direction information to iteratively update the feature value of each pixel in the feature map F output by the fourth deconvolution module, so that the feature value of the pixel far away from the liver tumor boundary gradually moves towards the boundary, and the specific calculation formula is as follows:
F t (p x ,p y )=F t-1 (p x -D(p) x ,p y -D(p) y )
wherein p is x And p y Respectively the row and column coordinates, F, of a pixel p t (p x ,p y ) The characteristic value of a pixel p after t times of iterative update of a characteristic diagram F is shown, D (p) x And D (p) y Respectively the value of the directional information of the pixel p in the row and column directions, F t-1 (p x -D(p) x ,p y -D(p) y ) Representing the coordinate (p) after the characteristic diagram F is iteratively updated for t-1 times x -D(p) x ,p y -D(p) y ) A characteristic value of the pixel of (1); for the coordinate (p) x -D(p) x ,p y -D(p) y ) The value is non-integer and satisfies 1 ≤ p x -D(p) x ≤row、1<p y -D(p) y <col, where row and col represent the number of rows and columns, respectively, of the feature map F, the feature value F of the coordinate point t-1 (p x -D(p) x ,p y -D(p) y ) By using a feature map F t-1 Calculating a bilinear difference value to obtain the bilinear difference value; for coordinate (p) x -D(p) x ,p y -D(p) y ) Beyond the feature map boundary, i.e. when p x -D(p) x <1 or p x -D(p) x > row or p y -D(p) y <1 or p y -D(p) y When > col, the characteristic value of the pixel p is not updated, i.e. F t (p x ,p y )=F t-1 (p x ,p y ) (ii) a Then, adding an output layer formed by convolution layers with the size of 1 multiplied by 1 after the direction correction layer, recording the output layer as a third output layer, and outputting a final liver tumor segmentation result; t-5 is preferred in this example;
fig. 3 is a schematic diagram of a result obtained by modifying a feature map by a direction modification module in the present embodiment, where fig. 3(a) is an original feature map output by a fourth deconvolution module in an initial segmentation module, fig. 3(b) to 3(f) are results obtained by iteratively updating the feature map 1 to 5 times by the direction modification module according to direction information, and a white dotted line represents a manual segmentation result of an expert, and it can be seen that the direction modification module can effectively make a feature map boundary and a boundary of a target area tend to be consistent by iteratively updating the feature map;
(4) constructing a loss function of the MlpBRAN-Net network, which comprises the following specific processes:
(4-a) constructing a loss function L of the initial segmentation module by adopting the Dice coefficient 1 As shown in the following formula:
Figure BDA0003596228480000101
wherein, Y is a segmentation gold standard, wherein the liver tumor region pixels are labeled as "1", the non-liver tumor region pixels are labeled as "0", P is an output result of the first output layer, wherein the pixels predicted as liver tumor are labeled as "1", the pixels predicted as non-liver tumor are labeled as "0", the symbol | · | represents the union set, and | · | represents the number of non-0 pixels in the given region;
(4-b) constructing a loss function L of the direction information extraction module by adopting a 2-norm and a vector included angle 2 As shown in the following formula:
Figure BDA0003596228480000102
wherein D is T (k) And D G (k) The direction information of the kth pixel output by the second output layer and the direction information golden standard of the kth pixel calculated according to the manual segmentation result respectively, | · | | luminous flux 2 Representing the 2-norm of a given vector, D G (k)·D T (k) Expression vector D G (k) And D T (k) Inner product of between, cos -1 (D G (k)·D T (k) Represents a calculation vector D G (k) And D T (k) The included angle between the two pixels is mu is a scale factor used for balancing 2-norm and vector included angle, N is the number of pixels in the CT image, and lambda (k) is the self-adaptive weight of the pixel k, and the calculation formula is as follows:
Figure BDA0003596228480000111
wherein, G k Representing the same number of pixels in the CT image as the class of pixel k, G if k belongs to a liver tumor k =N T In which N is T Is the number of pixels in the liver tumor region in the CT image, if k does not belong to the liver tumor, G k =N B In which N is B The number of pixels of a non-liver tumor region in the CT image; in the present embodiment, preferably μ ═ 1;
(4-c) constructing a directional modification module loss function L by adopting a Dice coefficient 3 As shown in the following formula:
Figure BDA0003596228480000112
wherein, T is an output result of the third output layer, where a pixel predicted as a liver tumor is marked as "1", a pixel predicted as a non-liver tumor is marked as "0", and Y is a segmentation gold standard, where a liver tumor region pixel is marked as "1", and a non-liver tumor region pixel is marked as "0";
(4-d) combining the loss functions of the initial segmentation module, the direction information extraction module and the direction correction module to construct a loss function L of the MlpBran-Net network:
L=L 1 +L 2 +L 3
(5) training the MlpBran-Net network by adopting a training data set A until L converges;
(6) and segmenting the CT image to be tested by using the trained network model to obtain the liver tumor segmentation result in the image.
The areas indicated by the white curves in fig. 4(a) to fig. 4(d) are examples of the final liver tumor segmentation result obtained by the present embodiment, and it can be seen that the method of the present invention can effectively segment liver tumors in different positions and shapes in the CT image, and can also accurately segment tumors with blurred boundaries.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (2)

1. The CT image liver tumor region automatic segmentation method based on the multi-branch network is characterized by comprising the following steps of:
(1) the method comprises the following steps of obtaining an original training data set containing an original CT image and a liver tumor manual segmentation result thereof from a liver tumor segmentation public database, and obtaining direction information pointing to a liver tumor boundary according to the liver tumor manual segmentation result, wherein the specific process comprises the following steps:
(1-a) for each pixel i in the CT image, judging whether the pixel i belongs to a liver tumor region according to a liver tumor manual segmentation result, if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the liver tumor region, and if so, acquiring a pixel j closest to the pixel i in Euclidean distance from the non-liver tumor region;
(1-b) calculating the direction D (i) of the pixel i pointing to the boundary of the liver tumor by using the following formula according to the relative position relationship of the pixel i and the pixel j:
D(i)=(D(i) x ,D(i) y )
Figure FDA0003596228470000011
Figure FDA0003596228470000012
wherein, D (i) x And D (i) y Respectively representing the values of the directional information of the pixel i in the row and column directions, i x And i y Respectively representing the row and column coordinates, j, of a pixel i x And j y Respectively represent the row and column coordinates of pixel j;
(2) establishing a training data set A containing an original CT image, a liver tumor manual segmentation result and direction information thereof;
(3) a deep convolution multi-branch network combining a self-attention mechanism and direction information is constructed and called as MlpBrann-Net, the network comprises an initial segmentation module, a direction information extraction module and a direction correction module, and the specific construction process comprises the following steps:
(3-a) constructing an initial segmentation module, wherein the network module comprises five convolution modules, four deconvolution modules, a self-attention module and an output layer, and the output of the first convolution module is simultaneously used as the input of the second convolution module and the fourth deconvolution module; the output of the second convolution module is simultaneously used as the input of a third convolution module and a third deconvolution module; the output of the third convolution module is simultaneously used as the input of the fourth convolution module and the second deconvolution module; the output of the fourth convolution module is simultaneously used as the input of the fifth convolution module and the self-attention module; the output of the self-attention module and the output of the fifth convolution module are used as the input of the first deconvolution module; in addition, the output of the last deconvolution module is used as the input of the next deconvolution module; the output layer included in the initial partitioning module is composed of 1 convolution layer with the size of 1 × 1, and the output layer is recorded as a first output layer; the output of the fourth deconvolution module is used as the input of the first output layer, and the output of the first output layer is the initial segmentation result of the liver tumor;
(3-b) in the initial segmentation module of the step (3-a), the first to fourth convolution modules are respectively formed by sequentially connecting 2 convolution layers with the size of 3 × 3 and 1 maximum pooling layer with the size of 2 × 2, and the fifth convolution module is formed by sequentially connecting 2 convolution layers with the size of 3 × 3;
(3-c) in the initial segmentation module of the step (3-a), each deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3, 1 splicing layer, and 2 convolution layers with the size of 3 × 3, wherein: the input of the deconvolution layer in the first deconvolution module is the output of the fifth convolution module, and the input of the deconvolution layer in the next deconvolution module is the output of the previous deconvolution module; the input of the splicing layer in the first deconvolution module comprises the output of the deconvolution layer in the current deconvolution module and the output of the self-attention module, and the input of the splicing layer in the second to fourth deconvolution modules comprises the output of the deconvolution layer in the current deconvolution module and the output of the convolution module connected with the deconvolution module;
(3-d) in the initial segmentation module in the step (3-a), the self-attention module adopts a residual structure design and comprises an image partitioning layer, a linear mapping layer, n self-attention layers and an image resizing layer which are connected in sequence, and a jump connection, wherein the image partitioning layer partitions the feature map output by the fourth convolution module into a plurality of sub-blocks; after flattening the sub-blocks, the linear mapping layer linearly maps the sub-blocks to an M-dimensional sub-space; each self-attention layer consists of a plurality of attention layers and a plurality of layers of perceptrons; the image resizing layer converts the output of the last self-attention layer into a feature map with the same size as the output of the fourth convolution module; adding the feature map obtained by the image resizing layer and the feature map output by the fourth convolution module by introducing jump connection, and taking the addition result as one input of the first deconvolution module;
(3-e) constructing a direction information extraction module, which specifically comprises: firstly, adding a deconvolution module after the first, second and third deconvolution modules in the initial segmentation module, respectively recording as the fifth, sixth and seventh deconvolution modules, wherein each newly added deconvolution module is formed by sequentially connecting 1 deconvolution layer with the size of 3 × 3 and 2 convolution layers with the size of 3 × 3; then, a splicing layer is added after the fifth, sixth and seventh deconvolution modules, and the outputs of the fifth, sixth and seventh deconvolution modules are spliced with the output of the fourth deconvolution module in the initial segmentation module; finally, inputting the spliced result into an output layer formed by convolution layers with the size of 1 multiplied by 1, wherein the output layer is marked as a second output layer, and the output of the second output layer is direction information;
(3-f) constructing a direction correction module, which specifically comprises the following steps: firstly, the direction information output by the direction information extraction module and the feature map output by the fourth deconvolution module in the initial segmentation module are used as input to construct a direction correction layer, and the direction correction layer is characterized in that the direction correction layer can iteratively update the feature value of each pixel in the feature map F output by the fourth deconvolution module by using the direction information, so that the feature value of the pixel far away from the liver tumor boundary gradually moves towards the boundary, and the specific calculation formula is as follows:
F t (p x ,p y )=F t-1 (p x -D(p) x ,p y -D(p) y )
wherein p is x And p y Respectively the row and column coordinates, F, of a pixel p t (p x ,p y ) The characteristic value of a pixel p after t times of iterative update of a characteristic diagram F is shown, D (p) x And D (p) y Are respectively asValue of direction information of a pixel p in row and column directions, F t-1 (p x -D(p) x ,p y -D(p) y ) Representing the coordinate (p) after the characteristic diagram F is iteratively updated for t-1 times x -D(p) x ,p y -D(p) y ) A characteristic value of the pixel of (a); for the coordinate (p) x -D(p) x ,p y -D(p) y ) The value is non-integer and satisfies 1 ≤ p x -D(p) x ≤row、1<p y -D(p) y <col, where row and col represent the number of rows and columns, respectively, of the feature map F, the feature value F of the coordinate point t-1 (p x -D(p) x ,p y -D(p) y ) By using a feature map F t-1 Calculating a bilinear difference value to obtain the bilinear difference value; for the coordinate (p) x -D(p) x ,p y -D(p) y ) Beyond the feature map boundary, i.e. when p x -D(p) x <1 or p x -D(p) x > row or p y -D(p) y <1 or p y -D(p) y When > col, the characteristic value of pixel p is not updated, i.e. F t (p x ,p y )=F t-1 (p x ,p y ) (ii) a Then, adding an output layer formed by convolution layers with the size of 1 multiplied by 1 after the direction correction layer, recording the output layer as a third output layer, and outputting a final liver tumor segmentation result;
(4) constructing a loss function of the MlpBran-Net network, which comprises the following specific processes:
(4-a) constructing a loss function L of the initial segmentation module by adopting the Dice coefficient 1 As shown in the following formula:
Figure FDA0003596228470000031
wherein, Y is a segmentation gold standard, wherein the liver tumor region pixels are labeled as "1", the non-liver tumor region pixels are labeled as "0", P is an output result of the first output layer, wherein the pixels predicted as liver tumor are labeled as "1", the pixels predicted as non-liver tumor are labeled as "0", the symbol | · | represents the union set, and | · | represents the number of non-0 pixels in the given region;
(4-b) constructing a loss function L of the direction information extraction module by adopting a 2-norm and a vector included angle 2 As shown in the following formula:
Figure FDA0003596228470000041
wherein D is T (k) And D G (k) The direction information of the kth pixel output by the second output layer and the direction information golden standard of the kth pixel calculated according to the manual segmentation result respectively, | · | | luminous flux 2 Representing the 2-norm, D, of a given vector G (k)·D T (k) Expression vector D G (k) And D T (k) Inner product of between, cos -1 (D G (k)·D T (k) Represent a calculated vector D G (k) And D T (k) The included angle between the two pixels is mu is a scale factor used for balancing 2-norm and vector included angle, N is the number of pixels in the CT image, and lambda (k) is the self-adaptive weight of the pixel k, and the calculation formula is as follows:
Figure FDA0003596228470000042
wherein G is k Representing the same number of pixels in the CT image as the class of pixel k, G if k belongs to a liver tumor k =N T In which N is T Is the number of pixels in the liver tumor region in the CT image, if k does not belong to the liver tumor, G k =N B In which N is B The number of pixels of a non-liver tumor region in the CT image;
(4-c) constructing a directional modification module loss function L by adopting a Dice coefficient 3 As shown in the following formula:
Figure FDA0003596228470000043
wherein, T is an output result of the third output layer, where a pixel predicted as a liver tumor is marked as "1", a pixel predicted as a non-liver tumor is marked as "0", and Y is a segmentation gold standard, where a liver tumor region pixel is marked as "1", and a non-liver tumor region pixel is marked as "0";
(4-d) combining the loss functions of the initial segmentation module, the direction information extraction module and the direction correction module to construct a loss function L of the MlpBran-Net network:
L=L 1 +L 2 +L 3
(5) training the MlpBran-Net network by adopting a training data set A until L converges;
(6) and segmenting the CT image to be tested by using the trained network model to obtain the liver tumor segmentation result in the image.
2. The method for automatically segmenting liver tumor region of CT image based on multi-branch network as claimed in claim 1, characterized in that: n is preferably a constant of 4-8, M is preferably a constant of 400-900, mu is preferably a constant of 0.8-1.2, and t is preferably a constant of 4-8.
CN202210389599.7A 2022-04-14 2022-04-14 CT image liver tumor region automatic segmentation method based on multi-branch network Active CN114842025B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210389599.7A CN114842025B (en) 2022-04-14 2022-04-14 CT image liver tumor region automatic segmentation method based on multi-branch network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210389599.7A CN114842025B (en) 2022-04-14 2022-04-14 CT image liver tumor region automatic segmentation method based on multi-branch network

Publications (2)

Publication Number Publication Date
CN114842025A true CN114842025A (en) 2022-08-02
CN114842025B CN114842025B (en) 2024-04-05

Family

ID=82563828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210389599.7A Active CN114842025B (en) 2022-04-14 2022-04-14 CT image liver tumor region automatic segmentation method based on multi-branch network

Country Status (1)

Country Link
CN (1) CN114842025B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210089807A1 (en) * 2019-09-25 2021-03-25 Samsung Electronics Co., Ltd. System and method for boundary aware semantic segmentation
CN112734762A (en) * 2020-12-31 2021-04-30 西华师范大学 Dual-path UNet network tumor segmentation method based on covariance self-attention mechanism
CN113129310A (en) * 2021-03-04 2021-07-16 同济大学 Medical image segmentation system based on attention routing
CN114240962A (en) * 2021-11-23 2022-03-25 湖南科技大学 CT image liver tumor region automatic segmentation method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210089807A1 (en) * 2019-09-25 2021-03-25 Samsung Electronics Co., Ltd. System and method for boundary aware semantic segmentation
CN112734762A (en) * 2020-12-31 2021-04-30 西华师范大学 Dual-path UNet network tumor segmentation method based on covariance self-attention mechanism
CN113129310A (en) * 2021-03-04 2021-07-16 同济大学 Medical image segmentation system based on attention routing
CN114240962A (en) * 2021-11-23 2022-03-25 湖南科技大学 CT image liver tumor region automatic segmentation method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ASHISH SINHA,ET AL: "Multi-Scale Self-Guided Attention for Medical Image Segmentation", IEEE, 14 April 2020 (2020-04-14) *
何兰;吴倩;: "基于3D卷积神经网络的肝脏自动分割方法", 中国医学物理学杂志, no. 06, 25 June 2018 (2018-06-25) *

Also Published As

Publication number Publication date
CN114842025B (en) 2024-04-05

Similar Documents

Publication Publication Date Title
CN111798462B (en) Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image
CN113674253B (en) Automatic segmentation method for rectal cancer CT image based on U-transducer
CN113012172B (en) AS-UNet-based medical image segmentation method and system
CN114240962B (en) CT image liver tumor region automatic segmentation method based on deep learning
CN110363802B (en) Prostate image registration system and method based on automatic segmentation and pelvis alignment
CN109166133A (en) Soft tissue organs image partition method based on critical point detection and deep learning
CN111127482A (en) CT image lung trachea segmentation method and system based on deep learning
CN114066866B (en) Medical image automatic segmentation method based on deep learning
CN107680107B (en) Automatic segmentation method of diffusion tensor magnetic resonance image based on multiple maps
CN113706486B (en) Pancreatic tumor image segmentation method based on dense connection network migration learning
CN107680110B (en) Inner ear three-dimensional level set segmentation method based on statistical shape model
CN110648331B (en) Detection method for medical image segmentation, medical image segmentation method and device
CN112529909A (en) Tumor image brain region segmentation method and system based on image completion
CN110008992B (en) Deep learning method for prostate cancer auxiliary diagnosis
CN112750137B (en) Liver tumor segmentation method and system based on deep learning
CN111127487B (en) Real-time multi-tissue medical image segmentation method
CN113436173A (en) Abdomen multi-organ segmentation modeling and segmentation method and system based on edge perception
CN112950611A (en) Liver blood vessel segmentation method based on CT image
CN116258933A (en) Medical image segmentation device based on global information perception
CN109919216B (en) Counterlearning method for computer-aided diagnosis of prostate cancer
Zuo et al. A method of crop seedling plant segmentation on edge information fusion model
CN112489062B (en) Medical image segmentation method and system based on boundary and neighborhood guidance
CN117911432A (en) Image segmentation method, device and storage medium
CN112750131A (en) Pelvis nuclear magnetic resonance image musculoskeletal segmentation method based on scale and sequence relation
CN113344940A (en) Liver blood vessel image segmentation method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant