CN116205936A - Image segmentation method introducing spatial information and attention mechanism - Google Patents

Image segmentation method introducing spatial information and attention mechanism Download PDF

Info

Publication number
CN116205936A
CN116205936A CN202310350733.7A CN202310350733A CN116205936A CN 116205936 A CN116205936 A CN 116205936A CN 202310350733 A CN202310350733 A CN 202310350733A CN 116205936 A CN116205936 A CN 116205936A
Authority
CN
China
Prior art keywords
image
feature
stage
attention mechanism
image segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310350733.7A
Other languages
Chinese (zh)
Inventor
栾晓
薛加望
刘玲慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202310350733.7A priority Critical patent/CN116205936A/en
Publication of CN116205936A publication Critical patent/CN116205936A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)

Abstract

The invention relates to an image segmentation method for introducing spatial information and an attention mechanism, belonging to the field of image segmentation. The method specifically comprises the following steps: s1: data preparation stage: preprocessing a brain medical image, and cutting out an image block from the preprocessed image; s2: feature encoding stage: extracting image features by pre-activated 3D convolution; s3: feature decoding stage: and restoring the original image size of the feature map obtained in the encoding stage through deconvolution and a attention mechanism with position encoding, and completing the image segmentation process. According to the invention, the attention mechanism is used for paying attention to the space information, so that the network segmentation performance is improved.

Description

Image segmentation method introducing spatial information and attention mechanism
Technical Field
The invention belongs to the field of image segmentation, and relates to an image segmentation method for introducing spatial information and a attention mechanism.
Background
With the rapid development of human medical imaging technologies such as computed tomography (Computed Tomography, CT) and magnetic resonance imaging (Magnetic Resonance Imaging, MRI), medical images play an increasingly important role in clinical medical diagnosis. The medical image segmentation can better provide scientific reference when medical staff judges and diagnoses disease etiology, thereby greatly reducing misdiagnosis rate caused by insufficient eyesight resolution of human beings or subjectively insufficient clinical experience of medical staff and further improving the utilization rate of medical images. U-Net has proven to be very effective in medical image processing tasks, where there is a 3D image whose anatomy is complex and where the complete convolution-based codec cannot fully exploit the spatial information of the 3D image.
Disclosure of Invention
In view of the above, an object of the present invention is to provide an image segmentation method that introduces spatial information and attention mechanisms, by using 3D relative position coding and attention mechanisms to fully mine spatial information of a 3D image, semantic information of the image is recovered more accurately on a decoding path, thereby improving model segmentation accuracy.
In order to achieve the above purpose, the present invention provides the following technical solutions:
an image segmentation method introducing spatial information and attention mechanism, the method designs a self-attention segmentation network assisted by relative position coding, extracts image features by pre-activated 3D convolution in the coding stage, gradually restores image size by deconvolution in the decoding stage, and restores the image features by a non local self-attention module embedded with relative position coding, the method comprises the following steps:
s1: data preparation stage: preprocessing a brain medical image, and cutting out an image block from the preprocessed image;
s2: feature encoding stage: extracting image features by pre-activated 3D convolution;
s3: feature decoding stage: and restoring the original image size of the feature map obtained in the encoding stage through deconvolution and a attention mechanism with position encoding, and completing the image segmentation process.
Further, the step S1 includes the steps of:
s11: cutting the three-dimensional medical image, and cutting off a background area with a gray value of 0 along a plane formed by any two axes;
s12: the cut image is normalized by Z-Score, the average value of gray distribution of the image is 0, and the standard deviation is 1, so that the image is subjected to normal distribution;
s13: cutting the cut image into image cut blocks with the size of 32 multiplied by 32, and randomly selecting one cut block as an input of the feature encoding stage in the step S2; if the data are multi-mode data, all the mode data are spliced together along the channel dimension to form a multi-channel image which is used as the input of the feature encoding stage in the step S2.
Further, the step S2 includes the steps of:
s21: performing feature extraction on the cut three-dimensional image by using common 3D convolution extraction to obtain 32X 32 feature mapping;
s22: downsampling the feature map of S21 using a 3D convolution with a step size of 2;
s23: the operations S21 and S22 are repeated, finally obtain 4 multiplied by 4 feature map x 4.
Further, the step S3 specifically includes the following steps:
s31: for the 3D feature map obtained in the encoding stage, calculating the relative positions of other pixel points by taking all the pixel points as original points;
s32: embedding the position code generated in the step S31 into a non local self-attention mechanism, and carrying out feature fusion on the feature map of the coding stage;
s33: upsampling the feature map using deconvolution, followed by repeating S32;
s34: the step S33 is repeated twice to realize image segmentation.
The invention has the beneficial effects that: spatial information is introduced by relative position coding and embedded in the self-attention mechanism so that the learning of weights in the attention mechanism depends not only on gray information and also on position information. The attention mechanism is used for paying attention to the space information, so that the network segmentation performance is improved.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a network structure diagram of an image segmentation model in the present invention;
FIG. 2 is a schematic diagram of a relative position encoding structure according to the present invention;
fig. 3 is an upsampling module of the network according to the present invention.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.
Referring to fig. 1 to 3, the present invention provides an image segmentation method for introducing spatial information and an attention mechanism, and in this embodiment, the method is assumed to be used for segmenting brain tissue images, and a self-attention segmentation network assisted by relative position coding is designed, and a network structure diagram is shown in fig. 1. In the encoding phase, image features are extracted by pre-activated 3D convolution, in the decoding phase, image size is gradually restored by deconvolution, and image features are restored by a non local self attention module (shown in fig. 3) embedded with relative position coding. The method comprises the following steps:
step 1: the brain medical image is preprocessed to be processed, cropping from preprocessed images the output is 32 multiplied by 32 is a block of an image;
step 101: cutting the three-dimensional medical image, and cutting off a background area with a gray value of 0 along a plane formed by any two axes;
step 102: the cut image is normalized by Z-Score, the average value of gray distribution of the image is 0, and the standard deviation is 1, so that the image is subjected to normal distribution;
step 103: cutting the cut image into image cut blocks with the size of 32 multiplied by 32, and randomly selecting one cut block as the input of a model; if the data are multi-mode data, all the mode data are spliced together along the channel dimension to form a multi-channel image to be used as network input.
Step 2: feature encoding stage: extracting image features by pre-activated 3D convolution;
step 201: the three-dimensional image is feature extracted using a common 3D convolution extraction, obtain 32 x 32 is a feature map of (1);
step 202: downsampling the feature map of step 201 using a 3D convolution of step 2;
step 203: the operation steps 201 and 202 are repeated, finally obtain 4 multiplied by 4 feature map x 4.
Step S3: feature decoding stage: and restoring the original image size of the feature map obtained in the encoding stage through deconvolution and a attention mechanism with position encoding, and completing the image segmentation process.
Step 301: for the 3D feature map obtained in the encoding stage, calculating the relative positions of other pixel points by taking all the pixel points as original points, as shown in FIG. 2;
step 302: as shown in fig. 3, the position code generated in step 301 is embedded in a NonLocal self-attention mechanism, and feature fusion is performed on the feature map of the coding stage.
Step 303: the feature map is up-sampled using deconvolution, followed by repeating step 302.
Step 304: step 303 is repeated twice to achieve image segmentation.
Spatial information is introduced by relative position coding and embedded in the self-attention mechanism so that the learning of weights in the attention mechanism depends not only on gray information and also on position information. The attention mechanism is used for paying attention to the space information, so that the network segmentation performance is improved.
In order to verify the effect of the present invention, the following experiments were performed:
based on this image segmentation method, which introduces spatial information and attention mechanisms, tests were performed on the IBSR18 dataset. The IBSR18 dataset contained 18 training samples, the test objective was to segment brain tissue nmr images into Grey Matter (GM), white Matter (WM), cerebrospinal fluid (CSF) and background. The 14 data samples are used as training set, the remaining one as validation set. At the same time, method 1 using spatial attention and channel attention at the same time, method 2 using self-attention mechanism, method 3 using axial attention, and method 4 of the present invention are compared. The Dice coefficient is used as an evaluation index, and the formula is as follows:
Figure BDA0004161406100000041
where A represents the result of the neural network segmentation and B represents the gold standard given by the dataset.
Table 1 gives the results of the tests on the dataset, it can be seen that the neural network based on the invention performs better on each segmented result in terms of the Dice coefficient.
TABLE 1
CSF GM WM AVG
Method
1 85.88 95.30 95.05 92.08
Method 2 86.12 95.38 95.04 92.17
Method 3 86.20 95.37 95.01 92.19
Method 4 86.77 95.48 95.06 92.44
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (4)

1. An image segmentation method introducing spatial information and an attention mechanism is characterized in that: comprises the following steps:
s1: data preparation stage: preprocessing a brain medical image, and cutting out an image block from the preprocessed image;
s2: feature encoding stage: extracting image features by pre-activated 3D convolution;
s3: feature decoding stage: and restoring the original image size of the feature map obtained in the encoding stage through deconvolution and a attention mechanism with position encoding, and completing the image segmentation process.
2. The method of image segmentation incorporating spatial information and attention mechanisms of claim 1, wherein: the method is characterized in that the step S1 comprises the following steps:
s11: cutting the three-dimensional medical image, and cutting off a background area with a gray value of 0 along a plane formed by any two axes;
s12: the cut image is normalized by Z-Score, the average value of gray distribution of the image is 0, and the standard deviation is 1, so that the image is subjected to normal distribution;
s13: cutting the cut image into a plurality of image cut blocks with the size of 32 multiplied by 32, and randomly selecting one cut block as an input of the feature encoding stage in the step S2; if the data are multi-mode data, all the mode data are spliced together along the channel dimension to form a multi-channel image which is used as the input of the feature encoding stage in the step S2.
3. The method of image segmentation incorporating spatial information and attention mechanisms of claim 1, wherein: the method is characterized in that the step S2 comprises the following steps:
s21: performing feature extraction on the cut three-dimensional image by using common 3D convolution extraction to obtain 32X 32 feature mapping;
s22: downsampling the feature map of S21 using a 3D convolution with a step size of 2;
s23: the operations S21 and S22 are repeated, finally obtain 4 multiplied by 4 feature map x 4.
4. The method of image segmentation incorporating spatial information and attention mechanisms of claim 1, wherein: the method is characterized in that the step S3 specifically comprises the following steps:
s31: for the 3D feature map obtained in the encoding stage, calculating the relative positions of other pixel points by taking all the pixel points as original points;
s32: embedding the position code generated in the step S31 into a non local self-attention mechanism, and carrying out feature fusion on the feature map of the coding stage;
s33: upsampling the feature map using deconvolution, followed by repeating S32;
s34: the step S33 is repeated twice to realize image segmentation.
CN202310350733.7A 2023-04-04 2023-04-04 Image segmentation method introducing spatial information and attention mechanism Pending CN116205936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310350733.7A CN116205936A (en) 2023-04-04 2023-04-04 Image segmentation method introducing spatial information and attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310350733.7A CN116205936A (en) 2023-04-04 2023-04-04 Image segmentation method introducing spatial information and attention mechanism

Publications (1)

Publication Number Publication Date
CN116205936A true CN116205936A (en) 2023-06-02

Family

ID=86514874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310350733.7A Pending CN116205936A (en) 2023-04-04 2023-04-04 Image segmentation method introducing spatial information and attention mechanism

Country Status (1)

Country Link
CN (1) CN116205936A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118297941A (en) * 2024-06-03 2024-07-05 中国科学院自动化研究所 Three-dimensional abdominal aortic aneurysm and visceral vessel lumen extraction method and device
CN118297941B (en) * 2024-06-03 2024-10-25 中国科学院自动化研究所 Three-dimensional abdominal aortic aneurysm and visceral vessel lumen extraction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118297941A (en) * 2024-06-03 2024-07-05 中国科学院自动化研究所 Three-dimensional abdominal aortic aneurysm and visceral vessel lumen extraction method and device
CN118297941B (en) * 2024-06-03 2024-10-25 中国科学院自动化研究所 Three-dimensional abdominal aortic aneurysm and visceral vessel lumen extraction method and device

Similar Documents

Publication Publication Date Title
CN112150428B (en) Medical image segmentation method based on deep learning
CN114926477B (en) Brain tumor multi-mode MRI image segmentation method based on deep learning
JP2023540910A (en) Connected Machine Learning Model with Collaborative Training for Lesion Detection
CN117218453B (en) Incomplete multi-mode medical image learning method
CN113393469A (en) Medical image segmentation method and device based on cyclic residual convolutional neural network
CN112396605B (en) Network training method and device, image recognition method and electronic equipment
CN114494296A (en) Brain glioma segmentation method and system based on fusion of Unet and Transformer
CN110910335B (en) Image processing method, image processing device and computer readable storage medium
CN115809998A (en) Based on E 2 Glioma MRI data segmentation method based on C-Transformer network
CN116433586A (en) Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method
CN116309615A (en) Multi-mode MRI brain tumor image segmentation method
Wu et al. Continuous refinement-based digital pathology image assistance scheme in medical decision-making systems
CN115311193A (en) Abnormal brain image segmentation method and system based on double attention mechanism
CN110992309A (en) Fundus image segmentation method based on deep information transfer network
Kumaraswamy et al. Automatic prostate segmentation of magnetic resonance imaging using Res-Net
CN116862930B (en) Cerebral vessel segmentation method, device, equipment and storage medium suitable for multiple modes
CN115690409A (en) SEResu-Net model-based MRI brain tumor image segmentation method and system
CN116228732A (en) Breast cancer molecular typing prediction method, system, medium, equipment and terminal
CN113379770B (en) Construction method of nasopharyngeal carcinoma MR image segmentation network, image segmentation method and device
CN116205936A (en) Image segmentation method introducing spatial information and attention mechanism
CN113177938B (en) Method and device for segmenting brain glioma based on circular convolution kernel and related components
CN113409324B (en) Brain segmentation method fusing differential geometric information
Fei et al. Deep Learning-Based Auto-Segmentation of Spinal Cord Internal Structure of Diffusion Tensor Imaging in Cervical Spondylotic Myelopathy
CN112634279A (en) Medical image semantic segmentation method based on attention Unet model
CN116524285A (en) Brain tissue image segmentation method introducing prior information and feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination