CN116152492A - Medical image segmentation method based on multi-attention fusion - Google Patents

Medical image segmentation method based on multi-attention fusion Download PDF

Info

Publication number
CN116152492A
CN116152492A CN202310064474.1A CN202310064474A CN116152492A CN 116152492 A CN116152492 A CN 116152492A CN 202310064474 A CN202310064474 A CN 202310064474A CN 116152492 A CN116152492 A CN 116152492A
Authority
CN
China
Prior art keywords
image segmentation
fusion
attention
medical image
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310064474.1A
Other languages
Chinese (zh)
Inventor
章勇勤
米继宗
刘钰
叶易凤
常明则
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NORTHWEST UNIVERSITY
Original Assignee
NORTHWEST UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NORTHWEST UNIVERSITY filed Critical NORTHWEST UNIVERSITY
Priority to CN202310064474.1A priority Critical patent/CN116152492A/en
Publication of CN116152492A publication Critical patent/CN116152492A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The application relates to a medical image segmentation method based on multi-attention fusion, which enables a model to pay more attention to a target area by improving modules such as optimized attention, and improves the segmentation precision of the model to the region of interest and boundaries; the method solves the problems that the existing target detection method tends to segment and classify the target area, but tends to not have high edge precision segmentation, and experiments show that the method better maintains the segmentation result of the backbone network and is obviously superior to the existing method in the segmentation result of the focus area.

Description

Medical image segmentation method based on multi-attention fusion
Technical Field
The application relates to the technical field of image segmentation, in particular to a medical image segmentation method based on multi-attention fusion.
Background
The brain is the heart of the human nervous system, and brain lesions may lead to permanent brain function impairment, leading to disability or death. Accurate detection and segmentation of brain lesions can help to quantify various pathological indicators of brain lesions (e.g., total volume of lesions, lesion location and number of lesion mass, etc.). These quantitative indicators are closely related to aging and pathological changes of the brain and may provide useful clues to the prognosis of the patient, and may further be used to analyze the effects of pharmaceutical interventions and to guide the design of surgical intervention protocols. Post-cerebral infarction bleeding transformation refers to bleeding caused by vascular resumption of blood perfusion in the ischemic area following acute cerebral infarction. The bleeding transformation after acute cerebral infarction is a part of the natural course of cerebral infarction, is also a main adverse reaction of therapies such as thrombolysis, is not only related to poor prognosis of cerebral infarction, but also is an important reason for the insufficient use of various therapies for improving blood flow. The rapid analysis and judgment of the bleeding and transformation CT image after cerebral infarction relate to whether a doctor can rapidly and accurately diagnose and treat the illness state of a patient, so that a scientific method is necessary to rapidly divide the bleeding and transformation area.
In the prior art, the deep learning automatic segmentation technology is used for carrying out segmentation detection on brain lesions such as hemorrhage transformation after cerebral infarction, and the method has important significance for further analysis of brain tissues and diagnosis and accurate positioning of brain diseases. However, for some tasks requiring high edge accuracy, depending on the accuracy of the frame, some non-square objects tend to be less effective in segmentation, especially for targets with non-fixed boundaries, most irregularly shaped distribution of lesions, the segmentation performance tends to be poor.
Disclosure of Invention
In order to overcome at least one of the shortcomings in the prior art, embodiments of the present application provide a medical image segmentation method based on multi-attention fusion.
In a first aspect, a medical image segmentation model based on multi-attention fusion is provided, comprising: the system comprises a depth residual error network, a cavity convolution space attention module, a pyramid expansion module and a double-branch fusion module;
the depth residual error network is used for extracting features of the image to be segmented to obtain a plurality of first feature images with different scales;
the cavity convolution space attention module is used for carrying out cavity convolution on the first feature images with different scales to obtain second feature images with different scales;
the pyramid expansion module is used for carrying out convolution and fusion operation on a plurality of second feature images with different scales from large to small according to the scales to obtain a plurality of third feature images with different scales; convolving and fusing the third feature images with different scales from small to large to obtain fourth feature images with different scales;
the double-branch fusion module is used for carrying out category mask prediction and front background mask prediction on a plurality of fourth feature images with different scales, and fusing a category mask prediction result and a front background mask prediction result to obtain an image segmentation result.
In one embodiment, the hole convolution spatial attention module includes an input layer, a max pooling and averaging pooling layer, and a hole convolution layer connected in sequence.
In one embodiment, the dual branch fusion module includes a category mask prediction branch and a front background mask prediction branch;
the category mask prediction branch comprises 4 convolution layers, deconvolution layers and convolution layers which are connected in sequence;
the front background mask prediction branch comprises 2 convolution layers and a full connection layer which are connected in sequence;
the last convolution layer of the category mask prediction branch outputs a category mask prediction result, and the full connection layer of the front background mask prediction branch outputs a front background mask prediction result.
In a second aspect, a medical image segmentation method based on multi-attention fusion is provided, including:
inputting an image to be segmented into a medical image segmentation model based on multi-attention fusion to obtain an image segmentation result;
the medical image segmentation model based on the multi-attention fusion is the medical image segmentation model based on the multi-attention fusion;
in one embodiment, the method further comprises training the medical image segmentation model based on the multi-attention fusion to obtain a trained medical image segmentation model based on the multi-attention fusion.
In one embodiment, training a medical image segmentation model based on multi-attention fusion includes:
and determining at least one target area for each fourth feature map obtained by the pyramid expansion module, and inputting the at least one target area into the double-branch fusion module.
In one embodiment, determining at least one target region for each fourth feature map comprises:
determining a plurality of prediction target region boxes in each fourth feature map;
calculating the corresponding intersection ratio of each predicted target area frame according to the real target area frame and the plurality of predicted target area frames;
and sorting the corresponding cross ratios of all the predicted target area frames from large to small, and selecting at least one predicted target area frame with the front cross ratio as at least one target area.
In one embodiment, the corresponding intersection ratio IoU of each predicted target region frame is calculated according to the real target region frame and the plurality of predicted target region frames new The following formula is used:
Figure BDA0004073673560000041
wherein S is 1 To predict the target region frame, S 2 And lambda is a penalty factor for a real target region box.
In one embodiment, training a medical image segmentation model based on multi-attention fusion includes:
HU index cutting-off processing and data expansion processing are carried out on the image slice containing the focus area, and training data are obtained.
In one embodiment, the depth residual network in the multi-attention fusion based medical image segmentation model is a pre-trained depth residual network during model training.
Compared with the prior art, the application has the following beneficial effects:
according to the method, the model is focused on the target region by improving the modules such as the optimized attention, and the segmentation accuracy of the model on the region of interest and the boundary is improved; the method solves the problems that the existing target detection method tends to segment and classify the target area, but tends to not have high edge precision segmentation, and experiments show that the method better maintains the segmentation result of the backbone network and is obviously superior to the existing method in the segmentation result of the focus area.
Drawings
The present application may be better understood by reference to the following description taken in conjunction with the accompanying drawings, which are incorporated in and form a part of this specification, together with the following detailed description. In the drawings:
FIG. 1 shows a schematic diagram of a multi-attention fusion based medical image segmentation model according to an embodiment of the present application;
FIG. 2 illustrates a schematic diagram of a hole convolution spatial attention module according to an embodiment of the present application;
FIG. 3 illustrates a schematic diagram of a pyramid expansion module according to an embodiment of the present application;
FIG. 4 shows a schematic diagram of a dual branch fusion module according to an embodiment of the present application;
fig. 5 shows a comparison of the image segmentation experimental results of the present application with the prior art.
Detailed Description
Exemplary embodiments of the present application will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual embodiment are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions may be made to achieve the developers' specific goals, and that these decisions may vary from one implementation to another.
It should be noted that, in order to avoid obscuring the present application with unnecessary details, only the device structures closely related to the solution according to the present application are shown in the drawings, and other details not greatly related to the present application are omitted.
It is to be understood that the present application is not limited to the described embodiments due to the following description with reference to the drawings. In this context, embodiments may be combined with each other, features replaced or borrowed between different embodiments, one or more features omitted in one embodiment, where possible.
The embodiment of the application provides a medical image segmentation method based on multi-attention fusion, which comprises the following steps: and inputting the image to be segmented into a medical image segmentation model based on multi-attention fusion to obtain an image segmentation result.
Turning in detail to the specific structure of a medical image segmentation model based on multi-attention fusion, fig. 1 shows a schematic diagram of a medical image segmentation model based on multi-attention fusion according to an embodiment of the present application, the model includes: depth residual network, hole convolution spatial attention module (Dilated Convolutional Spatial Attention, DCSA), pyramid expansion module and dual branch fusion module (Dual Branch Fusion Module, DBF); the specific implementation functions of each module are described in detail below.
The depth residual error network is used for extracting features of the image to be segmented to obtain a plurality of first feature images with different scales;
the cavity convolution space attention module is used for carrying out cavity convolution on the first feature images with different scales to obtain second feature images with different scales;
the pyramid expansion module is used for carrying out convolution and fusion operation on a plurality of second feature images with different scales from large to small according to the scales to obtain a plurality of third feature images with different scales; convolving and fusing the third feature images with different scales from small to large to obtain fourth feature images with different scales;
the double-branch fusion module is used for carrying out category mask prediction and front background mask prediction on a plurality of fourth feature images with different scales, and fusing a category mask prediction result and a front background mask prediction result to obtain an image segmentation result.
In this embodiment, the number of layers of the depth residual network ResNet is 50, the image to be segmented is input into the depth residual network, and feature extraction is performed on the image to be segmented to obtain a plurality of first feature maps [ C ] of different scales (128×128×256, 64×64×256, 32×32×256, 16×16×256) 2 ,C 3 ,C 4 ,C 5 ]The method comprises the steps of carrying out a first treatment on the surface of the Experiments have shown that as the number of network layers of the ResNet increases, the performance tends to saturate and the number of network layers 50 is sufficient.
In one embodiment, the hole convolution spatial attention module includes an input layer, a max pooling and averaging pooling layer, and a hole convolution layer connected in sequence.
In order to enable a network to automatically notice a place rich in pixels in an image during learning by only increasing a trace amount of calculation, a simple and effective cavity convolution space attention module for a feedforward convolution neural network is creatively added in a feature extraction part of a model, and fig. 2 shows a schematic diagram of the cavity convolution space attention module provided according to an embodiment of the application, DCSA is an improved SAM space attention module, and specifically, a common convolution part in the SAM space attention module is replaced by cavity convolution.
In this embodiment, the hole convolution may provide two benefits, one is that it may help to enlarge the receptive field and reduce the amount of computation, the receptive field is large to detect large segmented objects, and the resolution is high to accurately locate the objects. Secondly, multi-scale context information can be captured, and the cavity convolution has a parameter that can set the expansion rate (differential rate), so when settingWhen different expansion rates are set, the receptive fields are different, namely, multi-scale information is acquired. Multiscale information is important in visual tasks. Therefore, the combination of the traditional SAM space attention and the cavity convolution herein shows that the capability of detecting and segmenting brain focus areas of the model can be greatly improved when the expansion rate is 2. After the first feature map passes through the cavity convolution space attention module, a plurality of second feature maps [ D ] with different scales are obtained 2 ,D 3 ,D 4 ,D 5 ];
Specifically, fig. 3 shows a schematic diagram of a pyramid expansion module according to an embodiment of the present application, where the pyramid expansion module is configured to generate a plurality of second feature graphs [ D ] with different scales 2 ,D 3 ,D 4 ,D 5 ]Performing convolution and fusion operations according to the scale from large to small to obtain a plurality of third feature maps [ P ] with different scales 2 ,P 3 ,P 4 ,P 5 ]I.e. first to D 5 Performing 1×1 convolution operation to obtain a third feature map P 5 The method comprises the steps of carrying out a first treatment on the surface of the And then to D 4 Performing a 1×1 convolution operation, the result of the convolution operation and a third feature map P 5 Fusing to obtain a third feature map P 4 The method comprises the steps of carrying out a first treatment on the surface of the And then to D 3 Performing a 1×1 convolution operation, the result of the convolution operation and a third feature map P 4 Fusing to obtain a third feature map P 3 The method comprises the steps of carrying out a first treatment on the surface of the And then to D 2 Performing a 1×1 convolution operation, the result of the convolution operation and a third feature map P 3 Fusing to obtain a third feature map P 2
For a plurality of third feature maps [ P ] of different scales 2 ,P 3 ,P 4 ,P 5 ]Convolution and fusion operations are carried out according to the scale from small to large to obtain a plurality of fourth feature graphs [ N ] with different scales 2 ,N 3 ,N 4 ,N 5 ](II), (III), (V), (; i.e. first to P 2 Performing a 3×3 convolution operation to obtain a fourth feature map N 2 The method comprises the steps of carrying out a first treatment on the surface of the And then P is to 3 Performing a 3×3 convolution operation, and comparing the result of the convolution operation with a fourth feature map N 2 Fusing to obtain a fourth feature diagram N 3 The method comprises the steps of carrying out a first treatment on the surface of the And then P is to 4 Performing a 3×3 convolution operation, and comparing the result of the convolution operation with a fourth feature map N 3 Fusing to obtain a fourth feature diagram N 4 The method comprises the steps of carrying out a first treatment on the surface of the And thenP pair P 5 Performing a 3×3 convolution operation, and comparing the result of the convolution operation with a fourth feature map N 4 Fusing to obtain a fourth feature diagram N 5
In one embodiment, FIG. 4 shows a schematic diagram of a dual branch fusion module according to an embodiment of the present application, including a category mask prediction branch and a front background mask prediction branch; the category mask prediction branch comprises 4 convolution layers, deconvolution layers and convolution layers which are connected in sequence; the front background mask prediction branch comprises 2 convolution layers and a full connection layer which are connected in sequence; the last convolution layer of the category mask prediction branch outputs a category mask prediction result, and the full connection layer of the front background mask prediction branch outputs a front background mask prediction result.
In this embodiment, the class mask prediction branch is the main path, which is a small FCN network composed of 4 continuous convolution layers and 1 deconvolution layer, and is an end-to-end network, and the main modules include convolution and deconvolution, that is, multiple convolutions are performed on the image first to extract deep information; then deconvolution operation, namely interpolation operation, is carried out, the characteristic diagram is continuously increased, and finally each pixel value is classified, so that accurate segmentation of an input image is realized; each convolution layer of the main path is up-sampled by a factor of 2 by a 3×3 convolution kernel and a deconvolution layer, and a binary pixel mask is predicted for each class independently to obtain a class mask prediction result, so that segmentation and classification are realized.
A short path, i.e., a front background mask prediction branch, is added after the 3 rd convolution (conv 3) layer of the main path, comprising 2 3 x 3 convolution layers and 1 fully connected layer for predicting category independent foreground/background masks; it is not only efficient, but also allows the parameters in the fully connected layer to be trained with more samples, thus achieving better versatility. The main path is followed by a deconvolution operation and a convolution operation after the 3 rd convolution for adjusting the feature map dimensions. To obtain the final masked predictions, features from each class of the path and foreground/background predictions from the short path full connection are fused out. Only one full connection layer is used in short path prediction instead of a plurality of full connection layers, so that the hidden space feature map can be prevented from being folded into a short feature vector, and space information is prevented from being lost. Ablation experiments have shown that adding a short path branch from conv3 and eventually fusing yields the best results.
Further, the medical image segmentation method based on multi-attention fusion according to the embodiment of the application further includes: and training the medical image segmentation model based on the multi-attention fusion to obtain the trained medical image segmentation model based on the multi-attention fusion.
Specifically, training a medical image segmentation model based on multi-attention fusion includes:
and determining at least one target area for each fourth feature map obtained by the pyramid expansion module, and inputting the at least one target area into the double-branch fusion module.
In this embodiment, candidate regions (Region of Interest, roI) with low classification scores may be filtered out in a manner that identifies target regions to alleviate the problem of class imbalance while reducing subsequent computation of unnecessary information.
Specifically, determining at least one target region for each fourth feature map includes:
determining a plurality of prediction target region boxes in each fourth feature map; here, a plurality of prediction target region boxes may be determined using a Pyramid RoI Align module in the related art.
Then, calculating the corresponding intersection ratio of each predicted target area frame according to the real target area frame and the plurality of predicted target area frames; specifically, the following formula may be used to determine the overlap ratio:
Figure BDA0004073673560000101
wherein S is 1 To predict the target region frame, S 2 And lambda is a penalty factor for a real target region box.
And finally, sorting the corresponding intersection ratios of all the predicted target area frames according to the sequence from large to small, and selecting at least one predicted target area frame corresponding to the intersection ratio with the front sorting as at least one target area.
Further, training the medical image segmentation model based on multi-attention fusion comprises HU index truncation processing and data expansion processing on an image slice containing a focus area to obtain training data. Here, the data expansion method can be a method of cutting/filling, horizontal overturning, vertical overturning, affine transformation and the like without changing the original pixel value of the CT image, and can keep the characteristics of the original image to the maximum extent; the medical image data are rare, and the label data marked by doctors manually are more rare, so that the number and diversity of training samples are increased through data expansion, and the limited data are utilized to create as much utilization value as possible; HU index cut-off processing is a conventional operation in the field of medical image processing, and can help to observe bleeding transformation data after cerebral infarction, and effectively improve the detection rate of lesions.
Further, in the model training process, the depth residual error network in the medical image segmentation model based on multi-attention fusion is a pre-trained depth residual error network. The model convergence can be accelerated by adopting the pre-trained depth residual error network, and the model with better performance can be obtained quickly. Experiments prove that the method can effectively improve the segmentation level of the focus.
According to the method, a cavity convolution space attention module and a double-branch fusion module are innovatively introduced on the basis of a backbone network, the used evaluation index is an AP (Average Precision, average accuracy) value, the AP is a common evaluation index for target detection, namely, the average value of a given class of APs is set, for example, a set of fixed confidence thresholds (0.7 is selected in experiments), then TP (True Positive representing a Positive sample for detection) is calculated, FP (False Positive representing a Positive sample for detection), FN (False Positive representing a Negative sample for detection) is calculated, then predicted values precision=TP/(TP+FP) and return=TP/(TP+FN) under each confidence threshold are respectively calculated, after the precision and the return values are calculated, a series of precision points can be obtained, a PR curve is drawn, the APs are further calculated, and the AP values represent good target detection results. The present application makes comparative experiments on a plurality of latest models, and the results are as follows:
table 1 comparative results
Figure BDA0004073673560000121
In Table 1, AP50 represents the AP measurement at a IoU (cross-over) threshold of 0.5, AP75 represents the AP measurement at a IoU threshold of 0.75, and AP S Representative pixel area is less than 32 2 AP measurement at target frame time, AP M Representative pixel area is at 32 2 -96 2 Measurement of target frame in between, AP L Representative pixel area is greater than 96 2 AP measurements for the target frame of (a).
The first row of Mask R-CNN is a segmentation result of a backbone network model, the second row, the third row and the fourth row are segmentation results of a Mask transfiner, a DCT-Mask and a Refine Mask of the existing method respectively, and the fifth row is a segmentation result of the method of the application. As can be seen from table 1, the present application has a good image segmentation effect.
Fig. 5 shows a comparison of the image segmentation experimental results of the present application and the prior art, wherein the first column is a CT slice original image, the second column is label data manually marked by a doctor, the third column is an image segmentation result of a backbone network of the prior art, and the fourth column is an image segmentation result of the present application; according to fig. 5, compared with a backbone network, the method and the device enhance the propagation of semantic information through a plurality of simple and effective components, obtain better effects on segmentation and detection precision, and have better results close to the true value label data manually marked by doctors, thereby proving the feasibility of the method and device.
The foregoing is merely various embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A medical image segmentation model based on multi-attention fusion, comprising: the system comprises a depth residual error network, a cavity convolution space attention module, a pyramid expansion module and a double-branch fusion module;
the depth residual error network is used for extracting features of the image to be segmented to obtain a plurality of first feature images with different scales;
the cavity convolution space attention module is used for carrying out cavity convolution on the first feature images with different scales to obtain second feature images with different scales;
the pyramid expansion module is used for carrying out convolution and fusion operation on the second feature images with different scales from large to small according to the scales to obtain third feature images with different scales; convolving and fusing the third feature images with different scales from small to large to obtain fourth feature images with different scales;
the double-branch fusion module is used for carrying out category mask prediction and front background mask prediction on the fourth feature images with different scales, and fusing a category mask prediction result and a front background mask prediction result to obtain an image segmentation result.
2. The model of claim 1, wherein the hole convolution spatial attention module comprises an input layer, a max-pooling and average-pooling layer, and a hole convolution layer connected in sequence.
3. The model of claim 1, wherein the dual branch fusion module includes a category mask prediction branch and a front background mask prediction branch;
the category mask prediction branch comprises 4 convolution layers, deconvolution layers and convolution layers which are connected in sequence;
the front background mask prediction branch comprises 2 convolution layers and a full connection layer which are sequentially connected;
the last convolution layer of the category mask prediction branch outputs the category mask prediction result, and the full connection layer of the front background mask prediction branch outputs the front background mask prediction result.
4. A medical image segmentation method based on multi-attention fusion, comprising:
inputting an image to be segmented into a medical image segmentation model based on multi-attention fusion to obtain an image segmentation result;
the multi-attention fusion based medical image segmentation model is the multi-attention fusion based medical image segmentation model according to any one of claims 1-3.
5. The method of claim 4, further comprising training the multi-attention fusion-based medical image segmentation model to obtain a trained multi-attention fusion-based medical image segmentation model.
6. The method of claim 5, wherein the training the multi-attention fusion-based medical image segmentation model comprises:
and determining at least one target area for each fourth feature map obtained by the pyramid expansion module, and inputting the at least one target area into the double-branch fusion module.
7. The method of claim 6, wherein said determining at least one target region for each of said fourth feature maps comprises:
determining a plurality of prediction target region boxes in each fourth feature map;
calculating the corresponding intersection ratio of each predicted target area frame according to the real target area frame and the plurality of predicted target area frames;
and sorting the corresponding cross ratios of all the predicted target area frames from large to small, and selecting at least one predicted target area frame with the front cross ratio as the at least one target area.
8. The method of claim 7, wherein the corresponding intersection ratio IoU of each predicted target region frame is calculated from a real target region frame and the plurality of predicted target region frames new The following formula is used:
Figure QLYQS_1
wherein S is 1 To predict the target region frame, S 2 And lambda is a penalty factor for a real target region box.
9. The method of claim 5, wherein the training the multi-attention fusion-based medical image segmentation model comprises:
HU index cutting-off processing and data expansion processing are carried out on the image slice containing the focus area, and training data are obtained.
10. The method of claim 5, wherein during model training, the depth residual network in the multi-attention fusion based medical image segmentation model is a pre-trained depth residual network.
CN202310064474.1A 2023-01-12 2023-01-12 Medical image segmentation method based on multi-attention fusion Pending CN116152492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310064474.1A CN116152492A (en) 2023-01-12 2023-01-12 Medical image segmentation method based on multi-attention fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310064474.1A CN116152492A (en) 2023-01-12 2023-01-12 Medical image segmentation method based on multi-attention fusion

Publications (1)

Publication Number Publication Date
CN116152492A true CN116152492A (en) 2023-05-23

Family

ID=86353839

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310064474.1A Pending CN116152492A (en) 2023-01-12 2023-01-12 Medical image segmentation method based on multi-attention fusion

Country Status (1)

Country Link
CN (1) CN116152492A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117809122A (en) * 2024-02-29 2024-04-02 北京航空航天大学 Processing method, system, electronic equipment and medium for intracranial large blood vessel image

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117809122A (en) * 2024-02-29 2024-04-02 北京航空航天大学 Processing method, system, electronic equipment and medium for intracranial large blood vessel image

Similar Documents

Publication Publication Date Title
CN106056595B (en) Based on the pernicious assistant diagnosis system of depth convolutional neural networks automatic identification Benign Thyroid Nodules
CN108464840B (en) Automatic detection method and system for breast lumps
Tian et al. Multi-path convolutional neural network in fundus segmentation of blood vessels
CN108133476B (en) Method and system for automatically detecting pulmonary nodules
CN110197493A (en) Eye fundus image blood vessel segmentation method
CN111127466A (en) Medical image detection method, device, equipment and storage medium
CN108257135A (en) The assistant diagnosis system of medical image features is understood based on deep learning method
Liu et al. A framework of wound segmentation based on deep convolutional networks
CN111640120B (en) Pancreas CT automatic segmentation method based on significance dense connection expansion convolution network
CN113781439B (en) Ultrasonic video focus segmentation method and device
CN107169998A (en) A kind of real-time tracking and quantitative analysis method based on hepatic ultrasound contrast enhancement image
CN111598853B (en) CT image scoring method, device and equipment for pneumonia
CN110705403A (en) Cell sorting method, cell sorting device, cell sorting medium, and electronic apparatus
CN110751636A (en) Fundus image retinal arteriosclerosis detection method based on improved coding and decoding network
Jo et al. Segmentation of the main vessel of the left anterior descending artery using selective feature mapping in coronary angiography
CN112215217B (en) Digital image recognition method and device for simulating doctor to read film
CN104545792A (en) Arteriovenous retinal vessel optic disk positioning method of eye fundus image
CN115546605A (en) Training method and device based on image labeling and segmentation model
CN116152492A (en) Medical image segmentation method based on multi-attention fusion
CN112489088A (en) Twin network visual tracking method based on memory unit
CN111429457B (en) Intelligent evaluation method, device, equipment and medium for brightness of local area of image
CN112634291A (en) Automatic burn wound area segmentation method based on neural network
CN116883341A (en) Liver tumor CT image automatic segmentation method based on deep learning
CN111062909A (en) Method and equipment for judging benign and malignant breast tumor
CN114862885A (en) Liver MRI image domain-shrinkage segmentation and three-dimensional focus reconstruction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination