CN117078941B

CN117078941B - Cardiac MRI segmentation method based on context cascade attention

Info

Publication number: CN117078941B
Application number: CN202311231297.8A
Authority: CN
Inventors: 刘瑞霞; 胡盼盼; 舒明雷; 刘照阳; 陈永健; 王�琦
Original assignee: Qilu University of Technology; Qingdao Hisense Medical Equipment Co Ltd; Shandong Institute of Artificial Intelligence
Current assignee: Qilu University of Technology; Qingdao Hisense Medical Equipment Co Ltd; Shandong Institute of Artificial Intelligence
Priority date: 2023-09-22
Filing date: 2023-09-22
Publication date: 2024-03-01
Anticipated expiration: 2043-09-22
Also published as: CN117078941A

Abstract

A cardiac MRI segmentation method based on context cascade attention, involving the field of medical image segmentation technology, introduces a residual initial module in the encoder to better learn effective feature representation and improve the performance and generalization ability of the model. The complementary information of different layers is utilized in the decoder through cascade operations to refine the features of each layer, while the contextual attention module is used to explore the contextual information of each layer by retaining local information and compressing global information, and suppressing unimportant information area to highlight salient features, thereby improving the accuracy of segmentation.

Description

A cardiac MRI segmentation method based on context cascade attention

技术领域Technical field

本发明涉及医学图像分割技术领域，具体涉及一种基于上下文级联注意力的心脏MRI分割方法。The invention relates to the technical field of medical image segmentation, and in particular to a cardiac MRI segmentation method based on context cascade attention.

背景技术Background technique

心脏MRI的精确分割在医学图像处理领域具有重要意义，旨在从心脏MR图像中准确的提取出心脏结构，如心室、心房、心肌等，以辅助医生准确诊断和治疗心脏疾病。目前心脏MR图像分割的研究方法主要分为基于传统的方法和基于深度学习的方法，传统的方法在心脏MRI分割中难以处理复杂的心脏结构，导致分割效果不佳，而深度学习的方法能够更好地捕捉心脏MR图像中的复杂特征。U-Net作为代表性的深度学习体系架构，采用了编码器-解码器结构，并引入了跳跃连接，解决了传统卷积神经网络在分割任务中的信息丢失问题，为心脏MRI分割的研究和应用带来了重要的突破。然而，特征抽象能力不足以及缺乏学习像素之间的上下文关系的问题，这导致了分割的复杂性。Accurate segmentation of cardiac MRI is of great significance in the field of medical image processing. It aims to accurately extract cardiac structures, such as ventricles, atria, myocardium, etc., from cardiac MR images to assist doctors in accurately diagnosing and treating cardiac diseases. At present, the research methods of cardiac MR image segmentation are mainly divided into traditional methods and deep learning-based methods. Traditional methods are difficult to handle complex cardiac structures in cardiac MRI segmentation, resulting in poor segmentation results, while deep learning methods can achieve better results. Capture complex features in cardiac MR images well. As a representative deep learning architecture, U-Net adopts an encoder-decoder structure and introduces skip connections, which solves the information loss problem of traditional convolutional neural networks in segmentation tasks and provides a basis for the research and development of cardiac MRI segmentation. The application brought important breakthroughs. However, the problem of insufficient feature abstraction capabilities and lack of learning contextual relationships between pixels leads to the complexity of segmentation.

发明内容Contents of the invention

本发明为了克服以上技术的不足，提供了一种更好地捕捉和表示复杂的数据特征的基于上下文级联注意力的心脏MRI分割方法。In order to overcome the shortcomings of the above technologies, the present invention provides a cardiac MRI segmentation method based on context cascade attention that can better capture and represent complex data features.

本发明克服其技术问题所采用的技术方案是：The technical solution adopted by the present invention to overcome its technical problems is:

一种基于上下文级联注意力的心脏MRI分割方法，包括如下步骤：A cardiac MRI segmentation method based on context cascade attention, including the following steps:

a)收集n个受试者的心脏MRI数据，得到MRI数据集s，s＝{s₁,s₂,...,s_i,...,s_n}，s_i为第i名受试者的心脏MRI数据，i∈{1,2,...,n}；a) Collect cardiac MRI data of n subjects to obtain MRI data set s, s={s ₁ , s ₂ ,..., s _i ,..., s _n }, s _i is the i-th subject Subject's cardiac MRI data, i∈{1,2,...,n};

b)对MRI数据集s进行预处理操作，得到预处理后的数据集F，F＝{F₁,F₂,...,F_i,...,F_n}，F_i为第i个预处理后的二维图像数据；b) Perform preprocessing operations on the MRI data set s to obtain the preprocessed data set F, F={F ₁ , F ₂ ,..., F _i ,..., F _n }, F _i is the i-th Preprocessed two-dimensional image data;

c)将预处理后的数据集F划分为训练集、测试集、验证集；c) Divide the preprocessed data set F into a training set, a test set, and a verification set;

d)建立由编码器和解码器构成的分割网络模型；d) Establish a segmentation network model consisting of an encoder and a decoder;

e)将预处理后的二维图像数据F_i输入到分割网络模型的编码器中，输出得到特征图特征图/>特征图/>特征图/> e) Input the preprocessed two-dimensional image data F _i into the encoder of the segmentation network model, and output the feature map Feature map/> Feature map/> Feature map/>

f)将特征图特征图/>特征图/>特征图/>输入到分割网络模型的解码器中，输出得到预测分割图像/> f) Convert the feature map to Feature map/> Feature map/> Feature map/> Input to the decoder of the segmentation network model, and the output is the predicted segmentation image/>

g)训练分割网络模型，得到优化后的分割网络模型。g) Train the segmentation network model and obtain the optimized segmentation network model.

进一步的，步骤a)中从ACDC2017公共数据中，收集包含LV、RV、MYO三种结构的n个受试者的心脏MRI数据，得到MRI数据集s。Further, in step a), cardiac MRI data of n subjects containing three structures of LV, RV, and MYO are collected from ACDC2017 public data to obtain MRI data set s.

优选的，步骤a)中n＝100。Preferably, n=100 in step a).

进一步的，步骤b)包括如下步骤：Further, step b) includes the following steps:

b-1)对第i名受试者的心脏MRI数据s_i沿z轴进行逐个切片的重采样，重采样为x轴方向的像素间距为1.5，y轴方向的像素间距为1.5；b-1) Resample the cardiac MRI data s _i of the i-th subject slice by slice along the z-axis, with the pixel spacing in the x-axis direction being 1.5 and the pixel spacing in the y-axis direction being 1.5;

b-2)将重采样后的心脏MRI数据s_i执行裁剪大小为224×224的2D中心剪裁操作，得到剪裁后的数据F_i′，将剪裁后的数据F_i′进行归一化处理，得到预处理后的二维图像数据F_i。b-2) Perform a 2D center clipping operation with a crop size of 224×224 on the resampled cardiac MRI data s _i to obtain the clipped data F _i ′, and normalize the clipped data F _i ′. The preprocessed two-dimensional image data _Fi is obtained.

优选的，步骤c)中将预处理后的数据集F按7:2:1的比例划分为训练集、测试集、验证集。Preferably, in step c), the preprocessed data set F is divided into a training set, a test set, and a verification set in a ratio of 7:2:1.

进一步的，步骤e)包括如下步骤：Further, step e) includes the following steps:

e-1)分割网络模型的编码器由第一残差初始模块、第一最大池化层、第二残差初始模块、第二最大池化层、第三残差初始模块、第三最大池化层、第四残差初始模块构成；e-1) The encoder of the segmentation network model consists of the first residual initialization module, the first maximum pooling layer, the second residual initialization module, the second maximum pooling layer, the third residual initialization module, and the third maximum pooling ization layer and the fourth residual initial module;

e-2)第一残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将预处理后的二维图像数据F_i输入到第一分支中，输出得到特征图将预处理后的二维图像数据F_i输入到第二分支中，输出得到特征图/>将预处理后的二维图像数据F_i输入到第三分支中，输出得到特征图/>将预处理后的二维图像数据F_i输入到第四分支中，输出得到特征图/>将预处理后的二维图像数据F_i输入到第五分支中，输出得到特征图/>将特征图/>特征图特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图相加操作得到特征图/>e-3)将特征图/>输入到第一最大池化层中，输出得到特征图/>e-4)第二残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将特征图/>输入到第一分支中，输出得到特征图/>将特征图/>输入到第二分支中，输出得到特征图/>将特征图输入到第三分支中，输出得到特征图/>将特征图/>输入到第四分支中，输出得到特征图/>将特征图/>输入到第五分支中，输出得到特征图/>将特征图/>特征图/>特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图/>相加操作得到特征图/>e-5)将特征图/>输入到第二最大池化层中，输出得到特征图/>e-6)第三残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将特征图/>输入到第一分支中，输出得到特征图/>将特征图/>输入到第二分支中，输出得到特征图/>将特征图/>输入到第三分支中，输出得到特征图/>将特征图/>输入到第四分支中，输出得到特征图/>将特征图/>输入到第五分支中，输出得到特征图/>将特征图/>特征图/>特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图/>相加操作得到特征图/>e-7)将特征图/>输入到第三最大池化层中，输出得到特征图/>e-8)第四残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将特征图/>输入到第一分支中，输出得到特征图/>将特征图/>输入到第二分支中，输出得到特征图/>将特征图/>输入到第三分支中，输出得到特征图/>将特征图/>输入到第四分支中，输出得到特征图/>将特征图/>输入到第五分支中，输出得到特征图/>将特征图特征图/>特征图/>特征图/>进行拼接操作得到特征图/>将特征图与特征图/>相加操作得到特征图/>优选的，步骤e-2)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-4)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-6)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-8)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-3)中第一最大池化层的卷积核大小为2×2；步骤e-5)中第二最大池化层的卷积核大小为2×2；步骤e-7)中第三最大池化层的卷积核大小为2×2。e-2) The first residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer. The preprocessed two-dimensional image data F _i is input into the first branch, and the feature map is output. Input the preprocessed two-dimensional image data F _i into the second branch, and output the feature map/> Input the preprocessed two-dimensional image data F _i into the third branch, and output the feature map/> Input the preprocessed two-dimensional image data F _i into the fourth branch, and output the feature map/> Input the preprocessed two-dimensional image data F _i into the fifth branch, and output the feature map/> Convert the feature map/> feature map Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> with feature map The addition operation obtains the feature map/> e-3) Convert the feature map/> Input to the first maximum pooling layer, and the output is the feature map/> e-4) The second residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer, and the feature map/> Input to the first branch and output the feature map/> Convert the feature map/> Input to the second branch, and the output is the feature map/> The feature map Input to the third branch, and the output is the feature map/> Convert the feature map/> Input to the fourth branch, and the output is the feature map/> Convert the feature map/> Input to the fifth branch, and the output is the feature map/> Convert the feature map/> Feature map/> Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> and feature map/> The addition operation obtains the feature map/> e-5) Convert the feature map/> Input to the second maximum pooling layer, and the output is the feature map/> e-6) The third residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer, and the feature map/> Input to the first branch and output the feature map/> Convert the feature map/> Input to the second branch, and the output is the feature map/> Convert the feature map/> Input to the third branch, and the output is the feature map/> Convert the feature map/> Input to the fourth branch, and the output is the feature map/> Convert the feature map/> Input to the fifth branch, and the output is the feature map/> Convert the feature map/> Feature map/> Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> and feature map/> The addition operation obtains the feature map/> e-7) Convert the feature map/> Input to the third maximum pooling layer, and the output is the feature map/> e-8) The fourth residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer, and the feature map/> Input to the first branch and output the feature map/> Convert the feature map/> Input to the second branch, and the output is the feature map/> Convert the feature map/> Input to the third branch, and the output is the feature map/> Convert the feature map/> Input to the fourth branch, and the output is the feature map/> Convert the feature map/> Input to the fifth branch, and the output is the feature map/> The feature map Feature map/> Feature map/> Feature map/> Perform splicing operation to obtain feature map/> The feature map with feature map/> The addition operation obtains the feature map/> Preferably, in step e-2), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, and the convolution kernel size of the third branch is 1×1. The convolution kernel size is 1×1. The convolution kernel size of the first convolution layer of the fourth branch is 1×1. The convolution kernel size of the second convolution layer is 5×5. The padding is 2. The convolution kernel size of the first convolution layer of the branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; step e- In 4), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, and the convolution kernel size of the convolution layer of the third branch is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, padding is 2, the first convolution of the fifth branch The convolution kernel size of the layer is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; the first branch in step e-6) The convolution kernel size of the convolution layer is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, and the convolution kernel size of the fourth branch is 1×1. The convolution kernel size of the first convolution layer of the branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, and the padding is 2. The convolution kernel size of the first convolution layer of the fifth branch is is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; the convolution of the convolution layer of the first branch in step e-8) The convolution kernel size is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, and the first convolution of the fourth branch The convolution kernel size of the layer is 1×1, the convolution kernel size of the second convolution layer is 5×5, and the padding is 2. The convolution kernel size of the first convolution layer of the fifth branch is 1×1, and the convolution kernel size of the second convolution layer is 5×5. The convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; the convolution kernel size of the first maximum pooling layer in step e-3) is 2×2. ; The convolution kernel size of the second maximum pooling layer in step e-5) is 2×2; the convolution kernel size of the third maximum pooling layer in step e-7) is 2×2.

进一步的，步骤f)包括如下步骤：Further, step f) includes the following steps:

f-1)分割网络模型的解码器由Conv-block1、第一注意力门控模块、第一上采样层、第一上下文注意力模块、第一卷积层、Conv-block2、第二注意力门控模块、第二上采样层、第二上下文注意力模块、第二卷积层、Conv-block3、第三注意力门控模块、第三上采样层、第三上下文注意力模块、第三卷积层、Conv-block4、第四注意力门控模块、第四上下文注意力模块、第四卷积层构成；f-1) The decoder of the segmentation network model consists of Conv-block1, the first attention gating module, the first upsampling layer, the first contextual attention module, the first convolution layer, Conv-block2, and the second attention Gating module, second upsampling layer, second contextual attention module, second convolutional layer, Conv-block3, third attention gating module, third upsampling layer, third contextual attention module, third Composed of convolutional layer, Conv-block4, fourth attention gating module, fourth contextual attention module, and fourth convolutional layer;

f-2)Conv-block4依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图输入到Conv-block4中，输出得到特征图/>第四注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第四注意力门控模块的第一卷积层、第一BN层中，输出得到特征图将特征图/>依次输入到第四注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>依次输入到第四注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>第四上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第四上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第四上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第四上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图与特征图/>相加得到特征图/>将特征图/>输入到第四上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第四卷积层中得到特征图f-3)Conv-block3依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图/>输入到Conv-block3中，输出得到特征图/>将特征图/>输入到第三上采样层中，输出得到特征图/>第三注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第三注意力门控模块的第一卷积层、第一BN层中，输出得到特征图/>将特征图/>依次输入到第三注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图相加得到特征图/>将特征图/>依次输入到第三注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>级联得到特征图/>第三上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第三上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第三上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第三上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>输入到第三上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第三卷积层中得到特征图/>f-4)Conv-block2依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图/>输入到Conv-block2中，输出得到特征图/>将特征图/>输入到第二上采样层中，输出得到特征图/>第二注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第二注意力门控模块的第一卷积层、第一BN层中，输出得到特征图/>将特征图/>依次输入到第二注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>依次输入到第二注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图与特征图/>相乘得到特征图/>将特征图/>与特征图/>级联得到特征图/>第二上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第二上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第二上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第二上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>输入到第二上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第二卷积层中得到特征图/>f-5)Conv-block1依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图/>输入到Conv-block1中，输出得到特征图/>将特征图/>输入到第一上采样层中，输出得到特征图/>第一注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第一注意力门控模块的第一卷积层、第一BN层中，输出得到特征图/>将特征图/>依次输入到第一注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图与特征图/>相加得到特征图/>将特征图/>依次输入到第一注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>级联得到特征图/>第一上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第一上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第一上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第一上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>输入到第一上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第一卷积层中得到特征图/>f-6)将特征图/>特征图/>特征图/>特征图/>进行残差相加得到特征图/> f-2) Conv-block4 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map Input to Conv-block4 and output the feature map/> The fourth attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input to the first convolutional layer and the first BN layer of the fourth attention gating module in turn, and the output is a feature map. Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the fourth attention gating module in turn, and the output is the feature map/> Convert the feature map/> with feature map/> Add up to get the feature map/> Convert the feature map/> It is input to the ReLU activation function of the fourth attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is a feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> The fourth contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the fourth contextual attention module in turn, and the output is a feature map/> Convert the feature map/> Input to the global maximum pooling layer of the fourth contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the fourth context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> The feature map and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the fourth context attention module, and the output is the feature map/> Convert the feature map/> Input to the fourth convolutional layer to obtain the feature map f-3) Conv-block3 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map/> Input into Conv-block3 and output the feature map/> Convert the feature map/> Input to the third upsampling layer, and the output is the feature map/> The third attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input to the first convolutional layer and the first BN layer of the third attention gating module in turn, and the output is the feature map/> Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the third attention gating module in turn, and the output is the feature map/> Convert the feature map/> with feature map Add up to get the feature map/> Convert the feature map/> Input to the ReLU activation function of the third attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Cascade to obtain feature map/> The third contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the third contextual attention module in turn, and the output is a feature map/> Convert the feature map/> Input to the global maximum pooling layer of the third contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the third context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the third contextual attention module, and the output is the feature map/> Convert the feature map/> Input to the third convolutional layer to obtain the feature map/> f-4) Conv-block2 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map/> Input to Conv-block2 and output the feature map/> Convert the feature map/> Input to the second upsampling layer, and the output is the feature map/> The second attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input to the first convolutional layer and the first BN layer of the second attention gating module in turn, and the output is the feature map/> Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the second attention gating module in turn, and the output is a feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> It is input into the ReLU activation function of the second attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is a feature map/> The feature map and feature map/> Multiply to get the feature map/> Convert the feature map/> with feature map/> Cascade to obtain feature map/> The second contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the second context attention module in turn, and the output is a feature map/> Convert the feature map/> Input to the global maximum pooling layer of the second contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the second context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the second context attention module, and the output is the feature map/> Convert the feature map/> Input to the second convolutional layer to obtain the feature map/> f-5) Conv-block1 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map/> Input into Conv-block1 and output the feature map/> Convert the feature map/> Input to the first upsampling layer, and the output is the feature map/> The first attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input into the first convolutional layer and the first BN layer of the first attention gating module in turn, and the output is the feature map/> Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the first attention gating module in turn, and the output is the feature map/> The feature map and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the ReLU activation function of the first attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Cascade to obtain feature map/> The first contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the first context attention module in turn, and the output is the feature map/> Convert the feature map/> Input to the global maximum pooling layer of the first contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the first context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the first contextual attention module, and the output is the feature map/> Convert the feature map/> Input to the first convolutional layer to obtain the feature map/> f-6) Convert the feature map/> Feature map/> Feature map/> Feature map/> Add the residuals to obtain the feature map/>

优选的，步骤f-2)中Conv-block4的卷积层的卷积核大小为3×3，padding为1；第四注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第四上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第四上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第四卷积层的卷积核大小为1×1；步骤f-3)中Conv-block3的卷积层的卷积核大小为3×3，padding为1；第三注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第三上采样层的卷积核大小为2×2；第三上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第三上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第三卷积层的卷积核大小为1×1；f-4)中Conv-block2的卷积层的卷积核大小为3×3，padding为1；第二注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第二上采样层的卷积核大小为2×2；第二上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第二上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第二卷积层的卷积核大小为1×1；f-5)中Conv-block1的卷积层的卷积核大小为3×3，padding为1；第一注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第一上采样层的卷积核大小为2×2；第一上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第一上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第一卷积层的卷积核大小为1×1。进一步的，步骤g)中将Dice损失及交叉熵损失求和得到总损失，使用Adam优化器利用总损失训练分割网络模型，得到优化后的分割网络模型，训练时批次大小设为32，迭代周期设为200，学习率设为0.001。Preferably, the convolution kernel size of the convolution layer of Conv-block4 in step f-2) is 3×3, and the padding is 1; the first convolution layer, the second convolution layer of the fourth attention gating module, The convolution kernel sizes of the third convolutional layer are all 1×1; the convolution kernel sizes of the first, second, and third convolutional layers of the fourth contextual attention module are all 3×3. , padding is 1, the convolution kernel size of the fourth convolutional layer of the fourth contextual attention module is 1×1; the convolutional kernel size of the fourth convolutional layer of the decoder is 1×1; step f-3) The convolution kernel size of the convolution layer of Conv-block3 is 3×3, and the padding is 1; the convolution of the first convolution layer, the second convolution layer, and the third convolution layer of the third attention gating module The kernel sizes are all 1×1; the convolution kernel size of the third upsampling layer is 2×2; the convolutions of the first convolutional layer, the second convolutional layer, and the third convolutional layer of the third contextual attention module The kernel sizes are all 3×3, padding is 1, the convolution kernel size of the fourth convolutional layer of the third contextual attention module is 1×1; the convolutional kernel size of the third convolutional layer of the decoder is 1× 1; f-4) The convolution kernel size of the convolution layer of Conv-block2 is 3×3, and the padding is 1; the first convolution layer, the second convolution layer, and the third convolution layer of the second attention gating module The convolution kernel size of the convolution layer is 1×1; the convolution kernel size of the second upsampling layer is 2×2; the first convolution layer, the second convolution layer, and the third convolution layer of the second context attention module The convolution kernel size of the convolutional layer is 3×3, and the padding is 1. The convolution kernel size of the fourth convolutional layer of the second contextual attention module is 1×1; the convolution kernel size of the second convolutional layer of the decoder is 1×1. The convolution kernel size is 1 × 1; the convolution kernel size of the convolution layer of Conv-block1 in f-5) is 3 × 3, and the padding is 1; the first convolution layer and the second convolution layer of the first attention gating module The convolution kernel size of the convolution layer and the third convolution layer is 1×1; the convolution kernel size of the first upsampling layer is 2×2; the first convolution layer and the second convolution layer of the first context attention module The convolution kernel size of the convolution layer and the third convolution layer is 3 × 3, and the padding is 1. The convolution kernel size of the fourth convolution layer of the first context attention module is 1 × 1; the convolution kernel size of the decoder is 1 × 1. The convolution kernel size of a convolutional layer is 1×1. Further, in step g), the Dice loss and cross-entropy loss are summed to obtain the total loss, and the Adam optimizer is used to train the segmentation network model using the total loss to obtain the optimized segmentation network model. The batch size during training is set to 32, and iteration The period is set to 200 and the learning rate is set to 0.001.

本发明的有益效果是：在编码器中引入残差初始模块来更好的学习有效的特征表示，提高模型的性能和泛化能力。在解码器中通过级联操作利用不同层的互补信息来细化每一层的特征，同时使用上下文注意力模块通过保留局部信息和压缩全局信息来探索每一层的上下文信息，并抑制不重要的信息区域来突出显著的特征，从而提高分割的精确度。The beneficial effects of the present invention are: introducing a residual initial module in the encoder to better learn effective feature representations and improve the performance and generalization ability of the model. The complementary information of different layers is utilized in the decoder through cascade operations to refine the features of each layer, while the contextual attention module is used to explore the contextual information of each layer by retaining local information and compressing global information, and suppressing unimportant information. information area to highlight salient features, thereby improving the accuracy of segmentation.

附图说明Description of drawings

图1为本发明的分割网络模型的结构图；Figure 1 is a structural diagram of the segmentation network model of the present invention;

图2为本发明的残差初始模块的结构图；Figure 2 is a structural diagram of the residual initial module of the present invention;

图3为本发明的注意力门控模块的结构图；Figure 3 is a structural diagram of the attention gating module of the present invention;

图4为本发明的上下文注意力模块的结构图。Figure 4 is a structural diagram of the contextual attention module of the present invention.

具体实施方式Detailed ways

下面结合附图1至附图4对本发明做进一步说明。The present invention will be further described below in conjunction with Figures 1 to 4.

a)收集n个受试者的心脏MRI数据，得到MRI数据集s，s＝{s₁,s₂,...,s_i,...,s_n}，s_i为第i名受试者的心脏MRI数据，i∈{1,2,...,n}。a) Collect cardiac MRI data of n subjects to obtain MRI data set s, s={s ₁ , s ₂ ,..., s _i ,..., s _n }, s _i is the i-th subject The subject's cardiac MRI data, i∈{1,2,...,n}.

b)对MRI数据集s进行预处理操作，得到预处理后的数据集F，F＝{F₁,F₂,...,F_i,...,F_n}，F_i为第i个预处理后的二维图像数据。b) Perform preprocessing operations on the MRI data set s to obtain the preprocessed data set F, F={F ₁ , F ₂ ,..., F _i ,..., F _n }, F _i is the i-th preprocessed two-dimensional image data.

c)将预处理后的数据集F划分为训练集、测试集、验证集。c) Divide the preprocessed data set F into a training set, a test set, and a verification set.

d)建立由编码器和解码器构成的分割网络模型。d) Establish a segmentation network model consisting of an encoder and a decoder.

在编码器中使用残差初始模块更好地捕捉和表示复杂的数据特征，在解码器中通过多层上下文级联操作学习多尺度和多分辨率空间表示的特征。The residual initialization module is used in the encoder to better capture and represent complex data features, and the features of multi-scale and multi-resolution spatial representation are learned through multi-layer context cascade operations in the decoder.

在本发明的一个实施例中，步骤a)中从ACDC2017公共数据中，收集包含LV、RV、MYO三种结构的n个受试者的心脏MRI数据，得到MRI数据集s。在该实施例中，优选的，步骤a)中n＝100。In one embodiment of the present invention, in step a), cardiac MRI data of n subjects containing three structures of LV, RV, and MYO are collected from ACDC2017 public data to obtain MRI data set s. In this embodiment, preferably, n=100 in step a).

在本发明的一个实施例中，步骤b)包括如下步骤：In one embodiment of the invention, step b) includes the following steps:

b-1)对第i名受试者的心脏MRI数据s_i沿z轴进行逐个切片的重采样，重采样为x轴方向的像素间距为1.5，y轴方向的像素间距为1.5。b-1) The cardiac MRI data s _i of the i-th subject are resampled slice by slice along the z-axis, with the pixel spacing in the x-axis direction being 1.5 and the pixel spacing in the y-axis direction being 1.5.

b-2)将重采样后的心脏MRI数据s_i执行裁剪大小为224×224的2D中心剪裁操作，得到剪裁后的数据F_i′，为了保证数据的一致性，将剪裁后的数据F_i′进行归一化处理，得到预处理后的二维图像数据F_i。b-2) Perform a 2D center clipping operation with a crop size of 224×224 on the resampled cardiac MRI data s _i to obtain the clipped data F _i ′. In order to ensure the consistency of the data, the clipped data F _i ′ Perform normalization processing to obtain preprocessed two-dimensional image data _Fi .

在本发明的一个实施例中，步骤c)中将预处理后的数据集F按7:2:1的比例划分为训练集、测试集、验证集。In one embodiment of the present invention, in step c), the preprocessed data set F is divided into a training set, a test set, and a verification set in a ratio of 7:2:1.

在本发明的一个实施例中，步骤e)包括如下步骤：In one embodiment of the present invention, step e) includes the following steps:

e-1)分割网络模型的编码器由第一残差初始模块、第一最大池化层、第二残差初始模块、第二最大池化层、第三残差初始模块、第三最大池化层、第四残差初始模块构成。e-1) The encoder of the segmentation network model consists of the first residual initialization module, the first maximum pooling layer, the second residual initialization module, the second maximum pooling layer, the third residual initialization module, and the third maximum pooling ization layer and the fourth residual initial module.

e-2)第一残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将预处理后的二维图像数据F_i输入到第一分支中，输出得到特征图将预处理后的二维图像数据F_i输入到第二分支中，输出得到特征图/>将预处理后的二维图像数据F_i输入到第三分支中，输出得到特征图/>将预处理后的二维图像数据F_i输入到第四分支中，输出得到特征图/>将预处理后的二维图像数据F_i输入到第五分支中，输出得到特征图/>将特征图/>特征图特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图相加操作得到特征图/>e-3)将特征图/>输入到第一最大池化层中，输出得到特征图/>e-4)第二残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将特征图/>输入到第一分支中，输出得到特征图/>将特征图/>输入到第二分支中，输出得到特征图/>将特征图输入到第三分支中，输出得到特征图/>将特征图/>输入到第四分支中，输出得到特征图/>将特征图/>输入到第五分支中，输出得到特征图/>将特征图/>特征图/>特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图/>相加操作得到特征图/>e-5)将特征图/>输入到第二最大池化层中，输出得到特征图/>e-6)第三残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将特征图/>输入到第一分支中，输出得到特征图/>将特征图/>输入到第二分支中，输出得到特征图/>将特征图/>输入到第三分支中，输出得到特征图/>将特征图/>输入到第四分支中，输出得到特征图/>将特征图/>输入到第五分支中，输出得到特征图/>将特征图/>特征图/>特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图/>相加操作得到特征图/>e-7)将特征图/>输入到第三最大池化层中，输出得到特征图/>e-8)第四残差初始模块由第一分支、第二分支、第三分支、第四分支、第五分支构成，第一分支由卷积层构成、第二分支依次由平均池化层、卷积层构成，第三分支依次由卷积层、BN层构成，第四分支依次由第一卷积层、第二卷积层、BN层构成，第五分支依次由第一卷积层、第二卷积层、第一BN层、第三卷积层、第二BN层构成，将特征图/>输入到第一分支中，输出得到特征图/>将特征图/>输入到第二分支中，输出得到特征图/>将特征图/>输入到第三分支中，输出得到特征图/>将特征图/>输入到第四分支中，输出得到特征图/>将特征图/>输入到第五分支中，输出得到特征图/>将特征图/>特征图/>特征图/>特征图/>进行拼接操作得到特征图/>将特征图/>与特征图/>相加操作得到特征图/>在该实施例中，优选的，步骤e-2)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-4)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-6)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-8)中第一分支的卷积层的卷积核大小为1×1，第二分支的卷积层的卷积核大小为1×1，第三分支的卷积层的卷积核大小为1×1，第四分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为5×5，padding为2，第五分支的第一卷积层的卷积核大小为1×1、第二卷积层的卷积核大小为3×3、第三卷积层的卷积核大小为3×3；步骤e-3)中第一最大池化层的卷积核大小为2×2；步骤e-5)中第二最大池化层的卷积核大小为2×2；步骤e-7)中第三最大池化层的卷积核大小为2×2。在本发明的一个实施例中，步骤f)包括如下步骤：e-2) The first residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer. The preprocessed two-dimensional image data F _i is input into the first branch, and the feature map is output. Input the preprocessed two-dimensional image data F _i into the second branch, and output the feature map/> Input the preprocessed two-dimensional image data F _i into the third branch, and output the feature map/> Input the preprocessed two-dimensional image data F _i into the fourth branch, and output the feature map/> Input the preprocessed two-dimensional image data F _i into the fifth branch, and output the feature map/> Convert the feature map/> feature map Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> with feature map The addition operation obtains the feature map/> e-3) Convert the feature map/> Input to the first maximum pooling layer, and the output is the feature map/> e-4) The second residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer, and the feature map/> Input to the first branch and output the feature map/> Convert the feature map/> Input to the second branch, and the output is the feature map/> The feature map Input to the third branch, and the output is the feature map/> Convert the feature map/> Input to the fourth branch, and the output is the feature map/> Convert the feature map/> Input to the fifth branch, and the output is the feature map/> Convert the feature map/> Feature map/> Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> and feature map/> The addition operation obtains the feature map/> e-5) Convert the feature map/> Input to the second maximum pooling layer, and the output is the feature map/> e-6) The third residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer, and the feature map/> Input to the first branch and output the feature map/> Convert the feature map/> Input to the second branch, and the output is the feature map/> Convert the feature map/> Input to the third branch, and the output is the feature map/> Convert the feature map/> Input to the fourth branch, and the output is the feature map/> Convert the feature map/> Input to the fifth branch, and the output is the feature map/> Convert the feature map/> Feature map/> Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> and feature map/> The addition operation obtains the feature map/> e-7) Convert the feature map/> Input to the third maximum pooling layer, and the output is the feature map/> e-8) The fourth residual initial module is composed of the first branch, the second branch, the third branch, the fourth branch, and the fifth branch. The first branch is composed of a convolution layer, and the second branch is composed of an average pooling layer in turn. , convolution layer, the third branch is composed of convolution layer and BN layer in sequence, the fourth branch is composed of first convolution layer, second convolution layer and BN layer in sequence, and the fifth branch is composed of first convolution layer in sequence. , the second convolution layer, the first BN layer, the third convolution layer, and the second BN layer, and the feature map/> Input to the first branch and output the feature map/> Convert the feature map/> Input to the second branch, and the output is the feature map/> Convert the feature map/> Input to the third branch, and the output is the feature map/> Convert the feature map/> Input to the fourth branch, and the output is the feature map/> Convert the feature map/> Input to the fifth branch, and the output is the feature map/> Convert the feature map/> Feature map/> Feature map/> Feature map/> Perform splicing operation to obtain feature map/> Convert the feature map/> and feature map/> The addition operation obtains the feature map/> In this embodiment, preferably, in step e-2), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, and the convolution kernel size of the convolution layer of the second branch is 1×1. The convolution kernel size of the three-branch convolution layer is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, and the convolution kernel size of the second convolution layer is 5×5. padding is 2, the convolution kernel size of the first convolution layer of the fifth branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3 ×3; in step e-4), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, and the convolution kernel size of the convolution layer of the third branch is 1×1. The convolution kernel size is 1×1. The convolution kernel size of the first convolution layer of the fourth branch is 1×1. The convolution kernel size of the second convolution layer is 5×5. The padding is 2. The convolution kernel size of the first convolution layer of the branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; step e- In 6), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, and the convolution kernel size of the convolution layer of the third branch is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, padding is 2, the first convolution of the fifth branch The convolution kernel size of the layer is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; the first branch in step e-8) The convolution kernel size of the convolution layer is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, and the convolution kernel size of the fourth branch is 1×1. The convolution kernel size of the first convolution layer of the branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, and the padding is 2. The convolution kernel size of the first convolution layer of the fifth branch is is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; the convolution of the first maximum pooling layer in step e-3) The kernel size is 2×2; the convolution kernel size of the second maximum pooling layer in step e-5) is 2×2; the convolution kernel size of the third maximum pooling layer in step e-7) is 2×2 . In one embodiment of the present invention, step f) includes the following steps:

f-1)分割网络模型的解码器由Conv-block1、第一注意力门控模块、第一上采样层、第一上下文注意力模块、第一卷积层、Conv-block2、第二注意力门控模块、第二上采样层、第二上下文注意力模块、第二卷积层、Conv-block3、第三注意力门控模块、第三上采样层、第三上下文注意力模块、第三卷积层、Conv-block4、第四注意力门控模块、第四上下文注意力模块、第四卷积层构成。f-1) The decoder of the segmentation network model consists of Conv-block1, the first attention gating module, the first upsampling layer, the first contextual attention module, the first convolution layer, Conv-block2, and the second attention Gating module, second upsampling layer, second contextual attention module, second convolutional layer, Conv-block3, third attention gating module, third upsampling layer, third contextual attention module, third It is composed of convolutional layer, Conv-block4, fourth attention gating module, fourth contextual attention module and fourth convolutional layer.

f-2)Conv-block4依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图输入到Conv-block4中，输出得到特征图/>第四注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第四注意力门控模块的第一卷积层、第一BN层中，输出得到特征图将特征图/>依次输入到第四注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>依次输入到第四注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>第四上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第四上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第四上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第四上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图与特征图/>相加得到特征图/>将特征图/>输入到第四上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第四卷积层中得到特征图f-3)Conv-block3依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图/>输入到Conv-block3中，输出得到特征图/>将特征图/>输入到第三上采样层中，输出得到特征图/>第三注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第三注意力门控模块的第一卷积层、第一BN层中，输出得到特征图/>将特征图/>依次输入到第三注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图相加得到特征图/>将特征图/>依次输入到第三注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>级联得到特征图/>第三上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第三上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第三上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第三上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图将特征图/>与特征图/>相加得到特征图/>将特征图/>输入到第三上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第三卷积层中得到特征图/>f-4)Conv-block2依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图/>输入到Conv-block2中，输出得到特征图/>将特征图/>输入到第二上采样层中，输出得到特征图/>第二注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第二注意力门控模块的第一卷积层、第一BN层中，输出得到特征图/>将特征图/>依次输入到第二注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>依次输入到第二注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>级联得到特征图/>第二上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第二上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第二上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第二上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图将特征图/>与特征图/>相加得到特征图/>将特征图/>输入到第二上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第二卷积层中得到特征图/>f-5)Conv-block1依次由卷积层、BatchNorm层、ReLU激活函数构成，将特征图/>输入到Conv-block1中，输出得到特征图/>将特征图/>输入到第一上采样层中，输出得到特征图/>第一注意力门控模块由第一卷积层、第一BN层、第二卷积层、第二BN层、ReLU激活函数、第三卷积层、第三BN层、sigmoid函数构成，将特征图/>依次输入到第一注意力门控模块的第一卷积层、第一BN层中，输出得到特征图/>将特征图/>依次输入到第一注意力门控模块的第二卷积层、第二BN层中，输出得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>依次输入到第一注意力门控模块的ReLU激活函数、第三卷积层、第三BN层、sigmoid函数中，输出得到特征图/>将特征图与特征图/>相乘得到特征图/>将特征图/>与特征图/>级联得到特征图/>第一上下文注意力模块由第一卷积层、第二卷积层、全局最大池化层、第三卷积层、第四卷积层、sigmoid函数构成，将特征图/>依次输入到第一上下文注意力模块的第一卷积层、第二卷积层中，输出得到特征图/>将特征图/>输入到第一上下文注意力模块的全局最大池化层中，输出得到特征图/>将特征图/>依次输入到第一上下文注意力模块的第三卷积层、第四卷积层中，输出得到特征图/>将特征图/>与特征图/>相乘得到特征图/>将特征图/>与特征图/>相加得到特征图/>将特征图/>输入到第一上下文注意力模块的sigmoid函数中，输出得到特征图/>将特征图/>输入到第一卷积层中得到特征图/>f-6)将特征图/>特征图/>特征图/>特征图/>进行残差相加得到特征图/> f-2) Conv-block4 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map Input to Conv-block4 and output the feature map/> The fourth attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input to the first convolutional layer and the first BN layer of the fourth attention gating module in turn, and the output is a feature map. Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the fourth attention gating module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> It is input to the ReLU activation function of the fourth attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is a feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> The fourth contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the fourth contextual attention module in turn, and the output is a feature map/> Convert the feature map/> Input to the global maximum pooling layer of the fourth contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the fourth context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> The feature map and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the fourth context attention module, and the output is the feature map/> Convert the feature map/> Input to the fourth convolutional layer to obtain the feature map f-3) Conv-block3 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map/> Input into Conv-block3 and output the feature map/> Convert the feature map/> Input to the third upsampling layer, and the output is the feature map/> The third attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input to the first convolutional layer and the first BN layer of the third attention gating module in turn, and the output is the feature map/> Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the third attention gating module in turn, and the output is the feature map/> Convert the feature map/> with feature map Add up to get the feature map/> Convert the feature map/> Input to the ReLU activation function of the third attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Cascade to obtain feature map/> The third contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the third contextual attention module in turn, and the output is a feature map/> Convert the feature map/> Input to the global maximum pooling layer of the third contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the third context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the third contextual attention module, and the output is the feature map/> Convert the feature map/> Input to the third convolutional layer to obtain the feature map/> f-4) Conv-block2 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map/> Input to Conv-block2 and output the feature map/> Convert the feature map/> Input to the second upsampling layer, and the output is the feature map/> The second attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input to the first convolutional layer and the first BN layer of the second attention gating module in turn, and the output is the feature map/> Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the second attention gating module in turn, and the output is a feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> It is input into the ReLU activation function of the second attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is a feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Cascade to obtain feature map/> The second contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the second context attention module in turn, and the output is a feature map/> Convert the feature map/> Input to the global maximum pooling layer of the second contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the second context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the second context attention module, and the output is the feature map/> Convert the feature map/> Input to the second convolutional layer to obtain the feature map/> f-5) Conv-block1 consists of a convolution layer, a BatchNorm layer, and a ReLU activation function in sequence. The feature map/> Input into Conv-block1 and output the feature map/> Convert the feature map/> Input to the first upsampling layer, and the output is the feature map/> The first attention gating module consists of the first convolution layer, the first BN layer, the second convolution layer, the second BN layer, the ReLU activation function, the third convolution layer, the third BN layer, and the sigmoid function. Feature map/> It is input into the first convolutional layer and the first BN layer of the first attention gating module in turn, and the output is the feature map/> Convert the feature map/> It is input to the second convolutional layer and the second BN layer of the first attention gating module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the ReLU activation function of the first attention gating module, the third convolution layer, the third BN layer, and the sigmoid function in sequence, and the output is the feature map/> The feature map and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Cascade to obtain feature map/> The first contextual attention module consists of the first convolution layer, the second convolution layer, the global maximum pooling layer, the third convolution layer, the fourth convolution layer, and the sigmoid function. The feature map/> It is input to the first convolutional layer and the second convolutional layer of the first context attention module in turn, and the output is the feature map/> Convert the feature map/> Input to the global maximum pooling layer of the first contextual attention module, and the output is the feature map/> Convert the feature map/> It is input to the third convolutional layer and the fourth convolutional layer of the first context attention module in turn, and the output is the feature map/> Convert the feature map/> and feature map/> Multiply to get the feature map/> Convert the feature map/> and feature map/> Add up to get the feature map/> Convert the feature map/> Input to the sigmoid function of the first contextual attention module, and the output is the feature map/> Convert the feature map/> Input to the first convolutional layer to obtain the feature map/> f-6) Convert the feature map/> Feature map/> Feature map/> Feature map/> Add the residuals to obtain the feature map/>

在该实施例中，优选的，步骤f-2)中Conv-block4的卷积层的卷积核大小为3×3，padding为1；第四注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第四上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第四上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第四卷积层的卷积核大小为1×1；步骤f-3)中Conv-block3的卷积层的卷积核大小为3×3，padding为1；第三注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第三上采样层的卷积核大小为2×2；第三上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第三上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第三卷积层的卷积核大小为1×1；f-4)中Conv-block2的卷积层的卷积核大小为3×3，padding为1；第二注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第二上采样层的卷积核大小为2×2；第二上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第二上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第二卷积层的卷积核大小为1×1；f-5)中Conv-block1的卷积层的卷积核大小为3×3，padding为1；第一注意力门控模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为1×1；第一上采样层的卷积核大小为2×2；第一上下文注意力模块的第一卷积层、第二卷积层、第三卷积层的卷积核大小均为3×3，padding为1，第一上下文注意力模块的第四卷积层的卷积核大小为1×1；解码器的第一卷积层的卷积核大小为1×1。In this embodiment, preferably, the convolution kernel size of the convolution layer of Conv-block4 in step f-2) is 3×3, and the padding is 1; the first convolution layer of the fourth attention gating module, The convolution kernel sizes of the second convolution layer and the third convolution layer are both 1 × 1; the convolution kernels of the first convolution layer, the second convolution layer, and the third convolution layer of the fourth contextual attention module The sizes are all 3×3, padding is 1, the convolution kernel size of the fourth convolutional layer of the fourth contextual attention module is 1×1; the convolutional kernel size of the fourth convolutional layer of the decoder is 1×1 ; The convolution kernel size of the convolution layer of Conv-block3 in step f-3) is 3×3, and the padding is 1; the first convolution layer, the second convolution layer, and the third convolution layer of the third attention gating module The convolution kernel size of the convolutional layer is 1×1; the convolution kernel size of the third upsampling layer is 2×2; the first convolutional layer, the second convolutional layer, and the third convolutional layer of the third contextual attention module The convolution kernel size of the convolutional layer is 3×3, and the padding is 1. The convolution kernel size of the fourth convolutional layer of the third contextual attention module is 1×1; the convolution kernel size of the third convolutional layer of the decoder is 1×1. The convolution kernel size is 1 × 1; the convolution kernel size of the convolution layer of Conv-block2 in f-4) is 3 × 3, and the padding is 1; the first convolution layer and the second convolution layer of the second attention gating module The convolution kernel size of the convolution layer and the third convolution layer is 1 × 1; the convolution kernel size of the second upsampling layer is 2 × 2; the first convolution layer and the second convolution layer of the second context attention module The convolution kernel size of the convolution layer and the third convolution layer is 3 × 3, and the padding is 1. The convolution kernel size of the fourth convolution layer of the second context attention module is 1 × 1; the convolution kernel size of the decoder is 1 × 1. The convolution kernel size of the second convolutional layer is 1×1; the convolution kernel size of the convolutional layer of Conv-block1 in f-5) is 3×3, and the padding is 1; the first attention gating module The convolution kernel size of the convolution layer, the second convolution layer, and the third convolution layer is all 1×1; the convolution kernel size of the first upsampling layer is 2×2; the first context attention module of the first The convolution kernel size of the convolutional layer, the second convolutional layer, and the third convolutional layer is all 3×3, and the padding is 1. The convolution kernel size of the fourth convolutional layer of the first contextual attention module is 1× 1; The convolution kernel size of the first convolutional layer of the decoder is 1×1.

在本发明的一个实施例中，步骤g)中将Dice损失及交叉熵损失求和得到总损失，使用Adam优化器利用总损失训练分割网络模型，得到优化后的分割网络模型，训练时批次大小设为32，迭代周期设为200，学习率设为0.001。In one embodiment of the present invention, in step g), the Dice loss and the cross-entropy loss are summed to obtain the total loss, and the Adam optimizer is used to train the segmentation network model using the total loss to obtain the optimized segmentation network model. During training, batches The size is set to 32, the iteration period is set to 200, and the learning rate is set to 0.001.

最后应说明的是：以上所述仅为本发明的优选实施例而已，并不用于限制本发明，尽管参照前述实施例对本发明进行了详细的说明，对于本领域的技术人员来说，其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。Finally, it should be noted that the above are only preferred embodiments of the present invention and are not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still The technical solutions described in the foregoing embodiments may be modified, or some of the technical features may be equivalently replaced. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims

1. A method of cardiac MRI segmentation based on contextual cascade attention, comprising the steps of:

a) Collecting cardiac MRI data of n subjects to obtain MRI data sets s, s= { s ₁ ,s ₂ ,...,s _i ,...,s _n }，s _i Cardiac MRI data for the i-th subject, i e {1,2,., n };

b) Preprocessing the MRI data set s to obtain a preprocessed data set F, F= { F ₁ ,F ₂ ,...,F _i ,...,F _n }，F _i The i-th preprocessed two-dimensional image data;

c) Dividing the preprocessed data set F into a training set, a testing set and a verification set;

d) Establishing a segmentation network model formed by an encoder and a decoder;

e) The preprocessed two-dimensional image data F _i Input into encoder of dividing network model, and output to obtain characteristic diagramFeature map->Feature map->Feature map->

f) Map the characteristic mapFeature map->Feature map->Feature map->Input into decoder of segmentation network model, output to obtain predictive segmentation image>

g) Training a segmentation network model to obtain an optimized segmentation network model;

step e) comprises the steps of:

e-1) an encoder for dividing a network model is composed of a first residual initial module, a first maximum pooling layer, a second residual initial module, a second maximum pooling layer, a third residual initial module, a third maximum pooling layer and a fourth residual initial module;

e-2) the first residual error initial module is composed of a first branch, a second branch, a third branch, a fourth branch and a fifth branch, wherein the first branch is composed of a convolution layer, the second branch is sequentially composed of an average pooling layer and a convolution layer, the third branch is sequentially composed of a convolution layer and a BN layer, the fourth branch is sequentially composed of the first convolution layer, the second convolution layer and the BN layer, the fifth branch is sequentially composed of the first convolution layer, the second convolution layer, the first BN layer, the third convolution layer and the second BN layer, and the preprocessed two-dimensional image data F is obtained by _i Input into the first branch, output to obtain a feature mapThe preprocessed two-dimensional image data F _i Input into the second branch, output the obtained feature map +.>The preprocessed two-dimensional image data F _i Input into the third branch, output to obtain a feature mapThe preprocessed two-dimensional image data F _i Input into the fourth branch, output the obtained feature map +.>The preprocessed two-dimensional image data F _i Input into the fifth branch, output the obtained feature map +.>Feature map +.>Feature map->Feature map->Feature map->Performing splicing operation to obtain characteristic diagram->Feature map +.>And (4) feature map>The addition operation results in a feature map->

e-3) mapping the featuresInput into the first maximum pooling layer, and output to obtain characteristic diagram +.>

e-4) a second residual error initial module is composed of a first branch, a second branch, a third branch, a fourth branch and a fifth branch, wherein the first branch is composed of a convolution layer, the second branch is sequentially composed of an average pooling layer and a convolution layer, the third branch is sequentially composed of a convolution layer and a BN layer, the fourth branch is sequentially composed of a first convolution layer, a second convolution layer and a BN layer, the fifth branch is sequentially composed of the first convolution layer, the second convolution layer, the first BN layer, the third convolution layer and the second BN layer, and the characteristic diagram is formed by Input into the first branchOutputting the obtained characteristic diagram->Feature map +.>Input into the second branch, output the obtained feature map +.>Feature map +.>Input into the third branch, output the obtained feature map +.>Feature map +.>Input into the fourth branch, output the obtained feature map +.>Feature map +.>Input into the fifth branch, output the obtained feature map +.>Feature map +.>Feature mapFeature map->Feature map->Performing splicing operation to obtain characteristic diagram->Feature map +.>And feature mapThe addition operation results in a feature map->

e-5) mapping the featuresInputting into the second maximum pooling layer, outputting to obtain characteristic diagram +.>

e-6) a third residual error initial module is composed of a first branch, a second branch, a third branch, a fourth branch and a fifth branch, wherein the first branch is composed of a convolution layer, the second branch is sequentially composed of an average pooling layer and a convolution layer, the third branch is sequentially composed of a convolution layer and a BN layer, the fourth branch is sequentially composed of a first convolution layer, a second convolution layer and a BN layer, the fifth branch is sequentially composed of the first convolution layer, the second convolution layer, the first BN layer, the third convolution layer and the second BN layer, and the characteristic diagram is formed byInput into the first branch, output the obtained feature map +.>Feature map +. >Input into the second branch, output the obtained feature map +.>Feature map +.>Input into the third branch, output the obtained feature map +.>Feature map +.>Input into the fourth branch, output the obtained feature map +.>Feature map +.>Input into the fifth branch, output the obtained feature map +.>Feature map +.>Feature mapFeature map->Feature map->Performing splicing operation to obtain characteristic diagram->Feature map +.>And feature mapThe addition operation results in a feature map->

e-7) mapping the featuresInputting into the third maximum pooling layer, outputting to obtain characteristic diagram +.>

e-8) a fourth residual error initial module is composed of a first branch, a second branch, a third branch, a fourth branch and a fifth branch, wherein the first branch is composed of a convolution layer, the second branch is sequentially composed of an average pooling layer and a convolution layer, the third branch is sequentially composed of a convolution layer and a BN layer, the fourth branch is sequentially composed of a first convolution layer, a second convolution layer and a BN layer, the fifth branch is sequentially composed of the first convolution layer, the second convolution layer, the first BN layer, the third convolution layer and the second BN layer, and the characteristic diagram is formed byInput into the first branch, output the obtained feature map +.>Feature map +.>Input into the second branch, output the obtained feature map +.>Feature map +. >Input into the third branch, output the obtained feature map +.>Feature map +.>Input into the fourth branch, output the obtained feature map +.>Feature map +.>Input into the fifth branch, output the obtained feature map +.>Feature map +.>Feature mapFeature map->Feature map->Performing splicingThe join operation gets a feature map->Feature map +.>And feature mapThe addition operation results in a feature map->

2. The contextual cascade attention-based cardiac MRI segmentation method according to claim 1, wherein: from ACDC2017 public data in step a), cardiac MRI data of n subjects containing LV, RV, MYO structures were collected, resulting in MRI dataset s.

3. The contextual cascade attention-based cardiac MRI segmentation method according to claim 1, wherein: n=100 in step a).

4. The method of contextual cascade attention-based cardiac MRI segmentation according to claim 1, wherein step b) comprises the steps of:

b-1) cardiac MRI data s for the ith subject _i Resampling is carried out on the z-axis one by one, wherein the resampling is that the pixel pitch in the x-axis direction is 1.5, and the pixel pitch in the y-axis direction is 1.5;

b-2) resampling cardiac MRI data s _i Performing 2D center clipping operation with clipping size 224×224 to obtain clipped data F _i ' cutting the cut data F _i ' normalization processing is carried out to obtain preprocessed two-dimensional image data F _i 。

5. The contextual cascade attention-based cardiac MRI segmentation method according to claim 1, wherein: in the step c), the preprocessed data set F is divided into a training set, a testing set and a verification set according to the proportion of 7:2:1.

6. The contextual cascade attention-based cardiac MRI segmentation method according to claim 1, wherein: in the step e-2), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, the packing is 2, the convolution kernel size of the first convolution layer of the fifth branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; in the step e-4), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, the packing is 2, the convolution kernel size of the first convolution layer of the fifth branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; in the step e-6), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, the packing is 2, the convolution kernel size of the first convolution layer of the fifth branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; in the step e-8), the convolution kernel size of the convolution layer of the first branch is 1×1, the convolution kernel size of the convolution layer of the second branch is 1×1, the convolution kernel size of the convolution layer of the third branch is 1×1, the convolution kernel size of the first convolution layer of the fourth branch is 1×1, the convolution kernel size of the second convolution layer is 5×5, the packing is 2, the convolution kernel size of the first convolution layer of the fifth branch is 1×1, the convolution kernel size of the second convolution layer is 3×3, and the convolution kernel size of the third convolution layer is 3×3; the convolution kernel size of the first maximum pooling layer in step e-3) is 2 x 2; the convolution kernel size of the second largest pooling layer in step e-5) is 2 x 2; the convolution kernel size of the third largest pooling layer in step e-7) is 2 x 2.

7. The method of contextual cascade attention-based cardiac MRI segmentation according to claim 1, wherein step f) comprises the steps of:

f-1) the decoder of the split network model is composed of Conv-block1, a first attention gating module, a first upsampling layer, a first context attention module, a first convolution layer, conv-block2, a second attention gating module, a second upsampling layer, a second context attention module, a second convolution layer, conv-block3, a third attention gating module, a third upsampling layer, a third context attention module, a third convolution layer, conv-block4, a fourth attention gating module, a fourth context attention module and a fourth convolution layer;

f-2) Conv-block4 is composed of a convolution layer, a BatchNorm layer and a ReLU activation function in sequence, and is characterized in thatInput into Conv-block4, and output to obtain characteristic diagram +.>The fourth attention gating module consists of a first convolution layer, a first BN layer, a second convolution layer, a second BN layer, a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function, and maps the characteristicsSequentially inputting into a first convolution layer and a first BN layer of a fourth attention gating module, and outputting to obtain a characteristic diagram ++ >Feature map +.>Sequentially inputting into a second convolution layer and a second BN layer of the fourth attention gate module, and outputting to obtain a feature mapFeature map +.>And (4) feature map>Adding to obtain a feature map->Feature map +.>Sequentially inputting into a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function of a fourth attention gating module, and outputting to obtain a feature map +.>Feature map +.>And (4) feature map>Multiplication to obtain a feature map->The fourth context attention module consists of a first convolution layer, a second convolution layer, a global maximum pooling layer, a third convolution layer, a fourth convolution layer and a sigmoid function, and is used for adding a characteristic diagram->Sequentially inputting the images into a first convolution layer and a second convolution layer of a fourth context attention module, and outputting to obtain a characteristic diagramFeature map +.>Input into the global maximum pooling layer of the fourth context attention module, and output to obtain a feature mapFeature map +.>Sequentially inputting into a third convolution layer and a fourth convolution layer of a fourth context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>And (4) feature map>Multiplication to obtain a feature map->Feature map +.>And (4) feature map>Adding to obtain a feature map->Feature map +.>Inputting into the sigmoid function of the fourth context attention module, outputting the obtained feature map +. >Feature map +.>Inputting into a fourth convolution layer to obtain a feature map +.>

f-3) Conv-block3 is composed of a convolution layer, a BatchNorm layer and a ReLU activation function in sequence, and is characterized in thatInput into Conv-block3, and output to obtain characteristic diagram +.>Feature map +.>Input into the third upsampling layer, output to get the feature map +.>The third attention gating module consists of a first convolution layer, a first BN layer, a second convolution layer, a second BN layer, a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function, and is used for decoding a characteristic diagram->Sequentially input to a third attention gating moduleIn the first convolution layer and the first BN layer, the obtained characteristic diagram is output>Feature map +.>Sequentially inputting into a second convolution layer and a second BN layer of a third attention gate control module, and outputting to obtain a characteristic diagram +.>Feature map +.>And feature mapAdding to obtain a feature map->Feature map +.>Sequentially inputting into a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function of a third attention gating module, and outputting to obtain a feature map +.>Feature map +.>And (4) feature map>Multiplication to obtain a feature map->Feature map +.>And (4) feature map>Cascading get feature map->The third context attention module consists of a first convolution layer, a second convolution layer, a global maximum pooling layer, a third convolution layer, a fourth convolution layer and a sigmoid function, and is used for adding a characteristic diagram- >Sequentially inputting into the first convolution layer and the second convolution layer of the third context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>Input into the global maximum pooling layer of the third contextual attention module, output the resulting feature map +.>Feature map +.>Sequentially inputting into a third convolution layer and a fourth convolution layer of a third context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>And (4) feature map>Multiplication to obtain a feature map->Feature map +.>And (4) feature map>Adding to obtain a feature map->Feature map +.>Inputting into the sigmoid function of the third context attention module, outputting the obtained feature map +.>Feature map +.>Inputting into the third convolution layer to obtain a feature map +.>

f-4) Conv-block2 is composed of a convolution layer, a BatchNorm layer and a ReLU activation function in sequence, and is characterized byInput into Conv-block2, and output to obtain feature map +.>Feature map +.>Input into the second upsampling layer, output to get the feature map +.>The second attention gating module consists of a first convolution layer, a first BN layer, a second convolution layer, a second BN layer, a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function, and is used for decoding a characteristic diagram->Sequentially inputting into a first convolution layer and a first BN layer of a second attention gating module, and outputting to obtain a characteristic diagram ++ >Feature map +.>Sequentially inputting into a second convolution layer and a second BN layer of a second attention gate control module, and outputting to obtain a characteristic diagram +.>Feature map +.>And feature mapAdding to obtain a feature map->Feature map +.>In turnInputting into a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function of the second attention gating module, and outputting to obtain a feature map +.>Feature map +.>And feature mapMultiplication to obtain a feature map->Feature map +.>And (4) feature map>Cascading get feature map->The second context attention module consists of a first convolution layer, a second convolution layer, a global maximum pooling layer, a third convolution layer, a fourth convolution layer and a sigmoid function, and is used for adding a characteristic diagram->Sequentially inputting into a first convolution layer and a second convolution layer of a second context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>Global maximum input to the second contextual attention moduleIn the pooling layer, the characteristic diagram is obtained by outputting>Feature map +.>Sequentially inputting into a third convolution layer and a fourth convolution layer of the second context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>And (4) feature map>Multiplication to obtain a feature map->Feature map +.>And (4) feature map>Adding to obtain a feature map- >Feature map +.>Inputting into the sigmoid function of the second context attention module, outputting the obtained feature map +.>Feature map +.>Inputting the first convolution layer to obtain a characteristic diagram +.>

f-5) Conv-block1 is composed of a convolution layer, a BatchNorm layer and a ReLU activation function in sequence, and is characterized in thatInput into Conv-block1, and output to obtain characteristic diagram +.>Feature map +.>Input into the first upsampling layer, output to get the feature map +.>The first attention gating module consists of a first convolution layer, a first BN layer, a second convolution layer, a second BN layer, a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function, and is used for decoding a characteristic diagram->Sequentially inputting into a first convolution layer and a first BN layer of a first attention gate control module, and outputting to obtain a characteristic diagram ++>Feature map +.>Sequentially inputting into a second convolution layer and a second BN layer of the first attention gating module, and outputting to obtain a characteristic diagram ++>Feature map +.>And feature mapAdding to obtain a feature map->Feature map +.>Sequentially inputting into a ReLU activation function, a third convolution layer, a third BN layer and a sigmoid function of the first attention gating module, and outputting to obtain a feature map +.>Feature map +.>And (4) feature map>Multiplication to obtain a feature map->Feature map +. >And (4) feature map>Cascading get feature map->First contextual attention modelThe block is composed of a first convolution layer, a second convolution layer, a global maximum pooling layer, a third convolution layer, a fourth convolution layer and a sigmoid function, and the feature map is->Sequentially inputting into a first convolution layer and a second convolution layer of a first context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>Input into the global maximum pooling layer of the first contextual attention module, output the resulting feature map +.>Feature map +.>Sequentially inputting into a third convolution layer and a fourth convolution layer of the first context attention module, and outputting to obtain a characteristic diagram ++>Feature map +.>And (4) feature map>Multiplication to obtain a feature mapFeature map +.>And (4) feature map>Adding to obtain a feature map->Feature map +.>Inputting into the sigmoid function of the first context attention module, outputting the obtained feature map +.>Feature map +.>Inputting into the first convolution layer to obtain a feature map +.>

f-6) mapping the featuresFeature map->Feature map->Feature map->Residual addition is carried out to obtain a characteristic diagram->

8. The contextual cascade attention-based cardiac MRI segmentation method according to claim 7, wherein: the convolution kernel size of the convolution layer of Conv-block4 in step f-2) is 3×3, and padding is 1; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the fourth attention gating module are all 1 multiplied by 1; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the fourth context attention module are 3×3, the padding is 1, and the convolution kernel size of the fourth convolution layer of the fourth context attention module is 1×1; the convolution kernel size of the fourth convolution layer of the decoder is 1×1; the convolution kernel size of the convolution layer of Conv-block3 in step f-3) is 3×3, and padding is 1; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the third attention gating module are all 1 multiplied by 1; the convolution kernel size of the third upsampling layer is 2×2; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the third context attention module are 3×3, the padding is 1, and the convolution kernel size of the fourth convolution layer of the third context attention module is 1×1; the convolution kernel size of the third convolution layer of the decoder is 1×1; f-4) the convolution kernel size of the convolution layer of Conv-block2 is 3×3, and padding is 1; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the second attention gating module are all 1 multiplied by 1; the convolution kernel size of the second upsampling layer is 2 x 2; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the second context attention module are all 3×3, the padding is 1, and the convolution kernel size of the fourth convolution layer of the second context attention module is 1×1; the convolution kernel size of the second convolution layer of the decoder is 1×1; f-5) the convolution kernel size of the convolution layer of Conv-block1 is 3×3, and padding is 1; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the first attention gating module are all 1 multiplied by 1; the convolution kernel size of the first upsampling layer is 2 x 2; the convolution kernel sizes of the first convolution layer, the second convolution layer and the third convolution layer of the first context attention module are all 3×3, the padding is 1, and the convolution kernel size of the fourth convolution layer of the first context attention module is 1×1; the convolution kernel size of the first convolution layer of the decoder is 1 x 1.

9. The contextual cascade attention-based cardiac MRI segmentation method according to claim 1, wherein: and g) summing the Dice loss and the cross entropy loss to obtain total loss, training the segmentation network model by using the Adam optimizer to obtain an optimized segmentation network model, wherein the batch size during training is set to be 32, the iteration period is set to be 200, and the learning rate is set to be 0.001.