CN115272170A

CN115272170A - Prostate MRI (magnetic resonance imaging) image segmentation method and system based on self-adaptive multi-scale transform optimization

Info

Publication number: CN115272170A
Application number: CN202210610244.6A
Authority: CN
Inventors: 耿道颖; 朱静逸; 于泽宽; 李海庆; 金倞; 余荔恒
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2022-05-31
Filing date: 2022-05-31
Publication date: 2022-11-01

Abstract

The invention provides a prostate MRI image segmentation method and a system based on self-adaptive multi-scale transform optimization, which comprises the following steps: step S1: acquiring a prostate MRI image dataset; step S2: preprocessing an MRI image in the data set to obtain a training set and a test set after preprocessing; and step S3: initializing parameters in the network, training a prostate organ region marked in a prostate image by using the network with a Transformer module, and updating the parameters to obtain an automatic prostate segmentation model; and step S4: and inputting the images of the test data set into the trained network to obtain the prostate segmentation images of each MRI image. The invention introduces the self-adaptive convolution to adaptively modulate the global complementary information of the convolution kernel, adopts the self-adaptive Transformer module to enhance the global semantic extraction capability and improves the segmentation effect of the prostate MRI image.

Description

Prostate MRI image segmentation method and system based on self-adaptive multi-scale transform optimization

Technical Field

The invention relates to the field of medical image processing, in particular to a prostate MRI image segmentation method and system based on self-adaptive multi-scale transform optimization.

Background

Early stage lesions of prostate cancer are difficult to detect imagewise, and more than half of prostate cancer patients are diagnosed until late stage. In order to improve the cure rate of patients, early accurate screening has great significance for advanced intervention treatment. Magnetic Resonance Imaging (MRI) has high tissue resolution, can well distinguish normal and pathological tissue signals, and currently, differential diagnosis of prostate cancer is performed through MRI image features, which has high requirements on the professional level of doctors, but has some problems: the boundary between the prostate cancer focus and the surrounding normal tissues is not obvious, so that the prostate cancer focus is difficult to identify by a radiologist with naked eyes alone, and the malignancy degree is difficult to accurately judge, thereby increasing the risk of misdiagnosis and missed diagnosis. Therefore, the method has great significance for the research of the prostate segmentation technology.

In recent years, with the development of Artificial Intelligence (AI) and advanced learning image processing based on AI, the advanced learning algorithm has been widely used in the development and application of a medical image aided diagnosis system. The deep learning-based prostate segmentation technology can help radiologists to improve radiograph reading efficiency and reduce misdiagnosis and missed diagnosis rate. Generally, deep learning based prostate MRI image segmentation techniques can be divided into five categories: segmentation techniques based on Convolutional Neural Networks (CNN), segmentation techniques based on U-type Networks, segmentation techniques based on resolution enhancement, segmentation techniques based on antagonistic generated Networks (GAN) and segmentation techniques based on transformers. The CNN overcomes the limitation that the traditional machine learning needs to manually extract features, and can automatically learn effective features from input images. Compared with the classification of the image levels of the convolutional neural network, the U-shaped network adopts a symmetric encoder decoder and has a jump connection structure, and the downsampled feature graph is gradually restored to the original size, so that the pixel-level segmentation of the medical image is realized. Although the U-network model has good performance in medical image segmentation, the following problems still exist when the U-network model is applied to MRI image prostate segmentation: (1) The U-type network model improves the network segmentation precision by increasing the network hierarchy, but the gradient dispersion is often caused in the back propagation process; (2) The pooling downsampling operation in the U-shaped network can cause the loss of the detail information of the target edge, influence is brought to the reconstruction of the feature map after upsampling, and the segmentation precision is reduced; (3) The glands to be segmented are of different sizes and shapes and the gland tissue is not in high contrast to the background, resulting in a network that is difficult to focus on the study of the gland structure. Further improvements are therefore desirable. The GAN-based segmentation technique is mainly composed of a segmentation network that generates a segmentation prediction map and a discrimination network that determines whether an input is from a true tag or a segmentation prediction. Meanwhile, a Receptive Field module (RFB) is integrated in the segmentation network to acquire and fuse multi-scale information of the depth features, so that the recognition rate and robustness of the features are improved, and the segmentation performance of the network is improved. Recently, for better capturing local features of images, the transform-based framework is more introduced into the task of medical image segmentation and shows optimal different performance.

Patent document CN110188792A (application number: CN 201910312296.3) discloses an image feature acquisition method for prostate MRI three-dimensional images. The invention obtains the corresponding prostate organ region by automatically segmenting the prostate T2WI image, maps the prostate organ region to the registered ADC and DWI image based on the segmentation result, obtains the multi-parameter prostate organ region as the input of a discrimination model, and combines the multi-parameter MRI image and a deep learning algorithm to obtain the image characteristics. But the invention does not introduce adaptive convolution to adaptively modulate the convolution kernel global complementary information.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a prostate MRI image segmentation method and system based on adaptive multi-scale transform optimization.

The invention provides a prostate MRI image segmentation method based on self-adaptive multi-scale transform optimization, which comprises the following steps:

step S1: acquiring a prostate MRI image dataset;

step S2: preprocessing an MRI image in the data set to obtain a training set and a test set after preprocessing;

and step S3: initializing parameters in the network, training a prostate organ region marked in a prostate image by using the network with a Transformer module, and updating the parameters to obtain an automatic prostate segmentation model;

and step S4: and inputting the images of the test data set into the trained network to obtain the prostate segmentation images of each MRI image.

Preferably, in the step S1:

the prostate MRI image data set comprises a T1WI scanning sequence, and the marked area on the image is the range of prostate organs and is obtained by artificial marking;

in the step S2:

the preprocessing method comprises the steps of carrying out histogram equalization operation on all MRI images in the training set and the testing set, and carrying out data enhancement on all the MRI images in the training set and the testing set.

Preferably, in the step S3:

parameters in the network mainly comprise weight values, bias, the number of convolution kernels, the size of the convolution kernels, the number of transform modules and the number of Dropout layers;

and training a prostate organ region marked by the prostate image by using a network with a transform module in combination with a doctor, updating parameters in the network, and obtaining an automatic prostate segmentation model, wherein the model structure comprises a down-sampling module, an up-sampling module and a transform-based self-adaptive multi-scale attention module.

Preferably, step S3.1: the network based on the Transformer comprises N downsampling layers, N upsampling layers, 1 leakage ReLU layer and M Transformer modules;

step S3.2: performing K times of convolution on the input prostate image to obtain a feature map with the same size as the input prostate image, and subtracting the convolved feature map from the input first image to obtain a 1 st residual feature map, wherein K is a positive integer;

step S3.3: carrying out down-sampling on the 1 st residual error feature map to obtain a 1 st down-sampling feature map; performing convolution on the 1 st downsampling feature map for K times to obtain a feature map with the same size as the 1 st downsampling feature map; subtracting the feature map obtained after the 1 st down-sampling feature map is convolved for the K times from the 1 st down-sampling feature map to obtain a 2 nd residual error feature map;

step S3.4: repeating the step S3.3, and carrying out convolution downsampling on the nth residual error feature map for one time to obtain an nth downsampled feature map; performing convolution on the n-th downsampling feature map for M times to obtain a feature map with the same size as the n-th downsampling feature map; subtracting the feature map obtained after the n-th downsampling feature map is convolved for M times from the n-th downsampling feature map to obtain an n +1 residual error feature map; wherein the initial value of n is 2;

step S3.5: adding 1 to the value of n and then repeatedly executing the step S3.4; when N = N +1, go to step S3.6;

step S3.6: after down-sampling the image, the i-type slice stack of the input image is represented as a four-dimensional input:

x∈R^l×h×w×c

h is the height of the characteristic diagram, w is the width, c is the number of channels, x is an input four-dimensional image, R is a positive real number, and l is the thickness of a fault;

using high correlation of nearby pixels in the feature map, and down-sampling x by average pooling, with a kernel size of k, to generate a compressed feature map:

wherein x is_poolInputting the four dimensions after the pooling;

step S3.7: calculating linear projection Q 'of the compressed feature map query and linear projection K' of the key, and expressing the linear projection of x in the number of channels as V, the calculation formula is as follows:

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

wherein, W_QLinear projection weights, W, for queries_KIs a linear projection weight, W, of a key_VWeight of the V linear projection;

step S3.8: calculating the attention matrix A ∈ R^l×lThe matrix represents the attention degree to be given to other sections when the current section is divided, and the calculation formula is as follows:

wherein Q is a query and K is a key;

the output of the cross-section attention block is y ∈ R^l×h×w×c，y＝AV

Step S3.9: embedding a cross-section attention module into a Transformer module, wherein the output of the Transformer module is z, and the calculation formula of z is as follows:

z＝Layer_Norm(GELU(z_intW+b)+z_int)

wherein z is_intW is a corresponding weight matrix and b is an offset of an intermediate result output by a Transformer module;

step S3.10: performing primary up-sampling on the (N + 1) th residual error characteristic diagram to obtain a 1 st up-sampling characteristic diagram; adding a residual error feature map with the same size as the 1 st upsampling feature map from the 1 st residual error feature map to the (N + 1) th residual error feature map to the 1 st upsampling feature map to obtain a 1 st feature map;

step S3.11: performing convolution on the 1 st feature map for K times to obtain a feature map with the same size as the 1 st feature map; subtracting the feature diagram after the K-time convolution of the feature diagram 1 from the feature diagram 1 to obtain an up-sampling residual feature diagram 1;

step S3.12: performing primary upsampling on the mth upsampling residual characteristic map to obtain an m +1 th upsampling characteristic map; adding a residual error feature map with the same size as the m + 1-th upsampling feature map from the 1 st residual error feature map to the N + 1-th residual error feature map to the m + 1-th upsampling feature map to obtain an m + 1-th feature map; wherein the initial value of m is 1;

step S3.13: performing K times of convolution on the (m + 1) th feature map to obtain a feature map with the same size as the (m + 1) th feature map; subtracting the feature map after the (m + 1) th feature map is subjected to K times of convolution from the (m + 1) th feature map to obtain an (m + 1) th up-sampling residual error feature map;

step S3.14: adding 1 to the value m, and then repeatedly executing the step S3.12 and the step S3.13 until m = N, and going to the step S3.15;

step S3.15: performing convolution, a Leaky ReLU activation function and inverse pooling on N feature maps representing high-level feature information generated in the up-sampling process to increase the size of the feature maps and obtain network parameters;

the calculation formula of the loss function D is as follows:

wherein i is the abscissa position of the pixel point, j is the ordinate position of the pixel point, p_ijThe method comprises the steps of inputting a pixel value g with a coordinate position (i, j) in a binary image obtained by dividing a prostate image through a network_ijFor the pixel value with coordinate position (i, j) in the manually marked standard segmented image corresponding to the image in the first image, X and Y are the length and width of the manually marked standard segmented image target area, respectively.

Preferably, in the step S4:

performing semi-supervised learning by using the distribution information of the prostate region in the MRI image in all data; the training is divided into two stages:

the first stage is as follows: training by using data of a manually marked standard prostate segmentation image to generate a network model, predicting an unmarked prostate MRI image, and putting a prediction result into an initial training set;

and a second stage: performing integrated training by using all the data to generate a network model;

the self-adaptive convolution is introduced to adaptively modulate the global complementary information of the convolution kernel, the global semantic extraction capability is enhanced by adopting a self-adaptive Transformer module, and the remote dependency relationship is further simulated, so that the module can better perform on a network with jump connection of different scales.

The invention provides a prostate MRI image segmentation system based on self-adaptive multi-scale transform optimization, which comprises:

a module M1: acquiring a prostate MRI image dataset;

a module M2: preprocessing an MRI image in the data set to obtain a training set and a test set after preprocessing;

a module M3: initializing parameters in the network, training a prostate organ region marked in a prostate image by using the network with a Transformer module, and updating the parameters to obtain an automatic prostate segmentation model;

a module M4: and inputting the images of the test data set into the trained network to obtain the prostate segmentation images of each MRI image.

Preferably, in said module M1:

the prostate MRI image data set comprises a T1WI scanning sequence, and the marked area on the image is the range of prostate organs and is obtained by manual marking;

in the module M2:

Preferably, in said module M3:

Preferably, the module M3.1: the network based on the Transformer comprises N downsampling layers, N upsampling layers, 1 leakage ReLU layer and M Transformer modules;

module M3.2: performing convolution on the input prostate image for K times to obtain a feature map with the same size as the input prostate image, and subtracting the convolved feature map from the input first image to obtain a 1 st residual feature map, wherein K is a positive integer;

module M3.3: carrying out down-sampling on the 1 st residual error feature map to obtain a 1 st down-sampling feature map; performing convolution on the 1 st downsampling feature map for K times to obtain a feature map with the same size as the 1 st downsampling feature map; subtracting the feature map obtained after the 1 st down-sampling feature map is convoluted for K times from the 1 st down-sampling feature map to obtain a 2 nd residual error feature map;

module M3.4: repeating the module M3.3, and performing convolution downsampling on the nth residual error feature map for one time to obtain an nth downsampled feature map; performing M-time convolution on the n-th down-sampling feature map to obtain a feature map with the same size as the n-th down-sampling feature map; subtracting the characteristic graph obtained after the n-th down-sampling characteristic graph is convolved for M times from the n-th down-sampling characteristic graph to obtain an n + 1-th residual characteristic graph; wherein the initial value of n is 2;

module M3.5: adding 1 to the value of n, and then repeatedly executing a module M3.4; when N = N +1, go to module M3.6;

module M3.6: after down-sampling the image, the i-slice stack of the input image is represented as a four-dimensional input:

x∈R^l×h×w×c

h is the height of the characteristic diagram, w is the width, c is the number of channels, x is an input four-dimensional image, R is a positive real number, and l is the fault thickness;

wherein x is_poolInputting the four dimensions after the pooling;

module M3.7: calculating linear projection Q 'of the compressed feature map query and linear projection K' of the key, and expressing the linear projection of x in the number of channels as V, wherein the calculation formula is as follows:

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

wherein, W_QLinear projection weights, W, for queries_KIs a linear projected weight of a key, W_VWeight of the V linear projection;

module M3.8: calculating the attention matrix A ∈ R^l×lThe matrix represents the attention degree to be given to other sections when the current section is divided, and the calculation formula is as follows:

wherein Q is a query and K is a key;

the output of the cross-section attention block is y ∈ R^l×h×w×c，y＝AV

Module M3.9: embedding a cross-section attention module into a Transformer module, wherein the output of the Transformer module is z, and the calculation formula of z is as follows:

z＝Layer_Norm(GELU(z_intW+b)+z_int)

wherein z is_intW is the corresponding weight matrix and b is the intermediate result output by the Transformer moduleBiasing;

module M3.10: performing primary up-sampling on the (N + 1) th residual error feature map to obtain a 1 st up-sampling feature map; adding a residual error characteristic diagram with the same size as the 1 st upsampling characteristic diagram from the 1 st residual error characteristic diagram to the (N + 1) th residual error characteristic diagram to the 1 st upsampling characteristic diagram to obtain a 1 st characteristic diagram;

module M3.11: performing convolution on the 1 st feature map for K times to obtain a feature map with the same size as the 1 st feature map; subtracting the feature map after the convolution of the 1 st feature map for the K times from the 1 st feature map to obtain a 1 st up-sampling residual feature map;

module M3.12: performing primary upsampling on the mth upsampling residual characteristic map to obtain an m +1 th upsampling characteristic map; adding a residual error feature map with the same size as the m + 1-th upsampling feature map from the 1 st residual error feature map to the N + 1-th residual error feature map to the m + 1-th upsampling feature map to obtain an m + 1-th feature map; wherein the initial value of m is 1;

module M3.13: performing convolution on the (m + 1) th feature map for K times to obtain a feature map with the same size as the (m + 1) th feature map; subtracting the characteristic diagram after the m +1 th characteristic diagram is subjected to K times of convolution from the m +1 th characteristic diagram to obtain an m +1 th upsampling residual characteristic diagram;

module M3.14: after the value M is added by 1, the module M3.12 and the module M3.13 are repeatedly executed, and when M = N, the operation is switched to the module M3.15;

module M3.15: performing convolution, leaky ReLU activation function and inverse pooling on N feature graphs which represent high-level feature information and are generated in the up-sampling process, so that the size of the feature graphs is increased, and network parameters are obtained;

the calculation formula of the loss function D is as follows:

wherein i is the abscissa position of the pixel point, j is the ordinate position of the pixel point, p_ijThe method comprises the steps of inputting a pixel with a coordinate position (i, j) in a binary image obtained by dividing a prostate image through a networkValue g_ijFor the pixel value with coordinate position (i, j) in the manually marked standard segmented image corresponding to the image in the first image, X and Y are the length and width of the manually marked standard segmented image target area, respectively.

Preferably, in said module M4:

performing semi-supervised learning by using distribution information of the prostate region in the MRI image in all data; the training is divided into two stages:

Compared with the prior art, the invention has the following beneficial effects:

1. because two-dimensional networks cannot obtain useful information from adjacent sections, three-dimensional networks do not perform well with convolution alone due to the anisotropy of the MRI data (i.e., planar resolution is much lower than in-plane resolution). The invention introduces self-adaptive convolution to self-adaptively modulate the global complementary information of the convolution kernel;

2. according to the method, the self-adaptive Transformer module is adopted to enhance the global semantic extraction capability and further simulate the remote dependency relationship, and the module can better perform on the jump connection networks with different scales, so that the self-adaptive Transformer module can improve the U-shaped network based on the jump connection, and the segmentation effect on the prostate MRI image is improved;

3. the invention establishes the automatic prostate organ segmentation model based on a large amount of prostate MRI image data, reduces the interference of irrelevant background information on the discrimination model, and provides a basis for improving the diagnosis efficiency and accuracy of prostate cancer.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a flow chart of prostate segmentation based on adaptive multi-scale transform optimization according to the present invention;

FIG. 3 is a flow diagram of an adaptive multi-scale attention module of the present invention;

FIG. 4 is a flow chart of the Transformer module of the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the concept of the invention. All falling within the scope of the present invention.

Example 1:

according to the method for segmenting the prostate MRI image based on the adaptive multi-scale transform optimization provided by the invention, as shown in fig. 1-4, the method comprises the following steps:

step S1: acquiring a prostate MRI image dataset;

specifically, in the step S1:

in the step S2:

specifically, in the step S3:

Specifically, step S3.1: the network based on the Transformer comprises N downsampling layers, N upsampling layers, 1 leakage ReLU layer and M Transformer modules;

step S3.2: performing convolution on the input prostate image for K times to obtain a feature map with the same size as the input prostate image, and subtracting the convolved feature map from the input first image to obtain a 1 st residual feature map, wherein K is a positive integer;

step S3.3: carrying out down-sampling on the 1 st residual error feature map to obtain a 1 st down-sampling feature map; performing K-time convolution on the 1 st-time downsampling feature map to obtain a feature map with the same size as the 1 st-time downsampling feature map; subtracting the feature map obtained after the 1 st down-sampling feature map is convolved for the K times from the 1 st down-sampling feature map to obtain a 2 nd residual error feature map;

step S3.6: after down-sampling the image, the i-slice stack of the input image is represented as a four-dimensional input:

x∈R^l×h×w×c

wherein x is_poolInputting the four dimensions after the pooling;

step S3.7: calculating linear projection Q 'of the compressed feature map query and linear projection K' of the key, and expressing the linear projection of x in the number of channels as V, wherein the calculation formula is as follows:

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

wherein, W_QLinear projection weights, W, for queries_KIs a linear projected weight of a key, W_VIs the weight of the V linear projection;

wherein Q is a query and K is a key;

the output of the cross-section attention block is y ∈ R^l×h×w×c，y＝AV

Step S3.9: embedding a cross-section attention module into a Transformer module, wherein the output of the cross-section attention module is z, and the calculation formula of z is as follows:

z＝Layer_Norm(GELU(z_intW+b)+z_int)

step S3.10: performing primary up-sampling on the (N + 1) th residual error feature map to obtain a 1 st up-sampling feature map; adding a residual error characteristic diagram with the same size as the 1 st upsampling characteristic diagram from the 1 st residual error characteristic diagram to the (N + 1) th residual error characteristic diagram to the 1 st upsampling characteristic diagram to obtain a 1 st characteristic diagram;

step S3.11: performing K times of convolution on the 1 st feature map to obtain a feature map with the same size as the 1 st feature map; subtracting the feature diagram after the K-time convolution of the feature diagram 1 from the feature diagram 1 to obtain an up-sampling residual feature diagram 1;

step S3.12: performing primary up-sampling on the m-th up-sampling residual error characteristic diagram to obtain an m + 1-th up-sampling characteristic diagram; adding a residual error characteristic diagram from the 1 st residual error characteristic diagram to the (N + 1) th residual error characteristic diagram, which has the same size as the (m + 1) th upsampling characteristic diagram, to the (m + 1) th upsampling characteristic diagram to obtain an (m + 1) th characteristic diagram; wherein the initial value of m is 1;

step S3.13: performing K times of convolution on the (m + 1) th feature map to obtain a feature map with the same size as the (m + 1) th feature map; subtracting the characteristic diagram after the m +1 th characteristic diagram is subjected to K times of convolution from the m +1 th characteristic diagram to obtain an m +1 th upsampling residual characteristic diagram;

step S3.14: adding 1 to the value of m, and then repeatedly executing step S3.12 and step S3.13 until m = N, then going to step S3.15;

the calculation formula of the loss function D is as follows:

Specifically, in the step S4:

Example 2:

example 2 is a preferred example of example 1, and the present invention will be described in more detail.

The person skilled in the art can understand that the method for segmenting a prostate MRI image based on adaptive multi-scale transform optimization provided by the present invention as a specific embodiment of the system for segmenting a prostate MRI image based on adaptive multi-scale transform optimization, that is, the system for segmenting a prostate MRI image based on adaptive multi-scale transform optimization can be implemented by executing the flow of steps of the method for segmenting a prostate MRI image based on adaptive multi-scale transform optimization.

The invention provides a prostate MRI image segmentation system based on adaptive multi-scale transform optimization, which comprises:

a module M1: acquiring a prostate MRI image dataset;

specifically, in the module M1:

in said module M2:

A module M3: initializing parameters in a network, training by combining the network with a Transformer module with a prostate organ region marked in a prostate image, and updating the parameters to obtain an automatic prostate segmentation model;

in particular, in said module M3:

Specifically, module M3.1: the network based on the Transformer comprises N downsampling layers, N upsampling layers, 1 leakage ReLU layer and M Transformer modules;

module M3.3: carrying out down-sampling on the 1 st residual error feature map to obtain a 1 st down-sampling feature map; performing K-time convolution on the 1 st-time downsampling feature map to obtain a feature map with the same size as the 1 st-time downsampling feature map; subtracting the feature map obtained after the 1 st down-sampling feature map is convoluted for K times from the 1 st down-sampling feature map to obtain a 2 nd residual error feature map;

module M3.4: repeating the module M3.3, performing convolution downsampling on the nth residual error feature map for one time to obtain an nth downsampled feature map; performing M-time convolution on the n-th down-sampling feature map to obtain a feature map with the same size as the n-th down-sampling feature map; subtracting the feature map obtained after the n-th downsampling feature map is convolved for M times from the n-th downsampling feature map to obtain an n +1 residual error feature map; wherein the initial value of n is 2;

module M3.5: adding 1 to the value of n and then repeatedly executing a module M3.4; when N = N +1, go to module M3.6;

module M3.6: after down-sampling the image, the i-type slice stack of the input image is represented as a four-dimensional input:

x∈R^l×h×w×c

wherein x is_poolInputting the four dimensions after the pooling;

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

wherein, W_QLinear projection weights, W, for queries_KIs a linear projection weight, W, of a key_VIs the weight of the V linear projection;

wherein Q is a query and K is a key;

the output of the cross-section attention block is y ∈ R^l×h×w×c，y＝AV

z＝Layer_Norm(GELU(z_intW+b)+z_int)

module M3.11: performing K times of convolution on the 1 st feature map to obtain a feature map with the same size as the 1 st feature map; subtracting the feature map after the convolution of the 1 st feature map for the K times from the 1 st feature map to obtain a 1 st up-sampling residual feature map;

module M3.12: performing primary up-sampling on the m-th up-sampling residual error characteristic diagram to obtain an m + 1-th up-sampling characteristic diagram; adding a residual error characteristic diagram from the 1 st residual error characteristic diagram to the (N + 1) th residual error characteristic diagram, which has the same size as the (m + 1) th upsampling characteristic diagram, to the (m + 1) th upsampling characteristic diagram to obtain an (m + 1) th characteristic diagram; wherein the initial value of m is 1;

module M3.14: adding 1 to the value M, repeatedly executing the module M3.12 and the module M3.13 until M = N, and turning to the module M3.15;

the calculation formula of the loss function D is as follows:

Specifically, in the module M4:

the self-adaptive convolution is introduced to adaptively modulate convolution kernel global complementary information, a self-adaptive Transformer module is adopted to enhance global semantic extraction capability, and further remote dependency relationship is simulated, so that the module can better perform on a network with jump connection of different scales.

Example 3:

example 3 is a preferred example of example 1, and the present invention will be described in more detail.

The invention discloses a prostate MRI image segmentation method based on self-adaptive multi-scale transform optimization. The invention provides a Transformer-based 3D cross-section attention module for extracting information among different sections to segment a prostate area. The cross-section attention module improves the segmentation efficiency and precision of the deep learning network, and meanwhile, the over-fitting problem can be effectively solved. The invention obtains the corresponding prostate organ area by performing automatic organ segmentation on the prostate image. The invention establishes the automatic prostate organ segmentation model based on a large amount of prostate MRI image data, reduces the interference of irrelevant background information on the discrimination model, and provides a basis for improving the diagnosis efficiency and accuracy of prostate cancer.

In order to meet the requirement of using a computer to assist a radiologist to diagnose prostate cancer, the invention provides a prostate MRI image segmentation method based on self-adaptive multi-scale transform optimization, the segmentation result of the method at the edge and the detail is more accurate, and the prostate segmentation effect is improved.

The technical scheme adopted by the invention for solving the technical problems is as follows: a prostate MRI image segmentation method based on adaptive multi-scale transform optimization, as shown in fig. 1, comprising the following steps:

step 1, acquiring a prostate MRI image dataset, wherein the prostate MRI image dataset comprises but is not limited to a T1WI scanning sequence, and a labeled area on an image is a range of a prostate organ and is obtained by artificial labeling;

step 2, preprocessing all MRI images in the data set respectively to obtain a training set and a test set after preprocessing;

in order to solve the problems of unobvious tissue features of the original prostate image and the like, the preprocessing method in the step 2 comprises the step of performing histogram equalization operation on all MRI images in the training set and the test set. The detail expression of the image is enhanced after the operation processing, and the overall brightness of the image is improved.

In order to solve the overfitting phenomenon of network training caused by the small number of training samples, the preprocessing method in the step 2 includes performing operations such as horizontal and vertical translation, expansion and scaling, scale transformation, angle rotation and the like on all MRI images in the training set and the test set to perform data enhancement.

And 3, initializing parameters in the network, wherein the parameters in the network mainly comprise weight values, bias, the number of convolution kernels, the size of the convolution kernels, the number of transform modules, the number of Dropout layers and the like. Training a prostate organ region labeled by a prostate image by using a network with a transform module in combination with a doctor, continuously updating parameters in the network, and obtaining an automatic prostate segmentation model, wherein the model structure comprises a down-sampling module, an up-sampling module and a transform-based adaptive multi-scale attention module;

step 3-1, in this embodiment, the Transformer-based network includes N down-sampling layers, N up-sampling layers, 1 leak ReLU layer, and M Transformer modules, where N and M are optimal positive integers determined through experiments, and in this scheme, considering that a value of the number N of layers of the network has a large influence on a feature extraction effect, in order to find a suitable network hierarchy, network models of different hierarchies are built by presetting an N value and gradually increasing the hierarchies to train, and the performance of different network models is evaluated and analyzed by using multiple indexes such as a Dice Similarity Coefficient (DSC), an accuracy rate (Precision) and a Recall rate (Recall) respectively, so as to find the most suitable N value and M value of the network, and when N =5, when M =2, the network has the best segmentation effect on the prostate region;

step 3-2, performing convolution on the input prostate image for K times to obtain a feature map with the same size as the input prostate image, and subtracting the convolved feature map from the input first image to obtain a 1 st residual feature map, wherein K is a positive integer;

3-3, down-sampling the 1 st residual error feature map to obtain a 1 st down-sampling feature map; performing convolution on the 1 st down-sampling feature map for K times to obtain a feature map with the same size as the 1 st down-sampling feature map; subtracting the feature map obtained after the 1 st down-sampling feature map is convoluted for K times from the 1 st down-sampling feature map to obtain a 2 nd residual error feature map;

3-4, repeating the step 3-3, and performing convolution downsampling on the nth residual error feature map for one time to obtain a feature map of the nth downsampling; (ii) a Performing convolution on the n-th downsampling feature map for M times to obtain a feature map with the same size as the n-th downsampling feature map; subtracting the characteristic graph obtained after the n-th down-sampling characteristic graph is convolved for M times from the n-th down-sampling characteristic graph to obtain an n + 1-th residual characteristic graph; wherein the initial value of n is 2;

step 3-5, adding 1 to the n value and then repeatedly executing the step 3-4; when N = N +1, turning to step 3-6;

step 3-6, after down sampling is carried out on the image, the I-type fault stack of the input image is expressed as a four-dimensional input x epsilon R^l×h×w×cThe height of the characteristic diagram is h, the width of the characteristic diagram is w, and the number of channels is c. Using the high correlation of nearby pixels in the feature map, and downsampling x by average pooling (kernel size k), a compressed feature map is generated

3-7, calculating linear projections Q ', K' of the compression characteristic diagram, and expressing the linear projections of x in the number of channels as V, wherein the calculation formula is as follows:

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

step 3-8, calculating an attention matrix A epsilon R^l×lThe matrix represents the attention degree to be given to other sections when the current section is divided, and the calculation formula is as follows:

the output of the cross-section attention block is y ∈ R^l×h×w×cWherein y = AV;

3-9, embedding the cross-section attention module into a Transformer module, wherein the output of the Transformer module is z, and the calculation formula of z is as follows: z = Layer _ Norm (GELU (z)_intW+b)+z_int)；

3-10, performing primary up-sampling on the (N + 1) th residual error characteristic diagram to obtain a 1 st up-sampling characteristic diagram; adding a residual error feature map with the same size as the 1 st upsampling feature map in the 1 st residual error feature map to the N +1 th residual error feature map to obtain a 1 st feature map;

3-11, performing convolution on the 1 st feature map for K times to obtain a feature map with the same size as the 1 st feature map; subtracting the feature diagram after the K-time convolution of the feature diagram 1 from the feature diagram 1 to obtain an up-sampling residual feature diagram 1;

3-12, performing primary up-sampling on the m-th up-sampling residual error feature map again in the same manner as the step 3-10 to obtain an m + 1-th up-sampling feature map; adding a residual error feature map with the same size as the m + 1-th upsampling feature map in the 1 st to N + 1-th residual error feature maps to the m + 1-th upsampling feature map to obtain an m + 1-th feature map; wherein the initial value of m is 1;

step 3-13, performing K times of convolution on the m +1 th characteristic diagram again in the same way as the step 3-11 to obtain the sum

The feature map with the same size as the m +1 feature map; subtracting the feature map after the (m + 1) th feature map is subjected to K times of convolution from the (m + 1) th feature map to obtain an (m + 1) th up-sampling residual error feature map;

step 3-14, adding 1 to the value m, then repeatedly executing the step 3-12 and the step 3-13, and turning to the step 3-15 until m = N;

step 3-15, performing convolution, leaky ReLU activation function and inverse pooling on N feature graphs which represent high-level feature information and are generated in the up-sampling process, so that the size of the feature graphs is increased and final network parameters are obtained;

the calculation formula of the loss function D is as follows:

wherein p is_ijThe method comprises the steps of inputting a pixel value g with a coordinate position (i, j) in a binary image obtained by dividing a prostate image through a network_ijSetting the coordinate position of the manually marked standard segmentation image corresponding to the image in the first image as the pixel value of (i, j), wherein X and Y are the length and width of the manually marked standard segmentation image target area respectively;

and 4, inputting the image of the test data set into the trained network to obtain the prostate segmentation image of each MRI image.

Considering the limitation of the number of images in the original data set, the semi-supervised learning can be performed by utilizing the distribution information of the prostate area in the MRI images in all data; the training is divided into two stages: the method comprises the following steps that in the first stage, data of manually marked standard prostate segmentation images are mainly used for training to generate a network model, so that unmarked prostate MRI images are predicted, and prediction results are put into an initial training set; and in the second stage, all data are used for carrying out integrated training to generate a final network model.

Meanwhile, because the two-dimensional network cannot acquire useful information from adjacent sections, and because the three-dimensional network has anisotropy (i.e., the planar resolution is much lower than the in-plane resolution) of MRI data, the convolution method alone does not work well. Adaptive convolution is introduced to adaptively modulate convolution kernel global complementary information. The self-adaptive Transformer module is adopted to enhance the global semantic extraction capability and further simulate the remote dependency relationship, and the module enables the performance to be better on the jumping connection networks with different scales, so that the self-adaptive Transformer module can improve the U-shaped network based on the jumping connection, and the segmentation effect on the prostate MRI image is improved.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the apparatus, and the modules thereof provided by the present invention may be considered as a hardware component, and the modules included in the system, the apparatus, and the modules for implementing various programs may also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A prostate MRI image segmentation method based on self-adaptive multi-scale transform optimization is characterized by comprising the following steps:

step S1: acquiring a prostate MRI image dataset;

and step S3: initializing parameters in a network, training by combining the network with a Transformer module with a prostate organ region marked in a prostate image, and updating the parameters to obtain an automatic prostate segmentation model;

2. The prostate MRI image segmentation method based on adaptive multi-scale transform optimization according to claim 1, wherein:

in the step S1:

in the step S2:

3. The method for prostate MRI image segmentation based on adaptive multi-scale transform optimization according to claim 1, wherein in the step S3:

4. The prostate MRI image segmentation method based on adaptive multi-scale transform optimization according to claim 3, wherein:

step S3.1: the network based on the Transformer comprises N downsampling layers, N upsampling layers, 1 leakage ReLU layer and M Transformer modules;

step S3.4: repeating the step S3.3, and carrying out convolution downsampling on the nth residual error feature map for one time to obtain an nth downsampled feature map; performing convolution on the n-th downsampling feature map for M times to obtain a feature map with the same size as the n-th downsampling feature map; subtracting the characteristic graph obtained after the n-th down-sampling characteristic graph is convolved for M times from the n-th down-sampling characteristic graph to obtain an n + 1-th residual characteristic graph; wherein the initial value of n is 2;

x∈R^l×h×w×c

wherein x is_poolInputting the four dimensions after the pooling;

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

wherein Q is a query and K is a key;

the output of the cross-section attention block is y ∈ R^l×h×w×c，y＝AV

z＝Layer_Norm(GELU(z_intW+b)+z_int)

step S3.11: performing K times of convolution on the 1 st feature map to obtain a feature map with the same size as the 1 st feature map; subtracting the feature map after the convolution of the 1 st feature map for the K times from the 1 st feature map to obtain a 1 st up-sampling residual feature map;

step S3.12: performing primary upsampling on the mth upsampling residual characteristic map to obtain an m +1 th upsampling characteristic map; adding a residual error characteristic diagram from the 1 st residual error characteristic diagram to the (N + 1) th residual error characteristic diagram, which has the same size as the (m + 1) th upsampling characteristic diagram, to the (m + 1) th upsampling characteristic diagram to obtain an (m + 1) th characteristic diagram; wherein the initial value of m is 1;

step S3.15: performing convolution, leaky ReLU activation function and inverse pooling on N feature graphs which represent high-level feature information and are generated in the up-sampling process, so that the size of the feature graphs is increased, and network parameters are obtained;

the calculation formula of the loss function D is as follows:

5. The method for prostate MRI image segmentation based on adaptive multi-scale transform optimization according to claim 1, wherein in the step S4:

6. A prostate MRI image segmentation system based on adaptive multi-scale transform optimization, comprising:

a module M1: acquiring a prostate MRI image dataset;

7. The adaptive multi-scale transform optimization-based prostate MRI image segmentation system of claim 6, wherein:

in the module M1:

in said module M2:

8. The system for prostate MRI image segmentation based on adaptive multi-scale transform optimization according to claim 6, wherein in the module M3:

9. The adaptive multi-scale transform optimization-based prostate MRI image segmentation system of claim 8, wherein:

module M3.1: the network based on the Transformer comprises N downsampling layers, N upsampling layers, 1 leakage ReLU layer and M Transformer modules;

module M3.4: repeating the module M3.3, and performing convolution downsampling on the nth residual error feature map for one time to obtain an nth downsampled feature map; performing convolution on the n-th downsampling feature map for M times to obtain a feature map with the same size as the n-th downsampling feature map; subtracting the feature map obtained after the n-th downsampling feature map is convolved for M times from the n-th downsampling feature map to obtain an n +1 residual error feature map; wherein the initial value of n is 2;

x∈R^l×h×w×c

wherein x is_poolInputting the four dimensions after the pooling;

module M3.7: calculating linear projection Q 'of the compressed feature map query and linear projection K' of the key, and expressing the linear projection of x in the number of channels as V, the calculation formula is as follows:

Q′＝x_poolW_Q

K′＝x_poolW_K

V＝xW_V

wherein Q is a query and K is a key;

the output of the cross-section attention block is y ∈ R^l×h×w×c，y＝AV

z＝Layer_Norm(GELU(z_intW+b)+z_int)

wherein z is_intW is an intermediate result output by a Transformer module, W is a corresponding weight matrix, and b is an offset;

module M3.10: performing primary up-sampling on the (N + 1) th residual error feature map to obtain a 1 st up-sampling feature map; adding a residual error feature map with the same size as the 1 st upsampling feature map from the 1 st residual error feature map to the (N + 1) th residual error feature map to the 1 st upsampling feature map to obtain a 1 st feature map;

module M3.11: performing K times of convolution on the 1 st feature map to obtain a feature map with the same size as the 1 st feature map; subtracting the feature diagram after the K-time convolution of the feature diagram 1 from the feature diagram 1 to obtain an up-sampling residual feature diagram 1;

module M3.12: performing primary up-sampling on the m-th up-sampling residual error characteristic diagram to obtain an m + 1-th up-sampling characteristic diagram; adding a residual error feature map with the same size as the m + 1-th upsampling feature map from the 1 st residual error feature map to the N + 1-th residual error feature map to the m + 1-th upsampling feature map to obtain an m + 1-th feature map; wherein the initial value of m is 1;

module M3.13: performing K times of convolution on the (m + 1) th feature map to obtain a feature map with the same size as the (m + 1) th feature map; subtracting the characteristic diagram after the m +1 th characteristic diagram is subjected to K times of convolution from the m +1 th characteristic diagram to obtain an m +1 th upsampling residual characteristic diagram;

module M3.15: performing convolution, a Leaky ReLU activation function and inverse pooling on N feature maps representing high-level feature information generated in the up-sampling process to increase the size of the feature maps and obtain network parameters;

the calculation formula of the loss function D is as follows:

10. The system for prostate MRI image segmentation based on adaptive multi-scale transform optimization according to claim 6, wherein in the module M4: