CN114596318A - Breast cancer magnetic resonance imaging focus segmentation method based on Transformer - Google Patents

Breast cancer magnetic resonance imaging focus segmentation method based on Transformer Download PDF

Info

Publication number
CN114596318A
CN114596318A CN202210277852.XA CN202210277852A CN114596318A CN 114596318 A CN114596318 A CN 114596318A CN 202210277852 A CN202210277852 A CN 202210277852A CN 114596318 A CN114596318 A CN 114596318A
Authority
CN
China
Prior art keywords
transformer
breast cancer
encoder
transbc
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210277852.XA
Other languages
Chinese (zh)
Inventor
邵叶秦
许昌炎
桑子江
盛美红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN202210277852.XA priority Critical patent/CN114596318A/en
Publication of CN114596318A publication Critical patent/CN114596318A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30068Mammography; Breast

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a breast cancer magnetic resonance imaging focus segmentation method based on a Transformer, which relates to the field of intelligent medical treatment and deep learning and has the technical key points that: constructing a TransBC, wherein the TransBC is an MRI focus segmentation model based on Transformer combined with 3D convolution, and a network of the TransBC is an encoder-decoder structure; the encoder-decoder structure is divided into a down-sampling stage and an up-sampling stage, wherein the down-sampling stage is a CNN encoder and is used for extracting feature representations of different layers; and the up-sampling stage is a Transformer encoder and is used for extracting long-distance dependency of the high-resolution feature map for multiple times, supplementing and correcting the low-resolution CNN features. The core of the model is to encode a high-resolution feature map by using a Transformer and extract long-distance dependence to supplement and correct low-resolution CNN features. The model can more accurately process the edge part of the focus, and meanwhile, the model has a better segmentation effect on the difficult samples with unbalanced gray values in some focuses.

Description

Breast cancer magnetic resonance imaging focus segmentation method based on Transformer
Technical Field
The invention relates to the technical field of intelligent medical treatment and deep learning, in particular to a breast cancer magnetic resonance imaging focus segmentation method based on a Transformer.
Background
At present, breast cancer has become a leading cause of threat to female health, and according to latest data of international research on cancer (IARC) investigation in 2020, the number of newly increased breast cancer reaches 226 ten thousand, lung cancer is 220 ten thousand, and breast cancer formally replaces lung cancer, becoming the first cancer in the world. Various imaging examinations can be used to effectively assess and diagnose medical conditions. In practice, breast cancer dynamic enhanced sequences (DCE-MRI) possess the best diagnostic efficacy, with particular advantages in finding microscopic lesions, multicenter, multifocal and evaluating the range of lesions. With the rise and development of deep learning, researchers want to be able to segment medical images automatically through AI to assist doctor's visit. Compared with the traditional machine learning method, the convolutional network has more advantages in extracting the depth features: the weight sharing of the convolution operation brings the translation invariance of the characteristics, and the property of the convolution operator brings good local sensitivity.
At present, Convolutional Neural Networks (CNNs) have become the standard method for medical image segmentation tasks. The Unet model makes full convolutional networks and encoder-decoder architectures a new paradigm. However, the advantages of convolution operations also bring with them the inherent drawbacks of both receptive field-limited and spatial induction bias, which cannot capture the relationships of the global context. The unique self-attention mechanism in the Transformer can dynamically adjust the range of the receptive field according to the input content, and has more advantages in long-distance dependent modeling compared with the convolution operation. However, the recently proposed transform-based medical image segmentation method simply treats the transform as an auxiliary module without effectively combining the self-attention mechanism and the convolution.
Disclosure of Invention
The invention aims to solve the problems and provides a method for segmenting a breast cancer magnetic resonance imaging focus based on a Transformer.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for segmenting a breast cancer magnetic resonance imaging focus based on a Transformer is characterized in that a TransBC is constructed, the TransBC is an MRI focus segmentation model based on the Transformer combined with 3D convolution, and a network of the TransBC is an encoder-decoder structure; the encoder-decoder structure is divided into a down-sampling stage and an up-sampling stage,
the down-sampling stage is a CNN encoder and is used for extracting feature representations of different layers;
and the up-sampling stage is a Transformer encoder and is used for extracting long-distance dependency of the high-resolution feature map for multiple times, supplementing and correcting the low-resolution CNN features.
Preferably, it comprises the following steps:
s1: collecting dynamic enhanced breast cancer (DCE-MRI) data and preprocessing the data;
s2: constructing a TransBC network;
s3: constructing an encoder of a TransBC network, wherein the encoder of the TransBC network comprises a bottleneck module and a down-sampling module;
s4: constructing a decoder of a TransBC network, wherein the decoder of the TransBC network comprises a Transformer module, a feature fusion module and an up-sampling module;
s5: and training and testing the TransBC network by using the training set and the testing set obtained in the step S1.
Preferably, the preprocessing in S1 includes the following steps: the preprocessing step includes the steps of collecting patient breast cancer MRI data provided by a hospital, resampling MRI images to ensure that the spatial distance is 1mm, then performing cutting operation on the MRI images, unifying the sizes of the cut images to be (64, 64, 64), and dividing the collected data into a training set and a testing set after the data preprocessing in S1 is completed.
Preferably, the S3 includes the following steps:
3-1: using CNN encoders FCNN(·);
3-1-1: constructing a bottleneck block, wherein the bottleneck block is designed by using a classical residual error structure in ResNet;
3-1-2: constructing a down-sampling block, wherein the down-sampling block is composed of 3D convolution layers;
3-1-3: setting an encoder FCNNThe activation function of the convolution operation in (-) is the ReLU function, which is defined as: out (in) max (0, in); the convolution kernel size is set to 2 x 2, step size is 2.
3-2: inputting pictures
Figure BDA0003556742570000031
Through FCNNThe formula of the characteristic diagram after the operation is as follows:
Figure BDA0003556742570000032
preferably, the S4 includes the following steps:
4-1: constructing a Transformer module;
4-2: designing a feature fusion module by referring to the CBAM;
4-3 constructing an upsampling module to progressively resolve to
Figure BDA0003556742570000033
The feature map of (1) is restored to the original size.
Preferably, said 4-1 comprises the steps of:
4-1-1: determining the input of a Transformer module;
4-1-2: the input of the Transformer module is a 3D picture block
Figure BDA0003556742570000034
Wherein H, W, D and C respectively represent the height, width, depth and channel number of the optical fiber;
4-1-3: adding position codes and using learnable position codes;
4-1-4: the Transformer encoder comprises a multi-head self-attention block and a multi-layer perceptron block, wherein the self-attention block is responsible for completing the computation of query-key-value attention.
Preferably, the 4-1-2 comprises the following steps: partitioning the picture, partitioning the feature map x along three dimensions of width, height and depth, and stacking the blocks; a scaling strategy for the block side lengths is then used.
Preferably, the step S5 includes the steps of:
5-1: determining a basic architecture of a TransBC network, and initializing connection weight, residual error unit quantity, convolution layer quantity, learning rate, training step length, an optimizer, iteration times and training batches of each component of the network;
5-2: encoder F for inputting training set divided by S1 into TransBC networkCNN(. to obtain a down-sampled output XS
5-3: decoding the down-sampling result by using a Transformer module, a characteristic fusion module and an up-sampling module of a decoder part to obtain a model output value XU
5-4: evaluating the accuracy of model segmentation by adopting Dice, IoU and accuracy;
5-5: and (5) training the model according to the iteration times set in the step (5-1), and verifying the segmentation effect of the model by using the test set.
Preferably, in the 5-4:
the formula of Dice is:
Figure BDA0003556742570000041
wherein GT represents a gold standard binary image manually labeled by an expert, and Pred is a model prediction result. The value of Dice is [ 0-1%]The closer the Dice is to 1, the higher the contact ratio with the gold standard is;
IoU is given by the following formula:
Figure BDA0003556742570000042
IoU is used for measuring the contact ratio of the network prediction image and the gold standard as the Dice;
the accuracy is formulated as:
Figure BDA0003556742570000051
in (1), TP represents true positive; TN indicates true negative; FP and FN indicate false positives and false negatives.
The application also provides an MRI focus segmentation model, which is constructed by using the method for segmenting the breast cancer magnetic resonance imaging focus based on the Transformer, wherein the MRI focus segmentation model is a 3D medical image segmentation model based on the Transformer combined with 3D convolution, and a network of the MRI focus segmentation model adopts a coder-decoder structure; the encoder-decoder is divided into a down-sampling stage and an up-sampling stage.
The method for segmenting the breast cancer magnetic resonance imaging focus based on the Transformer is different from the method for segmenting the medical image based on the Transformer in the prior art in that the Transformer is simply regarded as an auxiliary module, and the transfbc in the method effectively utilizes the information extracted by the CNN and the Transformer respectively. The network continues to use an encoder-decoder structure, uses CNN to extract feature representations of different levels, and uses a Transformer encoder to extract long-distance dependency of feature maps for multiple times. Meanwhile, a fusion module which can fully utilize the Transformer characteristics and the CNN characteristics is designed to expand a jump layer in a classic encoder-decoder structure. The core of the method is to encode the high-resolution characteristic graph by using a Transformer, extract long-distance dependence to supplement and correct the low-resolution CNN characteristic, and more accurately process complex focuses and the edge parts of the focuses.
Drawings
FIG. 1 is a schematic diagram illustrating steps of a method for segmenting a breast cancer magnetic resonance imaging lesion based on a Transformer according to the present invention;
FIG. 2 is a flowchart of a method for segmenting a lesion of breast cancer by magnetic resonance imaging based on a Transformer according to the present invention;
FIG. 3 is a model structure diagram of a method for segmenting a breast cancer magnetic resonance imaging lesion based on a Transformer according to the present invention;
FIG. 4 is a block diagram of a Transformer-based magnetic resonance imaging lesion segmentation method for breast cancer according to the present invention;
FIG. 5 is a structural diagram of a feature fusion module of a transform-based breast cancer MRI lesion segmentation method according to the present invention;
FIG. 6 shows the lesion edge segmentation result of the transform-based lesion segmentation method for magnetic resonance imaging of breast cancer according to the present invention. Fig. 5 shows the predictive power of the model for the edge slices of malignant masses. The first row corresponds to the 1 st, 2 nd, 3 rd, 4 th, 5 th, 6 th, 39 th, 40 th, 41 th, 42 th and 43 th coronal slices in the lesion, the 2 nd behavioral model segmentation result, and the last row corresponds to the doctor labeled label (ground route) corresponding to the slice.
Fig. 7 is a result of segmenting a complex lesion by a transform-based breast cancer mri lesion segmentation method of the present invention. Fig. 6 illustrates the predictive power of the model for lesion slices with non-uniform internal gray values, with alternating bright and dark regions. The structure of fig. 6 is the same as that of fig. 5. The first row corresponds to the 28 th to 37 th coronal slices of the mass, which are the central parts of the mass.
FIG. 8 is an experimental result of a transform-based segmentation method for breast cancer MRI lesions on a test set according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and thus the present invention is not limited to the specific embodiments of the present disclosure.
A method for segmenting a magnetic resonance imaging lesion of breast cancer based on Transformer, please refer to fig. 1 and fig. 2, comprising the following steps:
s1: collecting dynamic enhanced breast cancer (DCE-MRI) data and preprocessing the data;
specifically, in one embodiment, each patient contains 4 MRI images, dynamic first, second, fourth and sixth phases, respectively, wherein the first phase of the dynamic scan is a mask before contrast enhancement is effective, contrast is effective from the second phase of the scan sequence, and then the next phase of the dynamic scan is performed at intervals (90 seconds to 120 seconds). And then preprocessing the data to reduce the redundancy and complexity of the data.
After the preprocessing, the MRI data is divided into training set test sets according to a fixed ratio, and in one embodiment, the data uses MRI data of dynamically enhanced breast cancer of 200 patients provided by a certain hospital, wherein 95 patients are benign tumor patients, and 105 patients are malignant tumor patients.
The preprocessing step is to collect patient breast cancer MRI data D provided by the hospitalBC={idi,imgi1,imgi2,imgi4,imgi6,segiIn which idiNumber, img, representing the ith patienti1,imgi2,imgi4,imgi6Respectively corresponding to the first-stage, second-stage, fourth-stage and sixth-stage MR images, seg, of the dynamic scan of the ith patientiIs a labeled binary image of the lesion in the MR image. Meanwhile, in order to reduce the error caused by measurement conversion, in one embodiment, the MRI image is resampled to ensure that the spatial distance is 1mm by 1 mm. In consideration of the difference in the condition of the patient and the computational pressure of transform encoding of the 3D medical image, in one embodiment, a cropping operation is performed on the MRI image, with the cropped images being uniformly sized (64, 64, 64). After the data preprocessing is finished, D isBCThe data is divided into training and testing sets in a fixed ratio.
S2: constructing an MRI focus segmentation model-TransBC network based on Transformer combined with 3D convolution, wherein the network adopts an encoder-decoder structure; a classical encoder-decoder is divided into a downsampling stage and an upsampling stage.
S3: constructing an encoder of a TransBC network;
the encoder of the transBC network comprises a bottleneck module and a down-sampling module.
The specific steps are as follows:
3-1: using CNN encoders FCNN(·),
Specifically, in one embodiment, the step 3-1 comprises the following steps:
3-1-1: constructing a bottleneck block:
the bottleneck block is designed using the classical residual structure in ResNet.
Specifically, the bottleneck block comprises a three-stage repeated 3D convolutional layer-batch normalization-ReLu layer block, and a short-jumper design is adopted in the last stage.
Compared to 2D convolution, the 3D convolution implements feature extraction of 3-dimensional spatial data, and the convolution formula is generally defined as: xl=f(Wl*Xl-1;bl) Wherein X isl-1And XlInput and output of respective first layer convolution layer, WlAnd blConvolution kernel parameters and bias terms of the first layer of convolution layer are respectively;
the residual structure is introduced to avoid the reduction of the precision of the model along with the increase of the convolution layer number, and the residual unit operation is defined as Zl=Zl-1+F(Zl-1;θl) Wherein Z isl-1And ZlInput and output of the l-th layer residual operation layer, thetalIs the set of parameters in the layer i residual operation.
3-1-2: a downsample block is constructed.
The lower sampling block is composed of a 3D convolution layer, and a convolution operation formula is shown as a step 4-1-1;
3-1-3: setting an encoder FCNNThe activation function of the convolution operation in (-) is the ReLU function, which is defined as: out (in) max (0, in); the convolution kernel size is set to 2 x 2, step size is 2.
3-2: inputting pictures
Figure BDA0003556742570000091
Through FCNNThe formula of the characteristic diagram after the operation is as follows:
Figure BDA0003556742570000092
s4: a decoder of the TransBC network is constructed.
The decoder of the TransBC network comprises a Transformer module, a feature fusion module and an upsampling module, and the S4 comprises the following steps:
4-1: constructing a Transformer module:
referring to fig. 4, fig. 4 is a structure of a Transformer module for capturing long-distance dependency of features and performing global information correction and spatial information complementation on a feature map extracted from CNN encoder branches.
The specific steps of the step 4-1 are as follows:
4-1-1: determining the input of a Transformer module:
TransBC uses a Transformer module at each hop layer stage.
Assume a Transformer encoder is FCNN(. h) feature fusion Module is FFusion(. 2), then each fused feature ffusThe formula is as follows:
ffus=FFusion(fl+1,FTran(fl))
Figure BDA0003556742570000101
Figure BDA0003556742570000102
in one embodiment, the input of the Transformer module is not flBut is instead provided with a layer fl-1. Compare f withl,fl-1One less convolution operation, therefore fl-1Contains more spatial detail information;
4-1-2: the input of the Transformer module is a 3D picture block
Figure BDA0003556742570000103
Wherein H, W, D, C respectively represent the height, width and depth and the number of channels.
The size of the 3D picture is (H, W, D, C). In one embodiment, to meet the input requirement of the transform encoder, a picture needs to be partitioned, i.e. picture serialized. The feature map x is partitioned along three dimensions of width, height and depth, and the blocks are stacked. Such processing may reduce information loss compared to blocking using convolutional layers.
In one embodiment, to reduce the computational load of the transform encoder, a scaling strategy of block side lengths is used. By experimental comparison, considering the balance between time and space complexity, the side length of the small cube in the present application is 1/8 of the side length of the original cube, so the number of blocks is 8 × 512.
4-1-3: in order to preserve the spatial information of the image sequence, a position code is added, a learnable position code is used, the dimension of which is consistent with the dimension in 4-1-2:
Figure BDA0003556742570000111
wherein the content of the first and second substances,
Figure BDA0003556742570000112
for the purpose of embedding the projection in a block,
Figure BDA0003556742570000113
for position coding, a represents the side length of the block, n is the number of the blocks, and b is the length of the embedded vector;
4-1-4: the transform encoder includes a multi-headed self-attention block and a multi-layered perceptron block.
The multi-head self-attention block is responsible for completing the calculation of query-key-value attention, and vectors Q, K and V come from the same input. The calculation formula is as follows,
Figure BDA0003556742570000114
the specific calculation process can be disassembled into the following steps:
(1) and calculating the similarity of the Q and the K through point multiplication, and transposing the K to meet the matrix operation rule. Normalizing, i.e. dividing, the result after the operation
Figure BDA0003556742570000115
dkMeaning the length of vector K.
(2) The similarity is quantified by the softmax function as a probability distribution.
(3) And performing vector multiplication on the probability distribution and V, and introducing the coding output of each input picture sequence into the coding information of the other picture sequences through an attention mechanism.
The multi-head self-attention block and the multi-layer perceptron block form a primary encoding of a transform encoder, and the transform encoder generally comprises N-times of encoding, and the encoding formula is as follows:
Figure BDA0003556742570000116
Figure BDA0003556742570000117
Figure BDA0003556742570000124
Figure BDA0003556742570000121
4-1-5: the dimension of the output characteristic diagram of the Transformer encoder is (N, D), and the dimension can be changed into a dimension by using a related dimension adjusting function in a deep learning framework
Figure BDA0003556742570000122
4-2: the feature fusion module is designed by referring to CBAM, and the structure is shown in FIG. 5.
The design of the feature fusion module mainly takes two aspects, namely, the difference between the CNN and Transformer coding modes and the difference between the CNN feature map and the Transformer feature map resolution.
The inputs of the feature module come from the CNN encoder module and the Transformer module respectively. The fusion module structure is shown in fig. 5. And adding the alignment elements of the local feature l and the global feature g, sending the added alignment elements into a bottleneck structure block, and calculating the importance of the features from the angles of a channel and a space position for the output of the bottleneck structure block. The operation process of the feature fusion module is as follows:
f=bottleneck(l+g)
ch_c of=avg_p ooling(f)+max_p ooling(f)
sp_c of=softmax(f)
ff=f*ch_c of+f*sp_c of。
4-3 build up of upsampling module, progressively resolving resolution into
Figure BDA0003556742570000123
The feature map of (1) is restored to the original size.
The upsampling process comprises a plurality of repeated upsampling blocks consisting of a 2 x upsampler, a 3D convolution layer with a core size of 3 x 3 and a ReLu layer. The up sampler adopts an interpolation value method, namely, a new voxel value is inserted between voxel points by using a trilinear interpolation algorithm on the basis of the original image voxel. For the feature map x, the size is (W, H, D), x is up-sampled by 2 times, and the down-sampled x' size is (2W,2H, 2D).
S5: and training and testing the TransBC network by using the training set and the testing set obtained in the step S1.
5-1: determining a basic architecture of a TransBC network, and initializing connection weight, residual error unit quantity, convolution layer quantity, learning rate, training step length, an optimizer, iteration times and training batches of each component of the network;
5-2: inputting the training set divided in the step 1) into an encoder F of a TransBC networkCNN(. C) to obtain a down-sampled output XS
5-3: decoding the down-sampling result by using a Transformer module, a feature fusion module and an up-sampling module of a decoder part to obtain a model output value XU
5-4: evaluating the accuracy of model segmentation by adopting Dice, IoU and accuracy;
the formula of Dice is:
Figure BDA0003556742570000131
wherein GT represents a gold standard binary image manually labeled by an expert, and Pred is a model prediction result. The value of Dice is [ 0-1%]A closer Dice to 1 indicates a higher degree of overlap with the gold standard.
IoU is given by the following formula:
Figure BDA0003556742570000132
IoU is a measure of the degree of overlap of the network predicted image with the gold standard, as is done for Dice.
The accuracy is formulated as:
Figure BDA0003556742570000141
in (1), TP represents true positive; TN indicates true negative; FP and FN indicate false positives and false negatives. A higher value of accuracy indicates a higher proportion of total voxels in the prediction of correct voxels.
5-5: and (5) training the model according to the iteration times set in the step (5-1), and verifying the segmentation effect of the model by using the test set.
The method for segmenting the breast cancer magnetic resonance imaging focus based on the Transformer aims at the problems that breast cancer Magnetic Resonance Imaging (MRI) often has the characteristics of focus boundary blurring and focus internal gray value imbalance, and a traditional convolution network is used to cause spatial induction deviation and limited receptive field in the processing process, and provides a 3D medical image segmentation model, namely TransBC, combining the Transformer and convolution operation. The network follows a classical coder-decoder architecture. A CNN encoder extracts feature representations of different layers in a downsampling stage; and (3) extracting long-distance dependence of the high-resolution feature map for multiple times by the aid of the transform encoder in the upsampling stage to supplement and correct the low-resolution CNN features. The core of the model is to encode a high-resolution feature map by using a Transformer and extract long-distance dependence to supplement and correct low-resolution CNN features. The experimental result on the breast cancer data set also shows that the model can more accurately process the marginal part of the focus, and meanwhile, the model has a better segmentation effect on the difficult samples with uneven gray value in some focuses.
The above description is only an example of the present invention, and is not intended to limit the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (11)

1. A breast cancer magnetic resonance imaging focus segmentation method based on a Transformer is characterized by comprising the following steps: constructing a TransBC, wherein the TransBC is an MRI focus segmentation model based on a Transformer combined with 3D convolution, and a network of the TransBC is an encoder-decoder structure; the encoder-decoder structure is divided into a down-sampling stage and an up-sampling stage,
the down-sampling stage is a CNN encoder and is used for extracting feature representations of different layers;
and the up-sampling stage is a Transformer encoder and is used for extracting long-distance dependency of the high-resolution feature map for multiple times, supplementing and correcting the low-resolution CNN features. In the up-sampling stage, the resolution of the characteristic image is gradually reduced to the original size by applying the up-sampler for multiple times, and the output of the network is the label image of the medical image.
2. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 1, wherein: which comprises the following steps:
s1: collecting dynamic enhanced breast cancer (DCE-MRI) data and preprocessing the data;
s2: constructing a TransBC network;
s3: constructing an encoder of a TransBC network, wherein the encoder of the TransBC network comprises a bottleneck module and a down-sampling module;
s4: constructing a decoder of a TransBC network, wherein the decoder of the TransBC network comprises a Transformer module, a feature fusion module and an up-sampling module;
s5: and training and testing the TransBC network by using the training set and the testing set obtained in the step S1.
3. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: the preprocessing in the S1 includes the steps of: the preprocessing step comprises the steps of collecting patient breast cancer MRI data provided by a hospital, resampling MRI images to ensure that the spatial distance is 1mm, then cutting the MRI images, unifying the sizes of the cut images to be (64, 64, 64), and dividing the collected data into a training set and a testing set after the data preprocessing in S1 is finished.
4. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: the S3 includes the following steps:
3-1: using CNN encoders FCNN(·);
3-1-1: constructing a bottleneck block, wherein the bottleneck block is designed by using a classical residual error structure in ResNet;
3-1-2: constructing a down-sampling block, wherein the down-sampling block is composed of 3D convolution layers;
3-1-3: setting an encoder FCNNThe activation function of the convolution operation in (-) is the ReLU function, which is defined as: out (in) max (0, in); setting the convolution kernel size to be 2 x 2 and the step size to be 2;
3-2: inputting pictures
Figure FDA0003556742560000021
Through FCNNThe formula of the characteristic diagram after the operation is as follows:
Figure FDA0003556742560000022
5. the Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: the S4 includes the following steps:
4-1: constructing a Transformer module;
4-2: designing a feature fusion module by referring to the CBAM;
4-3 constructing an upsampling module to progressively resolve to
Figure FDA0003556742560000023
The feature map of (1) is restored to the original size.
6. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 5, wherein: the 4-1 comprises the following steps:
4-1-1: determining the input of a Transformer module;
4-1-2: the input of the transform module is a 3D picture block
Figure FDA0003556742560000024
Wherein H, W, D and C respectively represent the height, width, depth and channel number of the optical fiber;
4-1-3: adding position codes and using learnable position codes;
4-1-4: the Transformer encoder comprises a multi-head self-attention block and a multi-layer perceptron block, wherein the self-attention block is responsible for completing the calculation of query-key-value attention.
7. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: the 4-1-2 comprises the following steps: partitioning the picture, partitioning the feature map x along three dimensions of width, height and depth, and stacking the blocks; a scaling strategy for the block side lengths is then used.
8. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: the step of S5 comprises the following steps:
5-1: determining a basic architecture of a TransBC network, and initializing connection weight, residual error unit quantity, convolution layer quantity, learning rate, training step length, an optimizer, iteration times and training batches of each component of the network;
5-2: encoder F for inputting training set divided by S1 into TransBC networkCNN(. to obtain a down-sampled output XS
5-3: decoding the down-sampling result by using a Transformer module, a feature fusion module and an up-sampling module of a decoder part to obtain a model output value XU
5-4: evaluating the accuracy of model segmentation by adopting Dice, IoU and accuracy;
5-5: and (5) training the model according to the iteration times set in the step (5-1), and verifying the segmentation effect of the model by using the test set.
9. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: in the 5-4:
the formula of Dice is:
Figure FDA0003556742560000031
wherein GT represents a gold standard binary image manually labeled by an expert, and Pred is a model prediction result;
IoU is given by the following formula:
Figure FDA0003556742560000032
IoU is used for measuring the contact ratio of the network prediction image and the gold standard as the Dice;
the accuracy is formulated as:
Figure FDA0003556742560000041
in (1), TP represents true positive; TN meterShowing true negative; FP and FN indicate false positives and false negatives.
10. The Transformer-based lesion segmentation method for breast cancer magnetic resonance imaging according to claim 2, wherein: in the 5-5:
to verify the segmentation performance of the model on breast cancer MRI, the images need to be processed to meet the input requirements of the model. The output image tag map of the model was applied with the softmax function and the threshold was set to 0.5. If the value in the label map is greater than the threshold value, it is set to 1, and if it is less than 0.5, it is set to 0. After the processing, the label map and the MRI image are in one-to-one correspondence, and if the voxel value is 0, the non-focus is represented, and if the voxel value is 1, the focus is represented.
11. An MRI lesion segmentation model characterized by: constructed using the method for transform-based MRI lesion segmentation of any one of claims 1-9, wherein the MRI lesion segmentation model is a transform-based 3D medical image segmentation model combined with 3D convolution, and a network of the MRI lesion segmentation model adopts a coder-decoder structure; the encoder-decoder is divided into a downsampling stage and an upsampling stage.
CN202210277852.XA 2022-03-21 2022-03-21 Breast cancer magnetic resonance imaging focus segmentation method based on Transformer Withdrawn CN114596318A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210277852.XA CN114596318A (en) 2022-03-21 2022-03-21 Breast cancer magnetic resonance imaging focus segmentation method based on Transformer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210277852.XA CN114596318A (en) 2022-03-21 2022-03-21 Breast cancer magnetic resonance imaging focus segmentation method based on Transformer

Publications (1)

Publication Number Publication Date
CN114596318A true CN114596318A (en) 2022-06-07

Family

ID=81819682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210277852.XA Withdrawn CN114596318A (en) 2022-03-21 2022-03-21 Breast cancer magnetic resonance imaging focus segmentation method based on Transformer

Country Status (1)

Country Link
CN (1) CN114596318A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115489127A (en) * 2022-10-28 2022-12-20 四川大学华西医院 Bone space microstructure prosthesis construction method based on deep learning
CN115953781A (en) * 2023-03-14 2023-04-11 武汉昊博科技有限公司 Mammary gland artificial intelligence analysis system and method based on thermal chromatography image
CN116309650A (en) * 2023-05-22 2023-06-23 湖南大学 Medical image segmentation method and system based on double-branch embedded attention mechanism
CN116664590A (en) * 2023-08-02 2023-08-29 中日友好医院(中日友好临床医学研究所) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115489127A (en) * 2022-10-28 2022-12-20 四川大学华西医院 Bone space microstructure prosthesis construction method based on deep learning
CN115953781A (en) * 2023-03-14 2023-04-11 武汉昊博科技有限公司 Mammary gland artificial intelligence analysis system and method based on thermal chromatography image
CN115953781B (en) * 2023-03-14 2023-06-13 武汉昊博科技有限公司 Mammary gland artificial intelligence analysis system and method based on thermal tomography
CN116309650A (en) * 2023-05-22 2023-06-23 湖南大学 Medical image segmentation method and system based on double-branch embedded attention mechanism
CN116664590A (en) * 2023-08-02 2023-08-29 中日友好医院(中日友好临床医学研究所) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image
CN116664590B (en) * 2023-08-02 2023-10-13 中日友好医院(中日友好临床医学研究所) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image

Similar Documents

Publication Publication Date Title
CN114596318A (en) Breast cancer magnetic resonance imaging focus segmentation method based on Transformer
CN107610194B (en) Magnetic resonance image super-resolution reconstruction method based on multi-scale fusion CNN
CN111784671B (en) Pathological image focus region detection method based on multi-scale deep learning
CN112258526B (en) CT kidney region cascade segmentation method based on dual attention mechanism
CN114494296A (en) Brain glioma segmentation method and system based on fusion of Unet and Transformer
CN114037714B (en) 3D MR and TRUS image segmentation method for prostate system puncture
CN112862805B (en) Automatic auditory neuroma image segmentation method and system
CN116739985A (en) Pulmonary CT image segmentation method based on transducer and convolutional neural network
CN114663440A (en) Fundus image focus segmentation method based on deep learning
CN112132878A (en) End-to-end brain nuclear magnetic resonance image registration method based on convolutional neural network
CN115471470A (en) Esophageal cancer CT image segmentation method
CN115908800A (en) Medical image segmentation method
CN117274599A (en) Brain magnetic resonance segmentation method and system based on combined double-task self-encoder
Shan et al. SCA-Net: A spatial and channel attention network for medical image segmentation
CN116823850A (en) Cardiac MRI segmentation method and system based on U-Net and transducer fusion improvement
CN115809998A (en) Based on E 2 Glioma MRI data segmentation method based on C-Transformer network
CN117078941A (en) Cardiac MRI segmentation method based on context cascade attention
CN113256657B (en) Efficient medical image segmentation method and system, terminal and medium
CN112990359B (en) Image data processing method, device, computer and storage medium
CN117333750A (en) Spatial registration and local global multi-scale multi-modal medical image fusion method
CN116051609B (en) Unsupervised medical image registration method based on band-limited deformation Fourier network
CN116433654A (en) Improved U-Net network spine integral segmentation method
CN115984560A (en) Image segmentation method based on CNN and Transformer
CN116309679A (en) MLP-like medical image segmentation method suitable for multiple modes
Wang et al. Multi-scale hierarchical transformer structure for 3d medical image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220607

WW01 Invention patent application withdrawn after publication