CN113487560A

CN113487560A - Brain tumor segmentation method and device based on spatial feature attention mechanism

Info

Publication number: CN113487560A
Application number: CN202110749496.2A
Authority: CN
Inventors: 蔡建平; 何喆
Original assignee: Hangzhou City University
Current assignee: Hangzhou City University
Priority date: 2021-07-02
Filing date: 2021-07-02
Publication date: 2021-10-08

Abstract

The application discloses a brain tumor segmentation method and a brain tumor segmentation device based on a spatial feature attention mechanism, wherein the method comprises the following steps: acquiring a test set and a training set of a four-mode brain MRI sequence; constructing a spatial feature attention U-type network; training the spatial feature attention U-type network by using the training set; and inputting the test set into the trained spatial feature attention U-shaped network to obtain a brain tumor segmentation result.

Description

Brain tumor segmentation method and device based on spatial feature attention mechanism

Technical Field

The invention relates to the technical field of image segmentation, in particular to a brain tumor segmentation method and device based on a spatial feature attention mechanism.

Background

Medical image segmentation is a necessary prerequisite for the development of healthcare systems, in particular for disease diagnosis and therapy planning. In various medical image segmentation tasks, U-shaped structures (also known as U-Net) have become the de facto standard and have enjoyed great success.

However, in the segmentation of brain tumors, due to the difference in the structural shape, location and resolution difficulty of the tumor, it is difficult to obtain better results with U-Net. To address this problem, many approaches combine the attention mechanism with convolutional neural networks. However, most of the existing methods only apply Attention mechanisms in spatial dimensions to improve the accuracy of target positioning, such as Attention U-Net, transit, TransBTS, etc., neglect the feature (channel) dimensions, which are important for the model to identify what is a tumor, and are an important breakthrough for improving the performance of the model.

Disclosure of Invention

The embodiment of the application aims to provide a brain tumor segmentation method and device based on a spatial feature attention mechanism, so as to solve the problem that due to the difference of the structural shape, the size and the resolution difficulty of tumors, a better result is difficult to obtain by U-Net.

According to a first aspect of embodiments of the present application, there is provided a brain tumor segmentation method based on a spatial feature attention mechanism, including:

acquiring a test set and a training set of a four-mode brain MRI sequence;

constructing a spatial feature attention U-type network;

training the spatial feature attention U-type network by using the training set;

inputting the test set into a trained spatial feature attention U-shaped network to obtain a brain tumor segmentation result;

wherein the spatial feature attention U-type network is based on a 3D U-type network and is combined with a spatial feature attention module;

the 3D U type network consists of an encoder and a decoder, wherein a jump connection is formed between the encoder and the decoder, the encoder is used for extracting a feature map of a brain tumor from the four-modality brain MRI sequence and reducing the feature map, and the decoder is used for restoring the reduced feature map, locating the position of the brain tumor from the feature map and segmenting the brain tumor;

the spatial feature attention module is composed of a spatial attention submodule and a feature attention submodule, the feature attention submodule is used for generating a feature attention map according to the feature map of the brain tumor, the feature attention map is multiplied by the feature map of the brain tumor to obtain a first feature map, the first feature map is added with the feature map of the brain tumor to obtain a second feature map, the spatial attention submodule is used for generating a spatial attention map according to the second feature map, the spatial attention map is multiplied by the spatial position of a salient feature with the second feature map to obtain a third feature map, and the third feature map is added with the second feature map to obtain the output of the spatial feature attention module.

Furthermore, the encoder is composed of n sequentially connected encoding blocks, a down-sampling layer is arranged between every two encoding blocks, the encoding blocks are used for extracting the feature map of the brain tumor, and the down-sampling layer is used for reducing the dimension of each feature map by half under the condition that the number of feature map channels is kept unchanged.

Further, one of the spatial feature attention modules is inserted between two adjacent encoding blocks and before a down-sampling layer between the encoding blocks.

Further, the spatial attention sub-module includes: the device comprises a Max wiring layer, an AvgPooling layer, a connection operation layer, a down sampling layer, an up sampling layer and a Sigmoid layer, wherein the Max wiring layer and the AvgPooling layer are connected in parallel and then sequentially connected in series with the connection operation layer, the down sampling layer, the up sampling layer and the Sigmoid layer.

Further, the feature attention sub-module includes: the device comprises a Maxbonding layer, an AvgPooling layer, a connection operation layer, a Transformer layer, a 3D convolution layer and a Sigmoid layer, wherein the Maxbonding layer and the AvgPooling layer are connected in parallel and then sequentially connected in series with the connection operation layer, the Transformer layer, the 3D convolution layer and the Sigmoid layer.

Further, the decoder includes: the decoding device comprises n decoding blocks and a 3D convolutional layer, wherein the n decoding blocks are connected in reverse order and then are adjacent to the 3D convolutional layer, an upper sampling layer is arranged between every two adjacent decoding blocks, the upper sampling layer is used for increasing the size of each dimension of characteristics by one time under the condition that the number of characteristic channels is kept unchanged, and the 3D convolutional layer is used for adjusting the number of the channels of the output characteristics of the first decoding block to be consistent with the number of the brain tumor segmentation results.

Further, the nth decoding block receives the output of the encoder, and the rest of the decoding blocks respectively receive the output of the spatial feature attention module.

According to a second aspect of the embodiments of the present application, there is provided a brain tumor segmentation apparatus based on a spatial feature attention mechanism, including:

the acquisition module is used for acquiring a test set and a training set of a four-mode brain MRI sequence;

the building module is used for building a spatial feature attention U-shaped network;

the training module is used for training the spatial feature attention U-shaped network by utilizing the training set;

the output module is used for inputting the test set into the trained spatial characteristic attention U-shaped network to obtain a brain tumor segmentation result;

According to a third aspect of embodiments of the present application, there is provided an electronic apparatus, including:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method as described in the first aspect.

According to a fourth aspect of embodiments herein, there is provided a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method according to the first aspect.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

according to the embodiment, the four-mode brain MRI sequence is input into the trained spatial feature attention U-shaped network, and the brain tumor segmentation result is obtained. Although the U-shaped structure (also known as U-Net) has become a de facto standard in various medical image segmentation tasks and has enjoyed great success. However, in the segmentation of brain tumors, due to the difference in the structural shape, location and resolution difficulty of the tumor, it is difficult to obtain better results with U-Net. How to make the model better judge the brain tumor position and improve the model identification accuracy rate is very important for a good model. The spatial feature attention module is designed to solve the problem and consists of a spatial attention submodule and a feature attention submodule, the feature attention submodule can help a model to pay attention to features related to tumor recognition and ignore irrelevant features so as to improve the accuracy of the model in judging the brain tumor, the spatial attention submodule can help the model to pay attention to the spatial position of the tumor features so as to improve the positioning accuracy of the model in judging the brain tumor, the combination of the two can well solve the problem that due to the difference of structural shape, position and resolution difficulty of the brain tumor, U-Net is difficult to obtain a better result, the performance of 3D U-Net on a Dice coefficient and sensitivity is greatly improved, and experiments on a BraTs2020 data set also prove the point.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a flow chart illustrating a brain tumor segmentation method based on a spatial feature attention mechanism according to an exemplary embodiment.

Fig. 2 is a diagram illustrating a spatial feature attention U-type network architecture according to an exemplary embodiment.

Fig. 3 is a block diagram of an encoder shown in accordance with an example embodiment.

Fig. 4 is a block diagram of a decoder according to an exemplary embodiment.

FIG. 5 is a block diagram illustrating a spatial feature attention module in accordance with an exemplary embodiment.

FIG. 6 is a block diagram illustrating a feature attention module in accordance with one exemplary embodiment.

FIG. 7 is a block diagram illustrating a spatial attention module in accordance with an exemplary embodiment.

Fig. 8 is a block diagram of a coding block one shown in accordance with an example embodiment.

Fig. 9 is a block diagram illustrating encoding blocks two through five and decoding blocks one through five according to an example embodiment.

Fig. 10 is a block diagram illustrating a brain tumor segmentation apparatus based on a spatial feature attention mechanism according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

Fig. 1 is a flow chart illustrating a brain tumor segmentation method based on a spatial feature attention mechanism according to an exemplary embodiment. The embodiment of the invention provides a brain tumor segmentation method based on a spatial feature attention mechanism, which comprises the following steps:

step S11, acquiring a test set and a training set of a four-mode brain MRI sequence;

step S12, constructing a spatial feature attention U-shaped network;

step S13, training the spatial feature attention U-shaped network by using the training set;

step S14, inputting the test set into the trained spatial feature attention U-shaped network to obtain a brain tumor segmentation result;

wherein the spatial feature attention U-type network is based on a 3D U-type network and is combined with a spatial feature attention module; the 3D U type network consists of an encoder and a decoder, wherein a jump connection is formed between the encoder and the decoder, the encoder is used for extracting a feature map of a brain tumor from the four-modality brain MRI sequence and reducing the feature map, and the decoder is used for restoring the reduced feature map, locating the position of the brain tumor from the feature map and segmenting the brain tumor;

According to the embodiment, the four-mode brain MRI sequence is input into the trained spatial feature attention U-shaped network, and the brain tumor segmentation result is obtained. Although the U-shaped structure (also known as U-Net) has become a de facto standard in various medical image segmentation tasks and has enjoyed great success. However, in the segmentation of brain tumors, due to the difference in the structural shape, location and resolution difficulty of the tumor, it is difficult to obtain better results with U-Net. How to make the model better judge the brain tumor position and improve the model identification accuracy rate is very important for a good model. The spatial feature attention module is designed to solve the problem and consists of a spatial attention submodule and a feature attention submodule, the feature attention submodule can help a model to pay attention to features related to tumor recognition and ignore irrelevant features so as to improve the accuracy of the model in judging the brain tumor, the spatial attention submodule can help the model to pay attention to the spatial position of the tumor features so as to improve the positioning accuracy of the model in judging the brain tumor, the combination of the two can well solve the problem that due to the difference of structural shape, position and resolution difficulty of the brain tumor, U-Net is difficult to obtain a better result, and the performance of 3D U-Net on a Dice coefficient and sensitivity is greatly improved.

In a specific implementation of step S11, a test set and a training set of four-modality brain MRI sequences are acquired;

in the specific implementation of step S12, a spatial feature attention U-type network is constructed; wherein the spatial feature attention U-type network is substantially framed by a 3D U-type network and incorporates a spatial feature attention module.

Fig. 2 is a diagram illustrating a spatial feature attention U-type network architecture according to an exemplary embodiment. Referring to fig. 2, the 3D U type network is composed of an encoder and a decoder, the encoder and the decoder are connected in a jump manner, the encoder is used for extracting a feature map of a brain tumor from the four-modality brain MRI sequence and reducing the feature map, and the decoder is used for restoring the reduced feature map, locating the position of the brain tumor from the feature map and performing brain tumor segmentation.

Specifically, the encoder is composed of n sequentially connected encoding blocks, a down-sampling layer is arranged between every two encoding blocks, the encoding blocks are used for extracting a feature map of the brain tumor, and the down-sampling layer is used for reducing the dimension of each feature map by half under the condition that the number of feature map channels is kept unchanged.

Further, each coding block consists of two coding sub-blocks, which are used for extracting a feature map related to the brain tumor, wherein a Dropout layer is arranged between the two coding sub-blocks of the first coding block, and some features can be randomly discarded, so that the overfitting problem of the model is alleviated. Each coding subblock consists of a GroupNorm layer, a ReLu layer and a 3D convolution layer in sequence.

The implementation of the encoder will be described in detail below, taking as an example that the encoder consists of 5 sequentially connected encoding blocks.

(1) And the implementation of the first coding block is shown in fig. 8. And the first coding block is composed of a GroupNorm layer, a Droupout layer, a ReLu layer and a 3D convolution layer in sequence. The role of the Droupout layer is to randomly discard some features to mitigate overfitting of the network.

(2) The implementation of coding blocks two through five is shown in fig. 9. And the second coding block to the fifth coding block are sequentially composed of a GroupNorm layer, a ReLu layer and a 3D convolution layer.

(3) Implementation of a downsampling layer. The downsampled layer is implemented using a 3D convolution with a kernel size of 2 x 2, step size of 2.

The decoder includes: the decoding device comprises n decoding blocks and a 3D convolutional layer, wherein the n decoding blocks are connected in reverse order and then are adjacent to the 3D convolutional layer, an upper sampling layer is arranged between every two adjacent decoding blocks, the upper sampling layer is used for increasing the size of each dimension of characteristics by one time under the condition that the number of characteristic channels is kept unchanged, and the 3D convolutional layer is used for adjusting the number of the channels of the output characteristics of the first decoding block to be consistent with the number of the brain tumor segmentation results.

Each decoding block consists of two decoding subblocks, and each decoding subblock consists of a GroupNorm layer, a ReLu layer and a 3D convolutional layer in sequence. The nth decoding block receives the output of the encoder, and the rest decoding blocks respectively receive the output of the spatial feature attention module.

The decoder comprises: the decoder comprises 5 decoding blocks and a 3D convolutional layer, wherein the 5 decoding blocks are connected in reverse order and then adjacent to the 3D convolutional layer, an up-sampling layer is arranged between every two adjacent decoding blocks, and the implementation of the decoder is explained in detail for an example.

(1) An implementation of decoding blocks five to one is shown in fig. 9. And five to one code blocks are sequentially composed of a GroupNorm layer, a ReLu layer and a 3D convolution layer.

(2) And (4) realizing an upsampling layer. The upsampling layer is implemented using a 3D deconvolution with a kernel size of 2 x 2, step size of 2.

(3) The decoding block and the upsampling layer are connected as shown in fig. 4 to obtain a decoding block.

Referring to fig. 3, one of the spatial feature attention modules is inserted between two adjacent ones of the coding blocks and before a downsampling layer between the coding blocks. The spatial feature attention module is composed of a spatial attention submodule and a feature attention submodule, the feature attention submodule is used for generating a feature attention map according to the feature map of the brain tumor, the feature attention map is multiplied by the feature map of the brain tumor to obtain a first feature map, the first feature map is added with the feature map of the brain tumor to obtain a second feature map, the spatial attention submodule is used for generating a spatial attention map according to the second feature map, the spatial attention map is multiplied by the spatial position of a salient feature with the second feature map to obtain a third feature map, and the third feature map is added with the second feature map to obtain the output of the spatial feature attention module.

The implementation of the spatial feature attention module is described in detail below with reference to the drawings, as shown in fig. 5. The spatial feature attention module consists of a feature attention submodule (CAM) and a spatial attention Submodule (SAM).

Assume that the input to the spatial feature attention Module is

Output is as

C, H, W, D represent the number of channels, height, width and depth, respectively. The operation of the spatial feature attention module can be represented by the following equation:

wherein X' represents the result of the intermediate operation,

which means that for each of the elements plus,

representing a multiplication per element.

The feature attention submodule is implemented as shown in fig. 6. The feature attention submodule is composed of a Max Pooling layer and an AvgPooling layer which are arranged in parallel, a connection operation layer, a Transformer layer, a 3D convolution layer and a Sigmoid layer in sequence. The MaxPooling and AvgPooling operations herein will apply in the spatial dimensions, i.e. (height, width and depth). Assume that the input to the feature attention submodule is

Output is as

The operation of the feature attention submodule can be represented by:

R＝Sigmoid(W_c(Tf([MaxPooling(X),AvgPooling(X)])))

Tf(Z)＝FFN(LN(Z′))+Z′

Z′＝MHA(LN(Z))+Z

wherein Tf denotes a Transformer layer, W_cRepresenting weights of the 3D convolutional layer, LN layer normalization, FFN feedforward network, MHA multi-head attention, [, ]]Represents the join operation, Z represents the input of the transform layer, and Z' represents the intermediate result of the transform layer.

The implementation of the spatial attention submodule is shown in fig. 7. The feature attention sub-module is composed of a MaxPoint layer, an AvgPooling layer, a connection operation layer, a down sampling layer, an up sampling layer and a Sigmoid layer which are arranged in parallel. The down-sampling layer here is implemented using a 3D convolution with a kernel size of 2 x 2, step size of 2. The upsampling layer is implemented using a 3D deconvolution with a kernel size of 2 x 2, step size of 2. The MaxPooling and AvgPooling operations herein will apply to channel dimensions. Assume that the input to the spatial attention submodule is

Output is as

The operation of the spatial attention submodule can be represented by:

R＝Sigmoid(Upsample(W_s([MaxPooling(X),AvgPooling(X)])))

wherein W_sRepresents the weight of the 3D convolutional layer, and upsamplie represents the upsampled layer.

In a specific implementation of step S13, training the spatial feature attention U-type network with the training set;

specifically, after the spatial feature attention U-type network is constructed, a Pytrch framework is used for realizing the model, and an NVIDIA RTX 2080Ti GPU is used for training the model. The loss function adopts a Dice loss function, and the optimizer adopts an Adam optimizer. The initial learning rate was set to 0.0001, the attenuation factor was 0.5, and the attenuation tolerance was 20. An image enhancement technique is employed that (i) the image is scaled with a probability of 0.25, the scaling factor being between 0.9 and 1.1; (ii) randomly flipping the coronal and sagittal planes of the image with a probability of 0.5; (iii) randomly translating the image with a probability of 0.1; (iv) noise and blurred images were added to the image with a probability of 0.25. After image enhancement, the image will be input to the network, cropped from 240 × 240 × 155 voxel resolution to 112 × 144 × 96 voxel resolution.

In the specific implementation of step S14, the test set is input into the trained spatial feature attention U-type network to obtain the brain tumor segmentation result.

In order to verify the effect of the method provided by the embodiment of the invention, BraTs2020 is selected as a data set and compared with the existing 3D U-Net, Cascade U-Net and Attention U-Net, so that the effect of the method is highlighted. The evaluation index adopts a Dice coefficient and sensitivity. (3D U-Net can be referred to:

Abdulkadir A,Lienkamp S S,et al.3D U-Net:learning dense volumetric segmentation from sparse annotation[C]spring, Cham,2016: 424-; cascade U-Net can be referred to: jiang Z, Ding C, Liu M, et al, two-stage cached u-net 1st place solution to branched change 2019segmentation task [ C]Spring, Cham,2019: 231-; attention U-Net can be referred to: schlemper J, Oktay O, Schaap M, et al.Attention protected networks Learing to expression of specific regions in media images [ J].Medical image analysis,2019,53:197-207.)

Briefly, the BraTs2020 dataset is introduced here (ref: Menze B H, Jakab A, Bauer S, et al. the multimodal bridge image segmentation reference [ J ]].IEEE transactions on medical imaging,2014,34(10):1993-2024.；Bakas S,Akbari H,Sotiras A,et al.Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features[J].Scientific data,2017,4(1):1-13.；Bakas S,Reyes M,Jakab A,et al.Identifying the best machine learning algorithms for brain tumor segmentation,progression assessment,and overall survival prediction in the BRATS challenge[J]arXiv preprint arXiv:1811.02629,2018.), which included 369 patient samples for training and 125 patient samples for testing. Each sample consisted of a four modality brain MRI scan, T1, T1ce, T2T2, and FLAIR. The voxel resolution of each modality was 240X 155, registered to the same T1 anatomical template, SRI24, and interpolated to the same 1mm³Resolution and skull stripping treatment. The signatures included 4 classes background (signature 0), enhanced tumor (ET-signature 4), peritumoral edema (ED-signature 2), necrotic and non-enhanced tumor core (NCR/NET-signature 1). The patent will evaluate Dice and sensitivity index on three tumor substructure areas, namely, the enhanced tumor area (ET, tag 1), the tumor core area (TC, tags 1 and 4), and the entire tumor area (WT, tags 1, 2 and 4), respectively.

Experiments were performed on the BraTs2020 dataset and the results are shown in table 1. Experimental results show that the method provided by the embodiment of the invention realizes great performance improvement.

Table 1 shows the comparison of Dice coefficients and sensitivities of the methods provided in the examples of the present invention with 3D U-Net, Cascade U-Net, and Attention U-Net on three tumor structure regions, ET, WT, and TC, under the data set of BraTs 2020.

Corresponding to the foregoing embodiment of brain tumor segmentation based on spatial feature attention mechanism, the present application also provides an embodiment of a brain tumor segmentation apparatus based on spatial feature attention mechanism.

Fig. 10 is a block diagram illustrating a brain tumor segmentation apparatus based on a spatial feature attention mechanism according to an exemplary embodiment. Referring to FIG. 10, the apparatus includes

An obtaining module 11, configured to obtain a test set and a training set of a four-modality brain MRI sequence;

the building module 12 is used for building a spatial feature attention U-type network;

the training module 13 is configured to train the spatial feature attention U-type network by using the training set;

the output module 14 is used for inputting the test set into the trained spatial feature attention U-shaped network to obtain a brain tumor segmentation result;

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.

Correspondingly, the present application also provides an electronic device, comprising: one or more processors; a memory for storing one or more programs; when executed by the one or more processors, cause the one or more processors to implement a method for brain tumor segmentation based on spatial feature attention mechanism as described above.

Accordingly, the present application also provides a computer readable storage medium having stored thereon computer instructions, wherein the instructions, when executed by a processor, implement a brain tumor segmentation method based on spatial feature attention mechanism as described above.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A brain tumor segmentation method based on a spatial feature attention mechanism is characterized by comprising the following steps:

acquiring a test set and a training set of a four-mode brain MRI sequence;

constructing a spatial feature attention U-type network;

2. The method according to claim 1, wherein the encoder is composed of n sequentially connected encoding blocks, and a down-sampling layer is arranged between every two encoding blocks, the encoding blocks are used for extracting the feature map of the brain tumor, and the down-sampling layer is used for reducing the dimension size of each feature map by half under the condition that the number of feature map channels is kept unchanged.

3. The method of claim 2, wherein one of the spatial feature attention modules is inserted between two adjacent coding blocks and before a downsampling layer between the coding blocks.

4. The method of claim 1, wherein the spatial attention sub-module comprises: the device comprises a Max wiring layer, an AvgPooling layer, a connection operation layer, a down sampling layer, an up sampling layer and a Sigmoid layer, wherein the Max wiring layer and the AvgPooling layer are connected in parallel and then sequentially connected in series with the connection operation layer, the down sampling layer, the up sampling layer and the Sigmoid layer.

5. The method of claim 1, wherein the feature attention submodule comprises: the device comprises a Maxbonding layer, an AvgPooling layer, a connection operation layer, a Transformer layer, a 3D convolution layer and a Sigmoid layer, wherein the Maxbonding layer and the AvgPooling layer are connected in parallel and then sequentially connected in series with the connection operation layer, the Transformer layer, the 3D convolution layer and the Sigmoid layer.

6. The method of claim 1, wherein the decoder comprises: the decoding device comprises n decoding blocks and a 3D convolutional layer, wherein the n decoding blocks are connected in reverse order and then are adjacent to the 3D convolutional layer, an upper sampling layer is arranged between every two adjacent decoding blocks, the upper sampling layer is used for increasing the size of each dimension of characteristics by one time under the condition that the number of characteristic channels is kept unchanged, and the 3D convolutional layer is used for adjusting the number of the channels of the output characteristics of the first decoding block to be consistent with the number of the brain tumor segmentation results.

7. The method of claim 6, wherein the nth decoding block receives the output of the encoder, and the rest of the decoding blocks respectively receive the output of the spatial feature attention module.

8. A brain tumor segmentation device based on a spatial feature attention mechanism, comprising:

9. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

10. A computer-readable storage medium having stored thereon computer instructions, which when executed by a processor, perform the steps of the method according to any one of claims 1-7.