CN114332463A

CN114332463A - MR brain tumor image example segmentation method, device, equipment and storage medium

Info

Publication number: CN114332463A
Application number: CN202111671947.1A
Authority: CN
Inventors: 刘薇; 何进; 姜立; 陈科; 王英; 李敬东
Original assignee: Chengdu Vocational and Technical College of Industry
Current assignee: Chengdu Vocational and Technical College of Industry
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-04-12

Abstract

The invention relates to the technical field of computer vision, and discloses a method, a device, equipment and a storage medium for segmenting an MR brain tumor image example, namely, on the basis of the existing Mask RCNN example segmentation model and the characteristic pyramid network FPN, adds a Bottom-up Path enhancement on the FPN, then respectively introducing a convolutional layer attention mechanism CBAM in an up-sampling Path of Top-down Path and a down-sampling Path of Bottom-up Path Augmentation of the FPN, can better improve the change problem of the tumor shape and size caused by individual difference and time change, meanwhile, the low-resolution characteristic diagram and the high-resolution characteristic diagram are fused, so that the detection effect of small targets can be effectively improved, the omission ratio of small-size tumors is reduced, the early screening of the tumors is helped, and therefore the segmentation performance is improved and the universality is better.

Description

MR brain tumor image example segmentation method, device, equipment and storage medium

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a method, a device, equipment and a storage medium for segmenting an MR brain tumor image example.

Background

Magnetic Resonance Imaging (MRI) is a non-invasive Imaging technique that allows for non-invasive characterization of the mesoscopic characteristics of an organ or tumor. The MRI technology can provide good spatial anatomical information, has good soft tissue contrast detection capability, can obtain tomography in any direction without reconstruction, and the obtained multi-sequence multi-modal imaging can provide rich image information for determining the nature of pathological changes, help doctors and researchers understand the pathophysiology of brain diseases, and assist in image diagnosis, prognosis and tumor treatment. In clinical environment, accurate detection and segmentation of lesions are of great significance to patient disease assessment and treatment scheme formulation.

However, MR brain tumor images present the following objective factors: (1) the complexity of the brain tissue structure, namely, the MRI brain image has normal tissues such as cerebral cortex, cerebrospinal fluid, white matter, gray matter and the like, and also has pathological tissues such as reinforced tumor, non-reinforced tumor, edema, necrotic area inside the tumor and the like, and the imaging similarity of partial tissue structures is high, the contrast is low, so that the tissues are difficult to distinguish; (2) multi-scale, i.e., tumors vary in shape and size and are variable in shape and size at different times; (3) the spatial position is not fixed, namely the tumor area is not fixed, and individual difference exists; (4) multiple targets, i.e. multiple targets or containing multiple substructures, are easy to miss detection; (5) the category imbalance, i.e., the feature information of the lesion area, is smaller in the MR image.

The factors cause that accurate segmentation for MR brain tumors still faces a great challenge, and at present, most mainstream segmentation algorithms have single functions, can only improve one of the factors, and are poor in universality. If people use U-Net or its improved network structure to segment different tissue organs or focuses of medical image, the segmentation precision is effectively improved by the network structure containing jump connected coding-decoding structure. Aiming at the segmentation problem of multiple subregions of the brain glioma, people also integrate the segmentation problem of multiple substructures into the same network by utilizing a multitask network, and successively realize the training of multitask branches in a course learning mode. There are also some methods for improving the efficiency of brain tumor segmentation by reducing the weight while maintaining accuracy. When the tumor is far away, most models can segment only a part of targets, lack of global visual field and are easy to miss detection. The main flow model has a non-ideal effect on multi-scale tumor detection, the tumor with smaller size in the image has higher missing detection rate, and the tumor with larger scale is difficult to realize complete segmentation better. Therefore, for the objective complexity of the MR brain tumor image, it is necessary to design a segmentation model and a segmentation scheme with improved performance and better universality.

Since a feature learning model with strong robustness and adaptivity can be formed by stacking a plurality of Convolutional layers, pooling layers and full-link layers on a Convolutional Neural Network (CNN) and convolving an image with a kernel feature parameter, the application of CNN to brain tumor segmentation may have the greatest advantage of being able to extract complex features of tumors in a brain MRI image, compared to manual segmentation and conventional segmentation methods. The CNN feature detection layer avoids explicit feature extraction by learning the training data, implicitly learns from the training data, and can greatly improve the brain tumor image segmentation precision. Therefore, some people begin to try to use the CNN convolutional neural network for brain tumor segmentation, and a method for automatically segmenting MRI brain tumor based on CNN is designed by Pereira and the like.

Further, an RCNN neural network is proposed on the basis of the CNN technology, that is, a convolutional neural network is applied to target detection, a Region recommendation method of Selective Search (Selective Search) is used to extract a recommendation Region (RP) from an image, so that automatic extraction of image features is realized, and finally, target classification and target positioning are performed. However, since each recommended area needs to extract features through the CNN network, a large amount of feature file storage space and running time are consumed. An improvement is made for this purpose, and SPP-NET is provided, namely, in the stage of feature extraction, a space pyramid structure is added in front of a full connection layer to adapt to feature maps of different sizes; fixed feature vectors are extracted through spatial pyramid pooling, feature extraction is performed on the whole graph for one time, the operation speed is improved, the storage space is reduced, and shared convolution is proposed for the first time. However, this method also has the following disadvantages: the classifier uses SVM and cannot be trained end to end; the training steps are complex when the training is carried out in stages, so that softmax is used for replacing SVM to carry out multi-classification prediction, and end-to-end training is achieved. However, this new and improved method also has disadvantages: the extraction of the recommended area is again by a selection search and the algorithm can only be run in the CPU, so this phase wastes a lot of time.

In order to solve the problem that the detection speed is too slow due to the consumption of a large amount of computing resources in the recommended region generation step, fast RCNN has been proposed, namely, RPN (Region pro social Network) Network instead of the selection search generates a recommendation Region, the region recommendation network shares the convolution characteristics of the entire graph with the detection network so that region recommendation consumes substantially no computation time, however, the problem with fast RCNN is that after the downsampled Conv4 layer output reaches the RPN layer, the RPN uses a single high level feature map (single feature map) for classification of objects and regression of bounding boxes (abbreviated bbox) if only one scale of the feature map is used, it is possible that the semantic representation of this level of features is not strong, but if a back level is selected, although the semantic expression is strong, the position information may not be enough, and the small object itself has less pixel information and is easily lost in the process of down-sampling.

In order to solve the detection problem of obvious size difference of the object, the classical method is to perform multi-scale change enhancement by using an image pyramid mode, but the method brings great calculation amount. For this reason, a Feature Pyramid Network (FPN) structure has also been proposed, which takes a Feature Pyramid as a basic structure, and predicts Feature maps of each level respectively, so as to solve the problem of multi-scale change in target detection, and at the same time, pay a small time cost. Meanwhile, a network model, Mask RCNN, has been developed that integrates Object detection (Object detection) and Instance Segmentation (Instance Segmentation). On the basis of the fast-RCNN, a fully-connected segmentation sub-network can be added, and a segmentation task can be added on the basis of the original target detection task. Mask RCNN can locate small targets and segment targets, and is essentially an example segmentation, wherein the example segmentation not only needs to classify the pixel level, but also needs to separate different individuals in the same class. Compared with semantic segmentation (semantic segmentation), the method puts higher requirements on a segmentation algorithm, and example segmentation has finer segmentation on similar objects. However, the difficulty of Mask RCNN accessing accurate positioning information is increased due to the long path between the highest-level features and the lower-level features.

Currently, two-stage detection and example segmentation models, such as Mask RCNN, consist of three parts: the system comprises a backbone network, a neck neutral network and a head network, wherein the classical optimization methods of the neutral network comprise FPN, PANET, BiFPN and the like. The low-level characteristics of the network are considered, and the sampling times are fewer, so that more target position information is contained, but the semantic information is less, and the target identification is not facilitated; the high-level features contain more semantic information, but the target positioning is not accurate. In order to solve the problems, the hack of the Mask RCNN adopts a feature pyramid network FPN, namely, the high-low layer features are added and fused in a Bottom-UP mode to enable the network to adapt to pictures with different scales, the multi-scale feature images are independently predicted, and the detection and segmentation effects of small-size targets are improved; therefore, the Bottom-up Path enhancement is added to the FPN by some people, so that the utilization rate of low-level information can be increased in the segmentation; some adopt a weighted bidirectional feature pyramid network BiFPN cascade connection in the tack network, and compared with the PANet, the BiFPN has the advantages that cross-layer links are added, image features can be subjected to deep fusion, segmentation precision is improved, and segmentation efficiency is reduced.

Therefore, the three optimization modes of the FPN, the PANet and the BiFPN respectively have advantages and disadvantages, wherein although the FPN optimization mode has high segmentation efficiency, the segmentation precision is not enough; the PANet optimization mode is to find balance on the segmentation precision and efficiency; although the BiFPN optimization method has high segmentation accuracy, the segmentation efficiency is low. Therefore, how to provide an MR brain tumor image example segmentation scheme with higher segmentation accuracy and efficiency based on a Mask RCNN network model is a subject of urgent research required by those skilled in the art.

Disclosure of Invention

In order to solve the problem of limited segmentation precision and efficiency of the existing MR brain tumor example segmentation technology, the invention aims to provide a novel MR brain tumor image example segmentation method, a novel MR brain tumor image example segmentation device, a novel MR brain tumor image example segmentation computer device and a computer readable storage medium.

In a first aspect, the present invention provides a method for segmenting an MR brain tumor image example, comprising:

acquiring a brain tumor public data training set, wherein the brain tumor public data training set comprises a plurality of MR brain tumor image samples;

respectively carrying out gray scale data normalization processing on each MR brain tumor image sample in the multiple MR brain tumor image samples to obtain corresponding sample images;

sending all the sample images into a Mask RCNN network model integrating target detection and example segmentation to be trained, and obtaining a trained MR brain tumor image example segmentation model, wherein the Mask RCNN network model consists of a backbone network, a neck portion tack network and a head network, the backbone network adopts a residual error network, the tack network adopts a characteristic pyramid network FPN, and the head network comprises a prediction frame classification network, a frame bbox regression network and a Mask branch network;

the residual error network is used for outputting six first feature maps corresponding to the sample image after the sample image is input;

the Top-down path of the feature pyramid network FPN is designed as: introducing a first convolution attention module in each of the prediction layers from the third prediction layer P3 to the seventh prediction layer P7 to perform important feature weighting in both spatial and channel dimensions using a channel-first-space strategy such that the ith first feature map F among the six first feature maps is input_iThen, the ith first feature map F is obtained_iCorresponding first refined new feature map F_iWherein i is a natural number and is e [3, 7 ]]；

The residual network is further configured to pass through different down-sampling times, such that the first refined new feature map F_i"after 1 × 1 dimension reduction and 2 times up-sampling, with another first refined new feature map F outputted by the next prediction layer and subjected to 1 × 1 dimension reduction_i-1Performing unit addition fusion, and performing 3 × 3 convolution processing on the unit addition fusion result to obtain a second feature map P ″_i-1；

Adding Bottom-up path enhancement in the feature pyramid network FPN, and designing as follows: first, a second convolution attention module is respectively introduced into each prediction layer so as to adopt a strategy of first channel and then space, and important feature weighting is carried out on two dimensions of space and channel, so that a second feature map P' is input_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N_iAnd then, carrying out the following iterative processing from bottom to top to obtain a new second refined new feature map for predicting frame classification, frame bbox regression and mask branch generation: said second refined new feature map N for lower layers_i"carry on 3 x 3 convolution process and down sampling process of a step size 2, then pass the processing result and the horizontal connection and the second characteristic diagram P" of the high level_i+1Performing unit-plus-fusion, and finally performing 3-x 3 convolution processing on the unit-plus-fusion result to obtain a new second refined new feature map N ″_i+1；

And importing the MR image to be processed into the MR brain tumor image example segmentation model, and outputting an example segmentation result.

Based on the above invention, an MR brain tumor example segmentation scheme based on attention mechanism is provided, that is, based on the existing Mask RCNN example segmentation model and feature pyramid network FPN, a Bottom-up Path enhancement button-up Path Augmentation is added on the FPN, then a rolling attention lamination mechanism CBAM is respectively introduced in the Top-up and Bottom-down Path Top-down sampling Path of the FPN and the Bottom-up Path Augmentation down-sampling Path, so as to better improve the change problem of the tumor shape and size caused by individual difference and time change, and simultaneously, by combining the low resolution feature map and the high resolution feature map, the detection effect of small targets can be effectively improved, the leak rate of small-sized tumors can be reduced, help can be provided for early screening of tumors, and by adding the Bottom-up enhancement Path on the FPN basis, accurate positioning information of low-level features can be utilized, and because the attention mechanism CBAM is introduced on the enhancement path, a better attention coefficient to the tumor feature map can be formed, so that the feature information of the tumor in a high-dimensional nonlinear space can be better maintained, the space position of the target is more sensitive, and the effectiveness of multi-target detection and segmentation is improved. Meanwhile, the scheme of the application can give consideration to the segmentation efficiency and the segmentation precision, and experimental analysis proves that the segmentation precision is higher than that of the PANet, and compared with the three-level cascade BiFPN, the segmentation precision is even in precision, but the efficiency is higher than that of the BiFPN, the time cost is less, and further the segmentation performance is improved and the universality is better. In addition, because the CBAM is a lightweight universal module, the overhead of the module can be ignored, the influence on the segmentation efficiency is small, and the high efficiency of the algorithm is kept while the performance of the model is improved; and because the target detection and the example segmentation are integrated, and the BBox is more accurate in region segmentation and classification judgment, the probability of outputting the classification has higher reference value for clinical significance, and therefore, compared with pure semantic segmentation, the method is more beneficial to processing the detection and segmentation of multiple targets and small targets, and the example segmentation model of the MR brain tumor has good universality.

In one possible design, the brain tumor public data training set adopts a training subset in a brain tumor public data set BRATS2018, wherein each of the MR brain tumor image samples in the plurality of MR brain tumor image samples has 4 modalities as follows: a T1 modality, a T2 modality, a T1ec modality, and a Flair modality.

In one possible design, for each MR brain tumor image sample in the plurality of MR brain tumor image samples, performing a gray scale data normalization process to obtain a corresponding sample image, respectively, includes: MR brain tumor image samples were normalized to have zero mean and unit standard deviation.

In one possible design, for each MR brain tumor image sample in the plurality of MR brain tumor image samples, performing a gray scale data normalization process to obtain a corresponding sample image, further comprising:

and performing data amplification processing on the normalized MR brain tumor image sample in the following mode: and carrying out random intensity offset processing on each channel, and carrying out random rotation processing, mirror image overturning processing, elastic deformation processing and/or scaling processing on the triaxial data.

In one possible design, the residual network adopts a ResNet101 network structure containing seven groups of convolutional layers;

in the ResNet101 network structure, the size of the first set of convolutional layers is 64 × 7 followed by the maximum pooling layer;

in the ResNet101 network structure, the second set of convolution layers includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 64 × 1, 64 × 3, and 256 × 1, respectively;

in the ResNet101 network structure, the third set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 128 × 1, 128 × 3 and 512 × 1, respectively;

in the ResNet101 network structure, the fourth set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 256 × 1, 256 × 3, and 1024 × 1, respectively;

in the ResNet101 network structure, the fifth set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 512 × 1, 512 × 3 and 2048 × 1, respectively;

in the ResNet101 network structure, the sixth set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 1024 × 1, 1024 × 3, and 4096 × 1, respectively;

in the ResNet101 network structure, the seventh group of convolutional layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 2048 × 1, 2048 × 3, and 8192 × 1, respectively.

In one possible design, the ith first feature map F in the six first feature maps is input_iThen, the ith first feature map F is obtained_iCorresponding first refined new feature map F_i", including:

the first characteristic diagram F_iA channel attention submodule fed into the first convolution attention module for obtaining a first weight coefficient in a channel attention dimension in a learning manner

Wherein σ () represents a normalization function sigmoid () and is used to continuously update the weight coefficients according to propagation feedback of the neural network

In order to guide the model in selecting tumor features from the lower layers, AvgPool () represents the mean pooling function, MaxPool () represents the maximum pooling function, MLP () represents the multi-layer perceptron,

represents the first characteristic diagram F_iAverage pooling is carried out to obtainIn the case of the characteristic diagram to be obtained,

represents the first characteristic diagram F_iCharacteristic map obtained after maximum pooling, W₀And W₁Represents the weight of the multi-layer perceptron MLP () and has W₀∈R^C/r×CAnd W₁∈R^C×C/rR represents a real number, C/R represents the number of neurons in a first layer of the multilayer perceptron MLP (), R represents a preset reduction rate, C represents the number of neurons in a second layer of the multilayer perceptron MLP (), and the activation function of the multilayer perceptron MLP () adopts a linear rectification function Relu;

for the first weight coefficient

And said first characteristic diagram F_iPerforming unit multiplication to obtain a first new feature map F_i′；

The first new feature map F_i' entering a spatial attention submodule in said first volumetric attention module for obtaining a second weight coefficient in a spatial attention dimension

In the formula, Conv^7*7() Represents the convolution operation with a filter size of 7 x 7,

representing the first new feature map F_i' feature map obtained after performing average pooling and having

Representing the first new feature map F_i' feature map obtained after maximum pooling and having

For the second weight coefficient

And said first new feature map F_i' convolution operation processing is carried out to obtain the first characteristic diagram F of the ith_iCorresponding first refined new feature map F_i″。

In one possible embodiment, the second characteristic map P ″ is entered_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N_i", including:

a second characteristic diagram P ″)_i-1The channel attention submodule sent into the second convolution attention module is used for acquiring a third weight coefficient on a channel attention dimension in a learning mode

For the third weight coefficient

And the second characteristic diagram P ″_i-1Performing unit multiplication to obtain a second new feature map P'_i-1；

Converting the second new feature map P'_i-1Feeding into a spatial attention submodule in said second convolution attention module to obtain a fourth weight coefficient in a spatial attention dimension

For the fourth weight coefficient

And the second new feature mapP′_i-1Carrying out convolution operation processing to obtain the second characteristic diagram P ″)_i-1Corresponding second refined new feature map N_i″。

In a second aspect, the invention provides an MR brain tumor image example segmentation device, which comprises a training set acquisition unit, a preprocessing unit, a model training unit and a model application unit, which are sequentially in communication connection;

the training set acquisition unit is used for acquiring a brain tumor public data training set, wherein the brain tumor public data training set comprises a plurality of MR brain tumor image samples;

the preprocessing unit is used for respectively carrying out gray scale data normalization processing on each MR brain tumor image sample in the MR brain tumor image samples to obtain corresponding sample images;

the model training unit is used for sending all the sample images into a Mask RCNN network model integrating target detection and example segmentation to be trained, and obtaining a trained MR brain tumor image example segmentation model, wherein the Mask RCNN network model is composed of a backbone network, a neck portion tack network and a head network, the backbone network adopts a residual error network, the tack network adopts a Feature Pyramid Network (FPN), and the head network comprises a prediction frame classification network, a frame ox regression network and a Mask branch network;

Adding a Bottom-up Path enhancement in the feature pyramid network FPN, and designing as follows: first, a second convolution attention module is respectively introduced into each prediction layer so as to adopt a strategy of first channel and then space, and important feature weighting is carried out on two dimensions of space and channel, so that a second feature map P is input_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N_iAnd then, carrying out the following iterative processing from bottom to top to obtain a new second refined new feature map for predicting frame classification, frame bbox regression and mask branch generation: said second refined new feature map N for lower layers_i"carry on 3 x 3 convolution process and down sampling process of a step size 2, then pass the processing result and the horizontal connection and the second characteristic diagram P" of the high level_i+1Performing unit-plus-fusion, and finally performing 3-x 3 convolution processing on the unit-plus-fusion result to obtain a new second refined new feature map N ″_i+1；

The model application unit is used for importing the MR image to be processed into the MR brain tumor image example segmentation model and outputting an example segmentation result.

In a third aspect, the present invention provides a computer device comprising a memory, a processor and a transceiver, which are sequentially connected in communication, wherein the memory is used for storing a computer program, the transceiver is used for transceiving information, and the processor is used for reading the computer program and executing the MR brain tumor image example segmentation method according to the first aspect or any possible design of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon instructions which, when executed on a computer, perform the MR brain tumor image instance segmentation method according to the first aspect or any of the possible designs of the first aspect.

In a fifth aspect, the invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the MR brain tumor image instance segmentation method as described above in the first aspect or any of the possible designs thereof.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of an example segmentation method of an MR brain tumor image provided by the present invention.

FIG. 2 is a schematic diagram of the operation process of the convolution attention module CBAM provided by the present invention.

FIG. 3 is a schematic diagram of a partial network structure of the CBAM-FPN model provided by the present invention.

Fig. 4 is a schematic structural diagram of an example segmentation device for an MR brain tumor image provided by the present invention.

Fig. 5 is a schematic structural diagram of a computer device provided by the present invention.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Specific structural and functional details disclosed herein are merely representative of exemplary embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.

It will be understood that, although the terms first, second, etc. may be used herein to describe various objects, these objects should not be limited by these terms. These terms are only used to distinguish one object from another. For example, a first object may be referred to as a second object, and similarly, a second object may be referred to as a first object, without departing from the scope of example embodiments of the present invention.

It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, B exists alone or A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists singly or A and B exist simultaneously; in addition, for the character "/" that may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.

As shown in fig. 1 to 3, the MR brain tumor image segmentation method provided in the first aspect of the present embodiment may be, but not limited to, executed by a Computer device with certain computing resources, for example, an electronic device such as a Personal Computer (PC, which refers to a multipurpose Computer with a size, price, and performance suitable for Personal use, a desktop Computer, a notebook Computer, a small notebook Computer, a tablet Computer, and a super Computer, etc. all belong to the Personal Computer), a smart phone, a Personal digital assistant (PAD), or a wearable device. As shown in FIG. 1, the example segmentation method for an MR brain tumor image can include, but is not limited to, the following steps S1-S4.

S1, obtaining a brain tumor public data training set, wherein the brain tumor public data training set comprises a plurality of MR brain tumor image samples.

In step S1, specifically, the brain tumor public data training set may adopt a training subset in a brain tumor public data set BRATS2018, where each of the MR brain tumor image samples in the plurality of MR brain tumor image samples has 4 modalities as follows: a T1 modality, a T2 modality, a T1ec modality, and a Flair modality. The four modalities can be understood as four different latitudinal information of the nuclear magnetic resonance image, and the shape of the image of each sequence is (155,240,240). In addition, the brain tumor public data set BRATS2018 has 285 cases in total, and has three subsets, namely a training subset, a testing subset and a ranking list subset.

And S2, respectively carrying out gray scale data normalization processing on each MR brain tumor image sample in the multiple MR brain tumor image samples to obtain corresponding sample images.

In step S2, although the brain tumor public data set BRATS2018 has already been registered, and resampled to 1 × 1mm and implemented skull stripping, for the segmentation task, it is still necessary to perform a gray data normalization process to send to the subsequent model for model training, that is, specifically, for each MR brain tumor image sample in the MR brain tumor image samples, the gray data normalization process is performed separately to obtain corresponding sample images, including: the MR brain tumor image samples are normalized to have zero mean and unit standard deviation, wherein the normalized processing mode is a conventional preprocessing mode. In addition, in order to improve the target detection and segmentation capability of the subsequent trained model, the method further needs to perform data amplification processing on the normalized MR brain tumor image samples, that is, perform gray scale data normalization processing on each MR brain tumor image sample in the plurality of MR brain tumor image samples to obtain corresponding sample images, and the method further includes: and performing data amplification processing on the normalized MR brain tumor image sample in the following mode: and carrying out random intensity offset processing on each channel, and carrying out random rotation processing, mirror image turning processing, elastic deformation processing and/or scaling processing on the triaxial data, wherein the processing modes are conventional preprocessing modes.

And S3, sending all the sample images into a Mask RCNN network model integrating target detection and example segmentation to be trained, and obtaining a trained MR brain tumor image example segmentation model, wherein the Mask RCNN network model is composed of a backbone network, a neck portion neural network and a head network, the backbone network adopts a residual error network, the neural network adopts a feature pyramid network FPN, and the head network comprises a prediction frame classification network, a frame bbox regression network and a Mask branch network.

In step S3, the Mask RCNN network model is an existing network model and has the following basic structure: the system comprises a backbone network, a tack network and a head network, wherein the residual error network is used for outputting six first feature maps corresponding to a sample image after the sample image is input, and a ResNet101 network structure comprising seven groups of convolution layers can be adopted in detail; in the ResNet101 network structure, the size of the first set of convolutional layers is 64 × 7 followed by the maximum pooling layer; in the ResNet101 network structure, the second set of convolution layers includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 64 × 1, 64 × 3, and 256 × 1, respectively; in the ResNet101 network structure, the third set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 128 × 1, 128 × 3 and 512 × 1, respectively; in the ResNet101 network structure, the fourth set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 256 × 1, 256 × 3, and 1024 × 1, respectively; in the ResNet101 network structure, the fifth set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 512 × 1, 512 × 3 and 2048 × 1, respectively; in the ResNet101 network structure, the sixth set of convolution layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 1024 × 1, 1024 × 3, and 4096 × 1, respectively; in the ResNet101 network structure, the seventh group of convolutional layers also includes convolutions with step size of 1, and the convolution kernel sizes of the convolutions are 2048 × 1, 2048 × 3, and 8192 × 1, respectively. In addition, the feature pyramid network FPN has been introduced in the background, and the prediction box classification network, the bounding box bbox regression network and the Mask branch network are all conventional configurations of head networks in the existing Mask RCNN network model.

The embodiment is mainly a needleThe sock network is optimally designed, namely, the Top-down path of the feature pyramid network FPN is firstly designed as follows: introducing a first convolution attention module in each prediction layer from the third prediction layer P3 to the seventh prediction layer P7 respectively so as to perform important feature weighting in two dimensions of space (Spatial) and Channel (Channel) by adopting a Channel-first-space strategy, so that the ith first feature map F in the six first feature maps is input_iThen, the ith first feature map F is obtained_iCorresponding first refined new feature map F_iWherein i is a natural number and is e [3, 7 ]]. The Top-down path is a high level feature map upsampling process (the upsampling ratio between each layer is 2); the network structure from the third prediction layer P3 to the seventh prediction layer P7 is a conventional configuration in the feature pyramid network FPN. The first convolution attention module is a simple and effective attention module for a feedforward convolution neural network, and can sequentially infer an attention diagram along two independent dimensions (namely a channel and a space) after giving an intermediate feature map, and then multiply the attention diagram with an input feature map for adaptive feature optimization; since CBAM (Convolutional Attention Module) is a lightweight general-purpose Module, the overhead of this Module can be ignored and seamlessly integrated into any CNN architecture, and end-to-end training can be performed with the underlying CNN.

Specifically, as shown in fig. 2, the ith first feature map F among the six first feature maps is input_iThen, the ith first feature map F is obtained_iCorresponding first refined new feature map F_i", including but not limited to the following steps S311 to S314.

S311, a first characteristic diagram F is obtained_iA channel attention submodule fed into the first convolution attention module for obtaining a first weight coefficient in a channel attention dimension in a learning manner

represents the first characteristic diagram F_iThe characteristic diagram obtained after the average pooling is carried out,

represents the first characteristic diagram F_iCharacteristic map obtained after maximum pooling, W₀And W₁Represents the weight of the multi-layer perceptron MLP () and has W₀∈R^C/r×CAnd W₁∈R^C×C/rR represents a real number, C/R represents the number of neurons in the first layer of the multilayer perceptron MLP (), R represents a preset reduction rate, C represents the number of neurons in the second layer of the multilayer perceptron MLP (), and the activation function of the multilayer perceptron MLP () adopts a linear rectification function Relu.

In the step S311, the first weight coefficient

The method for promoting important features and suppressing features that are not important to the current task is equivalent to an attention map in the channel attention dimension.

S312. for the first weight coefficient

And said first characteristic diagram F_iPerforming unit multiplication to obtain a first new feature map F_i′。

In step S312, the specific formula is:

in the formula (I), the compound is shown in the specification,

representing a unit multiplication.

S313, the first new feature map F is processed_i' entering a spatial attention submodule in said first volumetric attention module for obtaining a second weight coefficient in a spatial attention dimension

In the step S313, the second weight coefficient

Corresponding to an attention map in the spatial attention dimension.

S314, comparing the second weight coefficient

In step S314, the specific formula is:

in the formula (I), the compound is shown in the specification,

representing a unit multiplication.

Obtaining the first refined new feature map F_i"after that, the residual network will also be used for going through different downsampling times, so that the first refined new feature map F_i"after 1 × 1 dimension reduction and 2 times up-sampling, with another first refined new feature map F outputted by the next prediction layer and subjected to 1 × 1 dimension reduction_i-1Performing unit addition fusion, and performing 3 × 3 convolution processing on the unit addition fusion result to obtain a second feature map P ″_i-1. The specific formula of the foregoing process can be as follows:

in the formula (I), the compound is shown in the specification,

represents the unit addition.

In this embodiment, for the optimization design performed on the tack network, Bottom-up Path enhancement is added to the feature pyramid network FPN, and the design is as follows: respectively introducing a second convolution attention module in each prediction layer so as to adopt a strategy of first channel and then space, and performing important feature weighting on two dimensions of space and channel, so that the first convolution attention module is inputTwo characteristic diagrams P ″_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N_iAnd then, carrying out the following iterative processing from bottom to top to obtain a new second refined new feature map for predicting frame classification, frame bbox regression and mask branch generation: said second refined new feature map N for lower layers_i"carry on 3 x 3 convolution process and down sampling process of a step size 2, then pass the processing result and the horizontal connection and the second characteristic diagram P" of the high level_i+1Performing unit-plus-fusion, and finally performing 3-x 3 convolution processing on the unit-plus-fusion result to obtain a new second refined new feature map N ″_i+1。

The second convolution attention module is also a simple and effective attention module for the feedforward convolution neural network, and can refer to the aforementioned steps S311 to S314 to input the second feature map P ″_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N_iSpecifically, the method includes steps S321 to S324: s321. connecting the second characteristic diagram P ″_i-1The channel attention submodule sent into the second convolution attention module is used for acquiring a third weight coefficient on a channel attention dimension in a learning mode

S322. for the third weight coefficient

And the second characteristic diagram P ″_i-1Performing unit multiplication to obtain a second new feature map P'_i-1(ii) a S323, the second new feature map P'_i-1Feeding into a spatial attention submodule in said second convolution attention module to obtain a fourth weight coefficient in a spatial attention dimension

S324. for the fourth weight coefficient

And the second new feature map P'_i-1Carrying out convolution operation processing to obtain the second characteristic diagram P ″)_i-1Corresponding second refined new feature map N_i". For details of the steps S321 to S324, reference may be made to the steps S311 to S314, which are not described herein again. In addition, as shown in FIG. 3, a new feature map N "for obtaining a new second refined feature map_i+1The specific process of (2) can be expressed by the following iterative formula:

after all the sample images are sent into the improved Mask RCNN network model, the Mask RCNN network model can be trained, and an MR brain tumor image example segmentation model integrating target detection and example segmentation is obtained. In addition, in order to verify the MR brain tumor image example segmentation model, the image samples can be sent into the MR brain tumor image example segmentation model for example segmentation correctness verification after being subjected to gray data normalization processing aiming at the test subset in the brain tumor public data set BRATS 2018.

Illustratively, the present example also applies the following experimental conditions: (1) hardware environment: a CPU: intel (R) Xeon (R) CPU E5-26600 @2.20 GHz; memory: 24 GB; a display card: TITAN RTX 24GB Ram; (2) software environment: and OS: ubuntu 16.04.6 LTS; developing software: python; a basic development framework: PyTorch, a comparison test is performed on the optimization model (named CBAM-FPN model in this application) and several existing models (Mask RCNN, PANet, bipfn, etc.), that is, the backhaul networks of all comparison models all use the ResNet101 network, the head network all use the head same as Mask RCNN, the accuracy index uses Dice (Dice Similarity coefficient) and AP (Average Precision), the efficiency index uses parameters and FLOPs (i.e., "performed per second")Floating pointThe abbreviation of the operation times per second) and according to the division of BRATS2018 brain tumor structure, from three tumor regions, namely a total tumor region (WT), a tumor core region (TC) and a tumor enhancement region(ET) and the like, and the results and analysis thereof are shown in tables 1 to 3.

TABLE 1 comparison of Performance indicators on brain tumor segmentation

TABLE 2 comparison of the detection results of different brain tumors

TABLE 3 comparison of parameters and computational power for different segmentation models

Model (model)	Reference quantity (M)	FLOPs(B)
			MaskRCNN	43.21	126.33
PANet	52.37	162.92
			BIFPN (3 cascade)	67.02	262.31
CBAM-FPN	54.80	173.75

As can be seen from table 2, the CBAM-FPN model proposed in this embodiment has certain advantages in small target detection, and as can be seen from table 3, under the same backhaul network and head network, the parameter quantity of the CBAM-FPN model is between PANet and bipn (3 cascades), and although the operation time of the CBAM-FPN model is substantially equal to that of PANet, it is shorter than that of bipn (3 cascades).

And S4, importing the MR image to be processed into the MR brain tumor image example segmentation model, and outputting an example segmentation result.

In step S4, the MR image to be processed is also sent to the MR brain tumor image example segmentation model for example segmentation after the gray scale data normalization processing is performed, so as to obtain an example segmentation result.

Thus, based on the MR brain tumor image example segmentation method described in the foregoing steps S1-S4, an attention-based MR brain tumor example segmentation scheme is provided, that is, based on the existing Mask RCNN example segmentation model and feature pyramid network FPN, a Bottom-up and Top-Path enhancement Bottom-up method is added on the FPN, and then a rolling layer attention mechanism CBAM is respectively introduced into the Top-up and Bottom-down Top-down sampling Path and Bottom-up sampling Path of the FPN, so as to better improve the change problem of the tumor shape and size due to individual difference and time change, and at the same time, the low-resolution feature map and the high-resolution feature map are fused, so that the detection effect of a small target can be effectively improved, the leak rate of a small-size tumor can be reduced, help is provided for early screening of a tumor, and because a Bottom-up enhancement Path is added on the FPN basis, by utilizing the accurate positioning information of the low-layer features and introducing an attention mechanism CBAM on the enhancement path, a better attention coefficient to a tumor feature map can be formed, so that the feature information of the tumor in a high-dimensional nonlinear space can be better maintained, the spatial position of a target is more sensitive, and the effectiveness of multi-target detection and segmentation is improved. Meanwhile, the scheme of the application can give consideration to the segmentation efficiency and the segmentation precision, and experimental analysis proves that the segmentation precision is higher than that of the PANet, and compared with the three-level cascade BiFPN, the segmentation precision is even in precision, but the efficiency is higher than that of the BiFPN, the time cost is less, and further the segmentation performance is improved and the universality is better. In addition, because the CBAM is a lightweight universal module, the overhead of the module can be ignored, the influence on the segmentation efficiency is small, and the high efficiency of the algorithm is kept while the performance of the model is improved; and because the target detection and the example segmentation are integrated, and the BBox is more accurate in region segmentation and classification judgment, the probability of outputting the classification has higher reference value for clinical significance, and therefore, compared with pure semantic segmentation, the method is more beneficial to processing the detection and segmentation of multiple targets and small targets, and the example segmentation model of the MR brain tumor has good universality.

As shown in fig. 4, a second aspect of the present embodiment provides a virtual device for implementing the MR brain tumor image example segmentation method according to the first aspect, including a training set obtaining unit, a preprocessing unit, a model training unit, and a model application unit, which are sequentially connected in a communication manner;

Adding a Bottom-up Path enhancement in the feature pyramid network FPN, and designing as follows: first, a second convolution attention module is respectively introduced into each prediction layer so as to adopt a strategy of first channel and then space, and important feature weighting is carried out on two dimensions of space and channel, so that a second feature map P' is input_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N_iAnd then, carrying out the following iterative processing from bottom to top to obtain a new second refined new feature map for predicting frame classification, frame bbox regression and mask branch generation: said second refined new feature map N for lower layers_i"carry on 3 x 3 convolution process and down sampling process of a step size 2, then pass the processing result and the horizontal connection and the second characteristic diagram P" of the high level_i+1Performing unit addition fusion, and finallyCarrying out 3 × 3 convolution processing on the unit and fusion result to obtain a new second refined new feature map N ″_i+1；

For the working process, working details and technical effects of the foregoing apparatus provided in the second aspect of this embodiment, reference may be made to the MR brain tumor image segmentation method described in the first aspect, which is not described herein again.

As shown in fig. 5, a third aspect of the present embodiment provides a computer device for executing the MR brain tumor image example segmentation method according to the first aspect, which includes a memory, a processor and a transceiver, which are sequentially and communicatively connected, wherein the memory is used for storing a computer program, the transceiver is used for transceiving information, and the processor is used for reading the computer program to execute the MR brain tumor image example segmentation method according to the first aspect. For example, the Memory may include, but is not limited to, a Random-Access Memory (RAM), a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a First-in First-out (FIFO), and/or a First-in Last-out (FILO), and the like; the processor may be, but is not limited to, a microprocessor of the model number STM32F105 family. In addition, the computer device may also include, but is not limited to, a power module, a display screen, and other necessary components.

For the working process, working details and technical effects of the foregoing computer device provided in the third aspect of this embodiment, reference may be made to the MR brain tumor image segmentation method in the first aspect, which is not described herein again.

A fourth aspect of the present embodiment provides a computer-readable storage medium storing instructions for a method for MR brain tumor image instance segmentation according to the first aspect, i.e., the computer-readable storage medium has instructions stored thereon, which when executed on a computer, perform the method for MR brain tumor image instance segmentation according to the first aspect. The computer-readable storage medium refers to a carrier for storing data, and may include, but is not limited to, a computer-readable storage medium such as a floppy disk, an optical disk, a hard disk, a flash Memory, a flash disk and/or a Memory Stick (Memory Stick), and the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.

For the working process, working details and technical effects of the foregoing computer-readable storage medium provided in the fourth aspect of this embodiment, reference may be made to the MR brain tumor image segmentation method in the first aspect, which is not described herein again.

A fifth aspect of the present embodiments provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the MR brain tumor image instance segmentation method according to the first aspect. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable devices.

Finally, it should be noted that the present invention is not limited to the above alternative embodiments, and that various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims

1. An MR brain tumor image example segmentation method is characterized by comprising the following steps:

The residual network is further configured to pass through different down-sampling times, such that the first refined new feature map F_i"after 1 x 1 dimensionality reduction and 2 times up-sampling processing, with another first refined new feature map F output by the next prediction layer and subjected to 1 x 1 dimensionality reduction_i′-′₁Performing unit-plus-fusion, and performing 3-to-3 convolution on the unit-plus-fusion result to obtain a second feature map P_i′_-′₁；

Adding a Bottom-up Path enhancement in the feature pyramid network FPN, and designing as follows: first, a second convolution attention module is respectively introduced into each prediction layer so as to adopt a strategy of first channel and then space, and important feature weighting is carried out on two dimensions of space and channel, so that a second feature map P' is input_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N ″_iAnd then, carrying out the following iterative processing from bottom to top to obtain a new second refined new feature map for predicting frame classification, frame bbox regression and mask branch generation: said second refined new feature map N "for the lower layer_i3-3 convolution processing and down-sampling processing with a step size of 2 are carried out, and then the processing result is connected with the second characteristic diagram P ″, which is transmitted from the transverse connection and is at the upper layer_i+1Performing unit-plus-fusion, and finally performing 3-x 3 convolution processing on the unit-plus-fusion result to obtain a new second refined new feature map N ″_i+1；

2. The MR brain tumor image example segmentation method according to claim 1, wherein the brain tumor public data training set adopts a training subset in a brain tumor public data set BRATS2018, wherein each MR brain tumor image sample in the plurality of MR brain tumor image samples has the following 4 modalities: a T1 modality, a T2 modality, a T1ec modality, and a Flair modality.

3. The MR brain tumor image instance segmentation method according to claim 1, wherein the performing a gray scale data normalization process on each MR brain tumor image sample in the plurality of MR brain tumor image samples to obtain a corresponding sample image comprises: MR brain tumor image samples were normalized to have zero mean and unit standard deviation.

4. The MR brain tumor image instance segmentation method according to claim 3, wherein a gray scale data normalization process is performed on each MR brain tumor image sample in the plurality of MR brain tumor image samples to obtain a corresponding sample image, further comprising:

5. The MR brain tumor image example segmentation method of claim 1, wherein said residual network employs a ResNet101 network structure including seven convolutional layers;

6. The MR brain tumor image example segmentation method of claim 1, wherein the ith first feature map F in the six first feature maps is input_iThen, the ith first feature map F is obtained_iCorresponding first refined new feature map F_i", including:

for the first weight coefficient

The first new feature map F_i' feeding said first volumetric noteA spatial attention submodule in the attention module to obtain a second weight coefficient in a spatial attention dimension

For the second weight coefficient

7. The MR brain tumor image example segmentation method according to claim 1, characterized in that the second feature map P "is input_i-1Then, the second characteristic diagram P ″, which is obtained from the first characteristic diagram, is obtained_i-1Corresponding second refined new feature map N ″_iThe method comprises the following steps:

For the third weight coefficient

For the fourth weight coefficient

And the second new feature map P'_i-1Carrying out convolution operation processing to obtain the second characteristic diagram P ″)_i-1Corresponding second refined new feature map N ″_i。

8. An MR brain tumor image example segmentation device is characterized by comprising a training set acquisition unit, a preprocessing unit, a model training unit and a model application unit which are sequentially in communication connection;

9. Computer device, characterized in that it comprises a memory, a processor and a transceiver, which are in communication connection in turn, wherein the memory is used for storing a computer program, the transceiver is used for transceiving information, and the processor is used for reading the computer program and executing the MR brain tumor image instance segmentation method according to any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon instructions for performing, when running on a computer, a MR brain tumor image instance segmentation method according to any one of claims 1 to 7.