CN116645336A

CN116645336A - MRI brain image gland pituitary segmentation method

Info

Publication number: CN116645336A
Application number: CN202310522974.5A
Authority: CN
Inventors: 郑强
Original assignee: Yantai University
Current assignee: Shandong Zhongjia Yingrui Medical Technology Co ltd
Priority date: 2023-05-10
Filing date: 2023-05-10
Publication date: 2023-08-25
Anticipated expiration: 2043-05-10
Also published as: CN116645336B

Abstract

A method of MRI brain image adenohypophysis segmentation, comprising; preprocessing an MRI brain image acquired from a hospital, converting original Dicom format MRI data acquired from the hospital into a three-dimensional image in an NIFTI format by using ITK-SNAP software, and converting the image in the DICOM format into the three-dimensional image in the NIFTI format; cutting and dividing the three-dimensional image in the NIFTI format into sub-blocks; inputting the sub-blocks into VT-Unet to obtain a segmentation result, finally calculating a 3D boundary box of the pituitary gland, and applying the coordinates of the boundary box to the NIFTI image to obtain a pituitary region, thereby completing pituitary positioning; using the output of the pituitary gland positioning of the previous stage, generating a 2D slice according to the cross section direction as the input of the stage; finally, outputting a two-dimensional label of the pituitary gland, and then stacking all the two-dimensional slices in the cross section direction to reconstruct a three-dimensional image, thereby obtaining the final three-dimensional pituitary gland label. The method has the characteristics of high speed, high accuracy, good robustness, high efficiency and good generalization.

Description

MRI brain image gland pituitary segmentation method

Technical Field

The invention belongs to the technical field of clinical pituitary gland segmentation, and particularly relates to a method for segmenting pituitary gland by using an MRI brain image.

Background

Nuclear Magnetic Resonance (MRI) images are currently accepted as the best imaging method for displaying the morphological characteristics of the pituitary gland due to the fact that the MRI images are non-radiative and have high tissue contrast.

The existing pituitary gland segmentation method is mainly manual segmentation, and no method special for full-automatic segmentation of the pituitary gland exists at present. However, artificial-based pituitary segmentation suffers from the problems of being time consuming, subjective and inaccurate. In recent years, deep learning technology has been developed rapidly, and Convolutional Neural Networks (CNNs) have achieved great success in the field of medical image segmentation. The symmetrical codec structure used by U-Net and its variants to improve detail retention is the dominant architecture for medical image segmentation and is widely used for organ and tissue segmentation. ViT (Vision Transformer) is a new trend in the field of computer vision at present, and modeling remote dependency relationships with its unique global attention provides a new idea for computer vision and is gradually used for medical image segmentation. However, neither CNN nor transducer is applied to the pituitary segmentation field. Although CNN has strong feature extraction capability, the CNN has a great promotion space for the reservation of global information; transformer is leading in the field of natural language image processing, but has drawbacks in terms of both data amount requirements and generalization ability of models for both computer vision and freshness.

Disclosure of Invention

In order to overcome the technical problems, the invention aims to provide the MRI brain image adenohypophysis segmentation method which has the characteristics of high speed, high accuracy, good robustness, high efficiency and good generalization.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

an MRI brain image adenohypophysis segmentation method comprises the following steps;

step 1: positioning of the pituitary gland:

preprocessing an MRI brain image acquired from a hospital, converting original Dicom format MRI data acquired from the hospital into a three-dimensional image in an NIFTI format by using ITK-SNAP software, and converting the image in the DICOM format into the three-dimensional image in the NIFTI format;

clipping a three-dimensional image in NIFTI format to a size of 80 x 11, and dividing the image into 49 sub-blocks with a size of 32 x 11; then inputting the 49 sub-blocks into VT-Unet to obtain a segmentation result of the 49 sub-blocks, combining 49 outputs into a mask with the size of 80 multiplied by 11 according to a segmentation method, finally calculating a 3D boundary frame of the pituitary gland mask, expanding the coronal position and the sagittal position of the boundary frame to 32 and expanding two layers back and forth on the cross section to ensure that the boundary frame completely contains the pituitary gland region, and applying the coordinates of the boundary frame to an NIFTI image with the size of 80 multiplied by 11 to obtain the pituitary gland region, thereby completing the positioning of the pituitary gland;

step 2: pituitary segmentation:

using the output of the pituitary positioning of the previous stage, 2D slices (32 x 32) are generated from the cross-sectional direction as input to this stage;

finally, outputting a two-dimensional label of the pituitary gland, and then stacking all the two-dimensional slices in the cross section direction to reconstruct a three-dimensional image, thereby obtaining the final three-dimensional pituitary gland label.

In the step 1, the MRI brain image is obtained by using a Siemens Skyra3.0T superconducting magnetic resonance scanner, and the pituitary scan uses the following sequences and parameters, and the total scan time is about 16 minutes; sagittal T1WI sequence (tr=641 ms, te=9.9 ms), sagittal T2WI sequence (tr=3800 ms, te=75 ms).

In the step 1, a sliding window with a size of 32×32×11 is firstly defined, the sliding window is initially fixed at the upper right corner, then the sliding window slides leftwards or downwards, and each time, the sliding unit is too small, so that adjacent image blocks contain a large amount of repeated information, and training resources are wasted; too large a sliding unit may result in too little training data, considering comprehensively that 8 units per sliding are selected, the sub-blocks are divided by the sliding window, and finally 49 sub-blocks are obtained, and each sub-block has a size of 32×32×11. Marking the position of each sub-block before division;

then inputting the 49 subblocks into the VT-Unet to obtain a segmentation result of the 49 subblocks, wherein the segmentation result of each subblock is a mask with the size of 32 multiplied by 11;

mask _i ＝VTUnet(patch _i )，i∈(1，...，49) (1)

wherein VTUnet represents VT-Unet, and is a three-dimensional volume transducer model suitable for medical image segmentation _i Representing the ith sub-block, mask _i Representing a segmentation result corresponding to the ith sub-block;

the 49 outputs are combined into the mask with the size of 80 multiplied by 11 according to the dividing method, and as the mask with the size of 80 multiplied by 11 obtained by the combination has a plurality of discrete points, the subsequent calculation amount can be increased, only a main part is reserved by removing the discrete points by using the method of taking the maximum communication area, and the reserved main part is the pituitary mask;

ademask＝f(mask) (2)

wherein f represents the maximum communication function, mask represents 80×80×11 size mask, and adepack represents pituitary mask obtained by removing the maximum communication area;

finally, calculating a 3D boundary box of the pituitary gland mask, expanding the coronal position and the sagittal position of the boundary box to 32 sizes and expanding the boundary box back and forth on the cross section to ensure that the boundary box completely contains the pituitary gland region, and applying the coordinates of the boundary box to an NIFTI image of 80 multiplied by 11 to obtain the pituitary gland region, thereby completing the positioning of the pituitary gland;

in the step 2, using the output of the pituitary gland positioning of the previous stage, a 2D slice (32 multiplied by 32) is generated as the input of the stage according to the cross section direction;

x _i ＝X[a，b，i]，i∈(1，...，N) (3)

wherein X represents the pituitary gland region obtained by locating the pituitary gland, and X _i The ith slice of X in the cross-sectional direction, a represents the coronal direction of X, b represents the delocalization direction of X, i represents the cross-sectional direction of X, and N represents the number of cross-sectional layers of X. Wherein a and b are both 32;

finally, outputting a two-dimensional label of the pituitary gland, and then stacking all two-dimensional slices in the cross section direction to reconstruct a three-dimensional image, so that the final three-dimensional label of the pituitary gland is obtained;

X[a，b，i]＝x _i ，i∈(1，...，N) (4)

wherein X represents a three-dimensional label reconstructed in the cross-sectional direction, X _i The ith two-dimensional label, a represents the coronal bit direction of X, b represents the bit losing direction of X, i represents the cross section direction of X, and the number of N two-dimensional labels X.

The invention has the beneficial effects of.

The invention adopts a segmentation method based on two stages of deep learning, and has the advantages of high segmentation speed and high segmentation result accuracy (DICE= 0.9013). The data set uses brain MRI images of different diseases and different sequences, so that the robustness and generalization are good. Quantitative and qualitative assessment of pituitary segmentation are shown in table 1 and fig. 4. In table 1, the segmentation index of the method of the patent is higher than that of the traditional CNN model, viT (visual transducer) model and HyBird (hybrid CNN and transducer) model, and higher segmentation accuracy is shown. Fig. 4 shows that the method of this patent has a better segmentation effect than other methods.

Description of the drawings:

FIG. 1 is a schematic representation of an image segmentation sub-block in pituitary positioning according to the present invention.

FIG. 2 is a schematic diagram of an adenohypophysis segmentation convolutional neural network (PIT-Former) according to the present invention.

FIG. 3 is an expanded view of the pituitary positioning frame produced in the pituitary positioning stage of the present invention.

FIG. 4 is a qualitative assessment of pituitary segmentation according to the present invention. Pituitary segmentation examples of different methods for randomly selecting three patients, a1-a2, b1-b2, c1-c2 represent T1WI and T2WI, respectively, of three patients, PIT-force being the method presented in this section. The white line encircled portion represents the label.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

step 1: positioning of the pituitary gland:

specific expansion method as shown in fig. 3, the inner frame represents before expansion, and the outer frame represents after expansion.

Step 2: pituitary segmentation:

the purpose of pituitary segmentation is to accurately segment the pituitary gland on specific sections determined at the pituitary positioning stage, pituitary segmentation using PIT-force (FIG. 2). PIT-Former is a two-dimensional network fusing CNN and a transducer;

In the step 1, a sliding window with a size of 32×32×11 is firstly defined, the sliding window is initially fixed at the upper right corner (white), then the sliding window slides leftwards or downwards, and each time, the sliding unit is too small, so that adjacent image blocks contain a large amount of repeated information, and training resources are wasted; too large a sliding unit may result in too little training data, considering comprehensively that 8 units per sliding are selected, the sub-blocks are divided by the sliding window, and finally 49 sub-blocks are obtained, and each sub-block has a size of 32×32×11. Marking the position of each sub-block before division;

mask _i ＝VTUnet(patch _i )，i∈(1，...，49) (1)

ademask＝f(mask) (2)

x _i ＝X[a，b，i]，i∈(1，...，N) (3)

X[a，b，i]＝x _i ，i∈(1，...，N) (4)

Examples:

one MRI brain image acquired was cropped to a size of 80 x 11. The image is divided into 49 subblocks with the size of 32 x 11 and is input to the VT-Unet. The 49 outputs were then combined into a mask of size 80 x 11 according to the partitioning method, and finally the mask was taken to the maximum connected area to obtain the final pituitary mask and the 3D bounding box of the pituitary mask was calculated. The coronal and sagittal planes of the bounding box are expanded to a 32 x 32 size and two layers anteriorly and posteriorly in cross-section as the final pituitary positioning output. The output is used to generate 2D slices (32 x 32) from the axial direction as input to the pituitary segmentation stage. Finally, stacking all two-dimensional slices output by the pituitary segmentation in a third dimension to reconstruct a three-dimensional image, and obtaining a final three-dimensional pituitary gland label;

a novel two-stage pituitary segmentation method based on deep learning. The two-stage pituitary segmentation method generally includes two steps, pituitary localization and pituitary segmentation.

As shown in fig. 1: in order to obtain good adaptation of the proposed adenohypophysis segmentation method, the data collected includes three-dimensional brain images (TIWI and T2 WI) of two sequences of three different diseases (dwarf group, precocious group and normal group). The data of a plurality of different diseases are used for participating in training, so that the robustness of the model is improved. Since the proportion of the pituitary gland is too small and the calculation efficiency is improved in order to reduce the calculation amount, in the preprocessing process of the data, the original data is cut into the size of 80 x 11, and compared with the original image, the calculation amount of the model is reduced, and the fitting speed of a network is improved. The preprocessed brain MRI image is divided into 49 sub-blocks with the size of 32 x 11, the sub-blocks are input into an adenohypophysis positioning network, and the output is 49 sub-masks with the size of 32 x 11. The 49 sub-masks are combined into a mask of 80 x 11 size according to the partitioning method, and then the mask is taken to be the largest connected region to obtain the final pituitary mask and the 3D bounding box of the pituitary mask is calculated. The coronal and sagittal planes of the bounding box are expanded to a 32 x 32 size and two layers anteriorly and posteriorly in cross-section as the final output of pituitary positioning. We then input the output 2-dimensional slice into the pituitary segmentation model for pituitary segmentation.

The pituitary segmentation network (PIT-Former) is a two-dimensional network of U-Net architecture fused with CNN and Transfomer. The network includes 4-layer downsampling and 4-layer upsampling. We use ECA blocks to capture local cross-channel interactions and avoid dimension reduction methods to extract features, aimed at guaranteeing efficiency and effectiveness. The characteristics of the encoder are better fused by using a Channel cross fusion transducer (Channel-wise Cross fusion Transformer, CCT block), so that the semantic gap is reduced to improve the segmentation performance. Channel cross-attention module (channel-wise Cross Attention model, CCA block) avoidance dimension reduction approach is used to guide channel and information filtering of the transducer features and decoder features are used to eliminate ambiguity to facilitate feature extraction. The input to the network is a two-dimensional slice of 32 x 32 size, the feature map is doubled after each downsampling, and the feature map becomes 4*4 size after the fourth downsampling. The last feature map is then obtained by four upsampling and CCA blocks. Finally, a Softmax operation is carried out to obtain a final output result. The output result is a two-dimensional mask with the size of 32 x 32.

Normalization is to normalize the data to between [0,1] by centering on the minimum and scaling on the polar difference (maximum-minimum). The training data is subjected to data enhancement by using rotation, translation and overturn so as to reduce the occurrence of the overfitting phenomenon and improve the robustness of the model. In addition, optimization was performed using ADAM. ADAM is an adaptive low-order moment estimation optimizer that utilizes Nesterov momentum. The deep learning model was trained in end-to-end mode using training and validation data, with a batch size of 16, a learning rate of 1e-4, and implemented on a GPU using python 3.7.0 and GeForce RTX 2080ti 11 gb.

In addition, the DICE Coefficients (DCs) at the pixel level are applied to the final feature map for loss function calculation. DC is a statistic for measuring the degree of spatial overlap between two samples.

Ranging from 0 (indicating no spatial overlap) to 1 (indicating complete spatial overlap).

Where a represents a manual label and B represents a label automatically segmented using a deep learning model.

And finally reconstructing the segmentation result into a three-dimensional image to obtain a final pituitary segmentation result.

TABLE 1

Table 1 is a quantitative evaluation table of pituitary segmentation according to the present invention. In order to verify the performance of the PIT-Former network, 7 image segmentation methods were compared on the basis of 5 indexes such as Dice, HD, HD, MD and ASSD. Model performance was tested using 10 fold cross validation.

Claims

1. An MRI brain image adenohypophysis segmentation method is characterized by comprising the following steps of;

step 1: positioning of the pituitary gland:

step 2: pituitary segmentation:

2. The method of claim 1, wherein in step 1, the MRI brain image is obtained by using a siemens skyra3.0t superconducting magnetic resonance scanner, and the pituitary scan uses the following sequences and parameters for a total scan time of about 16 minutes; sagittal T1WI sequence (tr=641 ms, te=9.9 ms), sagittal T2WI sequence (tr=3800 ms, te=75 ms).

3. The method of claim 1, wherein in step 1, a sliding window of size 32×32×11 is defined first, the sliding window is initially fixed at the upper right corner, then it slides left or down, each time by 8 units, the sliding window is used to divide the sub-blocks, and finally 49 sub-blocks are obtained, and each sub-block has size 32×32×11. Marking the position of each sub-block before division;

mask _i ＝VTUnet(patch _i )，i∈(1，...，49) (1)

ademask＝f(mask) (2)

finally, calculating a 3D boundary box of the pituitary mask, expanding the coronal position and the sagittal position of the boundary box to 32 sizes and expanding the boundary box back and forth in a cross section to ensure that the boundary box completely contains the pituitary region, and applying the coordinates of the boundary box to an NIFTI image of 80 multiplied by 11 to obtain the pituitary region, thereby completing the pituitary positioning.

4. The MRI brain image pituitary segmentation method according to claim 1, wherein in the step 1, the output of the pituitary positioning of the previous stage is used in the step 2, and a 2D slice (32×32) is generated as an input of the stage according to the cross-sectional direction;

x _i ＝X[a，b，i]，i∈(1，...，N) (3)

X[a，b，i]＝x _i ，i∈(1，...，N) (4)