CN116798605A

CN116798605A - Alzheimer's disease auxiliary diagnosis method based on nuclear magnetic resonance image

Info

Publication number: CN116798605A
Application number: CN202310771372.3A
Authority: CN
Inventors: 胡振涛; 李艳阳; 杨浩然; 程聪聪; 王正; 陈鸿宇; 蒋涛; 王凯歌; 刘先省; 吴振辉
Original assignee: Henan University
Current assignee: Henan University
Priority date: 2023-06-27
Filing date: 2023-06-27
Publication date: 2023-09-22

Abstract

The invention relates to the technical field of image classification, in particular to an Alzheimer's disease auxiliary diagnosis method based on nuclear magnetic resonance images. The method comprises the following steps: acquiring a nuclear magnetic resonance image of a brain structure of an object to be analyzed; inputting the nuclear magnetic resonance image of the brain structure of the object to be analyzed into a trained convolutional neural network model to obtain a classification result; the training process of the convolutional neural network model is as follows: slicing the nuclear magnetic resonance sample image to obtain a corresponding slice group; adding a new convolutional layer in the initial convolutional neural network model, and inputting a slice group into the convolutional neural network model added with the new convolutional layer to obtain plane characteristics; and introducing a shift window attention mechanism into the transducer encoder module to establish spatial connection between plane features, thereby obtaining a trained convolutional neural network model. The invention improves the feature extraction precision of the brain nuclear magnetic resonance image.

Description

Alzheimer's disease auxiliary diagnosis method based on nuclear magnetic resonance image

Technical Field

The invention relates to the technical field of image classification, in particular to an Alzheimer's disease auxiliary diagnosis method based on nuclear magnetic resonance images.

Background

Alzheimer's disease is a neurological disorder that increases over time, resulting in a loss of memory and thinking, a decline in cognitive ability in the patient, and ultimately dementia, and neuroimaging can provide visualization of brain anatomy, and can diagnose many neurological disorders, particularly Alzheimer's disease characterized by brain atrophy. The deep learning model can find hidden characterization in the medical image, find out the connection between image parts, identify the mode related to diseases, and is successfully applied to the medical image such as structural nuclear magnetic resonance.

Patients with Alzheimer's disease can be divided into different subtypes, and brain degeneration patterns of patients with different subtypes are different. The human brain consists of many areas responsible for different functions, such as the hippocampus responsible for creating new memory, the hypothalamus responsible for regulating the daily diet, the amygdala responsible for emotional experience and expression, the cerebellum responsible for coordination of language and motor functions, etc., where brain atrophy is not evenly distributed but distributed in a localized manner in different brain areas, and where brain atrophy areas are not identical for different subtypes of the Alzheimer's disease patient, this requires an accurate grasp of the patient's local lesion characteristics for identification.

Traditional machine learning algorithms require manual selection of regions of interest for analysis, such methods require expert manual labeling, and the preselected regions of interest may not include all potentially useful information that can distinguish between Alzheimer's disease. The Alzheimer's disease auxiliary diagnosis based on deep learning can learn the generated characteristics based on the original medical image data in a dynamic mode, and huge achievements are achieved in the field of large-scale and high-dimensional medical imaging analysis. The existing deep learning method fuses input features indiscriminately, which may make it difficult to extract local fine-grained features effectively, because the model may require extensive data iterative training to make the spatial neighboring features strongly linked. In addition, because the medical image data set is very limited, the input features are fused and mixed with a large amount of background semantic information irrelevant to local focuses without distinction, so that the trained depth model cannot effectively extract and integrate local fine granularity features, and further the accuracy of the subsequent analysis results is lower.

Disclosure of Invention

In order to solve the problem that the existing method cannot effectively extract and integrate local fine granularity characteristics when the characteristic extraction is carried out on the brain nuclear magnetic resonance image, the invention aims to provide an Alzheimer disease auxiliary diagnosis method based on the nuclear magnetic resonance image, and the adopted technical scheme is as follows:

the invention provides an Alzheimer's disease auxiliary diagnosis method based on nuclear magnetic resonance images, which comprises the following steps:

acquiring a nuclear magnetic resonance image of a brain structure of an object to be analyzed; inputting the nuclear magnetic resonance image of the brain structure of the object to be analyzed into a trained convolutional neural network model to obtain a classification result;

the training process of the convolutional neural network model is as follows: acquiring a training set formed by nuclear magnetic resonance sample images of brain structures of different classes of subjects; slicing the nuclear magnetic resonance sample images in the training set to obtain slice groups corresponding to the nuclear magnetic resonance sample images; adding a new convolutional layer in the initial convolutional neural network model, and inputting the slice group into the convolutional neural network model added with the new convolutional layer to obtain plane characteristics; introducing a shift window attention mechanism into a transducer encoder module, and establishing spatial connection between the plane features based on the shift window attention mechanism to obtain a trained convolutional neural network model.

Preferably, the inputting the slice group into the convolutional neural network model with the new convolutional layer to obtain the plane feature includes:

performing dimension transformation on the slices in the slice group, and performing dimension expansion on the slices of the single channel by using convolution to obtain a three-channel feature map;

and mapping the feature map into feature vectors by using a convolution layer of the convolution neural network model added with a new convolution layer, and marking the feature vectors as plane features.

Preferably, introducing a shift window attention mechanism in the transducer encoder module, establishing a spatial connection between the planar features based on the shift window attention mechanism, comprising:

position embedding is carried out on each characteristic vector to obtain a vector with embedded positions;

constructing an initial vector sequence corresponding to each nuclear magnetic resonance sample image based on the embedded vector corresponding to each nuclear magnetic resonance sample image;

executing a shift window attention mechanism on the initial vector sequence, dividing the initial vector sequence into a plurality of windows, executing multi-head self-attention in each window, and carrying out feature fusion on vectors in the windows;

performing cyclic shift on the initial vector sequence, re-dividing the window in the shifted vector sequence, performing self-attention in the window in the newly divided window, and performing self-attention operation with shielding on the vector in the last window; performing reverse cyclic shift on the shifted vector sequence to recover the original vector sequence;

and averaging vectors corresponding to the same nuclear magnetic resonance sample image to obtain corresponding classification vectors, and sending the classification vectors into a multi-layer perceptron layer to obtain an output result.

Preferably, slicing the nmr sample images in the training set to obtain slice groups corresponding to the nmr sample images, including:

and respectively slicing the middle slice of each nuclear magnetic resonance sample image in the training set at the position of the axial plane along the directions of the two ends of the axis to obtain slice groups corresponding to each nuclear magnetic resonance sample image.

Preferably, the acquisition of nuclear magnetic resonance sample images of brain structures of subjects of different classes comprises:

and respectively normalizing the nuclear magnetic resonance images of the brain structures of the subjects in different categories into MNI152 standard space, performing skull stripping on the normalized nuclear magnetic resonance images, performing unified N4 bias field correction on the nuclear magnetic resonance images after skull stripping, and taking the corrected nuclear magnetic resonance images as nuclear magnetic resonance sample images of the brain structures of the subjects in corresponding categories.

The invention has at least the following beneficial effects:

according to the invention, the shift window attention mechanism is introduced into the transducer encoder module, so that the attention can be effectively focused on a small space region, thereby better capturing local fine granularity characteristics, gradually integrating the local characteristics, improving the generalization capability of the convolutional neural network model, enabling the classification result of the nuclear magnetic resonance image of the brain structure of the object to be analyzed to be more accurate, and more effectively assisting a doctor in diagnosing the brain condition of the object to be analyzed.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of an auxiliary diagnosis method for alzheimer's disease based on a nuclear magnetic resonance image according to an embodiment of the present invention;

FIG. 2 is an original MRI, wherein a is an axial plane of the original MRI, b is a coronal plane of the original MRI, and c is a sagittal plane of the original MRI;

fig. 3 is a spatially normalized nmr image, where d is an axial plane of the spatially normalized nmr image, e is a coronal plane of the spatially normalized nmr image, and f is a sagittal plane of the spatially normalized nmr image;

FIG. 4 is a nuclear magnetic resonance image after skull dissection, wherein g is the axial plane of the nuclear magnetic resonance image after skull dissection, h is the coronal plane of the nuclear magnetic resonance image after skull dissection, and i is the sagittal plane of the nuclear magnetic resonance image after skull dissection;

fig. 5 is a nuclear magnetic resonance image after offset field correction, where j is an axial plane of the nuclear magnetic resonance image after offset field correction, k is a coronal plane of the nuclear magnetic resonance image after offset field correction, and m is a sagittal plane of the nuclear magnetic resonance image after offset field correction;

fig. 6 is a schematic diagram of a shift window attention mechanism.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following is a detailed description of an auxiliary diagnosis method for Alzheimer's disease based on nuclear magnetic resonance image according to the invention with reference to the accompanying drawings and the preferred embodiment.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of the Alzheimer's disease auxiliary diagnosis method based on nuclear magnetic resonance images.

An embodiment of an auxiliary diagnosis method for Alzheimer's disease based on nuclear magnetic resonance images:

when the brain condition of the object to be analyzed is analyzed, a nuclear magnetic resonance image of the brain structure of the object to be analyzed is generally acquired, then a doctor analyzes each region in the acquired nuclear magnetic resonance image of the brain structure of the object to be analyzed, and further judges the brain condition of the object to be analyzed, but the workload of the doctor is high due to the fact that the result is obtained based on the method, and misjudgment phenomenon is likely to occur.

The embodiment provides an auxiliary diagnosis method for Alzheimer's disease based on nuclear magnetic resonance images, as shown in FIG. 1, comprising the following steps:

acquiring a nuclear magnetic resonance image of a brain structure of an object to be analyzed; and inputting the nuclear magnetic resonance image of the brain structure of the object to be analyzed into the trained convolutional neural network model to obtain a classification result.

The present embodiment is to classify a nuclear magnetic resonance image, and thus first take a nuclear magnetic resonance image of a brain structure of an object to be analyzed. In this embodiment, the convolutional neural network model is used to classify the nmr image of the brain structure of the object to be analyzed, so that the convolutional neural network model needs to be trained first to obtain the trained convolutional neural network model, and then the nmr image of the brain structure of the object to be analyzed is classified to obtain the classification result.

According to clinical diagnosis records, 2890 nuclear magnetic resonance images of brain structures of normal subjects, 1412 Zhang Qingdu cognitive impairment patients and 508 Zhang Aer Alzheimer's disease patients are queried according to conditions from an Alzheimer's disease clinical diagnosis database ADNI, nuclear magnetic resonance images of brain structures of the subjects in 287, 410 and 263 subjects are obtained respectively, each image obtained from the Alzheimer's disease clinical diagnosis database is preprocessed respectively, an original nuclear magnetic resonance image is shown in fig. 2, a is an axial plane of the original nuclear magnetic resonance image, b is a coronal plane of the original nuclear magnetic resonance image, and c is a sagittal plane of the original nuclear magnetic resonance image; the preprocessing process of the nuclear magnetic resonance image comprises the following steps: 1. the structural mri images were normalized into MNI152 standard space using FMRIB Software Library tools, and after spatial normalization, all brain structures have a nuclear magnetic resonance image size of 182 x 218 x 182, spatial resolution of 1X 1mm ³ Each voxel has the same layer thickness and origin coordinates; spatial normalizationAs shown in fig. 3, d in the figure is the axial plane of the spatially normalized nmr image, e is the coronal plane of the spatially normalized nmr image, and f is the sagittal plane of the spatially normalized nmr image; 2. skull stripping is carried out on the nuclear magnetic resonance image after spatial normalization by using a skull strip tool under FMRIB Software Library; the nuclear magnetic resonance image after skull dissection is shown in fig. 4, wherein g is the axial surface of the nuclear magnetic resonance image after skull dissection, h is the coronal surface of the nuclear magnetic resonance image after skull dissection, and i is the sagittal surface of the nuclear magnetic resonance image after skull dissection; 3. using Advanced Normalization Tools to carry out unified N4 bias field correction on the nuclear magnetic resonance image after skull stripping; as shown in fig. 5, j in the figure is the axial plane of the offset field corrected nmr image, k is the coronal plane of the offset field corrected nmr image, and m is the sagittal plane of the offset field corrected nmr image. And recording the preprocessed nuclear magnetic resonance image as a nuclear magnetic resonance sample image. A dataset is constructed based on all of the nuclear magnetic resonance sample images, the dataset containing three categories of nuclear magnetic resonance sample images, each category containing a plurality of nuclear magnetic resonance sample images. The nmr sample image in the dataset is randomly divided into a training set, a verification set and a test set, and the division ratio in the embodiment is 7:1.5:1.5. and all are divided in a random division manner.

According to the embodiment, firstly, an initial convolutional neural network is trained by using nuclear magnetic resonance sample images in a training set, specifically, slicing operation is carried out on each nuclear magnetic resonance sample image in the training set respectively to obtain a slice group corresponding to each nuclear magnetic resonance sample image, each nuclear magnetic resonance sample image in the training set is obtained to form a slice group, slice serial numbers in the slice group are strictly arranged according to the positions of the slices in the axial direction of the original nuclear magnetic resonance sample image, a bilinear interpolation algorithm is used for carrying out size adjustment on the slices in the sample, and the dimension of each slice after the size adjustment is 112×112×1.

Next, in this embodiment, a new convolutional layer is added to the initial convolutional neural network model, and the convolutional layer of the convolutional neural network model after the new convolutional layer is added is used to map the feature map into a feature vector. It should be noted that: the initial convolutional neural network model is an existing mainstream convolutional neural network model, and is AlexNet, VGGNet, resNet or GooLeNet. In the embodiment, a convolutional neural network taking VGGNet-16 as a main frame is used for extracting plane characteristics of two-dimensional slices at different positions of a nuclear magnetic resonance sample image with a three-dimensional structure; the method comprises the steps of performing dimension transformation on slices in a slice group, and performing dimension expansion on slices of a single channel by using convolution to obtain a three-channel feature map; and mapping the feature map into feature vectors by using a convolution layer of the convolution neural network model added with a new convolution layer, and marking the feature vectors as plane features.

The method specifically comprises the following steps:

1. and performing dimension transformation on the slices in the slice group, performing dimension expansion on the single-channel slices in the sample by using convolution, wherein the dimension-expanded single-channel slices are three-channel feature images, and the dimension of the feature images is 112 multiplied by 3.

2. Feature extraction is performed using a convolutional layer with VGGNet-16 as the main framework. The method specifically comprises the following steps:

(1) Performing 2 times of convolution operation with input dimension of 3, output dimension of 64 and convolution kernel size of 3×3+rule activation, wherein one time of pooling operation with pooling window of 2×2 and output feature map dimension of 56×56×128;

(2) Performing 2 times of convolution operation with input dimension 64, output dimension 128 and convolution kernel size 3×3+rule activation, performing one time of maximum pooling operation with pooling window 2×2, and outputting feature map dimension 28×28×256;

(3) Performing 3 times of convolution operation with input dimension of 128, output dimension of 256 and convolution kernel size of 3×3+rule activation, wherein one time of pooling operation with pooling window of 2×2 and output feature map dimension of 14×14×512;

(4) Performing 3 times of convolution operation with 128 input dimensions and 512 output dimensions and 3×3 convolution kernel size and Rule activation, performing one time of maximum pooling operation with 2×2 pooling window, and outputting 7×7×512 feature map dimensions;

(5) The convolution operation with convolution kernel size of 3×3+rule activation is performed 3 times, the input dimension is 512, the output dimension is 512, the one-time pooling window is 2×2 max pooling operation, and the output feature map dimension is 3×3×512.

3. The feature map is mapped into feature vectors using a convolution operation with an input dimension of 512, an output dimension of 256, and a convolution kernel size of 3 x 3, the mapped feature vectors having dimensions of 256 x 1.

The mapped feature vectors are noted as planar features, and then the present embodiment uses a transducer encoder to establish spatial connections between these planar features, specifically, N C-dimensional vectors are denoted as X εR ^N×C Then, the X is subjected to position embedding:

X _PE ＝X+PE

wherein X is _PE For the post-position embedding vector, PE is the position vector. The specific calculation formula of PE is as follows:

where pos is the position (i.e., what number of vectors) of the current vector in the sequence, i represents the sequence number of the element of the single vector, sin () is a sine function, and cos () is a cosine function.

As shown in fig. 6, which is a schematic diagram of a shift window attention mechanism, the shift window attention mechanism is performed on the corresponding vector sequence in each sample, the vector sequence is divided into a plurality of windows, and multi-head self-attention (MSA) within the window, that is, feature fusion between vectors within the window, is performed in each window. The feature extraction module of this embodiment includes two consecutive W-MSA (Window-MSA) and SW-MSA (Shifted Window-MSA) modules. The method for calculating the multi-head attention in the window is as follows:

(1) Generating a query matrix for each head of the set of vectors within the window:

Q ₁ ,Q ₂ ,…,Q _h ←split(X _W W ^Q )

(2) Generating a key matrix for each head of the vector group in the window:

K ₁ ,K ₂ ,…,K _h ←split(X _W W ^K )

(3) Generating a matrix of values for each head of the set of vectors within the window:

V ₁ ,V ₂ ,…,V _h ←split(X _W W ^V )

(4) Calculating an attention score:

wherein split () is a one-dimensional array returning a subscript from zero, Q ₁ Query matrix for attention header 1, Q ₂ Query matrix of head 2, Q _h Query matrix, X, for the h attention head _W For the set of intra-window feature vectors, W ^Q To query the linear transformation matrix, K ₁ Key matrix for 1 st attention head, K ₂ Key matrix for attention head 2, K _h Key matrix for h-th attention head, W ^K For key linear transformation matrix, V ₁ Matrix of values for the 1 st attention head, V ₂ For the value matrix of the 2 nd attention head, V _h For the value matrix of the h-th attention head, W ^V For a value linear transformation matrix,output of the ith head of multi-head self-attention, Q _i Query matrix for the i-th head of multi-head self-attention,>transpose of key matrix for multi-headed self-attention ith head, V _i For the value matrix of the i-th head of multi-head self-attention, softmax () is the activation function.

Performing cyclic shift (shift length is half of window size) on the original vector sequence, repartitioning the window in the shifted vector sequence, and performing self-attention within the window in the newly partitioned window.

masked-MSA (self-attention with mask) operations are performed on the vectors in the last window, and the masked-MSA calculation method is as follows:

(1) Generating a MASK matrix:

MASK←zeros(C,C)

(2) Setting the dependency value of the head instruction (token) on the tail instruction (token) to minus infinity:

(3) The correlation value of the tail token to the head token is set to minus infinity:

(4) The output at the minus infinity position is made 0 using the softmax activation function:

wherein MASK is a MASK matrix, zeros () is a zero matrix generation function, and w is the number of vectors in the window.

And performing reverse cyclic shift on the vector sequence to restore the original vector sequence.

And averaging vectors from the same structural nuclear magnetic resonance imaging sample to obtain a final classification vector, and sending the classification vector into a final multi-layer perceptron layer to obtain an output.

By adopting the method, based on a shift window attention mechanism in the transducer encoder module, a spatial link is established between plane features, the training of the convolutional neural network model is completed, and the trained convolutional neural network model is obtained.

In this embodiment, the sliding window sizes of the four sliding attention modules are 8, 16, 32, 96 in sequence, and the multi-head self-attention numbers are 8, 16, 32 respectively. In addition, the slices in the samples were randomly rotated and mirror flipped for data expansion, rotation angle range (-15, 15). For the Alzheimer's disease patient and normal person classification tasks, the model parameter learning rate was set to 0.0001, and 100 rounds of training were performed. For the classification tasks of the alzheimer's disease patient, the mild cognitive impairment patient and the normal person, the model parameter learning rate was set to 0.00001, and 100 rounds of training were performed. The method provided by the embodiment can effectively focus attention on a small space area by introducing a shift window attention mechanism in the transducer encoder module, so that local fine granularity features can be better captured, and the attention window increased layer by layer can gradually integrate the local features, so that model generalization capability is improved. In the embodiment, the development language is Python3.7, the deep learning framework is PyTorch, and the operating system is Ubuntu16.04. The server for the experiment was configured to: 64G running memory, intel 11 generation i7-11700K processor, inlet 3090 video card, 24G memory, and CUDA accelerated training. The accuracy of the classification task for Alzheimer disease patients and normal people is 93.56%, the sensitivity is 93.81% and the specificity is 93.31%; the accuracy of classification tasks for Alzheimer disease patients and mild cognitive impairment patients is 82.09%, the sensitivity is 86.96% and the specificity is 71.43%; the accuracy of classification tasks for patients with mild cognitive impairment and normal people is 79.09%, the sensitivity is 79.82% and the specificity is 78.17%.

Next, in this embodiment, a nmr image of a brain structure of an object to be analyzed is input into a trained convolutional neural network model, and a classification result corresponding to the nmr image of the brain structure of the object to be analyzed is obtained.

After obtaining the classification result corresponding to the nuclear magnetic resonance image of the brain structure of the object to be analyzed, the doctor diagnoses the brain condition of the object to be analyzed by combining the classification result corresponding to the nuclear magnetic resonance image of the brain structure of the object to be analyzed, and improves the diagnosis efficiency and the accuracy of the diagnosis result.

According to the embodiment, the shift window attention mechanism is introduced into the transducer encoder module, so that attention can be effectively focused on a small space region, local fine granularity characteristics can be better captured, the local characteristics can be gradually integrated, the generalization capability of a convolutional neural network model is improved, the classification result of nuclear magnetic resonance images of brain structures of an object to be analyzed is more accurate, and a doctor can be more effectively assisted in diagnosing brain conditions of the object to be analyzed.

The foregoing is a preferred embodiment of the present invention, and it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments may be modified or some or all of the technical features may be replaced with other technical features, which do not depart from the scope of the technical scheme of the embodiments of the present invention.

Claims

1. An auxiliary diagnosis method for Alzheimer's disease based on nuclear magnetic resonance images is characterized by comprising the following steps:

2. The method for aiding diagnosis of alzheimer's disease based on nuclear magnetic resonance image according to claim 1, wherein said inputting the slice group into the convolutional neural network model with a new convolutional layer added to obtain the planar feature comprises:

3. The method of claim 2, wherein a shift window attention mechanism is introduced into a transducer encoder module, and a spatial connection is established between the planar features based on the shift window attention mechanism, comprising:

4. The auxiliary diagnosis method for alzheimer's disease based on nuclear magnetic resonance images according to claim 1, wherein slicing the nuclear magnetic resonance sample images in the training set to obtain slice groups corresponding to each nuclear magnetic resonance sample image comprises:

5. The method for assisting diagnosis of alzheimer's disease based on nuclear magnetic resonance images according to claim 1, wherein the acquisition of nuclear magnetic resonance sample images of brain structures of subjects of different categories comprises: