CN113781385B

CN113781385B - Combined attention-seeking convolution method for automatic classification of brain medical images

Info

Publication number: CN113781385B
Application number: CN202110299393.0A
Authority: CN
Inventors: 朱旗; 张耕
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2024-03-08
Anticipated expiration: 2041-03-19
Also published as: CN113781385A

Abstract

The invention discloses a medical image automatic identification technology based on a graph convolution neural network. Based on brain diffusion tensor imaging and functional magnetic resonance imaging, a machine learning automatic diagnosis method capable of fusing two modal data and reasonably retaining and utilizing brain map structural information is designed. The current brain image classification lacks an effective method for fusing multi-mode images, and has poor learning effect on the image structure information in the data. The invention discloses a medical image automatic identification technology based on fusion diffusion tensor imaging of a graph convolution neural network and a functional magnetic resonance image. Specifically, we first aggregate neighbor information between brain region nodes on a brain diffusion Zhang Liangcheng image and then extract brain region joint attention. Then, a graph convolution operation is performed on the brain functional magnetic resonance image by taking the attention of the brain area as a reference. Finally, the characteristics are input into a multi-layer perceptron to be automatically identified and classified.

Description

Combined attention-seeking convolution method for automatic classification of brain medical images

Technical field: machine learning field, in particular automatic brain medical image recognition technology

The background technology is as follows: the technology is applied to the field of automatic identification of medical images by machine learning. Scientists in the machine learning field have proposed a number of novel technical models based thereon for implementing automatic recognition analysis of brain medical images. The graph convolution neural network technology is a deep neural network technology suitable for graph structure data analysis processing. Researchers in the field of deep learning solve the problem of processing graph structure data by designing a graph convolutional neural network framework.

Disclosure of Invention

Object of the Invention

The automatic identification technology of the medical image has important auxiliary effect on the work of doctors, and the invention and the application of a good technology in the related field can greatly improve the diagnosis level of hospitals. The existing automatic brain image recognition method has the defects of poor robustness and insufficient interpretation. The existing machine learning technology model cannot be well adapted to the graph structural characteristics of a brain function network, and lacks a multi-modal framework for organically fusing Functional Magnetic Resonance (FMRI) and Diffusion Tensor Imaging (DTI).

In order to solve the problems, a technical scheme is explored, which can organically integrate FMRI and DTI of the brain and can perfectly adapt to the structural characteristics of a brain network diagram, and the method has robustness and good interpretability.

Technical proposal

In order to achieve the above purpose, the technical scheme adopted by the invention comprises the following four steps:

firstly, a higher-order functional brain network is built based on FMRI, a higher-order functional brain network is built by using FMRI signal vectors of brain regions, and the fully-connected brain network built based on sparse representation in all brain regions can better reflect higher-order relations and potential hidden information in brain regions.

Secondly, a structural network for extracting brain region attention information is constructed based on DTI, in order to extract brain network attention weight, we use DTI structural data as edges between nodes, and FMRI as node feature vectors to construct a brain network diagram.

And (III) calculating a first-order second-order third-order attention score of the brain region through node aggregation, extracting joint attention on a DTI mode and an FMRI mode and using the joint attention on pooling of the GCN. This method of calculating attention does not require setting additional parameters, so that the calculation is very convenient. The attention score extracted by the above formula is then applied to the pooling layer of the GCN. We measure the intensity of the functional activity of each edge on the structural network represented by DTI, and further evaluate the attention score for each brain region node. If the attention score of a node on the structural network is relatively high we can understand that this brain region has relatively much functional activity through edges and other brain regions in the DTI structural network.

And fourthly, performing graph convolution and pooling operation, and classifying the output characteristics by using a multi-layer perceptron to obtain a final recognition result. And sorting the nodes by combining joint attention scores learned in two modes of DTI and FMRI, screening out unimportant nodes, and reserving nodes with high attention. In the downsampling of the GCN, a pooling layer adopts a mode of discarding nodes layer by layer to improve the fusion efficiency of higher-order neighbors of the nodes, and the method breaks through the structural integrity of a brain network graph, so that the lower network lacks the capability of fusing all the nodes. Therefore, a readout layer is added after each pooling layer of the GCN, the readout layer completes one time of aggregation of global information on the graph nodes of the current layer, and the result is fed back to the multi-layer perceptron for final classification.

Drawings

Construction of the higher-order functional brain network diagram of FIG. 1

FIG. 2 DTI structured brain network map construction

FIG. 3 illustrates aggregation of node attention

FIG. 4 GCN network structure diagram

Detailed Description

The implementation mode of the technical scheme is specifically introduced as follows:

the work firstly builds a higher-order brain network by using a sparse representation method, and the higher-order brain network is used as side information in a brain network diagram. And uses the signal vector of the original brain region FMRI time sequence as node information, thereby defining the complete graph structure. Different previous brain network classification diagnosis methods and frameworks, the method works by constructing a high-order graph functional brain network by using brain FMRI data, constructing a structural brain network graph by taking DTI structural data as edges between nodes and FMRI as node feature vectors, and extracting attention weight of brain regions.

Construction of higher-order functional brain networks based on FMRI

We use x= [ X ] ₁ ，x ₂ ，...，x ₉₀ ]∈R ^240×90 To represent the feature vectors of the nodes in the brain network graph, i.e. FMRI data of individual samples in the sample set, where 90 is the number of brain regions corresponding to each sample, x _i FMRI characteristic of the ith brain region of the sample. 240 is the length of the brain region feature vector.

The brain region FMRI signal vector is used for constructing a high-order functional brain network, and the fully-connected brain network constructed based on sparse representation in all brain regions can better embody the high-order relation and potential hidden information in the brain region function.

In the expression A _i Representative of the classical corresponding to the ith brain region, is composed of the characteristics of other N-1 brain regions, A _i The midbrain region corresponds to the element value of column i being zero. E (E) _i Is 90 x 1, representing an indication vector of the dictionary representation corresponding to the ith brain region, here representing the weight of the edge between brain region node i and other nodes. By calculating the indication vectors of the sparse representation corresponding to all brain areas, an indication vector matrix can be obtained:

E＝{E ₁ ，E ₂ ，......，E ₉₀ }

e is the adjacency matrix of the graph and represents the edges of the higher-order functional brain network graph. Since E is constructed using sparse representation, the value of many elements in the matrix will be zero, which means that there is no edge connection between the corresponding nodes. We use figure 1 to illustrate the construction of a high-level functional brain network graph. We name this network map as a high-level functional brain network map.

(II) constructing structural network for extracting brain region attention information based on DTI

Meanwhile, in order to extract the attention weight of the brain network, we construct a brain network graph by taking DTI structure data as edges between nodes and FMRI as a node feature vector. As shown in fig. 2, the edge matrix of the brain network diagram constructed by using DTI structure informationThe characteristic of the node is X= [ X ] as in the functional brain network diagram ₁ ，x ₂ ，...，x ₉₀ ]. We name this network map as a structural brain network map. The purpose of constructing a structural brain network graph is to extract the attention weight of brain region nodes.

(III) calculating first order second order third order attention score of brain region by node aggregation

This section describes how we draw joint attention and direct it on the DTI and FMRI modalitiesFor use in the pooling of GCNs. Firstly, in the first two steps we define a DTI structure brain network diagram taking FMRI signal vector as node characteristic, the node characteristic of the diagram is represented by X, and the edge of the diagram is represented byAnd (3) representing. We take the node joint attention score Z by aggregating the nodes.

Wherein the method comprises the steps of<x _i ，x _j >The inner product of the node i and the node j represents the correlation calculation of the characteristics of the two nodes, and the correlation calculation reflects the functional connection of the brain interval on the side of the brain network diagram. This means that we measure the intensity of the functional activity of each edge on the structural network represented by DTI, and thus evaluate the attention score for each brain region node. If the attention score of a node on the structural network is relatively high we can understand that this brain region has relatively much functional activity through edges and other brain regions in the DTI structural network. Layer 1 aggregated attention score we use Z ^l To indicate, the calculation mode of the first+1 layer is as follows:

this method of calculating attention does not require setting additional parameters, so that the calculation is very convenient. The attention score extracted by the above formula is then applied to the pooling layer of the GCN. Screening of nodes we used the topK mechanism

(IV) performing graph convolution and pooling operations and classifying the output features by using a multi-layer perceptron to obtain a final recognition result

The pooling mechanism of TopK is to imitate the thought of the maximum pooling operation in CNN and screen out the most valuable information. In the work, nodes are sequenced by combining joint attention scores learned in two modes of DTI and FMRI, unimportant nodes are screened out, and nodes with high attention are reserved.

After determining the sub-graph division and the corresponding adjacency matrix, we need to integrate and extract the information of the non-grid graph, and we define the aggregation method of node attention in the last step. We have first introduced the node aggregation approach of the convolutional layer.

h＝GNN(X，E)

Specifically, the method can be written as follows:

where W is a parameter to be trained, is->Is a degree matrix of (2). Layer 1+1 is defined as

Because the pooling layer adopts a topK method in the downsampling of the GCN, the fusion efficiency of the higher-order neighbors of the nodes is improved in a layer-by-layer node discarding mode, the method breaks the structural integrity of the brain network graph, and the lower network lacks the capability of fusing all the nodes. Therefore, a readout layer is added after each pooling layer of the GCN, the readout layer completes one time of aggregation of global information on the graph nodes of the current layer, and the result is fed back to the multi-layer perceptron for final classification. The readout layer is calculated as follows:

the above equation represents the result of stitching together the global average pooling and the global maximum pooling to get the readout layer, ||represents the stitching operation. Maximum pooling is a method commonly used in neural networks to extract features and reduce the impact of unwanted information, and average pooling is used as a complement to maximum pooling to preserve background information, overfitting due to non-uniformity of feature data divisions. The readout layer structures of the layers are added to give a full-view representation:

the readout layer functions like the global pooling of the convolutional layers commonly used in CNN network models, and it is their common feature to obtain global expressions by one-time aggregation of global inputs. And similar to global pooling in CNN, the readout layer may also employ common operations such as summing, averaging, maximizing, etc. The GCN network eventually containing the readout layer is shown in FIG. 4

And aggregating neighbor nodes on the DTI structure brain network diagram to obtain joint attention on DTI and FMRI modes, then pooling unimportant brain region nodes by using a topK according to the value of the attention, and in order to learn global information of each layer, setting a reading layer behind each pooled layer to aggregate the information of the global nodes, and finally adding and transmitting the results of the reading layers of each layer to a multi-layer perceptron for classification.

Claims

1. The joint attention seeking convolution method for automatic classification of brain medical images is characterized by comprising the following steps of: the method comprises the following steps:

constructing a high-order functional brain network based on FMRI, constructing the high-order functional brain network by using correlation among FMRI signal vectors of a brain region by using a sparse representation method, and taking an original vector as node characteristics of network nodes after constructing the brain network by using the sparse representation method;

with X= [ X ] ₁ ,x ₂ ,…,x ₉₀ ]∈R ^240×90 To represent the feature vector of the nodes in the brain network diagram, wherein 90 is the number of brain areas corresponding to each sample, and x _i The ith brain region representing the sample240 is the length of the brain region feature vector, and the mathematical expression of the fully connected brain network constructed based on sparse representation is as follows:

in the expression A _i Representative of the classical corresponding to the ith brain region, is composed of the characteristics of other N-1 brain regions, A _i The element value of the midbrain region corresponding to the ith column is zero, E _i A 90×1 vector representing an indication vector of a dictionary representation corresponding to the ith brain region, where the indication vector represents the weight of an edge between the brain region node i and other nodes;

by calculating the indication vectors of the sparse representation corresponding to all brain areas, an indication vector matrix can be obtained:

E＝{E ₁ ,E ₂ ,……,E ₉₀ }

e is an adjacent matrix of the graph and represents the edges of the higher-order functional brain network graph, and because E is constructed by using sparse representation, the value of partial elements existing in the matrix is zero, which represents that no edge connection exists between the corresponding nodes;

secondly, constructing a structural network for extracting brain region attention information based on DTI, and constructing a brain network diagram by taking DTI structural data as edges between nodes and FMRI as node feature vectors in order to extract brain network attention weights;

thirdly, calculating a first-order second-order third-order attention score of the brain region through node aggregation, extracting joint attention on a DTI mode and an FMRI mode and using the joint attention on pooling of the GCN;

the extraction method of the attention of the brain region node in the step comprises the step of calculating the importance degree of the brain region by using a graph convolution node aggregation mode, and the specific steps are as follows:

defining a DTI structure brain network diagram taking FMRI signal vector as node characteristic, wherein the node characteristic of the diagram is represented by X, and the side of the diagram is represented byRepresentation, use ofThe node joint attention score Z is obtained by aggregating the nodes, and the expression is as follows:

wherein the method comprises the steps of<x _i ,x _j >The inner product of the node i and the node j represents the correlation calculation of the characteristics of the two nodes, and the functional connection between brains on the side of a brain network diagram is reflected; this means that the intensity of the functional activity of each edge is measured on the structural network represented by DTI, and thus the attention score is evaluated for each brain region node;

if the attention score of a node on the structural network is high, it is understood that this brain region has performed relatively more functional activities through edges and other brain regions in the DTI structural network;

the attention score of the first layer aggregate is Z ^l To indicate, the calculation mode of the first+1 layer is as follows:

this method of calculating attention does not require setting additional parameters, and the attention score extracted by the above formula is applied to the pooling layer of the GCN;

fourthly, performing graph rolling and pooling operation, classifying output characteristics by using a multi-layer perceptron to obtain a final recognition result, sorting nodes by combining joint attention scores learned by two modes of DTI and FMRI, screening unimportant nodes by using a TopK strategy, and reserving nodes with high attention;

in the process of learning brain structure and function information and extracting features, the DTI and FMRI two-mode data are used simultaneously, and the function network and the structure network are fused in the graph convolution neural network to improve the performance of automatic diagnosis.