CN114510966B

CN114510966B - End-to-end brain causal network construction method based on graph neural network

Info

Publication number: CN114510966B
Application number: CN202210040667.9A
Authority: CN
Inventors: 徐鹏; 陈婉钧; 易婵琳; 姚汝威; 李存波; 李发礼; 尧德中
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-01-14
Filing date: 2022-01-14
Publication date: 2023-04-28
Anticipated expiration: 2042-01-14
Also published as: CN114510966A

Abstract

The invention discloses an end-to-end brain causal network construction method based on a graph neural network, and belongs to the field of brain electrical information processing. The invention designs a multi-layer perceptron adjacent to k-layer feature fusion for multi-dimensional feature extraction, and further designs a graph neural network for direct mining of brain causality. And then, a vector autoregressive model is utilized to obtain a multielement sequence with real electroencephalogram signal characteristics and causal supervision information thereof, and a neural network model is trained by a supervision method. Based on the trained neural network model, mining and causal network construction of causal relation of brain electrical data can be realized. Compared with the representative method of the traditional method, the Granger causal analysis and comparison research proves that the invention has remarkable advantages in capturing causal network topology structure and causal relationship strength under the condition of low signal-to-noise ratio. The invention provides a new view angle for breaking through the constraint of the traditional model driving assumption and directly mining the depth brain causal network mechanism in a data driving mode.

Description

End-to-end brain causal network construction method based on graph neural network

Technical Field

The invention belongs to the field of electroencephalogram information processing, in particular to an end-to-end brain causal network construction method based on a graph neural network.

Background

The brain network has great significance in researching interaction relation among various brain areas, and the causal network describes causal influence characteristics among various brain areas. In order to characterize the causal relationship of brain activity, various methods have been proposed, which can be broadly divided into model-driven series of methods represented by Glauca Causality (GCA) and a small number of non-parametric estimation methods. However, the series of methods based on model driven assumptions depend on the reliability of the model assumptions, are subject to the problem that the model driven assumptions do not conform to reality, and it is difficult to characterize the nature of the brain causal network mechanism. But research on non-parameter estimation is limited, and is mostly limited to the low-dimensional space feature level of signals to describe shallow layer dependency relationship among signals, and has great defects in depth excavation of high-dimensional features.

In recent years, the remarkable information mining capability of the deep learning method is attracting more attention, and the deep neural network method such as a long-short time memory network is gradually applied in the causal network analysis field. However, the current implementation concept is mainly based on a class model driving method, such as an autoregressive model, and training a neural network model to realize prediction of a time sequence or approximation to a traditional model assumption. The measurement of the causal relationship mainly comprises two branches, namely, the idea of similar graininess causal analysis is adopted, and the causal relationship between nodes is defined based on prediction errors; secondly, defining causal relation among nodes by analyzing model intermediate parameters of the neural network. The input-to-output mapping relation learned by the neural network is from a historical time sequence to a future time sequence, the essence is prediction of the time sequence, and the brain causal network connection is estimated indirectly by using thought driven by a class model, and is not direct estimation of the brain network end-to-end. And because most of the models need to be trained on a single sample, the models cannot adapt to the noise environment with complex brain signals, and further have limitations in generalization of the models, calculation cost and the like.

The brain acts as a complex system with rich Gao Weitu characteristic information for its activity. The graph neural network focuses on the structural characteristics of the graph, has strong graph structure modeling expression capability, and is one of important technologies for solving the problem of graph structure information mining (non-European space). The brain electrical signal has the characteristics of high time resolution, non-invasiveness, easy operation and the like, and is an important imaging technology for describing the brain dynamic causal activity. Therefore, the invention provides a robust end-to-end brain causal network construction method (GNN-C) based on the graph neural network based on the causal network construction of the brain electrical signals, which directly learns the mapping relation between the multi-element time sequence and the brain causal network and directly mines the cross brain causal network in a data driving mode under the multi-lead brain electrical signals (EEG) scene.

Disclosure of Invention

In order to break through the limitations that the traditional brain cause and effect network analysis method is constrained by model driving assumptions, insufficient information mining and the like, the direct mining of the interaction relationship among the multi-conductive brain electric signals under the low signal to noise ratio is realized in a data driving mode, and a reliable brain cause and effect network is constructed. The invention provides an end-to-end brain causal network construction method (GNN-C) based on a graph neural network. The multi-layer perceptron adjacent to k-layer feature fusion is innovatively designed for depth feature mining, a graph neural network is adopted, a reliable neural network model is trained in a supervised learning mode and used for direct estimation of a brain causal network mode, model driving assumption relied on by a traditional method is avoided, and direct mining of the cause and effect relationship in the brain is realized.

The technical scheme of the invention is as follows: an end-to-end brain causal network construction method based on a graph neural network comprises the following steps:

step S1: n-dimensional simulation electroencephalogram signal X= { X with time sequence causality and electroencephalogram signal characteristic is constructed by adopting vector autoregressive model ₁ (t),x ₂ (t),...,x _N (T) }, t=1, 2,..t as training input to the neural network model, T representing the total number of samples and deriving a causal network matrix corresponding to the multivariate signal

Training tags as models, wherein +.>

A real matrix representing n×n dimensions over a real field, N being the number of leads;

step S2: designing a graph neural network model of a multi-layer perceptron based on adjacent k-layer feature fusion;

step S3: adding noise with different signal to noise ratios to the simulated electroencephalogram signal obtained in the step S1, inputting the noise with different signal to noise ratios to the graph neural network established in the step S2 for training, and stopping training until the loss function converges; the loss function of the graph neural network adopts:

where n is the number of samples required to train the neural network once per batch, Y _i 、

Causal networks for predefined causal networks, model estimation, respectively, where y _i 、/>

The self-loop connection weights of (2) are all set to 0;

step S4: saving model parameters of the graph neural network when the loss function is the lowest in training;

step S5: loading the saved network model parameters, and loading the electroencephalogram signals of the N-dimensional causal network to be estimated

And inputting the result into the trained graph neural network model to obtain causal network output.

Further, constructing a simulated electroencephalogram signal by utilizing the step S1

The causal network comprises the following flows:

step S11: defining an ith order state space system matrix of input data as:

wherein ,

a zero matrix representing the M× (N-M) dimension; />

The method comprises the steps that an i-th order state space system matrix is formed by reserving a value of a front alpha% of a randomly generated zero-mean Gaussian distribution matrix and setting other elements to 0, and each element in the matrix designates a causal relationship between every two signals; />

Is true brain electrical signal->

An i-th order parameter estimation matrix obtained by a least square method, wherein M is less than N, and P is the total order;

step S12: checking the product by K _i I.e. 1, if the characteristic value of the system matrix reconstructed by P meets the requirement that the spectral radius of the system matrix is smaller than 1, turning to the step S13, otherwise turning to the step S11;

step S13: obtaining the synthesized brain electrical signal according to the following iterative formula

Wherein ε is the zero-mean white noise subject to Gaussian distribution;

step S14: the true brain electric signal X ₁ And synthesizing an electroencephalogram signal X ₂ Obtaining the simulated electroencephalogram signal based on row dimension stitching

Step S15: based on state space matrix K _i The predefined causal network is obtained by

/>

Y＝|∑ _i K _i |

Further, the step S2 includes the steps of:

s21, inputting an input signal into a multi-layer perceptron with adjacent k-layer feature fusion to separate and extract multi-lead signal features;

step S22, inputting the characteristics output in the step S21 into a node-edge convergence layer to reconstruct multi-lead characteristic information into side information of interaction of signals between every two leads;

step S23, inputting the side information output in the step S22 into a multi-layer perceptron with adjacent k-layer feature fusion to separate and extract the information features of all potential connecting sides;

step S24, inputting the characteristics output in the step S23 into an edge-node convergence layer, and reconstructing the characteristics from the interactive edge information into the characteristics of each convergence node;

step S25, inputting the features output in the step S24 into a multi-layer perceptron integrated with adjacent k-layer features to separate and extract deep node information features;

step S26, inputting the characteristics output in the step S25 into a node-edge convergence layer, and integrating connection information to obtain integrated characteristics;

step S27, the information features output in the step S23 are spliced with the integration features output in the step S26, and then the information features are input into a fully-connected network;

step S28, the feature output by the full-connection network in the step S27 is subjected to non-linear mapping by a ReLu function to obtain the weight of the sparse causal network connection matrix, wherein f (x) =max (0, x), f (x) represents the activation output of the neuron, and x represents the activation input of the neuron; an output of 0 indicates that the two nodes are not connected, an output of a non-zero value indicates that the two nodes are connected, and the higher the score, the stronger the connection strength of the two nodes; and (3) reshaping the connection score vector of the edge to obtain a weight matrix of the connection between the nodes, namely the causal network.

Further, the structure of the multi-layer perceptron of adjacent k-layer feature fusion described in the steps S21, S23, S25 is as follows: z full connection layers and 1 batch normalization layer; the transmission flow direction of the feature information in the multi-layer perceptron structure adjacent to the k-layer feature fusion is defined as follows: for Z < 1,2, >, Z fully connected layers, when Z < k, the current layer input is only from the output of the previous layer, when Z > k, the current layer input is the fusion information of the (Z-1), (Z-2), Z-k fully connected layer output features, where k is a positive integer in the (0, Z) range; finally, the Z-th fully connected output passes through a batch normalization layer.

Further, the node-edge convergence layer structures in steps S22 and S26 are respectively constructed as position coding matrixes of the transmitting node and the receiving node under the condition that all the network nodes are connected

and />

S and R are respectively associated with layer inputs->

Performing matrix multiplication operation: (SV) _ij and (RV)_ij I is more than or equal to 0 and less than or equal to N multiplied by N, j is more than or equal to 0 and less than or equal to H, and H is the number of output neurons of the multi-layer perceptron fused with the characteristics of adjacent k layers, so that the characteristic information corresponding to the transmitting node and the receiving node is taken out; and then splicing the characteristic information of the transmitting node and the characteristic information of the receiving node to obtain the side characteristic information for connecting the two nodes.

Further, the structure of the edge-node convergence layer in the step S24 is that the transposed matrix of the position coding matrix of the receiving node in the case of fully connecting the network nodes

And layer input signal->

Performing matrix multiplication operation (R) ^T U) _ij And i is more than or equal to 0 and less than or equal to N, j is more than or equal to 0 and less than or equal to 2H, and the average aggregation information of the node is obtained by dividing the number of electrode leads.

The invention has the beneficial effects that: the invention provides an end-to-end brain causal network estimation method of a graph neural network of a multi-layer perceptron based on adjacent k-layer feature fusion, which adopts the graph neural network to learn multidimensional mapping relation between a multi-element time sequence and the causal network in a supervised learning mode; the method has the advantages that the strong data mining capability of deep learning is utilized, the causal relationship between signals is directly mined in a data driving mode, a brain causal network is constructed, and the limitations of constraint and insufficient information mining of model driving assumptions of the traditional method are avoided. The method can realize the brain network construction of the multi-lead brain electrical signals under the condition of low signal to noise ratio, and a model is not required to be retrained aiming at new brain electrical signal input, so that the method has lower calculation cost and better generalization. Meanwhile, compared with the traditional method, the method has better anti-noise stable estimation performance, higher accuracy and lower network relative error under the condition of low signal-to-noise ratio.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention.

Fig. 2 is a schematic diagram of a neural network structure of a multi-layer perceptron based on adjacent k-layer feature fusion.

FIG. 3 is a schematic diagram of a multi-layer perceptron of adjacent k-layer feature fusion.

FIG. 4 is a bar graph of average connection estimation accuracy, recall and connection strength versus error index for a graph neural network, a Granger causal analysis reconstruction causal network based on adjacent k-layer feature fusion at different signal-to-noise ratios.

FIG. 5 is a schematic diagram of a graph neural network, a Granger causal analysis reconstruction causal network of a multi-layer perceptron based on adjacent k-layer feature fusion at different signal-to-noise ratios.

Detailed Description

Examples of embodiments of the present invention are further described below with reference to the accompanying drawings.

Referring to fig. 1, the invention provides an end-to-end brain causal network estimation method based on a graph neural network, which is implemented by the following steps:

step S1: constructing a time sequence causality simulation brain electrical signal with brain electrical signal characteristics by adopting a vector autoregressive model, taking the simulation brain electrical signal as the input of the model, and obtaining a predefined causality network as a label for calculating a loss function;

step S2: and designing a graph neural network model of the multi-layer perceptron based on the adjacent k-layer feature fusion. The designed graph neural network model is shown in fig. 2, and the structure of the multi-layer perceptron with adjacent k-layer feature fusion is shown in fig. 3;

step S3: adding noise with different signal to noise ratios to the simulated electroencephalogram signal obtained in the step S1, inputting the noise with different signal to noise ratios to the graph neural network established in the step 2 for training, and stopping training until the loss function converges;

step S5: and loading the saved network model parameters, and inputting the brain electrical signals of the N-dimensional causal network to be estimated into a trained graph neural network to obtain causal network output.

In this embodiment, the step S1 specifically includes the following steps:

step S11: defining an ith order state space system matrix of input data as:

wherein ,

a zero matrix representing the M× (N-M) dimension; />

The method comprises the steps of assigning causal relation among signals for an ith-order state space system matrix, wherein the elements are formed by preserving the alpha% value (the rest elements are set to 0) of a randomly generated zero-mean Gaussian distribution matrix; />

Is true brain electrical signal->

An i-th order parameter estimation matrix obtained by the least square method.

Step S12: checking the product by K _i I.e. 1, if the P reconstructed system matrix eigenvalue meets the requirement of discrete state space iteration stability, turning to step S13, otherwise turning to step S11;

step S13: obtaining the synthesized brain electrical signal according to the iterative formula

/>

Wherein ε is the zero-mean white noise subject to Gaussian distribution;

Y＝|Σ _i K _i |。

In this embodiment, the real electroencephalogram signal acquisition process for generating the simulated EEG signal is as follows: EEG signals were acquired in 17 resting conditions under test using an ASA-Lab amplifier (ANT neuron), with the sampling rate of the device being 500Hz and CPz and AFz electrodes as reference and ground, respectively. The preprocessing flow of the collected EEG signals is as follows: [ -200ms,0ms ] baseline correction, 0.5-30 Hz bandpass filtering, 256Hz downsampling, sliding time window slicing (slice count 512, no overlap) and artifact removal with voltage threshold of 75 microvolts.

In the present embodiment, the parameters of the execution of step S1 are as follows: n=5, m=3, y=512, α=20, p=3.

In this embodiment, the step S2 includes the following steps:

s21, inputting the multi-lead signal to a multi-layer perceptron adjacent to k-layer feature fusion to separate and extract the multi-lead signal features;

step S23, inputting the features output in the step S22 into a multi-layer perceptron with adjacent k-layer feature fusion to separate and extract information features of all potential connecting sides;

step S24, inputting the characteristics output in the step S23 into an edge-node convergence layer, and reconstructing the interactive edge information into the characteristic information of each converged node;

step S26, inputting the characteristics output in the step S25 into a node-edge convergence layer for connection information integration;

step S27, splicing the features output in the step S23 and the features output in the step S26, and inputting the spliced features into a fully-connected network;

and step S28, performing non-linear mapping on the characteristics output in the step S27 through a ReLu function to obtain the weight of the sparse causal network connection matrix, wherein f (x) =max (0, x), f (x) represents the activation output of the neuron, and x represents the activation input of the neuron. And (3) reshaping the connection score vector of the edge to obtain a weight matrix of the connection between the nodes, namely the causal network.

In this embodiment, the structure of the multilayer perceptron adjacent to the k-layer feature fusion is: the multi-layer perceptron adjacent to 2-layer feature fusion is composed of 4 full-connection layers and 1 batch normalization layer; the output of the first full-connection layer is characterized by the input of the second full-connection layer, the input of the third full-connection layer is the fusion information of the output characteristics of the first full-connection layer and the second full-connection layer, and the input of the fourth full-connection layer is the fusion information of the output characteristics of the second full-connection layer and the third full-connection layer. Finally, the fourth fully-connected output passes through a batch normalization layer; the number of neurons in each layer is 64, and the activation functions of the neurons are ELU activation functions:

where f (x) represents the activation output of the neuron and x represents the activation input of the neuron;

in step S27, the fully-connected network includes 1 input layer, 4 hidden layers and 1 output layer, the number of neurons in the hidden layers is 64, the number of neurons in the output layer is 1, and the activation functions of the neurons are all ELU activation functions.

In this embodiment, the node-edge aggregation layer structures in steps S22 and S25 are implemented by respectively constructing position coding matrices of the transmitting node and the receiving node under the condition that all network nodes are connected

and />

S and R are respectively associated with layer inputs->

Performing matrix multiplication operation: (SV) _ij Or (RV) _ij I is more than or equal to 0 and less than or equal to N multiplied by N, j is more than or equal to 0 and less than or equal to H, and H is the number of output neurons of the multi-layer perceptron fused with the characteristics of adjacent k layers, so that the corresponding characteristic information of the transmitting node and the receiving node is taken out; and splicing the characteristic information of the sending node and the receiving node to obtain the side characteristic information of the connecting two nodes, wherein N is 5.

In the present embodiment, the edge-node convergence layer structure in steps S22 and S25 is a transposed matrix of the position coding matrix of the receiving node when all the network nodes are connected

Layer(s)Input signal->

Performing matrix multiplication operation (R) ^T U) _ij I is more than or equal to 0 and less than or equal to N, j is more than or equal to 0 and less than or equal to 2H; dividing the number of electrode leads to obtain average aggregation information of the node, wherein N is 5.

In this embodiment, the training of the neural network in step S3 uses the following loss function:

where n is the number of samples required to train the neural network once per batch, set to 4000; y is Y _i 、

The self-loop connection weights of (2) are all set to 0; when training, an ADAM optimization algorithm is adopted, the initial learning rate is set to be 0.006, and the learning rate in the training process is updated according to the following learning rate adjustment strategy:

where j represents the number of iterations, the maximum number of iterations for training is set to 800.

In this embodiment, the graph neural network is tested for estimating the performance of the causal network: adopting the flow described in the step S1 to respectively generate 100 simulated electroencephalogram signals with 5 leads and a corresponding causal network, and respectively simulating a noisy environment for 100 times under the condition of different signal-to-noise ratios (-5 dB, 0dB, 5dB and 10 dB) to generate noisy signals; estimating causal network connection matrixes by using the generated test data and GNN-C and GCA respectively; and then using the average connection estimation accuracy, recall rate and connection strength relative error index quantization method performance of 100 noise tests of 100 simulation data, and observing the capture condition of a causal network topological structure, wherein the index calculation formula of a single causal network is as follows:

accuracy rate:

recall rate:

wherein TP represents the number of edges where the model correctly judges that two nodes are connected and the information flow direction is consistent with the predefined causal network, TN represents the number of edges where the model correctly judges that two nodes are not connected, FP represents the number of edges where the model incorrectly judges that two nodes are connected and the information flow direction is inconsistent with the predefined causal network, FN represents the number of edges where the model incorrectly judges that two nodes are not connected, and the sum of TP+TN+FP+FN is equal to the sum of the numbers of edges where all 5 leads are connected.

Relative error:

wherein

Representing a causal network of method estimation under the condition of a signal-to-noise ratio of kdB, wherein Y represents a predefined causal network, and I is a matrix binary norm; finally, the average value of the indexes corresponding to the 100 times of noise tests of 100 samples is respectively defined as the average connection estimation accuracy, recall rate and connection strength relative error.

Figure 4 presents the proposed performance differences of GNN-C and GCA at each signal-to-noise ratio, including average accuracy, recall, relative error ("x" indicates that the corresponding indices between GNN-C and GCA have significant differences (p < 0.05)). Fig. 5 shows that one sample is randomly selected from 100 samples, and the topology recovery of 100 noise tests is observed. The "arrow" pattern in the figure indicates a causal relationship of unidirectional information flow, and the no arrow pattern indicates that the causal relationship of information flow is bidirectional. The thicker the edge connection, the more causal connections the method estimates the information flow direction of two nodes in 100 reconstruction estimations. As shown in fig. 4 and 5, under the condition of each signal-to-noise ratio, the average connection accuracy and recall rate based on GNN-C are significantly better than those based on GCA method, the false connection of the error track is fewer, and the structure of robust recovery is more consistent with the predefined topology mode. This illustrates that the GNN-C method is better able to capture causal network topologies in the brain than GCA. Meanwhile, as can be seen from fig. 4, the estimated network connection strength relative error of GNN-C is significantly lower than GCA, which indicates that GNN-C can not only better capture the topology pattern of the network, but also estimate its causal connection strength value more accurately.

In conclusion, under the condition of low signal-to-noise ratio, the method for directly estimating the causal network of the brain based on the graph neural network in a data driving mode has better robustness in terms of mining causal relation of information flow direction and evaluating interaction strength between signals compared with the traditional method, and can construct a reliable causal network of the brain.

The foregoing description is only of the preferred embodiments of the invention, and all changes and modifications that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims

1. An end-to-end brain causal network construction method based on a graph neural network comprises the following steps:

Training tags as models, wherein +.>

step S28, the feature output by the full-connection network in the step S27 is subjected to non-linear mapping by a ReLu function to obtain the weight of the sparse causal network connection matrix, wherein f (x) =max (0, x), f (x) represents the activation output of the neuron, and x represents the activation input of the neuron; an output of 0 indicates that the two nodes are not connected, an output of a non-zero value indicates that the two nodes are connected, and the higher the score, the stronger the connection strength of the two nodes; reshaping the connection score vector of the edge to obtain a weight matrix of the connection between the nodes, namely a causal network;

The self-loop connection weights of (2) are all set to 0;

Inputting the result into a trained graph neural network model to obtain causal network output;

the structure of the multi-layer perceptron of adjacent k-layer feature fusion described in the steps S21, S23 and S25 is as follows: z full connection layers and 1 batch normalization layer; the transmission flow direction of the feature information in the multi-layer perceptron structure adjacent to the k-layer feature fusion is defined as follows: for Z < 1,2, >, Z fully connected layers, when Z < k, the current layer input is only from the output of the previous layer, when Z > k, the current layer input is the fusion information of the (Z-1), (Z-2), Z-k fully connected layer output features, where k is a positive integer in the (0, Z) range; finally, the Z-th fully-connected output passes through a batch normalization layer;

the node-edge convergence layer structures in the steps S22 and S26 are respectively constructed as position coding matrixes of the sending node and the receiving node under the condition that all network nodes are connected

and />

S and R are respectively associated with layer inputs->

Performing matrix multiplication operation: (SV) _ij and (RV)_ij I is more than or equal to 0 and less than or equal to N multiplied by N, j is more than or equal to 0 and less than or equal to H, and H is the number of output neurons of the multi-layer perceptron fused with the characteristics of adjacent k layers, so that the characteristic information corresponding to the transmitting node and the receiving node is taken out; then, characteristic information of the sending node and characteristic information of the receiving node are spliced to obtain side characteristic information for connecting the sending node and the receiving node;

the structure of the edge-node convergence layer in the step S24 is that the transposed matrix of the position coding matrix of the receiving node under the condition of all the network nodes are connected

And layer input signal->

2. The end-to-end brain causal network construction method based on the graphic neural network according to claim 1, wherein said step S1 is used to construct a simulated electroencephalogram signal

Causal net for a plantThe process comprises the following steps:

step S11: defining an ith order state space system matrix of input data as:

wherein ,

a zero matrix representing the M× (N-M) dimension; />

Is true brain electrical signal->

Wherein ε is the zero-mean white noise subject to Gaussian distribution;

Y＝|∑ _i K _i |。