CN114089834A

CN114089834A - Electroencephalogram identification method based on time-channel cascade Transformer network

Info

Publication number: CN114089834A
Application number: CN202111614470.3A
Authority: CN
Inventors: 周文晖; 王宇涵; 莫良言; 孔万增; 戴国骏
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2022-02-25
Anticipated expiration: 2041-12-27
Also published as: CN114089834B

Abstract

The invention discloses an electroencephalogram identification method based on a time-channel cascade transducer network. The invention comprises the following steps: 1: acquiring ideogram English character electroencephalogram data, and constructing a preprocessing module; 2: constructing a time module of a time-channel cascade transducer network, wherein input data of the time module is preprocessed electroencephalogram data, and output is extracted time characteristics; 3: constructing a brain electrical channel module of a time-channel cascade transducer network, wherein input data of the brain electrical channel module is time characteristics, and output is extracted time-space fusion characteristics; 4: and constructing a classification module of the time-channel cascade Transformer network, wherein the input of the classification module is a space-time fusion characteristic, and the output of the classification module is a classification result. The invention can effectively improve the identification accuracy of the character imagination electroencephalogram signal.

Description

Electroencephalogram identification method based on time-channel cascade Transformer network

Technical Field

The invention belongs to the field of brain-computer interfaces and deep learning, and particularly relates to an electroencephalogram identification method based on a time-channel cascade transducer network.

Background

Brain Computer Interface (BCI) is a cross technology involving multiple disciplines such as neuroscience, signal processing, pattern recognition, and the like. The human intention is recognized by detecting the brain nerve activity, the brain nerve activity is converted into a command for driving external equipment, the human body is replaced by limbs or language organs of a human body to realize the communication between the human body and the outside and the control over the external environment, and the human-computer interaction mode is novel. With the rapid development of the related fields, the brain-computer interface technology and the theoretical research have made remarkable progress, have attracted wide attention internationally, and become one of the research hotspots in the fields of biomedical engineering, computer technology, communication and the like.

In order to realize high-resolution activity mapping of the neuron network, brain neuron potential activity at corresponding positions of cerebral cortex can be recorded for a long time on a sub-millisecond time scale by implanting a group of closely-arranged Microelectrode Arrays (MEA) in the brain. A key advantage of MEA is the ability to simultaneously record and stimulate neurons at multiple sites. The electrode arrangement with high sensitivity and high stability is used for recording the signal of the target neuron cluster, the signal-to-noise ratio is higher, the time and the spatial resolution are good, the precise issuing time and the waveform of the neuron action potential can be simultaneously ensured to be recorded in a large range and at high precision, and a solid foundation is laid for fully extracting neural information and reading the activity of a cranial nerve network.

The nerve of a patient with nervous system diseases cannot be effectively communicated with the outside due to nerve damage of a certain part of a body, and the existing research shows that the mode of implanting a microelectrode array into a damaged brain area can be used for recovering lost functions through electrical stimulation, such as mechanical arm control, ideographic typing, ideographic speech control, touch arousing and the like, so that electroencephalogram signal recognition based on ideology is an important development direction of BCI. Through the analysis of the mind imagination electroencephalogram signals, the neural activity of the human brain in the imagination process can be identified, so that the thought and the intention of a patient with mobility disability can be transmitted to the outside, and the neural decoding is further realized. Therefore, the research on the ideogram electroencephalogram signal identification can promote the exploration on the cerebral nerve cognition and the cerebral disease rehabilitation, and has important research value and practical significance in the novel human-computer interaction field. The invention mainly aims at a character imagination electroencephalogram signal identification task in the field of idea imagination.

In recent years, with the wide application of deep learning in the research fields of computer vision, natural language processing and the like, the neural network has strong capability of processing nonlinear and high-dimensional data, and therefore, the neural network is also applied to data analysis of brain-computer interfaces. The invasive BCI has higher signal-to-noise ratio and good time and space resolution, and records signals of target neuron clusters by using an electrode arrangement with high sensitivity and high stability. The conventional algorithm only focuses on electroencephalogram characteristics under a time sequence when identifying electroencephalogram signals, and rarely focuses on importance of different characteristic channels, a used method such as a Convolutional Neural Network (CNN) has the problem of depending on selection of a convolutional kernel, and a Recurrent Neural Network (RNN) has the problem of being incapable of processing sequence parallelization and only focuses on previous records and current states. Therefore, the invention provides an electroencephalogram identification method based on a time-channel cascade transducer network, which utilizes a self-attention mechanism to extract characteristics of time dimension and electroencephalogram channel dimension information, and further extracts electroencephalogram channel characteristic information by fusing in a residual error cascade mode. The invention can effectively improve the recognition performance of character imagination electroencephalogram signals.

Disclosure of Invention

The invention provides an electroencephalogram identification method based on a time-channel cascade transducer network, which can effectively improve the identification accuracy of character imagination electroencephalogram signals.

The technical scheme adopted by the invention for solving the technical problem comprises the following steps as shown in figure 1:

step S1: acquiring ideogram English character electroencephalogram data, constructing a preprocessing module, and performing data preprocessing operation by using a time alignment technology, wherein the input of the preprocessing module is the ideogram English character electroencephalogram data, and the output of the preprocessing module is the preprocessed electroencephalogram data;

step S2: constructing a time module of the time-channel cascade Transformer network, wherein the input data of the time module is the preprocessed electroencephalogram data output in the step S1, and the output of the time module is the extracted time characteristics;

step S3: constructing a brain electrical channel module of the time-channel cascade transducer network, wherein the input data of the brain electrical channel module is the time characteristic output in the step S2, and the output of the brain electrical channel module is the extracted space-time fusion characteristic;

step S4: and constructing a classification module of the time-channel cascade Transformer network, wherein the input of the classification module is the space-time fusion characteristics output by the step S3, and the output of the classification module is the classification result.

The invention has the following beneficial effects:

the invention provides a electroencephalogram identification method based on a time-channel cascade transducer network, and a network block diagram of the method is shown in figure 2. The time-channel cascade transducer network comprises a time module, an electroencephalogram channel module and a classification module which are respectively used for acquiring time characteristics, space-time fusion characteristics and classification results. Meanwhile, the time-channel cascade transducer network adopts random position coding and increases classification identification bits so as to realize high-precision character imagination electroencephalogram classification performance.

Drawings

FIG. 1 is a schematic flow chart of the main steps of the present invention

FIG. 2 is a block diagram of a time-channel cascaded Transformer network

FIG. 3 is an internal structure diagram of a time-channel cascaded Transformer network module

Detailed Description

The invention is further illustrated by the following figures and examples.

As shown in fig. 1-3, a method for electroencephalogram identification based on a time-channel cascade transducer network includes the following steps:

The step S1 includes:

the method for acquiring the brain electrical data of the English characters by the aid of the ideological imagery is a mature technology, and the size of the data is 201 × 192 generally.

The preprocessing module adopts a time alignment technology (F.R. Willett, D.T. Avansino, L.R. Hochberg, et al.high-performance bridge-to-text communication video writing) as a mature technology, and is used for eliminating the problem of inconsistency of the writing speed of ideographic imagination characters, and the size of output data is consistent with that of input data.

The step S2 includes:

the time module sequentially comprises the following structures: the preprocessed data output in step S1 is used as input feature- > position coding layer- > multi-head self-attention mechanism module- > residual connecting layer- > LN regularization layer- > feed-forward network- > residual connecting layer- > LN regularization layer- > output time feature i. The time module executes 2 times in a co-loop manner, and the time characteristic I- > multi-head self-attention mechanism module- > residual error connecting layer- > LN regularization layer- > feedforward network- > residual error connecting layer- > LN regularization layer- > output time characteristic II for the first time.

Referring to fig. 3, the position coding layer adopts a random position coding method, and is used for learning the position relationship of the electroencephalogram data in the time dimension in the network training. The position coding layer outputs a random number matrix with the same format as the input data of the position coding layer, and the random number matrix is added with the input data of the position coding layer to be used as the input data of the multi-head self-attention mechanism module.

The multi-head self-attention mechanism module maps the input features to different subspaces, and then performs point multiplication operation on all the subspaces to calculate the attention vector. And finally, splicing and mapping the attention vectors calculated in all the subspaces to an original input space to obtain a final attention vector as an output. This may allow the model to learn the relevant information in different representation subspaces. The expression of the multi-head self-attention mechanism module is as follows (1):

wherein

i represents different subspaces, query vector Q, key vector K and value vector V as input of multi-headed self-attention module, W_i ^QMapping matrices for Q in different subspaces, W_i ^KMapping matrix for K in different subspaces, W_i ^VMapping matrix for V in different subspaces, W^OFrom W in all subspaces_i ^VAnd (4) splicing to obtain the finished product. The calculation mode of the attention vector on the independent subspace is as follows in sequence: firstly, the query vector Q and the key vector K are subjected to dot product operation, and then the dot product operation is divided by the dimensional square root of the key vector K

Obtaining a fractional matrix of the query vector Q, wherein the Softmax function has good perception capability, utilizing the Softmax function to normalize to obtain a weight matrix,and multiplying the value vector V to obtain the attention vector of a subspace, wherein the expression is as the following formula (2):

for the data set adopted by the invention, the dimension d of the parameter matrix of Q, K, V_q，d_kAnd d and_vare all 128, the number of heads is 16, d_modelIs 256. We transform the query vector Q from d by a linear transformation_modelDimension mapping as d_qHead, from the key vector K_modelDimension mapping as d_kHead, vector of values V from d_modelDimension mapping as d_vHead. By implicitly increasing the number of attention heads without reducing the hidden dimension assigned to each attention head, global features can be efficiently extracted and classification accuracy can be improved. The residual connecting layer is mainly used for residual connection, and the method effectively solves the problems of gradient disappearance and gradient explosion; the LN regularization layer is mainly used for normalizing input data; the Feed-Forward Network module (FFN) is composed of two layers of Feed-Forward neural networks, wherein the first layer of Feed-Forward neural Network is used for inputting features from d_modelThe dimensionality is mapped into 512 dimensionality, the activation function is a GELU function, and the second-layer feedforward neural network is mapped back to d from the 512 dimensionality_modelDimension, no activation function is used. The expression of the feedforward network is as follows (3):

FFN(x)＝max(0,xW₁+b₁)W₂+b₂ (3)

wherein W₁And W₂Is a randomly initialized weight vector, b₁And b₂Is a randomly initialized bias.

The interior of a time module based on a time-channel cascade transducer is completely dependent on an attention machine mechanism for modeling, a multi-head attention machine mechanism is utilized to solve the long-distance dependence problem existing in a Recurrent Neural Network (RNN) and a variant thereof, the importance of each time step of a brain electrical signal in a global sequence is concerned, global context information is effectively captured, and the method has better feature extraction capability. The structure of the model avoids the mode that CNN stacks convolution layers to obtain global information, and the model can have good performance.

The step S3 includes:

and constructing a time-channel cascade Transformer electroencephalogram channel module, wherein input data of the electroencephalogram channel module are time characteristics, the size of the time-channel cascade Transformer electroencephalogram channel module is 201 x 192, and output of the electroencephalogram channel module is extracted space-time fusion characteristics, and the size of the time-channel cascade Transformer electroencephalogram channel module is 196 x 256.

Fusing the time characteristics output by the step S2 and the preprocessed electroencephalogram data output by the step S1 in a residual cascade mode, wherein the size of the fused characteristics is 201 x 192,

the electroencephalogram channel module structure based on time-channel cascade Transformer sequentially comprises the following steps: the time characteristic output in the step S2 is used as an input characteristic- > characteristic cascade layer- > real layer- > Linear layer- > classification identification bit layer (performing splicing class token operation) - > multi-head self-attention mechanism module- > residual error connecting layer- > LN regularization layer- > feedforward network- > residual error connecting layer- > LN regularization layer- > output space-time fusion characteristic i, and the module locally circulates for 2 times; and during the second local circulation, the space-time fusion characteristic I- > a multi-head self-attention mechanism module- > a residual error connecting layer- > an LN regularization layer- > a feedforward network- > a residual error connecting layer- > an LN regularization layer- > outputs a space-time fusion characteristic II.

The feature cascade layer is formed by connecting the time features output in the step S2 with the preprocessed electroencephalogram data output in the step S1 through residual operation, and the size of the obtained fusion features is 201 x 192; the Rearrange layer is dimensional transformation and converts the fused features from 201 by 192 to 192 by 201; the Linear layer performs feature mapping for Linear operation, and maps the fusion features from 201 dimension to 256 dimension; the classification identification bit layer adopts a method for classifying identification bits in a paper (Dosovitskiy A, Beyer L, Kolesnikov A, et al. an image is 16x16 words: transformations for image recognition at scale), which is a mature technology. The method has the function that the information of the whole channel sequence can be acquired through an attention mechanism in the training process, so that the influence of the original sequence node can be effectively avoided. The four bits of the flag are concatenated in front of the dimension of the fused feature 192, at which time the fused feature is transformed into 196 x 256. The subsequent structures are kept consistent with the time module in step S2. And further extracting the spatial characteristics of the electroencephalogram data through a channel module to finally obtain the space-time fusion characteristics.

The step S4 includes:

and constructing a time-channel cascade Transformer classification module, wherein the input of the classification module is the space-time fusion characteristics output by the step S3, the size of the classification module is 196 × 256, and the output of the classification module is a classification result.

The structure of a time-channel cascade transform-based classification module sequentially comprises the following steps: the space-time fusion feature output in step S3 is used as an input feature- > capture classification flag- > Reduce layer- > LN regularization layer- > Linear layer (3 layers) - > output classification result.

Intercepting four classification identification bits with the size of 4 x 256 from the space-time fusion feature obtained in the step S3; reduce layer is reduced to 1 × 256 on 4 × 256; the Linear layer maps input features from 256 to 256 x 2 dimensions, the second layer maps the input features from 256 x 2 dimensions back to 256 dimensions, and the third layer maps the input features from 256 dimensions to 26 dimensions for classification. The activation function used between the two layers is a GELU function. The corresponding classification label is output through the classification module, and the loss function is calculated through comparison with the real label, the cross entropy loss function is adopted in the invention, and the specific formula is as follows:

where M is the number of trials, N is the number of categories,

the true label of the m-th trial,

the predicted probability of class n trial m is shown. When the method is used specifically, Adam with high convergence rate is used as an optimizer, and the initial learning rate is setSet to 7e-5 and batch size 8.

The comparative results are shown in table 1;

table 1: accuracy enhancement of form parameters compared to existing methods

Claims

1. A electroencephalogram identification method based on a time-channel cascade transducer network is characterized by comprising the following steps:

2. The electroencephalogram identification method based on the time-channel cascade transducer network, according to claim 1, characterized in that the preprocessing module adopts a time alignment technique to make the size of the output data consistent with the size of the input data.

3. The electroencephalogram identification method based on the time-channel cascade transducer network according to claim 1, characterized in that the time modules sequentially have the following structures: the preprocessed data output by the step S1 is used as input features- > position coding layer- > multi-head self-attention mechanism module- > residual connecting layer- > LN regularization layer- > feed-forward network- > residual connecting layer- > LN regularization layer- > output time features i; the time module is executed for 2 times in a circulating mode, and the time characteristic I- > multi-head self-attention mechanism module- > residual error connecting layer- > LN regularization layer- > feedforward network- > residual error connecting layer- > LN regularization layer- > output time characteristic II for the first time.

4. The electroencephalogram identification method based on the time-channel cascade transducer network, according to claim 1, characterized in that a multi-head self-attention mechanism module maps input features to different subspaces, and then performs point multiplication operation on all the subspaces to calculate attention vectors; finally, splicing and mapping the attention vectors calculated in all the subspaces to an original input space to obtain a final attention vector as output; the expression of the multi-head self-attention mechanism module is as follows (1):

wherein,

representing different subspaces, the query vector Q, the key vector K, and the value vector V as inputs to a multi-headed self-attention module,

for the mapping matrix of Q in different subspaces,

for the mapping matrix of K in the different subspaces,

mapping matrix for V in different subspaces, W^OFrom all subspaces

Splicing to obtain the finished product; the calculation mode of the attention vector on the independent subspace is as follows in sequence: firstly, the query vector Q and the key vector K are subjected to dot product operation, and then the dot product operation is divided by the dimensional square root of the key vector K

Obtaining a fractional matrix of the query vector Q, wherein the Softmax function has good perception capability, normalizing by using the fractional matrix to obtain a weight matrix, and multiplying by a value vector V to obtain an attention vector of a subspace, wherein the expression is as the following formula (2):

wherein Q, K, V parameter matrix dimension d_q、d_kAnd d_vAre all 128, the number of heads is 16, d_modelIs 256; by linear transformation, the query vector Q is converted from d_modelDimension mapping as d_qHead, from the key vector K_modelDimension mapping as d_kHead, vector of values V from d_modelDimension mapping as d_v*head。

5. The EEG identification method based on time-channel cascade Transformer network as claimed in claim 3 or 4, wherein the input data of EEG channel module is time characteristic with size of 201 × 192, and the output of EEG channel module is extracted time-space fusion characteristic with size of 196 × 256.

6. The electroencephalogram identification method based on the time-channel cascade Transformer network, according to claim 5, characterized in that the time features output in the step S2 and the preprocessed electroencephalogram data output in the step S1 are fused in a residual cascade mode, and the size of the fused features is 201 x 192.

7. The electroencephalogram identification method based on the time-channel cascade transducer network, according to claim 5, is characterized in that the electroencephalogram channel module structure sequentially comprises: the time characteristic output by the step S2 is used as an input characteristic- > characteristic cascade layer- > real layer- > Linear layer- > classification identification bit layer- > multi-head self-attention machine system module- > residual error connecting layer- > LN regularization layer- > feed-forward network- > residual error connecting layer- > LN regularization layer- > output space-time fusion characteristic i, and the module locally circulates for 2 times; and during the second local circulation, the space-time fusion characteristic I- > a multi-head self-attention mechanism module- > a residual error connecting layer- > an LN regularization layer- > a feedforward network- > a residual error connecting layer- > an LN regularization layer- > outputs a space-time fusion characteristic II.

8. The electroencephalogram identification method based on the time-channel cascade transducer network, according to claim 5, characterized in that the feature cascade layer connects the time features output in the step S2 with the preprocessed electroencephalogram data output in the step S1 through residual error operation, and the size of the obtained fusion features is 201 x 192; the Rearrange layer is dimensional transformation and converts the fused features from 201 by 192 to 192 by 201; the Linear layer performs feature mapping for Linear operation, and maps the fusion features from 201 dimension to 256 dimension; the classification identification bit layer can acquire the information of the whole channel sequence; splicing four identification bits in front of the dimension of the fused feature 192, wherein the fused feature is transformed into 196 × 256; the subsequent structures are all consistent with the time module in step S2; and further extracting the spatial characteristics of the electroencephalogram data through a channel module to finally obtain the space-time fusion characteristics.

9. The EEG identification method based on time-channel cascade Transformer network as claimed in claim 7 or 8, wherein the input of the classification module is the space-time fusion feature output in step S3, the size is 196 × 256, and the output of the classification module is the classification result;

10. The electroencephalogram recognition method based on the time-channel cascade transducer network, according to claim 9, characterized in that four classification identification bits with a size of 4 × 256 are intercepted from the space-time fusion features obtained in step S3; reduce layer is reduced to 1 × 256 on 4 × 256; linear is used for Linear operation to carry out feature mapping, a first layer of Linear layer maps input features from 256 to 256 x 2 dimensions, a second layer of Linear layer maps the 256 x 2 dimensions back to the 256 dimensions, and a third layer of Linear layer maps the 256 dimensions to 26 dimensions for classification; an activation function is used between the two layers as a GELU function; outputting a corresponding classification label through a classification module, calculating a loss function by comparing the classification label with a real label, and adopting a cross entropy loss function, wherein a specific formula is as follows:

where M is the number of trials, N is the number of categories,

the true label of the m-th trial,

the predicted probability of class n trial m is shown.