CN115105094A

CN115105094A - Attention and 3D dense connection neural network-based motor imagery classification method

Info

Publication number: CN115105094A
Application number: CN202210832540.0A
Authority: CN
Inventors: 温银堂; 何文静; 范子剑; 李闪闪; 张玉燕
Original assignee: Yanshan University
Current assignee: Yanshan University
Priority date: 2022-07-15
Filing date: 2022-07-15
Publication date: 2022-09-27
Anticipated expiration: 2042-07-15

Abstract

The application is suitable for the technical field of electroencephalogram signal classification, and provides a motor imagery classification method based on attention and a 3D dense connection neural network, which comprises the following steps: acquiring an electroencephalogram signal data set, and constructing three-dimensional representation data according to the electroencephalogram signal data set; inputting the three-dimensional representation data into a space-frequency spectrum-time attention module, and dynamically capturing the characteristics of different electroencephalogram signal channels, frequency bands and time to obtain space-frequency spectrum-time information; inputting space-frequency spectrum-time information into a 3D dense connection neural network, and obtaining two gradient flow characteristics by utilizing a cross-stage structure; and fusing the two gradient flow characteristics by using a characteristic fusion strategy to obtain a characteristic classification model. The human intention can be accurately recognized from the low noise ratio and the non-stationary brain signals, and the classification effect of the motor imagery classification task is improved.

Description

Attention and 3D dense connection neural network-based motor imagery classification method

Technical Field

The application belongs to the technical field of electroencephalogram signal classification, and particularly relates to a motor imagery classification method based on attention and a 3D dense connection neural network.

Background

The brain manipulates various high-level neural activities of people's consciousness, language, motion, vision, hearing, emotional expression, and so on. With the rapid development of Computer processing capability and signal analysis technology, Brain-Computer Interface (BCI) provides a new research approach and method for human to analyze Brain thinking model and consciousness formation. BCI, a new type of human-computer interaction technology, accomplishes efficient communication between the human brain and a computer by analyzing and decoding Electroencephalogram (EEG) data. The motor thinking is a more common paradigm in BCI because it is an autonomous interaction mode. Specifically, the BCI system is able to process the extracted scalp EEG signal features in real time, decode the envisioned motor type, and generate control commands when the subject performs the motor imagery task, thereby controlling external devices (drones, wheelchairs, mobile robots, etc.).

However, accurate decoding of human intent is a challenge due to the low signal-to-noise ratio and instability of the brain electrical signal. Thus, there is still considerable room for improvement in the implementation and application of the motor imagery BCI system, including its accuracy, interpretability, and availability of online systems. Meanwhile, the motor imagery BCI system is based on the fact that: when the subject or patient imagines moving any part, the corresponding brain is activated in the area responsible for producing the actual movement.

At present, a great deal of research at home and abroad is dedicated to researching a feature extraction and classification method of electroencephalogram signals. Common Spatial Pattern (CSP) is the most typical feature extraction method, which uses the diagonal of matrix to find an optimal set of Spatial filters for projection, so as to maximize the variance of two types of signals. Most of the existing classification methods based on neural networks ignore the complementarity among electroencephalogram characteristics, which limits the classification capability of the models to a certain extent. Meanwhile, we find that the main focus of researchers is to increase the depth of the network to improve the success rate of classification, but this ignores the complexity of the algorithm.

Disclosure of Invention

The embodiment of the application provides a motor imagery classification method based on attention and a 3D dense connection neural network, so that human intentions are accurately recognized from low-noise ratio and non-stable brain signals, and the classification effect of a motor imagery classification task is improved.

In a first aspect, an embodiment of the present application provides a motor imagery classification method based on attention and a 3D dense connection neural network, including:

acquiring an electroencephalogram signal data set, and constructing three-dimensional representation data according to the electroencephalogram signal data set;

inputting the three-dimensional representation data into a space-frequency spectrum-time attention module, and dynamically capturing the characteristics of different electroencephalogram signal channels, frequency bands and time to obtain space-frequency spectrum-time information;

inputting the space-frequency spectrum-time information into the 3D dense connection neural network, and obtaining two gradient flow characteristics by using a cross-stage structure;

and fusing the two gradient flow characteristics by using a characteristic fusion strategy to obtain a characteristic classification model.

Optionally, the acquiring the electroencephalogram signal data set, and constructing three-dimensional characterization data according to the electroencephalogram signal data set includes:

dividing the electroencephalogram signal data set into a training data set and a verification data set according to a preset proportion;

respectively extracting spectral features in a 0-30Hz frequency band in the training data set and the verification data set and time sequence features in a 20s motor imagery task by using short-time Fourier transform;

and combining the frequency spectrum characteristics and the time sequence characteristics extracted from the same time in each electroencephalogram signal channel to obtain three-dimensional representation data of the training data set and three-dimensional representation data of the verification data set.

Optionally, the extracting, by using short-time fourier transform, spectral features in a 0-30Hz frequency band in the training data set and the verification data set and time series features in a motor imagery task executing task 20s respectively includes:

selecting four non-overlapping frequency bands from the frequency bands of 0-30Hz, and acquiring frequency spectrum characteristics and time sequence characteristics in the training data set and the verification data set according to different time sequences of each frequency band and all electroencephalogram signal channels.

Optionally, the obtaining the three-dimensional representation data of the training data set and the three-dimensional representation data of the verification data set by combining the spectral feature and the time series feature extracted from each electroencephalogram signal channel at the same time includes:

respectively converting the frequency spectrum characteristic and the time sequence characteristic into a 2D map according to an electroencephalogram signal channel;

performing cubic spline interpolation on the 2D map;

and stacking all the 2D maps by taking the frequency band length B and the time sequence quantity T as the length of the three-dimensional characteristic graph respectively to obtain the three-dimensional characteristic data of the training data set and the three-dimensional characteristic data of the verification data set.

Optionally, the inputting the three-dimensional representation data into a space-spectrum-time attention module, and dynamically capturing characteristics of different electroencephalogram signal channels, frequency bands, and time to obtain space-spectrum-time information includes:

inputting the space-frequency spectrum characteristics in the three-dimensional representation data into a first convolution layer to obtain a characteristic map M1;

performing a channel global pooling operation on the feature map M1;

constructing a space-spectrum attention module, and respectively leading the feature map M1 subjected to pooling into two pooling layers to obtain a feature map A11 and a feature map A12;

reshaping the characteristic map A11 and the characteristic map A12 to obtain a high-resolution characteristic map R11 and a high-resolution characteristic map R12;

obtaining a spectrum attention matrix according to the characteristic diagram R11 and the softmax function; obtaining a spatial attention matrix according to the characteristic diagram R12 and the softmax function;

and obtaining a space-frequency spectrum characteristic diagram according to the frequency spectrum attention matrix and the space attention matrix.

inputting the space-time characteristics in the three-dimensional representation data into a second convolution layer to obtain a characteristic diagram M2;

performing time domain global average pooling operation on the feature map M2 to obtain a feature map A21 and a feature map A22;

reshaping the characteristic map A21 and the characteristic map A22 to obtain a high-resolution characteristic map R21 and a high-resolution characteristic map R22;

obtaining a time attention matrix according to the characteristic diagram R21 and the softmax function; obtaining a spatial attention matrix according to the characteristic diagram R22 and the softmax function;

and obtaining a space-time characteristic diagram according to the space attention matrix and the time attention matrix.

Optionally, the inputting the spatio-spectral-temporal information into the 3D dense-connected neural network, and obtaining two gradient flow characteristics by using a cross-phase structure includes:

dividing the space-frequency spectrum characteristic diagram and the space-time characteristic diagram into two parts respectively;

respectively inputting a part of the space-frequency spectrum characteristic diagram and a part of the space-time characteristic diagram into a three-dimensional dense block, and outputting the processed space-frequency spectrum characteristic diagram and the space-time characteristic diagram to a transition layer by the three-dimensional dense block;

inputting the another portion of the spatio-spectral feature map and the another portion of the spatio-temporal feature map into a transition layer.

Optionally, the fusing the two gradient flow features by using a feature fusion strategy to obtain a feature classification model, including:

the transition layer outputs the processed space-frequency spectrum characteristic diagram and the space-time characteristic diagram to the fusion layer, and outputs the space-frequency spectrum characteristic diagram of the other part and the space-time characteristic diagram of the other part to the fusion layer;

and the fusion layer fuses the processed space-frequency spectrum characteristic diagram, the space-frequency spectrum characteristic diagram of the other part, the processed space-time characteristic diagram and the space-frequency spectrum characteristic diagram of the other part to obtain a characteristic classification model.

Optionally, the method further includes:

inputting the training data set into the feature classification model for training, and verifying the feature classification model by using the verification data set after training for a preset number of times;

and if the verification result is over-fitted, retraining the feature classification model.

Compared with the prior art, the embodiment of the application has the advantages that:

the application aims at the problem that the existing classification method ignores the complementary characteristics of electroencephalogram signals and causes the classification accuracy to be limited, and provides a motor imagery classification method based on attention and a 3D dense connection neural network under the condition that a complex model structure and fussy parameter training are not needed, and the method realizes multi-classification of motor imagery from two angles: firstly, extracting the most discriminative features of the electroencephalogram signals through a space-frequency spectrum-time attention mechanism, and capturing the complex relation in data so as to reserve all channel features related to motor imagery to the maximum extent; and secondly, introducing a 3D dense connection neural network, and segmenting the obtained electroencephalogram characteristic diagram by utilizing a designed cross-stage structure, thereby reducing gradient loss and enhancing network learning ability. The method and the device realize the four-classification of the motor imagery electroencephalogram signals and have high classification precision.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flowchart of a motor imagery classification method based on attention and a 3D dense connection neural network according to an embodiment of the present application;

FIG. 2 is an overall framework diagram of a motor imagery classification method based on attention and 3D dense connected neural networks;

FIG. 3 is a schematic diagram of the construction of three-dimensional characterization data;

FIG. 4 is a flow diagram of a spatio-spectral-temporal attention module of the design;

FIG. 5 is a block diagram of a 3D densely connected neural network;

FIG. 6 is a schematic view of a visualization of the extracted optimal features;

FIG. 7 is a graph of the four-class confusion matrix results obtained by the present application;

fig. 8 is a graph comparing the results of ablation experiments.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in the specification of this application and the appended claims, the term "if" may be interpreted contextually as "when …" or "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

As shown in fig. 1 and 2, the motor imagery classification method based on attention and 3D dense-connected neural networks includes steps S101 to S104.

Step S101, acquiring an electroencephalogram signal data set, and constructing three-dimensional representation data according to the electroencephalogram signal data set.

Specifically, the electroencephalogram signal data set can be obtained by inquiring an existing database on the internet or by a mode of experimental creation. For example, the electroencephalogram data set in the present application, which records EEG (electroencephalogram) signals of 9 subjects at a sampling frequency of 250Hz, each subject performing 576 experiments (144 experiments per task, four motor imagery tasks in total), may use Dataset2a of "electroencephalogram data of berlin team, germany, on the fourth brain-computer interface competition".

And after the electroencephalogram signal data set is obtained, three-dimensional representation data are constructed according to the electroencephalogram signal data set.

Exemplary step S101 includes steps S1011 to S1013.

Step S1011, dividing the EEG signal data set into a training data set and a verification data set according to a preset proportion.

Specifically, the electroencephalogram signal data set is divided into a training data set and a verification data set according to a preset proportion. For example, the brain electrical signal data set is divided into a training data set and a validation data set at 4: 1.

Step S1012, respectively extracting spectral features in the 0-30Hz frequency band in the training data set and the validation data set and time series features in the exercise imagination task 20S using short-time fourier transform.

Specifically, first, data preprocessing is performed on a training data set and a verification data set, as shown in fig. 3, frequency band information and time window information closely related to a sports imagery category are considered, and according to an experimental paradigm shown in fig. 3a and 3b, a short-time fourier transform is used to extract a frequency spectrum feature in a frequency band range of 0 to 30Hz and a time series feature in a time range of 20s for executing a sports imagery task.

Illustratively, time series characteristics and spectrum characteristics of a training data set and a verification data set are obtained from a preprocessed training data set and the verification data set in four non-overlapping frequency bands (1-4 Hz, 4-7 Hz, 8-13 Hz and 14-30 Hz) closely related to motor imagery and different time sequences of all electroencephalogram signal channels.

Let X be equal to R ^N×P Defined as the raw brain signal data set (training data set and validation data set), where N is the number of brain signal channels and P is the number of samples per brain signal channel. By X ^T ∈R ^N×T And X ^S ∈R ^N×B Respectively representing the temporal and spectral characteristics of the selected brain electrical signal dataset samples. They are defined as:

where T represents the time stamp obtained from each sample and B is the frequency band extracted from the brain electrical signal data set.

And S1013, combining the frequency spectrum characteristics and the time sequence characteristics extracted at the same time from each electroencephalogram signal channel to obtain three-dimensional representation data of the training data set and three-dimensional representation data of the verification data set.

Specifically, according to the three-dimensional representation of the electroencephalogram signal corresponding to the brain channel in fig. 3c, the spectral features and the time series features of the same time window extracted from different electroencephalogram signal channels are combined. The representation method not only reserves the time domain information and the frequency spectrum information of the electroencephalogram signal data set, but also reserves the spatial information of the sampled electroencephalogram signal data set.

Illustratively, the frequency spectrum characteristics X of different channels are determined according to the positions of the channels of the electroencephalogram signal ^T ∈R ^N×T And time domain feature X ^S ∈R ^N×B Respectively converted into 2D maps

And

after the two-dimensional map is obtained, cubic spline interpolation is adopted to ensure the detail uniqueness of the characteristic map.

Then, the frequency band length B and the time sequence number T are respectively taken as the length of the three-dimensional characteristic graph, the 2D maps are respectively stacked, and the three-dimensional representation of the electroencephalogram signal data set (the training data set and the verification data set) is obtained ^SS ∈R ^H×W×B And X ^ST ∈R ^H ^×W×T They are defined as:

and S102, inputting the three-dimensional representation data into a space-spectrum-time attention module, and dynamically capturing the characteristics of different electroencephalogram signal channels, frequency bands and time to obtain space-spectrum-time information.

In particular, the spatio-spectral-temporal attention module includes a spatio-spectral attention module and a spatio-temporal attention module.

And inputting the space-frequency spectrum characteristics into the first convolution layer based on the variability of the cycle and frequency band characteristics of the electroencephalogram signals of the individual motor imagery to obtain a characteristic map M1.

The global pooling of channels is performed on the feature map M1 to reduce the computational cost:

wherein A is ^avg Represents a global pooling profile based on channels, which is formed by shrinking the original signal input X by the number N of channels. F _nAvg (X) represents a channel-based global pooling function.

And constructing a space-spectrum attention module, and respectively importing the feature map M1 subjected to pooling into two pooling layers, including a banded global average pooling layer for reducing the frequency band dimension and a spatial global average pooling layer for reducing the spatial dimension, so as to obtain a feature map A11 and a feature map A12.

Wherein A is _B ∈R ^H×W Which represents the distribution of the spectral features,

representing the spatial characteristic distribution of the b-th band, F _bAvg And F _sAvg Represents the global averaging function and B represents the bin length.

And (4) reshaping the characteristic diagram A11 and the characteristic diagram A12 to obtain a high-resolution characteristic diagram R11 and a high-resolution characteristic diagram R12, and correspondingly halving the characteristic dimension.

Obtaining a spectrum attention matrix S epsilon R according to the characteristic diagram R11 and the softmax function ^H×W×1

S＝softmax(λA _B +γ)

Where λ and γ are learnable parameters.

And obtaining a spatial attention matrix according to the characteristic diagram R12 and the softmax function. Spatial attention matrix S ₁ ∈R ^1×1×B This is achieved by a full connectivity layer with activation function from R12, the result of which is:

wherein λ is ₁ And gamma ₁ Are learnable parameters. The softmax activation function is defined as:

thereby obtaining spatial and spectral prediction signals, respectively.

Finally, a space-spectrum feature map X is obtained through the following formula _SS ：

Wherein the content of the first and second substances,

representing the operation of multiplication of elements.

And inputting the space-time characteristics in the three-dimensional characterization data into the second convolution layer to obtain a characteristic map M2. And then, performing time domain global average pooling operation and spatial domain global average pooling operation on the feature map M2 to obtain a feature map A21 and a feature map A22 respectively.

Matrix A of the global average pool of time domains _T Is defined as:

wherein A is _T ∈R ^H×W Representing a spread of temporal features, F _tAvg Representing a temporal global averaging function.

Matrix of global average pool of spatial domain

Is defined as:

wherein the content of the first and second substances,

spatial feature distribution of the t-th time stamp representing all brain electrical signal channels, F _dAvg Representing a spatial-global average pooling function.

And reshaping the characteristic map A21 and the characteristic map A22 to obtain a high-resolution characteristic map R21 and a high-resolution characteristic map R22.

Inputting the feature map R21 into a time self-attention matrix T epsilon R obtained by a softmax function ^H×W×1 。

Spatial attention matrix S ₂ ∈R ^1×1×T This is achieved by a normalized feature matrix function acting on R22, which is as follows:

wherein λ is ₂ And gamma ₂ Are learnable parameters.

Based on such projections, the spatio-temporal feature maps are obtained by the following procedure:

to this end, the construction of the spatio-spectral-temporal attention module is completed as shown in fig. 4.

And S103, inputting the space-frequency spectrum-time information into a 3D dense connection neural network, and obtaining two gradient flow characteristics by using a cross-stage structure.

Specifically, three-dimensional characterization data is constructed based on the electroencephalogram signal data set in step S101. Spatio-spectral feature map X using cross-phase structure _SS And a space-time feature map X _ST The two gradient flow characteristics are obtained by dividing the two parts respectively.

And step S104, fusing the two gradient flow characteristics by using a characteristic fusion strategy to obtain a characteristic classification model.

Specifically, a part of space-spectrum characteristic diagram and a part of space-time characteristic diagram are respectively input into the three-dimensional dense block, the three-dimensional dense block outputs the processed space-spectrum characteristic diagram and space-time characteristic diagram to the transition layer, and the other part of space-spectrum characteristic diagram and the other part of space-time characteristic diagram are input into the transition layer. Spatial-spectral feature map X _SS And a space-time feature map X _ST Split into two parts to increase the gradient path.

Illustratively, as shown in FIGS. 5 and 6, with a spatio-temporal profile X _ST The description is given for the sake of example. Space-time feature map X _ST Is divided into two parts, respectively X ″) _ST And X ″) _ST 。

Mixing X ″) _ST Inputting into the first three-dimensional dense block to obtain X' _ST Directly connected with the transition layer of the last stage. The number of the three-dimensional dense blocks determines the depth of the network, each three-dimensional dense block comprises k dense layers, and the input of the (l + 1) th dense layer is formed by the connection result of the output of the (l) th dense layer and the input of the (l) th dense layer. Thus, for the first dense layer of a three-dimensional dense block, it receives as input the combined feature map produced by the current convolutional layer and the previous convolutional layer:

X _l ＝H _l ([X ₀ ，X _l ，X ₂ ，…，X _l-1 ])

wherein, [ X ] ₀ ，X _l ，X ₂ ，…，X _l-1 ]Showing the concatenation of the signatures generated by all previous three-dimensional convolutional layers, H _l Is a continuously operating complex function containing bottleneck layers (consisting of spatial domain convolutional layers of size 3 x 1 and time domain convolutional layers of size 1 x 3), which reduces the expensive computational cost and memory requirements of conventional three-dimensional convolution to some extent.

And (3) bringing the result output by the first three-dimensional dense block into a partial transition layer, bringing the result output by the partial transition layer into a new three-dimensional dense block, and repeating the steps to improve the compactness of the model.

And connecting the output result of the partial transition layer corresponding to the last three-dimensional dense block with the transition layer of the last stage, wherein the last transition layer comprises a batch processing normalization layer and a convolution layer.

The output of the transition layer is brought into an average pooling layer, the number of output feature graphs is further reduced, and the compactness of the model is improved by the application of a plurality of transition layers, so that the learning capability of the network is improved.

And fusing the characteristics of the two gradient flows by utilizing a characteristic fusion strategy, and compressing a characteristic diagram obtained by characteristic classification by adopting Maxout. Using class cross entropy as a loss function:

wherein M represents the number of classes, y _i (0, 1, 2, 3) represents a plurality of types of indices, p _i Representing the probability of prediction. Thus, a feature classification model can be obtained:

Y _{classification} ＝F(X _SS ，X _ST )

wherein F represents a mapping function and Y _{classification} Representing the classification result of the motor imagery.

Steps S101 to S104 propose a motor imagery classification method based on attention and 3D dense connection neural network, which implements motor imagery multi-classification from two perspectives: firstly, extracting the characteristics of the electroencephalogram signal with the most discriminative power through a space-frequency spectrum-time attention mechanism, and capturing the complex relation in data, thereby furthest reserving all channel characteristics related to motor imagery; and secondly, introducing a 3D dense connection neural network, and segmenting the obtained electroencephalogram characteristic diagram by utilizing a designed cross-stage structure, thereby reducing gradient loss and enhancing network learning ability. The method and the device realize the four-classification of the motor imagery electroencephalogram signals and have high classification precision.

In order to verify the effect of the feature classification model created in the present application, after step S104, the method may further include:

the training data set is input into the feature classification model for training, and after a preset number of times (for example, 10 times) of training, the feature classification model is verified by using the verification data set. And if the verification result shows overfitting, retraining the feature classification model until the overfitting does not appear in the verification.

And inputting the test set in the electroencephalogram signal data set into a feature classification model for classification, and comparing the test set with baseline methods FBCSP, Deep ConvNet, EEGNet and M3D CNN to test the classification accuracy of the feature classification model.

Illustratively, each subject is first experimented with, an electroencephalogram data set of the subject is taken, and the electroencephalogram data set is divided into a training set and a testing set according to a ratio of 4: 1. And setting training parameters of the feature classification model, wherein the self-adaptive time estimation optimizer is used for minimizing a classification cross entropy loss function. The learning rate was 0.0001. The number of attention modules is set to 4. The hyperparameter and the constant and weight decay rate of the bulk normalization layer of the Dropout layer are set to 0.5 and 10, respectively ^-5 And 0.1.

Based on the brain-computer interface competition motor imagery four-classification dataset, the classification results were compared with the classification results of FBCSP, Deep ConvNet, EEGNet and M3D CNN methods, as shown in table 1, where the bold numbers are the optimal classification accuracy for each subject. As can be seen from the table, the feature classification model of the application obtains higher average classification accuracy rate, which reaches 84.45%.

TABLE 1

The validity of the extracted features of the proposed method is mined by a confusion matrix, the experimental results of which are given in fig. 7, where the values on the diagonal of the confusion matrix are the correct prediction samples for the classification task of each moving image.

The effectiveness of different modules in the model is further verified through ablation experiments, which comprise: and adding an attention mechanism in the CNN network, deleting an attention framework and only applying the 3D DCSPNet.

Fig. 8 shows the results of the ablation experiment. Firstly, the SST-annotation architecture can adaptively extract space-frequency spectrum-time characteristics, realize the complementation of different characteristics and improve the classification precision. The three-dimensional DCSPNet model can enhance the learning of electroencephalogram characteristics in different layers. Experimental results show that the motor imagery classification method based on attention and the 3D dense connection neural network has good classification performance and strong generalization capability.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A motor imagery classification method based on attention and 3D dense connection neural networks is characterized by comprising the following steps:

2. The motor imagery classification method based on attention and 3D dense-connected neural network of claim 1, wherein said acquiring a brain electrical signal data set and constructing three-dimensional characterization data from the brain electrical signal data set comprises:

3. The method for classifying motor imagery based on attention and 3D dense-connected neural networks according to claim 2, wherein the extracting spectral features in the 0-30Hz frequency band and time series features in the 20s motor imagery task in the training data set and the validation data set respectively using short-time Fourier transform comprises:

4. The method for classifying motor imagery based on attention and 3D dense connected neural network according to claim 3, wherein said combining the spectral features and the time series features extracted from the same time in each brain electrical signal channel to obtain three-dimensional characterization data of the training data set and three-dimensional characterization data of the validation data set comprises:

performing cubic spline interpolation on the 2D map;

and stacking all the 2D maps by taking the frequency band length B and the time sequence number T as the lengths of the three-dimensional feature maps respectively to obtain the three-dimensional characterization data of the training data set and the three-dimensional characterization data of the verification data set.

5. The method for classifying motor imagery based on attention and 3D dense-connected neural network according to claim 4, wherein the inputting the three-dimensional representation data into a spatio-spectral-temporal attention module for dynamically capturing the features of different EEG signal channels, frequency bands and time to obtain spatio-spectral-temporal information comprises:

performing a channel global pooling operation on the feature map M1;

and obtaining a space-spectrum characteristic diagram according to the spectrum attention matrix and the space attention matrix.

6. The method for classifying motor imagery based on attention and 3D dense-connected neural network according to claim 5, wherein the inputting the three-dimensional representation data into a spatio-spectral-temporal attention module for dynamically capturing features of different brain electrical signal channels, frequency bands and time to obtain spatio-spectral-temporal information comprises:

inputting the space-time characteristics in the three-dimensional representation data into a second convolution layer to obtain a characteristic map M2;

7. The method for classifying motor imagery based on attention and 3D dense-connected neural network of claim 6, wherein the inputting the spatio-spectral-temporal information into the 3D dense-connected neural network, two gradient flow features are obtained by using a cross-phase structure, comprising:

8. The method for classifying motor imagery based on attention and 3D dense-connected neural network according to claim 7, wherein the fusing the two gradient flow features using a feature fusion strategy to obtain a feature classification model comprises:

9. The method of classifying motor imagery based on attention and 3D dense-connected neural networks according to claim 8, wherein the method further comprises: