CN113269048A

CN113269048A - Motor imagery electroencephalogram signal classification method based on deep learning and mixed noise data enhancement

Info

Publication number: CN113269048A
Application number: CN202110474878.9A
Authority: CN
Inventors: 王丹; 陈佳明; 杜金莲; 许晴
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2021-08-17
Anticipated expiration: 2041-04-29
Also published as: CN113269048B

Abstract

The invention discloses a motor imagery electroencephalogram signal classification method based on deep learning and mixed noise data enhancement, aiming at the problems of low signal-to-noise ratio and small sample size of electroencephalograms, an empirical mode decomposition method and a white noise data enhancement method are combined, a mixed noise data enhancement method based on empirical mode decomposition is provided, and the quality of generated samples is improved by extracting main information of original signals to be mixed with white noise, so that a classifier with higher accuracy and higher stability is trained; combining the idea of a filter bank with a shallow neural network, providing a lightweight FB-Sinc-ShallownNet method with high convergence rate, and improving the classification accuracy of a deep learning method; the electroencephalogram signals are preprocessed by using a European alignment method, so that the difference among the electroencephalogram signals obtained at different time is reduced, the classification difficulty is reduced, and the classification accuracy is improved. The method can improve the prediction accuracy and stability of the classification model of the motor imagery electroencephalogram signals.

Description

Motor imagery electroencephalogram signal classification method based on deep learning and mixed noise data enhancement

Technical Field

The invention belongs to the field of computer software, and relates to an electroencephalogram signal classification method for recognizing motor imagery limb parts based on a deep learning and mixed noise data enhancement method.

Background

In recent years, research in the field of Brain Computer Interface (BCI) has received much attention from researchers at home and abroad. The brain-computer interface is a system for realizing brain and machine communication through biological signals emitted by brains such as electroencephalogram signals and the like, does not depend on the common way of outputting information by the brains such as peripheral nerves, muscles and the like, is mainly applied to the fields of rehabilitation medicine and the like, can assist in treatment and rehabilitation of disabled people, and develops a new auxiliary communication and control technology for patients suffering from serious neuromuscular diseases. Brain-computer interfaces fall into two categories, invasive and non-invasive. Among them, the invasive brain-computer interface has the advantages of high signal resolution and signal-to-noise ratio, but also requires high operation cost, and requires medical examination regularly, which has high risk. Although the signal-to-noise ratio is relatively low, the non-invasive brain-computer interface has much lower use cost and risk than the invasive brain-computer interface, and thus receives much more extensive attention. The non-invasive brain-computer interface takes signals such as electroencephalogram (EEG), Magnetoencephalography (MEG), and functional near-infrared spectroscopy (fNIRS) as input. The electroencephalogram signal is used as the representation of the electrophysiological signals of the brain neurons in the total scalp, and is most widely applied to the non-invasive brain-computer interface. The non-invasive brain-computer interface based on the brain electrical signal is divided into an active type and a response type. The initiative is mainly a brain-computer interface based on Motor Image (MI). The brain-computer interface based on motor imagery is a brain-computer interface paradigm which does not need external stimulation, is actively regulated and controlled by a user and can embody the intention of autonomous movement, and is one of the most important and widely researched paradigms. The brain-computer interface based on motor imagery mainly studies the classification problem of motor imagery electroencephalogram signals. The classification of the motor imagery electroencephalogram signals takes electroencephalogram signals generated by imagining the motion of a specific part as input data, and judges the classification of the electroencephalogram signals by using a classification model according to the characteristics of the electroencephalogram signals, so that the body part of a subject to generate the motion is identified. At present, the classification research of electroencephalogram signals by motor imagery mainly has the problems of low signal-to-noise ratio of the electroencephalogram signals and small sample size.

In order to solve the problems of low signal-to-noise ratio and difficult classification, researchers use machine learning and deep learning methods to extract and classify electroencephalogram characteristics. In the machine learning method, the classification accuracy of the Common Spatial Pattern (CSP), the Filter Bank Common Spatial Pattern (FBCSP), and the riemann geometry-based method (riemann geometry-based methods) is high, but still needs to be further improved. In recent years, researchers have found that deep learning methods can achieve better classification results than machine learning methods. Schirrmeister et al propose DeepConvNet (deep convolutional network) and ShallowConvNet (shallow convolutional network) methods, and find that the characteristics of motor imagery electroencephalogram signals can be better decoded by a light-weight shallow neural network through comparison. Lawsan et al proposed an EEGNet method in combination with the ideas of FBCSP and shallow neural network, and obtained a classification accuracy (66%) similar to that of FBCSP method on the public data set, and also applied the method to other paradigms. Wu et al propose a Multiscale Filter Bank Convolutional Neural Network (MSFBCNN) method to improve the accuracy of the shallowconvn method, in combination with the idea of the Filter Bank in the FBCSP method and the shallowconvn method. Borra et al, which considers interpretability and lightweight, propose a Sinc-ShallowNet (shallow network based on the Sinc function) method, and obtain a classification accuracy of 72.8% on a public data set. From the research, the idea of light weight and filter bank plays a key role in the classification research of motor imagery electroencephalogram signals based on deep learning.

In order to solve the problem of small sample size, a data enhancement method commonly used in the field of deep learning can be used for expanding the sample size of motor imagery electroencephalogram data, so that the requirements of the deep learning method on the data size are met, and the accuracy and the stability of the method are improved. The commonly used data enhancement method in the field of motor imagery brain-computer interface mainly comprises 3 types: noise addition, windowing, and challenge generation Networks (GAN). Wherein, the countermeasure generation network is commonly used for enhancing the electroencephalogram signal of time-frequency representation (time-frequency representation); the windowing is suitable for processing the original electroencephalogram signals, the effect is better when the windowing is combined with a neural network with a deeper layer number (such as DeepConvNet), the effect of matching with a shallow neural network is not obvious, and even the accuracy is reduced; the added noise is mainly used in combination with a shallow neural network and can be used for enhancing the original electroencephalogram signal or the electroencephalogram signal represented by time frequency. The combination of the common network structure and data characteristics of the 3 methods shows that the shallow neural network for processing the original electroencephalogram signals is suitable for data enhancement by a method of adding noise.

In combination with the analysis of the existing method, the invention provides a mixed noise data enhancement method based on empirical mode decomposition and an FB-Sinc-ShallowNet method (Filter Bank Sinc-ShallowNet ) so as to solve the problems of low signal-to-noise ratio and small sample size of electroencephalogram signals. Through a contrast experiment and an ablation experiment, the invention verifies that the classification accuracy of the motor imagery electroencephalogram signals can be effectively improved by the method.

The invention content is as follows:

a mixed noise data enhancement method based on empirical mode decomposition is provided. The method adds the main information of the original signal extracted by the EMD method into Gaussian white noise to form mixed noise, and adds the mixed noise into the original signal to synthesize a new training sample to realize data enhancement.

And secondly, providing an FB-Sinc-ShallowNet method to improve the classification accuracy.

And thirdly, preprocessing the data by using an Euclidean Alignment (EA) method to further improve the classification accuracy.

The invention has the following advantages:

(1) the method extracts the most relevant components with the original signal by using empirical mode decomposition, and the most relevant components are mixed with white noise with a specific signal-to-noise ratio to generate new data for data enhancement, so that the effect is better than that of directly adding a new sample generated by the white noise.

(2) The FB-Sinc-ShallowNet method combines the ideas of light weight and a filter bank, and performs band-pass filtering and feature extraction on input signals by learning 4 filter banks with different frequency bands, so as to train a classification model with high accuracy at a high convergence speed.

The electroencephalogram data are preprocessed by using European alignment, so that the difference of electroencephalogram samples at different times can be reduced, and the classification difficulty is reduced.

1. A motor imagery electroencephalogram signal classification method based on deep learning and mixed noise data enhancement is characterized by comprising the following steps:

step 1, dividing an electroencephalogram signal data set into a training set, a verification set and a test set. Performing European alignment preprocessing on the data of the 3 sets respectively;

step 2, performing mixed noise data enhancement based on empirical mode decomposition on training set data, and expanding the number of training set samples to 2 times of the original number;

step 3, inputting the expanded training set and the verification set into an FB-Sinc-ShallownNet method to train a model;

and 4, classifying the data of the test set, and identifying the limb part of the motor imagery of the subject.

2. The classification method according to claim 1, characterized in that the samples are augmented using a mixed noise data enhancement method based on empirical mode decomposition, in particular:

the method comprises the steps of performing Empirical Mode Decomposition (EMD) on an original signal to obtain the most relevant IMF (Intrinsic Mode Functions) components, estimating IMF energy, calculating a threshold value to filter out main information of the signal, mixing the main information with white noise, adjusting the signal-to-noise ratio of the mixed noise, and then adding the mixed noise to the original signal to generate a high-quality sample.

The method comprises the following specific steps:

step 1: EMD and IMF component screening: performing empirical mode decomposition on the original signal to obtain a plurality of IMF components, reserving the IMF components with the correlation coefficient more than or equal to 0.1, and discarding other IMF components, thereby screening the IMF components most correlated with the original signal.

Step 2: estimate the energy of the IMF component: assuming that the first IMF component contains most of noise in the original signal, and estimating noise energy values in other IMF components by taking the first IMF component as a reference;

and step 3: filtering the IMF components based on an adaptive threshold: dividing each IMF component into a plurality of regions by taking the regions divided by adjacent zero points on the left side and the right side of each extreme point as a unit, arranging the regions from small to large according to extreme values, sequentially calculating the accumulated sum of energy from small to large until the accumulated energy reaches a noise energy estimation value (critical value) of the IMF component, screening the divided regions by taking the extreme value before reaching the critical value as a threshold, reserving signals of the regions of which the extreme values are more than or equal to the threshold, and filtering signals of other regions, thereby extracting effective components in the original signals;

and 4, step 4: generating mixed noise: generating a white noise sequence with a specific signal-to-noise ratio based on the effective components of the original signal, and summing the white noise sequence with the effective components of the original signal to obtain mixed noise;

and 5: generating a new sample: and summing the mixed noise and the original signal to obtain a new sample, and using the new sample and the original data as training set data.

3. The classification method according to claim 1, characterized in that the samples are preprocessed using a euclidean alignment method, in particular:

step 1: the sample of one subject is known as X, and contains N samples. Calculating the arithmetic mean of the covariance matrices of all samples;

step 2: calculating the reciprocal of the square root of each element in the arithmetic mean matrix to obtain a transformation matrix;

and step 3: and multiplying each sample by a transformation matrix to realize Euclidean alignment.

4. The classification method according to claim 1, characterized in that the FB-Sinc-ShallowNet (filter bank Sinc-ShallowNet) method is used for classification, specifically:

the input signal is divided into 4 parts which are respectively sent into 4 branches of the shallow neural network, and each branch comprises a SincConv layer, a deep convolutional layer and a pooling layer. The outputs of the 4 branches are finally merged into a feature map (feature map), regularized by a discarding layer (Dropout), converted into a one-dimensional vector by a Flatten layer, and then classified by a SoftMax function. The method comprises the following specific steps:

step 1: dividing an input signal into 4 parts, and sending the divided signals into 4 branches;

step 2: each branch computes the time-domain convolution of the input signal with an fir (finite Impulse response) filter using sinccov layers (Sinc convolution layers).

And step 3: the batch normalization layer is used by each branch to adjust the size of the characteristic weight so as to improve the speed of network training;

and 4, step 4: the convolution result of each channel is separately calculated by a depth convolution layer for each branch, and spatial domain information is extracted;

and 5: the batch normalization layer is used by each branch to adjust the size of the characteristic weight so as to improve the speed of network training;

step 6: each branch is subjected to nonlinear operation by using an ELU activation function;

and 7: compressing the time domain information of the feature map by using an average pooling layer for each branch;

and 8: merging the 4 groups of feature maps, regularizing by using a discarding layer, and randomly discarding half of the features of the neurons;

and step 9: and compressing all the feature maps into a one-dimensional vector, and inputting the vector into a SoftMax function for classification.

Drawings

FIG. 1 general flow chart of the invention

FIG. 2 is a flow chart of a mixed noise data enhancement method based on empirical mode decomposition

FIG. 3 is a flow chart of the FB-Sinc-ShallownNet method

FIG. 4 data set Experimental time schematic

Detailed Description

The invention combines an empirical mode decomposition method with a white noise data enhancement method, provides a mixed noise data enhancement method based on empirical mode decomposition, and improves the quality of a generated sample by extracting main information of an original signal to mix with the white noise, thereby training a classifier with higher accuracy and stronger stability; meanwhile, the method further improves the classification accuracy by combining the lightweight FB-Sinc-ShallownNet method with high convergence rate, and provides a high-efficiency and stable deep learning method for classification of motor imagery electroencephalogram signals.

Fig. 1 can be broken down into several steps of the invention.

Step one, dividing an electroencephalogram signal data set into a training set, a verification set and a test set. Performing European alignment preprocessing on the data of the 3 sets respectively;

and step two, performing mixed noise data enhancement based on empirical mode decomposition on the training set data, and expanding the number of the training set samples to 2 times of the original number.

And step three, inputting the expanded training set and the verification set into an FB-Sinc-ShallownNet method to train the model.

And step four, classifying the data of the test set, and identifying the limb part of the motor imagery of the subject.

Before describing the specific method in detail, the problems to be solved by the present invention and the symbols involved are defined. The motor imagery electroencephalogram signal data is composed of a group of multi-channel amplitude (magnitude) time series and event markers. After preprocessing, samples and labels can be extracted from the multi-channel amplitude time series. The samples and the labels of the electroencephalogram signals are respectively expressed as a matrix X with a dimension of NxExT and a vector y with a length of N, wherein N represents the number of the samples, E represents the number of channels, and T represents the number of sampling points. The sample data of the sample i at the time t on the channel e is marked as X_i，e(t) tag thereof is denoted by y_iWhere i ∈ {1,2, …, N }, E ∈ {1,2, …, E }, and T ∈ {1,2, …, T }. The problem to be solved by classification research of the motor imagery electroencephalogram signals is as follows: a training sample and a label of a group of electroencephalogram signals are given, and a model f is trained to predict a sample X with an unknown label_iClass y of_i，pred。

Mixed noise data enhancement method based on empirical mode decomposition

In order to solve the problem of small number N of samples, in the field of motor imagery brain-computer interfaces, a data enhancement method for adding noise adds white noise with the average value of 0 into an electroencephalogram signal. However, because the electroencephalogram signal has the characteristic of low signal-to-noise ratio, the sample generated by directly adding the noise has lower signal-to-noise ratio, and the data enhancement effect is limited. In order to improve the quality of generated samples, the invention provides a mixed noise data enhancement method based on Empirical Mode Decomposition (EMD), which performs Empirical Mode Decomposition (EMD) on an original signal to obtain the most relevant IMF (Intrinsic Mode Functions) components, estimates the IMF energy and calculates a threshold to filter out the main information of the signal, mixes the main information with white noise and adjusts the signal-to-noise ratio of mixed noise, and then adds the mixture to the original signal to generate high-quality samples, as shown in fig. 2. The method comprises 4 steps: EMD and IMF component screening, IMF energy estimation, IMF component filtering based on adaptive thresholds, and mixed noise generation. These 4 steps are explained below.

1.1EMD and IMF component screening

The raw signal is first subjected to empirical mode decomposition. Recording the signal X of the sample i on the channel e_i，eFor S, it can be decomposed into J IMF components and remainder r by method 1, where J represents the number of IMF components produced at the end of the iteration. IMF is a function that satisfies the following 2 conditions: (1) the number of the extreme points and the zero points in the whole signal must be equal or only differ by 1; (2) at any point, the mean of the upper envelope defined by the maximum point and the lower envelope defined by the minimum point is 0. The IMF component refers to a component that satisfies the IMF condition generated in the empirical mode decomposition. The signal S and the IMF generated after decomposition satisfy the following conditions with the remainder term r:

then, the correlation coefficient of the signal S and each IMF component is calculated, IMF components (J' in total) with the correlation coefficient more than or equal to 0.1 are reserved, and other IMF components are discarded, so that the IMF component most correlated with the original signal is screened out. The correlation coefficient is defined as:

1.2 estimating the energy of the IMF component

The method of estimating the energy of the IMF components is commonly used for signal noise reduction, and it is assumed that the first IMF component contains most of the noise in the original signal S, and the noise energy values in the other IMF components are estimated with the first IMF component as a reference. The energy of the first IMF component is estimated by equation 3.

The energy of the remaining IMF components is estimated by equation 4. Wherein H denotes the Hurst index, beta_HAnd ρ_HIs a parameter that varies with H. In practice H is 0.5, when β_H＝0.719，ρ_H＝0.201。

1.3 Filtering IMF Components based on adaptive thresholds

After the energy of each IMF component is estimated, each IMF component is divided into a plurality of sections by taking sections divided by adjacent zero points on the left side and the right side of each extreme point as a unit, the threshold of the extreme value corresponding to the energy of each IMF component is calculated in a self-adaptive mode, signals of the sections with the extreme values larger than or equal to the threshold are reserved, signals of other sections are filtered, and therefore effective components in the original signals are extracted. The method comprises the following steps:

(1) calculating threshold of extremum

Dividing the jth IMF component into P intervals according toThe extreme values are arranged from small to large, and the extreme values of the sorted intervals are recorded as

Which corresponds to an energy of

Then, the energy is added up in order from small to large, and the process is stopped when equation 5 is satisfied.

Taking the absolute value of the extreme value of the qth interval as the threshold value T_j′Namely:

(2) filtering the IMF component

The interval of the jth IMF component is IMF_j′，p(P ∈ {1,2, …, P }), whose extreme value is e_j′，p(P ∈ {1,2, …, P }) the signal within the interval is filtered according to the rule shown in equation 7.

All J' IMF components are subjected to threshold filtering based on self-adaption, and then the IMF components are summed to obtain a signal

S' is shown in formula 7. Wherein, IMF_j′The jth IMF component is indicated.

1.4 generating Mixed noise

And after extracting the main information S 'of the original signal, generating a white noise sequence with a specific signal-to-noise ratio based on the S'. Let SNR be sdBEquation 9 for calculating energy E of white noise_noise。

Then, a vector a [ a ] composed of a white noise sequence having a length T, an average value of 0, and a standard deviation std is created₀，a₁，…，a_T]. Let each element in a and E_noiseThe multiplication results in a vector a'. Finally, summing the signal S 'and the vector a' to obtain the generated mixed noise a_gen. Note that the signal-to-noise ratio s and the standard deviation std need to be adjusted experimentally, and the present invention finds that the best effect is obtained when std is 0.02 and s is 1.

After the mixed noise is generated, it is added to the original signal. Original signal X of e channel of i sample_i，eThe new signal generated after adding the mixed noise is X'_i，e＝X_i,e+a_gen. The generated new signals of a plurality of channels are combined to form a new sample X' and the original data X which are used as a training set data training model, so that data enhancement can be realized, and the accuracy and generalization capability of the model are further improved.

Second, FB-Sinc-ShallownNet method

In order to solve the problems of signal-to-noise ratio and difficulty in classification of electroencephalogram signals, the invention provides an FB-Sinc-ShallownNet (filter bank Sinc-ShallownNet) method by combining the ideas of light weight and a filter bank, a plurality of groups of band-pass filters are learned on the basis of the Sinc-ShallownNet method, and the characteristics of the electroencephalogram signals are extracted from different frequency bands, so that the classification accuracy of the model is further improved. The network structure of this method is shown in fig. 3. The input signal is divided into 4 parts which are respectively sent into 4 branches of the shallow neural network, and each branch comprises a SincConv layer, a deep convolutional layer and a pooling layer. The outputs of the 4 branches are finally merged into a feature map (feature map), regularized by a discarding layer (Dropout), converted into a one-dimensional vector by a Flatten layer, and then classified by a SoftMax function. Each layer is described in detail below.

2.1SincConv layer

The SincConv layer (SincConv layer) calculates the time domain convolution of the input signal and the FIR (finite Impulse response) filter, so that more meaningful characteristics in the input signal can be extracted by fewer parameters, and the method has the advantage of quick convergence. Different from the traditional time domain convolution layer, the SincConv layer has a lower frequency limit of f_lowAnd an upper limit f_highAnd as a training parameter, constructing a convolution kernel by combining a sinc function. The SincConv layer is defined as:

y[T]＝x[T]*h(T,f_low,f_high) (10)

wherein, x [ T ]]Representing a signal of length T, y [ T]Representing a filtered signal of length T, h (T, f)_low,f_high) The convolution kernel is represented, and the calculation method is shown in formula 11.

h[T,f_low,f_high]＝2f_highsinc(2πf_highT)-2f_lowsinc(2πf_lowT) (11)

Wherein sinc (·) represents a sinc function defined as:

the convolution kernel defined by equation 11 is obtained by inverse Fourier transform (inverse Fourier transform) of the band pass filter defined by equation 13.

Where rect (-) represents a rectangular function of the frequency domain, whose phase is linear. In order to reduce the stopband oscillation and make it excessively smoother to construct a band-pass filter with better effect, a Hamming window function is introduced into the kernel function, as shown in equation 14.

Where L represents the length of the time domain convolution kernel.

The initialization parameters of the SincConv layer determine the frequency band range of the band-pass filter it learns. Note f_lowAnd f_highRespectively, are f_lowerAnd f_upperThen there is f_lower≤f_low,f_high<f_upper. In the Borra et al study, f_lower＝4Hz，f_upper38Hz, i.e. the frequency interval is [4, 38). In order to better extract information of different rhythms, the original signals are divided into 4 parts and are respectively input into SincConv layers of 4 groups of different initial frequency bands, namely [4,8 ], [8,14 ], [14,30) and [30,40), and time domain features of different rhythms are extracted in a finer frequency band, so that the classification accuracy of the model is further improved. For the convenience of programming, the invention realizes the one-dimensional time domain convolution in the form of two-dimensional convolution with convolution kernel size of (1 × L), so that a sinccov layer is represented by "sinccov 2D" in fig. 3.

The SincConv layer is followed by a Batch Normalization layer (Batch Normalization) for adjusting the feature weights to increase the network training speed. The layer normalizes the feature maps in a small batch (mini batch) with the mean value of 0 and the variance of 1, and relieves the phenomenon of 'internal covariate shift', so that the distribution change in the training process is reduced, and the training of the model is accelerated. In the research in the field of motor imagery brain-computer interface, a batch normalization layer is usually added after time domain convolution and space domain convolution, so the invention also adds a batch normalization layer after the sinccov layer, and sets the parameters of the batch normalization layer to be m-0.99 and e-1 e-3 to improve the stability.

m is momentum and e is a very small number to prevent the denominator from being 0.

2.2 deep convolutional and pooling layers

In each branch of the shallow network, after the SincConv layer extracts time domain information and the batch normalization layer is accelerated, the feature map is input to the deep convolutional layer (DepthwiseConv2D) to extract spatial domain information. The layer separately computes the convolution result of each channel when computing the convolution, unlike the common convolution layer which combines the results of multiple channels. Meanwhile, the convolution kernel size E × 1 (E ═ 22 in fig. 3) can calculate the eigenvalues of all channels at once, and is also applied to the EEGNet. In addition, to prevent the gradient from varying too much, the gradient is constrained by a maximum norm at the layer, with a maximum norm of 1.

After the deep convolutional layer, there is a batch normalization layer and an ELU (explicit Linear Unit) activation function layer. The former is used for accelerating training, and the parameter setting is the same as the previous batch normalization layer; the latter is used to improve the classification performance of the model. ELU shows better performance than other activation functions in the research in the field of motor imagery brain-computer interface, and the definition is shown as formula 15.

Wherein the parameter α is set to 1. After the activation function, the time domain information of the feature map is compressed by an average pooling layer to reduce the parameter number. The pooling layer size is set to 1 × 109 with a step size of 1 × 23, meaning that 0.5 second deep temporal features are extracted with a step size of about 0.1 second.

2.3 Branch merging and Classification

After feature extraction and compression of four branches, 4 groups of feature maps are merged. Each branch has 64 1 × 15 feature maps, and 256 1 × 15 feature maps are obtained after combination. After the information of 4 branches is integrated, regularization is performed by a discard layer. The discarding layer simplifies the model structure by randomly discarding part of the neurons, and prevents the overfitting phenomenon. The present invention sets the discard rate of the discard layer to 0.5, i.e., randomly discards half of the neuron features.

And finally, compressing all the feature maps into a one-dimensional vector, and inputting the vector into a SoftMax function for classification. Meanwhile, the maximum norm constraint is added to the last layer to regularize the model (the maximum norm is 0.5) so as to prevent an overfitting phenomenon.

Three, European style alignment preprocessing method

The Euclidean Alignment (EA) method is a transfer learning method proposed by He et al, and performs uniform matrix transformation on electroencephalograms of different subjects to change the mean value of covariance matrices thereof into a unit matrix, thereby reducing differences among different subjects and improving the effect of cross-subject (cross-subject) experiments (cross-subject experiments refer to training a model with electroencephalograms of other subjects to predict the types of samples of new subjects). In the single subject (within subject) study of the invention, the training data and the test data are from the same subject but collected from different times, and the influence of the time difference on the electroencephalogram characteristics can be relieved by using the European alignment method. Therefore, the method adopts the European alignment method to preprocess the training data and the test data so as to further improve the classification effect of the model.

The Euclidean alignment method does not need a label of a sample when calculating a transformation matrix, and only needs sample data. The sample of one subject is known as X, and contains N samples. The arithmetic mean of the covariance matrices for all samples is calculated as shown in equation 16.

And then, calculating the reciprocal of the square root of each element in the arithmetic mean matrix to obtain a transformation matrix, and multiplying each sample by the transformation matrix to realize Euclidean alignment, as shown in a formula 17.

At this time, the arithmetic mean of the covariance matrices of the aligned samples is calculated, and a unit matrix can be obtained, as shown in equation 18:

experiments and results are as follows:

in order to verify the effect of the proposed method, the invention performs an experiment on the disclosed motor imagery data set BCI composition IV IIa, the programming realizes the proposed method and the existing methods such as DeepConvNet, ShallowConvNet, EEGNet and Sinc-ShallowNet, the effect of the proposed method is verified through a comparison experiment, and meanwhile, the effectiveness of the 3-point invention in the proposed method is verified through performing an Ablation experiment (Ablation Study) on the 3-point invention. The experimental procedures and results are described in detail below.

1. Data set and preprocessing method

The experiment was performed using the BCI Competition IV Datasel IIa Dataset. This data set is a public data set on the 4 th BCI race, which contains 4 classes of motor imagery brain electrical signals for the left hand, right hand, feet and tongue of 9 subjects. These brain electrical signals were collected from 22 electrodes and contained 576 tests (i.e., 576 samples). These 576 samples were taken from two days, and the daily experiment was scored as 1 session. Each session contains 4 classes of samples, each class containing 72 samples. All samples have been labeled (i.e., the motor imagery that marks which portion of the sample corresponds to). The time design of the experiment is shown in figure 4. Each experiment had a fixed cross displayed on the screen, giving a prompt at 2 seconds, after which the subject started motor imagery at 3 seconds for 3 seconds. Because the difference of the electroencephalogram characteristics of different subjects is large, the classification accuracy rate needs to be calculated for each subject separately in the classification experiment of the electroencephalogram signals, and the average value of the classification accuracy rates of a plurality of subjects is solved to serve as the performance index of the model.

The present invention continues to use the data preprocessing methods commonly used in the prior art methods to more fairly compare them with the prior art methods. Firstly, the band-pass filtering is carried out on the original EEG signals by a 3-order Butterworth band-pass filter with the frequency of 4-40Hz, and signals of required frequency bands are filtered out. The filtered signal is then normalized by an exponential moving average (attenuation factor set to 0.999) to reduce the effect of the difference in values on the model. And finally, extracting data from 0.5 second to 2.5 seconds after the display prompt as a sample. Each sample is a 1 x 22 x 500 matrix as shown at the top of fig. 3. After preprocessing, each session contains 288 samples. Like other existing methods, the experiment trains the model with the data of the first session, and tests the effect of the model with the data of the 2 nd session.

2. Model training process

And before training is started, dividing the preprocessed data. 288 samples of the 2 nd session are used as the test set, 80% of the data (about 230) in the 288 samples of the 1 st session are used as the training set, and the remaining 20% of the data (about 58) are used as the validation set. And after dividing the data, performing European alignment pretreatment on the three sets. And then, performing mixed noise data enhancement based on empirical mode decomposition on the training set data, and expanding the training set samples to 460. After which training is started.

In the experiment, cross entropy loss functions are adopted for training of all methods, an Adam method is used as an optimizer, learning rates are set to be 0.001, and default values of the Adam method are used for other parameters. The batch sizes (batch size) of the batch trainings are all set to 64. Parameters of the Sinc-ShallowNet and SincConv layers of the proposed method are initialized as described in the detailed description, and the remaining parameters are initialized using the Xavier uniform distribution initialization method.

The training of the model is divided into 2 stages, and the setting is similar to that of the existing method. The maximum number of iterations of the first stage is set to 800 and the training is ended in advance when the validation set loss function reaches the minimum, to prevent the overfitting phenomenon and save training time. In the second stage, the verification set data is combined into the training set data for training, when the loss value of the verification set is smaller than that of the training set in the first stage, the training is ended in advance, and the maximum iteration number is still set to be 800. And recording the model when the loss value of the verification set is the lowest in the second iteration process, predicting the sample of the test set by using the model, and recording the accuracy of the test set. And respectively carrying out the model training and testing on 9 subjects to obtain the accuracy of 9 groups of test sets, and recording the average value of the accuracy as the final model accuracy. The significance of the difference between the accuracy Rate average of the proposed method and the accuracy Rate average of the existing method is detected by using a one-sided Wilcoxon symbolic rank test method, and the result of multiple comparison is corrected by using a False Discovery Rate (FDR) method of which alpha is 0.05, so that the error caused by comparison for multiple times is reduced.

3. Results of the experiment

(1) Comparative experiment

Comparative experiments were carried out on the proposed method and the existing method, and the experimental results are shown in table 1. The results in table 1 show that the average value of the accuracy of the FB-Sinc-ShallownNet method provided by the invention is higher than that of the existing method, and the p-value of Wilcoxon symbol rank test performed by the FB-Sinc-ShallownNet method is less than 0.05 compared with all comparison methods, so that the method has significance, and the performance of the provided method is better than that of the comparison methods; meanwhile, the standard deviation of the proposed method is the lowest compared with other methods, which indicates that the stability of the proposed method is the highest.

Table 1 comparative experiment results table

(2) Ablation experiment

To verify the validity of 3 invention contents of the proposed method, the invention performed 3 sets of ablation experiments, respectively using the following methods for model training and testing: (1) FB-Sinc-ShallowNet without enhancing mixed noise data based on empirical mode decomposition; (2) performing FB-Sinc-ShallowNet when the mixed noise data based on empirical mode decomposition is enhanced without European alignment preprocessing; (3) the filter bank structure is not used, namely, the Sinc-ShallownNet method is combined with the mixed noise data enhancement based on the empirical mode decomposition and the Euclidean alignment preprocessing. The results of the experiment are shown in table 2. As can be seen from table 2, the average values of the accuracy obtained by removing any one of the 3 invention contents and performing the experiment are all lower than that of the complete extraction method, and the p-values of Wilcoxon symbolic rank test are all less than 0.05, which has significance, indicating that the 3 proposed invention contents are all effective and indispensable; meanwhile, the standard deviation of the accuracy of the method is the lowest, and the method has higher stability when 3 invention contents are provided.

TABLE 2 ablation test results table

Claims

step 1, dividing an electroencephalogram signal data set into a training set, a verification set and a test set; performing European alignment preprocessing on the data of the 3 sets respectively;

step 4, classifying the data of the test set, and identifying the limb part of the motor imagery of the subject;

the method for enhancing the mixed noise data based on empirical mode decomposition is used for expanding samples, and comprises the following specific steps:

step 1): EMD and IMF component screening: performing empirical mode decomposition on an original signal to obtain a plurality of IMF components, and reserving the IMF components with the correlation coefficient more than or equal to 0.1 so as to screen out the IMF components most correlated with the original signal;

step 2): estimate the energy of the IMF component: assuming that the first IMF component contains most of noise in the original signal, and estimating noise energy values in other IMF components by taking the first IMF component as a reference;

step 3): filtering the IMF components based on an adaptive threshold: taking intervals divided by adjacent zero points on the left side and the right side of each extreme point as a unit, dividing each IMF component into a plurality of intervals, arranging the intervals from small to large according to the extreme values, sequentially calculating the accumulated sum of energy from small to large until the accumulated energy reaches the noise energy critical value of the IMF component, screening the divided intervals by taking the extreme value before reaching the critical value as a threshold, reserving signals of the intervals with the extreme values being more than or equal to the threshold, and filtering signals of other intervals, thereby extracting effective components in the original signals;

step 4): generating mixed noise: generating a white noise sequence with a specific signal-to-noise ratio based on the effective components of the original signal, and summing the white noise sequence with the effective components of the original signal to obtain mixed noise;

step 5): generating a new sample: and summing the mixed noise and the original signal to obtain a new sample, and using the new sample and the original data as training set data.

2. The classification method according to claim 1, characterized in that the samples are preprocessed using a euclidean alignment method, in particular:

step 1: knowing that a sample of one subject is X, comprising N samples; calculating the arithmetic mean of the covariance matrices of all samples;

3. The classification method according to claim 1, characterized in that the classification is performed using the FB-Sinc-ShallowNet method, specifically:

the input signal is divided into 4 parts and respectively sent into 4 branches of a shallow neural network, and each branch comprises a SincConv layer, a deep convolutional layer and a pooling layer; the outputs of the 4 branches are finally merged into a feature map, regularized through a discarding layer Dropout, converted into a one-dimensional vector through a Flatten layer, and classified through a SoftMax function; the method comprises the following specific steps:

step 2: each branch calculates the time domain convolution of the input signal and the FIR filter by using a SincConv layer;

and 5: adjusting the feature weight value of each branch by using a batch normalization layer;