Disclosure of Invention
The invention aims to provide a motor imagery brain electrolysis code method based on airspace characteristic time-frequency transformation, which is used for solving the problems.
The technical scheme of the invention is as follows: a motor imagery brain electrolysis code method based on airspace characteristic time-frequency transformation uses a Butterworth band-pass filter to obtain a target frequency band of an electroencephalogram signal, then uses a CSP algorithm to carry out airspace filtering on the target frequency band, carries out standardization processing on data, obtains a time-frequency energy diagram of a signal through continuous wavelet transformation, takes the time-frequency diagram as input of a convolutional neural network, and finally obtains a classification result of the signal.
The method comprises the following specific steps:
step1: a Butterworth band-pass filter is designed, the band-pass frequency band is 8-30Hz, the order is 5, and the motor imagery electroencephalogram signals of the alpha and beta sections are obtained.
Step2: and (3) performing spatial filtering on the electroencephalogram signals by using a common spatial mode algorithm (CSP) on the 8-30Hz frequency band signals filtered in Step1, and performing standardization processing.
Step3: and (3) using Continuous Wavelet Transform (CWT) to the brain electrical data processed by Step2 to obtain a corresponding time-frequency energy diagram. The wavelet mother wave uses 'Morlet', the center frequency is 0.8125, and the length of the scale sequence is 125.
Step4: and designing a Convolutional Neural Network (CNN), wherein the network consists of a convolutional-pooling layer and a full-connection layer, the time-frequency energy diagram obtained by Step3 is used as the input of the convolutional neural network, the CNN can automatically extract the characteristics in the time-frequency diagram, and the parameters of each layer are updated through back propagation, so that the learning and training of the network are completed.
Step5: and (3) performing performance test on the convolutional neural network model trained in Step4 by using test set data, so as to verify the effectiveness of spatial filtering and time-frequency conversion on the characteristic extraction of motor imagery electroencephalogram signals.
The Step1 specifically comprises the following steps: in MI-EEG data, the μ -rhythm (8-13 Hz) and β -rhythm (17-30 Hz) will exhibit differences in the energy increase or decrease of the associated electrodes under different motor imagery conditions. And in the data preprocessing stage, a Butterworth band-pass filter is used for filtering the original data, and the motor imagery electroencephalogram signals of 8-30Hz are obtained.
The Step2 specifically comprises the following steps:
the spatial filtering technology is suitable for processing multidimensional signal data, and utilizes the spatial correlation of brain electrical signals to eliminate the noise of the signals and realize the positioning of local cortical neural activity. The co-space model (CSP) algorithm is widely used in MI-EEG data for spatial domain feature extraction of two-class motor imagery electroencephalogram data, and can extract the spatial distribution component of each class from the multichannel electroencephalogram data.
Set X 1 ,X 2 For two different classes of signals in MI-EEG data, two classes of signals X 1 And X 2 The normalized covariance matrix is as follows:
in the formulae (1), (2),representation matrix X 1 Transpose of->Representation matrix X 2 Transpose of->Representation matrix->Sum of elements on diagonal, < >>Representation matrix->And the sum of the elements on the diagonal.
Solving covariance matrix of the mixed space:
is X 1 Mean covariance matrix of>Is X 2 Is a mean covariance matrix of (b).
Performing eigenvalue decomposition according to the mixed space covariance matrix obtained in the formula (3):
R=UλU T (4)
wherein U is a eigenvector matrix of matrix lambda, lambda is a diagonal matrix formed by eigenvalues of matrix R, eigenvalues are ordered in descending order, and a whitening value matrix is calculated:
a whitening matrix obtained by the formula (5) and a normalized covariance matrix of the formula (1), and a common eigenvector matrix are obtained by the formula (2):
S 1 =PR 1 P T (6)
S 2 =PR 2 P T (7)
then to S 1 ,S 2 And (3) carrying out principal component decomposition:
in the formulas (8), (9), the matrix S 1 ,S 2 Is equal, namely:
B 1 =B 2 =B (10)
at the same time, the diagonal matrix lambda corresponding to the eigenvalue 1 And lambda (lambda) 2 The sum is the identity matrix I:
λ 1 +λ 2 =I (11)
the projection matrix is determined by the whitening matrix calculated by the formula (5) and the common eigenvector matrix of the formula (10), and the projection matrix is obtained:
W=B T P (12)
the projection matrix W is the spatial filter, and finally, the spatially filtered signal is obtained by the formula (13):
S=X*W T (13)
wherein the dimension of the signal X is T multiplied by N, T represents the sampling point number, N represents the channel number and W T Is n×n, and the spatial filtered signal S is t×n.
After the data is subjected to spatial domain filtering, the data needs to be subjected to standardization processing. Data standardization enhances the data centralization and can eliminate adverse effects caused by abnormal points of samples. Meanwhile, in the model classification stage, the optimal speed of gradient descent solution can be remarkably improved, and the performance of the classifier is improved. In this step, the z-score is used to normalize each channel of each set of data:
where μ represents the sample mean of each set of data and σ represents the standard deviation of the data.
The Step3 specifically comprises the following steps:
the CSP algorithm extracts spatial features of signals to a certain extent, and uses continuous wavelet transformation to convert spatial filtering data into a two-dimensional time-frequency energy diagram. In contrast to wavelet transforms, fourier transforms can only acquire frequency domain features of a signal, whereas short-time fourier transforms lose part of the timing features due to their fixed window length. The wavelet transformation is a localized analysis of time frequency, and gradually refines the signals in multiple scales through translation and expansion operation, so that the frequency resolution at a low frequency is high, the time resolution at a high frequency is high, arbitrary details of the signals can be focused, and the problems of Fourier transformation and short-time Fourier transformation are solved. The invention selects Morlet wavelet as fundamental wave of continuous wavelet transformation, and the expression is as follows:
(14) Where ω represents frequency and t represents time.
The continuous wavelet transformation obtains a series of center frequencies by selecting a center frequency and then performing scale transformation, and obtains a series of basis functions in different time intervals by time shifting:
in the formula (15), α is a scale conversion factor, and β is a time shift factor.
The series of basis functions respectively carry out product integral operation with a certain section of interval of the original signal, and the frequency corresponding to the generated extremum is the frequency contained in the interval of the original signal:
in the formula (16), f (t) is a given time domain signal, phi α , β (t) is a family of Morlet wavelet functions.
The Step4 specifically comprises the following steps:
and constructing a convolutional neural network consisting of 4 convolutional layers, 2 pooling layers and 2 full-connection layers to extract the characteristics of different types of time-frequency diagrams and perform motor imagery electroencephalogram decoding. CNN imitates the biological visual perception mechanism to construct, is a kind of feedforward neural network comprising convolution calculation and having depth structure, because of its end-to-end characteristic, CNN is used in fields such as computer vision, natural language processing, speech recognition and text classification extensively.
Set X i I= (1, …, n) is n training set samples, the convolution layer receives input data, sample characteristics are extracted, and a series of characteristic diagrams are obtained:
feature map =conv(X i ,KS,KN) (17)
in the formula (17), X i For training samples, KS represents the size of the convolution kernel, KN represents the number of convolution kernels, conv (x) is the defined convolution layer.
The activation function can enable the output of each network layer to be nonlinear, and in general, the activation function adopts a ReLU, the activation function has high convergence speed and simple solution gradient when training the network, and the expression (18) is the expression of the ReLU:
f(x)=maX(0,x) (18)
after the data is convolved, activated and pooled, the feature map is input into a classifier composed of all-connected layers, the all-connected layers are composed of a plurality of neurons, the neurons of each layer are independent, and the expression (19) is a single neuron expression:
wherein W is i The weight and b represent the deviation.
Likewise, each neuron in the fully connected layer also requires an activation function, but in the last layer would not be activated using a ReLU, but rather a softmax function, specifically:
in the formula (20), p i Representing the probability of class i, z i Representing the network output from which class i was calculated.
Using cross entropy as a loss function, the optimizer selects Adam, and the optimizer learning rate is set to 3×10 -4 The cross entropy loss function is defined as follows:
in the formula (21), y i True tags representing class i, p i Representing the probability of the i-th class, N is the total number of training samples and k is the number of classes.
The Step5 specifically comprises the following steps:
using the disclosed brain electrical dataset BCI completions 2008IV2b as training and testing data did not involve ethical issues. The data set comprises 5 groups of data, and 720 brain electrical signals are imagined by left and right hand movement. The first three groups were used as training data for the model and the second two groups were used as test data to evaluate performance. The final performance of the depth model is judged by comparing the traditional machine learning model and the characteristic processing mode.
The beneficial effects of the invention are as follows: the motor imagery brain electrolysis code method based on airspace characteristic time-frequency transformation is used for extracting the global characteristics and the local characteristics of the spatial domain, the time domain and the frequency domain of signals from the brain electrical signals with low spatial resolution, low signal-to-noise ratio and non-stability. Unlike traditional spatial feature extraction, the electroencephalogram signals of the motor imagery rhythm segment are spatially filtered using a common spatial mode (CSP), but the log covariance decomposition signals are not. After CSP spatial filtering, signals of different categories have higher spatial difference. At the same time, a time-frequency energy map of the signal can be obtained using continuous wavelet transform. The time-frequency diagram contains time information, frequency information and energy information, and can obtain high-resolution frequency information in a low frequency band, and high-resolution time information in the high frequency band. The form of the picture is convenient for the input of the convolutional neural network. Convolutional neural networks are widely used in the fields of computer vision, image recognition, and voice classification because of their excellent performance. The end-to-end characteristic of the method can automatically learn the characteristics of data, update network parameters, finish classification and identification tasks and is very suitable for complex electroencephalogram signal processing. The invention can analyze brain electric physiological signals by the scheme provided by the invention, and can be used for laying a cushion for the realization of a future real-time brain-computer interface system. Experimental results show that the feature extraction strategy provided by the invention can capture global features of the brain electrical signals, and improves the classification accuracy of the convolutional neural network. The effectiveness of the protocol was verified compared to the rest of the traditional analysis methods.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
1. And combining a co-space mode and a time-frequency space feature extraction strategy of continuous wavelet transformation.
1.1, MI-EEG preprocessing and spatial domain feature extraction of signals.
The brain-computer interface (BCI) is a means of communication between a user and a computer that is independent of the normal neural pathways of the brain and muscles. BCIs can be classified into three types, i.e., non-implantable BCIs, and partially implantable BCIs, according to an electroencephalogram (EEG) signal acquisition method. The non-implantable BCIs convert the electroencephalogram record into a control command, realize the control of external equipment, and are convenient to operate and wide in application. A typical non-implantable brain machine usually consists of three parts, EEG signal acquisition, signal decoding and external device control, in which process signal decoding is a critical step to ensure proper operation of the overall system.
Fig. 1 is a motor imagery brain electrolysis code method based on airspace characteristic time-frequency transformation. A large number of experiments show that when the brain of a person performs a motor imagery task, the motion sensing area can generate corresponding potential changes, which are called Event Related Desynchronization (ERD) and Event Related Synchronization (ERS) phenomena. This phenomenon is manifested by significant energy changes in the μ (8-13 Hz) and β (17-30 Hz) rhythms of the C3, cz and C4 electrode channels, and FIG. 2 is the three-channel MI-EEG raw data acquired. The Butterworth filter has the characteristics that the corresponding curve of the frequencies in the passband is flattened to the maximum extent and has no fluctuation, and the amplitude gradually decays to zero in the passband. The key purpose of the invention is to design an effective characteristic expression form, so that a decoding model can acquire local characteristics and global characteristics including time, frequency and space from short motor imagery time.
As shown in fig. 3, in the data preprocessing stage, the 5 th order butterworth band-pass filter can well acquire the signals of the alpha and beta frequency bands, and the signals are not distorted. The spatial filtering technology utilizes the spatial correlation of the brain electrical signals, can play a role in eliminating the noise of the signals and positioning the local cortical neural activity, and is widely used for processing multi-dimensional signal data. The co-space mode (CSP) is a common feature extraction mode of motor imagery electroencephalogram signals, and particularly aims at feature extraction of two types of multi-channel electroencephalogram data. The basic principle is that a group of optimal spatial filters are found for projection by utilizing diagonalization of a matrix, so that variance value difference of two types of signals is maximized, and a characteristic vector with higher distinction degree is obtained.
Set X 1 ,X 2 For two different classes of signals in MI-EEG data, two classes of signals X 1 And X 2 The normalized covariance matrix is as follows:
in the formulae (1), (2),representation matrix X 1 Transpose of->Representation matrix X 2 Transpose of->Representation matrix->Sum of elements on diagonal, < >>Representation matrix->The sum of the elements on the diagonal is then calculated as the covariance matrix of the mixing space:
is X 1 Mean covariance matrix of>Is X 2 An average covariance matrix of (a);
performing eigenvalue decomposition according to the mixed space covariance matrix obtained in the formula (3):
R=UλU T (4)
wherein U is a eigenvector matrix of matrix lambda, lambda is a diagonal matrix formed by eigenvalues of matrix R, eigenvalues are ordered in descending order, and a whitening value matrix is calculated:
a whitening matrix obtained by the formula (5) and a normalized covariance matrix of the formula (1), and a common eigenvector matrix are obtained by the formula (2):
S 1 =PR 1 P T (6)
S 2 =PR 2 P T (7)
then to S 1 ,S 2 And (3) carrying out principal component decomposition:
in the formulas (8), (9), the matrix S 1 ,S 2 Is equal, namely:
B 1 =B 2 =B (10)
at the same time, the diagonal matrix lambda corresponding to the eigenvalue 1 And lambda (lambda) 2 The sum is the identity matrix I:
λ 1 +λ 2 =I (11)
the projection matrix is determined by the whitening matrix calculated by the formula (5) and the common eigenvector matrix of the formula (10), and the projection matrix is obtained:
W=B T P (12)
the projection matrix W is the spatial filter that is sought. Finally, the spatially filtered signal is obtained by the equation (13):
S=X*W T (13)
wherein the dimension of the signal X is T multiplied by N, T represents the sampling point number, N represents the channel number and W T Is N x N. The spatial filtered signal S has dimensions t×n.
After the data is subjected to spatial domain filtering, the data needs to be subjected to standardization processing. Data standardization enhances the data centralization and can eliminate adverse effects caused by abnormal points of samples. Meanwhile, in the model classification stage, the optimal speed of gradient descent solution can be remarkably improved, and the performance of the classifier is improved. In this step, the z-score is used to normalize each channel of each set of data:
where μ represents the sample mean of each set of data and σ represents the standard deviation of the data.
Fig. 4 is an electroencephalogram signal after CSP spatial filtering, and the CSP algorithm extracts spatial features of signals to a certain extent, so that the two types of signals have higher difference in spatial components.
1.2 time-frequency characteristic extraction based on continuous wavelet transform
The wavelet transformation is one of powerful tools in the field of signal analysis, inherits and develops the concept of short-time Fourier transformation localization, overcomes the defect that the window size does not change along with frequency and the like, can provide a 'time-frequency' window which changes along with frequency, is an ideal tool for carrying out time-frequency analysis and processing of signals, and is known as a 'mathematical microscope'. Wavelet transforms include Continuous Wavelet Transforms (CWT) and Discrete Wavelet Transforms (DWT), which represent a great potential in signal denoising, edge detection and image processing. The wavelet transformation is a localized analysis of time frequency, and gradually refines the signals in multiple scales through translation and expansion operation, so that the frequency resolution at a low frequency is high, the time resolution at a high frequency is high, arbitrary details of the signals can be focused, and the problems of Fourier transformation and short-time Fourier transformation are solved. The invention selects 'morlet' wavelet as the fundamental wave of continuous wavelet transformation, and the expression is as follows:
in the formula (14), ω represents frequency and t represents time.
The continuous wavelet transformation obtains a series of center frequencies by selecting a center frequency and then performing scale transformation, and obtains a series of basis functions in different time intervals by time shifting:
in the formula (15), α is a scale conversion factor, and β is a time shift factor.
The series of basis functions respectively carry out product integral operation with a certain section of interval of the original signal, and the frequency corresponding to the generated extremum is the frequency contained in the interval of the original signal:
in the formula (16), f (t) is a given time domain signal, phi α,β (t) is a family of Morlet wavelet functions.
On the basis of CSP spatial filtering, the invention uses CWT to convert the two-dimensional time sequence signal into a three-dimensional time-frequency diagram, which comprises time, frequency and energy.
As shown in fig. 5, the present embodiment compares time-frequency energy diagrams of continuous wavelet transform and post-CSP wavelet transform directly without CSP spatial filtering. As can be seen from the comparison chart, after CSP spatial filtering, the redundancy of the data information is reduced, and the important features are more obvious, so that the training speed and the classification precision of the classifier are improved, and the classifier is more robust.
2. Model building strategy based on deep learning
2.1 construction of convolutional neural network and brain electrolytic code
Inspired by the structure of human brain neurons, deep neural networks make great progress in regression and prediction problems. Convolutional neural networks are one of the models of deep neural networks, which are a class of feedforward neural networks that contain convolutional calculations and have a deep structure. Because of the end-to-end model structure and the automatic feature extraction capability, the interference of redundant information is reduced to the maximum extent, and the classification performance is improved. Inspired by computer vision, in terms of brain electrolysis codes, the lower layer of convolutional neural networks learns a general feature representation of data, and the higher layer learns a specific feature representation. Therefore, corresponding features can be effectively learned from a large number of samples, a complex feature extraction process is avoided, and corresponding electroencephalogram data analysis can be performed by personnel in the non-neuro-medical field.
As shown in FIG. 6, the invention builds a convolutional neural network consisting of 4 convolutional layers, 2 pooling layers and 2 full-connection layers to extractAnd (5) performing motor imagery electroencephalogram decoding according to the characteristics of the time-frequency diagrams of different categories. The Dropout layer is added between partial network layers, so that overfitting of a model can be effectively inhibited, the accuracy of the model is improved, and meanwhile, the convergence rate of model training is also increased. Set X i I= (1, …, n) is n training set samples, the convolution layer receives input data, sample characteristics are extracted, and a series of characteristic diagrams are obtained:
feature map =conv(X i ,KS,KN) (17)
in the formula (17), X i For training samples, KS represents the size of the convolution kernel, KN represents the number of convolution kernels, conv (x) is the defined convolution layer.
The activation function enables the output of each network layer to be nonlinear, and in general, the activation function adopts a 'ReLU', the activation function has fast convergence speed and simple solution gradient when training the network, and the expression (18) is the expression of the ReLU:
f(x)=max(0,x) (18)
after the data is convolved, activated and pooled, the feature map is input into a classifier composed of all connected layers for recognition and classification. The fully connected layer is composed of a plurality of neurons, the neurons of each layer are independent, and the formula (19) is a single neuron expression:
wherein W is i The weight and b represent the deviation. Likewise, each neuron in the fully connected layer also requires an activation function, but in the last layer will not be activated using a "ReLU" but a softmax function.
In the formula (20), p i Representing the probability of class i, z i Representing the network output from which class i was calculated.
Using cross entropy as a loss function, the optimizer selects Adam can effectively prevent gradient from disappearing and improve model performance. Wherein the optimizer learning rate is set to 3×10 -4 The cross entropy loss function is defined as follows:
in the formula (21), y i True tags representing class i, p i Representing the probability of the i-th class, N is the total number of training samples and k is the number of classes.
3. Experiment and evaluation
3.1 Experimental data set acquisition
In this example, the evaluation of experimental results was performed using the disclosed brain electrical data set BCI completions 2008iv2b, without involving ethical issues. The data were collected by brain-computer interface laboratory at university of glaz, scalp electrode distribution followed the 10-20 international standard lead system with a sampling frequency of 250Hz. The dataset contained electroencephalographic data for 9 volunteers, each volunteer had normal or corrected vision and no other physiological defects. Wherein, a single volunteer participated in 5 recording experiments, the first two were recorded as feedback-free data, and the last three were recorded as online feedback data. Electrooculography was performed for about 5 minutes before the start of each experiment to evaluate the effect of EOG. Finally, the training data contained 200 motor imagery recordings for each of the left and right hands, and the test data contained 160 recordings for each of the left and right hands.
Because of the difference of the collection paradigm of 5 experiments, in the data preprocessing stage, the scheme carries out unified specification processing on the data, and the last 1s and 4s of the motor imagery event record are intercepted, and the total 3s is taken as target data. Since the electroencephalogram signals have strong individual difference expressivity, electroencephalograms expressed by physiological structure differences of different individuals are different, and therefore, the test needs to use data of the same volunteer. The performance evaluation was performed using the electroencephalogram data of volunteer 4 for the proposed protocol. Wherein, the brain electrical data of the volunteer 4 comprises 420 records of a training set and 320 records of a testing set.
3.2 comparative analysis of Experimental results
In order to verify the decoding performance of the proposed method on motor imagery electroencephalogram data, some important parameter variations in model training are first analyzed and then compared with other methods. The whole experimental platform was based on python version 3.7, using a deep learning toolbox Tensorflow and an electroencephalogram analysis toolbox mno.
The training times, the testing precision and the change of the loss function and whether the training curve is converged are important performance indexes of the convolutional neural network model. Fig. 7 is a training-testing accuracy curve of the model of the present embodiment. From the figure, it can be seen that after 200 epoch training, the training curve basically fits the data, the training precision reaches 99.75%, the maximum precision of the test data reaches 97.22%, the convergence speed of the training and test curves is faster, and the training and test curves tend to be stable. FIG. 8 is a graph showing the change in the loss function of the training-test, wherein the loss function drops rapidly before 20 epochs and converges and stabilizes after 25 epochs, regardless of the training data or the test data. Experimental results show that the model has higher classification precision and stronger robustness.
Meanwhile, in order to further verify the effectiveness of the scheme, the method for decoding the brain waves of different motor imagery is compared. Table 1 is a comparison of results based on LDA linear classifier and different feature expressions. Traditional machine learning models have proven to be useful for motor imagery brain electrolysis codes, but due to the specificity of brain electrical signals, machine learning models have significant limitations in terms of accuracy and feature extraction. The feature extraction method is an important factor for improving the decoding precision of the machine learning model, and features with obvious expression and large variability can be distinguished by the model. The deep learning can automatically learn the characteristic expression of the data due to the end-to-end characteristic of the deep learning, and the deep learning shows excellent performance on recognition problems.
Table 1: comparison of LDA and CNN classification accuracy under different characteristics
4. Summary
The brain-computer interface is one of the most promising research fields at present, but the development of the brain-computer interface is hindered because of factors such as the non-stationarity of brain-computer signals, low signal-to-noise ratio, difficult acquisition of data and the like. Electroencephalogram signal decoding is the most critical step in brain-computer interface systems, and currently, signal decoding based on electroencephalogram signals mainly faces two challenges: 1. how to design a scientific feature expression form, and extract effective features from signals with low spatial resolution and low signal-to-noise ratio. 2. How to construct a high-performance decoding network, so that the model can correctly decode electroencephalogram information. In the aspect of feature extraction, the traditional signal analysis method extracts the electroencephalogram features with higher signal-to-noise ratio by solving and analyzing the power spectral density of the signal or carrying out the averaging on the related potentials of the superimposed events. Because the electroencephalogram signal has time sequence, the traditional feature extraction mode is insufficient for extracting all features of the signal, and some deep features and time related information can be lost. Meanwhile, the feature extraction mode also restricts the performance of the machine learning classification model, so that the traditional decoding model is low in classification precision and poor in robustness. The invention provides a motor imagery brain electrolysis code method based on airspace characteristic time-frequency transformation, which is used for acquiring characteristic information containing three dimensions of time frequency and space from original brain electrical signals. The space component of the signal is extracted by CSP algorithm, and then the continuous wavelet transformation is used to convert the two-dimensional time sequence signal into three-dimensional time-frequency energy diagram. And a deep learning method is used for constructing a convolutional neural network to identify and classify the time-frequency diagram. Experiments prove that the accuracy and the robustness of the classification network can be improved by the extracted airspace features and the time-frequency features.
The convolutional neural network model built by the invention obtains good performance on final test, and provides a new idea for subsequent real-time brain-computer interface signal decoding. The application in electroencephalogram analysis may provide a new alternative to classical spectral feature analysis in bioelectric signals. Furthermore, this work may also contribute to further research and practical application of MI-EEG recognition.
While the present invention has been described in detail with reference to the drawings, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.