CN113408444B

CN113408444B - Event-related potential signal classification method based on CNN-SVM

Info

Publication number: CN113408444B
Application number: CN202110705855.4A
Authority: CN
Inventors: 谢俊; 于鸿伟; 何柳诗; 张焕卿; 李敏; 徐光华
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2022-12-09
Anticipated expiration: 2041-06-24
Also published as: CN113408444A

Abstract

A classification method of event-related potential signals based on a CNN-SVM comprises the steps of firstly, carrying out band-pass filtering on collected electroencephalogram signals through a band-pass filter; then, the electroencephalogram signals are made into a data set with a label, and the data set is divided into a training set, a verification set and a test set; inputting the training set into a designed convolutional neural network model for training, and selecting optimal parameters of the network by using a verification set; inputting a training set, a verification set and a test set into the trained convolutional neural network, and outputting the down-sampling layer characteristics of the network, wherein the characteristics corresponding to the training set and the verification set are training characteristics, and the characteristics corresponding to the test set are test characteristics; training a support vector machine model by using the training characteristics, and classifying the test characteristics by using the trained support vector machine model to obtain an identification result of the event-related potential signal; the invention realizes the accurate identification of the event-related potential signal and improves the practical value of the brain-computer interface system.

Description

Event-related potential signal classification method based on CNN-SVM

Technical Field

The invention relates to the technical field of event-related potential brain-computer interfaces, in particular to an event-related potential signal classification method based on a CNN (convolutional neural network) -SVM (support vector machine).

Background

The brain-computer interface (BCI) is a technology for directly interacting the brain with the outside without the help of peripheral nerve pathways, and the technology is widely applied to the fields of language communication, environmental control, motor function rehabilitation and the like due to the advantages of low cost, high time resolution, good safety and the like. Brain-computer interface applications based on brain electrical signals can adopt various brain electrical forms, including steady-state visual evoked potentials (SSVEP), event-related potentials (ERP) and the like.

ERP is a transient brain response evoked by a series of specific stimuli, reflecting the processing mechanism of the brain to physical stimuli. ERP is mainly composed of "exogenous components" such as P1, N1, P2, etc. which are susceptible to physical characteristics of stimuli, and "endogenous components" such as N2, P3 (or P300), etc. which are not susceptible to physical characteristics of stimuli. The most studied and widely used clinical potential is P300 potential, the peak of P300 potential appears about 300 milliseconds after the occurrence of the relevant event, and most ERP tests are to detect whether P300 potential is included.

The traditional ERP signal classification is generally realized by manually extracting frequency domain or time-frequency domain feature information in electroencephalogram signals and then carrying out supervised classification on the extracted features, but the traditional method generally needs to strengthen the signal-to-noise ratio of the signals by superposing signal waveforms for multiple times to realize effective identification of the ERP signals, compared with the traditional feature extraction method, deep learning can automatically dig deeper features of the signals and avoid information loss, but because the ERP signals have the characteristics of weak signals, large individual difference, small electroencephalogram signal sample amount and the like, overfitting often occurs in the deep learning process, and the application of brain-computer interface engineering based on the ERP signals is limited.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide an event-related potential signal classification method based on a CNN-SVM, which can effectively improve the accuracy of ERP signal classification.

In order to achieve the purpose, the invention adopts the technical scheme that:

an event-related potential signal classification method based on a CNN-SVM comprises the following steps:

step 1: measuring electrodes are arranged in the top area and the occipital area of the head of a user, a reference electrode is arranged at the position of a single-side earlobe of the user, a ground electrode is arranged at the forehead of the user, and an electroencephalogram signal measured by the electrodes is sent to a computer after being amplified and subjected to analog-to-digital conversion;

and 2, step: a character moment of multiplying x rows by y columns is presented on a computer display screen, a certain row or a certain column of a random highlight character matrix represents one stimulation, the x rows and the y columns are required to be highlighted once, and the stimulation is performed for x + y times in total;

and step 3: the user focuses on the specified character, with x + y randomly highlighted rows or columns containing the desired character; when the row or column containing the character is highlighted, the user is asked to respond to the stimulus, which generates a P300 signal; when the row or column not containing the character is highlighted, the user does not respond and generates a non-P300 signal;

and 4, step 4: filtering the collected P300 signal sample and non-P300 signal sample by a band-pass filter to prepare a data set with a label, and dividing the data set into a training set, a verification set and a test set;

and 5: constructing a convolutional neural network model for ERP signal identification;

step 6: training a convolutional neural network model, inputting a training set into the constructed convolutional neural network model for network training, and selecting the optimal parameters of the convolutional neural network model by using a verification set;

and 7: inputting a training set, a verification set and a test set into a trained convolutional neural network, outputting the down-sampling layer characteristics of the network, wherein the characteristics corresponding to the training set and the verification set are training characteristics, and the characteristics corresponding to the test set are test characteristics;

and 8: completing the training of a support vector machine model by using training characteristics, and acquiring optimal parameter values by adopting gridding search;

and step 9: inputting the test features into a trained support vector machine model to perform identification and classification of ERP signals, and detecting the classification performance of a mixed model of a convolutional neural network and the support vector machine, wherein the mixed model of the convolutional neural network and the support vector machine is used for online identification of the ERP signals.

The convolutional neural network model in the step 5 specifically includes:

the first layer of the convolutional neural network model is the input layer l ₁ Inputting an original multi-channel ERP signal, wherein the size of an input sample matrix is n times the number of channels multiplied by the data sampling length m;

the second layer and the third layer of the convolutional neural network model are convolutional layers, and the convolutional layer of the second layer is convolutional layer ₂ For time-domain convolution of an input ERP signal, andlayer winding layer ₃ The spatial domain convolution is used for carrying out spatial domain convolution on the output of the previous layer;

the fourth layer of the convolutional neural network model is a down-sampling layer l ₄ The device is used for carrying out data dimensionality reduction on the output of the upper layer, the average pooling or maximum pooling method is adopted for the down-sampling operation, and a linear rectification function ReLU is used as an activation function;

the fifth layer of the convolutional neural network model is an output layer l ₅ And updating the network parameters by using a cross entropy loss function.

In the step 8, the RBF kernel function is selected by the kernel function of the support vector machine model, and the optimal parameters C and gamma are automatically matched by adopting a grid search cross validation method.

Compared with the prior art, the invention has the following beneficial effects:

according to the method, the filtered multichannel original electroencephalogram signal is used as input, and according to the characteristic that the ERP signal has time and space domain characteristics, the time-space separation convolution operation of time domain convolution before space domain convolution is provided, so that the initial identification of the ERP signal can be realized; the support vector machine is used for replacing a full connection layer in a convolutional neural network for classification, so that the overfitting problem is effectively reduced, the identification accuracy is improved, the ERP signal with small sample size has obvious identification advantages, and the application performance of a brain-computer interface system is improved.

Drawings

FIG. 1 is a flow chart of a method according to an embodiment of the present invention.

FIG. 2 is a diagram of exemplary stimulation according to an embodiment of the present invention, wherein (a) is a P300 character spelling matrix and (b) is a stimulation flow chart.

Fig. 3 is a schematic structural diagram of a convolutional neural network model constructed according to an embodiment of the present invention.

FIG. 4 is a confusion matrix chart of the accuracy of the P300 signal identification according to the embodiment of the present invention.

FIG. 5 is a graph comparing the results of the method of the present invention and the conventional convolutional neural network method, wherein (a) is the P300 detection accuracy for user S1 according to the different methods; FIG. b shows the P300 detection accuracy for user S2 in different ways; fig. (c) shows the P300 detection accuracy for user S3 by different methods.

Detailed Description

The present invention will be described in further detail with reference to the following examples and the accompanying drawings.

As shown in fig. 1, an event-related potential signal classification method based on CNN-SVM includes the following steps:

step 1: the FCz, C1, cz, C2, pz and POz positions of the head vertex area and the occipital area of a user are provided with measuring electrodes, the single-side earlobe position A1 or A2 is provided with a reference electrode, the forehead position Fpz of the head is provided with a ground electrode, and an electroencephalogram signal measured by the electrodes is sent to a computer after amplification and analog-to-digital conversion;

step 2: as shown in figure 2 (a), a 6 x 6 character matrix of 26 english letters and 9 numbers combined with an underlining arrangement is presented on a computer display screen, the user's task being to focus on the characters in the word prescribed by the researcher, i.e. to use one character at a time; all rows and columns of the character matrix are highlighted continuously and randomly at a rate of 5.7Hz, a certain row or a certain column of the random highlight character matrix represents one stimulation, and 6 rows and 6 columns are required to be highlighted once, so that the stimulation is performed for 12 times in total; a specific scheduling is shown in fig. 2, diagram (b), where each row and column of the character matrix is highlighted for 100 ms, followed by a pause of 75 ms, each character is highlighted 15 times, followed by a rest time of 2.5 seconds after the 15 highlights to inform the user that the character has been spelled out and to focus on the next character;

and step 3: the user focuses on the specified character, with 12 randomly highlighted rows or columns containing the desired character; when the row or column containing the character is highlighted, the user is asked to respond to the stimulus, which generates a P300 signal; when a row or column not containing the character is highlighted, the subject does not react, producing a non-P300 signal;

and 4, step 4: carrying out band-pass filtering on the acquired P300 signal sample and the acquired non-P300 signal sample by a Butterworth band-pass filter at 0.1Hz to 20Hz to prepare a labeled data set, wherein the label coding of the data set adopts one-hot coding, and the data set is randomly divided, wherein 70% of the data set is a training set, 15% of the data set is a verification set, and 15% of the data set is a test set;

and 5: a convolutional neural network model for ERP signal identification is constructed, as shown in fig. 3, specifically:

the first layer of the convolutional neural network model is the input layer l ₁ Inputting original multi-channel ERP signals, inputting data of 1 second when the size of a sample matrix is 6 multiplied by 240, namely the sampling rate is 240 Hz;

the second and third layers of the convolutional neural network model are convolutional layers, and the convolutional layer of the second layer is convolutional layer ₂ The method comprises the following steps that 6 one-dimensional convolution kernels are provided and are mainly used for carrying out time domain convolution on input ERP signals; third layer of convolutional layer ₃ There are 12 one-dimensional convolution kernels, mainly used for carrying on the space domain convolution to the output of the upper strata;

the fourth layer of the convolutional neural network model is a down-sampling layer, a down-sampling layer l ₄ Checking the third convolutional layer l with 12 convolutional cores of 1 × 6 size ₃ The output of the system is subjected to data dimension reduction, the maximum pooling method is adopted for the down-sampling operation, a linear rectification function ReLU is used as an activation function, the step length is consistent with the size of a convolution kernel, and meanwhile, a Dropout method is used for preventing overfitting;

the fifth layer of the convolutional neural network model is an output layer l ₅

Output layer l

₅ 2 nodes represent a binary problem, and network parameters are updated by using a cross entropy loss function corresponding to a P300 signal and a non-P300 signal;

and 6: training a convolutional neural network model, inputting a training set into the constructed convolutional neural network model for network training, and selecting optimal parameters of the convolutional neural network model by using a verification set;

the network training process adopts small-batch training, the size of input batch data of each training is 64 samples, the Dropout ratio is set to be 0.6, the optimization and adjustment of the weight adopt an Adam random gradient descent method, and the learning rate is set to be 0.001;

and 7: inputting a training set, a verification set and a test set into a trained convolutional neural network, outputting the characteristics of a down-sampling layer of the network, taking the characteristics derived from the training set and the verification set as training characteristics, and taking the characteristics derived from the test set as test characteristics;

and step 8: the training of the support vector machine model is completed by using the training characteristics, the kernel function selects an RBF kernel function, the classification algorithm selects a one vs one algorithm, and the value ranges of C and gamma are

Automatically matching to an optimal parameter value by adopting a grid search cross validation method;

and step 9: inputting the test features into a trained support vector machine model to perform identification and classification of ERP signals, and detecting the classification performance of a convolutional neural network and support vector machine mixed model; the convolution neural network and the support vector machine mixed model can be further used for online identification of ERP signals.

The method is adopted to carry out experiments on three users (S1-S3), and the electroencephalogram signals are synchronously recorded and displayed in real time in the experiment process, so that the state of the users can be conveniently checked in the experiments, the users are prevented from generating actions such as blinking, body movement and the like, and the data quality of the electroencephalogram signals is ensured. FIG. 4 is a confusion matrix of the recognition accuracy of the user S1 calculated by the hybrid model of the convolutional neural network and the SVM of the present invention, wherein the rows in the figure represent the prediction tags and the columns represent the actual discrimination result tags. It can be seen from the figure that the classification accuracy of the method (CNN-SVM) of the present invention for P300 signals is 93.5%, the classification accuracy for non-P300 signals is 94.3%, and the recognition rate of both types of signals exceeds 90%, which indicates that the method of the present invention can better complete the classification of both types of signals.

In addition, in order to verify the validity of the method of the present invention (CNN-SVM), it was compared with the classical methods for ERP signal recognition, i.e., the stepwise linear discriminant analysis (SWLDA) method and the Bayesian Linear Discriminant Analysis (BLDA) method, and the results are shown in table 1. The first row of the table lists the classification method used for comparison. Lines 2-4 list the P300 detection accuracy for different classification methods, respectively. The numbers are given in percent (%) and the number of precisions in bold represents the highest precision of this line.

TABLE 1P 300 detection accuracy for different classification algorithms

As can be seen from table 1, the classification accuracy of the method of the present invention (CNN-SVM) is highest among all classification methods. Because the method (CNN-SVM) uses the convolutional neural network as the feature extractor, compared with the method of manually extracting the frequency domain or time-frequency domain feature information in the electroencephalogram signal, the method can automatically mine the deeper features of the signal and avoid information loss, and the classification result of the method (CNN-SVM) is also obviously superior to other classification methods. The above results show that the method (CNN-SVM) of the invention has a larger detection precision improvement compared with the classical ERP signal identification method.

Referring to fig. 5, in which graphs (a) - (c) compare the P300 detection accuracy of the inventive method (CNN-SVM) and the conventional convolutional neural network method (CNN). Because the improvement of the spelling precision of the characters is realized on the basis of using less repetition times, the method is favorable for improving the information transmission rate, thereby being capable of improving the communication speed between the human brain and a computer, and the method (CNN-SVM) analyzes the P300 detection precision under different repetition times k epsilon [1,15 ]. Generally, compared with the traditional convolutional neural network method, the method (CNN-SVM) of the invention can obtain better recognition accuracy, and the recognition accuracy of the method (CNN-SVM) of the invention is improved by 4.36% compared with that of a single convolutional neural network method. This is because the multi-layered perceptron is an empirical risk minimization method, which is prone to over-fitting phenomenon during small sample classification, and the support vector machine is a structural risk minimization classification method, which tends to improve generalization performance compared to the multi-layered perceptron. The method (CNN-SVM) can stably achieve the recognition accuracy rate of more than 90% by repeating the character highlighting for 4 times, and has higher practical value.

Claims

1. An event-related potential signal classification method based on a CNN-SVM is characterized by comprising the following steps of:

and step 3: the user focuses on the specified character, with x + y randomly highlighted rows or columns containing the desired character; when the row or column containing the character is highlighted, the user is asked to respond to the stimulus, which will generate a P300 signal; when the row or column not containing the character is highlighted, the user does not respond and generates a non-P300 signal;

and 4, step 4: filtering the acquired P300 signal sample and non-P300 signal sample by a band-pass filter to prepare a data set with a label, and dividing the data set into a training set, a verification set and a test set;

and 7: inputting a training set, a verification set and a test set into the trained convolutional neural network, and outputting the down-sampling layer characteristics of the network, wherein the characteristics corresponding to the training set and the verification set are training characteristics, and the characteristics corresponding to the test set are test characteristics;

and step 8: completing the training of a support vector machine model by using training characteristics, and acquiring optimal parameter values by adopting gridding search;

and step 9: and inputting the test features into a trained support vector machine model to perform identification and classification of ERP signals, and detecting the classification performance of a convolutional neural network and support vector machine mixed model, wherein the convolutional neural network and support vector machine mixed model is used for online identification of the ERP signals.

2. The event-related potential signal classification method based on the CNN-SVM as claimed in claim 1, wherein the convolutional neural network model in step 5 specifically comprises:

the second and third layers of the convolutional neural network model are convolutional layers, and the convolutional layer of the second layer is convolutional layer ₂ For time-domain convolution of input ERP signal, a third layer of convolution layer ₃ The convolution function is used for performing space domain convolution on the output of the previous layer;

3. The event-related potential signal classification method based on the CNN-SVM as claimed in claim 1, wherein: in the step 8, the RBF kernel function is selected by the kernel function of the support vector machine model, and the optimal parameters C and gamma are automatically matched by adopting a grid search cross validation method.