CN115964655A

CN115964655A - Method for monitoring error-related potential in brain-computer interface based on mutual information quantity

Info

Publication number: CN115964655A
Application number: CN202310041182.6A
Authority: CN
Inventors: 曹天傲; 王启松; 刘丹; 汤泓; 陶琳; 钟小聪; 孙金玮
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2023-01-13
Filing date: 2023-01-13
Publication date: 2023-04-14

Abstract

A method for monitoring error-related potential in a brain-computer interface based on mutual information relates to a method for identifying error-related potential, and aims to solve the problem that the existing brain-computer interface is poor in identification rate of error-related potential signals. According to the method, time domain characteristics are extracted by preprocessing an original electroencephalogram signal and analyzing a non-overlapping sliding window; extracting the characteristics of the signal by using a Welch method to obtain frequency domain characteristics; performing feature combination on the time domain features and the frequency domain features, using mutual information as the measure between the features and the correct and wrong categories, calculating the mutual information quantity of the features and the categories, sequencing the features, and screening out the features with the top rank; and performing initial error correlation potential classification by using a least square support vector machine, performing leave-one-for-cross validation on the sample, acquiring and reserving an individual optimal model, and obtaining the accuracy of final error correlation potential classification. The method has the advantage of improving the identification precision of the error related potential.

Description

Method for monitoring error-related potential in brain-computer interface based on mutual information quantity

Technical Field

The invention relates to a method for identifying error-related potentials.

Background

The Brain-computer interface (BCI) technology reveals human intention and transmits the intention to external equipment, thereby providing bright prospect for effective communication between human and intelligent machine; in recent years, the BCI technology has been widely applied in human-computer interaction (HCI) tasks for teaching demonstrations, technical performances, robotic device control, and specific industrial operations; the BCI can enhance the attention and the memory of students in a teaching mode; some interactive games may control a virtual object according to the user's intention, such as a cursor reaching a specified target; the intelligent equipment such as the robot can complete some specific tasks according to human instructions; the industrial exoskeleton mechanical arm can assist in processing, carrying, loading and unloading heavy objects, and perform repeated fine operation, so that the physical strength of workers is greatly saved, and the production efficiency is obviously improved; the rehabilitation robot based on the brain electromyographic signals of the patient can replace a rehabilitation therapist to assist the affected limb of the patient to carry out high-strength repetitive rehabilitation training; in addition to performing the intended task through the setting program, whether the controlled object correctly performs the corresponding operation according to the user's intention also affects the effectiveness of the HCI performance.

Although the controlled virtual object or external device can generally move to a specified target or perform a desired operation to some extent according to the user's intention, it means that the machine intelligence possessed by the controlled virtual object or external device is gradually approaching the biological intelligence possessed by a human; but sometimes virtual objects or external devices move to the wrong target or perform the wrong unexpected operation, meaning that machine intelligence is not yet at the level of human bio-intelligence; the errors can interrupt the progress of teaching, industrial production and rehabilitation training, and greatly influence the HCI performance; in hazardous operations, some unexpected errors may even cause industrial accidents; in the rehabilitation training, the wrong actions of the rehabilitation robot even bring secondary damage to the patient; furthermore, users are prone to become careless due to long-time tasks and excessive reliance on external devices, which also does not facilitate further execution of human-computer interaction tasks, and therefore it is necessary to find errors and abort them in time.

The engaging supervisor reminds the operator of the mistaken execution action, which is really beneficial to the smooth operation of the human-computer interaction task, but the passive operation psychology of the user always exists and the active participation can not be completely realized; the user is repeatedly reminded, so that the attention is easily dispersed, and the labor cost is increased; in order to solve the above problems, the user himself is required to synchronously detect observed errors in the HCI task so as to correct intervention in time; error-related Potential (ErrP) occurs at the discretion; it has been demonstrated that when a human observes a faulty action, the ErrP signal is produced unconsciously in the brain, without any prior training, whatever type of error occurs; the ErrP signal is generated by the normal reaction of human beings when external things violate the self intention to operate; the ErrP signal frequency is mainly concentrated at 1-10Hz, tends to be stable and plays an important role in supervising specific tasks; the brain-computer interface error monitoring technology based on ErrP can obviously improve the initiative and confidence of active participation of a user, and can be more attentive when an expected task is executed; like the traditional BCI, the ErrP signal usually needs to be detected independently and can be regarded as single-source input in the BCI system, the current error is found by detecting the ErrP signal and corrective measures are taken to prevent the error operation from being continuously executed, and the human biological intelligence effect is played in a brain-computer interface; the machine intelligence of the brain-computer interface and the human biological intelligence are effectively combined to form hybrid enhanced intelligence, and the user experience in the human-computer interaction task is remarkably improved.

With the emergence of machine learning technology and powerful computing resources, the classification rate of neural signals is remarkably improved; with the integration of an ErrP automatic error correction mechanism in BCI, the method is expected to be further promoted; however, the currently generally low ErrP classification rate constitutes a significant challenge; due to the low ErrP classification rate, some human-computer interaction tasks such as the P300 speller have little improvement in ErrP integration, and participants even find the auto-correction strategy more confusing and unpredictable; the overall performance degradation results in some users preferring to use the P300 speller without correction because they believe there is no benefit and the confidence in using ErrP-BCI is gradually reduced; therefore, it is desirable to provide an algorithm for processing error-related potential signals, so as to improve the recognition rate of the ErrP signals and effectively monitor the brain-computer interface errors.

Disclosure of Invention

The invention aims to solve the problem that the existing brain-computer interface has poor signal identification rate of error-related potentials, and provides a method for monitoring error-related potentials in the brain-computer interface based on mutual information quantity.

The invention relates to a method for monitoring error-related potential in a brain-computer interface based on mutual information content, which comprises the following steps:

firstly, preprocessing an original electroencephalogram signal to obtain an original error-related potential signal;

step two, carrying out non-overlapping sliding window analysis on the original error-related potential signals obtained in the step one, extracting the average absolute value of each sliding window signal, and taking the average absolute value as the time domain characteristic of the original error-related potential signals;

thirdly, extracting the characteristics of the original error-related potential signals obtained in the first step by using a Welch method to obtain the frequency domain characteristics of the original error-related potential signals;

step four, the time domain characteristics obtained in the step two and the frequency domain characteristics obtained in the step three are subjected to characteristic combination to obtain combined characteristics;

step five, using the mutual information as the measure between the combined features and the correct and wrong categories in the step four, calculating the mutual information quantity of the combined features and the correct and wrong categories, and sequencing in a descending order to obtain and retain an optimal model; and on the basis of the optimal model, classifying the error correlation potentials by using a least square support vector machine.

The invention has the beneficial effects that: the application provides a novel Error-related Potential (Error-related Potential, errP) -based brain-computer interface Error monitoring method, which comprises the following steps: extracting and combining the time-domain and frequency-domain characteristics of the preprocessed real-time ErrP of the user, calculating and sequencing mutual information quantity of characteristics and categories, preferentially selecting the characteristics with larger mutual information quantity, classifying by using a least square support vector machine, performing leave-one-out cross validation on the sample, and acquiring and reserving an individual optimal model; the method carries out screening optimization on the characteristics, improves the identification precision of the ErrP, provides directions and strategies for subsequent human intervention, and plays the role of human biological intelligence. The machine intelligence of the brain-computer interface and the human biological intelligence are effectively combined to form hybrid enhanced intelligence, so that the friendliness of a human-computer interaction process is enhanced.

Drawings

Fig. 1 is a flowchart of a method for monitoring an error-related potential in a brain-computer interface based on mutual information according to a first embodiment;

fig. 2 is a schematic diagram of a process of screening an optimal feature number based on mutual information amount in a first embodiment;

FIG. 3 is a diagram illustrating a movement direction error of a cursor observed by a subject according to a first embodiment;

FIG. 4 is a Venn diagram of mutual information classification in the first embodiment;

FIG. 5 is a diagram of a waveform of a subject error-related potential and a total average waveform in accordance with one embodiment;

FIG. 6 is a comparison graph of classification accuracy of FCz channels of each subject based on MAV, welch power spectrum and dual feature combinations in the first embodiment;

FIG. 7 is a comparison graph of classification accuracy of Cz channel of each subject based on MAV, welch power spectrum and dual feature combination in the first embodiment.

Detailed Description

The present embodiment is described with reference to fig. 1 to 7, and the method for monitoring an error-related potential in a brain-computer interface based on mutual information amount according to the present embodiment includes the following steps:

step four, combining the time domain characteristics obtained in the step two with the frequency domain characteristics obtained in the step three to obtain combined characteristics;

In this embodiment, public databases are utilized — the future of eu brain-machine interactions: horizon 2020 planning (BNCIHORIZON 2020) data set 22-Monitoring error-related potentials data in the database. The testee sits in front of the computer screen and is required to observe whether the cursor moves towards the correct target or not and record the electroencephalogram signals of the testee. A moving cursor (i.e., a gray square) is displayed on the screen. The black square on the left or right side of the cursor represents the target position and the dashed square represents the position before the cursor, as shown in fig. 1. At each trial pass, the cursor is moved horizontally according to the position of the target. Each trial duration is approximately 2000 milliseconds. Once the target is reached, the cursor will remain in place and a new target position appears. During the course of the experiment, the user is asked to monitor the performance of the system and assess whether it is working properly. To investigate the signal generated by the false action, the cursor has a probability of moving in the wrong direction (i.e., opposite to the target position) in each trial. ErrP occurs when the subject finds that the cursor has moved to the wrong target. The data set included 64 channels of whole cerebral cortex ErrP signals collected by 6 subjects according to the 10-20 International Standard, with a sampling rate of 512Hz. The invention only selects the data with the error probability of 20 percent and analyzes the ErrP signals of each testee one by one. Each subject was collected in two phases, each phase separated by a different number of days. The invention selects the data of the first stage for training and constructing an individual optimal model, and tests by using the data of the second stage. Because the original data samples are huge and the proportion of different types of samples is unbalanced, a randderm function is utilized to perform non-repeated rearrangement sampling on most types of samples, and the most types of samples with the same number as the few types of samples are randomly extracted, so that the data are balanced.

In a preferred embodiment, the specific method for preprocessing the original electroencephalogram signal in the step one to obtain the original error-related potential signal includes: processing an original electroencephalogram signal by sequentially adopting a common average reference method, a median filtering method, an FIR band-pass filtering method, independent component analysis, anti-aliasing filtering, down-sampling and a sliding window; wherein, the common average reference method is as follows: removing electrode common noise in the mean value of the original EEG signals of 64 channels to obtain central electrode channel signals of 9 central cortex areas of error-related potential signals;

the median filtering method is to adopt a median filter with a window length of 201 to remove baseline drift;

the FIR band-pass filtering method comprises the following steps: the data is band-pass filtered through a finite long unit impulse response filter of 1-10 Hz.

In the present embodiment, first, the average signal values of 64 channels are removed by a Common Average Reference (CAR) method, and noise Common to all electrodes is removed to realize spatial filtering. Since the activity associated with the error is associated with the central cortical region, a preference of 9 central electrode channels (FC 1, FCz, FC2, C1, cz, C2, CP1, CPz, CP 2) for further analysis may achieve a reduction in dimensionality. This channel selection process also helps to remove electrodes that may be contaminated by blinking and muscle artifacts.

A median filter with a window length of 201 is adopted to remove baseline drift, then a Finite Impulse Response (FIR) filter with a frequency of 1-10Hz is used for carrying out band-pass filtering on data, a Hamming window is selected as a window function, and the window length is 50. Finally, the ocular signal is removed by Independent Component Analysis (ICA), and an ErrP signal is reconstructed. The fast ICA algorithm proposed by Hyvarinen, also called fixed-point algorithm, is a fast optimization iterative algorithm in a batch processing mode.

And filtering the reconstructed ErrP before downsampling by adopting an 82-order FIR filter to prevent frequency aliasing after downsampling. After down-sampling, the ErrP window time interval with fixed length in each trial is selected for analysis. And (3) extracting the Average absolute Value (MAV) of each sliding window signal as a time domain characteristic by adopting a non-overlapping sliding window analysis method for the length signal.

In a preferred embodiment, the specific method for performing non-overlapping sliding window analysis on the original error-related potential signal in the step two and extracting the average absolute value of each sliding window signal is as follows:

let x _u The signal is the u-th sample point of an original error correlation potential signal x, N is the signal length, and the average absolute value represents the embodiment of the separation degree of the signal and a far point in effective data acquired by the signal; the length of the sliding window is 7, each trial time has 12 sliding windows, 12 time domain features are obtained in total, 9 central electrode channels are preferentially selected for analysis, and 9 × 12=108 dimensional features are extracted from each experimental sample;

the mean absolute value expression is shown in (1):

where MAV is the mean absolute value.

In this embodiment, an overlapping sliding window analysis method is also used for the length signal, since ErrP is concentrated in 1-10Hz, and includes delta (0.5-4 Hz), theta (4-8 Hz), and alpha1 (8-10 Hz) frequency bands, when frequency domain features are extracted, welch power spectral features of the above 3 bands in ErrP are respectively extracted for classification, then Welch power spectral features in 1-10Hz frequency bands are extracted for classification, and the contribution of the extracted frequency domain features of different frequency bands to the single-channel single-feature classification rate is compared. The window function is selected from the group consisting of Hamming window, window Length 64, overlap Length 32.

In a preferred embodiment, the specific method for acquiring the frequency domain characteristics of the original error-related potential signal in step three is as follows:

the original error-related potential signal is firstly decomposed into data x (M) with the length of N, N =0,1, \ 8230, wherein N-1 is divided into L sections, M is the M-th sampling point of the data x (M), each section has M data, and the t-th section of data is expressed as:

x _t (m)＝x(m+tM-M),0≤m≤M,1≤t≤L (2)

wherein x is _t (m) is the t-th section data;

adding a window function w (m) to each segment of data to obtain a periodogram of each segment of data, wherein the periodogram of the t-th segment of data is as follows:

wherein, I _t (ω) is a periodogram for the t-th piece of data; u is a normalization factor; j is an imaginary unit; omega is frequency;

the periodograms of each segment of data are regarded as being mutually independent, and the power spectral density is as follows:

wherein, P _xx (e ^jω ) Is the power spectral density; l is the total segment number of the data x (n);

257 features are available in each test run, and 9 central electrode channels are preferentially selected for analysis, so that 9 x 257=2313 dimensional features are extracted from each experimental sample;

the power spectral density is a frequency domain characteristic.

In the embodiment, after cross validation is reserved for MAV and Welch power spectrum feature vectors of each channel by using an LS-SVM respectively, single-channel single-feature ErrP classification accuracy is obtained and compared. And combining the MAV and the power spectrum, wherein each trial time has 9 + 12+ 257) =2421 features, and continuously utilizing the LS-SVM to perform leave-one cross validation, and comparing single-channel double-feature classification rates one by one.

A Support Vector Machine (SVM) is a new pattern recognition method developed on the basis of Statistical Learning Theory (SLT) established by Vapnik et al in the 90 s of the 20 th century. Similar to the STL, the SVM method is a statistical learning method established for small samples, and can effectively overcome the disadvantages of the neural network method such as difficult convergence, unstable solution, poor generalization capability, and the like, and particularly, when the number of training samples is small, a good classification effect can still be achieved. Currently, SVMs have been widely used in pattern recognition, signal processing, communication, and other fields.

In the aspect of classification learning, the traditional pattern recognition method is to perform dimensionality reduction processing on data, but an SVM is just the opposite, and for samples in a feature space, the SVM adopts a mapping method to map the samples to a high-dimensional space, and the best classification hyperplane of each sample is searched in the high-dimensional space.

In the preferred embodiment, the band classification in the fifth step includes a delta band, a theta band and an alpha1 band;

the range of the delta frequency band is 0.5Hz-4Hz;

the theta frequency range is 4Hz-8Hz;

the range of the alpha1 frequency band is 8Hz-10Hz.

In a preferred embodiment, the specific method for classifying the combined features in the fifth step is as follows: classifying the combined features by using an optimal classification function;

let the training set be (x) _i ,y _i ) I =1,2, \ 8230n, n training samples, where x _i Representing inputs to a training model，y _i Representing the output of the training model; the training set has a linear discriminant function g (x) _i )＝w ^T x _i + b, w is the normal vector of the hyperplane, b is the intercept of the linear discriminant function; the problem of finding its optimal classification surface is expressed in the constraint y _i (w ^T x _i + b) -1 is more than or equal to 0, calculate | w | | survival rate ² A/2 minimum;

defining the original Lagrange function as shown in (6):

where L (w, b, α) is the function value of the original Lagrange function, α _i Is a Lagrange coefficient, and alpha _i Not less than 0; w is a weight vector, w ^T Is the transpose of the weight vector; b is the intercept of the linear discriminant function;

and respectively solving the partial derivatives of w and b by the above formula, and converting the partial derivatives into a quadratic programming dual problem:

under the constraint of

And alpha _i ≧ 0, the maximum value of the optimization function is taken to distinguish w ^T And w, w ^T Writing composition

w is changed into writing->

Wherein alpha is _j Is different from alpha _i Of another Lagrangian coefficient, y _j Is distinguished from y _i Output of another training model of (2), x _j Is distinguished from x _i The input of another training model:

wherein Q (alpha) is an optimization function;

the above formula has a unique optimal solution under the constraint condition of inequality, and satisfies:

α _i [y _i (w ^T x _i +b)-1]＝0 i＝1,2,...,n (8)

if it is

Is the optimal solution of equation (7), then->

For a plurality of samples->

Is 0, <' > based on>

Samples with the value different from 0 are the support vectors;

the optimal classification function is then:

wherein, b ^* Is the optimal solution of the intercept of the linear discriminant function;

in the non-linear case, using

X of linear inseparable sample _i Mapping into high bit space achieves linear separability. The optimization function is then:

wherein,

is->

And/or>

Inner product of (d);

thus, the optimal classification function is:

when the classification is carried out in a high-dimensional space, reducing the dimension of the optimal classification function by utilizing a kernel function;

the kernel function K (-) is defined as:

the optimization function transforms to:

the corresponding optimal classification function is:

in a preferred embodiment, the kernel function includes:

(a) Polynomial kernel function

At this time, the SVM is a q-order polynomial classifier;

(b) Radial basis kernel function

K(x _i ,x _j )＝exp(-||x _i -x _j || ² /2μ ² ) (16)

||x _i -x _j || ² Can be regarded as a flat between two feature vectorsThe square euclidean distance, μ is a free parameter. At this time, the SVM is a radial basis function classifier, and is different from the conventional radial basis function in that the center of each basis function corresponds to a support vector;

(c) Sigmoid kernel function

Wherein beta is ₀ Is the slope, beta ₁ Is the intercept.

In addition, the kernel functions include a numerical radial kernel function, a fourier series, a spline function, and a B-spline function.

In a preferred embodiment, the specific process of using the mutual information as the measure between the combined features and the optimal classification in the step six is as follows:

mutual information measures the degree of association between the feature items and the categories, and the mutual information of two discrete variables X and Y is defined as follows:

wherein p (X, Y) is a joint probability distribution function of X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively; mutual information of two random variables is a measure of interdependence between the variables; mutual information is equivalently expressed as:

wherein H (X) and H (Y) are edge entropies, H (X | Y) and H (Y | X) are conditional entropies, and H (X, Y) is the joint entropy of X and Y; is shown in FIG. 4 by the Venn diagram;

and continuously calculating the mutual information quantity of the characteristics and the categories, performing descending order on the mutual information quantity, preferentially retaining the characteristics with larger mutual information quantity, and acquiring the single-channel joint classification accuracy after performing leave-one-cross verification by using the LS-SVM. Finally, screening out the channels with higher classification results and characterizing the channelsAnd combining and screening by using mutual information, and keeping an individual optimal model after carrying out leave-one-cross verification by using the LS-SVM to obtain the final ErrP classification accuracy. Specifically, N characteristics are arranged in a descending order according to the sequence of mutual information quantity from large to small, 1 st to k th characteristics are selected on the basis of an individual optimal model and classified one by using a least square support vector machine, a left-to-one cross validation is carried out on a sample, and the classification accuracy A is _k The highest k value is retained as shown in fig. 3. This helps to convey the corresponding instruction, terminating erroneous movement of the cursor, facilitating timely intervention.

In a preferred embodiment, the specific method for obtaining and retaining the optimal model in the sixth step is as follows:

specifically, N characteristics are arranged in a descending order according to the sequence of mutual information quantity from large to small, 1 st to k th characteristics are selected on the basis of an individual optimal model and classified one by using a least square support vector machine, a left-to-one cross validation is carried out on a sample, and the classification accuracy A is _k Reserving the k value at the highest time; as shown in fig. 2; this helps to convey the corresponding instruction, terminating erroneous movement of the cursor, facilitating timely intervention.

In a preferred embodiment, the specific method for classifying the error-related potential by using the least square support vector machine in the step five is as follows: the least square support vector machine adopts a least square linear system as a loss function, and changes a quadratic programming method in a classic support vector machine algorithm into a solution linear equation set;

the target optimization function of the least square support vector machine algorithm is as follows:

constraint conditions are as follows:

wherein w is a weight vector; gamma is a regularization parameter; b is the intercept of the linear discriminant function; e.g. of the type _i Is an error;

updating the Lagrange function as:

for w, b, e respectively _i And alpha _i And (3) carrying out equal parameter derivation to obtain:

the finishing formula (23) is as follows:

wherein l = [1, 1., 1 ]] ^T ，

I, j =1,2,. Cndot.n, I is a unit matrix, α _i ＝[α ₁ ,α ₂ ,……，α _n ] ^T ，y＝[y ₁ ,y ₂ ,...,y _n ] ^T ；

Order to

Solving the matrix equation to obtain:

α _i ＝A ^-1 (y-bl) (26)

for an unknown sample x, the predicted value of the least squares support vector machine is:

each time of classification, a kernel function is selected preferentially, and thenTraining an optimal regular parameter gamma and an optimal nuclear parameter sigma by using training data and corresponding samples, and further training a Lagrangian coefficient alpha _i Intercept with a linear discriminant function b; according to Lagrange coefficient alpha _i And generating an individual optimal model by the linear discriminant function intercept b, acquiring and reserving the individual optimal model, and classifying the error-related potential through the individual optimal model.

The experimental results are as follows:

taking the FCz channel as an example, fig. 5 shows the error-related potential waveforms and the total average waveform of all subjects. The thin lines represent the ErrP of six subjects and the thick lines are the total mean waveform. ErrP has a high lock timing: after an error is observed for about 200ms, the initial positive peak occurs first, followed by a large negative deflection near 255ms and a second large positive peak at about 340 ms.

After the reconstructed ErrP is down-sampled to 128Hz, the ErrP signal period between-50 ms and 700ms in each trial is selected for analysis. When the model is trained, an RBF kernel function is selected, an optimal regular parameter gamma and an optimal kernel parameter sigma are trained by using training data and corresponding samples, and then a Lagrangian coefficient alpha and a linear discriminant function intercept b are trained. And forming an individual optimal model according to the four key parameters, and predicting the label of the test data.

After comparison, the classification accuracy rate of extracting delta and theta and extracting Welch power spectrum features in a frequency band of 1-10Hz compared with alpha1 is higher. It was also found that of the 9 channels, the classification rates of the FCz and Cz channels were higher. Thus, further analysis was performed using two central electrode channels, FCz and Cz. Fig. 6 and 7 show the recognition rates of FCz and Cz in terms of single-channel mean absolute value, single-channel Welch power spectrum, and the combination of single-channel mean absolute value and Welch power spectrum characteristics, respectively.

The classification effect of the FCz channel is generally better than that of the Cz channel; the classification accuracy based on the time domain feature MAV is higher than that of a frequency domain feature Welch power spectrum. The classification effect after comparing single characteristics and characteristic combination is found out that the more the characteristic numbers are, the more the classification is beneficial, some characteristics are redundant, and even the classification rate is affected.

And (3) combining the time-frequency characteristics of FCz and Cz to perform double-channel double-characteristic combined classification, wherein each test time is 2 x (12 + 257) =538. And finally, calculating mutual information quantity of the features and the categories, performing descending arrangement on the mutual information quantity, preferentially selecting the first k features with larger mutual information quantity to classify one by using a least square support vector machine, performing leave-one cross validation on the sample, and reserving the k value when the classification accuracy is highest. The k values of six subjects were 8, 11, 7, 6, 8, and 9, respectively, and the results are shown in table 1. It can be seen that, similar to the single-channel time-frequency feature combination, the superposition of the multi-channel feature numbers does not significantly improve the classification rate. On the contrary, the operation time is significantly prolonged, and the calculation amount and complexity are increased. The features after mutual information quantity sequencing and screening obviously improve the joint classification rate of the ErrP signals and contribute to the identification of the ErrP.

TABLE 1 Dual-channel dual-feature combination of each subject and classification accuracy comparison after mutual information quantity screening

。

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present invention, and shall cover the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A monitoring method for error-related potential in a brain-computer interface based on mutual information quantity is characterized by comprising the following steps:

2. The method for monitoring the error-related potential in the brain-computer interface based on mutual information content as claimed in claim 1, wherein the specific method for preprocessing the original electroencephalogram signal in the step one and obtaining the original error-related potential signal comprises: processing an original electroencephalogram signal by sequentially adopting a common average reference method, a median filtering method, an FIR band-pass filtering method, independent component analysis, anti-aliasing filtering, down-sampling and a sliding window; wherein, the common average reference method is as follows: removing electrode common noise in the mean value of the original EEG signals of 64 channels to obtain central electrode channel signals of 9 central cortex areas of error-related potential signals;

3. The method for monitoring the error-related potential in the brain-computer interface based on mutual information according to claim 1, wherein the specific method for performing non-overlapping sliding window analysis on the original error-related potential signal and extracting the average absolute value of each sliding window signal in the second step is as follows:

let x _u The u-th sample point of the original error-related potential signal x, N is the signal length, and the average absolute value expression is shown as (1):

where MAV is the mean absolute value.

4. The method for monitoring the error-related potential in the brain-computer interface based on the mutual information content as claimed in claim 3, wherein the specific method for obtaining the frequency domain characteristics of the original error-related potential signal in the third step is:

x _t (m)＝x(m+tM-M),0≤m≤M,1≤t≤L (2)

wherein x is _t (m) is the t-th section data;

adding a window function w (m) to each piece of data to obtain a periodogram of each piece of data, wherein the periodogram of the t-th piece of data is as follows:

wherein, I _t (ω) is a periodogram for the t-th piece of data; u is a normalization factor; j is an imaginary unit; omega is the frequency;

wherein, P _xx (e ^jω ) Is the power spectral density; l is the total number of segments of data x (n);

the power spectral density is a frequency domain characteristic.

5. The method for monitoring the error-related potential in the brain-computer interface based on mutual information according to claim 4, wherein the intermediate frequency band classification in the step five comprises a delta frequency band, a theta frequency band and an alpha1 frequency band;

the range of the delta frequency band is 0.5Hz-4Hz;

the theta frequency range is 4Hz-8Hz;

the range of the alpha1 frequency band is 8Hz-10Hz.

6. The method for monitoring the error-related potential in the brain-computer interface based on the mutual information quantity according to claim 5, wherein the concrete method for classifying the combined features in the fifth step is as follows: classifying the combined features by using an optimal classification function;

let the training set be (x) _i ,y _i ) N, n training samples, wherein x is equal to 1,2, \8230 _i Representing inputs of a training model, y _i Representing the output of the training model; the training set has a linear discriminant function g (x) _i )＝w ^T x _i + b, w is the normal vector of the hyperplane, b is the intercept of the linear discriminant function; the problem of finding its optimal classification surface is expressed in the constraint y _i (w ^T x _i + b) -1 is more than or equal to 0, solving | | w | | luminance ² A/2 minimum;

defining the original Lagrange function as shown in (6):

where L (w, b, α) is the function value of the original Lagrange function, α _i Is a Lagrange coefficient, and alpha _i Not less than 0; w is the weightWeight vector, w ^T Is the transpose of the weight vector; b is the intercept of the linear discriminant function;

under the constraint of

And alpha _i ≧ 0, the maximum value of the optimization function is taken to distinguish w ^T And w, w ^T Writing and making>

w composition>

Wherein alpha is _j Is different from alpha _i Another Lagrangian coefficient of y _j Is distinguished from y _i Output of another training model of (2), x _j Is distinguished from x _i The input of another training model:

wherein Q (alpha) is an optimization function;

α _i [y _i (w ^T x _i +b)-1]＝0 i＝1,2,...,n (8)

if it is

Is the optimal solution of equation (7), then +>

For a majority sample>

Is 0 and/or>

Samples with the value different from 0 are the support vectors;

the optimal classification function is then:

in the case of non-linearity, using

wherein,

is->

And &>

Inner product of (2);

thus, the optimal classification function is:

the kernel function K (-) is defined as:

the optimization function transforms to:

the corresponding optimal classification function is:

7. the method for monitoring error-related potentials in a brain-computer interface based on mutual information according to claim 6, wherein the kernel function comprises:

(a) Polynomial kernel function

At the moment, the SVM is a q-order polynomial classifier, and q is the order of the polynomial classifier;

(b) Radial basis kernel function

K(x _i ,x _j )＝exp(-||x _i -x _j || ² /2μ ² ) (16)

||x _i -x _j || ² We can see as a squared euclidean distance between two feature vectors, mu being a free parameter. At this time, the SVM is a radial basis function classifier, and is different from the existing radial basis function in that the center of each basis function corresponds to a support vector;

(c) Sigmoid kernel function

Wherein, beta ₀ Is the slope, beta ₁ Is the intercept.

8. The method for monitoring the potential related to the error in the brain-computer interface based on mutual information according to claim 7, wherein the specific process of using the mutual information as the measure between the combined features and the optimal classification in the fifth step is as follows:

wherein p (X, Y) is a joint probability distribution function of X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively; the mutual information of two random variables is a measure of the interdependence between the variables; mutual information is equivalently expressed as:

I(X；Y)＝H(X)-H(X|Y)＝H(Y)-H(Y|X)

＝H(X)+H(Y)-H(X,Y) (19)

＝H(X,Y)-H(X|Y)-H(Y|X)

where H (X) and H (Y) are edge entropy, H (X | Y) and H (Y | X) are conditional entropy, and H (X, Y) is the joint entropy of X and Y.

9. The method for monitoring potential related to errors in a brain-computer interface based on mutual information according to claim 8, wherein the specific method for classifying potential related to errors using a least squares support vector machine in the fifth step is: the least square support vector machine adopts a least square linear system as a loss function, and changes a quadratic programming method in a classic support vector machine algorithm into a solution linear equation set;

constraint conditions are as follows:

updating the Lagrange function as:

for w, b, e respectively _i And alpha _i And (3) carrying out equal parameter deviation derivation:

the finishing formula (23) is as follows:

wherein, l = [1, 1., 1 ]] ^T ，

I, j =1, 2.. N, I is the identity matrix, α _i ＝[α ₁ ,α ₂ ,……，α _n ] ^T ，y＝[y ₁ ,y ₂ ,...,y _n ] ^T ；

Order to

Solving the matrix equation to obtain:

α _i ＝A ^-1 (y-bl) (26)

in each classification, a kernel function is preferentially selected, then training is performed on an optimal regular parameter gamma and an optimal kernel parameter sigma by using training data and corresponding samples, and a Lagrangian coefficient alpha is further trained _i Intercept with a linear discriminant function b; according to lagrange coefficient alpha _i And generating an individual optimal model by the linear discriminant function intercept b, acquiring and reserving the individual optimal model, and classifying the error-related potential through the individual optimal model.