CN113486794B

CN113486794B - Motor imagery electroencephalogram signal classification method based on hybrid model

Info

Publication number: CN113486794B
Application number: CN202110763496.8A
Authority: CN
Inventors: 付荣荣; 向艺凡; 王世伟; 李威帅
Original assignee: Yanshan University
Current assignee: Yanshan University
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2022-04-15
Anticipated expiration: 2041-07-06
Also published as: CN113486794A

Abstract

The invention provides a motor imagery electroencephalogram signal classification method based on a hybrid model, which comprises the following steps: s1, data extraction: collecting two groups of data from a database as an electroencephalogram database; s2, data reconstruction: the electroencephalogram data in the electroencephalogram database are normalized and stored in a four-dimensional tensor structure; s3, feature optimization: optimizing the electroencephalogram data by using an ODV-CSSD algorithm to obtain optimal electroencephalogram characteristics; and S4, forming interpretable clusters by using the prior information, and establishing decision boundaries to obtain final class attribution results. The invention provides a new motor imagery electroencephalogram signal classification method aiming at the problem that a common space mode is too sensitive to abnormal data to cause easy overfitting, provides a new feature optimization algorithm ODV-CSSD and researches.

Description

Motor imagery electroencephalogram signal classification method based on hybrid model

Technical Field

The invention relates to the field of electroencephalogram signal classification, in particular to a motor imagery electroencephalogram signal classification method based on a mixed model.

Background

The human brain is composed of neurons connected to each other, and an Electroencephalogram (EEG), which is an electrical signal generated by the activity between neurons, records the change of the electrical signal during the brain activity, and is an overall reflection of the electrophysiological activity of brain neurons on the surface of the cerebral cortex or scalp. Brain-Computer Interface (BCI) technology converts such electrical signals into control commands, thereby providing a communication path between the Brain and external devices (e.g., BCI wheelchairs, prosthetics, robotic arms).

The core of BCI technology is to regulate the self-adaptive control relation between human brain and machine, i.e. to find out proper signal processing and information converting control algorithm and to convert the brain electric signal into one kind of operation system signal capable of being identified accurately by computer. The original electroencephalogram signals inevitably contain electro-ocular interference, myoelectric interference and power frequency interference, and the noise can greatly increase the computation amount, complexity and precision of electroencephalogram signal processing, so that in a BCI system, effective noise elimination and extraction of electroencephalogram features with high separability are very important for accurate decoding of brain activities.

At present, a great deal of research at home and abroad is dedicated to extraction and classification of electroencephalogram features in a Spatial domain, a Spatial filter is designed to improve the quality of electroencephalogram signals and extract features, a Common Spatial Pattern (CSP) is a relatively representative algorithm, a certain class of variance is maximized and another class of variance is minimized by finding a group of Spatial filters, and the Spatial filters are widely applied to electroencephalogram signal processing.

Disclosure of Invention

In order to solve the defects of the prior art, the invention provides a motor imagery electroencephalogram signal classification method based on a hybrid model, which classifies the motor imagery electroencephalogram signals by adopting a hybrid discrimination model, improves the quality of the electroencephalogram signals, and simultaneously helps solve the problem of overfitting easily appearing in a small sample training set by selecting the optimal number of electroencephalogram features, thereby improving the classification effect.

In order to achieve the purpose, the invention discloses the following technical scheme:

specifically, the invention provides a motor imagery electroencephalogram signal classification method based on a hybrid model, which comprises the following steps:

s1, data extraction: collecting two groups of data from a database as an electroencephalogram database;

s2, data reconstruction: the electroencephalogram data in the electroencephalogram database are normalized and stored in a four-dimensional tensor structure;

s3, feature optimization: the method for optimizing the electroencephalogram data by utilizing the ODV-CSSD algorithm to obtain the optimal electroencephalogram characteristics specifically comprises the following substeps:

s31, setting X_iRepresenting multi-channel brain electrical signals under a motor imagery task, wherein i is 1,2, wherein 1 represents left hand motor imagery, 2 represents right hand motor imagery, and a normalized spatial covariance matrix R_iComprises the following steps:

wherein, X^TIs the transpose of matrix X, trace (,) is the sum of diagonal elements of the matrix;

s32, averaging covariance matrixes of a plurality of electroencephalogram data in the same task mode to obtain an average spatial covariance matrix:

wherein N represents the number of trials for each class;

s33, obtaining the eigenvalue and the eigenvector according to the following formulas:

wherein W is a characteristic vector matrix, and Λ is a diagonal matrix of characteristic values, passing through the W matrix;

s34, filtering the electroencephalogram signal by using the following formula:

Z_C×T＝W_C×CR_C×T

wherein C represents the channel number of the EEG signal, T represents the sampling number in one test, and Z_C×TFor filteringA wave matrix;

s35 Slave filter matrix Z_C×TExtracting a feature vector Fp:

wherein, the matrix Z_pFront m rows and back m rows containing Z;

s35, setting the following optimal discrimination criteria, and searching the optimal discrimination vector:

wherein A ═ cW₁+(1-c)W₂C is 0-1, d is the L-dimensional column vector of the projection data, d^TIs a transpose of d, W_iThe calculation formula of the in-class dispersion of the ith class is as follows:

wherein Δ is the difference between the estimated mean values, and the calculation formula is: Δ ═ μ₁-μ₂In which μ_iThe mean value for class i is calculated according to the following formula:

wherein the content of the first and second substances,

it represents the jth sample vector of class i, N_iThe number of samples in the ith class;

wherein, the first optimal projection direction is:

d₁＝α₁A^-1Δ

wherein alpha is₁Is to make

And is

The second optimal projection direction is obtained by maximizing R, and the constraint that the second optimal projection direction needs to be satisfied is d₁And d₂Are orthogonal, i.e.

Solving for the second optimal projection direction is:

wherein constant b₁Comprises the following steps:

and S4, forming interpretable clusters by using the prior information, and establishing decision boundaries to obtain final class attribution results.

Preferably, the step S2 specifically includes performing slicing processing on the acquired EEG data, and storing the processed EEG data in a structure of a four-dimensional tensor, where four dimensions of the four-dimensional tensor are the sampling number, the channel number, the event trigger number, and the category.

Preferably, in step S4, forming interpretable clusters by using the priori information through a DRMM algorithm, and establishing decision boundaries to obtain final class attribution results.

Preferably, the implementation process of the DRMM algorithm in step S4 includes the following sub-steps:

s41, X represents a feature generated by a rule, and y includes a cluster structure. The features are normalized so that each feature has a zero mean and a unit variance. Let k be the number of classes, using Z_nTo indicate class attribution of the nth sample in k-dimensional data, e.g. Z _nk1 denotes the firstThe n data belong to the kth cluster. Using vectors

Represents a decision boundary of the cluster k in d-dimension, wherein

The lower boundary is represented by the lower boundary,

the upper boundary is represented by the upper boundary,

to represent

The transposing of (1). Assuming that all samples belonging to the kth cluster are within the corresponding decision rectangle, i.e. x for all samples belonging to the cluster_ndAll satisfy

Suppose that all decision boundaries obey the prior distribution p (t)_kd)

K＝2

Wherein the content of the first and second substances,

and

the position of a decision boundary of the prior rule is represented, if the prior rule does not exist, the position is

α_tAnd beta_tIs a positive parameter controlling the balance between these links;

s42, determining matrix

Defining a parameter Zn to represent the cluster index:

since the present invention computes the feature to two dimensions, D ═ 2. f (t) is an exponential function satisfying t ≧ 0:

the result of the calculation is a logical AND operator, so if x_ndWithin the decision rectangle, then Z _nk1, otherwise Z_nkThe irreparability of 0, f (t) makes it difficult to optimize the decision rectangle, so g (t) is used instead of f (t), and g (t) is defined as follows:

wherein a is a positive parameter for defining a soft steepness function, such that a is greater in value to indicate that g (t) is closer to f (t), g (t) is substituted for f (t), and a new variable γ is defined_nk：

Due to 0<g(t)<1, then t has 0 for all actual values<γ_nk<1, for the kth cluster, if sample x_ndWithin the rectangular decision boundary, then γ _nk1, or else γ _nk0. From this can be seen γ_nkCan become a soft clustering index;

s43, suppose y_nObeying mixed Gaussian distributions

Wherein, mu_kSum Σ_kIs the mean vector and covariance matrix of the kth cluster;

the joint probability of DRMM is:

wherein the content of the first and second substances,

representing all of the decision boundaries of the rectangle,

all parameters are represented;

given the parameters { T, Θ } and the observed variables { Φ, X, Y }, the posterior probabilities of the latent variables Z are calculated:

where const denotes a constant that is not a function of Z, and the expectation of the combined logarithm is calculated given the posterior distribution Z as follows:

wherein the content of the first and second substances,

is that

The average vector of (a); d is the extracted feature dimension; k is a clustering number; pi_nkIs a k-dimensional vector representing the expected values of the posterior distribution.

Finding the optimal value of T, Θ to maximize W (T, Θ) using the following formula:

computing

And for each

cluster K e

1,2, …, K ∈ { μ }_k,∑_k}；

The optimum value of (c) is calculated as follows:

{μ_k,∑_kis calculated as follows:

compared with the prior art, the invention has the following beneficial effects:

the invention provides a novel motor imagery electroencephalogram signal classification method aiming at the problem that a common space mode is too sensitive to abnormal data and is easy to overfit, feature selection and optimization are carried out through a novel provided algorithm ODV-CSSD, and feature optimization is realized through the algorithm from two angles: firstly, optimization is realized by finding a group of optimal feature vectors and projecting all the optimal feature vectors to construct an optimal feature space, and the influence generated by abnormal data can be reduced by the features found by the optimal feature space; and secondly, the optimal feature number is searched and decision is made, and the feature is prevented from containing redundant information, so that optimization is realized. The invention realizes the extraction and classification of the left and right hand motor imagery electroencephalogram signals and effectively improves the identification accuracy.

Drawings

FIG. 1 is a schematic flow diagram of the present invention;

FIG. 2 is a schematic of a reconstructed data structure;

FIG. 3 is a schematic of the clustering results for the extracted two-dimensional optimal features;

FIG. 4 is a ROC curve for a two-dimensional optimal feature;

fig. 5 is an overall flow chart of the present invention.

Detailed Description

Exemplary embodiments, features and aspects of the present invention will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The core of the invention is to provide a new spatial filtering method to process electroencephalogram signals, a mixed discrimination model is adopted to classify the motor imagery electroencephalogram signals, the quality of the electroencephalogram signals is improved, and simultaneously, the optimal electroencephalogram feature number is selected to help solve the problem of overfitting easily appearing in a small sample training set, so that the classification effect is improved.

The invention provides a motor imagery electroencephalogram signal classification method based on a mixed model, which comprises the following steps of:

s1, establishing an electroencephalogram database, and collecting electroencephalogram data by 59 EEG electrodes densely distributed in a brain sensorimotor region or directly calling the existing electroencephalogram data to serve as the electroencephalogram database.

S2, data reconstruction: the electroencephalogram data are normalized and stored in a four-dimensional tensor structure, the data structure is shown in fig. 2, wherein the four dimensions are as follows: sampling number, channel number, event triggering times and category.

S3, feature optimization: and obtaining the optimal electroencephalogram characteristics by using an ODV-CSSD algorithm.

S4, forming interpretable clusters by using the prior information, establishing decision boundaries, and obtaining a final category attribution result, as shown in FIG. 3.

In an embodiment of the present invention, step S1 is specifically executed by constructing the input data of the present invention from data of two different sources: subjects a, B, C, D are MI-EEG data from the german team in the fourth brain-computer interface competition; subjects E, F, G were tested in the same paradigm as brain-computer interface competition and were acquired with 59 EEG electrodes densely distributed in the sensorimotor region.

In an embodiment of the present invention, step S2 specifically includes performing slicing processing on the acquired EEG data, and storing the processed EEG data in a four-dimensional tensor structure, where the four dimensions of the tensor are: number of samples, number of channels (number of electrodes), number of event triggers, and type.

In one embodiment of the invention, in step S3, obtaining the optimal electroencephalogram features is achieved by using an ODV-CSSD algorithm.

In one embodiment of the invention, the specific steps of calculating the optimal electroencephalogram characteristic by adopting the ODV-CSSD method are as follows:

let Xi (i ═ 1,2, where 1 represents the left-hand motor imagery and 2 represents the right-hand motor imagery) denote the multichannel brain electrical signals under the motor imagery task. The normalized spatial covariance matrix Ri is:

where XT is the transpose of matrix X and trace (,) is the sum of the diagonal elements of the matrix.

Carrying out mean value processing on covariance matrixes of a plurality of experimental data in the same task mode to obtain an average spatial covariance matrix:

where N represents the number of trials per class.

The eigenvalues and eigenvectors are obtained according to the following equations:

wherein W is an eigenvector matrix, and Λ is a diagonal matrix of eigenvalues. Filtering the electroencephalogram signal through the W matrix using:

Z_C×T＝W_C×CR_C×T

wherein C represents the channel number of the electroencephalogram signal, and T represents the sampling number in one test.

From the filter matrix Z_C×TExtracting a feature vector Fp (the dimension does not exceed the number of channels):

wherein the matrix Zp comprises the first m rows and the last m rows of Z.

To find the optimal discrimination vector, the following optimal discrimination criteria are set:

wherein A ═ cW₁+(1-c)W₂C is more than or equal to 0 and less than or equal to 1. d is an L-dimensional column vector of the projection data. dT is the transpose of d. Wi is the in-class dispersion of the ith class, and the calculation formula is as follows:

wherein Δ is the difference between the estimated mean values, and the calculation formula is: Δ ═ μ₁-μ₂. Wherein mu_iThe mean value for class i is calculated according to the following formula:

wherein the content of the first and second substances,

which represents the jth sample vector of the ith class. Ni is the number of samples of the ith class. It is worth noting that: the generalized ratio R is independent of the size of the vector d, i.e., R (d) ═ R (α d).

The first optimal projection direction is:

d₁＝α₁A^-1Δ

wherein α 1 is

And is

The second best projection direction is obtained by maximizing R, and the constraint that the second best projection direction is orthogonal to d2, i.e. d1 is orthogonal to d2

Solving for the second optimal projection direction is:

where the constant b1 is:

the CSP is used for extracting features from the original electroencephalogram signals, and the optimal electroencephalogram features can be obtained after the optimal discrimination vector projection. Meanwhile, the number of the optimal electroencephalogram features can be controlled to achieve the optimal dimensionality.

In one embodiment of the present invention, in step S4, interpretable clusters are formed using a priori information through a DRMM algorithm.

In an embodiment of the present invention, the DRMM algorithm is implemented as follows:

let X denote via rule generationThe feature y comprises a cluster structure. The features are normalized so that each feature has a zero mean and a unit variance. Let k be the number of classes, and Zn is used to represent class attribution of nth sample in k-dimensional data, such as Z _nk1 indicates that the nth data belongs to the kth cluster.

Using vectors

Representing the decision boundary of cluster k in d-dimension. It is assumed that all samples belonging to the kth cluster are within the corresponding decision rectangle, i.e. all samples belonging to the cluster satisfy

Suppose that all decision boundaries obey the prior distribution p (t)_kd)：

Wherein the content of the first and second substances,

and

indicating where the decision boundary of the prior rule is located. If there is no prior rule, then

α t and β t are positive parameters that control the balance between these links.

In the decision matrix

Defining a parameter Zn to represent the cluster index:

wherein the content of the first and second substances,

wherein, f (t) is an exponential function satisfying t ≧ 0, and the calculation result can be regarded as a logical AND operator. Thus, if xn is inside the decision rectangle, Znk is 1, otherwise Znk is 0. The irreparability of f (t) makes it difficult to optimize the decision rectangle, so g (t) is substituted for f (t), which is defined as follows:

where a is a positive parameter used to define a soft "steepness" function, so that a greater value of a indicates that g (t) is closer to f (t). Replacing f (t) with g (t), and defining a new variable γ nk:

since 0< g (t) <1, then 0< γ nk <1 for all actual values of t. For the kth cluster, if sample xn is within the rectangular decision boundary, γ nk ≈ 1, otherwise γ nk ═ 0. Therefore, the gamma nk can be a soft clustering index.

If γ nk is used as a soft clustering index, the following two conditions must be satisfied:

the xn is positioned in a judgment rectangle of the kth cluster, namely gamma nk is approximately equal to 1;

xn is located outside all other decision rectangles, i.e. γ nk ≈ 0 for all j ≠ k.

Suppose yn follows a mixed Gaussian distribution:

where μ k and Σ k are the mean vector and covariance matrix of the kth cluster.

The joint probability of DRMM is:

wherein the content of the first and second substances,

representing all of the decision boundaries of the rectangle,

all parameters are indicated.

Calculating posterior probability of latent variable Z given parameters { T, Θ } and observed variables { Φ, X, Y }, and calculating posterior probability of latent variable Z

Where const denotes a constant that is not a function of Z. With the posterior distribution Z known, the expectation of the combined logarithm is calculated by:

the optimal T, Θ value is found using the following formula to maximize W (T, Θ)

Computing

And for each

cluster K e

1,2, …, K ∈ { μ }_k,∑_k}。

The optimum value of (c) is calculated as follows:

{μ_k,∑_kis calculated as follows:

the result of clustering according to the extracted two-dimensional optimal electroencephalogram features is shown in fig. 3. The DRMM algorithm, after learning the optimal features, gives a predicted class label, and at the same time, finds a rectangular decision rule for each cluster. The rectangular decision rule can visually interpret the differences between each cluster. Receiver operating characteristic curves (ROCs) are plotted for the predicted labels after optimal feature learning according to the DRMM and corresponding AUC values are calculated, as shown in fig. 4. The experimental objects A and B are brain-computer interface competition data, the experimental objects F and G are self-collected data, and the AUC values of the other three experimental objects except the experimental object G are all larger than 0.96, so that the experimental object A and the experimental object B have better performance.

In order to embody the performance of the algorithm of the invention, the DRMM clustering result is compared with the Fuzzy C Mean (FCM) and K-means clustering results, as shown in Table 1, wherein the bold numbers are the optimal identification accuracy of each experimental object. As can be seen from the table, the identification accuracy of the DRMM is greater than or equal to 0.91, and the DRMM has better performance than FCM and K-means for all test objects, so that the effectiveness of the feature optimization algorithm is verified.

The overall flow chart of the present invention is shown in fig. 5.

Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A motor imagery electroencephalogram signal classification method based on a hybrid model is characterized in that: which comprises the following steps:

s31, setting X_iRepresenting multi-channel electroencephalogram signals under motor imagery tasks stored in a database, wherein i is 1 and 2, wherein 1 represents left-hand motor imagery, 2 represents right-hand motor imagery, and a normalized spatial covariance matrix R_iComprises the following steps:

wherein, X_iIs an electroencephalogram signal, X_i ^TIs a transposed matrix of matrix X, trace () is the sum of diagonal elements of the matrix;

wherein the content of the first and second substances,

is an average spatial covariance matrix under the left-hand motor imagery,

the mean space covariance matrix under the right-hand motor imagery is obtained, and N is the test times of each class; r_1nAnd R_2nRespectively is a normalized spatial covariance matrix under the nth test left-hand and right-hand motor imagery tasks;

wherein W is an eigenvector matrix, Λ is a diagonal matrix of eigenvalues, and W is obtained through the W matrix^TA transposed matrix that is W;

s34, filtering the electroencephalogram signal by using the following formula:

Z_C×T＝W_C×CR_C×T

wherein C represents the number of channels when the electroencephalogram signal is obtained, T represents the number of samples in one test, and Z_C×TIs a filter matrix;

s35 selecting a filter matrix Z_C×TThe front m rows and the rear m rows of_pM is the logarithm of the spatial filter and p is the matrix Z_pFrom the matrix Z, i.e. p is 2m_pExtracted feature vector F_p：

Wherein, the matrix Z_pComprising Z_C×TM is the logarithm of the spatial filter, i.e. m pairs of spatial filters are selected, i is used for summation, representing the calculation var (Z)₁)，…，var(Z_2m) And performing summation calculation on the two;

s36, setting the following optimal discriminant criterion R (d) and searching the optimal discriminant vector:

wherein A ═ cW₁+(1-c)W₂C is 0-1, d is the column vector of the projection data, d^TIs a transpose of d, W₁And W₂The dispersion in class is 1 class and 2 class respectively, and the calculation formulas are respectively:

wherein c is W₁And W₂Since the contribution levels of the two types of tests are consistent, c is 0.5, Δ is the difference between the estimated average values, and the calculation formula is as follows: Δ ═ μ₁-μ₂In which μ_iThe mean value for class i is calculated according to the following formula:

wherein x is_ij＝(x_ij1,x_ij2..........x_ijL)^TWhich represents the jth sample vector of class i, L being the number of vector elements, N_iThe number of samples in the ith class;

first optimal projection direction d₁Calculated as follows:

d₁＝α₁A^-1Δ

wherein alpha is₁Is to make

And is

Obtaining a second optimal projection direction by maximizing R (d), wherein the constraint that the second optimal projection direction needs to be satisfied is d₁And d₂Are orthogonal, i.e.

Solving for the second optimal projection direction is:

d₂＝α₂{A^-1-b₁[A^-1]²}Δ

wherein alpha is₂Has the same action as alpha₁，b₁Comprises the following steps:

2. The hybrid model-based motor imagery electroencephalogram signal classification method of claim 1, wherein: the step S2 is specifically to slice the acquired EEG data, store the processed EEG data in a structure of a four-dimensional tensor, where four dimensions of the four-dimensional tensor are the sampling number, the channel number, the event triggering number and the category.

3. The hybrid model-based motor imagery electroencephalogram signal classification method of claim 1, wherein: in step S4, forming interpretable clusters by using the priori information through a DRMM algorithm, and establishing decision boundaries to obtain final class attribution results.

4. The hybrid model-based motor imagery electroencephalogram signal classification method of claim 3, wherein: the implementation process of the DRMM algorithm in step S4 includes the following sub-steps:

s41, let X denote the feature generated by the rule, y_nIncluding a clustering structure, normalizing the features such that each feature has a zero mean and a unit variance, assuming k as a serial number of the cluster and Z as the serial number of the cluster_nkTo indicate whether the nth sample belongs to the kth cluster or not, using a vector

Represents a decision boundary of the cluster k in d-dimension, wherein

The lower boundary is represented by the lower boundary,

the upper boundary is represented by the upper boundary,

to represent

Assuming that all samples belonging to the kth cluster are within the corresponding decision rectangle, i.e. for all samples x belonging to the cluster_ndAll satisfy

Suppose that all decision boundaries obey the prior distribution p (t)_kd)：

Wherein the content of the first and second substances,

and

α_tAnd beta_tAre positive parameters for controlling the balance among the links;

s42, determining matrix

Define a parameter Z_nkTo represent the cluster index:

wherein D is 2, f (t) is an exponential function satisfying t ≧ 0:

the result of the calculation is a logical AND operator, so if x_ndWithin the decision rectangle, then Z_nk1, otherwise Z_nkThe irreparability of 0, f (t) makes it difficult to optimize the decision rectangle, so g (t) is used instead of f (t), and g (t) is defined as follows:

Due to 0<g(t)<1, then t has 0 for all real time values<γ_nk<1, for the kth cluster, if sample x_ndWithin the rectangular decision boundary, then γ_nk1, or else γ_nkWhen the measured value is 0, gamma is observed_nkCan be used as a soft clustering index;

s43, suppose y_nObeying a mixed Gaussian distribution:

wherein, y_n|z_nIs the conditional probability, μ_kSum Σ_kIs the mean vector and covariance matrix of the kth cluster;

the joint probability of DRMM is:

wherein the content of the first and second substances,

representing all of the decision boundaries of the rectangle,

representing all parameters, random variables

If the feature is within the decision rectangle then phi_n1, otherwise phi_n0, Z is a latent variable, { X, Y } is an observed variable, N is the number of samples;

wherein x is_nFor the nth feature, const represents a constant that is not a function of Z, and the expected value of the joint logarithm W (T, Θ) is calculated by the following equation given the posterior distribution Z: