CN103077318A

CN103077318A - Classifying method based on sparse measurement

Info

Publication number: CN103077318A
Application number: CN2013100176108A
Authority: CN
Inventors: 徐鹏; 李沛洋; 张锐; 田春阳; 郭兰锦; 尧德中
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2013-01-17
Filing date: 2013-01-17
Publication date: 2013-05-01

Abstract

The invention discloses a classifying method based on sparse measurement. The classifying method specifically comprises the following steps of: culturing a distribution information matrix among classes and a distribution information matrix among classes; decomposing the obtained matrixes; converting a Fisher criterion to an L1 module structure; and estimating a projection vector for enabling an objective function to obtain the maximum value. As the existing LDA (Linear Discriminant Analysis) based on L2 mode measure function generates unfavorable amplification to the noises such as Outliers and the like, the classifying method disclosed by the invention can overcome the problem that the existing linear discriminant analysis based on L2 mode is affected by the Outliers and noise isolated points by adopting a measure function based on L1 mode during a discriminant analysis constructing process, so that stable classifying and identifying effect is obtained, and the stability of a BCI (Brain Computer Interface) system is improved to certain extent.

Description

A kind of sorting technique based on sparse tolerance

Technical field

The invention belongs to the biomedical information technical field, be specifically related to the method for classifying modes in the field of brain-computer interfaces.

Background technology

Brain-computer interface (Brain Computer Interface, BCI) be to utilize to realize human brain and extraneous passage (the Wolpaw JR that directly exchanges and control between computing machine or other external electronic device, Birbaumer N, McFarland DJ, Pfurtscheller G, Vaughan TM (2002) Brain-computer interfaces for communication and control.Clin Neurophysiol113,767-791).BCI research relates to numerous subjects, such as: Neuscience, input, signal processing, pattern-recognition, control theory etc., the cross development of these subjects have promoted the propelling of BCI research.The basic theory of BCI and clinical application research have been included into the category of brain science and neuroengineering, also by international a lot of authoritative institutions think 21 century brain and Neuroscience Research in forefront and one of focus.

A BCI system comprises signals collecting, data processing and three parts of Peripheral Equipment ﹠ Interface.At Data processing, the pattern classification of EEG signal is the key that can the BCI system translate brain information.The method for classifying modes that is used for brain-computer interface is a lot, wherein, linear discriminant analysis method (Linear Discriminant Analysis, LDA) is because simple structural principle and lower computation complexity and be widely used in the BCI pattern-recognition.

The basic thought of linear discriminant analysis is the sample in the higher dimensional space to be projected to a low dimensional feature space and so that the sample after the projection has dispersion in dispersion between maximum class and the minimum class in new space, the feature space that satisfies condition is makes (1) formula reach the maximum subspace that proper vector w opened:

J (w) = \frac{w^{t} S_{B} w}{w^{t} S_{W} w} - - - (1)

S_{B} = Σ_{i = 1}^{c} n_{i} (μ_{i} - μ) {(μ_{i} - μ)}^{T} - - - (2)

S_{W} = Σ_{i = 1}^{c} \underset{x_{k} &Element; classi}{Σ} (μ_{i} - x_{k}) {(μ_{i} - x_{k})}^{T} - - - (3)

μ_{i} = \frac{1}{n_{i}} \underset{x &Element; classi}{Σ} x - - - (4)

μ = \frac{1}{m} Σ_{i = 1}^{m} x_{i} - - - (5)

Wherein, m is the population sample number, and c is total classification number, n _iThe number of samples that represents i class sample, and have

x _kK sample in the expression training set, classi represents that the classification of sample is i, μ _iBe the sample average of i class sample, μ is the population sample average; W is the column vector of a m * 1, w ^TBe its transposition, and S _BAnd S _wBe respectively scatter matrix and within class scatter matrix between class, and both are the matrix of m * m.

By certain linear change and introduce Lagrange multiplier λ, then formula (1) can be expressed as:

L(w,λ)=w ^TS _Bw-λ(w ^TS _Ww-1) （6）

Formula (6) two ends ask local derviation to get to w simultaneously:

\frac{&PartialD; L (w, λ)}{&PartialD; w} = S_{B} w - {λS}_{W} w - - - (7)

Making it be 0 has:

S _Bw=λS _Ww （8）

The generalized eigenvector w of formula (8) is the projection vector of obtaining by LDA.

Because the essence of LDA is based on the exemplary number of L2 and defines, yet in EEG research, the EEG signals sample dimension that collects from eeg amplifier is usually much larger than number of samples, and has very strong randomness, particularly in the brain-computer interface field, this class problem shows more outstandingly.Although the solution that produces based on the L2 modeling method has good flatness, but do not possess sparse property, and be subject to easily in the experimentation because the exceptional value (Outliers) that electrode is loosening, nictation and misoperation produce and the impact of isolated noise point, even sometimes so that signal matrix becomes unusual, thereby may produce the mistake of projection w is estimated, finally affected classifying quality; And these interference can increase the reading difficulty of EEG signals and hinder the subsequent analysis research of EEG signals.

Summary of the invention

The objective of the invention is to propose a kind of sorting technique based on sparse tolerance in order to solve existing the problems referred to above that exist based on the linear discriminant analysis method of L2 mould

The technical solution used in the present invention is: a kind of sorting technique based on sparse tolerance specifically comprises the steps:

S1. according to scatter matrix S between EEG signals training sample difference compute classes _BWith within class scatter matrix S _W

Scatter matrix S between the class that S2. obtains according to step S1 _BWith within class scatter matrix S _WConstruct corresponding φ _BWith Wherein, φ _BWith

Satisfy respectively

T is the transpose of a matrix computing;

S3. the Fisher criterion is converted into the L1 mode configuration, namely has:

This structure mapping is converted into following objective function to log space:

Wherein, w is projection vector, estimates so that objective function is obtained peaked projection vector w.

Further, said method comprises that also step S4. projects to the space that vectorial w opens with test sample book and classifies.

The invention has the beneficial effects as follows: existing LDA based on L2 mould measure function can produce disadvantageous amplification to noises such as Outliers, method of the present invention has adopted the measure function based on the L1 mould when the structure discriminatory analysis, can overcome the problem that existing linear discriminant analysis based on the L2 mould is subject to Outliers and the impact of noise isolated point, obtain sane Classification and Identification effect, improve to a certain extent the BCI Systems balanth; Experimental results show that method that the present invention proposes can be at short time Convergence, thereby satisfy the brain-computer interface on-line system to the requirement of real-time.

Description of drawings

The schematic flow sheet of Fig. 1 the inventive method.

Gaussian data emulation synoptic diagram in Fig. 2 embodiment of the invention.

Colorectal cancer gene recognition effect comparison diagram in Fig. 3 embodiment of the invention.

Embodiment

The present invention is described further below in conjunction with the drawings and specific embodiments.

Because S _B, S _WBe real symmetric matrix, consisted of by the product of two matrixes respectively at the transposed matrix of self in essence, so formula (1) also can be expressed as:

S _B, S _WCan find corresponding φ by certain matrix decomposition theorem _BWith

Therefore method of the present invention can be described below:

In order to construct scatter matrix and within class scatter matrix between corresponding class, can in training sample, calculate respectively in advance the sample average μ of each class _iWith population sample average μ, calculate respectively scatter matrix S between class according to formula (2) and formula (3) _BAnd within class scatter matrix S _W

Scatter matrix S between the class that S2. obtains according to step S1 _BWith within class scatter matrix S _WConstruct corresponding φ _BWith Wherein, φ _BWith Satisfy respectively

T is the transpose of a matrix computing;

Here can adopt svd to construct corresponding φ _BWith

Formula is as follows:

SVD (S_{B}) = U_{B} Σ_{B} V_{B} = U_{B} Σ_{B} U_{B}^{T} = U_{B} \sqrt{Σ_{B}} \sqrt{Σ_{B}} U_{B}^{T} = φ_{B} φ_{B}^{T} - - - (10)

Wherein, because S _BAnd S _WBe real symmetric matrix, therefore carry out later its left singular matrix U of svd and right singular matrix V and only have a transposition relation, its size is m * m; ∑ _B=diag (σ _B1, σ _B2..., σ _Bm) and ∑ _W=diag (σ _W1, σ _W2..., σ _Wm) be respectively S _BAnd S _WThe diagonal matrix that consists of of singular value, and σ is arranged _B1〉=σ _B2〉=... 〉=σ _Bm=0 and σ _W1〉=σ _W2〉=... 〉=σ _Wm=0;

Be the matrix of m * m.

This structure mapping is converted into following objective function to log space: Wherein, w is projection vector, estimates so that objective function is obtained peaked projection vector w.

Said process is for the process of training projection vector w, as a kind of preferred version, comprises also that here test process step S4. projects to the space that vectorial w opens with test sample book and classifies, and specifically can adopt the arest neighbors mode to estimate the pattern of test sample book.

Here in step S3, estimate so that objective function to obtain the detailed process of peaked projection vector w as follows:

S31. to objective function J ^*(w) both sides while derived function goes out corresponding gradient:

Wherein,

P_{i} = \{\begin{matrix} 1 & w^{T} φ_{B} (:, i) &GreaterEqual; 0 \\ - 1 & w^{T} φ_{B} (:, i) < 0 \end{matrix},

φ _B(:, i),

Difference representing matrix φ _BWith I column vector and j column vector, m represents to consist of matrix φ _BWith

The column vector number;

S32. utilize gradient descent method that object vector w is estimated, the detailed process of estimation is as follows:

Renewal process is expressed as:

w(t+1)=w(t)-δ·w(t) ^T·dw

Wherein, t=[0,1,, n] and the expression iterations, w (t) is the vector of a m * 1, represent the estimated result that estimated result w (t+1) expression that the t time iteration obtains obtains for the t+1 time, it is the linear combination of w (t), vectorial w (0) the ∈ R of primary iteration ^mCan be set to a non-zero vector arbitrarily; δ is the iteration step distance, here can select so that J ^*(w (t+1)) reaches a value of maximum as the δ of this iteration, thereby guarantees the fast convergence of algorithm; When satisfying || J ^*(w (t+1))-J ^*(w (t)) || during＜α, iteration stopping, α is predefined threshold value.In the present embodiment, α≤1e-005.

For feasibility and the effect of verifying this programme, with method of the present invention (L1_LDA) and traditional LDA contrast, three groups of comparative experiments have been done.First group is adopted the Gaussian data of the different averages of two classes and standard deviation and introduces outlier as emulation experiment, second group of data that adopt international brain-computer interface match, the effect of checking this programme in brain-computer interface; The 3rd group of Colorectal Carcinoma data that adopt the Princeton University, the expansion possibility of checking this programme.

First group of experiment:

It is [3 that data A derives from mean vector; 3], the standard deviation vector is [0.5; 0.5] the Gaussian distribution matrix; The mean vector of data B is specifically as shown in table 1:

Table 1

The standard deviation vector is [0.5; 0.5] Gaussian distribution matrix (data B on diagonal gradually near data A), Outlier is [7.25 data from mean vector; 7.25], the standard deviation vector is [0.5; 0.5] the Gaussian distribution matrix, the number of Outliers is 10, data distribution in the experiment and projection are as shown in Figure 2, wherein, (a) the later scatter plot of data of Outliers is introduced in expression, and ' * ' and ' o ' represents respectively the data from different sample sets, can find out and introduce after the Outliers, skew has occured in part data, becomes exceptional value; (b) the traditional LDA method of expression can be found out because be subject to the impact of Outliers the classification projection of test set sample, a part of data communication device is crossed the LDA projection vector and has been mapped to from the position of cluster centre away from, has increased the probability of mistake minute; (c) the LDA method of expression process L1 modular constraint is to the classification projection of test sample book, can find out since the L1 mould to the inhibiting effect of Outliers so that two class data are concentrated after through the projection vector mapping has been distributed near the cluster centre of two class data, improved the stability of classifying quality.

Classification accuracy is as shown in table 2, and wherein runic represents that effect is better, and " * " expression has significant difference (p＜0.05).

Table 2

Second group of experiment:

The international brain-computer interface tested numbering of the experimental data A(aa al av aw ay that competes for the third time); Specifically as shown in table 3:

Table 3

The 3rd group of experiment:

Description of test: data from the rectal cancer group organization data of Princeton University, wherein comprised 62 Colorectal Tissues that difference is tested, each tissue has 2000 gene expressions, wherein has 40 organization table to reveal the positive, and 22 organization table reveal feminine gender.Adopt 5 times of cross validations, test 50 times average classification accuracy as shown in Figure 3.

Above experimental result proves that all method of the present invention is feasible, and in some cases, it is particularly evident that method of the present invention is compared traditional LDA effect.On the whole, no matter be emulation experiment or real experimental data, its result verification validity and the feasibility of method of the present invention, for improving the EEG research Systems balanth, extracting more effectively, cerebral function information has great theory and practice meaning.Method of the present invention not only has preferably effect in the brain-computer interface field, and proof also has preferably effect for the pattern-recognition in other field by experiment.

Those of ordinary skill in the art will appreciate that embodiment described here is in order to help reader understanding's principle of the present invention, should to be understood to that protection scope of the present invention is not limited to such special statement and embodiment.Those of ordinary skill in the art can make various other various concrete distortion and combinations that do not break away from essence of the present invention according to these technology enlightenments disclosed by the invention, and these distortion and combination are still in protection scope of the present invention.

Claims

1. the sorting technique based on sparse tolerance specifically comprises the steps:

Scatter matrix S between the class that S2. obtains according to step S1 _BWith within class scatter matrix S _WConstruct corresponding φ _BWith

Wherein, φ _BWith

Satisfy respectively

Described L1 mode configuration is mapped to log space is converted into following objective function:

2. the sorting technique based on sparse tolerance according to claim 1 is characterized in that, comprises that also step S4. projects to the space that vectorial w opens with test sample book and classifies.

3. the sorting technique based on sparse tolerance according to claim 1 and 2 is characterized in that, step S2 specifically adopts svd structure φ _BWith

4. the sorting technique based on sparse tolerance according to claim 1 and 2 is characterized in that, estimate among the step S3 so that objective function to obtain the detailed process of peaked projection vector w as follows:

Wherein,

P_{i} = \{\begin{matrix} 1 & w^{T} φ_{B} (:, i) &GreaterEqual; 0 \\ - 1 & w^{T} φ_{B} (:, i) < 0 \end{matrix},

φ _B(:, i), Difference representing matrix φ _BWith

I column vector and j column vector, m represents to consist of matrix φ _BWith

The column vector number;

S32. utilize gradient descent method that object vector w is estimated.

5. the sorting technique based on sparse tolerance according to claim 4 is characterized in that, step S32 utilizes gradient descent method as follows to the detailed process that object vector w estimates:

Renewal process is expressed as:

w(t+1)=w(t)-δ·w(t) ^T·dw

Wherein, t represents iterations, and w (t) is the vector of a m * 1, represents the estimated result that the t time iteration obtains, the estimated result that w (t+1) expression obtains for the t+1 time, vectorial w (0) the ∈ R of primary iteration ^mBe set to a non-zero vector arbitrarily; δ is the iteration step distance, selects so that J ^*(w (t+1)) reaches a value of maximum as the δ of this iteration; When satisfying || J ^*(w (t+1))-J ^*(w (t)) || during＜α, iteration stopping, α is predefined threshold value.