CN107491419A

CN107491419A - A kind of linear discriminant analysis method with bilinearity low-rank subspace

Info

Publication number: CN107491419A
Application number: CN201710793137.0A
Authority: CN
Inventors: 何小海; 苏婕; 卿粼波; 周文; 周文一; 王正勇; 吴晓红; 熊淑华
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2017-09-06
Filing date: 2017-09-06
Publication date: 2017-12-19
Anticipated expiration: 2037-09-06
Also published as: CN107491419B

Abstract

The invention discloses a kind of linear discriminant analysis method with bilinearity low-rank subspace.Comprise the following steps：For matrix data sample set, retain the form of script matrix as input data, build the discriminant analysis model based on matrix, loss function is built using the least square method based on matrix, introduce nuclear norm regular terms and low-rank constraint is carried out to mapping matrix stack, loss function is solved using alternating direction algorithm (ADMM) framework of multiplier, to seek the bilinearity low-rank subspace of matrix samples collection, so as to carry out dimensionality reduction to matrix samples collection.Linear discriminant analysis method of the present invention with bilinearity low-rank subspace is obviously improved than other method in matrix samples collection dimensionality reduction effect, and evaluation index general performance is fine.

Description

A kind of linear discriminant analysis method with bilinearity low-rank subspace

Technical field

The present invention devises a kind of linear discriminant analysis method with bilinearity low-rank subspace, is related to machine learning, Data Dimensionality Reduction, data mining technology field.

Background technology

With the fast development of big data, the correlation technique such as data mining, machine learning and artificial intelligence obtains widely Using.Requirement of the people to data becomes to become increasingly complex, because people are preserving the knot of data message and heuristic data behind There are more demands in terms of structure correlation, this causes the complexity of model structure and the inefficiencies of model training.Therefore, obtain Correlation between data is very necessary to reduce data redudancy and computation complexity.

A kind of intuitively method for reducing data redudancy is that the structural dependence between data is obtained by dimensionality reduction.PCA With LDA be two kinds of common dimension reduction methods, the feature extraction being widely used in area of pattern recognition, such as Eigenfaces and Signal structure correlation analysis in Fisherfaces, and signal transacting, PCA and LDA have obtained very big in dimension-reduction treatment Success.

PCA and LDA is by carrying out Eigenvalues Decomposition to covariance matrix or hash matrix, therefore both can be equivalent to Generalized eigenvalue decomposition problem.Input sample of data is expressed as vector or scalar by classical PCA and LDA, based on these vectors It is modeled, goes out to try to achieve mapping matrix.However, the data such as image, electroencephalogram (EEG), itself are matrix forms, if by its Represent to turn into vector, data dimension can be made very high.In this case, will during classical PCA and LDA progress Eigenvalues Decomposition Become abnormal difficult while cause huge amount of calculation.For LDA, or even lack sampling problem occurs, and lead to not Solve characteristic value；In addition, for this kind of matrix data, it is expressed as vector form, its original internal ranks knot can be destroyed Structure information.

In order to overcome drawbacks described above, existing researcher proposes several dimension reduction methods based on data matrix.Such as IMPCA, 2DPCA, (2D) 2PCA, 2DLDA, 2D-LDA and BDCA.However, these component analyzing methods still need carry out feature It is worth decomposition computation, calculation cost is very big.Because under certain condition, linear discriminant analysis can be equivalent to linear regression, therefore The loss function of least square method structure linear discriminant analysis can be used, so as to avoid meter huge caused by Eigenvalues Decomposition Calculation amount.

The content of the invention

The present invention provides a kind of more efficiently linear discriminant with bilinearity low-rank subspace to solve the above problems Analysis method (Linear Discriminant Analysis with Bilinear Low-Rank Subspace, BLR- LDA).The present invention be directed to matrix samples data set, the discriminant analysis model based on matrix is built, using with particular indicators The least square method structure loss function of matrix, and low-rank is carried out about to mapping matrix stack by introducing nuclear norm regular terms Beam, to obtain the structural dependence of input sample internal matrix, more preferable discriminant analysis effect can be obtained.

The present invention is achieved through the following technical solutions above-mentioned purpose：

A kind of linear discriminant analysis method with bilinearity low-rank subspace, comprises the following steps：

(1) matrix data sample set is directed to, regard it as input data in the form of script matrix.

(2) the discriminant analysis model based on matrix is built, using the least square method structure loss letter pair based on matrix Matrix samples collection enters row matrix discriminant analysis, to seek the two-wire subspace of matrix samples collection.

(3) low-rank constraint is carried out to mapping matrix stack by adding nuclear norm regular terms in loss function, so as to obtain The ranks correlation of matrix data sample interior.

(4) assume that each mapping matrix that mapping matrix is concentrated is separate, former loss function is converted into k son of solution Lose letter.

(5) mapping matrix is solved using the alternating direction algorithm (ADMM) of multiplier, is mapped to matrix data sample set Low-rank subspace.

Brief description of the drawings

Fig. 1 is classical LDA block schematic illustrations

Fig. 2 is BLR-LDA block schematic illustrations of the present invention

Embodiment

The invention will be further described below in conjunction with the accompanying drawings：

Fig. 1 is classical LDA block schematic illustrations, is comprised the following steps：

(1) the n set of data samples characterized by matrix form are inputted, are vectorial shape by each data sample matrixing Formula：

Wherein,Represent i-th of input sample of data matrix, x_iRepresent its corresponding vector form.Then n number Data matrix is constituted according to vector

(2) mapping matrix is usedLow-dimensional will be mapped to by the primary data sample collection of vector representation Space, pass through X^TW is operated, the data vector after being mapped

Fig. 2 is BLR-LDA block schematic illustrations of the present invention.Comprise the following steps：

(1) the n data sample X characterized by matrix form are inputted_iTo retain its structural information, correspondingly, in classical LDA Mapping matrix W each row correspond to a mapping matrix：D=p × q, j=1 ..., k.Then in classical LDA Mapping matrix W by one group of new mapping matrix collection W_jRepresent.

(2) under certain condition, classical LDA can be equivalent to linear regression, and this allows classical LDA directly to convert For least square problem, huge amount of calculation as caused by Eigenvalues Decomposition is avoided.Classical LDA can represent as follows：

Wherein,It is the indicator matrix of centralization, it is defined as follows：

Wherein, n is total number of samples amount, n_jIt is jth class sample size.

(3) above equation relation, mapping matrix are based onIt is calculated by equation below：

Wherein,It is that i-th of data sample is vectorial, w_jIt is the jth row of mapping matrix W in classical LDA.

(4) for the structural information of retention data matrix, we are vectorial by data sampleBe converted to data sample MatrixCorrespondingly, the jth of projection matrix arranges in classical LDABe converted to matrix formCause This, in above formulaRepresent as follows：

(5) in order to obtain the structural information of data sample matrix, we are to each W_jNuclear norm regularization is used alone. Loss function based on mapping matrix optimization problem is specified as follows:

Wherein, τ is the hyper parameter obtained by cross validation.

(6) in BLR-LDA frameworks, it will be assumed that each W_jIt is separate.Therefore, above-mentioned optimization problem can To be expressed as k sub- loss functions：

We solve above-mentioned optimization problem using alternating direction (ADMM) algorithm of multiplier, fissioned on W_jFission and auxiliary variable S_jTwo subproblems.Above formula can be equivalently written as：

s.t.W_j-S_j=0

Wherein,G(S_j)=τ | | S_j||_*。

(7) solved the above problems using augmented vector approach：

Wherein, Λ_jIt is Lagrange multiplier matrix, ρ ＞ 0 are hyper parameters.

First, on auxiliary variable S_jThe solution method of subproblem is as follows：

Optimal solution S_j ^(t)Obtained by the t times iteration of following analytical expression：

Wherein, ρ W_j-Λ_j=U Σ V^TIt is singular value decomposition form.For arbitrary τ ＞ 0,It is unusual It is worth threshold operator, S_τ(Σ)_ii=max (Σ_ii-τ,0)。

Then we solve second subproblem W_j, expression formula is as follows：

W_jIt is a clear and definite convex optimization problem, we take gradient to decline to solve it.Changed using following form In generation, updates W_j：

Wherein, α ＞ 0 are learning rate, are a hyper parameters.W_jPart derivation formula it is as follows：

In addition, LaGrange parameter Λ_jUpdated with following single gradient step：

(8) ADMM algorithms are used, continuous alternating iteration updates W_j、S_j、Λ_jUntil convergence, you can try to achieve one group and optimal reflect Penetrate matrix stack, W_j, so as to which matrix samples collection is mapped into bilinearity low-rank subspace.

Claims

A kind of 1. linear discriminant analysis method with bilinearity low-rank subspace, it is characterised in that comprise the following steps：

Step 1：For n matrix data sample set { X_i|X_i∈R^p×q, p × q=d, 1≤i≤n } (such as：View data, brain Electromyographic data), it regard each data sample as input data in the form of script matrix；

Step 2：The discriminant analysis model based on matrix is built, loss function L is built using the least square method based on matrix (W_(1,2,…k)), by L (W_(1,2,…k)) solution is iterated, so as to obtain mapping matrix collection W_j∈R^p×q(j=1,2 ... k), To seek the two-wire subspace of matrix samples collection, so as to carry out dimensionality reduction to matrix samples collection；

Step 3：By in loss function L (W_(1,2,…k)) in add nuclear norm canonical | | W_j||_*Item is to each mapping matrix W_jEnter Row low-rank constrains, so as to obtain the ranks correlation of matrix data sample interior；

Step 4：Assuming that k W_jBetween independently of each other, will solve loss function L (W_(1,2,…k)) the problem of to be converted into solution k sub Loss function L (W_j) the problem of, so as to k L (W_j) solve respectively；

Step 5：Sub- loss function L (W_j) non-smooth, using multiplier alternating direction algorithm (ADMM) framework to L (W_j) asked Solution, and former optimization problem is changed into two sub- optimization problems and solved, mapping matrix collection is finally given after iteration, so as to Original matrix sample set is mapped to low-rank subspace, to reach the purpose of Data Dimensionality Reduction.
2. the matrix samples collection of claim 1 all input matrix discriminant analysis models in the matrix form so that primary data sample Ranks relational structural information retained.
3. when in claim 2 to the loss function structure of matrix discriminant analysis model, the least square based on matrix is employed Method, high calculation cost caused by Eigenvalues Decomposition in conventional linear discriminant analysis is avoided, while it also avoid traditional wire Lack sampling problem in property discriminant analysis method.
4. by loss function L (W in claim 3_(1,2,…k)) in each W_jIntroduce the constraint of nuclear norm low-rank so that model Original matrix data sample ranks correlation information can be obtained.