CN108805155A

CN108805155A - Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously

Info

Publication number: CN108805155A
Application number: CN201810233453.7A
Authority: CN
Inventors: 王迪; 张磊; 张笑钦; 古楠楠; 叶修梓
Original assignee: Cangnan Institute Of Cangnan
Current assignee: Cangnan Institute Of Cangnan
Priority date: 2018-03-21
Filing date: 2018-03-21
Publication date: 2018-11-13

Abstract

The invention discloses a kind of while learning the semisupervised classification method of incidence matrix and Laplace regularization least square, mainly include the following steps that：Conjunctive model that is a kind of while learning incidence matrix and Laplace regularization least square is established according to training sample first；Secondly optimization is iterated to each class variable in model using block coordinate descent；It finally uses Laplace regularization least square grader to obtain the soft label of sample, and chooses classification of that the maximum dimension of element in label vector as sample.The sparse of sample is effectively merged by the invention from problem of representation and Laplace regularization least square grader, and is realized optimization while sample incidence matrix and Laplace regularization least square grader in learning process and mutually improved.The present invention has explicit classifier functions, the problem of so as to effectively handle external sample.Relative to other semisupervised classification methods, this method has more accurate classification accuracy rate, there is good application prospect.

Description

Learn the semisupervised classification of incidence matrix and Laplace regularization least square simultaneously Method

Technical field

The present invention relates to CRT technology fields, are specifically related to a kind of while learning incidence matrix (Affinity ) and Laplace regularization least square (Laplacian Regularized Least Square, Lap-RLS) Matrix Semisupervised classification method.

Background technology

In practical application, the performance of sorting technique depends in training sample the number for having exemplar, however It is extremely difficult, costly and time-consuming that being obtained in actual life, which has exemplar, and needs a large amount of effort of domain experts. On the other hand, the fast development for having benefited from data sampling techniques and computer hardware technology has obtained a large amount of unlabeled exemplars It is very easy to, so that utilize a small amount of semi-supervised learning for thering are exemplar and a large amount of unlabeled exemplars to be trained (Semi-Supervised Learning, SSL) becomes the research hotspot in CRT technology and machine learning field, existing It has been widely used in the fields such as image classification, recognition of face.

The neighbouring sample in mutually similar cluster or data manifold have larger possibility possess identical label this Under assuming that, the semi-supervised learning (Graph based Semi-Supervised Learning, G-SSL) based on figure causes extensively The close attention of big researcher.Its core concept is to utilize given part labels and the consistency being associated in pairs between sample It predicts the label of unlabeled exemplars, that is, passes through the process that pairs of associated data propagates label.Generally speaking, half prison based on figure It superintends and directs learning algorithm and mainly solves two critical problems：(1) incidence matrix between sample is constructed；(2) unlabeled exemplars are predicted Label.

Being associated with this concept is suggested in the weight matrix in defining datagram.Weight matrix is used for indicating The matrix of similarity between sample.Zhou et al. constructs adjacent map based on Euclidean distance to choose k Neighbor Points, then passes through Thermonuclear (heat kernel) calculates edge weights matrix.The linear neighborhood patches of overlappings of Wang et al. a series of approaches Entire figure, the edge weights of each patch are calculated using neighborhood linear projection.Liu et al. people proposes effective anchor figure, pass through by Each sample is expressed as it adjacent to the linear combination of anchor point to construct association.In order to avoid neighborhood graph parameter select permeability and obtain Derived from figure is adapted to, scholars propose rarefaction representation (Sparse Represention, SR) and low-rank representation (Low-Rank Representation, LRR) etc. the overall situation from representation method, main thought be calculate each sample under other samples Sparse or low-rank representation coefficient, then constructs incidence matrix with coefficient is indicated.The overall situation can be obtained preferably from representation method The global structure of data is an effective tool of semi-supervised learning.

After obtaining incidence matrix, then unlabeled exemplars can be estimated by different forecasting mechanisms based on incidence matrix Label, such as Gaussian field and harmonic function (Gaussian Fields and Harmonic Functions, GFHF), part With globally consistent inquiry learning (Learning with Local and Global Consistency, LLGC), manifold regularization (Manifold Regularization, MR), Markov random walk (Markov Random Walks, MRW), special mark Label propagate (Special Label Propagation, SLP), the straight thruster of spectrogram (Spectral Graph Transducer, SGT) etc..

The semi-supervised learning method based on figure is in classification task although show excellent performance above, they It is usually first to construct incidence matrix in training process, is then based on incidence matrix by certain forecasting mechanism come sample estimates mark Label, i.e., the construction of incidence matrix and Tag Estimation mechanism individually carry out in two steps, cannot make full use of incidence matrix in this way It is potentially contacted between sample label.

In order to overcome this disadvantage, Li et al. people that the two independent processes are merged into a combined optimization model and are asked Solution referred to as learns by oneself semi-supervised learning (Self-Taught Semi-Supervised Learning, STSSL).It can be simultaneously The label for learning incidence matrix and unknown sample the advantage is that the label that can make full use of given label and prediction comes Constantly improve incidence matrix, to further increase the precision that label propagates (Label Propagation, LP).However this Method belongs to transductive learning method, i.e. not explicit decision function predicts the label of unlabeled exemplars, therefore cannot be effective Ground handles the problem of external sample.In addition to this, label transmission method used in optimization process is also not suitable for having multiple The data set of miscellaneous structure.

Invention content

The purpose of the invention is to overcome shortcoming and defect of the existing technology, and provides a kind of while learning to be associated with The semisupervised classification method of matrix and Laplace regularization least square.This method is by the sparse from problem of representation and drawing of sample This regularization least square grader of pula is effectively merged, and establishes the Laplace regularization least square of self-study (Self-taught Laplacian Regularized Least Square, ST-LapRLS) model, and in learning process Optimization and mutually improvement while realizing sample incidence matrix and Laplace regularization least square grader.It is prior It is that the invention has explicit classifier functions, so as to effectively handle external sample problem.

5. to achieve the above object, the technical scheme is that this method includes：

S1:The joint mould of incidence matrix and Laplace regularization least square is built while learnt according to training sample Type；

S2:Optimization, Zhi Daoshou are iterated to each class variable in the conjunctive model using block gradient descent algorithm It holds back；

S3:With the soft label of Laplace regularization least square classifier calculated sample to be sorted, and choose soft label That maximum dimension of element is as the classification belonging to sample to be sorted in vector.

Further setting is that the step S1 includes the following steps：

S11：By sparse from indicating to build the incidence matrix between training sample, even some sample can be with other several If a sample rarefaction representation, then it represents that there is larger relevances between this sample and these samples；

S12:Laplce's figure relationship between sample label is built, i.e. the prodigious sample of relevance should have similar Label；

S13：Sparse by sample is effectively embedded into from problem of representation in Laplace regularization least square grader, is built Conjunctive model that is vertical while learning incidence matrix and Laplace regularization least square.

It includes following sub-step that further setting, which is the step S2,：

S21:By the conjunctive model of foundation be decomposed into Laplace regularization least square grader subproblem and It is sparse to indicate subproblem certainly, and Optimization Solution in turn is carried out to two sub-problems by way of iteration；

S22:For Laplace regularization least square grader subproblem, with the method that gradient is zero to grader Coefficient matrix variable carry out Analytical Solution；

S23:For sparse from relevant subproblem is indicated, in order to eliminate the sparse coefficient variation of expression certainly in optimization process Correlation, introduce auxiliary variable, and with alternating direction Multiplier Algorithm solve.

It includes following sub-step that further setting, which is the step S3,：

S31：With the soft label of Laplace regularization least square classifier calculated sample to be sorted；

S32：That maximum dimension of element in soft label vector is found out, and using the dimension as belonging to sample to be sorted Classification.

The beneficial effects of the invention are as follows：

1, the present invention proposes a kind of completely new and general semisupervised classification method, to arbitrary classification data type (such as recognition of face, object classification) is all suitable for.Specifically, estimated label is made full use of to go to generate good data pass Join matrix, good incidence matrix can further improve the performance of grader again, i.e. grader and incidence matrix is in learning process In can achieve the effect that mutually promote.

2, grader proposed by the present invention is explicit, it can easily handle external sample problem.

3, the soft label that the present invention is obtained by Laplace regularization least square grader is compared to label propagation side The hard label that method obtains is more suitable for the manifold in semi-supervised learning it is assumed that therefore the present invention can handle the classification times of complex data Business.

4, the present invention quickly and effectively solves built semisupervised classification model by block gradient descent algorithm.Especially Ground uses gradient to be solved for zero method the coefficient matrix variable in Laplace regularization least square grader Analysis solution；For the sparse correlated variables from problem of representation, the present invention proposes to be iterated solution with alternating direction Multiplier Algorithm.

Description of the drawings

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, according to These attached drawings obtain other attached drawings and still fall within scope of the invention.

Fig. 1 flow charts of the method for the present invention；

Fig. 2 is the overall flow service chart of the present invention

Specific implementation mode

To make the object, technical solutions and advantages of the present invention clearer, the present invention is made into one below in conjunction with attached drawing Step ground detailed description.

As shown in Figure 1 to Figure 2, it is in the embodiment of the present invention, the present invention is a kind of while learning incidence matrix and La Pula The semisupervised classification method of this regularization least square, the hardware and programming language of method carrying out practically of the invention are simultaneously unlimited System, being write with any language can complete, and other operating modes repeat no more thus.

The embodiment of the present invention is using the calculating with Intel Xeon-E5 central processing units and 16G byte of memory Machine is used in combination Matlab language to work out while learning the semisupervised classification of incidence matrix and Laplace regularization least square Working procedure, the method for realizing the present invention.

Learn incidence matrix while of the invention and the semisupervised classification method of Laplace regularization least square is main Foundation, block coordinate including conjunctive model decline optimization algorithm and three steps of classification policy of sample.

Before introducing specific steps, the meaning of the symbol to be used below introduction of the present invention.

Give a data set for including C class samplesWithout loss of generality, false If preceding l element is to have exemplar in data set S, remaining u element is unlabeled exemplars.There are exemplar and no label Sample is formed by matrix and is denoted as X respectively_lAnd X_u, i.e. X_l=[x₁,…,x_l], X_u=[x_l+1,…,x_l+u].Correspondingly, Y_l= [y₁,…,y_l] and Y_u=[y_l+1,…,y_l+u] it is known label matrix and Unknown Label matrix, wherein label y respectively_iIt is a C 0-1 vectors are tieed up, i.e., if x_iBelong to jth class sample, then y_iJ-th of position on element value be 1, the element in other positions Value is 0.Whole samples are formed by matrix and are denoted as X, i.e. X=[X_l,X_u], label matrix Y, i.e. Y=[Y_l,Y_u]。||A||₁ Representing matrix A'sNorm, i.e., | | A | |₁=∑ ∑ | α_ij|。

For the classification problem for including C class samples, it is as follows：

S1 builds while learning the conjunctive model of incidence matrix and Laplace regularization least square.It includes mainly：

A) incidence matrix between training sample is built.Classical is sparse as follows from problem of representation：

Wherein E is error matrix, and A is sparse from representing matrix, and the diagonal entry of diag (A)=0 representing matrixes A is all Zero, this be in order to avoid there is trivial solution, i.e., sample with its own come linear expression.

The optimal solution of note problem 1. isWhereinIt is sample x_iIt is sparse under other samples

Indicate coefficient.If x_iThe other sample rarefaction representations of energy, i.e.,Then illustrate sample x_iAnd sample ThisBetween there is larger relevances, therefore the incidence matrix between sample can be by rarefaction representation coefficient come structure It makes, i.e.,

W=(| A |+| A^T|)/2 ②

B) Laplce's figure relationship between sample label is built.The big sample of relevance should have similar label, Therefore Laplce's figure relationship of construction sample label is

Wherein L is Laplacian Matrix, and L=D-W, D are diagonal matrix, and diagonal entry is

C) the sparse of sample is combined from problem of representation with Laplace regularization least square grader, establishes and learns by oneself Laplace regularization least square model

S.t.X=XA+E, diag (A)=0

Herein, β is a positive parameter,It is nuclear matrix, B is Laplce's canonical Change least squared classified device coefficient matrix,Be a preceding l element it is the diagonal matrix that 1 remaining element is 0,WhereinIt is a null vector.

S2 is iterated optimization, Zhi Daoshou using block gradient descent algorithm to each variable in the conjunctive model that is proposed It holds back.It includes mainly：

A) optimization problem is decomposed into Laplace regularization least square grader subproblem and sparse expression certainly Problem, and Optimization Solution in turn is carried out to following two subproblems by way of iteration

In 5., L^(t)It is by A^(t)The Laplacian Matrix induced；In 6., Y^(t+1)=B^(t+1)K, i.e. grader institute The soft label matrix of the sample of prediction.

B) fixation is sparse from representing matrix A^(t)With error matrix E^(t), solve Laplace regularization least square grader Coefficient matrix B^(t+1).Since 5. sub- optimization problem is convex function about variable B and can continuously lead, therefore gradient is asked to variable B and is enabled It is zero, and available analytic solutions are：

Wherein I is unit matrix.

C) Laplace regularization least square grader coefficient matrix B is given^(t+1), solve sparse from representing matrix A^(t ⁺¹⁾With error matrix E^(t+1).Due to

Define matrixThen 6. sub- optimization problem can be re-written as

Wherein⊙ is that the Hadmard of matrix is accumulated.Optimizing to eliminate the sparse coefficient variation of expression certainly Correlation in journey introduces auxiliary variable Z, while for the sake of symbol simplicity, the subscript in variable being removed, then optimization problem 8. being equivalent to：

Its Augmented Lagrangian Functions is

Wherein Λ⁽¹⁾, Λ⁽²⁾Indicate Lagrangian Matrix Multiplier,<,>The inner product of representing matrix.With alternating direction Multiplier Algorithm to optimization problem 9. in each variable be iterated solution, be as follows：

Sub- optimization problem about A is solved to

WhereinS_η() is that threshold operator meets S_η(x)=sgn (x) max (| x |-η, 0).

Sub- optimization problem about Z is solved to

It is zero to seek gradient about Z to above formula and enable it, and the analytic solutions that variable Z can be obtained are

Sub- optimization problem about E is solved to

Update Lagrange multiplier matrix

Λ_κ+1 ⁽¹⁾=Λ_κ ⁽¹⁾+μ(X-XZ_κ+1-E_κ+1),

Λ_κ+1 ⁽²⁾=Λ_κ ⁽²⁾+μ(Z_κ+1-A_κ+1+diag(A_κ+1))。

S3 proposes a kind of method differentiating sample class label, includes mainly specifically：

A) for need differentiate classification sample z, sample z is gone out by Laplace regularization least square classifier calculated Soft label, i.e.,

y_z=F (z)=Bk_z,

Wherein k_z=[k (x₁,z),k(x₂,z),…,k(x_l+u,z)]^T。

B) that maximum dimension of element in soft label vector is found out, and using the dimension as the classification belonging to sample, i.e.,

One of ordinary skill in the art will appreciate that implement the method for the above embodiments be can be with Relevant hardware is instructed to complete by program, the program can be stored in a computer read/write memory medium, The storage medium, such as ROM/RAM, disk, CD.

The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainly It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.

Claims

1. semisupervised classification method that is a kind of while learning incidence matrix and Laplace regularization least square, feature exist In this method includes：

S1:The conjunctive model of incidence matrix and Laplace regularization least square is built while learnt according to training sample；

S2:Optimization is iterated to each class variable in the conjunctive model using block gradient descent algorithm, until convergence；

S3:With the soft label of Laplace regularization least square classifier calculated sample to be sorted, and choose soft label vector That maximum dimension of middle element is as the classification belonging to sample to be sorted.

2. semisupervised classification method according to claim 1, it is characterised in that：The step S1 includes the following steps：

S11：The incidence matrix between training sample is built by sparse expression certainly, even some sample can use other several samples If this rarefaction representation, then it represents that there is larger relevances between this sample and these samples；

S12:Laplce's figure relationship between sample label is built, i.e. the prodigious sample of relevance there should be similar mark Label；

S13：Sparse by sample is effectively embedded into from problem of representation in Laplace regularization least square grader, is established same When learn incidence matrix and Laplace regularization least square conjunctive model.

3. semisupervised classification method according to claim 1, it is characterised in that：The step S2 includes following sub-step Suddenly：

S21:The conjunctive model of foundation is decomposed into Laplace regularization least square grader subproblem and sparse From expression subproblem, and Optimization Solution in turn is carried out to two sub-problems by way of iteration；

S22:For Laplace regularization least square grader subproblem, with the method that gradient is zero to grader system Matrix number variable carries out Analytical Solution；

S23:For sparse from relevant subproblem is indicated, in order to eliminate the sparse phase from expression coefficient variation in optimization process Guan Xing introduces auxiliary variable, and is solved with alternating direction Multiplier Algorithm.

4. semisupervised classification method according to claim 1, it is characterised in that：The step S3 includes following sub-step Suddenly：

S32：That maximum dimension of element in soft label vector is found out, and using the dimension as the class belonging to sample to be sorted Not.