CN108805155A - Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously - Google Patents
Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously Download PDFInfo
- Publication number
- CN108805155A CN108805155A CN201810233453.7A CN201810233453A CN108805155A CN 108805155 A CN108805155 A CN 108805155A CN 201810233453 A CN201810233453 A CN 201810233453A CN 108805155 A CN108805155 A CN 108805155A
- Authority
- CN
- China
- Prior art keywords
- sample
- square
- laplace regularization
- grader
- incidence matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
Abstract
The invention discloses a kind of while learning the semisupervised classification method of incidence matrix and Laplace regularization least square, mainly include the following steps that:Conjunctive model that is a kind of while learning incidence matrix and Laplace regularization least square is established according to training sample first;Secondly optimization is iterated to each class variable in model using block coordinate descent;It finally uses Laplace regularization least square grader to obtain the soft label of sample, and chooses classification of that the maximum dimension of element in label vector as sample.The sparse of sample is effectively merged by the invention from problem of representation and Laplace regularization least square grader, and is realized optimization while sample incidence matrix and Laplace regularization least square grader in learning process and mutually improved.The present invention has explicit classifier functions, the problem of so as to effectively handle external sample.Relative to other semisupervised classification methods, this method has more accurate classification accuracy rate, there is good application prospect.
Description
Technical field
The present invention relates to CRT technology fields, are specifically related to a kind of while learning incidence matrix (Affinity
) and Laplace regularization least square (Laplacian Regularized Least Square, Lap-RLS) Matrix
Semisupervised classification method.
Background technology
In practical application, the performance of sorting technique depends in training sample the number for having exemplar, however
It is extremely difficult, costly and time-consuming that being obtained in actual life, which has exemplar, and needs a large amount of effort of domain experts.
On the other hand, the fast development for having benefited from data sampling techniques and computer hardware technology has obtained a large amount of unlabeled exemplars
It is very easy to, so that utilize a small amount of semi-supervised learning for thering are exemplar and a large amount of unlabeled exemplars to be trained
(Semi-Supervised Learning, SSL) becomes the research hotspot in CRT technology and machine learning field, existing
It has been widely used in the fields such as image classification, recognition of face.
The neighbouring sample in mutually similar cluster or data manifold have larger possibility possess identical label this
Under assuming that, the semi-supervised learning (Graph based Semi-Supervised Learning, G-SSL) based on figure causes extensively
The close attention of big researcher.Its core concept is to utilize given part labels and the consistency being associated in pairs between sample
It predicts the label of unlabeled exemplars, that is, passes through the process that pairs of associated data propagates label.Generally speaking, half prison based on figure
It superintends and directs learning algorithm and mainly solves two critical problems:(1) incidence matrix between sample is constructed;(2) unlabeled exemplars are predicted
Label.
Being associated with this concept is suggested in the weight matrix in defining datagram.Weight matrix is used for indicating
The matrix of similarity between sample.Zhou et al. constructs adjacent map based on Euclidean distance to choose k Neighbor Points, then passes through
Thermonuclear (heat kernel) calculates edge weights matrix.The linear neighborhood patches of overlappings of Wang et al. a series of approaches
Entire figure, the edge weights of each patch are calculated using neighborhood linear projection.Liu et al. people proposes effective anchor figure, pass through by
Each sample is expressed as it adjacent to the linear combination of anchor point to construct association.In order to avoid neighborhood graph parameter select permeability and obtain
Derived from figure is adapted to, scholars propose rarefaction representation (Sparse Represention, SR) and low-rank representation (Low-Rank
Representation, LRR) etc. the overall situation from representation method, main thought be calculate each sample under other samples
Sparse or low-rank representation coefficient, then constructs incidence matrix with coefficient is indicated.The overall situation can be obtained preferably from representation method
The global structure of data is an effective tool of semi-supervised learning.
After obtaining incidence matrix, then unlabeled exemplars can be estimated by different forecasting mechanisms based on incidence matrix
Label, such as Gaussian field and harmonic function (Gaussian Fields and Harmonic Functions, GFHF), part
With globally consistent inquiry learning (Learning with Local and Global Consistency, LLGC), manifold regularization
(Manifold Regularization, MR), Markov random walk (Markov Random Walks, MRW), special mark
Label propagate (Special Label Propagation, SLP), the straight thruster of spectrogram (Spectral Graph Transducer,
SGT) etc..
The semi-supervised learning method based on figure is in classification task although show excellent performance above, they
It is usually first to construct incidence matrix in training process, is then based on incidence matrix by certain forecasting mechanism come sample estimates mark
Label, i.e., the construction of incidence matrix and Tag Estimation mechanism individually carry out in two steps, cannot make full use of incidence matrix in this way
It is potentially contacted between sample label.
In order to overcome this disadvantage, Li et al. people that the two independent processes are merged into a combined optimization model and are asked
Solution referred to as learns by oneself semi-supervised learning (Self-Taught Semi-Supervised Learning, STSSL).It can be simultaneously
The label for learning incidence matrix and unknown sample the advantage is that the label that can make full use of given label and prediction comes
Constantly improve incidence matrix, to further increase the precision that label propagates (Label Propagation, LP).However this
Method belongs to transductive learning method, i.e. not explicit decision function predicts the label of unlabeled exemplars, therefore cannot be effective
Ground handles the problem of external sample.In addition to this, label transmission method used in optimization process is also not suitable for having multiple
The data set of miscellaneous structure.
Invention content
The purpose of the invention is to overcome shortcoming and defect of the existing technology, and provides a kind of while learning to be associated with
The semisupervised classification method of matrix and Laplace regularization least square.This method is by the sparse from problem of representation and drawing of sample
This regularization least square grader of pula is effectively merged, and establishes the Laplace regularization least square of self-study
(Self-taught Laplacian Regularized Least Square, ST-LapRLS) model, and in learning process
Optimization and mutually improvement while realizing sample incidence matrix and Laplace regularization least square grader.It is prior
It is that the invention has explicit classifier functions, so as to effectively handle external sample problem.
5. to achieve the above object, the technical scheme is that this method includes:
S1:The joint mould of incidence matrix and Laplace regularization least square is built while learnt according to training sample
Type;
S2:Optimization, Zhi Daoshou are iterated to each class variable in the conjunctive model using block gradient descent algorithm
It holds back;
S3:With the soft label of Laplace regularization least square classifier calculated sample to be sorted, and choose soft label
That maximum dimension of element is as the classification belonging to sample to be sorted in vector.
Further setting is that the step S1 includes the following steps:
S11:By sparse from indicating to build the incidence matrix between training sample, even some sample can be with other several
If a sample rarefaction representation, then it represents that there is larger relevances between this sample and these samples;
S12:Laplce's figure relationship between sample label is built, i.e. the prodigious sample of relevance should have similar
Label;
S13:Sparse by sample is effectively embedded into from problem of representation in Laplace regularization least square grader, is built
Conjunctive model that is vertical while learning incidence matrix and Laplace regularization least square.
It includes following sub-step that further setting, which is the step S2,:
S21:By the conjunctive model of foundation be decomposed into Laplace regularization least square grader subproblem and
It is sparse to indicate subproblem certainly, and Optimization Solution in turn is carried out to two sub-problems by way of iteration;
S22:For Laplace regularization least square grader subproblem, with the method that gradient is zero to grader
Coefficient matrix variable carry out Analytical Solution;
S23:For sparse from relevant subproblem is indicated, in order to eliminate the sparse coefficient variation of expression certainly in optimization process
Correlation, introduce auxiliary variable, and with alternating direction Multiplier Algorithm solve.
It includes following sub-step that further setting, which is the step S3,:
S31:With the soft label of Laplace regularization least square classifier calculated sample to be sorted;
S32:That maximum dimension of element in soft label vector is found out, and using the dimension as belonging to sample to be sorted
Classification.
The beneficial effects of the invention are as follows:
1, the present invention proposes a kind of completely new and general semisupervised classification method, to arbitrary classification data type
(such as recognition of face, object classification) is all suitable for.Specifically, estimated label is made full use of to go to generate good data pass
Join matrix, good incidence matrix can further improve the performance of grader again, i.e. grader and incidence matrix is in learning process
In can achieve the effect that mutually promote.
2, grader proposed by the present invention is explicit, it can easily handle external sample problem.
3, the soft label that the present invention is obtained by Laplace regularization least square grader is compared to label propagation side
The hard label that method obtains is more suitable for the manifold in semi-supervised learning it is assumed that therefore the present invention can handle the classification times of complex data
Business.
4, the present invention quickly and effectively solves built semisupervised classification model by block gradient descent algorithm.Especially
Ground uses gradient to be solved for zero method the coefficient matrix variable in Laplace regularization least square grader
Analysis solution;For the sparse correlated variables from problem of representation, the present invention proposes to be iterated solution with alternating direction Multiplier Algorithm.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, according to
These attached drawings obtain other attached drawings and still fall within scope of the invention.
Fig. 1 flow charts of the method for the present invention;
Fig. 2 is the overall flow service chart of the present invention
Specific implementation mode
To make the object, technical solutions and advantages of the present invention clearer, the present invention is made into one below in conjunction with attached drawing
Step ground detailed description.
As shown in Figure 1 to Figure 2, it is in the embodiment of the present invention, the present invention is a kind of while learning incidence matrix and La Pula
The semisupervised classification method of this regularization least square, the hardware and programming language of method carrying out practically of the invention are simultaneously unlimited
System, being write with any language can complete, and other operating modes repeat no more thus.
The embodiment of the present invention is using the calculating with Intel Xeon-E5 central processing units and 16G byte of memory
Machine is used in combination Matlab language to work out while learning the semisupervised classification of incidence matrix and Laplace regularization least square
Working procedure, the method for realizing the present invention.
Learn incidence matrix while of the invention and the semisupervised classification method of Laplace regularization least square is main
Foundation, block coordinate including conjunctive model decline optimization algorithm and three steps of classification policy of sample.
Before introducing specific steps, the meaning of the symbol to be used below introduction of the present invention.
Give a data set for including C class samplesWithout loss of generality, false
If preceding l element is to have exemplar in data set S, remaining u element is unlabeled exemplars.There are exemplar and no label
Sample is formed by matrix and is denoted as X respectivelylAnd Xu, i.e. Xl=[x1,…,xl], Xu=[xl+1,…,xl+u].Correspondingly, Yl=
[y1,…,yl] and Yu=[yl+1,…,yl+u] it is known label matrix and Unknown Label matrix, wherein label y respectivelyiIt is a C
0-1 vectors are tieed up, i.e., if xiBelong to jth class sample, then yiJ-th of position on element value be 1, the element in other positions
Value is 0.Whole samples are formed by matrix and are denoted as X, i.e. X=[Xl,Xu], label matrix Y, i.e. Y=[Yl,Yu]。||A||1
Representing matrix A'sNorm, i.e., | | A | |1=∑ ∑ | αij|。
For the classification problem for including C class samples, it is as follows:
S1 builds while learning the conjunctive model of incidence matrix and Laplace regularization least square.It includes mainly:
A) incidence matrix between training sample is built.Classical is sparse as follows from problem of representation:
Wherein E is error matrix, and A is sparse from representing matrix, and the diagonal entry of diag (A)=0 representing matrixes A is all
Zero, this be in order to avoid there is trivial solution, i.e., sample with its own come linear expression.
The optimal solution of note problem 1. isWhereinIt is sample xiIt is sparse under other samples
Indicate coefficient.If xiThe other sample rarefaction representations of energy, i.e.,Then illustrate sample xiAnd sample
ThisBetween there is larger relevances, therefore the incidence matrix between sample can be by rarefaction representation coefficient come structure
It makes, i.e.,
W=(| A |+| AT|)/2 ②
B) Laplce's figure relationship between sample label is built.The big sample of relevance should have similar label,
Therefore Laplce's figure relationship of construction sample label is
Wherein L is Laplacian Matrix, and L=D-W, D are diagonal matrix, and diagonal entry is
C) the sparse of sample is combined from problem of representation with Laplace regularization least square grader, establishes and learns by oneself
Laplace regularization least square model
S.t.X=XA+E, diag (A)=0
Herein, β is a positive parameter,It is nuclear matrix, B is Laplce's canonical
Change least squared classified device coefficient matrix,Be a preceding l element it is the diagonal matrix that 1 remaining element is 0,WhereinIt is a null vector.
S2 is iterated optimization, Zhi Daoshou using block gradient descent algorithm to each variable in the conjunctive model that is proposed
It holds back.It includes mainly:
A) optimization problem is decomposed into Laplace regularization least square grader subproblem and sparse expression certainly
Problem, and Optimization Solution in turn is carried out to following two subproblems by way of iteration
In 5., L(t)It is by A(t)The Laplacian Matrix induced;In 6., Y(t+1)=B(t+1)K, i.e. grader institute
The soft label matrix of the sample of prediction.
B) fixation is sparse from representing matrix A(t)With error matrix E(t), solve Laplace regularization least square grader
Coefficient matrix B(t+1).Since 5. sub- optimization problem is convex function about variable B and can continuously lead, therefore gradient is asked to variable B and is enabled
It is zero, and available analytic solutions are:
Wherein I is unit matrix.
C) Laplace regularization least square grader coefficient matrix B is given(t+1), solve sparse from representing matrix A(t +1)With error matrix E(t+1).Due to
Define matrixThen 6. sub- optimization problem can be re-written as
Wherein⊙ is that the Hadmard of matrix is accumulated.Optimizing to eliminate the sparse coefficient variation of expression certainly
Correlation in journey introduces auxiliary variable Z, while for the sake of symbol simplicity, the subscript in variable being removed, then optimization problem
8. being equivalent to:
Its Augmented Lagrangian Functions is
Wherein Λ(1), Λ(2)Indicate Lagrangian Matrix Multiplier,<,>The inner product of representing matrix.With alternating direction
Multiplier Algorithm to optimization problem 9. in each variable be iterated solution, be as follows:
Sub- optimization problem about A is solved to
WhereinSη() is that threshold operator meets Sη(x)=sgn (x) max (| x |-η, 0).
Sub- optimization problem about Z is solved to
It is zero to seek gradient about Z to above formula and enable it, and the analytic solutions that variable Z can be obtained are
Sub- optimization problem about E is solved to
Update Lagrange multiplier matrix
Λκ+1 (1)=Λκ (1)+μ(X-XZκ+1-Eκ+1),
Λκ+1 (2)=Λκ (2)+μ(Zκ+1-Aκ+1+diag(Aκ+1))。
S3 proposes a kind of method differentiating sample class label, includes mainly specifically:
A) for need differentiate classification sample z, sample z is gone out by Laplace regularization least square classifier calculated
Soft label, i.e.,
yz=F (z)=Bkz,
Wherein kz=[k (x1,z),k(x2,z),…,k(xl+u,z)]T。
B) that maximum dimension of element in soft label vector is found out, and using the dimension as the classification belonging to sample, i.e.,
One of ordinary skill in the art will appreciate that implement the method for the above embodiments be can be with
Relevant hardware is instructed to complete by program, the program can be stored in a computer read/write memory medium,
The storage medium, such as ROM/RAM, disk, CD.
The above disclosure is only the preferred embodiments of the present invention, cannot limit the right model of the present invention with this certainly
It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.
Claims (4)
1. semisupervised classification method that is a kind of while learning incidence matrix and Laplace regularization least square, feature exist
In this method includes:
S1:The conjunctive model of incidence matrix and Laplace regularization least square is built while learnt according to training sample;
S2:Optimization is iterated to each class variable in the conjunctive model using block gradient descent algorithm, until convergence;
S3:With the soft label of Laplace regularization least square classifier calculated sample to be sorted, and choose soft label vector
That maximum dimension of middle element is as the classification belonging to sample to be sorted.
2. semisupervised classification method according to claim 1, it is characterised in that:The step S1 includes the following steps:
S11:The incidence matrix between training sample is built by sparse expression certainly, even some sample can use other several samples
If this rarefaction representation, then it represents that there is larger relevances between this sample and these samples;
S12:Laplce's figure relationship between sample label is built, i.e. the prodigious sample of relevance there should be similar mark
Label;
S13:Sparse by sample is effectively embedded into from problem of representation in Laplace regularization least square grader, is established same
When learn incidence matrix and Laplace regularization least square conjunctive model.
3. semisupervised classification method according to claim 1, it is characterised in that:The step S2 includes following sub-step
Suddenly:
S21:The conjunctive model of foundation is decomposed into Laplace regularization least square grader subproblem and sparse
From expression subproblem, and Optimization Solution in turn is carried out to two sub-problems by way of iteration;
S22:For Laplace regularization least square grader subproblem, with the method that gradient is zero to grader system
Matrix number variable carries out Analytical Solution;
S23:For sparse from relevant subproblem is indicated, in order to eliminate the sparse phase from expression coefficient variation in optimization process
Guan Xing introduces auxiliary variable, and is solved with alternating direction Multiplier Algorithm.
4. semisupervised classification method according to claim 1, it is characterised in that:The step S3 includes following sub-step
Suddenly:
S31:With the soft label of Laplace regularization least square classifier calculated sample to be sorted;
S32:That maximum dimension of element in soft label vector is found out, and using the dimension as the class belonging to sample to be sorted
Not.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810233453.7A CN108805155A (en) | 2018-03-21 | 2018-03-21 | Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810233453.7A CN108805155A (en) | 2018-03-21 | 2018-03-21 | Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108805155A true CN108805155A (en) | 2018-11-13 |
Family
ID=64095254
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810233453.7A Pending CN108805155A (en) | 2018-03-21 | 2018-03-21 | Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108805155A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110363178A (en) * | 2019-07-23 | 2019-10-22 | 上海黑塞智能科技有限公司 | The airborne laser point cloud classification method being embedded in based on part and global depth feature |
CN111160398A (en) * | 2019-12-06 | 2020-05-15 | 重庆邮电大学 | Missing label multi-label classification method based on example level and label level association |
CN112801162A (en) * | 2021-01-22 | 2021-05-14 | 之江实验室 | Adaptive soft label regularization method based on image attribute prior |
CN115240863A (en) * | 2022-08-11 | 2022-10-25 | 合肥工业大学 | Alzheimer disease classification method and system for data loss scene |
-
2018
- 2018-03-21 CN CN201810233453.7A patent/CN108805155A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110363178A (en) * | 2019-07-23 | 2019-10-22 | 上海黑塞智能科技有限公司 | The airborne laser point cloud classification method being embedded in based on part and global depth feature |
CN110363178B (en) * | 2019-07-23 | 2021-10-15 | 上海黑塞智能科技有限公司 | Airborne laser point cloud classification method based on local and global depth feature embedding |
CN111160398A (en) * | 2019-12-06 | 2020-05-15 | 重庆邮电大学 | Missing label multi-label classification method based on example level and label level association |
CN111160398B (en) * | 2019-12-06 | 2022-08-23 | 重庆邮电大学 | Missing label multi-label classification method based on example level and label level association |
CN112801162A (en) * | 2021-01-22 | 2021-05-14 | 之江实验室 | Adaptive soft label regularization method based on image attribute prior |
CN115240863A (en) * | 2022-08-11 | 2022-10-25 | 合肥工业大学 | Alzheimer disease classification method and system for data loss scene |
CN115240863B (en) * | 2022-08-11 | 2023-05-09 | 合肥工业大学 | Alzheimer's disease classification method and system for data loss scene |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yunpeng et al. | Multi-step ahead time series forecasting for different data patterns based on LSTM recurrent neural network | |
Shen et al. | Wind speed prediction of unmanned sailboat based on CNN and LSTM hybrid neural network | |
Qian et al. | Stock prediction based on LSTM under different stability | |
Xie et al. | Graph neural network approach for anomaly detection | |
CN108805155A (en) | Learn the semisupervised classification method of incidence matrix and Laplace regularization least square simultaneously | |
Wang et al. | Correlation aware multi-step ahead wind speed forecasting with heteroscedastic multi-kernel learning | |
CN108830301A (en) | The semi-supervised data classification method of double Laplace regularizations based on anchor graph structure | |
CN114219181A (en) | Wind power probability prediction method based on transfer learning | |
Hu et al. | Ensemble echo network with deep architecture for time-series modeling | |
Mengcan et al. | Constrained voting extreme learning machine and its application | |
Ai et al. | A machine learning approach for cost prediction analysis in environmental governance engineering | |
CN114328663A (en) | High-dimensional theater data dimension reduction visualization processing method based on data mining | |
Fu et al. | MCA-DTCN: A novel dual-task temporal convolutional network with multi-channel attention for first prediction time detection and remaining useful life prediction | |
Wu et al. | TWC-EL: A multivariate prediction model by the fusion of three-way clustering and ensemble learning | |
Guo | The microscopic visual forms in architectural art design following deep learning | |
CN110555530A (en) | Distributed large-scale gene regulation and control network construction method | |
Copiaco et al. | Exploring deep time-series imaging for anomaly detection of building energy consumption | |
CN115080795A (en) | Multi-charging-station cooperative load prediction method and device | |
Jingbo | Big Data Classification Model and Algorithm Based on Double Quantum Particle Swarm Optimization | |
Zhang et al. | Lstfcfedlear: A LSTM-FC with vertical federated learning network for fault prediction | |
CN110502784A (en) | A kind of product simulation optimization method | |
Zhang et al. | A hierarchical network embedding method based on network partitioning | |
Ma et al. | Special issue on deep learning and neural computing for intelligent sensing and control | |
Dong et al. | Retrosynthesis prediction based on graph relation network | |
Qi et al. | Research on Carbon Emission Prediction Method Based on Deep Learning: A Case Study of Shandong Province |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181113 |
|
RJ01 | Rejection of invention patent application after publication |