CN110348287A

CN110348287A - A kind of unsupervised feature selection approach and device based on dictionary and sample similar diagram

Info

Publication number: CN110348287A
Application number: CN201910441208.XA
Authority: CN
Inventors: 唐厂; 万诚
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2019-05-24
Filing date: 2019-05-24
Publication date: 2019-10-18

Abstract

The present invention is a kind of unsupervised feature selection approach and device based on dictionary and sample similar diagram, disclose a kind of unsupervised feature selection approach and system based on dictionary and the study of sample similar diagram, in this method and system, due to the new dictionary base space of one by being given to raw data matrix X, in the new dictionary base space W of generation, l is used_2,1The row sparsity that norm applies W is used to the importance of measures characteristic with this.Compared with the rudimentary representation method in the past in original feature space, it is based on dictionary and sample similar diagram learning model disclosed in the present invention, captures higher level and more abstract expression, has broad application prospects.

Description

A kind of unsupervised feature selection approach and device based on dictionary and sample similar diagram

Technical field

The present invention relates to signal processings, data analysis field, and in particular to one kind is learnt based on dictionary and sample similar diagram Unsupervised feature selection approach and device.

Background technique

With the fast development of sensor and the Internet media, higher-dimension analysis and big data have become one and have challenge Property and inevitable problem.In numerous applications, such as computer vision, machine learning, pattern-recognition and medical analysis all A large amount of high dimensional data is used.Although data are typically denoted as high dimensional feature vector, for learning tasks, only Have a very little but unknown character subset be very important with it is distinguishing.Directly handling these high dimensional datas can not only show Write and increase demand to computing resource, and due to dimension disaster (curse of dimensionality) can reduce it is many existing There is the performance of algorithm.Feature selecting, which is typically considered, reduces uncorrelated or redundancy feature quantity to improve learning algorithm performance A significant process.

According to whether the label information of sample data classification is utilized, feature selection approach generally can be divided into three classes: have prison Superintend and direct feature selecting (Supervised feature selection), unsupervised feature selecting (Unsupervised feature ) and semi-supervised feature selecting (Semi-supervised feature selection) selection.In general, unsupervised feature Selection method may be summarized to be three kinds, that is, filtering type (Filter), packaged type (Wrapper) and embedded (Embedded).Phase Than in other two methods, embedded methods construct a learning model usually to export the subset of feature, it is in many aspects Excellent performance can be obtained, and is rapidly developed.Data representation based on embedded methods has obtained widely answering With, and propose various models.These methods assume that initial data can be by executing reconstruct letter on selected character subset Then number adds a special regularization matrix to indicate on the characterization coefficient matrix of feature selecting.Wherein, what is used is most Existing algorithm model be from indicate algorithm, it assumes that each feature may be expressed as the linear combination of its correlated characteristic.This Outside, for the partial structurtes of data, usually figure Laplace operator is embedded into embedded UFS method.However, traditional Embedded methods at least there are problems that two.First, existing most of algorithms are that feature is carried out on original feature space Selection frequently can lead to result inaccuracy due to the noise and redundancy feature of original feature space, and can not utilize higher level More abstract expression.Second, traditional UFS method usually constructs similar matrix respectively and carries out feature selecting, therefore, rear In continuous process, the similar matrix obtained in the initial data is remained unchanged, but the noise sample for including in real data and spy Sign inevitably makes similar matrix become unreliable.This insecure similar matrix may destroy the part of initial data Popular structure, to make result that can not optimize.

Summary of the invention

The technical problem to be solved in the present invention is that in view of the drawbacks of the prior art, providing a kind of based on dictionary and sample The unsupervised feature selection approach and system-DGL-UFS of similar diagram study, and other unsupervised feature selecting algorithms, such as: LS, Baseline, SCUFS and RJGSC etc. are compared, and DGL-UFS method is on dictionary base space rather than in essential characteristic sky Between it is upper select feature, and one similar matrix of study automatically, to keep the local manifolds structure of data.

The technical solution adopted by the present invention to solve the technical problems is: constructing a kind of based on dictionary graphics similar with sample The unsupervised feature selection approach practised, comprising the following steps:

S1, a raw data matrix X={ x is given¹,x²,…xⁿ}={ f₁；f₂…；f_d}∈R^d×n；Wherein:

N is number of samples, and d is the quantity of feature vector；I ∈ n, xⁱ∈R^d×1What is indicated is i-th of sample；J ∈ d, f_j∈ R^d×1It is j-th of feature vector；

S2, study have the dictionary D ∈ R of m base vector^d×^m, utilize the raw data matrix that dictionary D is given to step S1 X is reconstructed, and obtains a new dictionary base space W；Wherein, l is utilized_2,1Norm applies row sparsity to W, so that from dictionary D In select corresponding base vector to indicate each sample；

S3, Laplce's restricted model based on graph model, definition constraint regularization term are utilized；

S4, the constraint regularization term defined in conjunction with the obtained dictionary base space W of step S2 and step S3, in dictionary base In space W, feature selecting is carried out, specifically: construction DGL-UFS model objective function solves and obtains W；Wherein, excellent using iteration Change strategy, by carrying out constraint to each other to multiple variables in objective function, W is finally calculated.

Further, raw data matrix X is reconstructed, resulting reconstruction coefficients matrix W ∈ R^m×nAs newly generated Dictionary space；Wherein, the objective function about base vector and new dictionary base space learning can indicate are as follows:

Wherein,Expression takes square of the F norm of matrix " "；β is one and takes positive balance parameters；The i-th of dictionary D List is shown as dⁱ, wherein constraint condition | | dⁱ||²≤ 1 makes the atom in the dictionary space learnt become compact；||W||_2,1 It indicates to utilize l_2,1Norm applies row sparsity to dictionary base space W；M is the positive integer greater than 1.

Further, regularization term expression formula is constrained defined in step S3 are as follows:

Wherein, S_ijIt is data point xⁱSimilarity measure between xj；L=A-S is figure Laplacian Matrix, and A is one right Angular moment battle array, whereinS is similarity matrix；wⁱOr w^jIt is arranged for the i-th or j of W.

Further, the DGL-UFS model objective function of definition are as follows:

s.t.||dⁱ||²≤1,S^T1=1, S >=0, i=1,2 ... m

Wherein,Square of the F norm of representing matrix " "；α be taken for one of balance chart regular terms it is positive normal Number, β are one and take positive balance parameters；L=A-S is figure Laplacian Matrix, and A is a diagonal matrix, and S is similar matrix；| |W||_2,1It indicates to utilize l_2,1Norm applies row sparsity to dictionary base space W；The i-th list of dictionary D is shown as dⁱ, wherein constraint Condition | | dⁱ||²≤ 1 makes the atom in the dictionary space learnt become compact；M is the positive integer greater than 1.

Further, in step S4, the specific steps of objective function are solved by alternating iteration optimisation strategy are as follows:

S51, initialization dictionary base D and similar matrix S；

S52, preset parameter D and S, objective function is converted are as follows:

Wherein, ginseng is further updated by solving above-mentioned objective function using an iteration weight weighted least square algorithm Number W；

S53, preset parameter W and S, objective function is converted are as follows:

Wherein, using ADMM-Alternating Direction method of Multipliers algorithm, by drawing Enter a matrix of variables into objective function, further undated parameter D；

S54, preset parameter W and D, objective function is converted are as follows:

Wherein, due to the similarity vector of each sample be it is independent, for i-th of sample, current goal function representation are as follows:

By introducing Lagrange's multiplier, the similar matrix s of current i-th of sample is obtainedⁱ；Further, for each Sample successively updates the similar matrix of each sample using above-mentioned identical method, realizes the update to parameter S；

S55, repeat step S52-S54, until objective function convergence when, export convergent under parameter S, D and W。

A kind of unsupervised feature selecting device learnt based on dictionary and sample similar diagram proposed by the present invention, the device packet It includes with lower module:

Data setup module, for giving a raw data matrix X={ x¹,x²,…xⁿ}={ f₁；f₂…；f_d}∈R^d ^×n；Wherein:

Space Reconstruction module, for learning the dictionary D ∈ R with m base vector^d×m, mould is arranged to data using dictionary D The given raw data matrix X of block is reconstructed, and obtains a new dictionary base space W；Wherein, l is utilized_2,1Norm applies W Row sparsity, so that selecting corresponding base vector from dictionary D to indicate each sample；

Regularization module is constrained, for utilizing Laplce's restricted model based on graph model, definition constraint regularization term；

Feature selection module, the dictionary base space W for being obtained in conjunction with Space Reconstruction module, and constraint regularization module The constraint regularization term of definition carries out feature selecting in dictionary base space W, specifically: construction DGL-UFS simulated target letter Number solves and obtains W；Wherein, using iteration optimization strategy, by carrying out pact to each other to multiple variables in objective function W is finally calculated in beam.

Further, constraint regularization term expression formula defined in regularization module is constrained are as follows:

Wherein, S_ijIt is data point xⁱAnd x^jBetween similarity measure；L=A-S is figure Laplacian Matrix, and A is one right Angular moment battle array, whereinS is similarity matrix；wⁱOr w^jIt is arranged for the i-th or j of W.

Further, the DGL-UFS model objective function of definition are as follows:

s.t.||dⁱ||²≤1,S^T1=1, S >=0, i=1,2 ... m；

Further, in feature selection module, the specific steps of objective function are solved by alternating iteration optimisation strategy are as follows:

S51, initialization dictionary base D and similar matrix S；

S52, preset parameter D and S, objective function is converted are as follows:

S53, preset parameter W and S, objective function is converted are as follows:

S54, preset parameter W and D, objective function is converted are as follows:

In a kind of unsupervised feature selection approach and system based on dictionary and the study of sample similar diagram of the present invention In, due to W={ w¹,w²,…,wⁿIt is that a new space of X uses l in new dictionary base space_2,1Norm applies W Row sparsity can be used to the importance of measures characteristic.Compared with the rudimentary representation method in the past in original feature space, DGL-UFS captures higher level and more abstract expression.

Detailed description of the invention

Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:

Fig. 1 is the method flow diagram that the present invention realizes unsupervised feature selecting；

Fig. 2 is unsupervised feature selecting apparatus structure schematic diagram disclosed by the invention；

Fig. 3 is the iterative algorithm figure disclosed by the invention for solving dictionary base space W；

Fig. 4 is the algorithm pattern disclosed by the invention that objective function is solved using iteration optimization algorithms；

Fig. 5 is the statistics of eight databases；

Fig. 6 is different cluster result figure (ACC% ± std%) of the feature selection approach on each database；

Fig. 7 is different cluster result figure (NMI% ± std%) of the feature selection approach on each database；

Fig. 8 is on different data sets, and distinct methods correspond to the ACC value of the selected feature of different number；

Fig. 9 is on different data sets, and distinct methods correspond to the NMI value of the selected feature of different number；

Figure 10 is ACC value of the present invention in the case where keeping parameter alpha=1 to change the value of β；

Figure 11 is NMI value of the present invention in the case where keeping parameter alpha=1 to change the value of β；

Figure 12 is ACC value of the present invention in the case where keeping parameter beta=1 to change the value of α；

Figure 13 is NMI value of the present invention in the case where keeping parameter beta=1 to change the value of α；

Figure 14 is the convergence curve of algorithm 1 and algorithm 2 on Yale data set；

Figure 15 is the convergence curve of algorithm 1 and algorithm 2 on COIL20 data set；

Figure 16 is the convergence curve of algorithm 1 and algorithm 2 on Isolet data set；

Figure 17 is the convergence curve of algorithm 1 and algorithm 2 on Prostate_GE data set；

Figure 18 is the convergence curve of algorithm 1 and algorithm 2 on Lung data set；

Figure 19 is the convergence curve of algorithm 1 and algorithm 2 on SRBCT data set.

Specific embodiment

For a clearer understanding of the technical characteristics, objects and effects of the present invention, now control attached drawing is described in detail A specific embodiment of the invention.

Referring to FIG. 1, it realizes the method flow diagram of unsupervised feature selecting, a kind of base disclosed by the invention for the present invention In the unsupervised feature selection approach that dictionary and sample similar diagram learn, comprising the following steps:

N is number of samples, and i ∈ n, d are feature quantity, j ∈ d；xⁱ∈R^d×¹That indicate is i-th of sample sample, f_j∈R^d ^×1It is j-th of feature vector；

S2, study have the dictionary D ∈ R of m base vector^d×m, utilize the raw data matrix X that dictionary D is given to step S1 It is reconstructed, obtains a new dictionary base space W；Wherein, l is utilized_2,1Norm applies row sparsity to W, so that from dictionary D In select corresponding base vector to indicate each sample；

In fact, Yang et al. has confirmed that the base of different number will affect the quality of W, it is disclosed by the invention in order to make Algorithm model can export feature importance, in the present embodiment, only be provided with the radix equal with the dimension of feature.About base It can be indicated with the objective function of space learning are as follows:

Wherein,Square of the F norm of representing matrix " "；β is one and takes positive balance parameters；The i-th column of dictionary D It is expressed as dⁱ, wherein constraint condition | | dⁱ||²≤ 1 makes the atom in the dictionary space learnt become compact；||W||_2,1Table Show and utilizes l_2,1Norm applies row sparsity to dictionary base space W；M is the positive integer greater than 1.

In the present embodiment, l is used_2,1Norm and do not have to the l in traditional dictionary learning model₁Norm, main cause have two It is a: first, l₁Norm results in the sparsity of element, not can reflect the importance of different characteristic, on the contrary, l_2,1Norm can be with Row sparsity is applied to W, distance can be measured in characteristic dimension by regularization.Second, row sparsity is forced on W, Closely related base can be selected to indicate each sample from D, also the local attribute of available initial data.

S3, using Laplce's restricted model based on graph model, the local geometric retained between raw data matrix X closes System, and define constraint regularization term:

Previous in research, calculated between data sample usually using a certain distance measurement (such as gaussian kernel function) Similitude.However, in the follow-up process, the similar matrix obtained in the initial data is pre-defined and fixed, and is showed The noise sample and feature for including in real data inevitably make similar matrix become unreliable.We attempt in optimization process In learn similar matrix automatically, rather than use precalculated similar matrix.

The DGL-UFS model objective function of definition are as follows:

s.t.||dⁱ||²≤1,S^T1=1, S >=0, i=1,2 ... m；

Wherein, α is to take positive constant for one of balance chart regular terms.

Due to W={ w¹,w²,…,wⁿIt is that a new space of X uses l in new dictionary base space_2,1Norm is to W The row sparsity of application can be used to the importance of measures characteristic.With the rudimentary representation method phase in the past in original feature space Than DGL-UFS captures higher level and more abstract expression.

Referring to FIG. 2, it is unsupervised feature selecting apparatus structure schematic diagram disclosed by the invention, which includes data Setup module L1, Space Reconstruction module L2, constraint regularization module L3 and feature selection module L4, wherein；

Data setup module L1 is for giving a raw data matrix X={ x¹,x²,…xⁿ}={ f₁；f₂；f_d}∈R^d×n； Wherein:

N is number of samples, and i ∈ n, d are feature quantity, j ∈ d；xⁱ∈R^d×1That indicate is i-th of sample sample, f_j∈R^d ^×1It is j-th of feature vector；

Space Reconstruction module L2 is used to learn the dictionary D ∈ R with m base vector^d×m, mould is arranged to data using dictionary D The given raw data matrix X of block is reconstructed, and obtains a new dictionary base space W；Wherein, l is utilized_2,1Norm applies W Row sparsity, so that selecting corresponding base vector from dictionary D to indicate each sample；

It constrains regularization module L3 to be used to utilize Laplce's restricted model based on graph model, retains raw data matrix Local geometric relationship between X, and define constraint regularization term；

The dictionary base space W that feature selection module L4 is used to that Space Reconstruction module to be combined to obtain, and constraint regularization mould The constraint regularization term that block defines carries out feature selecting in dictionary base space W, specifically: construction DGL-UFS simulated target Function solves and obtains W；Wherein, using iteration optimization strategy, by carrying out pact to each other to multiple variables in objective function W is finally calculated in beam.

Fig. 3-Fig. 4 is please referred to, is changed for the iterative algorithm figure disclosed by the invention for solving dictionary base space W, and use The algorithm pattern of objective function is solved for optimization algorithm.

The specific steps of objective function are solved by alternating iteration optimisation strategy are as follows:

(1) fixed D and S, updates W:

Fixed D and S, solves W, then objective function is converted are as follows:

Due to the l of W_2,1Norm regularization, above formula are convex, but and it is rough, added again in the present embodiment using an iteration Least-squares algorithm is weighed to solve it.Assuming that updating the current of W in the subproblem of W is estimated as W^t, define a diagonal weight Matrix G^t, i-th of diagonal element elementThen W is updated by solving following weighted least-squares problem^t+1:

To above formula about W derivation, and the result that derivation is arranged is zero, is obtained:

(D^TD+βG)W+αWL_S=D^TX

Above-mentioned equation is Silvester (Sylvester) equation, due to D^TD+ β G is stringent orthotropicity, therefore on Formula has stable solution.

(2) fixed W and S, updates D:

Fixed W and S, solves D, then objective function is converted are as follows:

In order to solve above formula, in the present embodiment, ADMM (Alternating Direction method of is used Multipliers the optimal solution of D) is obtained by introducing a matrix of variables, is obtained:

Pass through the available optimal D of following iterative step:

Y^(t+1)=Y^(t)+D^(t+1)-H^(t+1)

Wherein, Y is Lagrange's multiplier, and κ is parameter, and t indicates the number of iterations.

(3) fixed W and D, updates S:

Fixed W and D, solves S, then objective function is converted are as follows:

Due to the similarity vector of each sample be it is independent, for i-th of sample, have:

s.t.1^Tsⁱ=1,0≤s_ij≤1

For the ease of indicating, a matrix is indicated with M in the present embodimentVector mⁱ∈R^n×1, then on Formula can be write as:

By introducing Lagrange's multiplier, available sⁱ。

In concrete implementation, D is initialized with random matrix, is obtained with traditional gaussian kernel function initialization S:

Wherein, N_k(xⁱ) indicate xⁱK arest neighbors set, σ be a width parameter (in initialization step, k=5, σ =1).

In order to sufficiently verify the validity of DGL-UFS method of the present invention, first on six common master databases (Yale, Isolet, COIL20, Prostate_GE, Lung, SRBCT) tests the performance of DGL-UFS method, while with following seven Kind unsupervised feature selecting algorithm popular at present is compared:

(1) Baseline: all primitive characters are all used.

(2) LS: Laplce's score feature selecting, this method choose the feature for being best suitable for Laplacian matrix.

(3) MCFS: multiple cluster feature selection, this method use l₁Feature selection process specification is turned to spectrum letter by norm Cease regression problem.

(4) SOGFS: while the unsupervised feature selection approach of feature selecting and local Structure learning is carried out, so as to Adaptively determine similar matrix.

(5) CDL-FS: a kind of coupling analysis-synthesis dictionary learning method of new unsupervised feature selecting uses synthesis Dictionary reconstructed sample, and analyze dictionary and parsing coding is carried out to sample and is sample allocation probability.

(6) SCUFS: the unsupervised feature selecting of subspace cluster guidance learns instruction by the subspace clustering based on expression Practice the cluster labels of sample, selection can retain the feature of cluster labels well.

(7) RJGSC: a kind of unsupervised feature selecting of new joint figure sparse coding model, the model is first with Some dictionary learning methods extract the base of training data, then extract initial data and generate new expression in base space.

In experiment, to DGL-UFS method and other seven kinds of unsupervised feature selection approach on database disclosed in six Compare test.Six databases include a face image database (Yale), a voice signal data library (Isolet), an object image data library (COIL20) and three biometric databases (Prostate_GE, Lung and SRBCT). The statistics of these databases is as shown in Figure 5.

Similar to previous unsupervised feature selection approach, K-means collection also is executed using the feature selected in the present invention Group, the evaluation criterion being widely used using two kinds, that is, the accuracy rate clustered (ACC) and normalized mutual information (NMI).ACC and The value of NMI is bigger, then it represents that method performance is better.Assuming that q_iIt is cluster result, p_iIt is true tag, then the definition of ACC is such as Under:

Wherein, if as x=y, δ (x, y)=1, otherwise δ (x, y)=0.map(q_i) it is a best mapping letter Number, its function are the true tag progress of experiment is obtained by Kuhn-Munkres algorithm cluster labels and sample Match.

Given two variables P and Q, NMI is defined as:

Wherein, H (P) and H (Q) respectively indicates the entropy of P and Q, and I (P, Q) indicates the mutual information of P and Q between the two.P is defeated Enter the cluster result of sample, Q is their true tag.NMI reflects the consistent degree between cluster result and true tag.

DGL-UFS algorithm and the parameter of other control methods will be configured in an experiment, for LS, MCFS, The size k=5 of neighbour's parameter of all databases is arranged in SOGFS, SCUFS and RJGSC.It is public in order to be carried out to different methods Flat comparison, in the present embodiment, using " grid search " strategy from { 10^-3,10^-2,10^-1,1,10,10²,10³In adjustment it is all The parameter of method.Due to the optimal number of the feature of selection be it is unknown, for all databases we use " grid search " plan Slightly from the quantity of the different features selected of { 20,30 ..., 90,100 } setting.

It is special to the low-dimensional selected by them using K-means algorithm after different characteristic selection algorithm completes feature selecting Sign is clustered.In view of the performance of K-means cluster will receive the influence of initialization, repeat 20 times it is different it is random just Beginningization experiment, finally records their average value.

Interpretation of result:

Fig. 6 and Fig. 7 gives ACC and NMI value of the distinct methods on six databases.As can be seen that proposed The performance of DGL-UFS is better than other state-of-the-art methods always, this shows that method proposed by the present invention can be captured than original The rudimentary higher level of expression and more abstract expression.Specifically, with RJGSC (RJGSC independently execute dictionary basic learning and Feature selecting) it compares, it has more than 5% to most of data sets (such as Yale, Isolet, Prostate_GE and SRBCT) It improves.

In order to verify influence of the feature selecting to cluster result, the present embodiment illustrates all methods in Fig. 8, Fig. 9 and exists On disparate databases, by the asynchronous performance of selection characteristic.ACC value and NMI value and different data concentrate the number of selected feature It measures related.It can be seen that the performance of this method, which is stablized, is better than other methods in certain characteristic range.It is worth noting that, When characteristic is smaller, compared to Baseline method, the ACC value of DGL-UFS method is higher, this has turned out our feature The advantage of selection method.

Parameter Sensitivity Analysis:

In the present invention containing there are two balance parameters (α and β), in order to study the present invention to the susceptibility of parameter, we are to it In a parameter fixed, a remaining parameter is changed.Specific operating process are as follows: firstly, fixed α= 1, change the value of β, ACC the and NMI value on disparate databases is as shown in Fig. 9, Figure 11.Secondly, fixing Beta=1, changes the value of α, ACC and NMI value on disparate databases is as shown in Figure 12 and Figure 13.

As can be seen that in most cases, DGL-UFS is for different α and β when the quantity of selected feature is fixed Performance is stablized.For data set Prostate_GE, NMI has some sensitivity to β, available best as β=1 NMI value；It is as a result stable as β > 1 for data set SRBCT；DGL-UFS is unwise other than to data set SRBCT Sense, when α > 0.01 and selected characteristic are greater than 80, the NMI on data set SRBCT keeps stablizing.

The time complexity of DGL-UFS algorithm is analyzed:

Through iteration optimization algorithms during solving objective function, the main time spends in three parts: solving W, asks It solves D and solves S.For updating the part W, the classic algorithm of Sylvester equation is Bartels Stewart algorithm, complicated Degree is o (m³), if T₁For the number of iterations of algorithm one, then the complexity of iteration time each time in algorithm two is o (T₁m³)； For updating the part D, if T₂For the number of iterations of ADMM algorithm, due to m=d, the iteration time each time in algorithm two is multiple Miscellaneous degree is o (T₂(d²n+d³))；For updating S portion, prime cost is to calculate distance matrix M, and time complexity is o (n²)。

The analysis of DGL-UFS convergence:

When for solving W, using iteration weight weighted least square algorithm in the present embodiment, in iterative process each time In, an approximate solution is obtained by Silvester (Sylvester) equation；For solving D, ADMM algorithm can guarantee to optimize D converges on optimal solution；For solve S portion, be based on Lagrangian and KKT condition, the optimization of S be also converge on it is optimal Solution.

Convergence curve of the objective function on different data sets is as shown in Figure 14~Figure 19；It can be seen that algorithm 1 is restrained Speed is fast, almost restrains in 5 iteration, and the target value of algorithm 2 can also be substantially reduced with preceding 10 iteration.

The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much Form, all of these belong to the protection of the present invention.

Claims

1. a kind of unsupervised feature selection approach learnt based on dictionary and sample similar diagram, which is characterized in that including following step It is rapid:

N is number of samples, and d is the quantity of feature vector；I ∈ n, xⁱ∈R^d×1What is indicated is i-th of sample；J ∈ d, f_j∈R^d×1 It is j-th of feature vector；

S2, study have the dictionary D ∈ R of m base vector^d×m, the raw data matrix X that step S1 gives is carried out using dictionary D Reconstruct, obtains a new dictionary base space W；Wherein, l is utilized_2,1Norm applies row sparsity to W, so that selecting from dictionary D Selecting corresponding base vector indicates each sample；

S4, the constraint regularization term defined in conjunction with the obtained dictionary base space W of step S2 and step S3, in dictionary base space W On, feature selecting is carried out, specifically: construction DGL-UFS model objective function solves and obtains W；Wherein, iteration optimization plan is utilized Slightly, by carrying out constraint to each other to multiple variables in objective function, W is finally calculated.

2. unsupervised feature selection approach according to claim 1, which is characterized in that carry out weight to raw data matrix X Structure, resulting reconstruction coefficients matrix W ∈ R^m×nAs newly generated dictionary space；Wherein, empty about base vector and new dictionary base Between the objective function that learns can indicate are as follows:

Wherein,Expression takes square of the F norm of matrix " "；β is one and takes positive balance parameters；The i-th list of dictionary D It is shown as dⁱ, wherein constraint condition | | dⁱ||²≤ 1 makes the atom in the dictionary space learnt become compact；||W||_2,1It indicates Utilize l_2,1Norm applies row sparsity to dictionary base space W；M is the positive integer greater than 1.

3. unsupervised feature selection approach according to claim 1, which is characterized in that constrain canonical defined in step S3 Change item expression formula are as follows:

Wherein, S_ijIt is data point xⁱAnd x^jBetween similarity measure；L=A-S is figure Laplacian Matrix, and A is one to angular moment Battle array, whereinS is similarity matrix；wⁱOr w^jIt is arranged for the i-th or j of W.

4. unsupervised feature selection approach according to claim 1,2 or 3, which is characterized in that the DGL-UFS model of definition Objective function are as follows:

s.t.||dⁱ||²≤1,S^T1=1, S >=0, i=1,2 ... m；

Wherein,Square of the F norm of representing matrix " "；α is to take positive constant for one of balance chart regular terms, and β is One takes positive balance parameters；L=A-S is figure Laplacian Matrix, and A is a diagonal matrix, and S is similar matrix；||W||_2,1 It indicates to utilize l_2,1Norm applies row sparsity to dictionary base space W；The i-th list of dictionary D is shown as dⁱ, wherein constraint condition | | dⁱ||²≤ 1 makes the atom in the dictionary space learnt become compact；M is the positive integer greater than 1.

5. unsupervised feature selection approach according to claim 4, which is characterized in that in step S4, pass through alternating iteration The specific steps of optimisation strategy solution objective function are as follows:

S51, initialization dictionary base D and similar matrix S；

S52, preset parameter D and S, objective function is converted are as follows:

Wherein, using an iteration weight weighted least square algorithm, by solving above-mentioned objective function, further undated parameter W；

S53, preset parameter W and S, objective function is converted are as follows:

Wherein, using ADMM-Alternating Direction method of Multipliers algorithm, by introducing one A matrix of variables is into objective function, further undated parameter D；

S54, preset parameter W and D, objective function is converted are as follows:

By introducing Lagrange's multiplier, the similar matrix s of current i-th of sample is obtainedⁱ；Further, needle for each sample, Using above-mentioned identical method, the similar matrix of each sample is successively updated, realizes the update to parameter S；

S55, step S52-S54 is repeated, when objective function convergence, exports parameter S, D and W under convergent.

6. a kind of unsupervised feature selecting device learnt based on dictionary and sample similar diagram, which is characterized in that the device includes With lower module:

Data setup module, for giving a raw data matrix X={ x¹,x²,…xⁿ}={ f₁；f₂…；f_d}∈R^d×n；Its In:

Space Reconstruction module, for learning the dictionary D ∈ R with m base vector^d×m, data setup module is given using dictionary D Fixed raw data matrix X is reconstructed, and obtains a new dictionary base space W；Wherein, l is utilized_2,1It is dilute that norm applies row to W Property is dredged, so that selecting corresponding base vector from dictionary D to indicate each sample；

Feature selection module, the dictionary base space W for being obtained in conjunction with Space Reconstruction module, and constraint regularization module definition Constraint regularization term, in dictionary base space W, carry out feature selecting, specifically: construction DGL-UFS model objective function, ask Solution obtains W；Wherein, using iteration optimization strategy, by carrying out constraint to each other to multiple variables in objective function, finally W is calculated.

7. unsupervised feature selecting device according to claim 6, which is characterized in that carry out weight to raw data matrix X Structure, resulting reconstruction coefficients matrix W ∈ R^m×nAs newly generated dictionary space；Wherein, empty about base vector and new dictionary base Between the objective function that learns can indicate are as follows:

8. unsupervised feature selecting device according to claim 6, which is characterized in that defined in constraint regularization module Constrain regularization term expression formula are as follows:

9. the unsupervised feature selecting device according to claim 6,7 or 8, which is characterized in that the DGL-UFS model of definition Objective function are as follows:

s.t.||dⁱ||²≤1,S^T1=1, S >=0, i=1,2 ... m；

10. unsupervised feature selection approach according to claim 9, which is characterized in that in feature selection module, pass through friendship The specific steps of objective function are solved for iteration optimization strategy are as follows:

S51, initialization dictionary base D and similar matrix S；

S52, preset parameter D and S, objective function is converted are as follows:

S53, preset parameter W and S, objective function is converted are as follows:

S54, preset parameter W and D, objective function is converted are as follows: