CN108536750A

CN108536750A - Based on point to the characteristics of image binary-coding representation method of relational learning and reconstruct

Info

Publication number: CN108536750A
Application number: CN201810203371.8A
Authority: CN
Inventors: 杨育彬; 甘元柱; 毛晓蛟
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2018-03-13
Filing date: 2018-03-13
Publication date: 2018-09-14
Anticipated expiration: 2038-03-13
Also published as: CN108536750B

Abstract

The invention discloses the characteristics of image binary-coding representation methods based on point to relational learning and reconstruct, including：Step 1, dictionary representation is converted data to, restricted problem is solved and obtains the coefficient matrix in representation；Step 2, the weight matrix that expression figure is constructed to the coefficient matrix obtained in step 1, is divided into k group by spectral clustering by dictionary item on this map, to realize that dictionary is divided；Step 3, new samples are given, its reconstructed residual in all dictionary item groups is calculated, choose the minimum corresponding optimal dictionary item group of reconstructed residual to carry out linear expression, to complete a little study to relationship；Step 4, optimal models are solved, study holding point realizes reconstruct of the point to relationship to the optimal binary-coding of relationship.

Description

Based on point to the characteristics of image binary-coding representation method of relational learning and reconstruct

Technical field

The invention belongs to characteristics of image coding fields, more particularly to based on point to the characteristics of image two of relational learning and reconstruct It is worth coded representation method.

Background technology

Current era also increases in high speed with Internet information age high speed development, image data total amount.It is examined in image In the application of rope, user gives a query image, needs to retrieve image similar with its from large scale database, and It is returned the result according to similarity ranking.For this application scenarios, a kind of most basic way is：First to query image and Database images extract feature respectively.Then, according to certain metric form (such as Euclidean distance) calculating query image and often The distance between a database images.Finally, database images are ranked up according to apart from size, return to forward data Library image is as retrieval result.The method in practical applications, needs to choose the spy having to picture material compared with high rule complexity Sign, and this category feature is often higher-dimension.But it can propose prodigious demand, Er Qiegao to database purchase using high dimensional feature The Real-valued feature of dimension is not high apart from computational valid time rate, and with the increase of database magnitude, common will apart from calculation As performance bottleneck.Therefore the characteristics of image based on binary-coding can well solve above-mentioned two problems, improve extensive The efficiency of image retrieval.

In practical applications, for data often without label, unsupervised binary-coding character representation and learning algorithm are logical Often it is suitable for not having the case where semantic label.And in the case that unsupervised, with traditional characteristic similarity based on Euclidean distance Representation method is again too simple, cannot obtain satisfactory effect.When data scale and data complexity increase, how Learn compact and indicate the strong binary-coding feature of ability, is an important problem in being retrieved for large-scale data.

Invention content

Goal of the invention：The present invention is directed in existing unsupervised algorithm coding learning process, hash function and optimization aim The tightly coupled problem of loss function proposes a kind of being compiled to the characteristics of image two-value of relational learning and reconstruct based on point for loose coupling Code representation method effectively solves under the data based on Hash binary-coding to improve the performance and accuracy rate of image retrieval, The quick and precisely search problem of image.

The characteristics of image binary-coding representation method based on point to relational learning and reconstruct built through the invention, it is intended to Using the means of machine learning and machine vision, the high, recall precision by the characteristic dimension occurred in traditional image retrieval technologies The problems such as low, and it is tight for hash function in existing unsupervised algorithm coding learning process and the loss function of optimization aim The problem of coupling, by reaching promotion figure based on the characteristics of image binary-coding representation method put to relational learning and reconstruct As the purpose of retrieval performance and accuracy rate.

Technical solution：The invention discloses the characteristics of image binary-coding expression sides based on point to relational learning and reconstruct Method, it is proposed that a kind of automation solutions effectively carrying out binary-coding to image in the case where image lacks and marks, Specifically include following steps：

Step 1, coefficient matrix is solved：Data are expressed as to the linear combination of dictionary and coefficient matrix, by solving to being The restricted model of matrix number is to acquire coefficient matrix；

Step 2, dictionary is divided：The weight matrix for indicating the coefficient matrix construction obtained in step 1 a figure, at this Dictionary item is divided by k by spectral clustering on a figure₂A group；

Step 3, character representation：When given new samples, its reconstructed residual in all dictionary item groups is calculated, then The minimum corresponding optimal dictionary item group of reconstructed residual is therefrom chosen to carry out linear expression, completes study of the point to relationship；

Step 4, point reconstructs relationship：Optimal models are solved, to which the optimal two-value of relationship is compiled in study to holding point Code realizes reconstruct of the point to relationship.

Step 1 includes：

Step 1-1 gives a matrix D=[x for including n image data₁,x₂,...,x_n]∈R^d×n, wherein D's is every One row indicate an image data, x_nIndicate n-th image data, the dimension of each image data is d, and matrix D can be from table It is shown as D=DC, the value of each element of matrix belongs to set of real numbers R, wherein C ∈ R^n×nIt is coefficient matrix, each row indicate every A initial data corresponding reconstruction coefficients on dictionary, the dictionary, that is, data matrix D itself can to coefficient matrix application It describes the low-rank constraint of the global structure of initial data and the sparsity of the local structure of initial data can be described about Beam obtains being grouped dictionary such as drag：

Wherein | | | |_*With | | | |₁Representing matrix nuclear norm and l respectively₁Norm, they are respectively by as to coefficient square Battle array carries out the approximation of low-rank constraint and the approximation of sparsity constraints, and s.t. indicates that constraint, λ are one for balancing sparsity The balance parameters of importance between low-rank；

Model in step 1-2, solution procedure 1-1 introduces an auxiliary variable J and obtains such as drag：

Augmented Lagrangian Functions L (C, J, the Y of the model₁,Y₂, μ) be：

Wherein, Y₁,Y₂, μ represents the introduced auxiliary variable of solving model, by each variable, that is, C, J, Y₁,Y₂,μ Iteration update, which is solved, iteration update each variable rule it is as follows：

J^k+1=max (| C^k+1+Y_2,k/μ_k|-λ/μ_k,0)

Y_1,k+1=Y_1,k+μ_k(D-DC_k+1)

Y_2,k+1=Y_2,k+μ_k(C_k+1-J_k+1)

μ_k+1=min (μ_max,ρμ_k)

Wherein, C^k+1Indicate the updated value of+1 iteration of variable C kth, J^kIndicate that variable J kth time iteration is updated Value, μ_kIndicate the updated value of variable μ kth time iteration, Y_1,kIndicate variable Y₁The updated value of kth time iteration, Y_2,kIt indicates to become Measure Y₂The updated value of kth time iteration, μ_maxIndicate that the maximum value of variable μ settings, ρ indicate that the growth factor of variable μ, Θ indicate Singular value shrinkage operation accord with andThe coefficient matrix C learnt is acquired by iteration.

Step 2 includes：

The coefficient matrix C obtained in step 1 is indicated the weight matrix W of a figure, calculated as follows by step 2-1：

W=(C+C^T)/2；

Step 2-2, structure n × n is to angle matrix D_w, D_wIt is defined as follows：

Wherein d_iIt is defined as：d_i=∑ w_ij,

w_ijIt refer to the value of the number of the i-th row jth row of matrix W；

Step 2-3 defines Laplacian Matrix L=D_w- W, and according to L, D_wCalculate D_w ^-1/2LD_w ^-1/2；

Step 2-4 calculates D_w ^-1/2LD_w ^-1/2Minimum k₁The corresponding feature vector f of a characteristic value institute, by feature to (citation is for amount f standardization：《Linear algebra (sixth version)》Higher Education Publishing House of department of mathematics of Tongji University), final group At n × k₁The eigenmatrix F of dimension；

Step 2-5, to every a line in eigenmatrix F as a k₁The sample of dimension, total n sample, uses clustering method Clustered that (citation is：《Machine learning》Publishing house of Zhou Zhihua Tsinghua University), cluster dimensionality k₂, belonging to the i-th row Class is exactly original x_iClass belonging to (i.e. i-th of data of the matrix D in step 1-1) finally obtains cluster and divides c, thus by word Allusion quotation item is divided into k₂A group.

Step 3 includes：

Step 3-1, when a given new samples x ' calculates the reconstruction coefficients z of x ' for entire dictionary D_i：

z_i=(D^TD+αI)^-1D^TX ',

Wherein, α is balance parameters, and I is unit matrix；

Step 3-2 is calculated for each dictionary item group respectively for the normalization residual error of x '：It is calculated by following formula Normalization residual error r of k-th of dictionary item group for x '_k(x′)：

Wherein, φ_{k_d}(z_i) it is z_iA part of coefficient, corresponding dictionary item is kth _ d dictionary group, D^kIndicate kth The dictionary that the data of a dictionary item group are formed；

The reconstruction coefficients z of x ' has been calculated in step 3-3_iAfter normalization residual error in all dictionary item groups, selects and calculate The dictionary item group of the reconstructed residual minimum arrived carries out linear expression as optimal dictionary item group, to optimal dictionary item group, by dilute Thin coding is reconstructed expression to x ' and (is realized by LAE (Local Anchor Embedding) algorithms, algorithm is with reference to text It offers：《Large Graph Construction for Scalable SemiSupervised Learning》Wei Liu, Junfeng He, Shih-Fu Chang), to complete a little study to relationship.

Step 4 includes：

Such as drag is solved, to learn optimal binary-coding to which holding point is to relationship：

s.t. W_recW_rec ^T=I

Wherein W_recIt is linear projection matrix, Z_recBe data matrix (i.e. step 3-3 obtain by sparse coding logarithm According to the reconstruct data being reconstructed after indicating), B ∈ { -1,1 }^c×nIt is a two values matrix, each column representing matrix Z_recMiddle data Corresponding binary-coding, μ_recIt is an offset parameter, s is a zooming parameter, | | | |_FThe F norms of representing matrix, ⊙ tables Show that the multiplication of corresponding element, model are convex, each unknown parameter μ is updated by iteration_rec、B、W_rec, s optimize, directly It is restrained to target function value.

It is described that each unknown parameter is updated to optimize by iteration, until target function value convergence, specifically include：

Step 4-1, random initializtion B and W_rec：

W_rIt is to initialize W_recAnd from the matrix of standardized normal distribution stochastical sampling, W_r=U ∑s V^TIt indicates to W_rMatrix Carry out singular value decomposition, it is assumed that W_rIt is m × n rank matrix, then U is m × m rank matrix after decomposing, and Σ is a m × n Rank diagonal matrix, and V* is n × n rank matrix.Bibliography is：《Linear algebra (sixth version)》Department of mathematics of Tongji University Higher Education Publishing House；C refers to the length of binary-coding, i.e. how many position, such as 16,32；

Initialize μ_rec=0, s=1；

Step 4-2 starts new round iteration, and iterations add one, if iterations are less than or equal to T times, carries out step 4-3, it is no to then follow the steps 4-4；

Step 4-3 updates B：

It enables

It calculates

Update W_rec：

It enables

It calculates

SVD () herein refers to carrying out singular value decomposition, singular value decomposition (Singular to the object in bracket Value Decomposition) it is a kind of important matrix decomposition in linear algebra, belong to the prior art, citation is：《U.S. of mathematics》The People's Telecon Publishing Houses Wu Jun；

Sign () in formula refers to carrying out sign function operations to the object in bracket.The meaning of this function of sign It is：Work as x>0, sign (x)=1；Work as x=0, sign (x)=0；Work as x<0, sign (x)=- 1；

Update μ_rec：

μ_rec=column_mean (Z_rec-s⊙W_rec ^TB),

Update s：

It enables

It calculates

Column_mean () in formula refer to averaging to matrix in bracket along the direction of row obtain it is new Matrix, trace () are the marks for calculating matrix in bracket.In linear algebra, the leading diagonal of a n × n matrix A is (from a left side The diagonal line of top to lower right) on the summation of each element be referred to as the mark (or mark number) of matrix A, be generally denoted as tr (A). Bibliography is：《Linear algebra (sixth version)》Higher Education Publishing House of department of mathematics of Tongji University；

Terminate epicycle iteration, return to step 4-2；

Step 4-4, by iterative calculation, the optimal binary-coding matrix B finally learnt completes weight of the point to relationship Structure.

The present invention is directed to the problems such as characteristic dimension occurred in traditional image retrieval technologies is high, recall precision is low, base In the model of machine learning and machine vision, the image data of not semantic label is converted to the binary-coding feature of low-dimensional, So as to improve retrieval rate in the applications such as the retrieval of image, amount of storage and scramble time are reduced.This method includes mainly Coefficient matrix, dictionary segmentation, character representation, point are solved to relationship four steps of reconstruct.It is by data to solve coefficient matrix step Dictionary representation is converted to, restricted problem is solved and obtains the coefficient matrix in representation；Dictionary segmentation step is to obtain Coefficient matrix construction indicates the weight matrix of figure, then realizes dictionary segmentation by spectral clustering；Character representation step is to choose most Small reconstructed residual corresponding optimal dictionary item group carries out linear expression new data；Point is by asking to relationship reconstruction step Optimal models are solved, study is maintained a little to the optimal binary-coding of relationship, to realize a little reconstruct to relationship.It obtains Binary-coding can be used in the applications such as image retrieval.The present invention is based on machine learning and machine vision, devise a kind of base In point to the characteristics of image binary-coding representation method of relational learning and reconstruct, there is lower optimization complexity, reduce volume The code time can be used for performance and accuracy rate that image retrieval promotes image retrieval.

The present invention uses above-mentioned technical proposal, has the advantages that：The point provided by the invention that is based on is to relational learning With the characteristics of image binary-coding representation method of reconstruct, caused by the characteristic dimension height occurred compared to general image retrieval The features such as height stores, search efficiency is low can reduce memory requirement, improve computational efficiency, and this method is for existing Hash function and the tightly coupled problem of the loss function of optimization aim in unsupervised algorithm coding learning process are a kind of loose couplings The image binary-coding feature learning frame of conjunction, its Optimized model are convex, have lower optimization complexity, finally reach To the purpose for promoting image retrieval performance and accuracy rate.

Description of the drawings

The present invention is done with reference to the accompanying drawings and detailed description and is further illustrated, it is of the invention above-mentioned or Otherwise advantage will become apparent.

Fig. 1 is the characteristics of image binary-coding expression workflow based on point to relational learning and reconstruct of the embodiment of the present invention Cheng Tu；

Fig. 2 is the dictionary segmentation step flow chart of the embodiment of the present invention；

Fig. 3 is the point of the embodiment of the present invention to relationship reconstruction step flow chart；

Fig. 4 is the example that the present invention retrieves highest matching degree on CIFAR-10 databases using eight binary-codings.Its In leftmost picture be retrieving image, the picture on the right is retrieval result, and red frame represents the result of retrieval error.

Fig. 5 is the display for importing 10 data.

Specific implementation mode

The present invention will be further described with reference to the accompanying drawings and embodiments.

As shown in Figure 1, Figure 2 and Figure 3, constructed by the present invention based on point to the characteristics of image two of relational learning and reconstruct The workflow of value coded representation method is roughly divided into following several stages：First stage put to the relational learning stage, including Solve coefficient matrix, dictionary segmentation and character representation work；Second stage is put to relationship reconstruction stage, is mainly iterated to calculate Obtain the work of optimal binary-coding.The characteristics of image two-value of relational learning and reconstruct is compiled based on point in the embodiment of the present invention The specific construction step of code representation method is as follows：

Step 3, character representation：When given new samples, its reconstructed residual in all dictionary item groups is calculated, then The minimum corresponding optimal dictionary item group of reconstructed residual is therefrom chosen to carry out linear expression, to complete a little to relationship It practises；

Step 1 includes：

Augmented Lagrangian Functions L (C, J, the Y of the model₁,Y₂, μ) be：

J^k+1=max (| C^k+1+Y_2,k/μ_k|-λ/μ_k,0)

Y_1,k+1=Y_1,k+μ_k(D-DC_k+1)

Y_2,k+1=Y_2,k+μ_k(C_k+1-J_k+1)

μ_k+1=min (μ_max,ρμ_k)

Step 2 includes：

W=(C+C^T)/2；

Wherein d_iIt is defined as：d_i=∑ w_ij,

w_ijIt refer to the value of the number of the i-th row jth row of matrix W；

Step 3 includes：

z_i=(D^TD+αI)-¹D^TX ',

Wherein, α is balance parameters, and I is unit matrix；

Step 4 includes：

s.t. W_recW_rec ^T=I

Step 4-1, random initializtion B and W_rec：

W=(C+C^T)/2；

Initialize μ_rec=0, s=1；

Step 4-3 updates B：

It enables

It calculates

Update W_rec：

It enables

It calculates

Update μ_rec：

μ_rec=column_mean (Z_rec-s⊙W_rec ^TB),

Update s：

It enables

It calculates

Terminate epicycle iteration, return to step 4-2；

Embodiment

The present embodiment includes with lower part：

10 data are imported, each data are the image data of 784 dimensions.Fig. 5 is that the picture of 10 data is shown.

Passing point after relational learning step (including solving coefficient matrix, dictionary segmentation, character representation) to that can obtain The original reconstruct for importing data indicates, that is, completes study of the point to relationship.

The reconstruct of this 10 data indicates as follows, that is, next step point is to the input data of relationship reconstruction step：

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 0

0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 1

0.0397 0 0 0 0.9603 0 0 0

0.8445 0 0 0 0 0.1555 0 0

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0

0.8739 0 0 0 0 0 0.1261 0

0 0.0534 0 0 0.9466 0 0 0

Next step is that point reconstructs relationship, can be to μ, and B, W, s variables are iterated update.

For this 10 data, can useCalculate the updated error of iteration each time. Finally obtain the error of each iteration：

9.9504,6.6467,6.2349,6.1016,1.8460,1.7460,1.4298,1.1086,1.0954, 1.0954,1.0763,1.0600,1.0597,1.0498,1.0427,1.0427,1.0427,1.0378,1.0343, 1.0343；

It can be found that the error of iteration is smaller and smaller, illustrate that iteration update is effective.

The parameter value after iteration, including μ, B, W, s can be finally obtained, wherein B is exactly that the two-value finally to be obtained is compiled Code：

μ：0 0 0 0 0 0 0 0

s：1

W：0.526380491743404, -0.703854357886537, -0.124134278972108, 0.202417906030414 ,-0.325885663271724,0.149605265605609 ,-0.0364373516095352- 0.203025641935073, -0.320148832680204, -0.296889358189778,0.187789441006430, 0.0556224552702578,0.0543180195432500, -0.700932457499529, - 0.337914469723198, -0.403186999011160, -0.614824643836743, -0.150284503746103, - 0.384163762219469,0.589286622028627, -0.218319098164965,0.134054680569501, 0.0609955279354016,0.187644253112132, -0.299070612273578, -0.225930941875870, - 0.488501528062168, -0.609111515373163,0.137953228767313,0.325901566927202, - 0.263695188929306, -0.234698359561226,0.000636095155782121, 0.0187859733491997,0.295608689874399,0.0392233814298232, -0.178144690543473, 0.227581141177460, -0.818998382462197,0.395520738197212, -0.246529543586242, - 0.169769446463417,0.251859151075042, -0.469358402701577, -0.673284555188396, - 0.130367042685072,0.299342831948313,0.258374510947432,0.255611415159194 ,- 0.0684087178891663, -0.533049300100546, -0.119796699797562, 0.0948826691189857, -0.526146514585480, -0.0854341580367617, 0.581685804423631, -0.163894212872911, -0.555842782524204,0.357982096845213, - 0.0383476153902449,0.574546274335455,0.128493460417319,0.208727206163476, 0.379966747912398；

The data of binary-coding B are as shown in table 1：

Table 1

One	Two	Three	Four	Five	Six	Seven	Eight	Nine	Ten
										0	1	1	0	1	1	1	0	1	0
0	1	1	0	1	1	0	0	0	0
										0	0	0	1	1	1	0	0	1	0
1	1	1	0	0	0	0	1	0	1
										0	1	1	1	0	0	1	1	0	1
0	0	1	0	0	0	0	1	0	0
										1	1	1	1	1	1	1	0	0	0
1	0	0	0	0	1	0	1	0	0

First row to the tenth row are corresponding in turn to ten initial data in table 1, it can be seen that second and third original number According to pattern be all number 1, the binary-coding that they are generated is also similar, and the two-value volume that they are generated with other data Code is just far short of what is expected.

Fig. 4 is the example that the present invention retrieves highest matching degree on CIFAR-10 databases using eight binary-codings.Its In leftmost picture be retrieving image, the picture on the right is retrieval result, wherein " Ours " indicate the present invention retrieval knot Fruit, overstriking frame represent the result of retrieval error.The method of the present invention is compared with other methods accuracy rate higher as can be seen from Figure 4, compared with it He can retrieve more correct result at method.

The present invention provides the characteristics of image binary-coding representation method based on point to relational learning and reconstruct, specific implementations There are many method and approach of the technical solution, the above is only a preferred embodiment of the present invention, it is noted that for this For the those of ordinary skill of technical field, without departing from the principle of the present invention, several improvement and profit can also be made Decorations, these improvements and modifications also should be regarded as protection scope of the present invention.Each component part being not known in the present embodiment is available The prior art is realized.

Claims

1. based on point to the characteristics of image binary-coding representation method of relational learning and reconstruct, which is characterized in that including following step Suddenly：

Step 1, coefficient matrix is solved：Image data table is shown as to the linear combination of dictionary and coefficient matrix, by solving to being The restricted model of matrix number is to acquire coefficient matrix；

Step 2, dictionary is divided：The weight matrix of a figure is indicated the coefficient matrix construction obtained in step 1, on this map Dictionary item is divided into k by spectral clustering₂A group；

Step 3, character representation：When given new samples, its reconstructed residual in all dictionary item groups is calculated, is then therefrom selected The minimum corresponding optimal dictionary item group of reconstructed residual is taken to carry out linear expression, completes study of the point to relationship；

Step 4, point reconstructs relationship：Optimal models are solved, it is real to which study to holding point is to the optimal binary-coding of relationship Now put the reconstruct to relationship.

2. the method as described in claim 1, which is characterized in that step 1 includes：

Step 1-1 gives a matrix D=[x for including n image data₁,x₂,...,x_n]∈R^d×n, each row of wherein D Indicate an image data, x_nIndicate n-th image data, the dimension of each image data is d, and matrix D can be expressed as D certainly The value of=DC, each element of matrix belong to set of real numbers R, wherein C ∈ R^n×nIt is coefficient matrix, each row indicate each original Data corresponding reconstruction coefficients on dictionary, the dictionary, that is, data matrix D itself can describe coefficient matrix application original The global structure of data low-rank constraint and can describe initial data local structure sparsity constraints, obtain as Drag is grouped dictionary：

m_Cin||C||_*+λ||C||₁S.t.D=DC, C >=0

Wherein | | | |_*With | | | |₁Representing matrix nuclear norm and l respectively₁Norm, they respectively by as to coefficient matrix into The approximation of the approximation and sparsity constraints of the constraint of row low-rank, s.t. indicate that constraint, λ are one for balancing sparsity and low-rank The balance parameters of importance between property；

m_Cin||C||_*+λ||J||₁S.t.D=DC, C=J, J >=0

Augmented Lagrangian Functions L (C, J, the Y of the model₁,Y₂, μ) be：

Wherein, Y₁,Y₂, μ represents the introduced auxiliary variable of solving model, by each variable, that is, C, J, Y₁,Y₂, μ's changes Generation update, solves the model, and the rule that iteration updates each variable is as follows：

J^k+1=max (| C^k+1+Y_2,k/μ_k|-λ/μ_k,0)

Y_1,k+1=Y_1,k+μ_k(D-DC_k+1)

Y_2,k+1=Y_2,k+μ_k(C_k+1-J_k+1)

μ_k+1=min (μ_max,ρμ_k)

Wherein, C^k+1Indicate the updated value of+1 iteration of variable C kth, J^kIndicate the updated value of variable J kth time iteration, μ_kTable Show the updated value of variable μ kth time iteration, Y_1,kIndicate variable Y₁The updated value of kth time iteration, Y_2,kIndicate variable Y₂Kth time The updated value of iteration, μ_maxIndicate that the maximum value of variable μ settings, ρ indicate that the growth factor of variable μ, Θ indicate that singular value is shunk Operator andThe coefficient matrix C learnt is acquired by iteration.

3. method as claimed in claim 2, which is characterized in that step 2 includes：

W=(C+C^T)/2；

Wherein d_iIt is defined as：d_i=∑w_ij,

w_ijIt refer to the value of the number of the i-th row jth row of matrix W；

Step 2-4 calculates D_w ^-1/2LD_w ^-1/²Minimum k₁A corresponding feature vector f of characteristic value institute, feature vector f is marked Standardization finally forms n × k₁The eigenmatrix F of dimension；

Step 2-5, to every a line in eigenmatrix F as a k₁The sample of dimension, total n sample, is gathered with clustering method Class, cluster dimensionality k₂, the class belonging to the i-th row is exactly original x_iAffiliated class finally obtains cluster and divides c, to divide dictionary item It is cut into k₂A group.

4. method as claimed in claim 3, which is characterized in that step 3 includes：

z_i=(D^TD+αI)^-1D^TX ',

Wherein, α is balance parameters, and I is unit matrix；

Step 3-2 is calculated for each dictionary item group respectively for the normalization residual error of x '：It is calculated k-th by following formula Normalization residual error r of the dictionary item group for x '_k(x′)：

Wherein, φ_{k_d}(z_i) it is z_iA part of coefficient, corresponding dictionary item is kth _ d dictionary group, D^kIndicate k-th of word The dictionary that the data of allusion quotation item group are formed；

The reconstruction coefficients z of x ' has been calculated in step 3-3_iAfter normalization residual error in all dictionary item groups, selects and be calculated The dictionary item group of reconstructed residual minimum carries out linear expression as optimal dictionary item group, to optimal dictionary item group, passes through sparse volume Expression is reconstructed to x ' in code, to complete a little study to relationship.

5. method as claimed in claim 4, which is characterized in that step 4 includes：

s.t.W_recW_rec ^T=I

Wherein W_recIt is linear projection matrix, Z_recIt is data matrix, B ∈ { -1,1 }^c×nIt is a two values matrix, each column indicates square Battle array Z_recThe corresponding binary-coding of middle data, μ_recIt is an offset parameter, s is a zooming parameter, | | | |_FRepresenting matrix F norms, ⊙ indicate the multiplication of corresponding element, and model is convex, updates each unknown parameter μ by iteration_rec、B、W_rec, s come into Row optimization, until target function value is restrained.

6. method as claimed in claim 5, which is characterized in that described by iteration to update each unknown parameter excellent to carry out Change, until target function value convergence, specifically includes：

Step 4-1, random initializtion B and W_rec：

W_rIt is to initialize W_recAnd from the matrix of standardized normal distribution stochastical sampling, W_r=U ∑s V^TIt indicates to W_rMatrix carries out strange Different value is decomposed, it is assumed that W_rIt is m × n rank matrix, then U is m × m rank matrix after decomposing, and Σ is that m × n rank is diagonal Matrix, and V* is n × n rank matrix, c refers to the length of binary-coding；

Initialize μ_rec=0, s=1；

Step 4-3 updates B：

It enables

It calculates

Update W_rec：

It enables

It calculatesR=VU^T,W_rec=W_recR,

Update μ_rec：

μ_rec=column_mean (Z_rec-s⊙W_rec ^TB),

Update s：

It enables

It calculates

Terminate epicycle iteration, return to step 4-2；

Step 4-4, by iterative calculation, the optimal binary-coding matrix B finally learnt completes reconstruct of the point to relationship.