CN108520279A

CN108520279A - A kind of semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part

Info

Publication number: CN108520279A
Application number: CN201810326062.XA
Authority: CN
Inventors: 黄冬梅; 张明华; 张晓桐; 邹亚晴; 李永兰
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University; Shanghai Ocean University
Priority date: 2018-04-12
Filing date: 2018-04-12
Publication date: 2018-09-11

Abstract

The present invention relates to a kind of semi-supervised dimension reduction methods of high spectrum image of the sparse insertion in part, the described method comprises the following steps：Step S1. sets higher dimensional space R^DIn there are data set X={ x₁, x₂..., x_l, x_l+1..., x_l+u, l+u=N, wherein preceding l sample X_lIt is N to have category sample, class label c, Different categories of samples number_i, i=(1,2 ..., c), rear u sample X_uIt is no category sample；Step S2. builds sparse coefficient matrix S by rarefaction representation；Step S3. is based on the sparse embedded projection algorithm construction projection matrix W in semi-supervised part；Step S4. finds out lower-dimensional subspace Y=W according to projection matrix W^TX={ y₁, y₂..., y_N}.Its advantage is shown：By carrying out the semi-supervised dimensionality reduction of the sparse insertion in part to high spectrum image, the category information of data was both utilized, has also maintained data local characteristics and reduces the noise information of image, to improve the nicety of grading of image.

Description

A kind of semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part

Technical field

The present invention relates to the dimensionality reduction technology fields of high spectrum image, specifically, being a kind of bloom of the sparse insertion in part Spectrogram is as semi-supervised dimension reduction method.

Background technology

The spatial resolution of high-spectrum remote sensing is high, and object spectrum curve is continuous, has stronger terrain classification and knowledge Other ability, in fields such as geologic prospect, medical detection, life science, forensic identification, military surveillance, environmental monitoring, precision agricultures It is widely used.High-spectrum seems the remote sensing images obtained by imaging spectrometer, with imaging spectrometer space and The raising of resolution ratio index between spectrum, while having the high-spectrum remote sensing data volume of two-dimensional space and one-dimensional spectral information in finger Number increases.The drastically expansion of data volume not only brings huge difficulty to data storage with transmission, while also exacerbating data The complexity of processing procedure reduces data-handling efficiency, or even will appear dimension disaster problem.Dimensionality reduction can be high dimensional data A low-dimensional representation method for including sample important information is found, is the pretreated important link of classification hyperspectral imagery.Dimensionality reduction The useful information of data can be not only preserved, and data volume can be greatly reduced, and it is possible to prevente effectively from dimension disaster, makes Data indicate simpler clear, are conducive to subsequent classification.

Whether considered according to sample data or use classes label, traditional dimension reduction method can be divided into three classes：Supervision drop Dimension, unsupervised dimensionality reduction and semi-supervised dimensionality reduction.Unsupervised dimensionality reduction using the spatial relationship of data come keep in data manifold knot Structure, representative algorithm include that Principal Component Analysis (principle component analysis, PCA) and part retain throwing Shadow (locality preserving projections, LPP) carries out unsupervised dimensionality reduction to high spectrum image, it cannot be guaranteed that from Distance is constant between sample data after higher-dimension to low dimension projective.Having supervision dimensionality reduction to refer to one has the projection of distinctive to be dropped Dimension, representative algorithm include local sensitivity discriminant analysis (Locality Sensitive Discriminant Analysis, LSDA) and neighborhood keeps embedded (Neighborhood Preserving Embedding, NPE), has to high spectrum image Dimensionality reduction is supervised, the category of image needs a large amount of manpower, material resources, and there are category data to be difficult to obtain.Semi-supervised dimensionality reduction being capable of profit With the distributed intelligence of data, the structural information of data and a small amount of label information come the performance of boosting algorithm, representative drop Dimension algorithm include semi-supervised locality preserving projections (Semi-Supervised Locality Preserving Projection, SSLPP) and based on semi-supervised feature extraction (the Semi-Supervised Dimensionality Reduction constrained in pairs Based on pairwise constraint propagation, SSDR-PCP).

Although above algorithm achieves preferable effect in some applications, all there is a disadvantage in them：It is right Noise not robust.

High spectrum image field, due to high spectrum image band class information more than and redundancy characteristic, need introduce category letter It ceasing to improve nicety of grading, the cost of category information is big and is not easy to obtain, though and semi-supervised dimension-reduction algorithm traditional at present Semi-supervised thought is so utilized, but only considered the part of properties of data, not can guarantee the local characteristics of data and does not examine Consider the influence of the noise information of image data, therefore, the semi-supervised dimension reduction method based on the sparse insertion in part is worth research.

Chinese patent literature CN201310565426.7, the applying date 20131115, patent name is：One kind is dilute based on part The face emotion identification method for dredging presentation class device, discloses a kind of face emotion recognition based on local rarefaction representation grader Method, it is characterised in that including step：Acquire Facial Expression Image；The feature of Gabor wavelet transition structure Facial Expression Image Vector；Feature is selected using feature selecting algorithm MFCS；Emotional category is identified using the grader of local rarefaction representation.

There is a kind of face emotion identification method based on local rarefaction representation grader of above patent document emotion to know Other accuracy rate is high, and emotion recognition is fast, it is insensitive to face the advantages that.But about a kind of category information not only utilizing sample, but also The dimension reduction method of the sparse insertion of sample data local message is considered, the distance between similar sample can be minimized, is maximized not Distance between similar sample, and the extraction of the local message of data can be kept and reduce the dry of the noise information of image It disturbs, is then disclosed without corresponding to improve the technical solution of nicety of grading of sample.

In conclusion in the higher-dimension inhomogeneous data have partially overlap or lean on it is close, although LPP has local reservation Characteristic, but LPP does not utilize the category information of sample, LPP different classes can be projected to together, can obtain undesirable point Class precision.Class label information is utilized in semi-supervised thought, there is no neighborhood information is considered, cannot reflect data set well Partial structurtes.Dimensionality reduction is carried out to high spectrum image at present, the feature of extraction can be influenced by noise information, seriously affect point Class effect.And studies have shown that the local message of data for dimensionality reduction and classification be all beneficial, traditional LPP dimension-reduction algorithms only It is single from the local message angle of image data, does not filter the noise information of image.One kind is needed both to utilize sample This category information can minimize between similar sample it is contemplated that the dimension reduction method of the sparse insertion of sample data local message Distance, maximize inhomogeneity sample between distance, and can keep the local message of data extraction and reduce image Noise information interference, to improve the semi-supervised dimensionality reduction of high spectrum image of the sparse insertion in the part of nicety of grading of sample Method, and the semi-supervised dimension reduction method of high spectrum image about the sparse insertion in this part yet there are no report.

Invention content

The purpose of the present invention is being directed to deficiency in the prior art, provide it is a kind of not only utilizing the category information of sample, but also examine The dimension reduction method of the sparse insertion of sample data local message is considered, the distance between similar sample can be minimized, has maximized different Distance between class sample, and the extraction of the local message of data can be kept and reduce the interference of the noise information of image, The semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part to improve the nicety of grading of sample.

Another object of the present invention is：A kind of semi-supervised dimension reduction method of high spectrum image using the sparse insertion in part is provided Technology path

To achieve the above object, the technical solution adopted by the present invention is that：

A kind of semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part, the described method comprises the following steps：

Step S1. sets higher dimensional space R^DIn there are data set X={ x₁, x₂..., x_l, x_l+1..., x_l+u, l+u=N, wherein Preceding l sample X_lIt is N to have category sample, class label c, Different categories of samples number_i, i=(1,2 ..., c), rear u sample X_uIt is Without category sample；

Step S2. builds sparse coefficient matrix S by rarefaction representation；

Step S3. is based on the sparse embedded projection algorithm construction projection matrix W in semi-supervised part；

Step S4. finds out lower-dimensional subspace Y=W according to projection matrix W^TX={ y₁,y₂,…,y_N}。

As a kind of perferred technical scheme, following steps are specifically included in step S2：

Step S21. is for any one data x_iLinear combination is carried out, obtaining linear combination is：

x_i=s_i,1x₁+…+s_i,i-1x_i-1+s_i,i+1x_i+1+…+s_i,nx_n

Wherein：s_i=[s_{I, 1},…s_{I, i-1},0,s_{I, i}+₁,…,s_i,n]^TFor coefficient matrix, s_i,jIndicate sample x_iReconstruct system Number；

Step S22. builds mathematical model, and the mathematical model built is as follows：

s.t.x_i=Xs_i

||s_i||₀Indicate s_iL₀Norm, for weighing s_iSparsity；

Step S23. is by l₀Norm minimum problem is converted into l₁Norm minimum problem：

s.t.x_i=Xs_i1=1^Ts_i

Wherein, 1 complete 1 vector is indicated, | | s_i||₁Indicate s_iL₁Norm；

Step S24. obtains sparse coefficient matrix s=[s by the calculation formula in step S23₁,s₂,…s_n]；

As a kind of perferred technical scheme, following steps are specifically included in step S3：

Step S31. improves semi-supervised similar weight matrix；

Wherein, q_ijFor the element of improved semi-supervised similar weight matrix Q, a_ij=exp (- | | x_i-x_j||²/ σ), σ is institute There are square of Euclidean distance average value between sample pair, J (x_i) it is x_iK neighbours domain, k be neighbour's parameter；a_ijFor partial weight value, Enable (1+a_ij) it is that weights, (1-a are differentiated in class_ij) weights are differentiated between class；

Step S32. calculates the diagonal matrix D of similar weight matrix, according to sparse by improving semi-supervised similar weight matrix Coefficient matrix S, to calculate Laplacian Matrix L^*, utilize Laplacian Matrix L^*Construct the sparse embedded projection in semi-supervised part The target function type of algorithm, formula are as follows：

Wherein, L^*=D-QS^T-SQ+SDS^T, S=[s₁,s₂,…,s_n]；Matrix D is the diagonal matrix of Q,

Diagonal matrix element is D_ii=∑_jq_ij；

Step S33. introduces constraints W^TXX^TW=I, wherein I are unit matrix, introduce method of Lagrange multipliers by step Formula overall goals function is converted into as follows in S32：

F (W)=W^TXL^*X^TW-λ(W^TXX^TW-I)

Derivation, such as following formula are carried out to the W of the formula：

It enablesIt obtains as follows：

XL^*X^TW=λ XX^TW

Wherein λ is generalized eigenvalue, passes through the corresponding feature vector of a maximum eigenvalue before solution, composition projection matrix W =[w₁,w₂,...,w_a]。

To realize above-mentioned second purpose, the technical solution adopted by the present invention is that：

A kind of semi-supervised dimensionality reduction of high spectrum image using the sparse insertion in part described at least one above-described embodiment Technology path, it is specific as follows：

The first step, existing high spectrum image, the interpretation figure marked according to expert refers to legend, from this width Different classes of sample data { x is extracted in image₁, x₂, ..., x_N, the input number as high-spectrum image dimensionality reduction of the present invention According to；

The initial data obtained in the first step is carried out simple data prediction by second step；

Third walks, and sparse coefficient matrix S is built by rarefaction representation；

4th step solves sample data the Euclidean distance dist=between them | | x_i-x_j||；

5th step, by the 4th step solve Euclidean distance, according in second step to sample add category information the case where, It is divided into 5 kinds of situations to solve, finally obtains the similar weight matrix Q between sample point；

6th step, the similar weight matrix Q acquired by the 5th step according to similar weight matrix Q and acquire diagonal matrix D；

7th step, the value acquired by third step and the 6th step bring L into^*=D-QS^T-SQ+SDS^T；

8th step brings the value that the 7th step acquires into following formula：

Wherein, L^*=D-QS^T-SQ+SDS^T, S=[s₁,s₂,…,s_n], matrix D is the diagonal matrix of Q, diagonal matrix element For D_ii=∑_jq_ij；

Sample data dimension is arranged and according to formula XL in 9th step^*X^TW=λ XX^TW finds out transition matrix W=(w₁, w₂..., w_d)；

Tenth step, calculation formula y_i=W^T×x_i, find out sample data Y={ y in lower dimensional space₁,y₂,…,y_N}；

11st step is added category information as training sample data to a few sample after dimensionality reduction, does not add in conjunction with most The test sample data of category classify to the sample after dimensionality reduction using nearest neighbor classifier, by classification results and known master The Comparative result of componential analysis PCA, local retaining projection LPP, calculate nicety of grading；

As a kind of perferred technical scheme, select number of tags for 50labels, when d=15, k=24, nicety of grading is most It is high.

As a kind of perferred technical scheme, the calculation formula of nicety of grading is as follows：NErr=sum (class~= classLabel)；Rate=1-nErr/length (class)；Class be by nearest neighbor algorithm find out come test sample Category, classLabel is the category for the test sample known before dimensionality reduction, and nErr is error rate, and rate is identification Rate.

As a kind of perferred technical scheme, pretreatment includes：Sample data adds category X={ x₁, x₂..., x_l, x_l+1..., x_l+u, data normalization processing.

The invention has the advantages that：

1. once following technique effect may be implemented input application in the present invention：This programme is by carrying out high spectrum image The semi-supervised dimensionality reduction of the sparse insertion in part, had both been utilized the category information of data, and had also maintained data local characteristics and reduce The noise information of image, to improve the nicety of grading of image.

2. algorithm proposed by the present invention makes full use of the local retention performance of LPP algorithms, in the similar weight square of LPP algorithms Introduced in battle array has category sample data and largely without category sample data on a small quantity, and in the algorithm believes neighbour between sample data The difference combination situation of breath and category information considers respectively, including the 1) similitude between similar label neighbour sample number strong point； 2) similitude between inhomogeneity label neighbour sample number strong point；3) there is category and without the phase between category neighbour's sample number strong point Like property；4) without category and without the similitude between category neighbour's sample number strong point；

3. a kind of semi-supervised dimension-reduction algorithm of high spectrum image of the sparse insertion in part of the present invention, being suitable for processing has wave The high spectrum image that section is more and redundancy, Spectral correlation are strong, data volume is huge, dimension is high, dimensionality reduction greatly reduce answering for calculating Miscellaneous degree reduces the differentiation error caused by redundancy, improves the classification performance of image, and traditional high spectrum image Not only store and transmit it is of high cost, in practical applications handle higher-dimension data also bring along many problems；

4. matching noise is removed using sparse matrix, to effective filter out noise information during study, to noise Robust；

5. the present invention proposes a kind of category information both utilizing sample, it is contemplated that sample data local message is sparse embedding The dimension reduction method entered can minimize the distance between similar sample, maximize the distance between inhomogeneity sample, and can keep The extraction of the local message of data and the interference of the noise information of reduction image, to improve the nicety of grading of sample；

6. by rarefaction representation, base signal matrix that eshaustibility may be less indicates initial data and at the same time subtracting as far as possible Few reconstruction error；

7. improved semi-supervised similar weight matrix has the following advantages：(1)0≤a_ij≤ 1,1≤(1+a_ij)≤2,0 ≤(1-a_ij)≤1, differentiate between class in weights and class differentiate weights by noise be compressed in it is certain within the scope of；(2) by a_ij= exp(-||x_i-x_j||²/ σ) (σ is square of Euclidean distance average value between all data) it is found that with Euclidean distance reduction, Partial weight value increases, and differentiates that weights reduce between class, differentiates that weights increase in class so that the similar weights of similar exemplar are got over Greatly rather than the weights of similar exemplar are accordingly reduced.

Description of the drawings

Attached drawing 1 is rarefaction representation schematic diagram.

2 dimensionality reduction technology route schematic diagram of attached drawing.

Attached drawing 3 is that number of tags is 20labels and the result of PCA, LPP carry out contrast schematic diagram.

Attached drawing 4 is that number of tags is 30labels and the result of PCA, LPP carry out contrast schematic diagram.

Attached drawing 5 is that number of tags is 40labels and the result of PCA, LPP carry out contrast schematic diagram.

Attached drawing 6 is that number of tags is 50labels and the result of PCA, LPP carry out contrast schematic diagram.

Specific implementation mode

It elaborates below in conjunction with the accompanying drawings to specific implementation mode provided by the invention.

The present invention proposes a modified hydrothermal process：The semi-supervised dimensionality reduction of high spectrum image based on the sparse insertion in part is calculated The key of method, the algorithm is that the partial structurtes using LPP keep and introduce category information and reduce noise information, constructs One new projection target function, to obtain the projection matrix based on ISWSSFE rarefaction representations.This method can be more fully It extracts local discriminant information and effectively filters noise information so that classification hyperspectral imagery precision is obviously improved.

If higher dimensional space R^DIn there are data set X={ x₁, x₂..., x_l, x_l+1..., x_l+u, l+u=N, wherein preceding l sample This X_lIt is N to have category sample, class label c, Different categories of samples number_i, i=(1,2 ..., c), rear u sample X_uIt is no category Sample.

1, rarefaction representation

The target of rarefaction representation indicates initial data and at the same time reducing as far as possible with base signal matrix as few as possible Reconstruction error.For any one data x_i, it can be by other samples (in addition to itself) linear combination：

x_i=s_i,1x₁+…+s_i,i-₁x_i-₁+s_i,i+1x_i+1+…+s_i,nx_n (1)

Wherein：s_i=[s_{I, 1},…s_{I, i-1},0,s_{I, i}+₁,…,s_i,n]^TFor coefficient matrix, s_i,jIndicate sample x_iReconstruct system Number.

Mathematical model is as follows：

s.t.x_i=Xs_i

||s_i||₀Indicate s_iL₀Norm, for weighing s_iSparsity, but minimize l₀Norm problem is a NP- Hard problems solve difficult.It therefore, can be by l₀Norm minimum problem is converted into l₁Norm minimum problem：

s.t.x_i=Xs_i1=1^Ts_i

Wherein, 1 complete 1 vector is indicated, | | s_i||₁Indicate s_iL₁Norm.By solving formula (3), it can be deduced that sparse coefficient Matrix s=[s₁,s₂,…s_n]。

2, the sparse embedded projection algorithm (ISWSSFE) in semi-supervised part

LPP is unsupervised locality preserving projections dimension-reduction algorithm, and the category information of sample is not added in similar weight matrix, Therefore very sensitive to noise.The present invention utilizes the category information of sample, it is proposed that improved semi-supervised similar weight matrix:

Wherein, q_ijFor the element of improved semi-supervised similar weight matrix Q, a_ij=exp (- | | x_i-x_j||²/ σ), σ is institute There is square of Euclidean distance average value between sample pair.J(x_i) it is x_iK neighbours domain, k be neighbour's parameter.a_ijFor partial weight value, Enable (1+a_ij) it is that weights, (1-a are differentiated in class_ij) differentiate that weights, improved semi-supervised similar weighted value consider sample between class Class label information, which reflects the local neighborhood structure and category information of sample data.

The advantages of improved semi-supervised similar weight matrix：(1)0≤a_ij≤ 1,1≤(1+a_ij)≤2,0≤(1-a_ij)≤ 1, differentiate between class in weights and class differentiate weights by noise be compressed in it is certain within the scope of；(2) by a_ij=exp (- | | x_i-x_j| |²/ σ) (σ is square of Euclidean distance average value between all data) it is found that with Euclidean distance reduction, partial weight value increase Greatly, differentiate that weights reduce between class, differentiate that weights increase in class so that the similar weights of similar exemplar are bigger rather than similar The weights of exemplar are accordingly reduced.

Based on improved semi-supervised similar weight matrix, improved semi-supervised localized target function is provided, the throwing of LPP is made Shadow matrix W can keep the classification information of sample in the case where ensureing that part retains and be made an uproar to image using rarefaction representation Acoustic intelligence is filtered, and formula is as follows：

Wherein, L^*=D-QS^T-SQ+SDS^T, S=[s₁,s₂,...,s_n].Matrix D is the diagonal matrix of Q, opposite angle matrix element Element is D_ii=∑_jq_ij。

Introduce constraints W^TXX^TW=I, wherein I are unit matrix.

Introducing method of Lagrange multipliers converts formula (5) overall goals function to as follows：

F (W)=W^TXL^*X^TW-λ(W^TXX^TW-I) (6)

Derivation, such as following formula are carried out to the W of formula (6)：

It enablesFormula (7) is converted into formula (8), as follows：

XL^*X^TW=λ XX^TW (8)

Projection matrix W is actually to solve generalized eigenvalue and feature vector problem, and wherein λ is generalized eigenvalue, is led to Cross the corresponding feature vector of the preceding a maximum eigenvalue of solution, composition projection matrix W=[w₁,w₂,...,w_a], low-dimensional is empty after projection Between data be：

Y=W^TX

Input：Sample data setParameter is k

Output：Dimensionality reduction matrix W=[w₁,w₂,...,w_a] and lower-dimensional subspace Y=W^TX={ y₁,y₂,…,y_N}

Steps are as follows for the realization of algorithm：

(1) according to a_ij=exp (- | | x_i-x_j||²/ σ) the improved semi-supervised similar weight matrix Q of construction；

(2) by improving semi-supervised similar weight matrix, the diagonal matrix D of similar weight matrix is calculated；

(3) sparse coefficient matrix S is constructed according to formula (3), to calculate Laplacian Matrix L^*；

(4) Laplacian Matrix L is utilized^*Construct the target function type (5) of the sparse embedded projection algorithm in semi-supervised part；

(5) Generalized Characteristic Equation is solved according to formula (8), the corresponding feature vector of a maximum eigenvalue before obtaining is constituted and thrown Shadow matrix W；

(6) lower-dimensional subspace Y=W is found out according to formula (9)^TX={ y₁,y₂,…,y_N}。

The present invention proposes the semi-supervised dimension-reduction algorithm (ISWSSFE) of the high spectrum image based on the sparse insertion in part, right Pavia University high spectrum images are handled, take respectively number of tags be 20labels, 30labels, 40labels, 50labels is tested, and is compared with the result of PCA, LPP, and d is dimension, k be neighbour's parameter (k 24, d be 5~ 50), as a result as seen in figures 3-6, from the line chart of 20labels, 30labels, 40labels and 50labels it can be found that When number of tags is 50labels, when d=15, k=24, the discrimination of sample is up to 93.56%.

Fig. 2 substantially illustrates the technology path of the dimensionality reduction of the high spectrum image based on the technology of the present invention.

The first step, existing high spectrum image, the interpretation figure marked according to expert refers to legend, from this width Different classes of sample data { x is extracted in image₁, x₂, ..., x_N, the input number as high-spectrum image dimensionality reduction of the present invention According to.

The initial data (103 * 1800 samples of dimension) obtained in the first step is carried out simple data and located in advance by second step Reason.Pretreatment includes：Sample data adds category X={ x₁, x₂..., x_l, x_l+1..., x_l+u, data normalization processing.

Third walks, and the sparse coefficient matrix S (1800*1800) of training sample is acquired according to formula (3).

4th step solves sample data Euclidean distance (1800*1800) dist=between them | | x_i-x_j||。

5th step, by the 4th step solve Euclidean distance, according in second step to sample add category information the case where, It is divided into 5 kinds of situations to solve, finally obtains the similar weight matrix Q (1800*1800) between sample point as shown in formula (4).

6th step, the similar weight matrix Q acquired by the 5th step according to similar weight matrix Q and acquire diagonal matrix D。

7th step, the value acquired by third step and the 6th step bring L into^*=D-QST-SQ+SDS^T, acquire

8th step brings the value that the 7th step acquires into formula (5).

9th step, setting sample data dimension d (value range is 5~50), and according to formula (8), find out transition matrix W =(w₁, w₂..., w_d) (103*d dimensions).

Tenth step, calculation formula y_i=W^T×x_i, find out sample data Y={ y in lower dimensional space₁,y₂,…,y_N(d ties up * 1800)。

11st step is added category information as training sample data to a few sample after dimensionality reduction, does not add in conjunction with most The test sample data of category classify to the sample after dimensionality reduction using nearest neighbor classifier, by classification results with it is known The Comparative result of (PCA, LPP) utilizes formula nErr=sum (class~=classLabel)；Rate=1-nErr/ Length (class) (class be by nearest neighbor algorithm find out come test sample category, classLabel is in dimensionality reduction The category for the test sample known before, nErr are error rates, and rate is discrimination) calculate nicety of grading.

12nd step, it can be found that working as from the line chart of 20labels, 30labels, 40labels and 50labels Number of tags is 50labels, and when d=15, k=24, the discrimination of sample is up to 93.56%.

The present invention once puts into application, and following technique effect may be implemented：This programme passes through to high spectrum image carry out office The category information of data had both been utilized in the semi-supervised dimensionality reduction of the sparse insertion in portion, also maintains data local characteristics and reduces figure The noise information of picture, to improve the nicety of grading of image.Algorithm proposed by the present invention makes full use of the part of LPP algorithms to protect Characteristic is held, introduced in the similar weight matrix of LPP algorithms has category sample data and largely without category sample data on a small quantity, and The difference combination situation of neighbor information between sample data and category information is considered respectively in the algorithm, including 1) similar label Similitude between neighbour's sample number strong point；2) similitude between inhomogeneity label neighbour sample number strong point；3) have category and Without the similitude between category neighbour's sample number strong point；4) without category and without the similitude between category neighbour's sample number strong point； A kind of semi-supervised dimension-reduction algorithm of high spectrum image of the sparse insertion in part of the present invention, being suitable for processing has wave band more and superfluous High spectrum image remaining, Spectral correlation is strong, data volume is huge, dimension is high, dimensionality reduction greatly reduce the complexity of calculating, subtract Lack the differentiation error caused by redundancy, improves the classification performance of image, and traditional high spectrum image not only stores With transmission cost height, the data for handling higher-dimension in practical applications also bring along many problems；It goes to match using sparse matrix Noise, to effective filter out noise information during study, to noise robustness；The present invention proposes one kind and both utilizing sample This category information can minimize between similar sample it is contemplated that the dimension reduction method of the sparse insertion of sample data local message Distance, maximize inhomogeneity sample between distance, and can keep the local message of data extraction and reduce image Noise information interference, to improve the nicety of grading of sample；By rarefaction representation, base signal square that eshaustibility may be less Battle array indicates initial data and at the same time reducing reconstruction error as far as possible；Having for improved semi-supervised similar weight matrix is following Advantage：(1)0≤a_ij≤ 1,1≤(1+a_ij)≤2,0≤(1-a_ij)≤1 differentiates between class in weights and class and differentiates weights by noise Within the scope of being compressed in centainly；(2) by a_ij=exp (- | | x_i-x_j||²/ σ) (σ is Euclidean distance average value between all data Square) it is found that with Euclidean distance reduction, partial weight value increases, differentiates that weights reduce between class, differentiate that weights increase in class Greatly so that the similar weights of similar exemplar are bigger rather than the weights of similar exemplar are accordingly reduced.

The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art Member, under the premise of not departing from the method for the present invention, can also make several improvement and supplement, these are improved and supplement also should be regarded as Protection scope of the present invention.

Claims

1. a kind of semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part, which is characterized in that the method includes following Step：

Step S1. sets higher dimensional space R^DIn there are data set X={ x₁, x₂..., x_l, x_l+1..., x_l+u, l+u=N, wherein first l Sample X_lIt is N to have category sample, class label c, Different categories of samples number_i, i=(1,2 ..., c), rear u sample X_uIt is no class Standard specimen sheet；

Step S2. builds sparse coefficient matrix S by rarefaction representation；

2. the semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part according to claim 1, which is characterized in that step Following steps are specifically included in rapid S2：

x_i=s_i,1x₁+…+s_i,i-1x_i-1+s_i,i+1x_i+1+…+s_i,nx_n

Wherein：s_i=[s_{I, 1},…s_{I, i-1},0,s_{I, i+1},…,s_i,n]^TFor coefficient matrix, s_i,jIndicate sample x_iReconstruction coefficients；

s.t.x_i=Xs_i

||s_i||₀Indicate s_iL₀Norm, for weighing s_iSparsity；

s.t.x_i=Xs_i1=1^Ts_i

Wherein, 1 complete 1 vector is indicated, | | s_i||₁Indicate s_iL₁Norm；

Step S24. obtains sparse coefficient matrix s=[s by the calculation formula in step S23₁,s₂,…s_n]。

3. the semi-supervised dimension reduction method of high spectrum image of the sparse insertion in part according to claim 1, which is characterized in that step Following steps are specifically included in rapid S3：

Step S31. improves semi-supervised similar weight matrix；

Wherein, q_ijFor the element of improved semi-supervised similar weight matrix Q, a_ij=exp (- | | x_i-x_j||²/ σ), σ is all samples Square of Euclidean distance average value between this pair, J (x_i) it is x_iK neighbours domain, k be neighbour's parameter；a_ijFor partial weight value, (1 is enabled +a_ij) it is that weights, (1-a are differentiated in class_ij) weights are differentiated between class；

Step S32. calculates the diagonal matrix D of similar weight matrix, according to sparse coefficient by improving semi-supervised similar weight matrix Matrix S, to calculate Laplacian Matrix L^*, utilize Laplacian Matrix L^*Construct the sparse embedded projection algorithm in semi-supervised part Target function type, formula is as follows：

Wherein, L^*=D-QS^T-SQ+SDS^T, S=[s₁,s₂,...,s_n]；Matrix D is the diagonal matrix of Q,

Diagonal matrix element is D_ii=∑_jq_ij；

Step S33. introduces constraints W^TXX^TW=I, wherein I are unit matrix, introduce method of Lagrange multipliers by step S32 Middle formula overall goals function is converted into as follows：

F (W)=W^TXL^*X^TW-λ(W^TXX^TW-I)

It enablesIt obtains as follows：

XL^*X^TW=λ XX^TW

Wherein λ is generalized eigenvalue, passes through the corresponding feature vector of a maximum eigenvalue before solution, composition projection matrix W= [w₁,w₂,…,w_a]。

4. a kind of utilizing the local semi-supervised dimension reduction method of high spectrum image of sparse insertion of claim 1-3 any one of them Technology path, it is specific as follows：

The first step, existing high spectrum image, the interpretation figure marked according to expert extracts difference from diagram picture Sample data { the x of classification₁, x₂, ..., x_N, the input data as high-spectrum image dimensionality reduction；

5th step, by the 4th step solve Euclidean distance, according in second step to sample add category information the case where, be divided into 5 kinds of situations solve, and finally obtain the similar weight matrix Q between sample point；

8th step brings the value that the 7th step acquires into following formula：

Wherein, L^*=D-QS^T-SQ+SDS^T, S=[s₁,s₂,…,s_n], matrix D is the diagonal matrix of Q, and diagonal matrix element is D_ii =∑_jq_ij；

11st step is added category information as training sample data to a few sample after dimensionality reduction, does not add category in conjunction with most Test sample data, classified to the sample after dimensionality reduction using nearest neighbor classifier, by classification results and known principal component The Comparative result of analytic approach PCA, local retaining projection LPP, calculate nicety of grading.

5. a kind of technology path of the semi-supervised dimensionality reduction of high spectrum image of the sparse insertion in part according to claim 4, It is characterized in that, selects number of tags for 50labels, when d=15, k=24, as the calculating parameter for calculating nicety of grading.

6. a kind of technology path of the semi-supervised dimensionality reduction of high spectrum image of the sparse insertion in part according to claim 4, It is characterized in that, the calculation formula of nicety of grading is as follows：

NErr=sum (class~=classLabel)；Rate=1-nErr/length (class)；Class is by nearest Adjacent algorithm finds out the category for the test sample come, and classLabel is the category for the test sample known before dimensionality reduction, NErr is error rate, and rate is discrimination.

7. a kind of technology path of the semi-supervised dimensionality reduction of high spectrum image of the sparse insertion in part according to claim 4, It is characterized in that, pretreatment includes：Sample data adds category X={ x₁, x₂..., x_l, x_l+1..., x_l+u, data normalization processing.