CN104463229A

CN104463229A - Hyperspectral data monitoring and classification method based on correlation coefficient redundancy

Info

Publication number: CN104463229A
Application number: CN201410840648.XA
Authority: CN
Inventors: 张淼; 张晔; 沈毅
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2014-12-30
Filing date: 2014-12-30
Publication date: 2015-03-25
Anticipated expiration: 2034-12-30
Also published as: CN104463229B

Abstract

The invention discloses a hyperspectral data monitoring and classification method based on correlation coefficient redundancy, and belongs to the field of remote sensing image information processing. The hyperspectral data monitoring and classification method based on correlation coefficient redundancy includes the following steps that firstly, a training sample set needed by monitoring and classification is automatically screened out through the correlation coefficient redundancy; secondly, the parameters of an SVM kernel function are optimized; thirdly, a dichotomy task on a hyperspectral remote sensing image is finished through an SVM classifier algorithm; fourthly, a multi-classification task is realized based on a one-to-one strategy. Automatic screening of training samples is assisted by calculating the correlation coefficient redundancy of the sample set selected once, abandoned samples are used for automatically optimizing the parameters of the classifier, the classification accuracy of the SVM classifier algorithm is effectively improved, and time consumption is reduced by reducing supporting vectors, so that the hyperspectral data monitoring and classification method is more practical in the high-accuracy classification task of processing the hyperspectral remote sensing image.

Description

Based on the high-spectral data supervised classification method of related coefficient redundance

Technical field

The invention belongs to remote sensing images field of information processing, relate to a kind of high-spectral data supervised classification method optimizing training data and classifier parameters.

Background technology

Classification of hyperspectral remote sensing image is the important content of high-spectrum remote sensing field of information processing, whether according to, the priori of use classes can be divided into Supervised classification and unsupervised classification, Supervised classification precision is higher than unsupervised classification, is thus applicable to the sophisticated category application of remote sensing images.Classification concept is also not quite similar in different application, and in classification hyperspectral imagery research, for Supervised classification, first researchist will select representational pixel (or claiming pixel) set as training sample for each classification.High light spectrum image-forming equipment, owing to emphasizing spectral resolution, is often sacrificed to some extent in spatial resolution, therefore selects suitable training sample to be the work of a unusual dependency analysis personnel experience by visual inspection.In addition, also usually need by other information, as land data or existing map etc. for each classification selects representative training sample.For Supervised classification, other sample of same class needs to have congeniality, the variance scope that also demand fulfillment is certain simultaneously.Therefore in actual applications, just need to select multiple training sample region or set.If class variance is comparatively large, so select training area also to require great effort very much, we can not determine whether the training sample selected is applicable to Images Classification completely simultaneously.Therefore, selecting and screen training sample is very dependence researchist judgment, a job simultaneously very consuming time.

Support vector machine (Support Vector Machine, SVM) algorithm is the sorting technique grown up on Statistical Learning Theory basis, in small sample, non-linear and high dimensional pattern classification, have unique advantage, the high-spectral data therefore for wave band Numerous (generally 100 to 1000) has more outstanding classifying quality always.SVM is on linear classifier basis, form by introducing the evolution of structural risk minimization principle, Optimum Theory and kernel method, be the most also be most widely used method in statistical learning, effectively can overcome Hughes phenomenon (i.e. nicety of grading along with the increase of wave band number reduce on the contrary) in Hyperspectral data classification application.In addition, SVM can directly calculate high dimensional data, need not through dimension-reduction treatment, the whole wave band data of such employing carries out classifying the adequacy and integrality that ensure that spectral information is applied, therefore, in a lot of high-precision classification task, be all often adopt all band data in the enterprising row operation of high-performance workstation.

Summary of the invention

In hyperspectral classification method, be difficult to carry out training sample the problem of Effective selection in order to solve, the invention provides a kind of high-spectral data supervised classification method based on related coefficient redundance.The method selects the related coefficient redundance of sample set to carry out auxiliary automatic screening training sample by calculating single, and abandonment sample is used for the automatic optimal of classifier parameters, the nicety of grading of SVM classifier algorithm is made to obtain effective lifting, and decrease time loss by simplifying support vector, make it more to have practical value in the high-precision classification task of process high-spectrum remote sensing.

The present invention is directed to the specific tasks faced by high-spectrum remote sensing analyst, devise a kind of assisting sifting training sample and the sorting technique of parameter optimization is provided.When the defective pixel screened out in training sample or suboptimum pixel, propose the related coefficient redundance analyzing nonlinear correlation information content between multivariate input, thus once can carry out total evaluation to the M in the line segment sampling thief of E × 1 pixel preselected pixel, ensure that the pixel selected can realize the maximization of overall useful classified information amount; Meanwhile, the training sample of abandonment is attached in the parameter selection process of SVM classifier effectively; Finally by SVM classifier algorithm, automatic classification is carried out to the test sample book that analyst is concerned about, achieve a set of scheme of Supervised classification fast and accurately.Concrete implementation step is as follows:

Step one, utilize training sample set required for related coefficient redundance automatic screening Supervised classification:

1) for the high-spectrum remote sensing of shooting wherein Row, Column represent the wide of high-spectrum remote sensing and length, and B represents the wave band number of high-spectrum remote sensing, select training sample by graphical analysis personnel according to the line segment sampling thief of E × 1;

2) training sample for artificial selection carries out automatic screening, and each pixel retained of setting is M, requires M < E, has namely deleted the pixel of E-M training sample;

3) M pixel P of all possible options is calculated ₁..., P _mrelated coefficient redundance, and select a maximum related coefficient redundance of numerical value, namely then m the pixel retained from E pixel;

4) E the pixel selected this unifies the category label Class that mark one is determined ^*, by M pixel retaining with the right form of data be included into training sample set, a remaining E-M pixel is also with the form that data are right be included into abandonment sample set;

5) if need to continue to select more training sample, then 1 is returned), otherwise carry out step 2.

Step 2, optimizing is carried out to the parameter of SVM kernel function:

1) according to the different characteristics of SVM classifier kernel function, select the radial basis function (Radial Basis Function, RBF) with symmetrical inner product as the kernel function of support vector machine, RBF form is: wherein σ is width parameter, parameter σ is carried out to the traversal examination of following data: σ ∈ { 0.01,0.1,1,10};

2) optimizing scheme: first, a class pixel with identical category index value is selected from the training sample set that step one obtains, its new category label is defined as-1, remaining all pixels then corresponding new category label 1 in training sample set, and same process is also done to the pixel of abandoning in sample set, two classification problems of a structure standard, the SVM algorithm applied based on RBF kernel function goes classification, and calculate for such other nicety of grading, wherein: RBF form is: σ is width parameter, parameter σ is carried out to the traversal examination of following data: σ ∈ { 0.01,0.1,1,10}; Then, according to same way, above-mentioned process (because this processing procedure relates to the calculating of a class to other all classes, therefore also referred to as one-to-many strategy) is all done to each classification in training sample set, then obtains the nicety of grading for all categories; Average is got to the nicety of grading of all categories, obtains average nicety of grading;

3) select the σ value corresponding to average nicety of grading maximal value as the parameter of SVM classifier algorithm.

Step 3, use SVM classifier algorithm complete two classification task to high-spectrum remote sensing:

From training sample set, select h, the pixel of k two kind is trained as the SVM classifier of training data to known parameters σ, and to all L test sample book P _t(1≤t≤L) carries out classification estimation, obtains the classification estimated value of all test sample books afterwards:

f _h，k(P _t)，1≤t≤L；

If only have two kinds in training sample set, then without the need to carrying out step 4, utilize f _{h, k}(P _t) kind judging to all test sample books can be completed; Otherwise, need to select in addition a pair other different classes of combination, and repeat step 3, until all category combinations are between two all complete by calculating.

Step 4, realize many classification task based on strategy one to one:

According to the result that step 3 obtains, utilize the Voting principle of strategy one to one, finally judge the category label of each test sample book.

Compared with prior art, tool of the present invention has the following advantages:

1) contemplated by the invention the feature that high spectrum image spatial resolution is lower, automatic screening is carried out for line segment sampling thief conventional in expert along training set-determination process, and this step can after analyst completes the sampling of each E × 1 pixel, by computer automatic execution, therefore from the angle of practicality, any calculating pressure can not be brought, after analyst has chosen all training samples, this method also can by the training sample abandoned, automatically provide the suitable parameter of SVM classifier, entirety has extremely strong operability.

2) the training sample screening process based on related coefficient redundance of the present invention's proposition, effectively can analyze the overall nonlinear correlation information under multivariate acting in conjunction, the nonlinear correlation relation presented because space length is comparatively large between line segment sampling thief two end pixel can be evaluated well, and this quantity of information is used for the useful classified information evaluating multivariate entirety, make follow-up sorter computing more efficient, classification results is also more accurate.

3) contemplated by the invention the different characteristics of one-to-many strategy and strategy one to one: the former is more suitable for that training sample is less and parameter that is training sample number relative equilibrium of all categories determines that the stage is applied, can objective evaluation sorter to other performance of unitary class; Latter is applicable to the many classification problems of test sample book of big data quantity.Therefore, one-to-many strategy is applied to parameter optimization, and strategy is applied to final test sample book classification one to one, and two kinds of strategies are combined and make this method improve classification effectiveness and performance.

4) the present invention proposes a kind of high-spectral data training sample screening scheme based on related coefficient redundance, and in conjunction with SVM Algorithm constitution high-spectral data supervised classification method, contrast traditional svm classifier algorithm and achieve high-precision classification while reducing computing time.

Accompanying drawing explanation

Fig. 1 is process flow diagram of the present invention;

Fig. 2 is the classification results scatter diagram of the inventive method;

Fig. 3 is the classification results scatter diagram of standard SVM method.

Embodiment

Below in conjunction with accompanying drawing, technical scheme of the present invention is further described; but be not limited thereto; everyly technical solution of the present invention modified or equivalent to replace, and not departing from the spirit and scope of technical solution of the present invention, all should be encompassed in protection scope of the present invention.

Embodiment one: as shown in Figure 1, the high-spectral data supervised classification method based on related coefficient redundance that present embodiment provides is divided into four steps, and concrete steps are as follows:

Step one: utilize the training sample set required for related coefficient redundance automatic screening Supervised classification.

1) for the high-spectrum remote sensing of shooting wherein Row, Column represents the wide of high-spectrum remote sensing and length, B represents the wave band number of high-spectrum remote sensing, training sample is selected according to the line segment sampling thief of E × 1 by graphical analysis personnel, meaningful in order to ensure follow-up screening operation, require E>=2, namely once can select the training sample of E pixel, use vectorial P _m(m=1 ..., E) represent, each vectorial P _mdimension all equal B.Due to vectorial P _meach dimension data represent each band class information of this pixel, therefore also claim P _mfor pixel vectors, or be called for short pixel;

5) if need to continue to select more training sample, then 1 is returned), otherwise carry out step below.

Provide the computing method of the related coefficient redundance (CorrelationCoefficient Redundancy, CCR) used in above-mentioned steps below:

Before calculating the related coefficient redundance between multivariate, first need the nonlinear interaction coefficient (Nonlinear Correlation Coefficient, NCC) calculated between two between variable.Consider Two Variables X and Y, their element number (i.e. the dimension of vector) is B, and the desirable status number of variable is g, and wherein g needs the element number being less than different numerical value in B element, otherwise some state can be caused to occur zero element, thus produce the computing of singularity.The distribution of state is determined by following mode: first, is arranged respectively by the element of variable X and Y by order from small to large; Then, the B/g of a foremost value is set to first state, an ensuing B/g value is set to second state, and the rest may be inferred, and claims the minimum value of each state and maximal value to be state threshold; Finally, for variable X and Y, their element is to (X (1), Y (1)), (X (2), Y (2)) ... (X (B), Y (B)) will put into the two-dimensional state lattice of g × g according to determined state threshold above.

After above process, the free position probability of variable X and Y is p _i=1/g, the joint probability of variable X and Y is p _ij=n _ij/ B, wherein n _ijit is the number that in (i, j) individual two-dimensional state lattice, element is right.Nonlinear interaction coefficient (Nonlinear Correlation Coefficient, NCC) is defined as:

NCC(X，Y)＝H(X)+H(Y)-H(X，Y)；

H (X, Y) = - Σ_{i = 1}^{g} Σ_{j = 1}^{g} p_{ij} \log_{g} p_{ij};

H (X) = H (Y) = - Σ_{i = 1}^{g} p_{ij} \log_{g} p_{ij} .

Notice p _i=1/g, then NCC can be reduced to:

NCC (X, Y) = 2 + Σ_{i = 1}^{g} Σ_{j = 1}^{g} p_{ij} \log_{g} p_{ij} .

B the element of variable X and Y is to (X (1), Y (1)), (X (2), Y (2)), the general correlativity between Two Variables in statistical significance is contained in (X (B), Y (B)) distribution in the two-dimensional state lattice of g × g, thus can weigh the nonlinear correlation relation between Two Variables.

For the situation of multiple variable, M the pixel P arrived as involved in the present invention ₁..., P _m, between variable, the degree of universal relation all can describe with nonlinear interaction coefficient between two, therefore can be written as M its nonlinear interaction coefficient matrix of variable to be investigated:

Wherein, P _uand P _vrepresent M pixel P ₁..., P _mbetween arbitrary two, 1≤u≤M, 1≤v≤M, NCC (P _u, P _v) represent pixel variable P _uand P _vbetween nonlinear interaction coefficient, because variable is completely relevant to himself, and its nonlinear interaction coefficient is 1, and therefore the diagonal entry of nonlinear interaction coefficient matrix R is 1; Other element 0≤r in matrix R _uv≤ 1, (u ≠ v, u≤M, v≤M), represents the degree of correlation between u variable and v variable.Between all variablees all mutual incoherent time, R is unit matrix, and in this case, the degree of correlation between the multiple variablees investigated is the most weak.When all variablees are all completely relevant each other, each element in R is equal to 1, in this case, has the strongest correlativity between the variable investigated.Visible, the general correlativity between M variable to be investigated lies in nonlinear interaction coefficient matrix R, and in order to measure it quantitatively, we propose the concept of related coefficient redundance:

CCR (P_{1}, . . ., P_{M}) = Σ_{i = 1}^{M} {(\frac{λ_{i}}{M})}^{2} .

Wherein, λ _iit is the eigenwert of nonlinear interaction coefficient matrix R.Because matrix R is M × M dimension, so solving of its secular equation is controlled in less calculated amount, the training sample screening process that the present invention is increased can't bring the burden in computing.

Strong point due to high light spectrum image-forming equipment is that spectral resolution is very high, general about 10nm, so cause the lower short slab of its image resolution ratio to a certain extent, it is poor that direct consequence is exactly the spatial resolution that maps of each pixel, namely the ground region that comprises of each pixel is larger, so when carrying out training sample selection, often with the pixel in the formal character continuum of line segment.Line segment type selector switch is different from circular (or approximate with square) selector switch, and the latter can ensure that the space length in region between pixel is no more than selector switch diameter, but the former but can make the pixel at selector switch two ends have larger space length.The growth of this space length can not be only evaluation means with correlation information when needing us to carry out screening, and the high-spectral data training sample that therefore we devise based on related coefficient redundance screens.This algorithm mainly utilizes can either metric linear relevant information, can measure again the nonlinear interaction coefficient of nonlinear correlation information as support, and is expanded to the situation of multidimensional variable input by matrix operation, also has less calculated amount simultaneously.

Step 2: optimizing is carried out to the parameter of SVM kernel function.

1) according to the different characteristics of SVM classifier kernel function, select radial basis function (the Radial Basis Function with symmetrical inner product, RBF) as the kernel function of support vector machine, because no matter be low-dimensional, higher-dimension, small sample or large sample situation, RBF kernel function is all applicable, and having loose domain of convergence, is ideal classification foundation function.RBF form is: wherein, the vector of P corresponding to the test pixel of input, P _ifor the vector corresponding to the pixel in training sample set, σ is width parameter, controls the radial effect scope of function.Meanwhile, consider the operand size that can bear in implementation process, parameter σ is carried out to the traversal examination of following data: σ ∈ { 0.01,0.1,1,10};

2) training sample set obtained with preceding step is combined into training data and completes training to SVM algorithm, because SVM classifier is two sorters, namely can only distinguish two kinds of different classifications, so take following optimizing scheme at every turn:

First, a class pixel with identical category index value is selected from training sample set, its new category label is defined as-1, remaining all pixels then corresponding new category label 1 in training sample set, and such process is also done to the pixel of abandoning in sample set.So just construct two classification problems of a standard, the SVM algorithm can applied based on RBF kernel function goes classification, and can calculate the classification accuracy rate of all test sample books, namely for such other nicety of grading, this mode classification is also referred to as one-to-many strategy svm classifier;

Then, according to same way, above-mentioned process is all done to each classification in training sample set, then can obtain the nicety of grading for all categories; Average is got to the nicety of grading of all categories, obtains average nicety of grading.

Due to training sample set and abandon sample set be before obtain, so change the different values of σ, will obtain different average classification accuracy value, this numerical value can be used for the performance of evaluating σ.

Provide the SVM classifier algorithm used in above-mentioned steps below:

Owing to specifying RBF kernel function, then sorter can be expressed as:

f (P) = sgn (Σ_{i = 1}^{N} y_{i} a_{i} K (P_{i}, P) + b) .

Wherein, sgn () is-symbol function, α=(α ₁, α ₂..., α _n) be Lagrange multiplier, the vector of P corresponding to the test pixel of input, P _ifor the vector corresponding to the pixel in training sample set, N is training sample sum, and b is threshold value, y _i{-1,1} is new category label to ∈, i.e. two kinds of different classification actual substitution numerical value when carrying out SVM computing.

SVM is by adding the maximized constraint condition in class interval by the problems referred to above, the dual problem being converted to convex quadratic programming solves, and obtains α and b, and the training sample wherein in α corresponding to nonzero value is support vector.So far, sorter training is complete, test sample book is substituted into sorter expression formula and just can carry out classification and determine.In addition, in sorter training process, also relate to the penalty factor that the sample that crosses the border is revised, but concerning the image applications in EO-1 hyperion field, many experiments all show that the impact of this factor pair classification results is very little, therefore the present invention is in parameter optimization process, does not carry out traversal examination to penalty factor.

Step 3: use SVM classifier algorithm completes two classification task to high-spectrum remote sensing.

Because SVM classifier is two sorters, therefore we need to select h from training sample set, and the pixel of k two kind is trained as the SVM classifier of training data to known parameters σ, and to all L test sample book P _t, 1≤t≤L carries out classification estimation, obtains the classification estimated value of all test sample books afterwards:

f _h，k(P _t)，1≤t≤L。

If only have two kinds in training sample set, then without the need to carrying out the step after the present invention, utilize f _{h, k}(P _t) kind judging to all test sample books can be completed; Otherwise, just need to select in addition a pair other different classes of combination, and repeat step 3, until all category combinations are between two all complete by calculating.

Step 4: realize many classification task based on strategy one to one.

Strategy is to any two kinds all structural classification devices one to one, and by these sorter concurrent operations, the final classification of test sample book is determined by vote by ballot.This strategy makes each SVM differentiate easily, and the training time has extraordinary performance.

Before decision-making, need to calculate test sample book P _tto the score function F of every kind _h(P _t), 1≤t≤L.This function has added up the positive and negative score of each sub-classifier in step 3, and its computing formula is as follows:

F_{h} (P_{t}) = Σ_{k = 1, k &NotEqual; h}^{w} f_{h, k} (P_{t}), 1 \leq t \leq L .

Wherein, w is the classification sum of many classification task, the class number namely comprised in training sample set.

The final decision of strategy takes Voting principle one to one, obtains the final category label of each test sample book according to following formula:

h^{*} = \underset{h = 1, . . ., w}{\arg \max} {F_{h} (P_{t})}, 1 \leq t \leq L .

Embodiment two: present embodiment chooses the hyperspectral image data of a standard, PaviaU data acquisition.This data separate ROSIS hyperspectral imager is taken, place is the Pavia university of North of Italy, containing 103 wave bands, single band image size is 610 × 340, and investigated on the spot by related scientific research personnel and give ground true reference diagram, so adopt this data acquisition to test, actual assessment the nicety of grading of classifier algorithm can be gone out.The configuration of testing computer used is as follows: Intel i52.5GHz processor, 4G internal memory.

Experiment 1: utilize the inventive method to classify to PaviaU data acquisition.

This 4 kind of Meadows, Asphalt, Trees, Painted metal sheets that we have chosen pixel in PaviaU image more is tested, training sample is selected with the line segment sampling thief of 5 × 1 sizes, each reservation 4 pixels, have for each screening plant permutation and combination.4 kinds have all respectively carried out 20 samplings, namely have selected the training sample of 100 pixels respectively, and after screening, every kind all remains the training sample of 80 pixels; The test sample book of 4 kinds is respectively 18549,6531,2964,1245 pixels.

Perform step one: utilize the training sample set required for related coefficient redundance automatic screening Supervised classification.

When carrying out automatic screening, select training sample with the line segment formula sampling thief of 5 × 1 sizes, each pixel retained of setting is 4, namely automatically deletes 1.Carry out 20 independent sample operations respectively to 4 kinds, the final training sample set obtained is combined into:

{(P ₁，Class ₁)，(P ₂，Class ₂)，…，(P ₃₂₀，Class ₃₂₀)}。

Abandonment sample set is:

{(P ₃₂₁，Class ₃₂₁)，(P ₃₂₂，Class ₃₂₂)，…，(P ₄₀₀，Class ₄₀₀)}。

Perform step 2: optimizing is carried out to the parameter of SVM kernel function.

Substitute into one by one the width parameter σ of sorter Kernel Function, span is { 0.01,0.1,1,10}.Each class in above-mentioned training sample set is two classification SVM with all classes of residue calculate, and tests with whole abandonment sample set, add up such other nicety of grading.The nicety of grading of four kinds can be obtained so respectively, after getting average, obtain average nicety of grading.During corresponding σ=0.1, when this average nicety of grading obtains maximal value, therefore kernel functional parameter is defined as 0.1.

Perform step 3: use SVM classifier algorithm completes two classification task to high-spectrum remote sensing.

Respectively with training sample set { (P ₁, Class ₁), (P ₂, Class ₂) ..., (P ₃₂₀, Class ₃₂₀) in classification between two as training data, SVM is trained, afterwards classification calculating is carried out to all test sample books, event memory in order to later step use.

Because experiment have chosen altogether 4 kinds, therefore step 3 will perform altogether time.

Perform step 4: realize many classification task based on strategy one to one.

According to the result that previous step obtains, utilize the Voting principle of strategy one to one, finally judge the category label of each test sample book.

Because PaviaU data have the true reference diagram in ground, we calculate the nicety of grading of every kind accordingly.In addition, also the support vector sum that time loss and all two sorters of each execution step are used is added up.

Experiment 2: carry out the experiment contrasted as the sorting technique proposed with the present invention.

Employing standard one to one tactful SVM classifier algorithm carries out contrast experiment, and wherein width parameter σ directly uses result σ=0.1 of experiment 1 optimizing.In contrast experiment, also adopt the training sample that experiment 1 is selected, namely 4 kinds equally all have chosen the training sample of 100 pixels, but not through screening; Test sample book is completely the same with experiment 1, classifies, and obtain the nicety of grading of every kind to the test sample book of 4 kinds according to pixel.In addition, the classification time loss of experiment 2 and support vector sum are also added up.

Fig. 2 and Fig. 3 is the classification results scatter diagram of experiment 1 and experiment 2 respectively, the pixel of white point presentation class mistake wherein, and black region is the pixel that classification is correct, and gray area is then background, is not namely chosen as the region of training, test sample book.Examine two figure, can find that the misclassification pixel of experiment 1 is obviously less.Concrete nicety of grading is as shown in table 1, and contrast the inventive method and traditional SVM method, the nicety of grading of all 4 kinds all has lifting, and average nicety of grading improves 0.96%.

The nicety of grading statistics of 1 and experiment 2 tested by table 1

	Experiment 1	Experiment 2
			Classification 1 nicety of grading (%)	97.29	96.51
Classification 2 nicety of grading (%)	98.91	98.87

Classification 3 nicety of grading (%)	87.13	85.53
			Classification 4 nicety of grading (%)	87.41	85.99
Average nicety of grading (%)	92.69	91.73

We have also added up the time loss of two experiments, refer to table 2.Wherein, experiment 1 be consuming timely divided into three parts, i.e. the screening training sample time of step one, the classifier parameters optimal time of step 2, it is consuming time that the tactful one to one SVM of step 3 and step 4 completes many classification task; Test 2 to be then only tactful SVM one to one to complete many classification task consuming time.The inventive method can effectively reduce the classification time, decreases 4.01 seconds with regard to this embodiment, though all included in steps totally consuming time on also have the advantage of 2.55 seconds.

The run time statistics of 1 and experiment 2 tested by table 2

In addition, we have also added up the support vectors sum of two experiments, and experiment 1 all two sorters create 870 support vectors altogether, and experiment 2 all two sorters create 1102 support vectors, the former 232 support vectors fewer than the latter altogether.Visible, the inventive method passes through automatic screening training sample process quickly and efficiently, make follow-up SVM classifier can obtain the support vector of simplifying in the training process, not only nicety of grading increases, and directly can reduce the classification computing time of test sample book.

Claims

1., based on a high-spectral data supervised classification method for related coefficient redundance, it is characterized in that described method step is as follows:

2) training sample for artificial selection carries out automatic screening, and each pixel retained of setting is M, and M<E, has namely deleted the pixel of E-M training sample;

4) E the pixel selected this unifies the category label Class that mark one is determined ^*, by M pixel retaining with the right form (P of data ₁ ^*, Class ^*) ..., be included into training sample set, a remaining E-M pixel is also with the form that data are right be included into abandonment sample set;

5) if need to continue to select more training sample, then 1 is returned), otherwise carry out step 2;

Step 2, optimizing is carried out to the parameter of SVM kernel function:

A, from the training sample set that step one obtains, select a class pixel with identical category index value, its new category label is defined as-1, remaining all pixels then corresponding new category label 1 in training sample set, and same process is also done to the pixel of abandoning in sample set, two classification problems of a structure standard, the SVM algorithm applied based on RBF kernel function goes classification, and calculates for such other nicety of grading, wherein: RBF form is: σ is width parameter, parameter σ is carried out to the traversal examination of following data: σ ∈ { 0.01,0.1,1,10};

B, according to same way, above-mentioned process is all done to each classification in training sample set, then obtains the nicety of grading for all categories;

C, average is got to the nicety of grading of all categories, obtain average nicety of grading;

D, select the σ value corresponding to average nicety of grading maximal value as the parameter of SVM classifier algorithm;

From training sample set, select h, the pixel of k two kind is trained as the SVM classifier of training data to known parameters σ, and to all L test sample book P _tcarry out classification estimation, 1≤t≤L, obtain the classification estimated value of all test sample books afterwards:

f _h，k(P _t)，1≤t≤L；

If only have two kinds in training sample set, then without the need to carrying out step 4, utilize f _{h, k}(P _t) kind judging to all test sample books can be completed; Otherwise, need to select in addition a pair other different classes of combination, and repeat step 3, until all category combinations are between two all complete by calculating;

Step 4, realize many classification task based on strategy one to one:

2. the high-spectral data supervised classification method based on related coefficient redundance according to claim 1, is characterized in that described E >=2.

3. the high-spectral data supervised classification method based on related coefficient redundance according to claim 1, is characterized in that the concept of described related coefficient redundance:

CCR (P_{1}, . . ., P_{M}) = Σ_{i = 1}^{M} {(\frac{λ_{i}}{M})}^{2} .

Wherein, λ _iit is the eigenwert of nonlinear interaction coefficient matrix R.

4. the high-spectral data supervised classification method based on related coefficient redundance according to claim 1, is characterized in that described SVM classifier is expressed as:

f (P) = (Σ_{i = 1}^{N} y_{i} α_{i} K (P_{i}, P) + b);

5. the high-spectral data supervised classification method based on related coefficient redundance according to claim 1, described in it is characterized in that, the final decision of strategy takes Voting principle one to one, obtains the final category label of each test sample book according to following formula:

h^{*} = \underset{h = 1, . . ., w}{\arg \max} {F_{h} (P_{t})};

Wherein, F _h(P _t) be the score function of every kind, 1≤t≤L, w is the classification sum of many classification task, the class number namely comprised in training sample set.

6. the high-spectral data supervised classification method based on related coefficient redundance according to claim 5, is characterized in that the score function F of described every kind _h(P _t) computing formula as follows:

F_{h} (P_{t}) = Σ_{k = 1, k &NotEqual; h}^{w} f_{h, k} (P_{t}), 1 \leq t \leq L .