CN108446735A

CN108446735A - A kind of feature selection approach optimizing neighbour's constituent analysis based on differential evolution

Info

Publication number: CN108446735A
Application number: CN201810233510.1A
Authority: CN
Inventors: 童楚东; 俞海珍; 朱莹
Original assignee: Ningbo University
Current assignee: Ningbo University
Priority date: 2018-03-06
Filing date: 2018-03-06
Publication date: 2018-08-24

Abstract

The present invention discloses a kind of feature selection approach optimizing neighbour's constituent analysis based on differential evolution, it is intended to solve how from orientation optimization neighbour constituent analysis (NCA) algorithm of optimization, to obtain optimal feature weight coefficient.The method of the present invention optimizes the object function of NCA algorithms using differential evolution algorithm, to obtain the feature weight coefficient of global optimum.Compared to traditional NCA methods, the object function of NCA algorithms is optimized for using differential evolution algorithm, to ensure that last weight coefficient vector is global optimum's result rather than local optimum.Secondly, the method for the present invention is not with tradition NCA the difference is that consider the object function for including planningization parameter, and also there is no need to determine the size of regularisation parameter.It can be said that it is to be used for a kind of strategy for improvement of characteristic of division selection to traditional NCA methods that the present invention, which sends out method,.

Description

A kind of feature selection approach optimizing neighbour's constituent analysis based on differential evolution

Technical field

The present invention relates to a kind of feature selection approach more particularly to a kind of neighbour constituent analysis is optimized based on differential evolution Feature selection approach.

Background technology

In recent years, data digging method is obtained in all trades and professions and widely answers, for the theory of data digging method Research has been similarly obtained extensive concern with application study.In industrial information construction, financial field, internet industry, logarithm A large amount of manpower and material resources have been put into according to the research excavated with machine learning.Feature selecting is accounted in data mining with machine learning The person's of having consequence, although it is not certain specific data mining or machine learning algorithm, feature selecting can be significantly Ground improves the performance of follow-up data mining algorithm.When carrying out data modeling in particular for high dimensional data, feature selecting can be sent out The positive effect shot is obvious to all.In identifying in mode for common disaggregated model, the input of model is typically height The sample data of dimension, and the output of model is then the corresponding category label of each sample data.In the identical sorting algorithm of application Under the premise of, it takes and input data feature selecting is not taken to have significantly difference on classification accuracy.Because implementing feature When resettling disaggregated model after selection, the negative effect of many interference informations can be rejected, to promote the precision of disaggregated model.

For the research of feature selecting, has many researchers and propose phase for different objects, different problems The resolving ideas answered.Among these, neighbour's constituent analysis (Neighborhood Component Analysis, NCA) be it is a kind of compared with Novel feature selecting algorithm, can be dedicated for the feature selecting before classification model construction.The method optimizing that NCA passes through 1 rank neighbour Leave-One-Out classification accuracy rates, and then obtain the weight coefficient of each input feature vector.So, weight coefficient is close to 0 Feature is exactly useless feature, can be rejected.However, the process of the Optimization Solution feature weight coefficient of tradition NCA methods is very It is easily trapped into local optimum, and weight coefficient is also susceptible to over-fitting.Although can be by introducing regularisation parameter tune It is had suffered fitting degree, but the regularisation parameter how to be selected the mode of cross validation can only to be relied on to carry out at present.Therefore, traditional The perfect of NCA algorithms needs to be further studied.

Invention content

Technical problem underlying to be solved by this invention is：How from the orientation optimization NCA algorithms of optimization, to obtain Optimal feature weight coefficient.Specifically, the method for the present invention optimizes the object function of NCA algorithms using differential evolution algorithm, To obtain the feature weight coefficient of global optimum.

Technical solution is used by the present invention solves above-mentioned technical problem：One kind optimizing neighbour's ingredient based on differential evolution The feature selection approach of analysis, includes the following steps：

(1) the different classes of y of application is collected₁, y₂..., y_CCorresponding sample data set X₁, X₂..., X_C, wherein C Indicate classification sum, c class data setsInclude the N of m feature_cA sample data, c=1,2 ..., C.

(2) by data set X₁, X₂..., X_CForm a matrix X ∈ R^N×m, and X is handled to obtain X by row execution standardization =[x₁, x₂..., x_N]^T∈R^N×mTo eliminate each feature dimension influence, wherein N=N₁+N₁+…+N_C, x_i∈R^m×1It indicates I-th of sample data.

(3) parameter of differential evolution algorithm, including population number nP=6m, zoom factor Z=0.6, greatest iteration time are set Number Imax >=2000 and crossover probability p=0.1.

(4) matrix W=[w of arbitrary initial m × nP dimensions₁, w₂..., w_nP] after, set iterations iter=0 and k=1.

(5) take in matrix W k-th of column vector as population w_k∈R^m×1Afterwards, according to formula d_ij=w_k|x_i-x_j| calculating matrixIn arbitrary two sample points x_iWith x_jThe distance between d_ij, wherein | x_i-x_j| it indicates vector x_i-x_jIn element all take absolutely Value, lower label i, j=1,2 ..., N.

(6) x is calculated according to formula as follows_iSelect x_jProbability p as its reference data points_ij：

(7) according to formula f_k=∑_i∑_jz_ijp_ijCalculate k-th of population w_kCorresponding object function f_k, wherein z_ijFor two into Number processed and only in x_iWith x_jValue 1 when belonging to one species.

(8) judge whether to meet condition k ＜ NIf so, setting return to step after k=k+1 (5)；If it is not, obtaining object function Vectorial F=[f₁, f₂..., f_N] after find out maximum value f in F_bestCorresponding population w_best, and execute next step (9).

(9) it is that each population generates a corresponding variation vector v according to formula as follows_k：

v_k=w_k+Z(w_best-w_k)+Z(w_a-w_b) (2)

In above formula, lower label a and b is the 2 mutually different integers randomly generated from section [1, nP].

(10) according to formula as follows to the vector v that makes a variation_kIt is modified, i.e.,；

Wherein, v_{K, n}Indicate vector v_kIn nth elements, n=1,2 ..., m.

(11) it is generated according to formula as follows and attempts vector u_k∈R^m×1, i.e.,：

Wherein, u_{K, n}With w_{K, n}Respectively u_kWith w_kMiddle nth elements, vectorial rand ∈ R^m×1Middle each element be all 0 to 1 it Between equally distributed arbitrary random decimal, rand_nIt is then the nth elements in random vector rand.

(12) according to formula Population Regeneration w as follows_k, i.e.,：

In above formula, h (u_k) indicate u_kAs population w_kReplacement values after the target function value that is calculated.

(13) step (9)~(12) are repeated until all populations all update and finish to obtain new matrix W, juxtaposition iter= iter+1。

(14) judge whether to meet condition iter ＞ ImaxIf it is not, return to step (5) continues to execute；If so, output is most Big object function f_bestCorresponding population w_best, the respective weights coefficient of as each feature.

(15) according to w_best∈R^m×1In each element concrete numerical value size, will be close to the feature corresponding to 0 element It rejects, then remaining feature is the result after feature selecting.

Compared with conventional method, inventive process have the advantage that：

First, the method for the present invention optimizes the object function of NCA algorithms using differential evolution algorithm, to ensure last power Weight coefficient vector is global optimum's result rather than local optimum.Secondly, the method for the present invention and tradition NCA be not the difference is that Once considered the object function for including planningization parameter, also there is no need to determine the size of regularisation parameter.It can be said that the present invention is sent out Method is that a kind of strategy for improvement of characteristic of division selection is used for traditional NCA methods.

Description of the drawings

Fig. 1 is the implementing procedure figure of the method for the present invention.

Fig. 2 is the feature selecting result schematic diagram of the method for the present invention.

Specific implementation mode

The method of the present invention is described in detail with specific case study on implementation below in conjunction with the accompanying drawings.

As shown in Figure 1, the present invention discloses a kind of feature selection approach optimizing neighbour's constituent analysis based on differential evolution.Under Design the validity that the numerical value case that one two is classified verifies the method for the present invention in face.

The equally distributed data set X between section [0,1] for randomly generating one 500 × 20 dimension, will be full in data set X Sufficient condition X₃·X₉/X₁₅The category label of the sample of ＜ 0.4 is arranged to y₁=1, and the classification mark of other samples for being unsatisfactory for condition Number it is arranged to y₂=2.

(1) above-mentioned training dataset is made of two class sample datas, and the result of feature selecting ought to select in data set X 3, the corresponding feature of 9 and 15 row, continues with implementation the method for the present invention.

(2) X is handled to obtain X=[x by row execution standardization₁, x₂..., x₅₀₀]^T∈R^500×20To eliminate each feature The influence of dimension.

(3) parameter of differential evolution algorithm, including population number nP=120, zoom factor Z=0.6, greatest iteration time are set Number Imax=2000 and crossover probability p=0.1.

(4) matrix W=[w of sharp arbitrary initial m × nP dimensions₁, w₂..., w_nP] after, set iterations iter=0 and k= 1。

(5) take in matrix W k-th of column vector as population w_k∈R^m×1Afterwards, according to formula d_ij=w_k|x_i-x_j| calculating matrixIn arbitrary two sample points x_iWith x_jThe distance between d_ij。

(6) x is calculated_iSelect x_jProbability p as its reference data points_ij。

(7) according to formula f_k=∑_i∑_jz_ijp_ijCalculate k-th of population w_kCorresponding object function f_k。

(8) judge whether to meet condition k ＜ 500If so, setting return to step after k=k+1 (5)；If it is not, obtaining target letter Number vector F=[f₁, f₂..., f₅₀₀] after find out maximum value f in F_bestCorresponding population w_best, and execute next step (9).

(9) it is that each population generates a corresponding variation vector v_k。

(10) to the vector v that makes a variation_kIt is modified.

(12) Population Regeneration w_k。

(15) according to w_best∈R^33×1In each element concrete numerical value size, will be close to the spy corresponding to 0 element Sign is rejected, then remaining feature is the result after feature selecting.

As shown in Fig. 2, the corresponding weighting coefficient scatter plot of each feature, from figure it can be found that the method for the present invention correctly Corresponding feature is selected.

Above-mentioned case study on implementation only is used for illustrating the specific implementation of the present invention, rather than limits the invention. In the protection domain of spirit and claims of the present invention, to any modification that the present invention makes, the protection of the present invention is both fallen within Range.

Claims

1. a kind of feature selection approach optimizing neighbour's constituent analysis based on differential evolution, which is characterized in that include the following steps：

Step (1)：Collect the different classes of y of application₁, y₂..., y_CCorresponding sample data set X₁, X₂..., X_C, wherein C Indicate classification sum, c class data setsInclude the N of m feature_cA sample data, c=1,2 ..., C；

Step (2)：By data set X₁, X₂..., X_CForm a matrix X ∈ R^N×m, and X is handled to obtain X by row execution standardization =[x₁, x₂..., x_N]^T∈R^N×mTo eliminate each feature dimension influence, wherein N=N₁+N₁+…+N_C, x_i∈R^m×1It indicates The transposition of i-th of sample data, upper label T representing matrixes or vector；

Step (3)：The parameter of differential evolution algorithm, including population number nP=6m, zoom factor Z=0.6, greatest iteration time are set Number Imax >=2000 and crossover probability p=0.1；

Step (4)：Matrix W=[w of sharp arbitrary initial m × nP dimensions₁, w₂..., w_nP] after, set iterations iter=0 and k= 1；

Step (5)：Take in matrix W k-th of column vector as population w_k∈R^m×1Afterwards, according to formula d_ij=w_k|x_i-x_j| calculate square Battle arrayIn arbitrary two sample points x_iWith x_jThe distance between d_ij, wherein | x_i-x_j| it indicates vector x_i-x_jIn element all take absolutely Value, lower label i, j=1,2 ..., N；

Step (6)：X is calculated according to formula as follows_iSelect x_jProbability p as its reference data points_ij：

Step (7)：According to formula f_k=∑_i∑_jz_ijp_ijCalculate k-th of population w_kCorresponding neighbour's constituent analysis object function f_k, Wherein z_ijFor binary number and only in x_iWith x_jValue 1 when belonging to one species；

Step (8)：Judge whether to meet condition k ＜ NIf so, setting return to step after k=k+1 (5)；If it is not, obtaining object function Vectorial F=[f₁, f₂..., f_N] after find out maximum value f in F_bestCorresponding population w_best, and execute next step (9)；

Step (9)：It is that each population generates a corresponding variation vector v according to formula as follows_k：

v_k=w_k+Z(w_best-w_k)+Z(w_a-w_b) (2)

In above formula, lower label a and b is the 2 mutually different integers randomly generated from section [1, nP]；

Step (10)：According to formula as follows to the vector v that makes a variation_kIt is modified, i.e.,：

In above formula, v_{K, n}Indicate vector v_kIn nth elements, n=1,2 ..., m；

Step (11)：It is generated according to formula as follows and attempts vector u_k∈R^m×1, i.e.,：

Wherein, u_{K, n}With w_{K, n}Respectively u_kWith w_kMiddle nth elements, vectorial rand ∈ R^m×1Middle each element be all between 0 to 1 uniformly The arbitrary random decimal of distribution, rand_nIt is then the nth elements in random vector rand；

Step (12)：According to formula Population Regeneration w as follows_k, i.e.,：

In above formula, h (u_k) indicate u_kAs population w_kReplacement values after the target function value that is calculated；

Step (13)：Step (9)~(12) are repeated until all populations all update and finish to obtain new matrix W, juxtaposition iter= iter+1；

Step (14)：Judge whether to meet condition iter ＞ ImaxIf it is not, return to step (5) continues to execute；If so, output Maximum target function f_bestCorresponding population w_best, the respective weights coefficient of as each feature；

Step (15)：According to w_best∈R^m×1In each element concrete numerical value size, will be close to the spy corresponding to 0 element Sign is rejected, then remaining feature is the result after feature selecting.