CN109447178A

CN109447178A - A kind of svm classifier method based on mixed kernel function

Info

Publication number: CN109447178A
Application number: CN201811343420.4A
Authority: CN
Inventors: 朱芳; 陈得宝; 纵海宝
Original assignee: Huaibei Normal University
Current assignee: Huaibei Normal University
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2019-03-08

Abstract

The svm classifier method based on mixed kernel function that the invention discloses a kind of, one, collection data set, analyzes each sample in the data set record of collection, distinguishes sample different attribute, determine input and output sample；Two, kernel function is selected and constructed, exponential distribution kernel function is mixed with Radial basis kernel function；Three, the parameter in mixed kernel function is optimized；Four, C-SVC model is selected, the support vector cassification model based on novel mixed kernel function is established；Five, classification prediction is carried out by the support vector cassification model established.The present invention makes full use of the global performance of exponential function and the Local Property of radial basis function, and using the particle swarm optimization algorithm Optimized model parameter with Gaussian mutation, improves the overall performance of support vector machines；The study of exponential distribution kernel function of overall importance and Generalization Capability are above other single kernel functions, and the support vector machines performance of novel mixed kernel function is substantially better than the support vector machines performance of other mixed kernel functions.

Description

A kind of svm classifier method based on mixed kernel function

Technical field

The present invention relates to a kind of svm classifier methods, and in particular to a kind of svm classifier method based on mixed kernel function.

Background technique

SVM (Support Vector Machine) refers to support vector machines, is a kind of common method of discrimination.In machine Device learning areas is the learning model for having supervision, commonly used to carry out pattern-recognition, classification and regression analysis.The side SVM Method is that sample space is mapped in a higher-dimension or even infinite dimensional feature space by a Nonlinear Mapping p (space Hilbert), so that being converted into the problem of Nonlinear separability in original sample space linear in feature space The problem of can dividing.Briefly, peacekeeping linearisation is exactly risen.Dimension is risen, sample is done to higher dimensional space exactly and is mapped, ordinary circumstance Lower this will increase the complexity of calculating, or even can cause " dimension disaster ".But SVM is come as the problems such as classification, recurrence Say, it is likely that make low-dimensional sample space can not linear process sample set, can but pass through a line in high-dimensional feature space Property hyperplane realize linear partition (or return).The general dimension that rises can all bring the complication of calculating, and SVM method dexterously solves This problem: kernel function is introduced, and requires no knowledge about the explicit expression of Nonlinear Mapping；Due to being in high-dimensional feature space In establish linear learning machine, so not only hardly increasing the complexity of calculating, and to a certain degree compared with linear model On avoid " dimension disaster ".

At this stage, SVM has been applied to Face datection, and the fault diagnosis of Turbo-generator Set is classified, and returns, cluster, when Between sequence prediction, System Discrimination, Financial Engineering, biological medicine signal processing, data mining, biological information, text mining is adaptive Induction signal processing, splice site identification, the database learning algorithm based on support vector machines, the identification of handwritten form similar character are supported Application of the vector machine Function Fitting in fractal interpolation, the Initial Alignment of Inertial Navigation Systems based on support vector machines, Prediction for Rock Burst Support vector machines, defect recognition, computer keyboard subscriber authentication, video caption are automatically positioned on extraction, and speaker is really The multiple fields such as recognize.

Original sample can be projected higher dimensional space from lower dimensional space by kernel function, be that support vector machines (SVM) processing is non-thread One of the important method of the inseparable problem of property, it make sample from former Space Nonlinear can not isloation state be converted to higher dimensional space Approximately linear can isloation state.The type of kernel function has determined the size of mapping inner product and higher dimensional space, to change sample number According to the complexity being distributed in higher dimensional space.If the space dimensionality of data projection is larger, obtained model is more complex, experience Risk is small but fiducial range is big, is easy to appear overfitting；Vice versa.

For the selection of kernel function or construction, there are no unified rules at present, generally select core letter using empirical method Number, as long as the function for meeting Mercer condition is theoretically all chosen as kernel function.Therefore, it is necessary to design a kind of efficient core letter Number, by data projection to suitable higher dimensional space, can make classifying quality more excellent.

Summary of the invention

It is an object of the invention to overcome defect of the existing technology, provide the SVM based on mixed kernel function points a kind of Class method.

In order to achieve the above objectives, the technical scheme adopted by the invention is that: a kind of svm classifier side based on mixed kernel function Method, steps are as follows:

One, data set is collected, each sample in the data set record of collection is analyzed, sample is distinguished and does not belong to Property, determine input and output sample；

Two, kernel function is selected and constructed, the exponential distribution global kernels and radial base core of Mercer condition will be met Function is mixed, and novel mixed kernel function is obtained:

In formula, parameter t is ratio shared by two kinds of kernel functions, and σ is the core width of Radial basis kernel function, and γ is exponential distribution The parameter of kernel function, x, y are input and output sample pair in sample set；

Three, the parameter in novel mixed kernel function is optimized；

Four, C-SVC model is selected, the support vector cassification model based on novel mixed kernel function is established；

Five, classification prediction is carried out by the support vector cassification model established.

Further, the optimization is using the particle swarm optimization algorithm for introducing Gaussian mutation, in novel mixed kernel function Three parameters σ, γ, t optimize.

Further, the step of particle swarm optimization algorithm for introducing Gaussian mutation is as follows:

1, parameter, population size value, maximum number of iterations value are set, and local search ability C1, ability of searching optimum C2 are used to Property weights omega value, limit particle search area and population flying speed；

2, the position X of particle (σ, γ, t) is initialized_iWith speed V_i；

3, the fitness value of each particle is calculated；

4, global extremum and individual extreme value are determined；

5, iteration optimizing judges whether to reach maximum classification accuracy, if so, turning to step 8, otherwise continues downward It executes；

6, basisWithThe speed of more new particle and position；

7, a particle is randomly choosed, Gaussian mutation is carried out to it, population diversity is kept, goes back to step 3；

8, optimal solution is exported.

The method have the benefit that: using the exponential distribution global kernels and radial direction for meeting Mercer condition The novel mixed kernel function of base kernel function building, that is, have preferable extrapolability, the data point of wide apart can be made to core letter Numerical value has an impact, and has stronger learning ability, also has an impact to neighbouring local message；Introduce the population of Gaussian mutation Optimization algorithm optimizes the parameter of the support vector cassification model based on novel mixed kernel function simultaneously, more common than other Mixed kernel function has higher classification accuracy.

Detailed description of the invention

The present invention is further elaborated with embodiment with reference to the accompanying drawing.

Fig. 1 is the output characteristic curve figure of RBF of the present invention；

Fig. 2 is the output characteristic curve figure of exponential kernel functions of the present invention；

Fig. 3 is curve graph (γ=0.2) of the compound kernel function of the present invention at test point；

Fig. 4 is curve graph (σ=0.2) of the compound kernel function of the present invention at test point.

Specific embodiment

Embodiment 1

A kind of svm classifier method based on mixed kernel function, steps are as follows:

Two, kernel function is selected and constructed, exponential distribution kernel function is mixed with Radial basis kernel function；

Three, the parameter in mixed kernel function is optimized, using the particle swarm optimization algorithm for introducing Gaussian mutation；

Wherein the generation of novel mixed kernel function is according to as follows:

The concept of kernel function, it is assumed that give m training sample { x₁, x₂..., x_m, each x_iA corresponding feature to Amount.So, there is a function K (x_i, x_j) meet K (x_i, x_j)=Φ (x_i)^TΦ(x_j) and meet Mercer condition, then K (x_i, x_j) Referred to as kernel function.Wherein, Φ (x) is some nonlinear mapping function.Mercer condition refers to for arbitrary g (x) ∈ RⁿAnd ≠ 0, there is ∫ ∫ K (x_i, x_j)g(x_i)g(x_j)dx_idx_j≥0.I.e. for all training sample (x₁, x₂..., x_m, K (x_i, x_j) It is effective kernel function, then its corresponding nuclear matrix is symmetrical and positive semi-definite.

Kernel function can be divided into two major classes according to the characteristic of kernel function by the design of mixed kernel function: global kernel function drawn game Portion's kernel function.Global kernel function has global property, is good at extracting sample global property, such as line its main feature is that interpolation ability is weaker Property, Polynomial kernel function etc..Local kernel function has locality, and local learning ability is higher than global learning ability.This method The local kernel function of use is Radial basis kernel function (Radial Basis Function, RBF), and formula is as follows:

Shown in Fig. 1, RBF kernel function output curve diagram when test point takes 0.2, σ to take 0.1,0.2,0.5,0.8,1 is exactly One typical local kernel function.It can be seen from the figure that the region near test point, kernel function influences it very greatly, and from Test point is remoter, and output valve levels off to 0.

Using the global kernels of an exponential distribution, it meets Mercer condition, is an effective kernel function.

It proves: for m training sample { x₁, x₂..., x_m, by any two x_iAnd x_jIt brings into above formula (3), can obtain:

As it can be seen that corresponding m m matrix K is a symmetrical matrix.

Assuming that each training sample has k attributive character, Φ_k(x) it is right to indicate that original space sample x is mapped to higher dimensional space institute The kth dimension attribute value of the mapping function Φ (x) answered.So for any vector z, obtain:

Therefore, the kernel matrix K obtained on training set is positive semi-definite (K >=0).

To sum up, index nuclear matrix K is symmetrical positive semi-definite to meet Mercer condition.Therefore, exponential kernel functions are effective Kernel function, have been used for core study.Exponential kernel functions when test point equally takes 0.2, γ to take 0.1,0.2,0.5,0.8,1 respectively Curve of output is as shown in Figure 2.It can be seen from the figure that the kernel function has preferable extrapolability, wide apart may make Data point all has an impact to kernel function value.Although the output characteristics of exponential kernel functions is similar with Polynomial kernel function, from letter It is seen in number prototype, only one γ of the parameter of the function, and Polynomial kernel function has tri- parameters of γ, r, d to need to select, because This reduces the dependence of parameter.

By RBF kernel function in conjunction with above-mentioned exponential kernel functions, for the advantages of both giving full play to, formed of the invention novel Mixed kernel function.

According to the definition of kernel function, if K₁、K₂It is kernel function, then there is α > 0, β > 0 is for arbitrary vector z, z^Tα K₁Z > 0, z^TβK₂Z > 0.Therefore, z^TαK₁z+z^Tβk₂Z=z^T(αK₁+βK₂) z > 0, (α, β > 0) to get arrive α K₁+βK₂Nuclear moment Battle array is positive definite namely α K₁+βK₂It is kernel function.Formula is as follows:

In formula, parameter t is ratio shared by two kinds of kernel functions, generally takes 0≤t≤1.Combination nova kernel function is in test point Output when 0.2 is illustrated in fig. 3 shown below.

Kernel functional parameter based on PSO optimizes, and there are three parameters (in the σ, new kernel function in RBF in novel kernel function γ, kernel function proportion t) needs optimize, their value has a direct impact the result of SVM model.Using structure Simply, the particle swarm optimization algorithm (Particle Swarm Optimization, PSO) being easily achieved carries out model parameter Optimization.

The present invention uses the support vector machines (C-SVC) with punishment parameter C, therefore parameter C is also required to optimize.

Specific step is as follows for algorithm:

Step1: parameter setting, population size are set as 20, and maximum number of iterations 200 represents local search ability c₁=1.5, in order to make particle evolve to globe optimum, setting represents the c of ability of searching optimum₂=1.7, inertia weight ω's Value is 1, particle search area C ∈ [0.1,100], σ ∈ [0.1,100], γ ∈ [0.01,10], t ∈ [0.01,1]；Particle Group's flying speed range [- 0.6*C_max, 0.6*C_max], [- 0.6* σ_max, 0.6* σ_max], [- 0.6* γ_max, 0.6* γ_max], [- 0.6*t_max, 0.6*t_max]；

Step2: the position and speed of initialization particle (C, σ, γ, t), the position of each particle: X_i=(X_max-X_min)* rand+X_min；Speed: V_i=V_max*rands；

Step3: calculating the fitness value of each particle, using the classification accuracy of training set as fitness function:

Step4: retain each individual and run to current time obtained desired positionsWith it is best when former generation group Position

Step5: iteration optimizing judges whether to reach maximum classification accuracy, if so, turning to Step 8, otherwise execute Step 6~7；

Step6: according toWith The speed of more new particle and position；

Step7: one particle of random selection carries out Gaussian mutation X to it_i=X_i× g, g ∈ N (0,1) keep population more Sample goes back to Step3；

Step8: output optimal solution.

Embodiment 2

The chemical composition analysis for recording the grape wine of three kinds of different cultivars on Italian the same area forms Wine data set, Data set shares 178 samples, and each sample contains 13 attributes.

The validity of the novel mixed kernel function of the verifying present invention establishes base by Wine data set by 10 random groupings In the support vector cassification model of mixed kernel function, involved parameter is optimized using PSO, carries out classification prognostic experiment.With Machine is grouped the adaptability for reflecting novel mixed kernel function.

The classification results for the support vector cassification model that obtained result and other kernel functions are constructed compare, as follows Shown in table 1.

1 Wine data set of table is grouped the classification accuracy of different kernel functions at random

As can be known from Table 1, generally for the random division training set and test set of Wine data set, novel mixed nucleus letter Several training set classification accuracies and test set classification accuracy are above other kernel functions；Therefore, novel mixed kernel function fills Distribute the of overall importance of the locality and index core for having waved RBF, learning ability and generalization ability are all compared with other kernel functions height, it was demonstrated that The validity of the novel mixed kernel function of the present invention.

Embodiment 3

The present embodiment be Iris data set verify example, in order to more fully more all kinds of kernel functions it is single or mixing SVM classifying quality, experimental subjects are the Iris data set that UCI database website provides.

Iris data set includes 150 samples, is divided into 3 classifications, and each sample contains 4 attributes.The present invention will The 50% of every class sample of the data set is used as training set, and 50% in addition is used as test set.Parameter is optimized through PSO respectively, Svm classifier model is established, the classification accuracy of test set is as shown in table 2 below.

Classification results under 2 Iris data set difference kernel function of table

As shown in Table 2, for using single kernel function, kernel function proposed by the invention than linear, multinomial, The classification accuracy of RBF kernel function wants high；For mixed kernel function, the classification of novel mixed kernel function of the invention is imitated Fruit will get well compared with other mixed kernel functions, illustrate the support vector machines performance of novel mixed kernel function construction of the invention better than other The support vector machines of mixed kernel function.

This method constructs a kind of new mixed kernel function that can take into account study and Generalization Ability, the experimental results showed that, The classification accuracy of data set is all higher than single kernel function and other mixed kernel functions.Therefore, with the novel mixing of the present invention The SVM model of Kernel is classified, and preferably classification prediction result can be obtained.

Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited to above-mentioned Particular implementation, those skilled in the art can make various deformations or amendments within the scope of the claims, this not shadow Ring substantive content of the invention.

Claims

1. a kind of svm classifier method based on mixed kernel function, steps are as follows:

One, data set is collected, each sample in the data set record of collection is analyzed, distinguishes sample different attribute, really Determine input and output sample；

Two, kernel function is selected and constructed, the exponential distribution global kernels and Radial basis kernel function of Mercer condition will be met It is mixed, obtains novel mixed kernel function:

In formula, parameter t is ratio shared by two kinds of kernel functions, and σ is the core width of Radial basis kernel function, and γ is exponential distribution core letter Several parameters, x, y are input and output sample pair in sample set；

Three, the parameter in novel mixed kernel function is optimized；

2. the svm classifier method according to claim 1 based on mixed kernel function, it is characterised in that: the optimization uses The particle swarm optimization algorithm for introducing Gaussian mutation, optimizes three parameters σ, γ, t in novel mixed kernel function.

3. the svm classifier method according to claim 2 based on mixed kernel function, it is characterised in that: the introducing Gauss The step of particle swarm optimization algorithm of variation, is as follows:

Step 1: setting parameter, population size value, maximum number of iterations value, local search ability C1, ability of searching optimum C2, Inertia weight ω value limits particle search area and population flying speed；

Step 2: the position X of initialization particle (σ, γ, t)_iWith speed V_i；

Step 3: calculating the fitness value of each particle；

Step 4: determining global extremum and individual extreme value；

Step 5: iteration optimizing, judges whether to reach maximum classification accuracy, if so, turning to step 8, otherwise continue downward It executes；

Step 6: according toWithThe speed of more new particle and position；

Step 7: one particle of random selection, Gaussian mutation is carried out to it, population diversity is kept, step 3 is gone back to；

Step 8: output optimal solution.