CN113449779B

CN113449779B - SVM incremental learning method based on sample distribution density improved KKT condition

Info

Publication number: CN113449779B
Application number: CN202110652246.7A
Authority: CN
Inventors: 王彩云; 吴钇达; 李阳雨; 丁牧恒
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2021-06-11
Filing date: 2021-06-11
Publication date: 2024-04-16
Anticipated expiration: 2041-06-11
Also published as: CN113449779A

Abstract

The invention discloses a sample distributionAn SVM incremental learning method for density improvement KKT conditions, comprising: obtaining a set of support vectors SV for a classifier ₀ Standard KKT condition I of classifier; construction of SVM classifier Model _old Improved KKT condition I; judging whether samples in the newly added sample set B meet the Model of the SVM classifier _old Standard KKT condition i; performing secondary judgment on samples in the newly added sample set B to judge whether the Model of the SVM classifier is met _old Improved KKT condition I; from a set of candidate support vectors SV ₁ Training classifier Model ₁ The method comprises the steps of carrying out a first treatment on the surface of the Computing a Model that satisfies a classifier ₁ Improved KKT condition II; by SV ₀ ∪SV ₁ ∪SV _add Training the classifier and outputting updated classifier Model ₂ . The improved KKT condition based on the sample distribution density can realize effective screening of the newly added samples under the condition of unbalanced sample distribution, and improves the generalization capability of the SVM incremental learning algorithm.

Description

SVM incremental learning method based on sample distribution density improved KKT condition

Technical Field

The invention belongs to the technical field of machine learning, and relates to an SVM incremental learning method based on sample distribution density improved KKT conditions. The SVM incremental learning method is particularly used for SVM incremental learning under the condition of uneven sample distribution, and can be used for online updating of SVM classifier under automatic incremental learning.

Background

The support vector machine (Support Vector Machine, SVM) is a machine learning pattern recognition classification algorithm proposed by Vapnik in the 90 th century, reference [ Vapnik V.statistical learning theory, new York: john Wiley & Sons, inc.,1998], which is excellent in classification tasks of small sample, high dimensional features. The conventional SVM algorithm is a batch learning mode, that is, it is assumed that all training samples can be obtained at one time before training, and the learning process is terminated after training is completed. However, in practical applications, it is generally impossible to obtain the training samples all at once, but the training samples are obtained gradually over time, and the information contained in the new training samples may change over time. Therefore, the classifier needs to have the ability to continuously learn useful knowledge from these sample data, thereby realizing online updating of the classifier under the condition of newly added samples.

How to learn useful knowledge from newly added sample data and ensure that a model updated through learning training has better classification performance becomes an important problem to be solved. This problem can be solved by incrementally learning to retain important information for the newly added sample. The incremental learning idea is summarized as follows: based on the original knowledge base, only the change caused by the newly added data is updated to the original knowledge base. This will save a significant amount of training time and memory requirements after the new sample data is added.

The SVM incremental learning algorithm proposed by Syed et al is an early more classical algorithm, and the basic idea of the algorithm is to only retain the support vector of the original classification model and perform incremental training with newly added data samples in the references [ Syed N a, liu H, sun k.increment learning with support vector machines.proc.int Joint Conference on Artificial Intelligence,1999 ]. The algorithm does not screen the newly added samples, and some newly added samples without gain effect on classification accuracy are trained, so that the efficiency of incremental learning is poor and the classification accuracy is affected. Then, researchers introduce a Karush-Kuhn-Tucker (KKT) condition to screen newly added samples, and an SVM incremental learning algorithm based on the KKT condition is provided. In 2014, zhang Canlin and the like introduce a new sample mischaracterizing error driving idea based on KKT conditions, and propose a new SVM increment learning algorithm, reference [ Zhang Canlin, yao Minghai, tong Xiaolong, and the like ]. The SVM incremental learning algorithm based on the KKT condition can screen newly added samples, but has poor generalization capability. Under the condition of uneven distribution of the newly added samples, the existing SVM increment learning method based on the KKT condition can not carry out self-adaptive screening on the newly added samples with obviously inconsistent distribution density, and under the condition of unbalanced sample distribution, the classification precision of the SVM classifier updated by the method is obviously lower.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide an SVM increment learning method based on sample distribution density improvement KKT condition, so as to solve the problems of poor learning ability on newly added samples, low generalization ability of a classifier and low classification precision of an SVM increment learning algorithm under the condition of unbalanced sample distribution in the prior art.

The method of the invention can be used for updating the classifier by judging the KKT condition of the newly added sample, and pre-selecting a possible support vector set SV, but analyzing the rule of the number of the non-support vectors near the classification boundary under different sample distribution densities finds that the larger the sample distribution density is, the more the non-support vectors near the classification boundary are, the more the samples are likely to be converted into the support vectors of the new classifier, the smaller the sample distribution density is, the fewer the non-support vectors near the classification boundary are, the samples are likely to be converted into the support vectors of the new classifier are, if the bias parameters of fixed values are introduced to improve the KKT condition, and under the condition of unbalanced sample distribution, the learning degree of the sample set with low sample density on the newly added samples is obviously insufficient compared with the learning degree of the sample set with high sample density on the newly added samples.

On the basis of an SVM increment learning method based on the KKT condition, the bias parameter is calculated through sample distribution density self-adaption so as to improve the KKT condition, the improved KKT condition is adopted to automatically screen candidate support vector sets SV from positive and negative sample sets in newly added samples, sample learning of low sample density is improved, sample learning of high sample density is balanced, and under the condition of fully utilizing historical training results, quick increment learning of newly added samples is realized, so that the generalization capability and classification precision of a classifier are improved.

The invention relates to an SVM increment learning method based on sample distribution density improved KKT condition, which comprises the following steps:

1) Training an SVM classifier Model from an original sample set A _old Obtaining classifier Model _old Support vector set SV of (1) ₀ And standard KKT condition I;

2) Calculating sample distribution density of positive and negative samples in an original sample set A, calculating bias parameters under the positive and negative samples according to the sample distribution density, adding the bias parameters on the basis of a standard KKT condition I, and constructing an SVM classifier Model _old Self-adaptation to positive and negative samplesImproved KKT condition I should be optimized;

3) Judging whether all samples in the newly added sample set B meet Model of SVM (support vector machine) classifier _old If all the conditions are satisfied, outputting the Model of the original SVM classifier _old The model is the required model, and the process is finished; otherwise, putting samples which violate the standard KKT condition I in the newly added sample set B into a set B1, and putting samples which meet the standard KKT condition I into a set B2;

4) Judging whether samples in the newly added sample set B meet the Model of the SVM classifier _old Is performed by placing samples satisfying the modified KKT condition I into the candidate support vector set SV ₁ Defined as possible support vector samples;

5) According to the candidate support vector set SV ₁ Training classifier Model ₁ Obtaining classifier Model ₁ Standard KKT condition II of (1) using support vector set SV ₁ Calculating bias parameters under positive and negative samples to obtain an improved KKT condition II after the bias parameters are added;

6) Judging whether the set B2 meets the improved KKT condition II, if the set B2 is empty or all samples meet the improved KKT condition II, outputting an updated Model which is the classifier Model ₁ Ending; otherwise, putting the sample which does not meet the modified KKT condition II into a supplementary support vector set SV _add ；

7) By SV ₀ ∪SV ₁ ∪SV _add Model of ensemble training classifier ₂ And outputs the updated classifier as a Model ₂ ，SV ₀ SV is the set of support vectors for the original classifier _add To supplement the support vector set.

Further, in the step 1), training an SVM classifier Model according to the original sample set _old The LIBSVM kit is used in the process, and the standard KKT condition I is expressed as follows:

in the solving process of the optimal hyperplane optimal solution of the SVM classifier, the optimal hyperplane optimal solution is determinedThe meaning condition is 0.ltoreq.alpha _ι The standard KKT condition is abbreviated as y _i f(x _i ) Is less than or equal to 1; wherein C is a penalty factor, α= (α) ₁ ,α ₂ ,...,α _n ) ^T T is the transpose of the matrix, y, for Lagrangian multiplier _i ∈[1,-1]For sample labels, f (x _i ) For sample x _i Distance from the optimal hyperplane.

Further, in the step 2), bias parameters under positive and negative samples are calculated according to the distribution density of the samples, and a classifier Model is constructed _old The method for improving KKT condition I is specifically as follows:

21 Assumed original training setFor the original sample set, the positive sample is +.>Negative sample is +.>The class centers of positive and negative samples in the sample set are c respectively ⁺ And c ^- ，/>Representing the distance between the positive and negative samples and the center of the class;

22 Calculating the sample distribution density xi of the positive and negative sample sets ₊ With xi _- ：

23 Calculating the bias parameter delta under positive and negative samples ₊ ,δ _- ：

δ ₊ ＝1-ξ ₊ ,δ _- ＝1-ξ _- ；

24 Calculating Model of SVM classifier _old Is defined in KKT condition I:

|y _i f(x _i )|≤1+δ＝1+(1-ξ),ξ∈[ξ ₊ ,ξ _- ]

wherein N is ₊ Represent the positive number of samples, N _- Represents the negative number of samples, and max (·) represents the distance maximum.

Further, in the step 3), for the newly added sample set b= { s ₁ ,s ₂ ,...,s _j ,...,s _M Sample vector s in } _j Judging if s _j All meet the standard KKT condition I, output classifier Model _old Ending; otherwise, dividing the newly added sample set B into two sets of a set B1 and a set B2, wherein the Model of the SVM classifier is satisfied _old Sample vectors of standard KKT condition i are placed in set B2 and sample vectors that violate standard KKT condition i are placed in set B1.

Further, in the step 5), the LIBSVM tool box is adopted to implement the method according to the candidate support vector set SV ₁ Training classifier Model ₁ Calculating a candidate support vector set SV ₁ Sample distribution density of the positive and negative samples, calculating bias parameters under the positive and negative samples according to the sample distribution density, adding the bias parameters on the basis of standard KKT condition II, and constructing a Model of the SVM classifier ₁ Improved KKT condition II aiming at the self-adaptive optimization of positive and negative samples is provided.

Further, in the step 7), the support vector set SV of the original classifier is obtained ₀ A set of possible support vectors SV selected from the newly added sample set B ₁ Supplemental support vector set SV _add Merging and combining SV by using LIBSVM tool box ₀ ∪SV ₁ ∪SV _add Training classifier Model ₂ The classifier Model ₂ Is the classifier model found.

The invention has the beneficial effects that:

1. the invention considers that the existing SVM increment learning method based on the improved KKT condition often adopts fixed offset parameters to improve the KKT condition to realize the selection of the newly added sample, and does not consider the special condition that the sample distribution is obviously unbalanced or the sample set is smaller to influence the selection of the newly added sample, so that the classification precision of the classifier after increment learning updating is lower.

2. The invention provides a new improved KKT condition SVM increment learning method, which introduces sample distribution density parameter self-adaptive calculation of bias parameters of improved KKT conditions, so that the improved KKT conditions can more effectively select candidate support vector sets SV from newly added samples under the condition of unbalanced sample distribution or smaller sample set, and the classification precision of a classifier is improved.

3. The invention improves the learning efficiency of the increment sample, and under the condition of obtaining the same target classification precision, the increment learning frequency is less, and the time consumption of online updating of the classifier is reduced.

Drawings

FIG. 1 is a schematic diagram of the method of the present invention.

Fig. 2 is a ROC graph of the 10-time classification result for the abalone dataset increment.

FIG. 3 is a ROC graph of the 10-degree classification result for a Waveform dataset increment.

FIG. 4 is a graph showing the classification accuracy of a wave form dataset as a function of the number of increments.

Detailed Description

The invention will be further described with reference to examples and drawings, to which reference is made, but which are not intended to limit the scope of the invention.

Referring to fig. 1, the SVM incremental learning method based on sample distribution density improving KKT condition of the present invention comprises the following steps:

training an SVM classifier Model according to the original sample set in the step 1) _old The LIBSVM kit is used in the process, and the standard KKT condition I is expressed as follows:

optimal hyper-flatness of SVM classifierIn the process of solving the face optimal solution, defining a condition of 0-alpha _ι The standard KKT condition is abbreviated as y _i f(x _i ) Is less than or equal to 1; wherein C is a penalty factor, α= (α) ₁ ,α ₂ ,...,α _n ) ^T T is the transpose of the matrix, y, for Lagrangian multiplier _i ∈[1,-1]For sample labels, f (x _i ) For sample x _i Distance from the optimal hyperplane.

2) Calculating sample distribution density of positive and negative samples in an original sample set A, calculating bias parameters under the positive and negative samples according to the sample distribution density, adding the bias parameters on the basis of a standard KKT condition I, and constructing an SVM classifier Model _old The improved KKT condition I aiming at the self-adaptive optimization of positive and negative samples is adopted;

in the step 2), the bias parameters of the positive and negative samples are calculated according to the distribution density of the samples, and a classifier Model is constructed _old The method for improving KKT condition I is specifically as follows:

δ ₊ ＝1-ξ ₊ ,δ _- ＝1-ξ _{_} ；

24 Calculating Model of SVM classifier _old Is defined in KKT condition I:

|y _i f(x _i )|≤1+δ＝1+(1-ξ),ξ∈[ξ ₊ ,ξ _{_} ]

wherein N is ₊ Represent the positive number of samples, N _{_} Represents the negative number of samples, and max (·) represents the distance maximum.

in the step 3), for the newly added sample set b= { s ₁ ,s ₂ ,...,s _j ,...,s _M Sample vector s in } _j Judging if s _j All meet the standard KKT condition I, output classifier Model _old Ending; otherwise, dividing the newly added sample set B into two sets of a set B1 and a set B2, wherein the Model of the SVM classifier is satisfied _old Sample vectors of standard KKT condition i are placed in set B2 and sample vectors that violate standard KKT condition i are placed in set B1.

4) Judging whether samples in the newly added sample set B meet the Model of the SVM classifier _old Is performed by placing samples satisfying the modified KKT condition I into the candidate support vector set SV ₁ Defined as possible support vector samples.

l is adopted in the step 5)The IBSVM tool box is based on the candidate support vector set SV ₁ Training classifier Model ₁ Calculating a candidate support vector set SV ₁ Sample distribution density of the positive and negative samples, calculating bias parameters under the positive and negative samples according to the sample distribution density, adding the bias parameters on the basis of standard KKT condition II, and constructing a Model of the SVM classifier ₁ Improved KKT condition II aiming at the self-adaptive optimization of positive and negative samples is provided.

6) Judging whether the set B2 meets the improved KKT condition II, if the set B2 is empty or all samples meet the improved KKT condition II, outputting an updated Model which is the classifier Model ₁ Ending; otherwise, putting the sample which does not meet the modified KKT condition II into a supplementary support vector set SV _add 。

7) By SV ₀ ∪SV ₁ ∪SV _add Model of ensemble training classifier ₂ And outputs the updated classifier as a Model ₂ ，SV ₀ SV is the set of support vectors for the original classifier _add Supplementing a support vector set;

the support vector set SV of the original classifier in the step 7) ₀ A set of possible support vectors SV selected from the newly added sample set B ₁ Supplemental support vector set SV _add Merging and combining SV by using LIBSVM tool box ₀ ∪SV ₁ ∪SV _add Training classifier Model ₂ The classifier Model ₂ Is the classifier model found.

The data adopted by the embodiment of the invention are an abalone data set and a Waveform data set in UCI standard classification data set. Wherein, the abalone classification data set contains 4177 examples, and the attribute quantity of each example is 8. The Waveform classification dataset contained 5000 instances, each with a number of attributes of 21. The preparation work in the early stage is as follows: 4177 examples in the abalone classification dataset were randomly aliquoted into 10, 1 of which was the original sample set, 1 of which was the test sample set, and 8 of which was the incremental sample set at 8 increments. The same segmentation method was applied to split the Waveform classification dataset equally into 10 aliquots. The SVM training function employed in the operating program employs the libsvmtrain function of the LIBSVM toolbox. Comparing the incremental learning SVM algorithm of the present invention with other methods, the simulation results are shown in FIG. 2 and FIG. 3 below.

As shown in fig. 2 and 3, a subject operating characteristic curve (Receiver Operating characteristic Curve, ROC) curve of classification results on UCI standard dataset abalone and Waveform, wherein the ordinate true positive rate (True Positive Rate, TPR) of the ROC curve indicates the ratio of correctly judged positive samples among all the actually positive samples; the abscissa false positive rate (False Positive Rate, FPR) represents the rate at which positive samples are erroneously judged to be among all the samples that are actually negative; the Area Under the Curve is the Area Under Curve (AUC) index, and the higher the AUC value is, the better the classifier performance is. The algorithm of the invention is shown in the figure by a modified KKT-ISVM curve, and the other two methods are respectively a KKT-based incremental SVM algorithm (KKT-ISVM) and a combined retention set-based incremental SVM algorithm (Combined Reserved Set Incremental Support Vector Machine, CRS-ISVM). FIG. 4 shows the classification accuracy of different methods on the Waveform dataset varying with the number of incremental learning, the higher the classification accuracy, the better the classifier performance.

According to the ROC curves of multiple incremental learning under two simulation comparison experiments and the AUC index comparison results of the area representation under the curves, the ROC curves of the improved KKT incremental SVM algorithm (improved KKT-ISVM) in the four methods are closer to the upper left, the area AUC indexes under the curves are the highest, the higher the AUC is used for indicating higher classification accuracy, the higher the classifier performance is, and the method provided by the invention is proved to be effective and has advanced performance.

The present invention has been described in terms of the preferred embodiments thereof, and it should be understood by those skilled in the art that various modifications can be made without departing from the principles of the invention, and such modifications should also be considered as being within the scope of the invention.

Claims

1. An SVM incremental learning method for improving KKT conditions based on sample distribution density is characterized by comprising the following steps:

1) Training an SVM classifier Model from the raw sample set A _old Obtaining classifier Model _old Support vector set SV of (1) ₀ And standard KKT condition I;

2) Calculating sample distribution densities of positive and negative samples in an original sample set A, calculating bias parameters under the positive and negative samples according to the sample distribution densities, adding the bias parameters on the basis of a standard KKT condition I, and constructing an SVM classifier Model _old The improved KKT condition I aiming at the self-adaptive optimization of positive and negative samples is adopted;

4) Judging whether samples in the newly added sample set B meet the Model of the SVM classifier _old Is performed by placing samples satisfying the modified KKT condition I into the candidate support vector set SV ₁ ；

7) By SV ₀ ∪SV ₁ SV _add Model of ensemble training classifier ₂ And outputs the updated classifier as a Model ₂ ，SV ₀ SV is the set of support vectors for the original classifier _add Supplementing a support vector set;

the step 2) is based on a sample distribution densitometerCalculating bias parameters under positive and negative samples, and constructing classifier Model _old The method for improving KKT condition I is specifically as follows:

21 Assumed original training setFor the original sample set, the positive sample is +.>Negative sample is +.>The class centers of positive and negative samples in the sample set are c respectively ⁺ And c ^- ，/>And->Representing the distance between the positive and negative samples and the center of the class;

δ ₊ ＝1-ξ ₊ ,δ _- ＝1-ξ _- ；

24 Calculating Model of SVM classifier _old Is defined in KKT condition I:

|y _i f(x _i )|≤1+δ＝1+(1-ξ),ξ∈[ξ ₊ ,ξ _- ]

wherein N is ₊ Represents the positive number of samples, N-represents the negative number of samples, and max (·) represents the distance maximum.

2. The SVM incremental learning method based on sample distribution density improvement KKT condition of claim 1, wherein in said step 3), for a new added sample set b= { s ₁ ,s ₂ ,...,s _j ,...,s _M Sample vector s in } _j Judging if s _j All meet the standard KKT condition I, output classifier Model _old Ending; otherwise, dividing the newly added sample set B into two sets of a set B1 and a set B2, wherein the Model of the SVM classifier is satisfied _old Sample vectors of standard KKT condition i are placed in set B2 and sample vectors that violate standard KKT condition i are placed in set B1.

3. The method for learning SVM increments based on the sample distribution density improvement KKT condition according to claim 1, wherein the step 5) uses LIBSVM tool kit according to the candidate support vector set SV ₁ Training classifier Model ₁ Calculating a candidate support vector set SV ₁ Sample distribution density of the positive and negative samples, calculating bias parameters under the positive and negative samples according to the sample distribution density, adding the bias parameters on the basis of standard KKT condition II, and constructing a Model of the SVM classifier ₁ Improved KKT condition II aiming at the self-adaptive optimization of positive and negative samples is provided.

4. The method for SVM delta learning based on sample distribution density refinement KKT conditions of claim 1, wherein in said step 7) the support vector set SV of the original classifier is ₀ A set of possible support vectors SV selected from the newly added sample set B ₁ Supplemental support vector set SV _add Merging and combining SV by using LIBSVM tool box ₀ ∪SV ₁ SV _add Training classifier Model ₂ The classifier Model ₂ Is the classifier model found.