Online soft interval kernel learning algorithm based on step length control
Technical Field
The invention belongs to the field of data mining and machine learning, relates to a method for data mining and data processing, and particularly relates to an online soft interval kernel learning algorithm (OSKL) based on step length control.
Background
The classification problem is a classic problem in the field of data mining and machine learning. The traditional classification method based on the batch processing technology firstly collects data, builds a learning model based on the collected data, and selects an optimization algorithm to solve the model to obtain a classifier. With the rapid development of technologies such as e-commerce, social media, mobile internet, internet of things, and the like, more and more application scenarios require real-time processing of large-scale data streams. The traditional classification method based on the batch processing technology has the defects of high calculation complexity, low model updating efficiency and the like when processing a large-scale data stream problem. The online learning is based on a basic frame of point-by-point learning, data information is learned point-by-point through a dynamic updating model, the calculation complexity of the model updating once is only O (1), and the online learning method has the advantages of low calculation complexity, high model updating efficiency, strong real-time performance and the like, and is a natural tool for processing and analyzing data flow problems. In addition, some error labels are inevitable in large-scale label data, and the error labels can seriously influence the construction and the effect of the classifier. Therefore, it is desirable to design a data stream mining algorithm with fault tolerance.
Disclosure of Invention
The invention aims to provide an online soft interval kernel learning algorithm based on step length control, which can reduce model storage space, effectively control noise influence, remarkably improve model updating efficiency and meet the real-time requirement of practical application problems, aiming at the problem that the existing classification method based on batch processing technology cannot efficiently process data stream classification and the online learning algorithm cannot inhibit noise influence.
According to an embodiment of the present invention, an online soft interval kernel learning algorithm based on step size control is provided, which includes the following steps:
initializing model parameters, a decision function and a model kernel function.
And (II) collecting the data stream, and predicting the class label of the data stream sample by using a classification decision function.
And (III) acquiring a sample real label, appointing a loss function, and calculating a sample loss value.
And (IV) calculating the updating step size of the decision function of the classifier.
And (V) updating a classifier decision function based on the basic framework of the online gradient descent algorithm.
In the learning algorithm according to the embodiment of the present invention, in step (one), the specific steps of model initialization are:
determining a training sample set and a test sample set, initializing a model threshold parameter C, and initializing a decision function f of a binary problem 00, a gaussian kernel function is designated as a model kernel function k (·, ·).
In the learning algorithm according to the embodiment of the present invention, in the step (two), the specific steps of predicting the class label of the data stream sample by using the classification decision function are as follows:
collecting data stream in the form of one-by-one { (x)
t,y
t)}
t=1,2,…,x
tDenotes the t-th sample input, y
tRepresenting the t-th sample output (class label). Using a decision function f
t-1Predicting the t-th sample in a data stream
The label of (2):
in the learning algorithm according to the embodiment of the present invention, in step (three), the specific operation flow for calculating the sample loss is as follows:
the sample point (x) is calculated by assigning the most common change function of the two-classification problem as the loss functiont,yt) Change loss of (c):
in the learning algorithm according to the embodiment of the present invention, in step (four), the update step τ is calculatedtThe specific operation flow is as follows: determining the update step τ based on the following two considerationst: firstly, realizing current sample x with highest possible confidencetCorrect classification of points, i.e. reaching zero loss (l)t0); secondly, the stability of the algorithm is ensured as much as possible, namely the fluctuation of the decision function in the updating process is reduced. Optimum step length tautIs a solution to the following optimization problem:
on the other hand, large-scale sampling data inevitably contains a large amount of error label data, and the error labels can seriously influence the construction of the decision function and the effect of a corresponding classifier. To this end, we introduce soft interval threshold parameter controlSystem update step τtC is less than or equal to C, so that the influence of error label data on the model is limited, and the stability of the classifier is ensured. Based on the sample points (x) calculated in step (three)t,yt) 1 change loss ltAnd a step size control parameter C, determining an update step size tautComprises the following steps:
in the learning algorithm according to the embodiment of the present invention, in step (v), the specific operation flow for updating the classifier decision function is as follows:
based on the update step τ calculated in step (IV)tUnder the basic framework of the online gradient descent algorithm, a decision function f is settPerform the update
Obtaining a new decision function ft。
The invention relates to an online soft interval kernel learning algorithm based on step length control. An online kernel learning classifier is established by introducing a change loss function, a Gaussian kernel function and a soft interval threshold parameter C, so that online prediction of data flow is realized. The method adopts soft interval threshold parameters to enable the updating of the decision function of the classifier to be smoother and has robustness. Compared with classical online learning algorithms Kernel Perceptron and Pegesoso, the proposed algorithm OSKL significantly improves the classification accuracy. The online classification algorithm OSKL can flexibly process the classification problem in a data flow scene, and compared with the traditional static classification mode based on batch processing technology, the online classification algorithm OSKL greatly reduces the calculation complexity and the model operation time.
Drawings
FIG. 1 is a schematic diagram of an online soft interval kernel learning algorithm based on step control
FIG. 2 is a schematic diagram showing classification accuracy comparison of three algorithms on a reference data set
FIG. 3 is a schematic diagram showing the comparison of the classification accuracy of the average test of three algorithms on a noisy label data set ijcnn
FIG. 4 is a schematic diagram showing the comparison of the classification accuracy of the average test of three algorithms on the noise-containing tag data set codna
FIG. 5 is a schematic diagram showing comparison of average test classification accuracy of three algorithms on noise-containing tag data set eegeye
Detailed Description
The specific steps of the present invention are explained below with reference to the drawings.
The first embodiment is as follows: the online classification experiments on the original reference data sets ijcnn, codrna, eegeye are taken as examples for explanation. Fig. 1 is a schematic diagram of an online soft interval kernel learning algorithm based on step size control according to an embodiment of the present invention, where the online learning algorithm includes the following steps:
the method comprises the following steps: model parameters, decision functions and model kernel functions are initialized. The method comprises the following specific steps:
initializing a model threshold parameter C to 0.05, and initializing a binary problem decision function f0When 0, a gaussian kernel function is designated as a model kernel function, i.e., k (x)i,xj)=exp(-‖xi-xj‖2D), where d is taken as the dimension of the sample input x.
Step two: and collecting the data stream, and predicting the class label of the data stream sample by using a classification decision function. The method comprises the following specific steps: collecting data stream in the form of one-by-one { (x)
t,y
t)}
t=1,2,…,x
tDenotes the t-th sample input, y
tRepresenting the t-th sample output (class label). Using a decision function f
t-1Predicting the t-th sample in a data stream
The label of (2):
step three: obtainingAnd (4) a sample real label appoints a loss function and calculates a sample loss value. The method comprises the following specific steps: the sample point (x) is calculated by assigning the most common change function of the two-classification problem as the loss functiontYt) loss of change:
step four: and calculating the updating step length of the decision function of the classifier. The method comprises the following specific steps: introducing soft interval threshold parameter to control update step length tautC is less than or equal to C, so that the influence of error label data on the model is limited, and the stability of the classifier is ensured. Based on the sample points (x) calculated in step (three)t,yt) 1 change loss ltAnd step size control parameter C, determining the updating step size tau of the t steptComprises the following steps:
step five: and updating a decision function of the classifier based on a basic framework of an online gradient descent algorithm. The method comprises the following specific steps: based on the update step τ calculated in step (IV)tUnder the basic framework of the online gradient descent algorithm, a decision function f is settPerform the update
Obtaining a new decision function ft。
FIG. 2 is a schematic diagram showing the comparison of the average online test accuracy of predictions made by the online learning algorithm of the present invention and the existing online learning algorithms Kernel Perceptron and Pegasos in the reference data set ijcnn, the reference data set codrna and the reference data set eegeye. As can be seen from FIG. 2, the average test accuracy of the online learning algorithm of the present invention on the above 3 reference data sets is better than that of the other two methods.
Example two: on the basis of the original reference data sets ijcnn, codrna and eegeye, a noise label is added, and an online classifier is trained on the data set containing the noise label. Unlike the first embodiment, in the first embodiment, 30% of the data set is randomly selected as the test set, and the rest of the data is added with the noise label to construct the training set. Specifically, we modulo 20, modulo 10, modulo 5 the sample index, and multiply the sample point label with a remainder of 0 by-1 to obtain noise label data.
Fig. 3-5 are graphs of the average classification performance (average test accuracy, ACA) on the original 30% dataset noiseless test dataset with training on the online classifiers Kernel permeatron, pegaso and OSKL on the noisy labeled datasets ijcnn, codna, eegeye. Experimental results show that the classification precision of the three algorithms is reduced with the increase of noise of training samples indexed by mod20, mod10 and mod5, but the OSKL algorithm provided by I can effectively control the noise influence under the condition of containing noise, and the classification effect is obviously higher than that of online classifiers Kernel Perception and Pegasos.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are possible within the spirit and scope of the claims.