CN113591913B

CN113591913B - Picture classification method and device supporting incremental learning

Info

Publication number: CN113591913B
Application number: CN202110716563.0A
Authority: CN
Inventors: 段黎婷; 刘惠义
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2024-03-29
Anticipated expiration: 2041-06-28
Also published as: CN113591913A

Abstract

The invention discloses a picture classification method and device supporting incremental learning, wherein the method comprises the following steps: selecting training picture samples in a training set, inputting the training picture samples into a convolutional neural network, and selecting typical picture samples through the convolutional neural network; updating the representative memory by a typical picture sample; performing data augmentation on the data set of the representative memory and the increment task to construct a test set; inputting the test picture samples in the test set into a convolutional neural network, and extracting feature vectors through the convolutional neural; and inputting the feature vectors of all the test picture samples into a trained classifier, and outputting a picture classification result through the increment of the classifier. The invention can learn the picture characteristics and classify the pictures on line, thereby improving the classification accuracy.

Description

Picture classification method and device supporting incremental learning

Technical Field

The invention relates to a picture classification method and device supporting incremental learning, and belongs to the technical field of image classification.

Background

Most of traditional machine learning modes are off-line learning and batch learning, the off-line learning and batch learning modes can not dynamically learn data characteristics, and meanwhile, a large amount of memory space is required to store historical data. Incremental learning aims to develop an algorithm that does not stop learning, but rather continuously updates model parameters over time. In recent years, with the remarkable increase of large data, incremental learning has become a popular direction.

In recent years, many research results have been presented in this field. Early incremental learning algorithms were based in large part on feed forward neural networks. ParekhR et al teach a method for incremental learning by adjusting the structure of a multi-layer perceptron neural network (MLP). When the knowledge contained in the new sample is not in the learned knowledge range of the MLP network, incremental learning of the new sample is realized by adding hidden layer units. However, when the new sample size is large, difficulty will occur in implementing incremental learning by means of weight adjustment and structure adjustment. The passive active algorithm (passive aggressive learning, PA) is a second order algorithm that uses this information to assist model update in learning taking into account the confidence level of the prediction, i.e. the sample-to-current decision boundary spacing, and the PA uses more complex learning steps: the learning step size on each sample is related to the confidence level that the sample is classified. Confidence-weighted learning (CW) is another second order algorithm that is an extension of the PA algorithm. The algorithm CW always forces the current sample to be correctly classified, which constraint makes it very susceptible to noise data to produce an overfitting.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a picture classification method and device supporting incremental learning, which can be used for classifying pictures on line and remarkably improving the classification accuracy.

In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:

in a first aspect, the present invention provides a method for classifying pictures, including:

selecting training picture samples in a training set, inputting the training picture samples into a convolutional neural network, and selecting typical picture samples through the convolutional neural network;

updating the representative memory by a typical picture sample;

performing data augmentation on the data set of the representative memory and the increment task to construct a test set;

inputting the test picture samples in the test set into a convolutional neural network, and extracting feature vectors through the convolutional neural;

and inputting the feature vectors of all the test picture samples into a trained classifier, and outputting a picture classification result through the increment of the classifier.

Preferably, the selecting the typical picture sample through the convolutional neural network includes: the convolutional neural network outputs posterior probability according to training picture samples which can be correctly identified in the input training picture samples, and typical picture samples are selected from the training picture samples according to the sequence of the posterior probability from high to low.

Preferably, the convolutional neural network includes: 8 convolution layers, wherein each convolution layer uses a convolution kernel of 3×3 and 64 nodes, maxPooling layers are added after the 2 nd, 4 th, 6 th and 8 th convolution layers, the pool size of each MaxPooling layer is 2×2, and the step length is 2; and adding a Dropout layer on the last MaxPooling layer, wherein the Dropout probability is 0.5.

Preferably, the extracting the feature vector by convolution nerve includes:

carrying out convolution processing on an input test picture sample through a convolution layer to obtain image characteristics;

normalizing the image features through batch normalization;

carrying out pooling treatment on the normalized image features through a MaxPooling layer;

and activating the pooled image features through a ReLU activation function to obtain feature vectors.

Preferably, the classifier comprises a regular dual average algorithm RDA and a soft confidence weighting algorithm SCW, wherein the regular dual average algorithm RDA is used for improving the feature extraction capability of the stream data, and the soft confidence weighting algorithm SCW is used for outputting classification results.

Preferably, the expression of the optimization problem of the soft confidence weighting algorithm SCW is as follows:

where N (μ, Σ) represents the Gaussian distribution of the classifier obeying the mean vector μ and covariance matrix Σ, D _KL Representing a new Gaussian distribution N (μ) _t+1 ，Σ _t+1 ) And the current Gaussian distribution N (mu) _t ，Σ _t ) Degree of divergence, iota between ^Φ As a loss function, C represents a trade-off between passive and aggressive parameters, (x) _t ，y _t ) Representing the input feature vector;

the sealing solution is as follows:

wherein alpha is _t And beta _t The coefficients, respectively, are expressed as follows:

wherein,m _t ＝y _t (μ _t ·x _t )；Φ＝Φ ^-1 (eta); phi represents a cumulative function of normal distribution, phi ^-1 (eta) is an inverse function of phi, which is used to generate random variables subject to random distribution, eta e [0,1]。

Preferably, the expression of the weight update strategy of the regular dual average algorithm RDA is as follows:

wherein,<G ^(t) ，W>representing gradient G ^(t) The integrated average value of the weight W, ψ (W) is a regular term, ψ (W) = lambda W ₁ Lambda denotes the learning rate, h (W) is an auxiliary rigid convex function,β ^(t) is a non-negative and non-self-decreasing sequence, < >>t represents a time step.

Preferably, the training of the classifier includes:

selecting training picture samples in a training set, inputting the training picture samples into a convolutional neural network, and extracting feature vectors through the convolutional neural network;

the classifier outputs a prediction label according to the feature vector of the input training picture sample;

judging whether the predicted label is consistent with the actual label of the training picture sample,

if yes, the classifier is kept unchanged, and if not, parameters of the classifier are updated.

In a second aspect, the present invention provides a picture classification apparatus supporting incremental learning, including a processor and a storage medium;

the storage medium is used for storing instructions;

the processor is operative according to the instructions to perform the steps of the method according to any one of the preceding claims.

Compared with the prior art, the invention has the beneficial effects that:

the image classification method and device supporting incremental learning provided by the invention can be used for classifying images on line, and learning can be carried out according to the incremental images, so that the classification accuracy is improved.

Drawings

Fig. 1 is a flowchart of a picture classification method supporting incremental learning according to an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.

Embodiment one:

as shown in fig. 1, an embodiment of the present invention provides a method for classifying pictures, including the following steps:

step 1, selecting training picture samples in a training set, inputting the training picture samples into a convolutional neural network, and selecting typical picture samples through the convolutional neural network.

And step 2, updating the representative memory through the typical picture sample.

And step 3, carrying out data augmentation on the data set of the representative memory and the increment task to construct a test set.

And 4, inputting the test picture samples in the test set into a convolutional neural network, and extracting feature vectors through the convolutional neural.

And 5, inputting the feature vectors of all the test picture samples into a trained classifier, and outputting a picture classification result through the increment of the classifier.

Taking MNIST handwriting data set as an example, MNIST data set is a very classical data set in the machine learning field, and is composed of 60000 training samples and 10000 test samples, each sample is a 28 x 28 pixel gray scale handwriting digital picture. In the experiment, 10 kinds of samples from the number of 0 to 9 are selected, and 1000 pictures are selected as initial training set samples from each kind.

The selecting of the typical picture sample through the convolutional neural network comprises the following steps:

step 1.1, a convolutional neural network outputs posterior probability according to a training picture sample which can be correctly identified in input training picture samples;

and 1.2, selecting a typical picture sample from the training picture samples according to the sequence of the posterior probability from high to low.

Extracting feature vectors by convolving nerves includes:

step 4.1, carrying out convolution processing on an input test picture sample through a convolution layer to obtain image characteristics;

step 4.2, carrying out normalization processing on the image characteristics through batch normalization;

step 4.3, pooling the normalized image features through a MaxPooling layer;

and 4.4, activating the pooled image features through the ReLU activation function to obtain feature vectors.

Wherein the convolutional neural network comprises: 8 convolution layers, wherein each convolution layer uses a convolution kernel of 3×3 and 64 nodes, maxPooling layers are added after the 2 nd, 4 th, 6 th and 8 th convolution layers, the pool size of each MaxPooling layer is 2×2, and the step length is 2; and adding a Dropout layer on the last MaxPooling layer, wherein the Dropout probability is 0.5.

The classifier comprises a regular dual average algorithm RDA and a soft confidence weighting algorithm SCW, wherein the regular dual average algorithm RDA is used for improving the feature extraction capacity of the stream data, and the soft confidence weighting algorithm SCW is used for outputting classification results.

The expression of the optimization problem of the soft confidence weighting algorithm SCW is as follows:

the sealing solution is as follows:

wherein,m _t ＝y _t (μ _t ·x _t )；Φ＝Φ ^-1 (eta); phi represents a cumulative function of normal distribution, phi ^-1 (eta) is the inverse of phi, the inverseFor generating random variables obeying random distribution, eta e [0,1 ]]。

Training of the classifier includes:

step 5.1, selecting training picture samples in a training set, inputting the training picture samples into a convolutional neural network, and extracting feature vectors through the convolutional neural network;

step 5.2, outputting a prediction label by the classifier according to the feature vector of the input training picture sample;

step 5.3, judging whether the predicted label is consistent with the actual label of the training picture sample,

The invention herein compares to conventional incremental learning algorithms: the regular dual average algorithm and the soft confidence weighting algorithm have certain optimization on classification accuracy and model convergence speed, and are very practical picture classification methods for online classification of pictures.

Embodiment two:

the embodiment of the invention provides a picture classification device supporting incremental learning, which comprises a processor and a storage medium; the storage medium is used for storing instructions; the processor is operative according to instructions to perform steps of any of the methods according to the embodiments.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. A picture classification method, comprising:

updating the representative memory by a typical picture sample;

inputting the feature vectors of all the test picture samples into a trained classifier, and outputting a picture classification result through the increment of the classifier;

the classifier comprises a regular dual average algorithm RDA and a soft confidence weighting algorithm SCW, wherein the regular dual average algorithm RDA is used for improving the feature extraction capacity of stream data, and the soft confidence weighting algorithm SCW is used for outputting classification results;

the sealing solution is as follows:

wherein,m _t ＝y _t (μ _t ·x _t )； Φ＝Φ ^-1 (eta); phi represents a cumulative function of normal distribution, phi ^-1 (eta) is an inverse function of phi, which is used to generate random variables subject to random distribution, eta e [0,1]；

The expression of the weight update strategy of the regular dual average algorithm RDA is as follows:

2. The method of claim 1, wherein the selecting a representative picture sample via a convolutional neural network comprises:

the convolutional neural network outputs posterior probability according to training picture samples which can be correctly identified in the input training picture samples, and typical picture samples are selected from the training picture samples according to the sequence of the posterior probability from high to low.

3. A method of classifying pictures according to claim 2, wherein said convolutional neural network comprises: 8 convolution layers, wherein each convolution layer uses a convolution kernel of 3×3 and 64 nodes, maxPooling layers are added after the 2 nd, 4 th, 6 th and 8 th convolution layers, the pool size of each MaxPooling layer is 2×2, and the step length is 2; and adding a Dropout layer on the last MaxPooling layer, wherein the Dropout probability is 0.5.

4. A method of classifying pictures according to claim 3, wherein said extracting feature vectors by convolutional nerves comprises:

normalizing the image features through batch normalization;

5. A method of classifying pictures according to claim 1, wherein the training of the classifier comprises:

6. The picture classifying device is characterized by comprising a processor and a storage medium;

the storage medium is used for storing instructions;

the processor being operative according to the instructions to perform the steps of the method according to any one of claims 1-5.