CN107729992B

CN107729992B - Deep learning method based on back propagation

Info

Publication number: CN107729992B
Application number: CN201711029714.5A
Authority: CN
Inventors: 王好谦; 安王鹏; 方璐; 戴琼海
Original assignee: Shenzhen Weilai Media Technology Research Institute; Shenzhen Graduate School Tsinghua University
Current assignee: Shenzhen Weilai Media Technology Research Institute; Shenzhen Graduate School Tsinghua University
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2020-12-29
Anticipated expiration: 2037-10-27
Also published as: CN107729992A

Abstract

The invention discloses a deep learning method based on back propagation, which comprises the following steps: s1: preparing a training set; s2: inputting the training set into a convolutional neural network to obtain network output; s3: calculating the distance between the network output and the truth value in the training set to obtain a cross entropy target function; s4: judging whether the accuracy of the network output in the step S2 is improved or not according to the cross entropy objective function, if so, executing the step S5, and if not, finishing the training; s5: updating the weight of the convolutional neural network using a sine exponential learning rate, inputting the updated weight into the convolutional neural network of step S2, and repeating steps S2 to S4. The deep learning method provided by the invention can greatly accelerate the training of the classification network.

Description

Deep learning method based on back propagation

Technical Field

The invention relates to the field of computer vision and image processing, in particular to a deep learning method based on back propagation.

Background

Image classification belongs to the field of computer vision and image processing, is one of core problems in the field of computer vision, has various practical applications, and can be classified into various problems such as face positioning and pedestrian positioning, so that the image classification is an image processing problem with basic properties and has important academic and industrial research values. The object of super-resolution of an image is to have a fixed classification label set, then to find a classification label from the classification label set for an input image, and finally to assign the classification label to the input image. It is simple to extremely human to recognize a visual concept like "cat," however, from the perspective of computer vision algorithms there are major challenges: 1. change of visual angle: the same object, the camera can be presented from multiple angles; 2. size change: the size of an object visible is often variable (not only in pictures, but also in the real world); 3. deformation: the shapes of many things are not invariable and can be changed greatly; 4. shielding: the target object may be occluded and sometimes only a small portion of the object (which may be as small as a few pixels) is visible; 5. the illumination condition is as follows: at the pixel level, the influence of illumination is very large; 6. background interference: objects may be mixed into the background, making it difficult to identify; 7. intra-class differences: objects of the kind, such as chairs, which have many different objects, each with its own shape, vary greatly in shape from individual to individual. In the face of all the above variations and combinations thereof, a good image classification model can maintain stable classification conclusion and simultaneously keep enough sensitivity to inter-class differences.

Currently, mainstream image classification methods are mainly classified into three categories: a nearest neighbor classifier KNN-based method, a support vector machine-based method and a deep learning-based method; the KNN-based method is simple and clear in thought, does not need training at all, but is time-consuming when a model is deployed, and needs to be compared with pictures in all training sets; the computational complexity of support vector machine-based methods depends on the number of support vectors, rather than the dimensions of the sample space, in a way that avoids "dimension disasters"; but the method has weak interpretability on high-dimensional mapping of a kernel function, depends on the experience of an experimenter, and is sensitive to data loss; the deep learning-based method learns a large amount of priori knowledge from a training set, and can automatically extract features of data, so that the features are not selected manually, and the deep neural network has a breakthrough effect in the fields of computer vision, natural language processing, voice recognition and the like; however, in the using process, the related neural network has more hyper-parameters, long-time debugging is needed, and the training speed is generally slower, which influences the wide application of deep learning in the image classification method to a certain extent.

The above background disclosure is only for the purpose of assisting understanding of the concept and technical solution of the present invention and does not necessarily belong to the prior art of the present patent application, and should not be used for evaluating the novelty and inventive step of the present application in the case that there is no clear evidence that the above content is disclosed at the filing date of the present patent application.

Disclosure of Invention

In order to solve the technical problem, the invention provides a deep learning method based on back propagation, which can greatly accelerate the training of a classification network.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention discloses a deep learning method based on back propagation, which comprises the following steps:

s1: preparing a training set;

s2: inputting the training set into a convolutional neural network to obtain network output;

s3: calculating the distance between the network output and the truth value in the training set to obtain a cross entropy target function;

s4: judging whether the accuracy of the network output in the step S2 is improved or not according to the cross entropy objective function, if so, executing the step S5, and if not, finishing the training;

s5: updating the weight of the convolutional neural network using a sine exponential learning rate, inputting the updated weight into the convolutional neural network of step S2, and repeating steps S2 to S4.

Preferably, step S1 includes: and (4) enhancing the natural image public data set to obtain an enhanced data set serving as a training set.

Preferably, the enhancing processing on the natural image common data set in step S1 specifically includes: rotating an image, cropping an image, scaling an image, contrast transforming, changing the size of an image, and noising an image.

Preferably, wherein the contrast transformation specifically comprises: in the HSV color space of the image, keeping the hue H unchanged, and performing exponential operation on the S and V components of each pixel, wherein the exponential factor is between 0.25 and 4.

Preferably, step S1 further includes using an image difference method to unify the enhanced data set to a size of 224 × 224 as a training set.

Preferably, a width residual network is specifically selected from the convolutional neural networks in step S2 as the learned model.

Preferably, the sine index learning rate in step S5 is:

in the formula, lr₀The initial learning rate is represented, T represents the iteration number during network training, alpha is an attenuation coefficient, beta is an oscillation coefficient, and T represents the iteration number required by training the training set in the network once.

Compared with the prior art, the invention has the beneficial effects that: the deep learning method based on the back propagation disclosed by the invention optimizes the convolutional neural network by adopting the sine exponential learning rate as a back propagation parameter, can greatly accelerate the training of the classification network, and can more quickly enable the neural network to obtain the capability of accurately extracting the features; and the learning rate can be continuously reduced by using an exponential mode, and the learning rate is oscillated within a certain range by using a sine function method, so that the deep learning classification network can be trained more quickly.

In a further scheme, parameters of the model are updated at a specific sine exponential learning rate, so that the deep learning model can train the classification network at a speed 2-5 times.

Drawings

FIG. 1 is a flow chart diagram of a deep learning method based on back propagation in accordance with a preferred embodiment of the present invention;

fig. 2 is a diagram of the sine exponential learning rate of the preferred embodiment of the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings and preferred embodiments.

As shown in fig. 1, the deep learning method based on back propagation of the preferred embodiment of the present invention includes the following steps:

s1: preparing a training set;

in this embodiment, a natural Image common data set (e.g., Image Net data set) is used as a training set, and in order to enrich the Image training set, Image features of the Image Net data set are better extracted, a model is generalized (model overfitting is prevented), and the data set is subjected to Image enhancement to obtain a larger data set.

The image enhanced content comprises: 1. rotating the image at a certain angle randomly to change the orientation of the image content; 2. cutting an image, and randomly intercepting an area in the image; 3. scaling transformation, namely scaling the image according to a certain proportion; 4. contrast transformation, namely changing the brightness components of saturation S and saturation V and keeping hue H unchanged in an HSV color space of an image, in a specific embodiment, performing exponential operation on the S and V components of each pixel, wherein an exponential factor is between 0.25 and 4 so as to increase illumination change; 5. changing the size of the image, and changing the size of the original image by using a bilinear difference method of Opencv with a certain probability; 6. the image is noisy, which in a particular embodiment is a gaussian function of a standard normal distribution.

In a specific embodiment, the image difference method of Opencv is used to unify all enhanced data to a size of 224 × 224 as the training set of the input to the network.

S2: inputting the preprocessed data set serving as a training set into a convolutional neural network to obtain network output;

in a specific embodiment, a wide residual network is selected as a learned model, and the parameters of the model are initialized in a gaussian distribution.

S3: calculating a cross-entropy objective function representing a distance between the network output and a true value in the data set according to the network output of step S2 and the true value in the data set in step S1 (a data sort label may be set in the data set);

s4: judging whether the accuracy of the network output in the step S2 is improved or not according to the cross entropy objective function in the step S3, if so, executing the step S5, and if not, finishing the training;

s5: and (4) combining the oscillation characteristic of the sine function and the decreasing characteristic of the exponential function, updating the weight of the convolutional neural network by adopting a sine exponential learning rate, inputting the updated weight into the convolutional neural network of the step S2, and repeating the steps S2 to S4.

As shown in fig. 2, the sine exponential learning rate is:

in the formula, lr₀The initial learning rate is represented, T represents the iteration number during network training, alpha is an attenuation coefficient, beta is an oscillation coefficient, and T represents the sample number of the training set/the sample number of each iteration, namely the iteration number required by the training set for training once in the network.

The deep learning method based on the back propagation of the preferred embodiment of the invention comprises the steps of firstly preprocessing an image public data set to obtain an enhanced data set which is used as a training set of a width residual error network; then, determining the target function of the training model as a cross entropy target function, and continuously reducing the target function by the optimization method so as to continuously enhance the recognition capability of the model; the optimization method adopts an optimization method based on back propagation, and uses a sine exponential function learning rate to train the model, wherein the training comprises continuously solving the gradient of the model weight according to the cross entropy, and using the sine exponential function learning rate as a back propagation parameter to continuously update the parameter of the model. By the deep learning method based on the back propagation in the preferred embodiment of the invention, the training of the neural network can be rapidly carried out, so that the deep learning model can carry out the training of the classification network at the speed of 2-5 times.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several equivalent substitutions or obvious modifications can be made without departing from the spirit of the invention, and all the properties or uses are considered to be within the scope of the invention.

Claims

1. A deep learning method based on back propagation is characterized by comprising the following steps:

s1: preparing a training set, wherein a natural image common data set is used as the training set;

s5: updating the weight of the convolutional neural network by using a sine exponential learning rate, inputting the updated weight into the convolutional neural network of step S2, and repeating steps S2 to S4, thereby accelerating network training;

the sine index learning rate in step S5 is:

2. The deep learning method according to claim 1, wherein step S1 includes: and (4) enhancing the natural image public data set to obtain an enhanced data set serving as a training set.

3. The deep learning method according to claim 2, wherein the enhancement processing on the natural image common data set in step S1 is specifically selected from the following processing modes: rotating an image, cropping an image, scaling an image, contrast transforming, changing the size of an image, and noising an image.

4. The deep learning method of claim 3, wherein the contrast transformation specifically comprises: in the HSV color space of the image, keeping the hue H unchanged, and performing exponential operation on the saturation S and brightness V components of each pixel, wherein the exponential factor is between 0.25 and 4.

5. The deep learning method of claim 2, wherein step S1 further comprises using an image difference method to unify the enhanced data set to 224 x 224 size as a training set.

6. The deep learning method of claim 1, wherein a width residual network is specifically selected from the convolutional neural networks in step S2 as a learning model.