CN112487938A

CN112487938A - Method for realizing garbage classification by utilizing deep learning algorithm

Info

Publication number: CN112487938A
Application number: CN202011349429.3A
Authority: CN
Inventors: 蔡志成; 庄建军; 彭成磊; 冯源; 李保稷
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2021-03-12

Abstract

A method for realizing garbage classification by using a deep learning algorithm, 1) applying a lightweight convolutional neural network model squeezenet to garbage classification: inputting a single preprocessed garbage picture into the trained squeezenet model to finally obtain one of 4 outputs, namely the type of garbage in the picture; 2) extracting the garbage picture features by using a convolution kernel: performing convolution operation on the image by using a convolution kernel in the CNN, extracting image characteristics, and obtaining the convolution kernel with optimal parameters through training to accurately identify the garbage image; 3) and training and optimizing the neural network by using an Adam algorithm.

Description

Method for realizing garbage classification by utilizing deep learning algorithm

Technical Field

The invention relates to the technical field of application of neural network algorithms, in particular to a method for facilitating the transplantation application of a garbage classification technology in an embedded platform.

Background

Waste classification is a big change to the traditional mode of refuse collection processing, can reduce the rubbish handling capacity, improves living environment quality, solves the problem that rubbish output is increasing day by day at present, and the environmental aspect constantly worsens. In first-line cities such as Shanghai, garbage classification is gradually deep into life of common people, but the actual participation rate is not considerable due to the complexity of garbage classification. The invention utilizes the light weight neural network algorithm to accurately classify the garbage, thereby facilitating the transplantation of the garbage classification technology on the embedded platform.

The Convolutional Neural Network (CNN) is generally applied to the field of image processing in recent years, and compared with the conventional neural network algorithm, the convolutional neural network has the characteristics of sparse connection between weight sharing layers and the like, and is favorable for processing a large amount of data. The Alexnet model is the champion of ImageNet in the image processing field of 2012, greatly expands the depth of a neural network by utilizing the characteristics of CNN, and firstly uses GPU to accelerate the operation of the neural network. The SqueezeNet published in 2017 is a lightweight convolutional neural network algorithm with accuracy comparable to Alexnet, but the parameter amount can be reduced by 50 times after compression.

The technical scheme of the method for realizing garbage classification by the existing algorithm is as follows:

step 1, pre-training a parameter model trained by the ImageNet of the large-scale image classification data set by adopting an inclusion-v 3 model.

And 2, extracting 2048-dimensional vector features of the image by inputting and processing a garbage image by adopting a grid structure and parameters of a pre-training model, thereby realizing feature extraction.

And 3, completing garbage image classification by using softmax regression.

The prior art is as follows: the existing method for classifying the garbage by utilizing the deep learning algorithm has large calculation amount and more used parameters, and is not beneficial to embedded application. In addition, the existing method is not very high in precision, and the situation of recognition error is easy to occur. The invention aims to provide a deep learning garbage classification algorithm which is smaller in calculation amount, less in parameter, more suitable for embedded development and higher in precision.

Because the Incep-v 3 model is used, the final classification layer of the network is moved out, only the penultimate layer, namely the bottleneck layer, of the CNN is trained, the model accuracy is relatively low, and the classification accuracy only reaches 95%; moreover, the Incep-v 3 model is used, so that under-fitting is easy to occur under the condition of a small number of pictures.

Disclosure of Invention

The invention aims to: the method for realizing garbage classification by using the deep learning algorithm is provided, particularly, the lightweight CNN model SqueezeNet is used for realizing accurate classification of garbage pictures, the popularization of a garbage classification policy is facilitated, and a foundation is laid for implanting the method (algorithm) into embedded equipment or mobile equipment in the future so as to realize the automation of garbage classification.

The technical scheme of the invention is that the method for realizing garbage classification by utilizing the deep learning algorithm is characterized in that 1) the lightweight convolutional neural network model squeezenet is used for garbage classification: inputting a single preprocessed garbage picture into the trained squeezenet model to finally obtain one of 4 outputs, namely the type of garbage in the picture; 2) extracting the garbage picture features by using a convolution kernel: performing convolution operation on the image by using a convolution kernel in the CNN, extracting image characteristics, and obtaining the convolution kernel with optimal parameters through training to accurately identify the garbage image; 3) training and optimizing the neural network by using an Adam algorithm;

in the step 1), performing normalization and random clipping on the garbage pictures, and performing a classification task by adopting SqueezeNet; in the step 2), extracting image characteristics by using a convolution kernel and further identifying a junk picture by using CNN; in the step 3), an Adam algorithm is used for optimizing model training, which is an optimization algorithm for continuously changing learning rate in the training process of gradient descent, thereby improving iterative convergence speed and improving convergence degree.

In the overall process of the SqueezeNet, an image passes through a convolutional layer, then passes through a plurality of fire layers to extract characteristics and gradually increase the number of channels, and the convolutional layer adopts a linear rectification function (ReLu function) as an activation function; the data is then passed through an averaging pooling layer to generate classification values, which are then passed to a classification layer using a normalized exponential function (softmax function) for classification operations.

As shown in fig. 2, which shows the overall flow of the SqueezeNet, the image passes through a convolutional layer, and then passes through multiple fire layers to extract features and gradually increase the number of channels, wherein the convolutional layer adopts a linear rectification function (ReLu function) as an activation function. The data is then passed through an averaging pooling layer to generate classification values, which are then passed to a classification layer using a normalized exponential function (softmax function) for classification operations.

Fig. 3 shows the fire layer arrangement of squeezet, in which the squeeze layer is used as the bottleneck layer, and the number of channels of the input picture is reduced by using a convolution kernel of 1 × 1 to reduce the amount of computation of the expansion layer 3 × 3 convolution. The expanded layer replaces part of the convolution kernel 3 by 3 with the convolution kernel 1 by 1 on the basis of the expanded layer, and further reduces the calculation amount. The number of convolution kernels in the expanded layer in fig. 3 is greater than that in the SqueezeNet, and the number of channels of the image passing through the fire layer is ensured to be increased.

The scheme of the invention can be combined with or replaced by traditional machine learning algorithms (SVM, ensemble learning and the like), but the methods need manual feature extraction, and have large calculation amount and lower precision.

Has the advantages that: the method adopts the convolution to check the picture for feature extraction, thereby improving the generalization performance of the method; because the method adopts the lightweight neural network model SqueezeNet, the precision is high, the calculated amount is small, and the operation is fast. The implementation of the model in embedded or mobile equipment is facilitated; because the method adopts the deep convolutional neural network to train the picture set, the accuracy rate is superior to that of other current garbage classification algorithms, and the garbage classification efficiency can be greatly improved.

The invention adopts the SqueezeNet method: compared with the classical Alexnet, the compressed parameter quantity of the algorithm is reduced by 50 times under the condition that the precision is basically kept flat, and the algorithm is beneficial to being executed on embedded and mobile equipment. The designed squeeze convolution replaces the N × N convolution with 1 × 1 convolution and reduces the number of channels in the input image, thus reducing the number of parameters. The invention overcomes the defects that the accuracy of the Incep-v 3 model is relatively low, and the classification accuracy only reaches 95%; moreover, the Incep-v 3 model is used, so that the phenomenon of overfitting is easy to occur under the condition that the number of pictures is small.

Drawings

FIG. 1 is a block diagram of the process of the present method; preprocessing an original data set, training a model to testing the model;

FIG. 2 is an overall flow chart of the neural network SqueezeNet of the present invention;

FIG. 3 is a schematic diagram of the configuration of the fire layer of the neural network SqueezeNet of the present invention.

Detailed Description

Referring to fig. 1, the method consists of three parts, namely data set preprocessing, model training and model testing.

1 data preprocessing

1.1 the original data set contains 3000 pictures, which are classified into six categories (board, glass, mate, paper, plastic, hash).

1.2 normalization of the raw data set. Images of size 224 x 224 were cut out randomly in a single image. Then, 4 pixels of 0 are filled around the original image respectively, and the original image is cut into the size of the original image randomly to be used as a data set of a training and testing model.

The pretreatment method comprises the following steps: i.e. the processing of the data before it is entered into the model. We resize the original 400 × 400 color image to 224 × 224 and perform normalization with mean vector (0.4914,0.4822,0.4465) and variance vector (0.2023,0.1994,0.2010), which effectively reduces the computational complexity and stabilizes the model convergence. Then, 4 pixels of 0 pixel are respectively filled around the original image, and then the image is randomly cut into the size of the original image (random cutting, namely, an area with a designated size is randomly selected on the image). The processed data is then input into the SqueezeNet model for training.

2 model training

2.1 application of the SqueezeNet model:

ReLU function expression:

f(x)＝max(0，x)

2.2 according to 7: and 3, dividing the data set into a training set and a testing set, iterating for 50 rounds at a learning rate of 0.001 and a weight attenuation rate of 10^ (-5), and optimizing by using an Adam algorithm.

3 model test

The accuracy was checked on the test set to obtain a 99.68% high accuracy model.

The invention normalizes and randomly cuts the garbage pictures, adopts SqueezeNet to classify tasks and uses Adam algorithm to optimize model training. The lightweight convolutional neural network model squeezenet is used for garbage classification. A method for extracting image features by convolution kernel and then identifying garbage pictures by CNN. And training and optimizing the neural network by using an Adam algorithm.

Classification was performed using SqueezeNet: inputting image data into a trained SqueezeNet model, performing feature extraction on an input image by the SqueezeNet to obtain a feature vector, and obtaining the probability of finally belonging to each category through a softmax function. The category of the maximum probability is the category to which the input image belongs.

Soft Max function expression:

an array E ═ E1, E2, ·, eN }is provided

Then the soft Max value of ei is:

soft Max function meaning: one can see as the probabilities of the output categories.

Training process: and inputting the image data of the training set into an Squeeze Net model, and performing gradient descent and iterative training. This is the training process of the model. How well the results were verified after training: the test set is unknown to the model because it is partitioned from the original data and does not participate in the model training. The model is tested on a test set for accuracy to obtain the test accuracy. This is to verify whether good results are obtained.

The formula and symbol meanings are described in 1. the Fire layer, the Squeeze layer and the expanded layer are all module names of the Squeeze Net, and 1 × 1 and 3 × 3 are the sizes of convolution kernels.

Optimization of Adam algorithm: this is a way to dynamically adjust the learning rate of each parameter during the training of the gradient descent by using the first and second moment estimates of the gradient. Therefore, the iterative convergence speed is increased, and the optimization algorithm of the convergence degree is improved. The results obtained were: a high accuracy of 99.68% was achieved on a carefully chosen test set.

The final trained model of the invention achieves a high accuracy of 99.68% on a carefully selected test set, which shows that the invention has strong generalization performance, (generalization capability, namely the prediction and recognition capability of the model to unknown samples) can be well applied to some practical scenes.

Claims

1. A method for realizing garbage classification by using a deep learning algorithm is characterized in that 1) a lightweight convolutional neural network model squeezenet is used for garbage classification: inputting a single preprocessed garbage picture into the trained squeezenet model to finally obtain one of 4 outputs, namely the type of garbage in the picture; 2) extracting the garbage picture features by using a convolution kernel: performing convolution operation on the image by using a convolution kernel in the CNN, extracting image characteristics, and obtaining the convolution kernel with optimal parameters through training to accurately identify the garbage image; 3) training and optimizing the neural network by using an Adam algorithm;

2. The method for realizing garbage classification by using the deep learning algorithm as claimed in claim 1, wherein in the overall process of the SqueezeNet, the image firstly passes through a convolution layer, then passes through a plurality of fire layers to extract features and gradually increase the number of channels, and the convolution layer adopts a linear rectification function (ReLu function) as an activation function; the data is then passed through an averaging pooling layer to generate classification values, which are then passed to a classification layer using a normalized exponential function (softmax function) for classification operations.

3. The method of claim 1, wherein the Adam algorithm is used to optimize the training of the model.

4. The method for realizing garbage classification by using the deep learning algorithm as claimed in claim 1, wherein in the step 1), fire layers of the squeezet are arranged, wherein the squeeze layers are used as bottleneck layers, and 1 × 1 convolution kernels are adopted to reduce the number of channels of the input pictures so as to reduce the calculated amount of 3 × 3 convolution of the expanded layers; the expanded layer replaces part of the convolution kernel 3 by 3 with the convolution kernel 1 by 1 on the basis of the expanded layer, and the calculation amount is reduced; the number of convolution kernels of the expanded layer is larger than that of convolution kernels in the Squeezenet, and the number of channels of the image after passing through the fire layer is guaranteed to be increased.