CN111597899A

CN111597899A - Scenic spot ground plastic bottle detection method

Info

Publication number: CN111597899A
Application number: CN202010298079.6A
Authority: CN
Inventors: 邵奇可; 陈一苇; 卢熠
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-04-16
Filing date: 2020-04-16
Publication date: 2020-08-28
Anticipated expiration: 2040-04-16
Also published as: CN111597899B

Abstract

The method for detecting plastic bottles on the ground in scenic spots comprises the following steps: 1) collecting a large number of images of high-altitude cameras in scenic spots and other plastic bottle data sets, calibrating the data sets according to on-site management requirements, and determining a used one-stage target detection algorithm model; 2) constructing a parameter adaptive loss function

And

Description

Scenic spot ground plastic bottle detection method

Technical Field

The invention belongs to the technical field of image recognition and computer vision, and relates to a scenic spot ground plastic bottle detection method.

Background

At present, to tourist abandon the plastic bottle at will in the scenic spot, the staff can't in time handle the plastic bottle problem of abandoning subaerial, and traditional processing method mainly includes: firstly, the scenic spot is continuously inspected by workers; secondly, plastic bottles on the ground of the scenic spot are identified through a traditional image algorithm. Wherein, through the staff patrol and examine the processing mode of accomplishing scenic spot ground plastic bottle, this processing mode need consume a large amount of manpowers, material resources and financial resources, and because the staff need vacate and have factors such as hourglass inspection at artificial patrol and examine in-process, the effect is not ideal. The generalization of the traditional image algorithm is poor, and the plastic bottle in the image can be detected only by the camera under the conditions of a fixed angle and fixed illumination.

Therefore, the plastic bottles on the ground of the scenic spot are identified in real time by utilizing the existing security cameras in the scenic spot, the position information of the ground plastic bottles is sent to the control center, field workers are informed to process in time, labor cost can be greatly reduced, and the processing efficiency of the scenic spot on the ground plastic bottles can be improved. Therefore, the scenic spot ground plastic bottle detection system based on the video has good popularization value.

The video stream of the security camera is used for identifying ground plastic bottles in the scenic area, and the requirements on the accuracy of an identification algorithm and the real-time performance of information are high. Therefore, the target detection algorithm based on deep learning is reasonable. The target detection algorithm based on deep learning is divided into a two-stage model and a one-stage model. Although the two-stage target detection model has better detection precision, the forward reasoning speed is slow, and the real-time requirement of a service scene cannot be met. In the traditional one-stage target detection algorithm model, although the real-time performance of the algorithm is good, the detection precision of the two-stage target detection algorithm model cannot be achieved. When the target is detected by an image, a large number of scenic background objects are contained, although the loss value of the scenic background objects is small, the number of the scenic background objects is far more than that of the plastic bottle target, and the conventional target detection method is difficult to obtain high identification accuracy under the complex scene, so that a highly adaptive target detection method is urgently needed.

Disclosure of Invention

The invention provides a scenic spot ground plastic bottle detection method which has high identification accuracy and good self-adaptability and aims to overcome the defects in the prior art.

The invention improves the loss function in a one-stage target detection algorithm model. The loss function is used as an objective function of a gradient descent process in the convolutional neural network, and directly influences the training result of the convolutional neural network. The quality of the training result of the convolutional neural network is directly related to the identification precision of target detection, so that the method is particularly important for the design and display of a loss function. In a stage target detection algorithm model training process, a network contains a large number of scenic spot background objects when an image detects a target, although loss values of the scenic spot background objects are small, the loss values far exceed plastic bottles in number, so that the scenic spot background loss values with small probability values overwhelm the plastic bottle target loss values when the loss values are calculated, the model precision is greatly reduced, and a focus loss function is embedded into a detection model to improve the training precision. And if the hyper-parameters exist in the focus loss function, the hyper-parameters need to be set according to empirical values, and the magnitude of the hyper-parameters can not be automatically adjusted according to the predicted class probability value.

The invention provides a deep learning loss function based on semi-supervised learning, aiming at the problems that hyper-parameters need to be adjusted manually in the training process of a focus loss function and the parameters in the training process do not have self-adaptability.

The method for detecting plastic bottles on the ground in scenic spots comprises the following steps:

step 1: constructing a plastic bottle data set M, a training data set T, a verification data set V, labeling the plastic bottle category number C, the training data batch size batch, the training batch number batches, the learning rate l _ rate, and the proportionality coefficient zeta between the training data set T and the verification data set V.

ζ＝Card(V)/Card(T)

Wherein V ∪ T is M, C ∈ N⁺，ζ∈(0,1)，batches∈N⁺,l_rate∈N⁺，batch∈N⁺，

Representing the height and width of the image and r representing the number of channels of the image.

Step 2: determining a stage target detection model to be trained, setting the depth of a convolutional neural network as L, setting a network convolutional layer convolutional kernel set G, setting a network output layer in a full-connection mode, setting a convolutional kernel set A and a network characteristic diagram set U,

representing the kth characteristic diagram in the l-th network

The corresponding grid number and anchor point set M are specifically defined as follows:

wherein:

and respectively representing the height, width and dimension of a convolution kernel, a characteristic diagram and an anchor point corresponding to the l-th network.

Indicating the fill size of the layer l network convolution kernel,

representing the convolution step size of the layer I network, f representing the excitation function of the convolution neuron, theta representing the selected input feature, Λ∈ N⁺Denotes the total number of anchor points xi ∈ N in the layer I network⁺Representing the total number of output layer nodes, Φ ∈ N⁺Indicates the total number of layer I network feature maps, Δ ∈ N⁺Representing the total number of the l-th layer convolution kernels.

Step 3: the design parameter adaptive focus loss function is as follows:

wherein:

is shown asImage t of jth anchor point in ith grid on l-layer network_kThe loss function of the confidence degrees of the plastic bottle sample and the scenic spot background sample; in the same way, the method for preparing the composite material,

a loss function representing a prediction box for a plastic bottle,

λ ∈ Q is a loss function representing the class of plastic bottles

And (4) parameters.

And

the loss functions of the plastic bottle target and the scenic spot background target are respectively expressed as follows:

the probability value of the foreground plastic bottle predicted by the jth anchor point in the ith grid on the ith network is shown, and similarly,

representing the corresponding scenic background probability value.

Respectively representing the abscissa and the ordinate of the central point of the prediction frame of the jth anchor point in the ith grid on the ith network, and the like

Respectively representing the abscissa and the ordinate of the central point of the plastic bottle sample calibration frame;

respectively representing the shortest Euclidean distance from the central point of the prediction frame of the jth anchor point in the ith grid on the ith network to the boundary of the frame, and the same way

Respectively representing the shortest Euclidean distance from the central point of the plastic bottle sample calibration frame to the frame boundary;

and (4) representing the predicted plastic bottle category value of the jth anchor point prediction in the ith grid on the ith network. In the same way, the method for preparing the composite material,

indicating the nominal status of the class of plastic bottles,

indicating that a sample of a plastic bottle was predicted,

whether to predict the background sample of the scenic spot is represented, and the specific calculation is as follows:

wherein the parameters α∈ (0, 1); iou_jRepresenting anchor points m_jThe overlap ratio of the anchor point box and the plastic bottle calibration box in the ith grid, and miou represents the maximum overlap ratio.

Step 4: and (3) based on a loss function of a stage target detection algorithm model in Step 3, carrying out gradient descent method training on the model by using a training set until the model converges. And in the model testing stage, setting alarm time as timer, automatically recording the detailed category and position information of the plastic bottle when the plastic bottle is detected by the system model, starting timing, and sending an alarm if the detailed category and position information of the plastic bottle detected again are consistent with those before after the given time is exceeded.

The invention has the advantages that: the parameter adaptability of the plastic bottle detection model can be improved, and the accuracy of plastic bottle detection is greatly improved.

Drawings

Fig. 1 is a network configuration diagram of the convolutional neural network of the present invention.

Fig. 2 is a diagram of a loss function structure in the convolutional neural network of the present invention.

FIG. 3 is a flowchart of the present invention for the disposition of plastic bottle detection algorithm based on convolutional neural network.

Detailed Description

In order to better explain the technical scheme of the invention, the invention is further explained by an embodiment with the accompanying drawings.

step 1: collecting a large amount of plastic bottle image data shot at high altitude, constructing a plastic bottle data set M with the number of 10000, a training data set T with the number of 8000, a verification data set V with the number of 2000, a labeled plastic bottle category number C with the value of 5, namely a Fenda plastic bottle, a kouchuole plastic bottle, a pulsating plastic bottle, a scream plastic bottle and a farmer spring plastic bottle, wherein the training data batch size batch value is 4, the training batch times batches value is 1000, the learning rate l _ rate value is 0.001, the proportionality coefficient zeta value between the training data set T and the verification data set V is 0.25, the height, width and channel number of all images are set consistently, and the height h of the images is set consistently_kAnd width w_kThe values are 416 and 416 respectively, and the number r of channels of the image is 3.

Step 2: determining a one-stage target detection model as Yolov3, setting the depth L of the convolutional neural network as 139, wherein the height, width and dimension settings of the convolutional kernel are specifically shown in FIG. 1, and the filling size of the convolutional kernel

Default to 1, convolution step size

The excitation function f of the convolutional neurons is defaulted to be a LEAkly _ relu excitation function, anchor points are shared in each layer network, an anchor point set M is set to be { (10,13), (30,61) and (156,198) }, namely, the total number of anchor points Λ in each layer network layer is set to be 3, the network output layer adopts a full-connection mode, a convolution kernel set A is set to be { (1,1,30), (1,1,30) }, namely, the total number of output layer nodes is set to be 3.

Step 3: as shown in fig. 2, a parameter adaptive focus LOSS function LOSS is constructed, where the value of the parameter α is 0.25 and the value of the parameter λ is 0.5.

Step 4: and (3) based on a loss function of a stage target detection algorithm model in Step 3, carrying out gradient descent method training on the model by using a training set until the model converges. As shown in fig. 3, the video stream of the camera installed in the scenic spot is used for real-time detection, in the model test stage, the alarm time timer takes 3 minutes, when the plastic bottle is detected by the system model, the detailed type and position information of the plastic bottle are automatically recorded, timing is started, and after 3 minutes, if the detailed type and position information of the plastic bottle detected again are consistent with the previous detailed type and position information, an alarm is given.

While the foregoing has described a preferred embodiment of the invention, it will be appreciated that the invention is not limited to the embodiment described, but is capable of numerous modifications without departing from the basic spirit and scope of the invention as set out in the appended claims.

Claims

1. The scenic spot ground plastic bottle detection method comprises the following steps:

step 1: constructing a plastic bottle data set M, a training data set T, a verification data set V, labeling the plastic bottle category number C, the training data batch size batch, the training batch number batches, the learning rate l _ rate, and a proportionality coefficient zeta between the training data set T and the verification data set V;

ζ＝Card(V)/Card(T)

Representing the height and width of the image, and r represents the number of channels of the image;

representing the kth characteristic diagram in the l-th network

wherein:

respectively representing the height, width and dimension of a convolution kernel, a characteristic diagram and an anchor point corresponding to the l-th network;

indicating the fill size of the layer l network convolution kernel,

representing the convolution step size of the layer I network, f representing the excitation function of the convolution neuron, theta representing the selected input feature, Λ∈ N⁺Denotes the total number of anchor points xi ∈ N in the layer I network⁺Representing the total number of output layer nodes, Φ ∈ N⁺Indicates the total number of layer I network feature maps, Δ ∈ N⁺Represents the total number of the l layer convolution kernels;

step 3: designing a parameter adaptive focus loss function, which specifically comprises the following steps:

wherein:

indicating that the jth anchor point in the ith grid on the ith network is in the image t_kThe loss function of the confidence degrees of the plastic bottle sample and the scenic spot background sample; in the same way, the method for preparing the composite material,

a loss function representing a prediction box for a plastic bottle,

λ ∈ Q is a loss function representing the class of plastic bottles

A parameter;

and

representing a corresponding scenic region background probability value;

the plastic bottle category prediction value represents the prediction of the jth anchor point in the ith grid on the ith network; in the same way, the method for preparing the composite material,

indicating the nominal status of the class of plastic bottles,

indicating that a sample of a plastic bottle was predicted,

wherein the parameters α∈ (0, 1); iou_jRepresenting anchor points m_jThe overlapping rate of the anchor point frame and the plastic bottle calibration frame in the ith grid, wherein miou represents the maximum overlapping rate;

step 4: performing gradient descent method training on the model by using a loss function of a stage target detection algorithm model in Step 3 until the model converges; in the system operation stage, a first-order target detection model is used for extracting a network characteristic value, an anchor point is determined based on a K-means clustering method, the alarm time is set to be a timer, when the plastic bottle is detected by the system model, the detailed type and position information of the plastic bottle are automatically recorded, timing is started, and after the given time is exceeded, if the detailed type and position information of the plastic bottle detected again are consistent with the previous detailed type and position information of the plastic bottle, an alarm is given.