CN114022736A - Garbage detection method and device - Google Patents

Garbage detection method and device Download PDF

Info

Publication number
CN114022736A
CN114022736A CN202111332238.0A CN202111332238A CN114022736A CN 114022736 A CN114022736 A CN 114022736A CN 202111332238 A CN202111332238 A CN 202111332238A CN 114022736 A CN114022736 A CN 114022736A
Authority
CN
China
Prior art keywords
garbage
image
training sample
position information
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111332238.0A
Other languages
Chinese (zh)
Inventor
黄笑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CLP Cloud Digital Intelligence Technology Co Ltd
Original Assignee
CLP Cloud Digital Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CLP Cloud Digital Intelligence Technology Co Ltd filed Critical CLP Cloud Digital Intelligence Technology Co Ltd
Priority to CN202111332238.0A priority Critical patent/CN114022736A/en
Publication of CN114022736A publication Critical patent/CN114022736A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a garbage detection method, which comprises the following steps: acquiring an image to be recognized, and inputting the image to be recognized into a trained garbage detection model; the garbage detection model can extract the features of the image to be identified to obtain a plurality of feature maps with different scales; the garbage detection model respectively carries out garbage detection on the plurality of characteristic graphs to obtain garbage detection results and result confidence degrees corresponding to the plurality of characteristic graphs; and the garbage detection model outputs the garbage detection result of the image to be recognized according to the respective corresponding garbage detection results and result confidence degrees of the plurality of characteristic graphs. It can be seen that, this application utilizes rubbish detection model to discern the rubbish in the image, and need not be the same with prior art, needs the manual work to discern rubbish from the image to can avoid appearing artifical discernment rubbish in-process easily because the discernment that the mistake in operation leads to is omitted, inefficiency, consuming time hard problem, improve efficiency and the precision of discerning rubbish from the image, and then improve user experience.

Description

Garbage detection method and device
Technical Field
The application relates to the field of deep learning visual algorithms, in particular to a garbage detection method and device.
Background
In addition to pursuing economic development, the living environment of human beings is more closely related to every person in modern society. In urban life, untimely garbage treatment not only affects the appearance of the city, but also reduces the life quality of people and affects the life state of people. How to treat garbage in time is the problem which is mainly solved by urban management departments and environmental sanitation departments.
At present, patrol of patrolmen is needed for urban garbage supervision and treatment, personnel of a sanitation department are responsible for making a district, the patrolled garbage needs to be shot for a large pile of garbage and is uploaded to a treatment background in a picture mode, then the treatment personnel command and schedule regional responsible persons, the regional responsible persons dispatch tasks to the treatment personnel, the treatment personnel feed results back to the treatment background after treatment, the process is long in time consumption, the treatment tasks are easy to pile up due to transfer among multiple places, a large amount of manpower and material resources are consumed, citizen complaints are caused due to untimely treatment, a large amount of pressure is added to municipal management and financial management, and the life happiness of the citizen is greatly reduced.
With the popularization of city monitoring, cameras are almost spread all over every corner of a city, and some city management work is also applied to the cameras, how to combine the monitoring camera shooting with the treatment of urban garbage, and the application of a deep learning technology and an image processing technology to the detection and management of the urban garbage is an important step of a smart city.
Disclosure of Invention
The application provides a garbage detection method, so that the efficiency and the accuracy of recognizing garbage from images can be improved, and further the user experience is improved.
In a first aspect, the present application provides a garbage detection method, including:
acquiring an image to be recognized, and inputting the image to be recognized into a trained garbage detection model;
the garbage detection model extracts the features of the image to be identified to obtain a plurality of feature maps with different scales;
the garbage detection model respectively carries out garbage detection on the plurality of feature maps to obtain garbage detection results and result confidence degrees corresponding to the plurality of feature maps;
and the garbage detection model outputs the garbage detection result of the image to be identified according to the respective corresponding garbage detection results and result confidence degrees of the plurality of feature maps.
Optionally, the spam detection result includes spam location information and spam category; the garbage position information comprises an image to be identified marked with a garbage area and coordinate information of garbage.
Optionally, the generating manner of the garbage detection model includes:
acquiring a training sample data set; the training sample data set comprises a plurality of groups of training sample data, wherein each group of training sample data comprises a training sample image and a garbage position information label and a garbage category label corresponding to the training sample image;
inputting a training sample image into a preset network model to obtain predicted garbage position information, a predicted garbage category label and a predicted confidence coefficient corresponding to the training sample image;
determining a total loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and adjusting model parameters of the network model according to the total loss value until the network model meets training conditions, and taking the network model as the garbage detection model.
Optionally, the determining a total loss value according to the predicted garbage position information, the predicted garbage category label, the predicted confidence degree corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image includes:
determining a frame loss value, a category loss value and a confidence coefficient loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and determining a total loss value according to the frame loss value, the category loss value and the confidence coefficient loss value.
Optionally, the acquiring the training sample data set includes:
acquiring an original acquisition image;
correcting the original collected image to obtain a corrected image;
acquiring a garbage position information label and a garbage category label corresponding to the corrected image;
and taking the corrected image as a training sample image in training sample data, and taking a garbage position information label and a garbage category label corresponding to the corrected image as a garbage position information label and a garbage category label corresponding to the training sample image.
Optionally, after the step of obtaining a training sample data set, the method further includes:
performing data enhancement processing on the training sample images in the training sample data set to obtain data enhancement images;
taking the data enhanced image and a garbage position information label and a garbage category label corresponding to the data enhanced image as a group of training sample data in the training sample data set;
wherein, the data enhancement processing mode comprises at least one of the following modes: flip transform, random cropping, color dithering, translation transform, scale transform, contrast transform, noise perturbation, rotation transform/reflection transform.
Optionally, after the step of obtaining a training sample data set, the method further includes:
splicing any multiple training sample images in the training sample data set to obtain a spliced training sample image;
and using the spliced training sample image and the garbage position information labels and the garbage category labels corresponding to the training sample images as a group of training sample data in the training sample data set.
Optionally, the method further includes:
and if the garbage detection result of the image to be recognized comprises garbage position information and garbage category, outputting garbage cleaning prompt information.
Optionally, the method further includes:
and if the checking result of the garbage detection result of the image to be recognized is a negative sample, performing model training on the garbage detection model by using the image to be recognized and the corrected garbage detection result of the image to be recognized.
In a second aspect, the present application provides a waste detection device, the device comprising:
the input unit is used for acquiring an image to be recognized and inputting the image to be recognized into a trained garbage detection model;
the first acquisition unit is used for extracting the features of the image to be identified by the garbage detection model to obtain three feature maps with different scales;
the second obtaining unit is used for controlling the garbage detection model to respectively perform garbage detection on the three characteristic graphs to obtain garbage detection results and result confidence degrees corresponding to the three characteristic graphs;
and the third acquisition unit is used for controlling the garbage detection model to output the garbage detection result of the image to be identified according to the respective corresponding garbage detection results and result confidence degrees of the three characteristic graphs.
Optionally, the spam detection result includes spam location information and spam category; the garbage position information comprises an image to be identified marked with a garbage area and coordinate information of garbage.
Optionally, the apparatus further includes a model generation unit, configured to:
acquiring a training sample data set; the training sample data set comprises a plurality of groups of training sample data, wherein each group of training sample data comprises a training sample image and a garbage position information label and a garbage category label corresponding to the training sample image;
inputting a training sample image into a preset network model to obtain predicted garbage position information, a predicted garbage category label and a predicted confidence coefficient corresponding to the training sample image;
determining a total loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and adjusting model parameters of the network model according to the total loss value until the network model meets training conditions, and taking the network model as the garbage detection model.
Optionally, the model generating unit is specifically configured to:
determining a frame loss value, a category loss value and a confidence coefficient loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and determining a total loss value according to the frame loss value, the category loss value and the confidence coefficient loss value.
Optionally, the model generating unit is specifically configured to:
acquiring an original acquisition image;
correcting the original collected image to obtain a corrected image;
acquiring a garbage position information label and a garbage category label corresponding to the corrected image;
and taking the corrected image as a training sample image in training sample data, and taking a garbage position information label and a garbage category label corresponding to the corrected image as a garbage position information label and a garbage category label corresponding to the training sample image.
Optionally, the model generating unit is further configured to:
performing data enhancement processing on the training sample images in the training sample data set to obtain data enhancement images;
taking the data enhanced image and a garbage position information label and a garbage category label corresponding to the data enhanced image as a group of training sample data in the training sample data set;
wherein, the data enhancement processing mode comprises at least one of the following modes: flip transform, random cropping, color dithering, translation transform, scale transform, contrast transform, noise perturbation, rotation transform/reflection transform.
Optionally, the model generating unit is further configured to:
splicing any multiple training sample images in the training sample data set to obtain a spliced training sample image;
and using the spliced training sample image and the garbage position information labels and the garbage category labels corresponding to the training sample images as a group of training sample data in the training sample data set.
Optionally, the apparatus further includes a prompting unit:
and if the garbage detection result of the image to be recognized comprises garbage position information and garbage category, outputting garbage cleaning prompt information.
Optionally, the apparatus further comprises a retraining unit:
and if the checking result of the garbage detection result of the image to be recognized is a negative sample, performing model training on the garbage detection model by using the image to be recognized and the corrected garbage detection result of the image to be recognized.
In a third aspect, the present application provides a readable medium comprising executable instructions, which when executed by a processor of an electronic device, perform the method according to any of the first aspect.
In a fourth aspect, the present application provides an electronic device comprising a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor performs the method according to any one of the first aspect.
According to the technical scheme, the application provides a garbage detection method, in the embodiment, an image to be recognized can be obtained firstly, and the image to be recognized is input into a trained garbage detection model; then, the garbage detection model can perform feature extraction on the image to be identified to obtain a plurality of feature maps with different scales; then, the garbage detection model respectively carries out garbage detection on the plurality of feature maps to obtain garbage detection results and result confidence degrees corresponding to the plurality of feature maps; and finally, the garbage detection model outputs the garbage detection result of the image to be recognized according to the respective corresponding garbage detection results and result confidence degrees of the plurality of feature maps. Therefore, the garbage detection model can be used for identifying the garbage in the image, the garbage does not need to be identified from the image manually as in the prior art, so that the problems of missing, low efficiency, time and labor waste caused by operation errors in the process of identifying the garbage manually can be avoided, the efficiency and the accuracy of identifying the garbage from the image are improved, and the user experience is improved.
Further effects of the above-mentioned unconventional preferred modes will be described below in conjunction with specific embodiments.
Drawings
In order to more clearly illustrate the embodiments or prior art solutions of the present application, the drawings needed for describing the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings can be obtained by those skilled in the art without inventive exercise.
Fig. 1 is a schematic flowchart of a garbage detection method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a garbage detection method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a garbage detection apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following embodiments and accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, patrol of patrolmen is needed for urban garbage supervision and treatment, personnel of a sanitation department take charge of a system in a district, the patrolled garbage needs to be shot for a large pile of garbage and is uploaded to a treatment background in a picture mode, then the treatment personnel command and schedule regional responsible persons, the regional responsible persons dispatch tasks to the treatment personnel, the treatment personnel feed results back to the treatment next day, the process is long in time consumption, multiple transfer is available, the treatment tasks are easy to accumulate, a large amount of manpower and material resources are consumed, citizen complaints caused by untimely treatment are also caused, a large amount of pressure is added to municipal management and financial management, and the life happiness of the citizen is greatly reduced. Therefore, a new garbage detection method is needed.
In the embodiment, an image to be recognized can be obtained first, and the image to be recognized is input into a trained garbage detection model; then, the garbage detection model can perform feature extraction on the image to be identified to obtain a plurality of feature maps with different scales; then, the garbage detection model respectively carries out garbage detection on the plurality of feature maps to obtain garbage detection results and result confidence degrees corresponding to the plurality of feature maps; and finally, the garbage detection model outputs the garbage detection result of the image to be recognized according to the respective corresponding garbage detection results and result confidence degrees of the plurality of feature maps. Therefore, the garbage detection model can be used for identifying the garbage in the image, the garbage does not need to be identified from the image manually as in the prior art, so that the problems of missing, low efficiency, time and labor waste caused by operation errors in the process of identifying the garbage manually can be avoided, the efficiency and the accuracy of identifying the garbage from the image are improved, and the user experience is improved. It should be noted that the embodiment of the present application may be applied to an electronic device (such as a mobile phone, a tablet, a computer, etc.) or a server. In addition to the above-mentioned embodiments, other embodiments are also possible, and are not limited herein.
Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, a method for detecting spam in an embodiment of the present application is shown, and in the embodiment, the method may include the following steps:
s101: and acquiring an image to be recognized, and inputting the image to be recognized into the trained garbage detection model.
In this embodiment, an image to be recognized may be obtained first, where the image to be recognized may be understood as an image that needs to be subjected to spam recognition. In one implementation, the image to be recognized may be acquired by a camera, for example, the image may be acquired in real time, for example, the image may be acquired every 10 seconds, it is understood that the camera may be used to fixedly acquire a target area (i.e., an area that needs to be monitored for the existence of garbage), or may be used to acquire a plurality of target areas by rotating the camera.
After the image to be recognized is obtained, inputting the image to be recognized into a trained garbage detection model, wherein the garbage detection model can be obtained by training based on a training sample data set; the training sample data set may include a plurality of sets of training sample data, where each set of training sample data includes a training sample image and a garbage detection result corresponding to the training sample image.
S102: and the garbage detection model extracts the features of the image to be identified to obtain a plurality of feature maps with different scales.
In this embodiment, after the garbage detection model acquires an image to be recognized, feature extraction may be performed on the image to be recognized to obtain a plurality of feature maps of different scales.
It is emphasized that typically an image contains a variety of objects and has a large or small size. It is desirable to detect all sizes of objects at once, so the network must have the ability to detect objects of different sizes, and the deeper the network, the smaller the signature, and the later the smaller the objects are, the more difficult it is to detect. To address this problem, the garbage detection model may employ a plurality of features of different scales to detect garbage, for example, in one implementation, the garbage detection model may be a YOLOv3 model, and YOLOv3 may detect garbage through a feature map of 3 different scales; like this, the rubbish detection model can detect the characteristic of finer grit, the rubbish detection model can adopt FPN (feature Pyramid network) structure to correspond the different precision of multiple yardstick, carry out rubbish detection respectively to the feature map (being the characteristic map) of the different degree of depth, the feature map of present layer can go on the upsampling to the feature map of future layer (being the next layer) to utilize, like this, can fuse low order characteristic and high order characteristic, promote the rubbish and detect the precision.
S103: and the garbage detection model respectively carries out garbage detection on the plurality of characteristic graphs to obtain garbage detection results and result confidence degrees corresponding to the plurality of characteristic graphs.
After obtaining the plurality of feature maps, the spam detection model may perform spam detection on the plurality of feature maps respectively to obtain spam detection results and result confidence levels corresponding to the plurality of feature maps respectively, that is, the spam detection results and the result confidence levels of the spam detection results corresponding to the plurality of feature maps respectively, where it is to be noted that the higher the result confidence level of the spam detection result is, the higher the confidence level of the spam detection result is, that is, the higher the trueness rate is.
The garbage detection result may include garbage location information and a garbage category. The trash position information may include an image to be recognized marked with a trash area and coordinate information of trash, for example, the image to be recognized marked with the trash area may be an image to be recognized marked with a marking frame for the image area marked with the trash in the image, that is, the trash area is marked in the image to be recognized through the marking frame, and the coordinate information of the trash may be understood as the coordinate information of the marking frame, for example, when the marking frame is a rectangular frame, the coordinate information of the trash may be coordinate information of at least two opposite-to-corner points of the rectangular frame, such as coordinate information of an upper left corner and a lower right corner of the rectangular frame, or coordinate information of an upper right corner and a lower left corner of the rectangular frame. The waste category is understood to be the type of waste, which may include, for example, waste paper, plastic, glass, metal, cloth, kitchen waste, and hazardous waste. And the result confidence may include a confidence corresponding to the spam location information and a confidence corresponding to the spam category.
S104: and the garbage detection model outputs the garbage detection result of the image to be identified according to the respective corresponding garbage detection results and result confidence degrees of the plurality of feature maps.
After the garbage detection model obtains the garbage detection results and the result confidence degrees corresponding to the feature maps, the garbage detection model outputs the garbage detection results of the image to be recognized according to the garbage detection results and the result confidence degrees corresponding to the feature maps. In an implementation manner, a spam detection result with the highest result confidence degree among spam detection results and result confidence degrees corresponding to the plurality of feature maps may be used as the spam detection result of the image to be recognized.
According to the technical scheme, the application provides a garbage detection method, in the embodiment, an image to be recognized can be obtained firstly, and the image to be recognized is input into a trained garbage detection model; then, the garbage detection model can perform feature extraction on the image to be identified to obtain a plurality of feature maps with different scales; then, the garbage detection model respectively carries out garbage detection on the plurality of feature maps to obtain garbage detection results and result confidence degrees corresponding to the plurality of feature maps; and finally, the garbage detection model outputs the garbage detection result of the image to be recognized according to the respective corresponding garbage detection results and result confidence degrees of the plurality of feature maps. Therefore, the garbage detection model can be used for identifying the garbage in the image, the garbage does not need to be identified from the image manually as in the prior art, so that the problems of missing, low efficiency, time and labor waste caused by operation errors in the process of identifying the garbage manually can be avoided, the efficiency and the accuracy of identifying the garbage from the image are improved, and the user experience is improved.
Next, a manner of generating the garbage detection model will be described. Specifically, in this embodiment, the generation method of the garbage detection model includes the following steps:
step a: and acquiring a training sample data set.
In this embodiment, the training sample data set may include a plurality of sets of training sample data, where each set of training sample data includes a training sample image and a garbage location information tag and a garbage category tag corresponding to the training sample image.
As an example, the originally captured image may be acquired first. For example, images containing garbage (garbage cans, garbage bags, paperboards (containing cartons), plastic bottles and beer bottles) can be captured through videos shot under a plurality of fixed cameras, and the images are collected to generate a data set.
Then, the original captured image may be corrected to obtain a corrected image. Because the process of acquiring and capturing pictures may cause distortion, deformation, blur, and the like, the data needs to be filtered and corrected. For the problem of picture deformation, a graphic image algorithm can be adopted to process an original collected image so as to ensure the quality of a data set and avoid that the model is not optimal due to the data problem.
Then, a trash position information label and a trash category label corresponding to the corrected image may be acquired. The corrected image may be used as a training sample image in training sample data, and a trash position information label and a trash category label corresponding to the corrected image may be used as a trash position information label and a trash category label corresponding to the training sample image. For example, the garbage category included in each picture in the original captured image data set may be framed with a rectangular frame, and the garbage category is labeled to obtain the garbage category and the garbage coordinate information (coordinate information of the upper left corner and the lower right corner of the rectangular frame) included in each picture, so as to form a labeled data set, where the labeled files in the labeled data set correspond to the original captured images in the original captured image set one by one.
In one implementation, after the step of obtaining a training sample data set, the method further comprises: performing data enhancement processing on the training sample images in the training sample data set to obtain data enhancement images; and taking the data enhanced image and the garbage position information label and the garbage category label corresponding to the data enhanced image as a group of training sample data in the training sample data set.
Wherein, the data enhancement processing mode comprises at least one of the following modes: flip transform, random cropping, color dithering, translation transform, scale transform, contrast transform, noise perturbation, rotation transform/reflection transform.
It should be noted that, after the training sample images are labeled, data enhancement is performed on the training sample images, so that the number of training sample images can be increased, and the recognition performance and generalization capability of the model can be improved.
In one implementation, after the step of obtaining a training sample data set, the method further comprises: splicing any multiple training sample images in the training sample data set to obtain a spliced training sample image; and using the spliced training sample image and the garbage position information labels and the garbage category labels corresponding to the training sample images as a group of training sample data in the training sample data set.
Specifically, in order to enrich the background of the detected object, multiple training sample pictures can be fused simultaneously in the implementation manner, for example, four training sample images in the training sample data set can be spliced to obtain one training sample image, so that the number of batchs in the training process is increased equivalently, and the requirement for a large mini _ batch can be remarkably reduced.
Step b: inputting a training sample image into a preset network model to obtain the corresponding predicted garbage position information, predicted garbage category labels and predicted confidence of the training sample image.
In this embodiment, one or more training sample images may be sequentially input into a preset network model, so as to obtain the predicted garbage position information, the predicted garbage category label, and the prediction confidence corresponding to each training sample image.
It should be noted that, in an implementation manner of this embodiment, the preset network model may be a YOLO V3 model, the YOLO V3 adjusts a network structure on the basis of YOLO 1 and YOLO 2, and performs object detection by using multi-scale features, where softmax is replaced by Logistic for object classification. YOLOv3 has no full connection layer and pooling layer, can correspond to input images of any size, mainly comprises 75 convolutional layers, and is additionally provided with a resnet residual module in the network, so that the gradient problem of the deep network is solved. The Resnet residual error network is equivalent to adding a shortcut path in an original CNN network structure, and the learning process is changed from directly learning features to adding certain features on the basis of the previously learned features so as to obtain better features. Thus, a complex feature h (x), which was previously learned independently layer by layer, now becomes a model h (x) ═ f (x) + x, where x is the feature at the beginning of the short, and f (x) is the padding and addition of x, which becomes the residual. Therefore, the learning target is changed from the learning of complete information into the learning of residual errors, so that the difficulty in learning high-quality features is greatly reduced. The Softmax layer in Yolov3 is replaced by a 1x1 convolutional layer + logistic activation function structure, which can correspond to multi-label objects. When YOLOv3 predicts the preselected box bbox, a logistic regression (logistic regression) is adopted, each preselected box comprises five elements bbox (x, y, w, h, c), wherein the first four elements represent the size and the coordinate position of the preselected box, and the last value is a confidence coefficient; pr (object) IOU (bbox, object) is σ (t0), where pr (object) IOU (bbox, object) is the confidence of the prediction frame, and is the value of the confidence after σ transformation of the prediction parameter t0, and t0 is the network output confidence. σ is a sigmoid function.
Step c: determining a total loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
in this embodiment, the frame loss value (lbox (i.e. error of bbox)), the class loss value (lcls), and the confidence loss value (lobj) may be determined according to the predicted garbage position information, the predicted garbage class label, and the prediction confidence corresponding to the training sample, and the garbage position information label and the garbage class label corresponding to the training sample image. Specifically, the frame loss value may be determined according to the predicted spam location information and the spam location information tag, the category loss value may be determined according to the predicted spam category tag and the spam category tag, and the confidence loss value may be determined according to the prediction confidence. The frame loss value (lbox), the category loss value (lcls) and the confidence loss value (lobj) are calculated as follows:
Figure BDA0003349220150000131
Figure BDA0003349220150000132
Figure BDA0003349220150000133
Figure BDA0003349220150000134
wherein S represents a prediction grid scale, B represents a prediction coordinate,
Figure BDA0003349220150000135
indicating that if the predicted coordinate at i, j has a target, its value is 1, otherwise it is 0;
Figure BDA0003349220150000141
indicating that the value is 1 if the predicted coordinate at i, j has no target, and 0 otherwise; c represents confidence and p represents category probability; lambda [ alpha ]coord
λclass、λnobj、λopjIs a parameter that exists to increase the loss of bounding boxes and to reduce the loss of confidence predictions for boxes that do not contain objects; x is the number ofi,yiRepresenting the coordinates of the center point, wi,hiWidth and height, respectively;
Figure BDA0003349220150000142
respectively, the coordinates of the desired center point are represented,
Figure BDA0003349220150000143
respectively, indicate the desired width and height,
Figure BDA0003349220150000144
is the desired confidence. Then, a total loss value may be determined according to the bounding box loss value, the category loss value, and the confidence loss value. For example, the sum of the frame loss value, the category loss value, and the confidence loss value may be used as the total loss value, i.e., loss ═ lbox + lobj + lcls, where loss is the total loss value.
Step d: and adjusting model parameters of the network model according to the total loss value until the network model meets training conditions, and taking the network model as the garbage detection model.
Wherein the training condition is that the iterative training times of the model satisfy a preset number, for example, 500 times, or the model parameters of the model have converged.
In addition, in this embodiment, the method for training the model by the network model and adjusting the parameters to obtain the optimal weight model includes: the data set is divided into three parts, namely a training set, a verification set and a test set. The training set is used for model training based on a YOLOv3 network, and a plurality of weight models can be obtained by sending the training set into a YOLOv3 model for iterative training. And the verification set is used for adjusting model parameters in time during model training. Since the weights (weights) of the network model are initialized randomly when the training is started, at this time, if a larger learning rate is selected, instability (oscillation) of the model may be brought, a gradual warp method is adopted to adjust the learning rate in the process of training the model, each step is increased a little from an initial small learning rate until the initially set larger learning rate is reached, the initially set learning rate is adopted to train, so that the learning rate in several epochs or some steps for starting the training is smaller, the model can slowly tend to be stable under the preheated primary learning rate, and the preset learning rate is selected for training after the model is relatively stable, so that the convergence speed of the model becomes faster, and the model effect is better. In the training process, a corresponding weight file is generated according to the network learning effect, and garbage can be detected in untrained pictures according to the weight file. The weight file is equivalent to storing the learned features, and untrained pictures are matched according to the features in the weight file when the weight file is used, so that a detection result is obtained.
And the training is based on a coco pre-training model, so that the model learning loss can be quickly reduced, and the model training efficiency is improved. Model parameters can be adjusted in time while training and verifying, and time cost caused by late result verification is reduced. And the test set is used for testing the training effect of the model after the model training is finished and predicting the optimal weight model.
In one implementation, after S104, the method further includes: and if the garbage detection result of the image to be recognized comprises garbage position information and garbage category, outputting garbage cleaning prompt information.
In this embodiment, if the spam detection result of the image to be recognized includes spam location information and spam category, it is indicated that the corresponding area in the image to be recognized includes spam, and at this time, spam cleaning prompt information may be output, for example, a voice prompt may be performed, or a cleaning message may be popped up in a terminal interface, so as to prompt a user to clean spam. Therefore, according to the embodiment, the garbage cleaning prompt information is output by determining that the garbage detection result of the image to be recognized comprises the garbage position information and the garbage category so as to prompt a user to clean the garbage, the garbage cleaning prompt information can be combined with a social management platform, management linkage is realized, an abnormal automatic reporting early warning function is realized, and the garbage detection method has important effects on aspects of city appearance, improvement of livelihood and the like.
Because the garbage time report has a certain time, the garbage existence time can be updated when the garbage exists continuously, and whether the garbage is processed or not can be judged according to the duration time. Therefore, in this embodiment, the garbage disposal prompt message may be output only when the existence of garbage is continuously detected in a continuous time (for example, within ten minutes) for the same area.
In one implementation, after S104, the method further includes:
if the checking result of the garbage detection result of the image to be recognized is a negative sample, that is, if the garbage monitoring result is inconsistent with the actual garbage detection result of the image to be recognized, the checking result is a negative sample, which indicates that the garbage detection model is recognized incorrectly, and then the garbage detection model can be trained by using the image to be recognized and the corrected garbage detection result of the image to be recognized.
As an example, as shown in fig. 2, a video shot under a fixed camera may be transmitted back to the social governance platform server in real time, the trained model is used to detect the transmitted video in real time, the detected garbage is marked in the video by a rectangular frame and the category is marked, the social governance platform server intervenes the detection result, and if the social governance platform server detects that the detection result is wrong, a negative sample of the detection error may be collected. When the collected negative samples exceed a certain number or the categories of newly-added foreign matters (garbage), the self-optimization can be automatically started, namely, the collected negative samples are used for continuing to train the garbage detection model. At this time, the training sample data in the data set includes two sources, namely, prepared data and automatically collected pictures by the system. Like this, along with the time lapse, the negative sample of the feedback that this application obtained is more, and the iterative number of times of rubbish detection model on-line training is also more, and the effect of model can optimize gradually, and the rate of accuracy can promote gradually.
In summary, compared with the prior art, the method has the following advantages:
(1) according to the method and the device, the fixed camera is adopted to shoot the video as a data source, so that manual intervention is reduced, and the condition under the current camera can be accurately detected in real time.
(2) The application can be applied to garbage detection in the social treatment platform, and after garbage is detected, manual work can be carried out on the platform. The method and the device can collect the detected object picture and the result of manual treatment, and convert the result of manual treatment into the label corresponding to the training model. When the number of the collected object pictures reaches a specified threshold value, the method and the system can automatically start the process of model training. With the lapse of time, the artificial feedback obtained by the method is increased, the number of times of on-line training iteration is increased, the effect of the model is gradually optimized, and the accuracy is gradually improved.
(3) The garbage treatment system is mainly used for treating garbage, and based on the YOLOv3 real-time detection, the model is high in stability and good in detection and identification performance, the influence of external factors can be eliminated, and the purpose of 'precise treatment' is achieved.
(4) The system is combined with a social management platform, realizes an abnormal automatic reporting and early warning function through management linkage, and has important effects on aspects of city appearance, improvement of livelihood and the like.
Fig. 3 shows a specific embodiment of a garbage detection apparatus according to the present application. The apparatus of this embodiment is a physical apparatus for executing the method of the above embodiment. The technical solution is essentially the same as the above embodiments, and the apparatus in this embodiment includes:
an input unit 301, configured to acquire an image to be recognized and input the image to be recognized into a trained garbage detection model;
a first obtaining unit 302, configured to perform feature extraction on the image to be identified by the garbage detection model to obtain three feature maps with different scales;
a second obtaining unit 303, configured to control the garbage detection model to perform garbage detection on the three feature maps respectively, so as to obtain garbage detection results and result confidence levels corresponding to the three feature maps respectively;
a third obtaining unit 304, configured to control the spam detection model to output a spam detection result of the image to be identified according to a spam detection result and a result confidence degree that correspond to each of the three feature maps.
Optionally, the spam detection result includes spam location information and spam category; the garbage position information comprises an image to be identified marked with a garbage area and coordinate information of garbage.
Optionally, the apparatus further includes a model generation unit, configured to:
acquiring a training sample data set; the training sample data set comprises a plurality of groups of training sample data, wherein each group of training sample data comprises a training sample image and a garbage position information label and a garbage category label corresponding to the training sample image;
inputting a training sample image into a preset network model to obtain predicted garbage position information, a predicted garbage category label and a predicted confidence coefficient corresponding to the training sample image;
determining a total loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and adjusting model parameters of the network model according to the total loss value until the network model meets training conditions, and taking the network model as the garbage detection model.
Optionally, the model generating unit is specifically configured to:
determining a frame loss value, a category loss value and a confidence coefficient loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and determining a total loss value according to the frame loss value, the category loss value and the confidence coefficient loss value.
Optionally, the model generating unit is specifically configured to:
acquiring an original acquisition image;
correcting the original collected image to obtain a corrected image;
acquiring a garbage position information label and a garbage category label corresponding to the corrected image;
and taking the corrected image as a training sample image in training sample data, and taking a garbage position information label and a garbage category label corresponding to the corrected image as a garbage position information label and a garbage category label corresponding to the training sample image.
Optionally, the model generating unit is further configured to:
performing data enhancement processing on the training sample images in the training sample data set to obtain data enhancement images;
taking the data enhanced image and a garbage position information label and a garbage category label corresponding to the data enhanced image as a group of training sample data in the training sample data set;
wherein, the data enhancement processing mode comprises at least one of the following modes: flip transform, random cropping, color dithering, translation transform, scale transform, contrast transform, noise perturbation, rotation transform/reflection transform.
Optionally, the model generating unit is further configured to:
splicing any multiple training sample images in the training sample data set to obtain a spliced training sample image;
and using the spliced training sample image and the garbage position information labels and the garbage category labels corresponding to the training sample images as a group of training sample data in the training sample data set.
Optionally, the apparatus further includes a prompting unit:
and if the garbage detection result of the image to be recognized comprises garbage position information and garbage category, outputting garbage cleaning prompt information.
Optionally, the apparatus further comprises a retraining unit:
and if the checking result of the garbage detection result of the image to be recognized is a negative sample, performing model training on the garbage detection model by using the image to be recognized and the corrected garbage detection result of the image to be recognized.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. On the hardware level, the electronic device comprises a processor and optionally an internal bus, a network interface and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus.
And the memory is used for storing the execution instruction. In particular, a computer program that can be executed by executing instructions. The memory may include both memory and non-volatile storage and provides execution instructions and data to the processor.
In a possible implementation manner, the processor reads the corresponding execution instruction from the nonvolatile memory to the memory and then runs the corresponding execution instruction, and may also obtain the corresponding execution instruction from other devices to form the garbage detection apparatus on a logic level. The processor executes the execution instructions stored in the memory, so that the garbage detection method provided by any embodiment of the application is realized through the executed execution instructions.
The method executed by the garbage detection apparatus according to the embodiment shown in fig. 1 of the present application may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
An embodiment of the present application further provides a readable storage medium, where the readable storage medium stores an execution instruction, and when the stored execution instruction is executed by a processor of an electronic device, the electronic device can execute the spam detection method provided in any embodiment of the present application, and is specifically configured to execute the method for spam detection.
The electronic device described in the foregoing embodiments may be a computer.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (8)

1. A method of garbage detection, the method comprising:
acquiring an image to be recognized, and inputting the image to be recognized into a trained garbage detection model;
the garbage detection model extracts the features of the image to be identified to obtain a plurality of feature maps with different scales;
the garbage detection model respectively carries out garbage detection on the plurality of feature maps to obtain garbage detection results and result confidence degrees corresponding to the plurality of feature maps;
the garbage detection model outputs a garbage detection result of the image to be recognized according to the respective corresponding garbage detection results and result confidence degrees of the plurality of feature maps;
the generation mode of the garbage detection model comprises the following steps:
acquiring a training sample data set; the training sample data set comprises a plurality of groups of training sample data, wherein each group of training sample data comprises a training sample image and a garbage position information label and a garbage category label corresponding to the training sample image;
inputting a training sample image into a preset network model to obtain predicted garbage position information, a predicted garbage category label and a predicted confidence coefficient corresponding to the training sample image;
determining a total loss value according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image;
and adjusting model parameters of the network model according to the total loss value until the network model meets training conditions, and taking the network model as the garbage detection model.
2. The method of claim 1, wherein the spam detection result comprises spam location information and spam category; the garbage position information comprises an image to be identified marked with a garbage area and coordinate information of garbage.
3. The method according to claim 1 or 2, wherein the determining a total loss value according to the predicted garbage position information, the predicted garbage category label and the prediction confidence corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image comprises:
determining a frame loss value lbox, a category loss value lcls and a confidence coefficient loss value lobj according to the predicted garbage position information, the predicted garbage category label and the predicted confidence coefficient corresponding to the training sample, and the garbage position information label and the garbage category label corresponding to the training sample image:
Figure FDA0003349220140000021
Figure FDA0003349220140000022
Figure FDA0003349220140000023
wherein S represents a prediction grid scale, B represents a prediction coordinate,
Figure FDA0003349220140000024
indicating that if the predicted coordinate at i, j has a target, its value is 1, otherwise it is 0;
Figure FDA0003349220140000025
indicating that the value is 1 if the predicted coordinate at i, j has no target, and 0 otherwise; c represents confidence and p represents category probability; lambda [ alpha ]coord、λclass、λnobj、λopjIs a parameter that exists to increase the loss of bounding boxes and to reduce the loss of confidence predictions for boxes that do not contain objects; x is the number ofi,yiRepresenting the coordinates of the center point, wi,hiWidth and height, respectively;
Figure FDA0003349220140000026
individual watchThe coordinates of the center point are shown as being desired,
Figure FDA0003349220140000027
respectively, indicate the desired width and height,
Figure FDA0003349220140000028
is the desired confidence level;
determining a total loss value loss according to the frame loss value lbox, the category loss value lcls and the confidence coefficient loss value lobj,
loss=lbox+lobj+lcls。
4. the method of claim 3, wherein the obtaining a training sample data set comprises:
acquiring an original acquisition image;
correcting the original collected image to obtain a corrected image;
acquiring a garbage position information label and a garbage category label corresponding to the corrected image;
and taking the corrected image as a training sample image in training sample data, and taking a garbage position information label and a garbage category label corresponding to the corrected image as a garbage position information label and a garbage category label corresponding to the training sample image.
5. The method according to claim 3, wherein after the step of obtaining a set of training sample data, the method further comprises:
performing data enhancement processing on the training sample images in the training sample data set to obtain data enhancement images;
taking the data enhanced image and a garbage position information label and a garbage category label corresponding to the data enhanced image as a group of training sample data in the training sample data set;
wherein, the data enhancement processing mode comprises at least one of the following modes: flip transform, random cropping, color dithering, translation transform, scale transform, contrast transform, noise perturbation, rotation transform/reflection transform.
6. The method according to claim 3, wherein after the step of obtaining a set of training sample data, the method further comprises:
splicing any multiple training sample images in the training sample data set to obtain a spliced training sample image;
and using the spliced training sample image and the garbage position information labels and the garbage category labels corresponding to the training sample images as a group of training sample data in the training sample data set.
7. The method of claim 1, further comprising:
and if the garbage detection result of the image to be recognized comprises garbage position information and garbage category, outputting garbage cleaning prompt information.
8. The method of claim 1, further comprising:
and if the checking result of the garbage detection result of the image to be recognized is a negative sample, performing model training on the garbage detection model by using the image to be recognized and the corrected garbage detection result of the image to be recognized.
CN202111332238.0A 2021-11-11 2021-11-11 Garbage detection method and device Pending CN114022736A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111332238.0A CN114022736A (en) 2021-11-11 2021-11-11 Garbage detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111332238.0A CN114022736A (en) 2021-11-11 2021-11-11 Garbage detection method and device

Publications (1)

Publication Number Publication Date
CN114022736A true CN114022736A (en) 2022-02-08

Family

ID=80063783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111332238.0A Pending CN114022736A (en) 2021-11-11 2021-11-11 Garbage detection method and device

Country Status (1)

Country Link
CN (1) CN114022736A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511041A (en) * 2022-04-01 2022-05-17 北京世纪好未来教育科技有限公司 Model training method, image processing method, device, equipment and storage medium
CN114987721A (en) * 2022-05-26 2022-09-02 南方科技大学 Underwater cleaning method and device, electronic equipment and storage medium
CN115180313A (en) * 2022-06-29 2022-10-14 广东交通职业技术学院 Method, system and device for guiding classified garbage throwing and storage medium
CN117830800A (en) * 2024-03-04 2024-04-05 广州市仪美医用家具科技股份有限公司 Clothing detection and recovery method, system, medium and equipment based on YOLO algorithm

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511041A (en) * 2022-04-01 2022-05-17 北京世纪好未来教育科技有限公司 Model training method, image processing method, device, equipment and storage medium
CN114987721A (en) * 2022-05-26 2022-09-02 南方科技大学 Underwater cleaning method and device, electronic equipment and storage medium
CN115180313A (en) * 2022-06-29 2022-10-14 广东交通职业技术学院 Method, system and device for guiding classified garbage throwing and storage medium
CN117830800A (en) * 2024-03-04 2024-04-05 广州市仪美医用家具科技股份有限公司 Clothing detection and recovery method, system, medium and equipment based on YOLO algorithm

Similar Documents

Publication Publication Date Title
CN114022736A (en) Garbage detection method and device
CN108710847B (en) Scene recognition method and device and electronic equipment
WO2018036276A1 (en) Image quality detection method, device, server and storage medium
CN110598752B (en) Image classification model training method and system for automatically generating training data set
CN111723657B (en) River foreign matter detection method and device based on YOLOv3 and self-optimization
CN112001403B (en) Image contour detection method and system
CN111415338A (en) Method and system for constructing target detection model
CN112364944B (en) Deep learning-based household garbage classification method
CN112651318A (en) Image recognition-based garbage classification method, device and system
CN109492697B (en) Picture detection network training method and picture detection network training device
CN112801911B (en) Method and device for removing text noise in natural image and storage medium
CN111126155A (en) Pedestrian re-identification method for generating confrontation network based on semantic constraint
CN114724246A (en) Dangerous behavior identification method and device
CN114708291A (en) Image processing method, image processing device, electronic equipment and storage medium
CN113837257A (en) Target detection method and device
CN111652242B (en) Image processing method, device, electronic equipment and storage medium
CN117275086A (en) Gesture recognition method, gesture recognition device, computer equipment and storage medium
CN112580581A (en) Target detection method and device and electronic equipment
CN110210314B (en) Face detection method, device, computer equipment and storage medium
CN114708582B (en) AI and RPA-based electric power data intelligent inspection method and device
CN116912872A (en) Drawing identification method, device, equipment and readable storage medium
CN115115552A (en) Image correction model training method, image correction device and computer equipment
CN111127327B (en) Picture inclination detection method and device
CN114332564A (en) Vehicle classification method, apparatus and storage medium
CN114998183A (en) Method for identifying surface defects of recycled aluminum alloy template

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination