CN115761259A - Kitchen waste target detection method and system based on class balance loss function - Google Patents

Kitchen waste target detection method and system based on class balance loss function Download PDF

Info

Publication number
CN115761259A
CN115761259A CN202211418560.XA CN202211418560A CN115761259A CN 115761259 A CN115761259 A CN 115761259A CN 202211418560 A CN202211418560 A CN 202211418560A CN 115761259 A CN115761259 A CN 115761259A
Authority
CN
China
Prior art keywords
network
class
target detection
samples
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211418560.XA
Other languages
Chinese (zh)
Other versions
CN115761259B (en
Inventor
方乐缘
唐崎
欧阳立韩
汤琳
梁桥康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202211418560.XA priority Critical patent/CN115761259B/en
Publication of CN115761259A publication Critical patent/CN115761259A/en
Application granted granted Critical
Publication of CN115761259B publication Critical patent/CN115761259B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kitchen garbage target detection method and a kitchen garbage target detection system based on a category balance loss function, wherein the kitchen garbage target detection method comprises the steps of constructing a kitchen garbage detection data set; constructing a kitchen garbage target detection model, which comprises a feature extraction network, a feature fusion network, an area suggestion generation network, a RoI transform network and a detection head network; constructing an L1 regression loss function and a class balance loss function, training a target detection model according to a training set in a kitchen garbage detection data set, and updating network weights of the kitchen garbage target detection model in the training by using a gradient descent method back propagation loss gradient in combination with a bounding box regression result of a class target, a classification result probability, a preset target real bounding box and a class label; and acquiring an actual kitchen waste image, and detecting according to the trained kitchen waste target detection model to obtain a target detection result. The detection precision of each type of recoverable rubbish in the kitchen garbage is effectively improved.

Description

Kitchen waste target detection method and system based on class balance loss function
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a kitchen garbage target detection method and system based on a class balance loss function.
Background
The kitchen waste is complex in component and is doped with a large amount of impurities such as glass bottles, metal cans and various plastic products. These impurities will affect the harmless and total consumption of the kitchen waste. Therefore, sorting kitchen waste becomes an urgent problem to be solved. At present, the rubbish from cooking letter sorting mainly relies on artificial mode, and manual sorting wastes time and energy, and sorts the inefficiency, and the bacterium that food that putrefaction is rotten in the letter sorting process produced also seriously influences workman's healthy simultaneously. Therefore, it is necessary to perform automatic sorting of the kitchen waste. The research on the kitchen waste intelligent detection algorithm is an essential part for realizing automatic kitchen waste sorting.
In practical situations, recyclables in kitchen waste show a severe unbalanced distribution of the number of class samples. There are some categories, such as irregular soft plastics, that account for most of the recyclables, which are referred to as the majority; while other categories, such as glass products, occupy only a small fraction of the recyclables, which are referred to as a minority category. The target detection method based on deep learning generally cannot effectively learn a small number of target features, resulting in poor detection effect. In order to improve the detection performance of a few types, the existing methods are mainly divided into two types: resampling and re-weighting. In resampling, either the majority classes are undersampled (deleting data) or the minority classes are oversampled (adding duplicate data) adjusted, or both. In the re-weighting, a small number of classes are given a larger weight and a large number of classes are given a lower weight, mainly in the classification loss function of the neural network training process. Oversampling in the resampling method introduces many repeated samples, slows down training speed, and easily causes model overfitting; while under-sampling can discard large amounts of data, as can over-sampling, which can present an over-fitting problem. Therefore, the method based on re-weighting is more widely applied. However, in practical situations, the distribution of the number of kitchen waste categories is extremely unbalanced, the detection performance cannot be effectively improved by using a simple re-weighting method, and the weights of the samples of most categories are reduced after re-weighting, so that the characteristic learning of difficult samples in most categories is influenced.
Disclosure of Invention
Aiming at the technical problems, the invention provides a kitchen garbage target detection method and system based on a class balance loss function.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a kitchen garbage target detection method based on a class balance loss function comprises the following steps:
s100: collecting kitchen waste images and constructing a kitchen waste detection data set;
s200: constructing a kitchen waste target detection model, wherein the kitchen waste target detection model comprises a feature extraction network, a feature fusion network, an area suggestion generation network, a RoI Transformer network and a detection head network which are sequentially connected, the feature extraction network is used for extracting multi-scale features of a kitchen waste image, the feature fusion network is used for fusing the multi-scale features and sending the fused multi-scale features to the area suggestion generation network, the area suggestion generation network is used for generating candidate frames of the fused multi-scale features, positive samples and negative samples are obtained according to the intersection and comparison of the candidate frames and a real boundary frame, the RoI Transformer network is used for aligning the candidate frame samples, extracting the aligned sample features according to a preset positive and negative sample proportion and inputting the aligned sample features into the detection head network to obtain a boundary regression frame result and classification result probability of a category target;
s300: constructing an L1 regression loss function and a class balance loss function, training a kitchen garbage target detection model according to a training set in a kitchen garbage detection data set, calculating a loss value of the trained kitchen garbage target detection model according to a bounding box regression result of a class target, a classification result probability, a preset target real bounding box and a class label, and performing network weight updating by using a gradient descent method to reversely propagate a loss gradient according to the loss value to obtain the trained kitchen garbage target detection model; the class balance loss function comprises an inter-class balance weighting factor based on the number of samples and a weighting factor based on the difficulty of the samples;
s400: and acquiring an actual kitchen waste image, and detecting the actual kitchen waste image according to the trained kitchen waste target detection model to obtain a target detection result.
Preferably, the feature extraction network comprises a dimension expansion top layer, a first feature extraction layer, a second feature extraction layer, a third feature extraction layer and a fourth feature extraction layer, the dimension expansion layer performs dimension expansion on the input image to a dimension required by input of the first feature extraction stage, the first feature extraction layer comprises 3 Convnextblock blocks, the second feature extraction layer comprises 3 Convnextblock blocks, the third feature extraction layer comprises 9 Convnextblock blocks, the fourth feature extraction layer comprises 3 Convnextblock blocks, output feature graphs of the first, second and third feature extraction layers are subjected to double-size downsampling respectively and then input to a corresponding next feature extraction layer, wherein the Convnextblock blocks comprise a depth separable convolutional layer, the convolutional kernel size is 7 × 7, a normalization layer, two 1 × 1 convolutional layers and a GELU activation function.
Preferably, the feature fusion module includes a first feature fusion layer, a second feature fusion layer, a third feature fusion layer and a fourth feature fusion layer, where the ith feature fusion layer is configured to amplify the feature map output by the ith feature extraction layer to the same size as the feature map output by the previous feature extraction layer by an up-sampling method of nearest neighbor interpolation, and then fuse the feature map with the feature map output by the previous feature extraction layer by an addition operation to obtain a fused feature map L output by the ith feature fusion layer i
Preferably, the area suggestion generation network includes sequentially connected 1 × 3 convolution and 2 × 1 convolutions in series, the area suggestion generation network in S200 generates a candidate frame for the fused multi-scale feature, and obtains a positive sample and a negative sample according to an intersection ratio of the candidate frame and a real bounding box, including:
fused feature map L i After the input area suggestion generation network, a series of candidate boxes are generated. And defining the candidate box with the intersection ratio of the candidate box to the real bounding box being more than 0.7 as a positive sample, and defining the candidate box with the intersection ratio of the candidate box to the real bounding box being less than 0.3 as a negative sample.
Preferably, the constructing the L1 regression loss function and the class balance loss function in S300 includes:
s310: constructing an L1 regression loss function and a cross entropy classification loss function;
s320: designing an inter-class balance weighting factor based on the number of samples;
s330: designing a weighting factor based on sample difficulty;
s340: and constructing a class balance loss function according to the cross entropy classification loss function, the inter-class balance weighting factor based on the number of samples and the weighting factor based on the difficulty and easiness of the samples.
Preferably, S310 includes:
using the L1 regression loss function:
Figure SMS_1
wherein x is an independent variable of the L1 regression loss function and represents the difference between a real bounding box and a candidate box of the class target;
using a cross-entropy classification loss function:
Figure SMS_2
wherein y is a class label, corresponds to 0 and 1 in binary classification, and when the y is 1, the classification result is consistent with the real label result, and P 1 Is a candidate in-frame classThe probability of other targets, for simplifying the formula, is defined as:
Figure SMS_3
the simplified cross entropy loss function is then:
L CE (P t ,y)=-log(P t )
P t is the classification confidence for each sample in class y, and P t ∈(0,1)。
Preferably, S320 includes:
defining the total number of categories of the data set as N, thereby obtaining the number of samples of each category label in each training batch as N 1 ,n 2 ,n 3 ,...,n N ]Then the number of class y samples in each training batch is n y
Defining a sample number normalization factor of the class label y, and the formula is as follows:
Figure SMS_4
n max the maximum value of the number of all the class samples in each training batch;
defining parameters m and n, wherein m is more than 0 and less than n and less than 1, and setting the weighting factor of each training batch type y as
Figure SMS_5
And is provided with
Figure SMS_6
Then the sample weighting for class y in each training batch is defined
Figure SMS_7
The formula is as follows:
Figure SMS_8
δ maxmin respectively for all categories in each training batchThe maximum value and the minimum value in the sample number normalization factor;
defining an adaptive inter-class balance weighting factor omega for class y in each training batch y The formula is as follows:
Figure SMS_9
Figure SMS_10
n y respectively weighting degree and number of samples of class y in each training batch, and balance weighting factor and sample weighting degree between classes
Figure SMS_11
And the number of samples are all inversely related;
finally, the class balance weighting factors of all categories are normalized, so that:
Figure SMS_12
where N is the total number of categories in the dataset and i is the sequence number of each category.
Preferably, S330 is specifically:
define the following sample difficulty weighting factor gamma, the formula is as follows
γ=2-α(1+P t )
P t Classify confidence for each sample in each training batch, and P t Belongs to the field of ' 0,1 ', alpha is a difficult modulation factor, and alpha belongs to the field of ' 0.5,1]。
Preferably, S340 specifically is:
Figure SMS_13
in each of the training batches, the training batch,
Figure SMS_14
sample weighting degree, n, for class y y Is the number of samples of class y, α isDifficult and easy modulation factor, P t Is the classification confidence for each sample.
A kitchen waste target detection system based on a class balance loss function comprises:
the kitchen waste image acquisition module is used for acquiring kitchen waste images and constructing a kitchen waste detection data set;
the kitchen garbage target detection model building module is used for building a kitchen garbage target detection model, the kitchen garbage target detection model comprises a feature extraction network, a feature fusion network, a region suggestion generation network, a RoI Transformer network and a detection head network which are sequentially connected, the feature extraction network is used for extracting multi-scale features of a kitchen garbage image, the feature fusion network is used for fusing the multi-scale features and sending the fused multi-scale features to the region suggestion generation network, the region suggestion generation network generates candidate frames for the fused multi-scale features, positive samples and negative samples are obtained according to the intersection and combination ratio of the candidate frames and a real boundary frame, the RoI Transformer network is used for aligning the candidate frame samples, extracting the aligned sample features according to a preset positive and negative sample ratio and inputting the aligned sample features to the detection head network, and obtaining boundary frame regression results and classification result probabilities of category targets;
the kitchen waste target detection model training module is used for constructing an L1 regression loss function and a class balance loss function, training a kitchen waste target detection model according to a kitchen waste detection training set, calculating a loss value of the trained kitchen waste target detection model according to a bounding box regression result of a class target, a classification result probability, a preset target real bounding box and a class label, and performing network weight updating by using a gradient descent method to reversely propagate a loss gradient according to the loss value to obtain the trained kitchen waste target detection model;
and the detection module is used for acquiring an actual kitchen waste image and detecting the actual kitchen waste image according to the trained kitchen waste target detection model to obtain a target detection result.
According to the kitchen garbage target detection method and system, the class balance loss function performs self-adaptive inter-class balance weighting on each class in each batch during training, so that a network model focuses on training of samples of a few classes; meanwhile, weighting the difficult samples in various classes to different degrees through sample difficult weighting factors, and mining the characteristics of the difficult samples. The method provided by the invention has the advantages that the average detection precision of a few types of samples in the kitchen waste is obviously improved, and meanwhile, the average detection precision of a plurality of types of samples is also slightly improved.
Drawings
FIG. 1 is a flowchart illustrating a kitchen waste target detection method based on a class balance loss function according to an embodiment of the present invention;
FIG. 2 is a sample of each category in a kitchen waste image dataset according to an embodiment of the present invention;
FIG. 3 is a statistical chart of the number of classes in a training set of kitchen waste image data according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a kitchen waste target detection model according to an embodiment of the present invention;
fig. 5 is a schematic diagram comparing the detection results of the kitchen waste image data set by the method of the present invention and the existing method, wherein, (a) is the detection result of the cross entropy loss method, (b) is the detection result of the focus loss method, (c) is the detection result of the method, and (d) is the result of the real tag.
Detailed Description
In order to make the technical solutions of the present invention better understood, the present invention is further described in detail below with reference to the accompanying drawings.
In one embodiment, as shown in fig. 1, a kitchen waste target detection method based on a class balance loss function includes the following steps:
s100: and (4) collecting kitchen waste images and constructing a kitchen waste detection data set.
Specifically, first, image data of the kitchen waste disposal station is acquired, and 13871 kitchen waste images are acquired in total.
Secondly, as shown in fig. 2, according to the national garbage category definition, the recyclable garbage in the data set image is classified, and then the recyclable garbage target in the image is labeled by a rotating rectangular frame in a manual labeling manner by using labeling software. The specific recoverable garbage categories and the labeled quantities thereof are respectively as follows: 80866 as irregular soft plastic, 46520 as regular soft plastic, 251166 as hard plastic, 6358 as tetra pack, 5776 as plastic bottle, 2698 as glass product and 2361 as metal product, and 7 types.
And finally, randomly dividing the kitchen garbage data into training sets and testing sets according to the proportion of 4. The number of labels of each class of training data is shown in fig. 3.
Further, after the kitchen garbage detection data set is constructed, preprocessing a training set in the kitchen garbage detection data set, namely, scaling all pictures in the kitchen garbage training data set to be 1024 × 1024 by adopting bilinear interpolation; in order to prevent overfitting during network training and improve the generalization performance of the network, data enhancement is carried out by randomly overturning according to the horizontal axis, the vertical axis and the diagonal line, the overturning probabilities are 0.25, 0.25 and 0.25 respectively, and finally normalization processing is carried out on the images according to set mean values (123.676, 116.28, 103.53) and variances (58.395, 57.12 and 57.375).
S200: the kitchen waste target detection method comprises the steps of constructing a kitchen waste target detection model, wherein the kitchen waste target detection model comprises a feature extraction network, a feature fusion network, an area suggestion generation network, a RoI Transformer network and a detection head network which are sequentially connected, the feature extraction network is used for extracting multi-scale features of a kitchen waste image, the feature fusion network is used for fusing the multi-scale features and sending the fused multi-scale features to the area suggestion generation network, the area suggestion generation network is used for generating candidate frames of the fused multi-scale features, positive samples and negative samples are obtained according to the intersection and comparison of the candidate frames and a real boundary frame, the RoI Transformer network is used for aligning the candidate frame samples, extracting the aligned sample features according to a preset positive and negative sample proportion and inputting the aligned sample features to the detection head network, and obtaining a boundary regression frame result and classification result probability of a category target.
Specifically, a structure diagram of the kitchen waste target detection model is shown in fig. 4.
In one embodiment, the feature extraction network comprises a dimension expansion top layer, a first feature extraction layer, a second feature extraction layer, a third feature extraction layer and a fourth feature extraction layer which are connected in sequence, the dimension expansion layer performs dimension expansion on an input image to a dimension required by input of a first feature extraction stage, the first feature extraction layer comprises 3 Convnextblock blocks, the second feature extraction layer comprises 3 Convnextblock blocks, the third feature extraction layer comprises 9 Convnextblock blocks, the fourth feature extraction layer comprises 3 Convnextblock blocks, output feature graphs of the first, second and third feature extraction layers are subjected to double-size down-sampling respectively and then input to a corresponding next feature extraction layer, wherein the Convnextblock comprises a depth separable convolution layer, convolution kernels are 7 x 7 in size, a normalization layer, two 1 x 1 convolution layers and a GELU activation function.
Specifically, a first feature extraction layer and a second feature extraction layer are used for shallow feature extraction, and a third feature extraction layer and a fourth feature extraction layer are used for deep feature extraction.
In one embodiment, the feature fusion module includes a first feature fusion layer, a second feature fusion layer, a third feature fusion layer and a fourth feature fusion layer, where the ith feature fusion layer is configured to amplify a feature map output by the ith feature extraction layer to the same size as an output feature map of the last feature extraction layer by an up-sampling method of nearest neighbor interpolation, and then fuse the feature map with a feature map output by the last feature extraction layer by an addition operation to obtain a fused feature map L output by the ith feature fusion layer i
In one embodiment, the generating network of the region suggestion includes sequentially and serially connecting 1 × 3 convolution and 2 × 1 convolutions, the generating network of the region suggestion in S200 generates a candidate frame for the fused multi-scale feature, and obtains a positive sample and a negative sample according to an intersection ratio of the candidate frame and a real bounding box, including:
fused feature map L i After the input area suggestion generation network, a series of candidate frames are generated, wherein the candidate frames and the real edgesThe candidate box with the intersection ratio of the bounding box being more than 0.7 is defined as a positive sample, and the candidate box with the intersection ratio of the bounding box being less than 0.3 is defined as a negative sample.
S300: constructing an L1 regression loss function and a class balance loss function, training a kitchen garbage target detection model according to a training set in a kitchen garbage detection data set, calculating a loss value of the trained kitchen garbage target detection model according to a bounding box regression result of a class target, a classification result probability, a preset target real bounding box and a class label, and performing network weight updating by using a gradient descent method to reversely propagate a loss gradient according to the loss value to obtain the trained kitchen garbage target detection model; the class balance loss function comprises an inter-class balance weighting factor based on the number of samples and a weighting factor based on the difficulty of the samples.
Specifically, the configuration of network model training is set as follows: setting the maximum iteration round number epochs as 12, setting the size of a training batch as 4, reading 4 pictures and corresponding picture marking files during each training, selecting AdamW by an optimizer, setting the initial learning rate as 0.0002, and setting the weight attenuation as 0.05.
Setting the configuration of the network model test: the area suggests that the network non-maximum suppression threshold is set to 0.7, the cross-over ratio threshold of the detection head network is set to 0.01, and the score threshold is set to 0.05.
In one embodiment, constructing the L1 regression loss function and the class balance loss function in S300 includes:
s310: constructing an L1 regression loss function and a cross entropy classification loss function;
s320: designing an inter-class balance weighting factor based on the number of samples;
s330: designing a weighting factor based on sample difficulty;
s340: and constructing a class balance loss function according to the cross entropy classification loss function, the inter-class balance weighting factor based on the number of samples and the weighting factor based on the difficulty and the easiness of the samples.
In one embodiment, S310 includes:
using the L1 regression loss function:
Figure SMS_15
wherein x is an independent variable of the L1 regression loss function and represents the difference between a real bounding box and a candidate box of the class target;
using a cross-entropy classification loss function:
Figure SMS_16
wherein y is a category label, corresponds to 0 and 1 in binary classification, and when the y is 1, the classification result is consistent with the real label result, and P 1 For the probability of being a category target in the candidate box, defining for a simplified formula:
Figure SMS_17
the simplified cross entropy loss function is then:
L CE (P t ,y)=-log(P t )
P t is the classification confidence of each sample in class y, and P t ∈(0,1)。
In one embodiment, S320 includes:
defining the total number of classes of the data set to be N, thereby obtaining the number of samples of each class label in each training batch to be N 1 ,n 2 ,n 3 ,...,n N ]Then the number of class y samples in each training batch is n y
Defining a sample number normalization factor of the class label y, and the formula is as follows:
Figure SMS_18
n max the maximum value of the number of all the class samples in each training batch;
defining parameters m and n, wherein m is more than 0 and n is less than 1, and setting each training batch classY is a weighting factor of
Figure SMS_19
And is provided with
Figure SMS_20
Then the sample weighting for class y in each training batch is defined
Figure SMS_21
The formula is as follows:
Figure SMS_22
δ maxmin respectively normalizing the maximum value and the minimum value in the factors for the number of samples of all classes in each training batch;
defining an adaptive inter-class balance weighting factor omega for class y in each training batch y The formula is as follows:
Figure SMS_23
Figure SMS_24
n y respectively weighting degree and number of samples of class y in each training batch, and balance weighting factor and sample weighting degree between classes
Figure SMS_25
And the number of samples are all inversely related;
finally, the class balance weighting factors of all categories are normalized, so that:
Figure SMS_26
wherein N is the total number of the categories in the kitchen garbage detection data set, and i is the serial number of each category.
In one embodiment, S330 specifically is:
define the following sample difficulty weighting factor gamma, the formula is as follows
γ=2-α(1+P t )
P t Classify confidence for each sample in each training batch, and P t Belongs to the field of ' 0,1 ', alpha is a difficult modulation factor, and alpha belongs to the field of ' 0.5,1]。
Specifically, the use of the sample difficult and easy weighting factor γ alleviates the problem of insufficient learning of the difficult sample features thereof due to the fact that the inter-class balance weighting factor reduces the loss contribution of most classes, so that the loss function effectively mines the features of the difficult samples.
In one embodiment, S340 specifically includes:
Figure SMS_27
in each of the training batches, the number of training batches,
Figure SMS_28
sample weighting degree, n, for class y y Number of samples in category y, α is a difficult modulation factor, P t The classification confidence for each sample.
Specifically, before training, the class balance loss function needs to be set as a hyper-parameter when calculating the inter-class balance weighting factors of all classes of each training batch, and 0 < m < n < 1. In the present invention, the parameters m and n are 0.4 and 0.6, respectively. When the difficulty weighting factor gamma of the samples of each training batch is calculated, the difficulty modulation factor alpha is used as a hyper-parameter, and an optimal value needs to be found in multiple experiments, wherein the difficulty modulation factor alpha takes a value of 0.63. And performing iterative training on the constructed network model by using the set training configuration to obtain a trained kitchen garbage target detection model.
S400: and acquiring an actual kitchen waste image, and detecting the actual kitchen waste image according to the trained kitchen waste target detection model to obtain a target detection result.
The advantages of the invention are illustrated below with reference to one embodiment.
The method of the invention is compared with two existing methods; existing methods include Cross Entropy classification Loss function (Cross entry Loss), and focus classification Loss function (Focal Loss).
The comparison of the average detection precision results of each category on the test set by the method and the existing method is shown in table 1:
TABLE 1 average detection accuracy result comparison for each category of kitchen garbage test set
Figure SMS_29
As can be seen from table 1, the method of the present invention shows better quantification results than the other methods. The visual comparison result of the method of the present invention and other methods is shown in fig. 5 (in fig. 5, fig. 5 (a) is the detection result of the cross entropy loss method, fig. 5 (b) is the detection result of the focus loss method, fig. 5 (c) is the detection result of the method, and fig. 5 (d) is the result of the real label). As can be seen from both the quantitative result and the visual result, the method has the best detection effect.
In one embodiment, a kitchen waste target detection system based on a class balance loss function includes:
the kitchen waste image acquisition module is used for acquiring kitchen waste images and constructing a kitchen waste detection data set;
the kitchen garbage target detection model building module is used for building a kitchen garbage target detection model, the kitchen garbage target detection model comprises a feature extraction network, a feature fusion network, a region suggestion generation network, a RoI Transformer network and a detection head network which are sequentially connected, the feature extraction network is used for extracting multi-scale features of a kitchen garbage image, the feature fusion network is used for fusing the multi-scale features and sending the fused multi-scale features to the region suggestion generation network, the region suggestion generation network generates candidate frames for the fused multi-scale features, positive samples and negative samples are obtained according to the intersection and combination ratio of the candidate frames and a real boundary frame, the RoI Transformer network is used for aligning the candidate frame samples, extracting the aligned sample features according to a preset positive and negative sample ratio and inputting the aligned sample features to the detection head network, and obtaining boundary frame regression results and classification result probabilities of category targets;
the kitchen waste target detection model training module is used for constructing an L1 regression loss function and a class balance loss function, training a kitchen waste target detection model according to a training set in a kitchen waste detection data set, calculating a loss value of the trained kitchen waste target detection model according to a bounding box regression result of a class target, a classification result probability, a preset target real bounding box and a class label, and performing network weight updating by using a gradient descent method to reversely propagate a loss gradient according to the loss value to obtain the trained kitchen waste target detection model;
and the detection module is used for acquiring an actual kitchen waste image and detecting the actual kitchen waste image according to the trained kitchen waste target detection model to obtain a target detection result.
For specific limitation of the kitchen waste target detection system based on the category balance loss function, reference may be made to the description of a kitchen waste target detection method based on the category balance loss function, which is not repeated herein.
Compared with the prior art, the invention has the following beneficial effects:
the invention designs a kitchen garbage target detection method and system based on a class balance loss function, and a classification loss function in a target detection network is constructed. In each batch of the constructed class balance loss function during training, self-adaptive inter-class balance weighting is carried out on each class, so that a network model can pay attention to training of samples of a few classes; meanwhile, weighting the difficult samples in various classes to different degrees through the sample difficult weighting factors, and mining the characteristics of the difficult samples. The method provided by the invention has the advantages that the average detection precision of the few types of samples such as plastic bottles, glass products and metal products is remarkably improved, and meanwhile, the average detection precision of the many types of samples is also slightly improved.
The kitchen waste target detection method and system based on the class balance loss function provided by the invention are described in detail above. The principles and embodiments of the present invention have been described herein using specific examples, which are presented only to assist in understanding the core concepts of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (10)

1. A kitchen waste target detection method based on a class balance loss function is characterized by comprising the following steps:
s100: collecting kitchen waste images and constructing a kitchen waste detection data set;
s200: constructing a kitchen garbage target detection model, wherein the kitchen garbage target detection model comprises a feature extraction network, a feature fusion network, a region suggestion generation network, a RoI Transformer network and a detection head network which are sequentially connected, the feature extraction network is used for extracting multi-scale features of a kitchen garbage image, the feature fusion network is used for fusing the multi-scale features and sending the fused multi-scale features to the region suggestion generation network, the region suggestion generation network is used for generating candidate frames of the fused multi-scale features, positive samples and negative samples are obtained according to the intersection and comparison of the candidate frames and a real boundary frame, and the RoI Transformer network is used for extracting aligned sample features according to a preset positive and negative sample proportion and inputting the aligned sample features to the detection head network after aligning the candidate frame samples to obtain a boundary frame regression result and a classification result probability of a category target;
s300: constructing an L1 regression loss function and a class balance loss function, training the kitchen garbage target detection model according to a training set in the kitchen garbage detection data set, calculating a loss value of the trained kitchen garbage target detection model according to a bounding box regression result of the class target, a classification result probability, a preset target real bounding box and a class label, and performing network weight updating by using a gradient descent method to reversely propagate a loss gradient according to the loss value to obtain the trained kitchen garbage target detection model; wherein, the class balance loss function comprises an inter-class balance weighting factor based on the number of samples and the weighting factor based on the difficulty of the samples;
s400: and acquiring an actual kitchen waste image, and detecting the actual kitchen waste image according to the trained kitchen waste target detection model to obtain a target detection result.
2. The method of claim 1, wherein the feature extraction network comprises a dimension expansion top layer, a first feature extraction layer, a second feature extraction layer, a third feature extraction layer and a fourth feature extraction layer, the dimension expansion layer performs dimension expansion on the input image to a dimension required by the input of the first feature extraction stage, the first feature extraction layer comprises 3 Convnextblock blocks, the second feature extraction layer comprises 3 Convnextblock blocks, the third feature extraction layer comprises 9 Convnextblock blocks, the fourth feature extraction layer comprises 3 Convnextblock blocks, and output feature maps of the first, second and third feature extraction layers are subjected to double-size down-sampling and then input to a corresponding next feature extraction layer, wherein the Convnextblock comprises a depth separable convolution layer with convolution kernel size of 7 x 7, a normalization layer, two 1 x 1 convolution layers and a GEACTIVATION function.
3. The method according to claim 2, wherein the feature fusion module comprises a first feature fusion layer, a second feature fusion layer, a third feature fusion layer and a fourth feature fusion layer, the ith feature fusion layer is used for amplifying the feature map output by the ith feature extraction layer to the same size as the feature map output by the last feature extraction layer by an up-sampling method of nearest neighbor interpolation, and then fusing the feature map with the feature map output by the last feature extraction layer through an adding operation to obtain a fused feature map L output by the ith feature fusion layer i
4. The method of claim 3, wherein the area suggestion generation network comprises sequentially connected 1 x 3 convolution and 2 x 1 convolutions, the area suggestion generation network performs candidate box generation on the fused multi-scale feature in S200, and positive samples and negative samples are obtained according to an intersection-and-merge ratio of the candidate box and a real bounding box, including:
the fused feature map L i And after the area suggestion generation network is input, generating a series of candidate boxes, wherein the candidate boxes with the intersection ratio of the candidate boxes to the real boundary box being more than 0.7 are defined as positive samples, and the candidate boxes with the intersection ratio to the real boundary box being less than 0.3 are defined as negative samples.
5. The method of claim 3, wherein the L1 regression loss function and the class balance loss function constructed in S300 comprise:
s310: constructing an L1 regression loss function and a cross entropy classification loss function;
s320: designing an inter-class balance weighting factor based on the number of samples;
s330: designing a weighting factor based on sample difficulty;
s340: and constructing a class balance loss function according to the cross entropy classification loss function, the inter-class balance weighting factor based on the number of the samples and the weighting factor based on the difficulty and the easiness of the samples.
6. The method of claim 5, wherein S310 comprises:
using the L1 regression loss function:
Figure FDA0003942173450000021
wherein x is an independent variable of the L1 regression loss function and represents the difference between a real bounding box and a candidate box of the class target;
using a cross-entropy classification loss function:
Figure FDA0003942173450000022
wherein y is a class label, corresponds to 0 and 1 in binary classification, and when the y is 1, the classification result is consistent with the real label result, and P 1 For the probability of being a category target within the candidate box, for a simplified formula, define:
Figure FDA0003942173450000031
the simplified cross entropy loss function is then:
L CE (P t ,y)=-log(P t )
P t is the classification confidence for each sample in class y, and P t ∈(0,1)。
7. The method of claim 5, wherein S320 comprises:
defining the total number of classes of the data set to be N, thereby obtaining the number of samples of each class label in each training batch to be N 1 ,n 2 ,n 3 ,...,n N ]Then the number of samples in class y in each training batch is n y
Defining a sample number normalization factor of the category label y, wherein the formula is as follows:
Figure FDA0003942173450000032
n max the maximum value of the number of all the class samples in each training batch;
defining parameters m and n, wherein m is more than 0 and less than n and less than 1, and setting the weighting factor of each training batch type y as
Figure FDA0003942173450000033
And is provided with
Figure FDA0003942173450000034
Then the sample weighting for class y in each training batch is defined
Figure FDA0003942173450000035
The formula is as follows:
Figure FDA0003942173450000036
δ maxmin respectively normalizing the maximum value and the minimum value in the factors for the number of samples of all classes in each training batch;
defining an adaptive inter-class balance weighting factor omega for class y in each training batch y The formula is as follows:
Figure FDA0003942173450000037
Figure FDA0003942173450000038
n y respectively weighting degree and number of samples of class y in each training batch, and balance weighting factor and sample weighting degree between classes
Figure FDA0003942173450000039
And the number of samples are all inversely related;
finally, the class balance weighting factors of all categories are normalized, so that:
Figure FDA0003942173450000041
wherein N is the total number of the categories in the kitchen garbage data set, and i is the serial number of each category.
8. The method according to claim 5, wherein S330 is specifically:
define the following sample difficulty weighting factor gamma, the formula is as follows
γ=2-α(1+P t )
P t Classify confidence for each sample in each training batch, and P t Belongs to the field of ' 0,1 ', alpha is a difficult modulation factor, and alpha belongs to the field of ' 0.5,1]。
9. The method according to claim 5, wherein S340 specifically is:
Figure FDA0003942173450000042
in each of the training batches, the training batch,
Figure FDA0003942173450000043
sample weighting degree, n, for class y y Number of samples in category y, α is a difficult modulation factor, P t The classification confidence for each sample.
10. A kitchen waste target detection system based on a class balance loss function is characterized by comprising:
the kitchen waste image acquisition module is used for acquiring kitchen waste images and constructing a kitchen waste detection data set;
the kitchen garbage target detection model building module is used for building a kitchen garbage target detection model, the kitchen garbage target detection model comprises a feature extraction network, a feature fusion network, a region suggestion generation network, a RoI Transformer network and a detection head network which are sequentially connected, the feature extraction network is used for extracting multi-scale features of a kitchen garbage image, the feature fusion network is used for fusing the multi-scale features and sending the fused multi-scale features to the region suggestion generation network, the region suggestion generation network is used for generating candidate frames of the fused multi-scale features, positive samples and negative samples are obtained according to the intersection ratio of the candidate frames and a real boundary frame, the RoI Transformer network is used for extracting the aligned sample features according to a preset positive sample proportion and negative sample proportion and inputting the aligned sample features to the detection head network after aligning the candidate frame samples, and boundary frame regression results and classification result probabilities of category targets are obtained;
the kitchen waste target detection model training module is used for constructing an L1 regression loss function and a class balance loss function, training the kitchen waste target detection model according to a training set in the kitchen waste detection data set, calculating a loss value of the trained kitchen waste target detection model according to a bounding box regression result of the class target, a classification result probability, a preset target real bounding box and a class label, and performing network weight updating by using a gradient descent method to reversely propagate a loss gradient according to the loss value to obtain the trained kitchen waste target detection model;
and the detection module is used for acquiring an actual kitchen waste image and detecting the actual kitchen waste image according to the trained kitchen waste target detection model to obtain a target detection result.
CN202211418560.XA 2022-11-14 2022-11-14 Kitchen waste target detection method and system based on class balance loss function Active CN115761259B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211418560.XA CN115761259B (en) 2022-11-14 2022-11-14 Kitchen waste target detection method and system based on class balance loss function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211418560.XA CN115761259B (en) 2022-11-14 2022-11-14 Kitchen waste target detection method and system based on class balance loss function

Publications (2)

Publication Number Publication Date
CN115761259A true CN115761259A (en) 2023-03-07
CN115761259B CN115761259B (en) 2023-11-24

Family

ID=85370103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211418560.XA Active CN115761259B (en) 2022-11-14 2022-11-14 Kitchen waste target detection method and system based on class balance loss function

Country Status (1)

Country Link
CN (1) CN115761259B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777843A (en) * 2023-05-26 2023-09-19 湖南大学 Kitchen waste detection method and system based on dynamic non-maximum suppression

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function
WO2020102988A1 (en) * 2018-11-20 2020-05-28 西安电子科技大学 Feature fusion and dense connection based infrared plane target detection method
CN112541532A (en) * 2020-12-07 2021-03-23 长沙理工大学 Target detection method based on dense connection structure
CN113807347A (en) * 2021-08-20 2021-12-17 北京工业大学 Kitchen waste impurity identification method based on target detection technology
CN114510732A (en) * 2022-01-28 2022-05-17 上海大学 Encrypted traffic classification method based on incremental learning
CN115205521A (en) * 2022-08-09 2022-10-18 湖南大学 Kitchen waste detection method based on neural network
CN115272652A (en) * 2022-07-29 2022-11-01 东南大学 Dense object image detection method based on multiple regression and adaptive focus loss

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020102988A1 (en) * 2018-11-20 2020-05-28 西安电子科技大学 Feature fusion and dense connection based infrared plane target detection method
CN111091105A (en) * 2019-12-23 2020-05-01 郑州轻工业大学 Remote sensing image target detection method based on new frame regression loss function
CN112541532A (en) * 2020-12-07 2021-03-23 长沙理工大学 Target detection method based on dense connection structure
CN113807347A (en) * 2021-08-20 2021-12-17 北京工业大学 Kitchen waste impurity identification method based on target detection technology
CN114510732A (en) * 2022-01-28 2022-05-17 上海大学 Encrypted traffic classification method based on incremental learning
CN115272652A (en) * 2022-07-29 2022-11-01 东南大学 Dense object image detection method based on multiple regression and adaptive focus loss
CN115205521A (en) * 2022-08-09 2022-10-18 湖南大学 Kitchen waste detection method based on neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777843A (en) * 2023-05-26 2023-09-19 湖南大学 Kitchen waste detection method and system based on dynamic non-maximum suppression
CN116777843B (en) * 2023-05-26 2024-02-27 湖南大学 Kitchen waste detection method and system based on dynamic non-maximum suppression

Also Published As

Publication number Publication date
CN115761259B (en) 2023-11-24

Similar Documents

Publication Publication Date Title
Adedeji et al. Intelligent waste classification system using deep learning convolutional neural network
US11126890B2 (en) Robust training of large-scale object detectors with a noisy dataset
CN109657584B (en) Improved LeNet-5 fusion network traffic sign identification method for assisting driving
CN110097554B (en) Retina blood vessel segmentation method based on dense convolution and depth separable convolution
CN112330682B (en) Industrial CT image segmentation method based on deep convolutional neural network
CN109978807B (en) Shadow removing method based on generating type countermeasure network
CN109800754A (en) A kind of ancient character body classification method based on convolutional neural networks
CN111461127A (en) Example segmentation method based on one-stage target detection framework
CN111488917A (en) Garbage image fine-grained classification method based on incremental learning
CN112733936A (en) Recyclable garbage classification method based on image recognition
CN115937655B (en) Multi-order feature interaction target detection model, construction method, device and application thereof
CN115761259A (en) Kitchen waste target detection method and system based on class balance loss function
CN107357895A (en) A kind of processing method of the text representation based on bag of words
CN114627106A (en) Weld defect detection method based on Cascade Mask R-CNN model
CN115205521A (en) Kitchen waste detection method based on neural network
CN114444566A (en) Image counterfeiting detection method and device and computer storage medium
CN113807347A (en) Kitchen waste impurity identification method based on target detection technology
CN115909011A (en) Astronomical image automatic classification method based on improved SE-inclusion-v 3 network model
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN111858878A (en) Method, system and storage medium for automatically extracting answer from natural language text
CN113963272A (en) Unmanned aerial vehicle image target detection method based on improved yolov3
CN107577922A (en) A kind of corn lncRNA sifting sort methods based on arm processor
Wang et al. Text detection algorithm based on improved YOLOv3
CN116977624A (en) Target identification method, system, electronic equipment and medium based on YOLOv7 model
CN114782762B (en) Garbage image detection method and community garbage station

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant