CN114330542A

CN114330542A - Sample mining method and device based on target detection and storage medium

Info

Publication number: CN114330542A
Application number: CN202111625545.8A
Authority: CN
Inventors: 刘文龙; 曾卓熙; 肖嵘; 王孝宇
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2022-04-12

Abstract

The invention discloses a sample mining method, a sample mining device and a storage medium based on target detection, wherein the method comprises the following steps: predicting each unlabeled sample of the unlabeled sample set by adopting a pre-trained target detection model to obtain a prediction result; forming a standard Gaussian heat map and a mask map according to the high-confidence peak center points and the wide-height attributes of all the categories in the heat map in the prediction result; evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unlabeled sample; and screening high-value samples in the unlabeled sample set according to the evaluation result. By implementing the method, high-quality samples are mined based on the existing output characteristics of the target detection model, the labeling cost is reduced, meanwhile, a special scoring module or a scoring loss function does not need to be added to the network, and the workload is greatly reduced.

Description

Sample mining method and device based on target detection and storage medium

Technical Field

The invention relates to the technical field of deep learning, in particular to a sample mining method and device based on target detection and a storage medium.

Background

In recent years, the deep learning technology has been advanced dramatically, and the technology has been developed for the technical innovation in the fields of vision, voice and the like, so that the technology cannot be worn out. However, deep learning relies heavily on large data, requiring a large data supply to optimize massive quantities of parameters, thus enabling the model to learn how to extract high quality features. However, most deep learning tasks require a large number of high-quality labeled data sets, and acquiring the high-quality labeled data sets consumes a large amount of manpower, so that how to extract "high-information" data from the unlabeled data sets with a minimum amount of samples is a significant research hotspot at present, and the method reduces the labeling workload while maximizing the performance benefit of the model.

At present, some evaluation criteria exist for judging the amount of information contained in an image, but the methods are usually applied to a two-classification or multi-classification model, but an object detection model is not applicable. Because the target detection task focuses more on the information content at the target instance level than on the classification information content of the entire image, there is a need for a targeted high-value sample mining method that determines the information content of an image based on the value scores of the target instances of interest within the image.

Disclosure of Invention

In view of this, embodiments of the present invention provide a sample mining method, apparatus, and storage medium based on target detection, so as to solve the technical problem in the prior art that the existing sample mining method for a two-class or multi-class model cannot be adopted in sample mining for a target detection task.

The technical scheme provided by the invention is as follows:

the first aspect of the embodiments of the present invention provides a sample mining method based on target detection, including: predicting each unlabeled sample of the unlabeled sample set by adopting a pre-trained target detection model to obtain a prediction result; forming a standard Gaussian heat map and a mask map according to the high-confidence peak center points and the wide-height attributes of all the categories in the heat map in the prediction result; evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unmarked sample; and screening the high-value samples in the unlabeled sample set according to the evaluation result.

Optionally, the forming a standard gaussian heat map and a mask map according to the high-confidence peak center points and their wide-height attributes of all categories in the heat map in the prediction result includes: determining a high-confidence central point according to a peak point of the heat map in the prediction result and a preset threshold; forming a standard Gaussian heat map according to the high-confidence central point and the width and height attributes of the high-confidence central point in the prediction result; and forming a mask map according to the relation between the Gaussian value of each position in the standard Gaussian heat map and a preset Gaussian threshold value.

Optionally, the determining a high-confidence center point according to a peak point of a heatmap in the prediction result and a preset threshold includes: obtaining peak points on the heatmap for each category based on the pooling operation; and filtering the peak point according to a preset threshold value to obtain a high-confidence-degree central point.

Optionally, the evaluating the heat map in the prediction result according to the mask map and the standard gaussian heat map to obtain an evaluation result of each unlabeled sample includes: calculating to obtain a first distribution according to the mask graph and the heat map of each unmarked sample; calculating to obtain a second distribution according to the mask map and the standard Gaussian heat map; and calculating to obtain the evaluation result of each unmarked sample according to the distance between the first distribution and the second distribution.

Optionally, the screening the high-value samples in the unlabeled sample set according to the evaluation result includes: calculating the average value of the difference values of the first probability and the second probability of all the high-confidence central points to obtain an edge evaluation result; and screening the high-value samples in the unmarked sample set according to the evaluation result and the edge evaluation result.

Optionally, after the screening the high-value samples in the unlabeled sample set according to the evaluation result, the method further includes: updating the unlabeled sample set and the labeled sample set according to the screened high-value samples; and performing the training of the target detection model, the prediction of the unlabeled sample set, the formation of the standard Gaussian heat map and the mask map, the calculation of the evaluation result and the sequencing and selection of the unlabeled samples again in sequence according to the updated labeled sample set until the training of the target detection model is stopped when the judgment that the stopping condition is met.

Optionally, the target detection model is a centret target detection model; the evaluation result is calculated by KL divergence, JS divergence or Wasserstein distance.

A second aspect of the embodiments of the present invention provides a sample mining apparatus based on target detection, including: the prediction module is used for predicting each unmarked sample in the unmarked sample set by adopting a pre-trained target detection model to obtain a prediction result; the branch calculation module is used for forming a standard Gaussian heat map and a mask map according to the high-confidence peak value central points and the wide-height attributes of all categories in the heat map in the prediction result; the evaluation module is used for evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unmarked sample; and the screening module is used for screening the high-value samples in the unlabeled sample set according to the evaluation result.

A third aspect of the embodiments of the present invention provides a computer-readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause a computer to execute the target detection-based sample mining method according to any one of the first aspect and the first aspect of the embodiments of the present invention.

A fourth aspect of an embodiment of the present invention provides an electronic device, including: the target detection-based sample mining method comprises a memory and a processor, wherein the memory and the processor are connected in communication with each other, the memory stores computer instructions, and the processor executes the computer instructions to execute the target detection-based sample mining method according to any one of the first aspect and the first aspect of the embodiments of the invention.

The technical scheme provided by the invention has the following effects:

according to the sample mining method, device and storage medium based on target detection, provided by the embodiment of the invention, a standard Gaussian heat map and a mask map are formed based on high-confidence peak center points and wide-high attributes of all types in the heat map in a target detection model prediction result, and the heat map is evaluated based on the mask map and the standard Gaussian heat map to obtain an evaluation result; therefore, the sample mining method is used for mining high-quality unmarked samples based on the existing output characteristics of the target detection model, reduces the number of marked samples, reduces the data marking cost, does not need to add a special marking module or a marking loss function to a network, and greatly reduces the workload.

According to the sample mining method based on target detection provided by the embodiment of the invention, the evaluation result of each unmarked sample is calculated through a Gaussian heatmap evaluation strategy. Meanwhile, the edge evaluation result of each unmarked sample is obtained by calculating the mean value of the difference values of the first probability and the second probability of all the high-confidence central points. And when the samples are screened, comprehensively screening high-value samples in the unlabeled sample set based on the evaluation results and the edge evaluation results. The accuracy of sample information quantity judgment is greatly improved. And when the samples are screened, a larger number of high-quality samples are obtained in a circulating updating calculation mode, and the requirement on the number of the samples is met. Meanwhile, the performance of the CenterNet detection network is gradually improved through a circulating process, and the performance benefit of the model is maximized while the data workload is minimized.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a flow diagram of a method of sample mining based on target detection according to an embodiment of the invention;

FIG. 2 is a block diagram of a target detection model in a target detection-based sample mining method according to an embodiment of the present invention;

FIG. 3 is a flow diagram of a method of sample mining based on target detection according to another embodiment of the present invention;

FIG. 4 is a Gaussian thermal diagram of a target center point of a target detection-based sample mining method according to an embodiment of the invention;

FIG. 5 is a flow diagram of a method of sample mining based on target detection according to another embodiment of the present invention;

FIG. 6 is a flow diagram of a method of sample mining based on target detection according to another embodiment of the invention;

FIG. 7 is a block diagram of a sample mining device based on target detection according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of a computer-readable storage medium provided in accordance with an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device provided in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in the background art, in the implementation process of the target detection technology, the labeling operation for the training samples is particularly important, but in the existing target detection technology, the labeled resources are often limited, and meanwhile, a large amount of label-free data is not utilized. Often, it takes a significant amount of time and labor to annotate each object in an image, which in turn results in a very time consuming, difficult, and costly overall process. Meanwhile, the existing sample mining method is usually applied to a two-classification or multi-classification model, but the target detection model is not applicable. Because the target detection task focuses more on the amount of information at the target instance level than on the classification information of the entire image.

In view of this, embodiments of the present invention provide a sample mining method based on target detection, so as to solve the technical problems in the prior art that manual labeling of samples is difficult, cost is high, and the existing sample mining method for two-class or multi-class models cannot be adopted.

In accordance with an embodiment of the present invention, there is provided a method of sample mining based on object detection, it should be noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

In this embodiment, a sample mining method based on target detection is provided, which may be used in electronic devices, such as a computer, a mobile phone, a tablet computer, and the like, fig. 1 is a flowchart of a sample mining method based on target detection according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:

step S101: and predicting each unlabeled sample of the unlabeled sample set by adopting a pre-trained target detection model to obtain a prediction result.

Because the sample mining method mainly aims at the sample mining of the target detection task, the information quantity of the sample instance level needs to be acquired. Therefore, the sample value in the unmarked sample set can be mined by adopting a target detection model, and the high-value sample in the unmarked sample set is obtained.

In the target detection model, the centret target detection model has a prediction branch of a heat map, the branch can output the central point probability distribution of each category of targets, and the central point heat maps are observed to find that the lower the central energy of the central point heat maps, the lower the reliability of target classification and positioning. The situation that the target exists in the sample more or the information content in the sample is larger is shown, so that the target detection model cannot accurately judge the category to which the sample belongs. Therefore, the sample has high information value and strong mining performance. By utilizing the characteristic of the target detection model of the CenterNet, the information content of the unmarked image data can be subjected to value scoring, more unmarked data with high information value are excavated, and the detection performance of the CenterNet can be synchronously improved while the data marking amount is reduced.

Specifically, when the target detection model is used for sample mining, the target detection model of centret may be trained by using the labeled sample set to obtain a pre-trained target detection model, and then the pre-trained target detection model may be used to predict the unlabeled sample set, that is, each unlabeled sample in the unlabeled sample set is input into the pre-trained target detection model to obtain a prediction result of each unlabeled sample. Specifically, as shown in fig. 2, when predicting an unlabeled sample set, a prediction result is mainly obtained through a feature extraction module, a multi-scale feature fusion model and a prediction module for the centret target detection model, where the feature extraction module is configured to extract features from input data, the multi-scale fusion module fuses multi-scale features extracted from the feature extraction module, and the prediction module predicts according to the fused multi-scale features to obtain the prediction result.

Step S102: and forming a standard Gaussian heat map and a mask map according to the high-confidence peak center points and the wide-height attributes of all the categories in the heat map in the prediction result.

When a pre-trained target detection model is adopted to predict each unlabeled sample in the unlabeled sample set, a prediction result comprising three branches is output. Specifically, as shown in fig. 2, the prediction results of the three branches include a center point class heat map prediction branch (heatmap, hm), a center point width and height attribute prediction branch (wh), and a center point downsampling offset error prediction branch (offset); after the prediction results of the three branches are determined, the position of the object detection frame can be determined.

The central point type heat map prediction branch is an output module of the core in the target detection model, and the module is used for outputting the central point of an object and simultaneously containing which type the object is. In the prediction of this module, the center of the object is predicted as a point (i.e., the point is set to 1 and the rest to 0). However, in the actual training process, a single point is difficult to learn. Therefore, in the actual output, a two-dimensional gaussian distribution is calculated around the center point of the object, so that the model changes from learning the position of one point to regressing the whole circular area. The Gaussian distribution is set, so that the algorithm is easier to optimize, the area of the positive sample can be enlarged, and the problem of unbalance is reduced.

The center point width height attribute predicts the branch's width and height of the output object box. The effect of the central point down-sampling offset error prediction branch is to further correct the position of the central point of the object, the output size of the heatmap branch is only 1/16 of the original image, the length and the width are only 1/4 respectively, that is, one pixel point on the feature map can correspond to 16 points of the original image, so in order to further "correct" the position of the central point of the object, the output of the branch is the same as that of the central point width high attribute prediction branch, and the offset of X and the offset of Y are continuously output on each pixel point.

Based on the above, hm branch is used to predict different classes of target center heatmaps, and the different classes of target predictions are distributed in different feature channel layers. The wh branch is used to predict the width and height properties of the potential target center point. Therefore, after each unlabeled sample is predicted by adopting a pre-trained target detection model, the heatmap of the category to which each target on the unlabeled sample image belongs is output in the hm branch. For example, in the embodiment, if a total of five categories are predicted for an unlabeled sample, and three targets are included in the unlabeled sample, a heat map of the five categories is output, and a gaussian distribution predicted by each target in the unlabeled sample is displayed on the heat map of each category, that is, three gaussian distributions or three central points are displayed on the heat map of each category. Meanwhile, in the wh branch, the width and height attributes of each center point on the heatmap for each category are output.

Based on the high confidence peak center points for all classes in the heat map of hm branch outputs and the broad height attribute of each center point of wh branch outputs, a standard gaussian heat map for each class may be formed, which may be computed based on each broad height attribute. The resulting standard gaussian heat map can be used as a standard for subsequent evaluation of heat maps of unlabeled samples. Meanwhile, a mask graph can be formed based on the relation between the standard Gaussian heat map and the preset Gaussian threshold value and used as a mask of the subsequent evaluation heat map.

Step S103: and evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unlabeled sample.

In one embodiment, when evaluating the heatmap of the unlabeled sample by using the standard gaussian heatmap and the mask map, the heatmaps of all categories predicted from the unlabeled sample can be taken as a whole, the standard gaussian heatmap of all corresponding categories can be taken as a whole, and the mask map of all corresponding categories can be taken as a whole; and then multiplying the mask graph and the heat map of each unlabeled sample to obtain a first distribution, multiplying the mask graph and the standard Gaussian heat map to obtain a second distribution, and calculating the distance between the first distribution and the second distribution to obtain an evaluation result.

Wherein, when calculating the distance between the first distribution and the second distribution, any one of KL divergence, JS divergence or Wasserstein distance can be adopted for calculation. The calculation formulas of the KL divergence, the JS divergence or the Wasserstein distance may refer to the currently adopted calculation formulas, and are not described herein again.

Step S104: and screening the high-value samples in the unlabeled sample set according to the evaluation result. Because any one of the KL divergence, the JS divergence or the Wasserstein distance is adopted to obtain the evaluation result, namely the evaluation result is a specific numerical value, the specific numerical values corresponding to the evaluation result obtained by calculating each unmarked sample can be sorted, and a certain number of high-value samples with high scores are screened out.

Although the high-value sample is mainly obtained by predictive mining based on a pre-trained target detection model, the high-value sample may be applied to other models different from the target detection model for target detection. That is, the high-value sample can be used not only in the centret target detection model but also in other target detection methods.

According to the sample mining method based on target detection provided by the embodiment of the invention, a standard Gaussian heat map and a mask map are formed based on the high-confidence peak center points and the wide-height attributes of all categories in the heat map in the target detection model prediction result, and the heat map is evaluated based on the mask map and the standard Gaussian heat map to obtain an evaluation result; therefore, the sample mining method is used for mining high-quality unmarked samples based on the existing output characteristics of the target detection model, reduces the number of marked samples, reduces the data marking cost, does not need to add a special marking module or a marking loss function to a network, and greatly reduces the workload.

In one embodiment, as shown in fig. 3, a standard gaussian heat map and a mask map are formed according to the high-confidence peak center points of all classes in the heat map in the prediction result, including the following steps:

step S201: and determining a high-confidence central point according to the peak point of the heat map in the prediction result and a preset threshold value. Specifically, based on the above, for the prediction of the pre-trained target detection model of each unlabeled sample in the unlabeled sample set, the heatmaps of multiple categories corresponding to each unlabeled sample are output in the hm branch. Also included in the heat map are predicted centroids for a plurality of targets. Since some of the center points are particularly low in energy, and may be false center points, they may be removed by filtering.

In one embodiment, a pooling operation may be employed to obtain a peak point for each center point in each category heat map; and then, adopting a preset confidence threshold value, filtering out all pseudo center points lower than the threshold value, and reserving the high-confidence center point. For example, the maxpool operation may be used to obtain the peak point of each category heat map, resulting in multiple peak points. When the confidence threshold is preset, the confidence threshold can be made lower than the confidence threshold set by the centret convention, so as to ensure more low-quality detection output.

Step S202: and forming a standard Gaussian heat map according to the high-confidence central points and the wide-high attributes of the high-confidence central points in the prediction result. Through the filtering operation of step S201 described above, a plurality of high-confidence center points can be obtained for each category of heatmap. At this time, the width and height attributes of a plurality of high-confidence central points are obtained by adopting the width and height attribute branches in the prediction result, and a standard Gaussian heatmap is created according to the width and height attributes.

When a standard gaussian heat map is produced, the following formula can be adopted to generate the standard gaussian heat map:

wherein, Y_xycA Gaussian kernel representing the central heat map, exp represents an exponential function based on the natural constant e in higher mathematics, (x, y) is the pixel location of any point in the hm map (heat map for each class),

when the pixel position and the center point coordinate of the ground-route are approximately coincident, the output value of the Gaussian kernel is approximately 1; when the pixel position and the ground-truth center point are very different, the gaussian kernel output value is close to 0.

Is the "adaptive variance of the target scale", whose value is closely related to the size of the target, affecting the size and distribution of each point in the generated heat map. When in use

When a smaller value is selected, the Gaussian kernel output distribution is gathered at the category peak point, so that the 'bright point' in the generated heat map is smaller,

get and compareAt large values, "bright spots" in the generated heatmap diverge towards the surroundings. In this formula, the adaptive variance of the target scale

The high-confidence central point can be obtained by conversion calculation through the width and height attributes of the high-confidence central point. Thus, a standard Gaussian heatmap corresponding to each category may be calculated by the above formula.

And, based on the above formula, hm branches to output heat maps of all potential target centers currently. The higher the heat map center energy, the higher the confidence that the classification and localization of the target tend to be. Thus, given a fixed target size, if the energy is more concentrated, it tends to indicate a higher confidence in the classification and localization of the target. As shown in FIG. 4, object 1 is the same width and height as object 2, and thus

Similarly, the standard Gaussian heatmap generated by the above formula is shown as GT. It can be seen that the heat map for target 1 is more divergent than for target 2, indicating that the current detection model has a higher uncertainty for the detection of this target, and therefore images with more targets 1 have a higher value for the optimization of the current detection model.

Therefore, the generated standard Gaussian heatmap can be used as an evaluation standard for the heatmap of each category, and the amount of information contained in the sample to be marked is judged by comparing the two heatmaps, so that the sample with higher value is screened out.

Step S203: and forming a mask map according to the relation between the Gaussian value of each position in the standard Gaussian heat map and a preset Gaussian threshold value. After a standard Gaussian heatmap is formed, judging the relation between the Gaussian value of each position and a preset Gaussian threshold value based on the standard Gaussian heatmap, and setting the value of the corresponding position as 1 when the value is greater than the preset Gaussian threshold value, or setting the value as 0. Wherein the predetermined gaussian threshold may be 0 or a smaller value. After alignment and resetting at each location, a mask map can be created from a standard gaussian heatmap. By this mask map, it is possible to achieve that only gaussian regions near the high confidence center point in each class heat map are focused on, while other low confidence background outputs are ignored.

In one embodiment, as shown in fig. 5, the step of screening the high-value samples in the unlabeled sample set according to the evaluation result includes the following steps:

step S301: and calculating the average value of the difference values of the first probability and the second probability of all the high-confidence central points to obtain an edge evaluation result.

Specifically, besides the evaluation of the central point heat map sampling, the information amount of the sample can be mined by considering the difference of confidence values of various types of the predicted targets at the same central point and comprehensively considering the edge sampling strategy. For the edge sampling strategy, if the probability value difference between the first two categories at the center point is smaller, it indicates that the current detection model classifies the categories of the target at the center point more poorly, and therefore the potential target contains a larger amount of information. Therefore, an edge sampling strategy can be adopted to determine the edge evaluation result.

For the edge evaluation result, the edge evaluation result can be obtained by calculating an average value of the difference values of the first probability and the second probability of all the high-confidence central points. The first probability is represented as the highest probability of the prediction class probabilities and the second probability is represented as the second highest probability of the prediction class probabilities. When calculating the edge evaluation result, in order to reduce the calculation amount, a specific calculation formula for calculating the edge evaluation result by using an average value of the difference values of the previous two prediction category probabilities (the first probability and the second probability) of all the high-confidence center points can be represented by the following formula:

wherein p is₁Maximum class probability, p, representing the current center point₂Representing the second largest class probability for the current centroid, and N represents the number of all high confidence centroids. The maximum class probability and the second large class probability are specifically expressed as the probabilities of different classes predicted by the same target in the unlabeled sample;therefore, when the edge evaluation result of an unlabeled sample is calculated, the maximum class probability and the second maximum class probability predicted by all targets in the sample to be labeled need to be calculated, and then the difference is calculated through the formula and then the average value is obtained, so that the edge evaluation result of the unlabeled sample is obtained.

Step S302: and screening the high-value samples in the unmarked sample set according to the evaluation result and the edge evaluation result. Specifically, the evaluation result of each unlabeled sample is calculated through the gaussian heatmap evaluation strategy in step S103. Meanwhile, the edge evaluation result of each unlabeled sample is obtained through the calculation of the step 301. Therefore, the evaluation result and the edge evaluation result can be combined to obtain a total evaluation result, and then the total evaluation result is adopted to screen high-value samples in the unmarked sample set.

Specifically, the total evaluation result of the unlabeled sample can be represented by the following formula:

v＝α·v₁+β·v₂

wherein v is₁Shows the evaluation results, v₂And expressing the edge evaluation result, wherein alpha is the weight of the gaussian heat map evaluation strategy, namely the evaluation result, beta is the weight of the edge sampling strategy, namely the edge evaluation result, and the two weights can be predetermined based on the actual situation.

In an embodiment, the unlabeled samples in the unlabeled sample set are sorted according to the total evaluation result, and a preset number of samples, such as the first K samples, are taken as high-value samples. At this point, although a certain number of high value samples may be determined, the number of samples taken may be small. Therefore, more high-value samples can be obtained in a cyclic updating mode.

In one embodiment, the cyclic process specifically includes: updating the unlabeled sample set and the labeled sample set according to the screened high-value samples; the training of the target detection model, the prediction of the unlabeled sample set, the formation of the standard Gaussian heat map and the mask map, the calculation of the evaluation result and the sequencing and selection of the unlabeled samples are executed again and sequentially according to the updated labeled sample set until the training of the target detection model is stopped when the judgment that the stopping condition is met

Specifically, after k high-value samples are determined, the k high-value samples can be labeled and added into the labeled sample set, and meanwhile, the k high-value samples are deleted from the unlabeled sample set, so that the unlabeled sample set and the labeled sample set are updated. And then retraining the CenterNet target detection model by adopting the updated labeled sample set, performing prediction on the unlabeled sample set, forming a standard Gaussian heat map and a mask map, calculating evaluation results and edge evaluation results, sequencing and selecting the unlabeled samples, mining a new batch of high-quality samples to be labeled, and updating the updated sample set. The above steps are repeated until the detection performance of the centret reaches saturation for the data.

In one embodiment, as shown in fig. 6, the target detection-based sample mining method is implemented by the following processes: a centret target detection network model is trained based on the existing set of labeled samples. Predicting the unmarked samples of the unmarked sample set based on the trained CenterNet detection model to obtain a prediction result; and carrying out maxpool operation on the hm prediction branch based on the prediction result to obtain the peak point of the heat map of each category. Filtering all pseudo center points lower than a threshold value through a preset confidence threshold value, and reserving high confidence center points; and acquiring the width and height attributes of all high-confidence central points on the wh prediction branch output, and making a standard Gaussian heat map according to the width and height attributes. And (3) making a mask map through the standard Gaussian heat map, and setting the value of a corresponding position to be 1 if the Gaussian value of a certain position is greater than 0 when the mask map is made, otherwise setting the value to be 0. And calculating JS divergence of the hm heat map and the standard Gaussian heat map based on the manufactured mask map to serve as a score of the evaluation strategy based on the Gaussian heat map. And obtaining the average value of the probability difference values of the first two prediction categories of all the high-confidence central points as the edge evaluation result. And weighting the score and the edge evaluation result based on the Gaussian heat map evaluation strategy to obtain a total evaluation result. And selecting the first K unlabeled samples from all samples of the unlabeled sample set as high-value samples to be labeled according to the sequence that the total evaluation result v value is from large to small. And manually marking the excavated K high-value samples to be marked, adding the marked samples into the marked sample set, and simultaneously removing the samples from the unmarked data set. And (3) retraining the CenterNet target detection model by adopting the updated labeled sample set, performing the steps again, excavating a new batch of high-quality samples to be labeled, and performing the step of updating the sample set. Until the detection performance of the centret reaches saturation for the data.

According to the sample mining method based on target detection provided by the embodiment of the invention, high-quality unmarked samples are mined in an iterative manner based on the existing output characteristics of the CenterNet network, so that the number of marked samples is reduced, the data marking cost is reduced, meanwhile, a special marking module or a marking loss function is not required to be added for the network, and the workload is greatly reduced. By the sample mining method, the performance of the CenterNet detection network is gradually improved in the iteration process, and the performance benefit maximization of the model is achieved while the data workload is minimized. In addition, the high-quality annotation data set mined by the sample mining method can also be applied to other target detection methods, and is not limited to a CenterNet detection algorithm.

An embodiment of the present invention further provides a sample mining apparatus based on target detection, as shown in fig. 7, the apparatus includes:

the prediction module is used for predicting each unmarked sample in the unmarked sample set by adopting a pre-trained target detection model to obtain a prediction result; for details, reference is made to the corresponding parts of the above method embodiments, which are not described herein again.

The branch calculation module is used for forming a standard Gaussian heat map and a mask map according to the high-confidence peak value central points and the wide-height attributes of all categories in the heat map in the prediction result; for details, reference is made to the corresponding parts of the above method embodiments, which are not described herein again.

The evaluation module is used for evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unmarked sample; for details, reference is made to the corresponding parts of the above method embodiments, which are not described herein again.

And the screening module is used for screening the high-value samples in the unlabeled sample set according to the evaluation result. For details, reference is made to the corresponding parts of the above method embodiments, which are not described herein again.

According to the sample mining device based on target detection provided by the embodiment of the invention, a standard Gaussian heat map and a mask map are formed based on the high-confidence peak center points of all categories and the width and height attributes thereof in the heat map in the target detection model prediction result, and the heat map is evaluated based on the mask map and the standard Gaussian heat map to obtain an evaluation result; therefore, the sample mining method is used for mining high-quality unmarked samples based on the existing output characteristics of the target detection model, reduces the number of marked samples, reduces the data marking cost, does not need to add a special marking module or a marking loss function to a network, and greatly reduces the workload.

The functional description of the sample mining device based on target detection provided by the embodiment of the invention refers to the description of the sample mining method based on target detection in the above embodiment in detail.

An embodiment of the present invention further provides a storage medium, as shown in fig. 8, on which a computer program 601 is stored, where the instructions are executed by a processor to implement the steps of the sample mining method based on object detection in the foregoing embodiment. The storage medium is also stored with audio and video stream data, characteristic frame data, an interactive request signaling, encrypted data, preset data size and the like. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

An embodiment of the present invention further provides an electronic device, as shown in fig. 9, the electronic device may include a processor 51 and a memory 52, where the processor 51 and the memory 52 may be connected by a bus or in another manner, and fig. 9 takes the connection by the bus as an example.

The processor 51 may be a Central Processing Unit (CPU). The Processor 51 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof.

The memory 52, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as the corresponding program instructions/modules in the embodiments of the present invention. The processor 51 executes various functional applications and data processing of the processor by running non-transitory software programs, instructions and modules stored in the memory 52, namely, implements the target detection-based sample mining method in the above method embodiments.

The memory 52 may include a storage program area and a storage data area, wherein the storage program area may store an operating device, an application program required for at least one function; the storage data area may store data created by the processor 51, and the like. Further, the memory 52 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 52 may optionally include memory located remotely from the processor 51, and these remote memories may be connected to the processor 51 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory 52 and, when executed by the processor 51, perform a target detection-based sample mining method as in the embodiment of fig. 1-6.

The details of the electronic device may be understood by referring to the corresponding descriptions and effects in the embodiments shown in fig. 1 to fig. 6, and are not described herein again.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims

1. A sample mining method based on target detection is characterized by comprising the following steps:

predicting each unlabeled sample of the unlabeled sample set by adopting a pre-trained target detection model to obtain a prediction result;

forming a standard Gaussian heat map and a mask map according to the high-confidence peak center points and the wide-height attributes of all the categories in the heat map in the prediction result;

evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unmarked sample;

and screening the high-value samples in the unlabeled sample set according to the evaluation result.

2. The method of object detection-based sample mining according to claim 1, wherein said forming a standard gaussian heat map and a mask map from the high confidence peak center points of all classes in the heat map in the prediction results and their wide height attributes comprises:

determining a high-confidence central point according to a peak point of the heat map in the prediction result and a preset threshold;

forming a standard Gaussian heat map according to the high-confidence central point and the width and height attributes of the high-confidence central point in the prediction result;

and forming a mask map according to the relation between the Gaussian value of each position in the standard Gaussian heat map and a preset Gaussian threshold value.

3. The method of claim 2, wherein the determining a high confidence center point according to a peak point of a heat map in the prediction result and a preset threshold comprises:

obtaining peak points on the heatmap for each category based on the pooling operation;

and filtering the peak point according to the preset threshold value to obtain a high-confidence-degree central point.

4. The method of claim 1, wherein the evaluating the heat map of the predicted results according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unlabeled sample comprises:

calculating to obtain a first distribution according to the mask graph and the heat map of each unmarked sample;

calculating to obtain a second distribution according to the mask map and the standard Gaussian heat map;

and calculating to obtain the evaluation result of each unmarked sample according to the distance between the first distribution and the second distribution.

5. The target detection-based sample mining method of claim 1, wherein the screening of high-value samples in the unlabeled sample set according to the evaluation result comprises:

calculating the average value of the difference values of the first probability and the second probability of all the high-confidence central points to obtain an edge evaluation result;

and screening the high-value samples in the unmarked sample set according to the evaluation result and the edge evaluation result.

6. The method of claim 1, wherein after the selecting the high-value samples in the unlabeled sample set according to the evaluation result, the method further comprises:

updating the unlabeled sample set and the labeled sample set according to the screened high-value samples;

and performing the training of the target detection model, the prediction of the unlabeled sample set, the formation of the standard Gaussian heat map and the mask map, the calculation of the evaluation result and the sequencing and selection of the unlabeled samples again in sequence according to the updated labeled sample set until the training of the target detection model is stopped when the judgment that the stopping condition is met.

7. The object detection-based sample mining method of claim 1,

the target detection model is a CenterNet target detection model;

the evaluation result is calculated by KL divergence, JS divergence or Wasserstein distance.

8. A sample excavating device based on target detection is characterized by comprising:

the prediction module is used for predicting each unmarked sample in the unmarked sample set by adopting a pre-trained target detection model to obtain a prediction result;

the branch calculation module is used for forming a standard Gaussian heat map and a mask map according to the high-confidence peak value central points and the wide-height attributes of all categories in the heat map in the prediction result;

the evaluation module is used for evaluating the heat map in the prediction result according to the mask map and the standard Gaussian heat map to obtain the evaluation result of each unmarked sample;

and the screening module is used for screening the high-value samples in the unlabeled sample set according to the evaluation result.

9. A computer-readable storage medium storing computer instructions for causing a computer to perform the target detection-based sample mining method of any one of claims 1-7.

10. An electronic device, comprising: a memory and a processor, the memory and the processor being communicatively coupled to each other, the memory storing computer instructions, the processor executing the computer instructions to perform the target detection-based sample mining method of any one of claims 1-7.