WO2023105277A1

WO2023105277A1 - Data sampling method and apparatus, and storage medium

Info

Publication number: WO2023105277A1
Application number: PCT/IB2021/062073
Authority: WO
Inventors: Chunya LIU
Original assignee: Sensetime International Pte. Ltd.
Priority date: 2021-12-09
Filing date: 2021-12-21
Publication date: 2023-06-15
Also published as: AU2021290433A1

Abstract

The disclosure provides a data sampling method and apparatus, and a storage medium. Data label sets corresponding to respective scenes are acquired; based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears is counted, and initial weights of the respective data label sets are determined; based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight of each scene is adjusted to obtain a sampling weight of the data label set of each scene; and sampling is performed based on the sampling weight.

Description

DATA SAMPLING METHOD AND APPARATUS, AND STORAGE MEDIUM

CROSS-REFERENCE TO RELATED APPLICATIONS

This is based on and claims priority from Singapore Patent Application No. 10202113665U filed on December 9, 2021, disclosure of which is incorporated herein by reference in its entity.

TECHNICAL FIELD

Embodiments of the disclosure relate to the technical field of data sampling, and particularly to a data sampling method and apparatus, and a storage medium.

BACKGROUND

Target detection is an important part of an intelligent video analysis system. In the target detection of some intelligent game scenes, it is desired to perform high-accuracy detection on target objects related to the game.

A traditional detection model is obtained by training labeled sample data. However, in some game scenes, there are two main problems in sample data collection: firstly, in the process of sample data collection, the samples of a certain category in a game area are mainly collected, and the data of various categories in each batch of collected data is different in an order of magnitude; secondly, the amount of data in each scene is different. If sampling is performed by a random sampling method, the collection amount of sample data for scenes with less data amount is less. Furthermore, when the model trains samples, sufficient training cannot be implemented on samples in scenes or categories with less data amount, and the detection performance of objects to be detected in scenes or categories with less data amount is poor.

SUMMARY

The embodiments of the disclosure provide a data sampling method and apparatus, and a storage medium, which may improve the detection performance of objects to be detected in scenes or categories with less data amount.

The technical solutions of the disclosure are implemented as follows.

The embodiments of the disclosure provide a data sampling method, which may include the following operations.

Data label sets corresponding to respective scenes are acquired.

For each scene, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears is counted based on multiple pieces of label information corresponding to different images in the respective data label sets, and initial weights of the respective data label sets are determined. Herein, M is an integer greater than or equal to 1.

The initial weight in each scene is adjusted based on the negative correlation between the number of times for which the label information category appears and the weights to obtain a sampling weight of the data label set in each scene.

The different images in the data label set of each scene are sampled based on the sampling weight to obtain sample images. The sample images are used for determining training samples to be used in a process of training a target detection model.

In the above-mentioned solution, the operation of counting, for each scene, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears based on multiple pieces of label information corresponding to different images in the respective data label sets, and determining initial weights of the respective data label sets may include the following operations.

For each scene, the M label information categories corresponding to each data label set are counted according to the multiple pieces of label information corresponding to each data label set. M numbers of times corresponding to the respective M label information categories are counted in the different images.

The maximum number information of the M numbers of times is determined as the initial weight corresponding to each data label set.

In the above-mentioned solution, the operation of adjusting the initial weight in each scene based on the negative correlation between the number of times for which the label information category appears and the weights to obtain a sampling weight of the data label set in each scene may include the following operations.

Weight proportion information corresponding to each data label set is calculated based on the initial weights corresponding to the respective scenes.

Exponential operation is performed on negated weight proportion information to obtain an intermediate value corresponding to each data label set.

The negative correlation between the number of times for which the label information category appears and the weights is implemented based on negative correlation processing between the intermediate value and a second constant, to calculate the sampling weight corresponding to each data label set. The second constant is a positive integer greater than or equal to 1.

In the above-mentioned solution, the operation of calculating weight proportion information corresponding to each data label set based on the initial weights corresponding to the respective scenes may include the following operations.

The initial weight of each scene is divided by a sum of the initial weights corresponding to the respective scenes to obtain the weight proportion information corresponding to each data label set of each scene.

In the above-mentioned solution, the operation of implementing the negative correlation between the number of times for which the label information category appears and the weights based on negative correlation processing between the intermediate value and the second constant, to calculate the sampling weight corresponding to each data label set may include the following operations.

The intermediate value is added to a preset constant to obtain a secondary intermediate value.

The second constant is divided by the secondary intermediate value to take a reciprocal to implement the negative correlation between the number of times for which the label information category appears and the weights, so as to obtain the sampling weight corresponding to each data label set.

In the above-mentioned solution, the operation of sampling the different images in the data label set of each scene based on the sampling weight to obtain sample images may include the following operations.

The sampling weight corresponding to each scene is divided by the sum of the sampling weights corresponding to the respective scenes to obtain a sampling proportion corresponding to each data label set corresponding to each scene.

The sampling proportion is multiplied by the number of preset total sample images to obtain the sampling number of sample images corresponding to each data label set. performing, according to the sampling number, random sampling on each data label set corresponding to each scene to obtain the sample image.

In the above-mentioned solution, M numbers of times corresponding to the respective M label information categories are constructed in form of a two-dimensional array.

The operation of determining the maximum number information of the M numbers of times as the initial weight corresponding to each data label set may include the following operations.

The maximum number information of the M numbers of times corresponding to M two- dimensional arrays is determined as the initial weight of each data label set. In the above-mentioned solution, each two-dimensional array in the M two-dimensional arrays may include: numbering information of each corresponding data label set and the numbering information of the corresponding label information category.

The embodiments of the disclosure further provide a data sampling apparatus, which may include a data acquisition unit, a weight determination unit, a weight adjusting unit, and a sampling unit.

The data acquisition unit is configured to acquire data label sets corresponding to respective scenes.

The weight determination unit is configured to, for each scene, count, based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears, and determine initial weights of the respective data label sets. Herein, M is an integer greater than or equal to 1.

The weight adjusting unit is configured to adjust, based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight in each scene to obtain a sampling weight of the data label set in each scene.

The sampling unit is configured to sample, based on the sampling weight, the different images in the data label set of each scene to obtain sample images. The sample images are used for determining training samples to be used in a process of training a target detection model.

The embodiments of the disclosure further provide a data sampling apparatus, including a memory and a processor. The memory stores a computer program which may run on the processor. When executing the program, the processor implements the steps of the above- mentioned method.

The embodiments of the disclosure further provide a computer readable storage medium, on which a computer program is stored. When executed by a processor, the computer program implements the steps of the above-mentioned method.

In the embodiments of the disclosure, data label sets corresponding to respective scenes are acquired; for each scene, based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears is counted, and the initial weights of the respective data label sets are determined, M being an integer greater than or equal to 1 ; based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight of each scene is adjusted to obtain a sampling weight of the data label set of each scene; and sampling is performed on the different images in the data label set of each scene based on the sampling weight to obtain the sample images, the sample images being configured to train a sample and act in the process of training the target detection model. The corresponding initial weight is calculated by the number of times for which each of the label information categories appears, and the initial weight corresponding to each scene is calculated according to the negative correlation of the number of times for which the corresponding label information category appears to obtain the sampling weight. Therefore, the solution narrows the numerical gap between the sampling weights corresponding to the data label sets with different numbers of times of label information categories, and sample images with smaller number differences may be collected in the data label sets through the corresponding sampling weights, so that the target detection model obtained by training the sample images can improve the detection performance of objects to be detected in scenes or categories with less data amount.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. FIG. 2 is an optional effect diagram of a data sampling method according to an embodiment of the disclosure.

FIG. 3 is an optional effect diagram of a data sampling method according to an embodiment of the disclosure.

FIG. 4 is an optional flowchart of a data sampling method according to an embodiment of the disclosure.

FIG. 5 is an optional flowchart of a data sampling method according to an embodiment of the disclosure.

FIG. 6 is an optional flowchart of a data sampling method according to an embodiment of the disclosure.

FIG. 7 is an optional flowchart of a data sampling method according to an embodiment of the disclosure.

FIG. 8 is an optional flowchart of a data sampling method according to an embodiment of the disclosure.

FIG. 9 is a structural schematic diagram of a data sampling apparatus according to an embodiment of the disclosure.

FIG. 10 is a schematic diagram of a hardware entity of a data sampling apparatus according to an embodiment of the disclosure.

DETAILED DESCRIPTION

For making the objectives, technical solutions, and advantages of the disclosure clearer, the technical solutions of the disclosure will further be described below in combination with the drawings and embodiments in detail. The described embodiments should not be considered as limits to the disclosure. All other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of the disclosure.

"Some embodiments" involved in the following descriptions describes a subset of all possible embodiments. However, it can be understood that "some embodiments" may be the same subset or different subsets of all the possible embodiments, and may be combined without conflicts.

If the similar descriptions of "first/second" appear in the disclosed documents, the following descriptions will be added. Terms "first/second/third" involved in the following descriptions are only for distinguishing similar objects and do not represent a specific sequence of the objects. It can be understood that "first/second/third" may be interchanged to specific sequences or orders if allowed to implement the embodiments of the disclosure described herein in sequences except the illustrated or described ones.

Unless otherwise defined, all technological and scientific terms used in the disclosure have meanings the same as those usually understood by those skilled in the art of the disclosure. The terms used in the disclosure are only adopted to describe the embodiments of the disclosure and not intended to limit the disclosure.

FIG. 1 is an optional flowchart of a data sampling method according to an embodiment of the disclosure, which will be described with reference to the steps shown in FIG. 1.

At S 101 , data label sets corresponding to respective scenes are acquired.

In the embodiment of the disclosure, a server acquires the data label sets corresponding to respective scenes.

Each scene in the multiple scenes may include: a scene in which a plurality of players play a game. The data label set may include: a plurality of images in corresponding scenes and multiple pieces of label information. Herein, each image may correspond to at least one piece of label information. What is labeled in the label information is the category of an object to be detected in the image, that is, the corresponding label information category. The label information categories of the multiple pieces of label information may be the same or different. The data label set corresponding to the scene in which a plurality of players play a game may include: multiple images corresponding to the scene in which a plurality of players play a game and multiple pieces of label information. The multiple scenes may include: scenes in which different players play games on different occasions.

Exemplarily, FIG. 2 is combined, which is an optional effect diagram of a data sampling method according to an embodiment of the disclosure. The scene shown in FIG. 2 may be one scene in the multiple scenes. The scene shown in FIG. 2 may be a scene in which player A, player B and player C play a game. The object to be detected herein may be a game currency.

Exemplarily, FIG. 3 is combined, which is an optional effect diagram of a data sampling method according to an embodiment of the disclosure. FIG. 3 is a image in the data label set corresponding to one scene. Herein, the object to be detected in the image may include: the left hand 304 of the player D, the right hand 302 of the player D, the left elbow joint 303 of the player D, the right elbow joint 301 of the player D and the game currency 305. The label information corresponding to the image may be: the hands, the elbow joints and the game currency. The label information of the objects to be detected of the left hand 304 of the player D and the right hand 302 of the player D may be "hand". The label information of the left elbow joint 303 of the player D and the right elbow joint 301 of the player D may be "elbow joint". The label information of the game currency 305 may be "game currency".

At S102, for each scene, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears is counted based on multiple pieces of label information corresponding to different images in the respective data label sets, and the initial weights of the respective data label sets are determined.

In the embodiment of the disclosure, the server counts the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears based on multiple pieces of label information corresponding to different images in the respective data label sets for each scene, and determines the initial weights of the respective data label sets.

In the embodiment of the disclosure, different images in each data label set correspond to multiple pieces of label information. The label information categories of the multiple pieces of label information may be the same and may also be different. The server counts each of M label information categories corresponding to the data label set according to the label information categories of the multiple pieces of label information. Then, the server determines the number of times for which each of M label information categories appears according to the number of label information corresponding to each label information category. Furthermore, the server determines the maximum number among the numbers of times of M label information categories as the initial weight corresponding to the data label set.

Exemplarily, one data label set includes: 5 images and 8 pieces of label information. The 8 pieces of label information herein may respectively be: a game currency, a game currency, a game currency, a mobile phone, a mobile phone, a water cup, a game currency and a mobile phone. The server may divide the 8 pieces of label information into 3 label information categories according to the 8 pieces of label information: the game currency, the mobile phone and the water cup. The server counts that the number of pieces of the label information corresponding to the label information category "game currency” is 4, the number of pieces of the label information corresponding to the label information category "mobile phone" being 3, and the number of pieces of the label information corresponding to the label information category "water cup" being 1. Furthermore, the server may determine that the maximum number "4" in the number of times of each label information category is the initial weight of the data label set.

At S103, the initial weight in each scene is adjusted based on the negative correlation between the number of times for which the label information category appears and the weights to obtain a sampling weight of the data label set in each scene. In the embodiment of the disclosure, the server adjusts the initial weight in each scene based on the negative correlation between the number of times for which the label information category appears and the weights to obtain a sampling weight of the data label set in each scene.

In the embodiment of the disclosure, the server adjusts the initial weight in each scene, and reduces the larger initial weight to obtain the sampling weight of the scene corresponding to the initial weight. The server increases the smaller initial weight to obtain the sampling weight of the scene corresponding to the initial weight, and obtains the sampling weight of the data label set of each scene.

In the embodiment of the disclosure, the server divides the initial weight of each scene by a sum of the initial weights of the respective scenes. The weight proportion information corresponding to each scene is obtained. The server negates the weight proportion information corresponding to each scene and performs exponential operation to obtain an intermediate value corresponding to each data label set. The server implements negative correlation between the number of times for which the label information category appears and the weights based on negative correlation processing between the intermediate value of each scene and a second constant, to calculate the sampling weight corresponding to the data label set of each scene.

At S104, different images in the data label set of each scene are sampled based on the sampling weights to obtain sample images.

In the embodiment of the disclosure, the server samples the different images in the data label set of each scene based on the sampling weight to obtain the sample images. The sample images are used for determining training samples to be used in a process of training a target detection model.

In the embodiment of the disclosure, the server divides each sampling weight by a sum of the sampling weights corresponding to the respective scenes to obtain a sampling proportion corresponding to each scene. The server performs average sampling on the different images in the data label set of the corresponding scene based on each sampling proportion to obtain the sample images of the corresponding scene.

Exemplarily, the multiple scenes may include three scenes. The data label sets of the three scenes may respectively include: 100 images, 150 images and 200 images. The sampling weights corresponding to the data label sets of the three scenes may include: 3, 4 and 5. The server divides the sampling weight of each scene by a sum of the sampling weights of the three scenes, and may obtain the sampling proportion of each scene. Through calculation, the server may obtain the sampling proportions of three scenes: 0.25, 0.33 and 0.416 respectively. The server performs average sampling on the sampling proportions of the three scenes in the data label sets of the three scenes to obtain the sample images corresponding to the three scenes, respectively.

In the embodiment of the disclosure, the data label sets corresponding to respective scenes are acquired; for each scene, based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears is counted, and the initial weights of the respective data label sets are determined, M being an integer greater than or equal to 1 ; based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight of each scene is adjusted to obtain a sampling weight of the data label set of each scene; and sampling is performed on the different images in the data label set of each scene based on the sampling weight to obtain the sample images, the sample images being configured to train a sample and act in the process of training the target detection model. The corresponding initial weight is calculated by the number of times of the label information categories, and the initial weight corresponding to each scene is calculated according to the negative correlation of the number of times of the corresponding label information categories to obtain the sampling weight. Therefore, the solution narrows the numerical gap between the sampling weights corresponding to the data label sets with different numbers of times of label information categories, and sample images with smaller number differences may be collected in the data label sets through the corresponding sampling weight, so that the target detection model obtained by training the sample images can improve the detection performance of objects to be detected in scenes or categories with less data amount.

In some embodiments, referring to FIG. 4, FIG. 4 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. S102 shown in FIG. 1 may be implemented by S105 to S107, which will be described with reference to steps.

At SI 05, for each scene, M label information categories corresponding to each data label set are counted according to the multiple pieces of label information corresponding to each data label set.

In the embodiment of the disclosure, the server counts, for each scene, the M label information categories corresponding to each data label set according to the multiple pieces of label information corresponding to each data label set.

In the embodiment of the disclosure, the multiple pieces of label information may correspond to M label information categories. The numbers of times of the M label information categories may be the same as or different from the number of the multiple pieces of label information. The number of times of the M label information categories is smaller than or equal to the number of the multiple pieces of label information.

At S106, M numbers of times corresponding to the respective M label information categories are counted in the different images.

In the embodiment of the disclosure, the server counts each of the M numbers of times corresponding to the respective M label information categories in the different images.

In the embodiment of the disclosure, different images of each scene may correspond to multiple pieces of label information. Each image may correspond to at least one piece of label information. The server performs classification according to the label information categories of the multiple pieces of label information corresponding to different images, and determines M label information categories. The server determines the number of label information corresponding to each label information category, further determines the number information corresponding to each label information category, and further determines M numbers of times corresponding to the M label information categories.

At 107, the maximum number information of the M numbers of times is determined as the initial weight corresponding to each data label set.

In the embodiment of the disclosure, the server determines the maximum number information of the M numbers of times as the initial weight corresponding to each data label set.

In the embodiment of the disclosure, M numbers of times corresponding to the respective M label information categories may be constructed in form of a two-dimensional array. Exemplarily, one of the M numbers of times may be constructed with class_number[N][M]. The two- dimensional array includes: numbering information N corresponding to the data label set and numbering information M corresponding to the label information category. The server determines the maximum number indicated in M two-dimensional arrays as the initial weight of the data label set.

In the embodiment of the disclosure, the server determines the maximum number information of the M numbers of times as the initial weight of the data label set by counting M label information categories in each data label set, and counting M numbers of times corresponding to the M label information categories. The server determines the initial weight through the maximum number information, which may consider the number of images in the corresponding data label set, so that the determined initial weight better fits the number of images in the corresponding data label set, and the subsequent sampling number better fits the number of images in the data label set. In some embodiments, referring to FIG. 5, FIG. 5 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. S103 shown in FIG. 1 may be implemented by S108 to SI 10, which will be described with reference to steps.

At S108, based on the respective corresponding initial weights in multiple scenes, the weight proportion information corresponding to each data label set is calculated.

In the embodiment of the disclosure, the server calculates the weight proportion information corresponding to each data label set based on the respective corresponding initial weight in the multiple scenes.

In the embodiment of the disclosure, the server divides the initial weight corresponding to each data label set by a sum of the initial weights of the multiple scenes to obtain the weight proportion information corresponding to each data label set. list

Exemplarily, the server may calculate the weight proportion information ^scale corresponding to each data label set by formula (1).

list list where numbers j_s t g initial _Weight of the data label set, and ^numberM in ^number[1] _t|_lc number information of the ith label information category in the data label set. N is the number of multiple data label sets, and

i_s h_e sum of the initial weights of N data label sets.

1 t

The server divides the initial weight ^{1S number[i]} of the data label set by the sum

_of list the initial weights of the N data label sets to obtain the weight proportion information ^scale corresponding to the data label set.

At S109, exponential operation is performed on negated weight proportion information to obtain an intermediate value corresponding to each data label set.

In the embodiment of the disclosure, the server performs exponential operation on the negated weight proportion information to obtain the intermediate value corresponding to each data label set.

At SI 10, negative correlation between the number of times for which the label information category appears and the weights is implemented based on negative correlation processing between the intermediate value and a second constant, to calculate the sampling weight corresponding to each data label set.

In the embodiment of the disclosure, the server implements the negative correlation between the number of times for which the label information category appears and the weights based on negative correlation processing between the intermediate value and the second constant, to calculate the sampling weight corresponding to each data label set.

The second constant may be a positive integer greater than or equal to 1.

In the embodiment of the disclosure, the server may also divide the second constant by the intermediate value to obtain the sampling weight corresponding to each data label set. Exemplarily, the server may calculate the sampling weight

by formula (2).

where e is a constant with a value of 2.718281828459. 1 is the second constant. The server list gets a value after taking the weight proportion information ^scale as a negative number. The server calculates the exponent of the value with e as the base to get the intermediate value. The server adds the intermediate value to 1 to get another value, and the server divides 1 by the another value to get the sampling weight .

In the embodiment of the disclosure, the server calculates the weight proportion information corresponding to each data label set. Exponential operation is performed on the negated weight proportion information to obtain the intermediate value corresponding to each data label set. Then, negative correlation between the number of times for which the label information category appears and the weights is implemented based on negative correlation processing between the intermediate value and the second constant, to calculate the sampling weight corresponding to each data label set. In this process, the server calculates a final sampling weight according to the negative correlation between the number of times of the corresponding label information category and the weight, which narrows the size difference between the weights of the data label sets with different numbers of images, and sample images with smaller number differences may be collected in the data label sets through the corresponding sampling weights, so that the target detection model obtained by training the sample images can improve the detection performance of objects to be detected in scenes or categories with less data amount.

In some embodiments, referring to FIG. 6, FIG. 6 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. S108 shown in FIG. 5 may be implemented by S 111, which will be described with reference to steps.

At SI 11, the initial weight of each scene is divided by the sum of the initial weights corresponding to the respective scenes to obtain the weight proportion information corresponding to each data label set of each scene.

In the embodiment of the disclosure, the server divides the initial weight of each scene by the sum of the initial weights corresponding to multiple scenes to obtain the weight proportion information corresponding to each data label set of each scene.

In the embodiment of the disclosure, the server calculates the weight proportion information of the initial weight of each scene, and performs sampling on a preset total sampling image library according to the weight proportion information, so that the sampling effect is better.

In some embodiments, referring to FIG. 6, FIG. 6 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. SI 10 shown in FIG. 5 may be implemented by S 112 to S 113, which will be described with reference to steps.

At SI 12, the intermediate value is added to a preset constant to obtain a secondary intermediate value.

In the embodiment of the disclosure, the server adds the intermediate value to the preset constant to obtain the secondary intermediate value.

Herein, the preset constant may be any constant.

At SI 13, the second constant is divided by the secondary intermediate value to take a reciprocal to implement the negative correlation between the number of times for which the label information category appears and the weights, so as to obtain the sampling weight corresponding to each data label set.

In the embodiment of the disclosure, the server divides the second constant by the secondary intermediate value to take a reciprocal to implement the negative correlation between the number of times for which the label information category appears and the weights, so as to obtain the sampling weight corresponding to each data label set.

Herein, the second constant may be the same as and may also be different from the preset constant.

In the embodiment of the disclosure, the server divides the second constant by the secondary intermediate value corresponding to each data label set to obtain the sampling weight corresponding to each data label set.

In the embodiment of the disclosure, the server divides the second constant by the secondary intermediate value to take a reciprocal to obtain a final sampling weight. The use of the characteristics of a sigmod function guarantees the sampling weight of each data label set to be within a certain range, so that the subsequent sampling effect is better and the difference is smaller.

In some embodiments, referring to FIG. 7, FIG. 7 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. S104 shown in FIG. 1 may be implemented by S 114 to S 116, which will be described with reference to steps.

At SI 14, the sampling weight corresponding to each scene is divided by the sum of the sampling weights corresponding to the respective scenes to obtain a sampling proportion corresponding to each data label set corresponding to each scene.

In the embodiment of the disclosure, the server divides the sampling weight corresponding to each scene by the sum of the sampling weights corresponding to the respective scenes to obtain a sampling proportion corresponding to each data label set corresponding to each scene.

In the embodiment of the disclosure, the server calculates the proportion information of the sampling weight corresponding to each scene in a plurality of sampling weights, that is, the sampling proportion.

At SI 15, the sampling proportion is multiplied by the number of preset total sample images to obtain the sampling number of sample images corresponding to each data label set.

In the embodiment of the disclosure, the server multiplies the sampling proportion by the number of the preset total sample images to obtain the sampling number of sample images corresponding to each data label set.

Exemplarily, the number of the preset total sample images may be 200. The sampling proportion corresponding to one data label set is 0.2. The server multiplies 0.2 by 200 to obtain a sampling number 40. The server may perform average sampling on the different images in the corresponding data label set according to the sampling number 40 to sample 40 sample images.

At SI 16, random sampling is performed on each data label set corresponding to each scene according to the sampling number to obtain the sample image.

In the embodiment of the disclosure, the server performs random sampling on each data label set corresponding to each scene according to the sampling number to obtain the sample image.

Exemplarily, the multiple scenes may include three scenes. The data label sets of the three scenes may respectively include: 100 images, 150 images and 200 images. The number of the preset total sample images may be 1000. The sampling weights corresponding to the data label sets of the three scenes may include: 3, 4 and 5. The server divides the sampling weight of each scene by the sum of the sampling weights of the three scenes, and may obtain the sampling proportion of each scene. Through calculation, the server may obtain the sampling proportions of the three scenes: 0.25, 0.33 and 0.416, respectively. The server multiplies the sampling proportions of the three scenes by the number of preset total sample images respectively to obtain the sampling numbers corresponding to the three data label sets of the three scenes: 250, 330 and 416. The server performs average sampling on the different images in the corresponding data label sets according to the three sampling numbers to obtain the sample images corresponding to the three data label sets respectively.

In the embodiment of the disclosure, the server performs average sampling on different images of the corresponding data label sets through the calculated sampling numbers, which reduces the difference of the sampling numbers of the data label sets with different numbers of label information categories and different numbers of images.

In some embodiments, referring to FIG. 8, FIG. 8 is an optional flowchart of a data sampling method according to an embodiment of the disclosure. S107 shown in FIG. 4 may be implemented by S101 to SI 17, which will be described with reference to steps.

At SI 17, the maximum number information among M numbers of times corresponding to M two-dimensional arrays is determined as the initial weight of each data label set.

In the embodiment of the disclosure, M numbers of times corresponding to the respective M label information categories may be constructed in form of a two-dimensional array. The server determines the maximum number information of the M numbers of times corresponding to the M two-dimensional arrays as the initial weight of each data label set.

Exemplarily, one of the M numbers of times may be constructed with class_number[N][M]. N in the two-dimensional array is the numbering information of the corresponding data label set, and M is the numbering information of the corresponding label information category. The server determines the corresponding maximum number in M two-dimensional arrays as the initial weight of the data label set. Each two-dimensional array corresponds to one piece of number information. The server determines the maximum number information corresponding to the two- dimensional arrays in the M two-dimensional arrays as the initial weight.

In the embodiment of the disclosure, the server constructs M numbers of times through the two-dimensional array which facilitates the calculation of the server and accelerates the calculation efficiency of sampling.

In the embodiment of the disclosure, the server acquires N data label sets used for training. The data is manually labeled, and the label information is the category and coordinates of each objected to be detected. Each data label set includes: a plurality of images, and each image corresponds to at least one piece of label information. Each label information may be the category and coordinates of the corresponding object to be detected.

The server calculates the number of times for which each label information category in each data label set appears, and uses a two-dimensional array class_number[N][M], where N is the numbering information of a data label list and M is the numbering information of each label information category.

The server calculates the value of the maximum number of times corresponding to the label information category in the data label list as the initial weight of the data label list. Exemplarily, the server may determine the initial weight

by formula (3). list_number[i] = max(class _ number[i][j]),O < i < N,O < j < M w ,here list numb _her li] ₁ •_s fo_e initia 1l wei •ght ot r t.hLe ith data l >abel > set .

the jth label information category in the ith data label set, N being the number of the data label sets, and M being the number of the label information categories in the corresponding data label sets. The server determines the maximum number information among the numbers of times of the M label information categories corresponding to the ith data label set as the initial weight ^^St number^ _Qf fo_e fojfo |_a|-)cl Set.

The server reduces the difference between the weight values of each data label set, that is, reduces the weight of the data list with large data amount and increases the weight of the data list with small data amount.

The server may calculate the proportion of the weight of each data label set in the total weight by formula (1).

list list where number^ j_s fo_c initial weight of the data label set, and ^numbc,'i' i_n numberti] j_s fo_e number of times of the ith label information category in the data label set. N is the number of multiple data label sets, and

j_s fo_{e sum o} fo_e fo^^al _weights of N data label sets.

1

The server divides the initial weight ^ls’'^number&] of the data label set by the sum

_of list the initial weights of the N data label sets to obtain the weight proportion information ^scale corresponding to the data label set. The server calculates the sampling weight of each data label set. Here, the sampling weight range of each list is kept in [0.5,1] by use of the characteristics of the sigmod function. list

The server may calculate the sampling weight "^Y''"^/"' by formula (2).

where e is a constant with a value of 2.718281828459. 1 is the second constant. The server gets a value after taking the weight proportion information as a negative number. The server calculates the exponent of the value with e as the base to get the intermediate value. The server adds the intermediate value to 1 to get another value, and the server divides 1 by the another list value to get the sampling weight "^Y''"^/"' . list

The server assigns the obtained sampling weights ^waghts to each data label set for use. A value is assigned to the corresponding data label set, and average sampling is performed.

The corresponding initial weight is calculated by the number of times for which each category of the label information categories appears, and the initial weight corresponding to each scene is calculated according to the negative correlation of the number of times of the corresponding label information category to obtain the sampling weight. Therefore, the solution narrows the numerical gap between the sampling weights corresponding to the data label sets with different numbers of label information categories, and sample images with smaller number differences may be collected in the data label sets through the corresponding sampling weights, so that the target detection model obtained by training the sample images can improve the detection performance of objects to be detected in scenes or categories with less data amount.

Referring to FIG. 9, which is a structural schematic diagram of a data sampling apparatus according to an embodiment of the disclosure.

In the embodiment of the disclosure, the data sampling apparatus 800 includes: a data acquisition unit 803, a weight determination unit 804, a weight adjusting unit 805 and a sampling unit 806.

The data acquisition unit 803 is configured to acquire data label sets corresponding to respective scenes.

The weight determination unit 804 is configured to, for each scene, count, based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears, and determine initial weight of the data label sets. Herein, M is an integer greater than or equal to 1.

The weight adjusting unit 805 is configured to adjust, based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight in each scene to obtain a sampling weight of the data label set in each scene.

The sampling unit 806 is configured to sample, based on the sampling weight, the different images in the data label set of each scene to obtain sample images. The sample images are used for determining training samples to be used in a process of training a target detection model.

In the embodiment of the disclosure, the weight determination unit 804 in the data sampling apparatus 800 is further configured to count, for each scene, M label information categories corresponding to each data label set according to the multiple pieces of label information corresponding to each data label set; count, in the different images, M numbers of times corresponding to the respective M label information categories; and determine the maximum number information of the M numbers of times as the initial weight corresponding to each data label set.

In the embodiment of the disclosure, the weight adjusting unit 805 in the data sampling apparatus 800 is further configured to, calculate, based on the respective corresponding initial weight in multiple scenes, the weight proportion information corresponding to each data label set; perform exponential operation on the negated weight proportion information to obtain an intermediate value corresponding to each data label set; and implement negative correlation between the number of times for which the label information category appears and the weights based on negative correlation processing between the intermediate value and a second constant, to calculate the sampling weight corresponding to each data label set. The second constant is a positive integer greater than or equal to 1.

In the embodiment of the disclosure, the weight adjusting unit 805 in the data sampling apparatus 800 is further configured to divide the initial weight of each scene by the sum of the initial weights corresponding to multiple scenes to obtain the weight proportion information corresponding to each data label set of each scene.

In the embodiment of the disclosure, the weight adjusting unit 805 in the data sampling apparatus 800 is further configured to add the intermediate value to a preset constant to obtain a secondary intermediate value; divide the second constant by the secondary intermediate value to take a reciprocal to implement the negative correlation between the number of times for which the label information category appears and the weights, so as to obtain the sampling weight corresponding to each data label set.

In the embodiment of the disclosure, the sampling unit 806 in the data sampling apparatus 800 is further configured to divide the sampling weight corresponding to each scene by the sum of the sampling weights corresponding to multiple scenes to obtain a sampling proportion corresponding to each data label set corresponding to each scene; multiply the sampling proportion by the number of preset total sample images to obtain the sampling number of the sample images corresponding to each data label set; and perform random sampling on each data label set corresponding to each scene according to the sampling number to obtain the sample images.

In the embodiment of the disclosure, M numbers of times corresponding to the respective M label information categories are constructed in form of a two-dimensional array.

The weight determination unit 804 of the data sampling apparatus 800 is further configured to determine the maximum number information of the M numbers of times corresponding to M two-dimensional arrays as the initial weight of each data label set.

In the embodiment of the disclosure, each two-dimensional array in the M two-dimensional arrays includes: numbering information of each corresponding data label set and the numbering information of the corresponding label information category.

In the embodiment of the disclosure, the data label sets corresponding to respective scenes are acquired by the data sampling unit 803. For each scene, based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears is counted by the weight determination unit 804, and the initial weights of the respective data label sets are determined, M being an integer greater than or equal to 1. Based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight of each scene is adjusted by the weight adjusting unit 805 to obtain a sampling weight of the data label set of each scene. Sampling is performed on the different images in the data label set of each scene by the sampling unit 806 based on the sampling weight to obtain the sample images, and the sample images are configured to train a sample and act in the process of training the target detection model. The corresponding initial weight is calculated by the number of the label information categories, and the initial weight corresponding to each scene is calculated according to the negative correlation of the number of the corresponding label information categories to obtain the sampling weight. Therefore, the solution narrows the numerical gap between the sampling weight corresponding to the data label sets with different numbers of label information categories, and sample images with smaller number differences may be collected in the data label sets through the corresponding sampling weight, so that the target detection model obtained by training the sample images can improve the detection performance of objects to be detected in scenes or categories with less data amount.

It is to be noted that, in the embodiment of the disclosure, when being implemented in the form of a software function module and sold or used as an independent product, the data sampling method may also be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the embodiments of the disclosure substantially or parts making contributions to the related art may be embodied in form of software product, and the computer software product is stored in a storage medium, including a plurality of instructions configured to enable a data sampling apparatus (which may be a personal computer and the like) to execute all or part of the steps of the method in each embodiment of the present disclosure. The storage medium includes: various media capable of storing program codes such as a U disk, a mobile hard disk, a Read Only Memory (ROM), a magnetic disk or an optical disk. By doing so, the embodiments of the present disclosure are not limited to any specific combination of hardware and software.

Correspondingly, the embodiments of the disclosure provide a computer readable storage medium, on which a computer program is stored. When executed by a processor, the computer program implements the steps of the above-mentioned method.

Correspondingly, the embodiments of the disclosure provide a data sampling apparatus, including a memory 802 and a processor 801. The memory 802 stores a computer program which may run on the processor 801. When executing the program, the processor 801 implements the steps of the above-mentioned method.

It is to be noted that the descriptions of the above storage medium and device embodiment are similar to the descriptions of the above method embodiment and have similar beneficial effects as the method embodiment. Technical details undisclosed in the storage medium and device embodiment of the disclosure are understood with reference to the descriptions about the method embodiment of the disclosure.

It is to be noted that FIG. 10 is a schematic diagram of a hardware entity of a data sampling apparatus according to an embodiment of the disclosure. As shown in FIG. 10, the hardware entity of the data sampling apparatus 800 includes: a processor 801 and a memory 802.

The processor 801 generally controls the overall operation of the data sampling apparatus 800.

The memory 802 is configured to store an instruction and an application executable by the processor 801, and may also cache data (for example, image data, audio data, voice communication data and video communication data) to be processed or already processed by the processor 801 and each module in the data sampling apparatus 800, which may be implemented by FLASH or a Random Access Memory (RAM).

It is to be understood that "one embodiment" and "an embodiment" mentioned in the whole specification mean that specific features, structures or characteristics related to the embodiments are included in at least one embodiment of the disclosure. Therefore, "in one embodiment" or "in an embodiment" mentioned throughout the specification does not always refer to the same embodiment. Furthermore, these particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It is to be understood that, in each embodiment of the disclosure, a magnitude of a sequence number of each process does not mean an execution sequence and the execution sequence of each process should be determined by its function and an internal logic and should not form any limit to an implementation process of the embodiments of the disclosure. The sequence numbers of the embodiments of the disclosure are adopted not to represent superiority-inferiority of the embodiments but only for description.

It is to be noted that terms "include" and "contain" or any other variant thereof is intended to cover nonexclusive inclusions herein, so that a process, method, object or device including a series of elements not only includes those elements but also includes other elements which are not clearly listed or further includes elements intrinsic to the process, the method, the object or the device. Under the condition of no more limitations, it is not excluded that additional identical elements further exist in the process, method, article or device including elements defined by a sentence "including a

In the several embodiments provided in the disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiment described above is only schematic, and for example, division of the units is only logic function division, and other division manners may be adopted during practical implementation. For example, multiple units or components may be combined or integrated into another system, or some characteristics may be neglected or not executed. In addition, coupling or direct coupling or communication connection between each displayed or discussed component may be indirect coupling or communication connection, implemented through some interfaces, of the apparatus or the units, and may be electrical and mechanical or adopt other forms.

The units described as separate parts may or may not be physically separated, and parts displayed as units may or may not be physical units, and namely may be located in the same place, or may also be distributed to multiple network units. Part or all of the units may be selected to achieve the purpose of the solutions of the embodiments according to a practical requirement.

In addition, each function unit in each embodiment of the disclosure may be integrated into a processing unit, each unit may also serve as an independent unit and two or more than two units may also be integrated into a unit. The integrated unit may be implemented in a hardware form and may also be implemented in form of hardware and software function unit.

Those of ordinary skill in the art should know that: all or part of the steps of the above- mentioned method embodiment may be implemented by instructing related hardware through a program, the above-mentioned program may be stored in a computer-readable storage medium, and the program is executed to execute the steps of the above-mentioned method embodiment; and the storage medium includes: various media capable of storing program codes such as mobile storage equipment, a ROM, a magnetic disk or an optical disc.

Or, the integrated unit of the present disclosure may also be stored in a computer readable storage medium if being implemented in the form of a software functional module and sold or used as a standalone product. Based on such an understanding, the technical solutions of the embodiments of the disclosure substantially or parts making contributions to the related art may be embodied in form of software product, and the computer software product is stored in a storage medium, including a plurality of instructions configured to enable a computer device (which may be a personal computer, a server, a network apparatus or the like) to execute all or part of the steps of the method in each embodiment of the present disclosure. The above- mentioned storage medium includes: various media capable of storing program codes such as mobile storage equipment, a ROM, a magnetic disk or an optical disc.

The above is only the implementation mode of the present disclosure and not intended to limit the scope of protection of the present disclosure. Any variations or replacements apparent to those skilled in the art within the technical scope disclosed by the present disclosure shall fall within the scope of protection of the present disclosure. Therefore, the scope of protection of the disclosure shall be subject to the scope of protection of the claims.

Claims

1. A data sampling method, comprising: acquiring data label sets corresponding to respective scenes; for each scene, counting, based on multiple pieces of label information corresponding to different images in the respective data label sets, a number of times for which each label information category of M label information categories corresponding to the respective data label sets appears, and determining initial weights of the respective data label sets, M being an integer greater than or equal to 1 ; adjusting, based on negative correlation between the number of times for which the label information category appears and the weights, the initial weight in each scene to obtain a sampling weight of the data label set in each scene; and sampling, based on the sampling weight, the different images in the data label set of each scene to obtain sample images, the sample images being used for determining training samples to be used in a process of training a target detection model.

2. The data sampling method of claim 1, wherein for each scene, counting, based on multiple pieces of label information corresponding to different images in the respective data label sets, the number of times for which each label information category of M label information categories corresponding to the respective data label sets appears, and determining initial weights of the respective data label sets comprises: for each scene, counting, according to the multiple pieces of label information corresponding to each data label set, the M label information categories corresponding to each data label set; counting M numbers of times corresponding to the respective M label information categories in the different images; and determining maximum number information of the M numbers of times as the initial weight corresponding to each data label set.

3. The data sampling method of claim 1 or 2, wherein adjusting, based on the negative correlation between the number of times for which the label information category appears and the weights, the initial weight in each scene to obtain a sampling weight of the data label set in each scene comprises: calculating, based on the initial weights corresponding to the respective scenes, weight proportion information corresponding to each data label set; performing exponential operation on negated weight proportion information to obtain an intermediate value corresponding to each data label set; implementing, based on negative correlation processing between the intermediate value and a second constant, the negative correlation between the number of times for which the label information category appears and the weights to calculate the sampling weight corresponding to each data label set, the second constant being a positive integer greater than or equal to 1.

4. The data sampling method of claim 3, wherein calculating, based on the initial weights corresponding to the respective scenes, weight proportion information corresponding to each data label set comprises: dividing the initial weight of each scene by a sum of the initial weights corresponding to the respective scenes to obtain the weight proportion information corresponding to each data label set of each scene.

5. The data sampling method of claim 3 or 4, wherein implementing, based on negative correlation processing between the intermediate value and the second constant, the negative correlation between the number of times for which the label information category appears and the weights to calculate the sampling weight corresponding to each data label set comprises: adding the intermediate value to a preset constant to obtain a secondary intermediate value; and implementing the negative correlation between the number of times for which the label information category appears and the weights by dividing the second constant by the secondary intermediate value to obtain a value and taking a reciprocal of the value, to obtain the sampling weight corresponding to each data label set.

6. The data sampling method according to any one of claims 1-5, wherein sampling, based on the sampling weight, the different images in the data label set of each scene to obtain sample images comprises: dividing the sampling weight corresponding to each scene by the sum of the sampling weights corresponding to the respective scenes to obtain a sampling proportion corresponding to each data label set corresponding to each scene; multiplying the sampling proportion by a number of preset total sample images to obtain the sampling number of sample images corresponding to each data label set; and performing, according to the sampling number, random sampling on each data label set corresponding to each scene to obtain the sample image.

7. The data sampling method of claim 2, wherein

M numbers of times corresponding to the respective M label information categories are constructed in form of a two-dimensional array, wherein determining the maximum number information of the M numbers of times as the initial weight corresponding to each data label set comprises: determining the maximum number information of the M numbers of times corresponding to M two-dimensional arrays as the initial weight of each data label set.

8. The data sampling method of claim 7, wherein each two-dimensional array in the M two-dimensional arrays comprises: numbering information corresponding to each data label set and numbering information corresponding to an label information category.

9. A data sampling apparatus, comprising: a data acquisition unit, configured to acquire data label sets corresponding to respective scenes; a weight determination unit, configured to, for each scene, count, based on multiple pieces of label information corresponding to different images in the respective data label sets, a number of times for which each label information category of M label information categories corresponding to the respective data label sets appears, and determine initial weights of the respective data label sets, M being an integer greater than or equal to 1; a weight adjusting unit, configured to adjust, based on negative correlation between the number of times for which the label information category appears and the weights, an initial weight in each scene to obtain a sampling weight of the data label set in each scene; and a sampling unit, configured to sample, based on the sampling weight, the different images in the data label set of each scene to obtain sample images, the sample images being used for determining training samples to be used in a process of training a target detection model.

10. The data sampling apparatus of claim 9, wherein the weight determination unit is configured to: 18 for each scene, count, according to the multiple pieces of label information corresponding to each data label set, the M label information categories corresponding to each data label set; count M numbers of times corresponding to the respective M label information categories in the different images; and determine maximum number information of the M numbers of times as the initial weight corresponding to each data label set.

11. The data sampling apparatus of claim 9 or 10, wherein the weight adjusting unit is configured to: calculate, based on the initial weights corresponding to the respective scenes, weight proportion information corresponding to each data label set; perform exponential operation on negated weight proportion information to obtain an intermediate value corresponding to each data label set; implement, based on negative correlation processing between the intermediate value and a second constant, the negative correlation between the number of times for which the label information category appears and the weights to calculate the sampling weight corresponding to each data label set, the second constant being a positive integer greater than or equal to 1.

12. The data sampling apparatus of claim 11, wherein the weight adjusting unit is configured to: divide the initial weight of each scene by a sum of the initial weights corresponding to the respective scenes to obtain the weight proportion information corresponding to each data label set of each scene.

13. The data sampling apparatus of claim 11 or 12, wherein the weight adjusting unit is configured to: add the intermediate value to a preset constant to obtain a secondary intermediate value; and implement the negative correlation between the number of times for which the label information category appears and the weights by dividing the second constant by the secondary intermediate value to obtain a value and take a reciprocal of the value, to obtain the sampling weight corresponding to each data label set.

14. The data sampling apparatus according to any one of claims 9-13, wherein the weight adjusting unit is configured to: divide the sampling weight corresponding to each scene by the sum of the sampling weights corresponding to the respective scenes to obtain a sampling proportion corresponding to each data label set corresponding to each scene; multiply the sampling proportion by a number of preset total sample images to obtain the sampling number of sample images corresponding to each data label set; and perform, according to the sampling number, random sampling on each data label set corresponding to each scene to obtain the sample image.

15. The data sampling apparatus of claim 10, wherein

M numbers of times corresponding to the respective M label information categories are constructed in form of a two-dimensional array, wherein the weight determination unit is configured to: determine the maximum number information of the M numbers of times corresponding to M two-dimensional arrays as the initial weight of each data label set.

16. The data sampling apparatus of claim 15, wherein 19 each two-dimensional array in the M two-dimensional arrays comprises: numbering information corresponding to each data label set and numbering information corresponding to an label information category.

17. A data sampling apparatus, comprising a memory and a processor, wherein the memory stores a processor-executable computer program that when executed by the processor, implements the steps of the method of any one of claims 1-8.

18. A computer readable storage medium having stored thereon a computer program that when executed by a processor, implements the steps of the method of any one of claims 1-8.