CN113158911A - Data generation method and device - Google Patents

Data generation method and device Download PDF

Info

Publication number
CN113158911A
CN113158911A CN202110448065.2A CN202110448065A CN113158911A CN 113158911 A CN113158911 A CN 113158911A CN 202110448065 A CN202110448065 A CN 202110448065A CN 113158911 A CN113158911 A CN 113158911A
Authority
CN
China
Prior art keywords
data
hand skeleton
training
variance
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110448065.2A
Other languages
Chinese (zh)
Inventor
古迎冬
李骊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing HJIMI Technology Co Ltd
Original Assignee
Beijing HJIMI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing HJIMI Technology Co Ltd filed Critical Beijing HJIMI Technology Co Ltd
Priority to CN202110448065.2A priority Critical patent/CN113158911A/en
Publication of CN113158911A publication Critical patent/CN113158911A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Abstract

The invention provides a data generation method and a device, wherein the method comprises the following steps: acquiring first depth map data, wherein the first depth map data comprises a first hand skeleton map and label data of each joint point; adding noise data into the first hand skeleton diagram and carrying out standardization processing to obtain a second hand skeleton diagram; and labeling each labeling data to the second hand skeleton map to generate second depth map data. In the present invention, the first hand skeleton map in each first depth map is subjected to gaussian noise and normalization processing to change partial data in the original hand skeleton map to obtain a second hand skeleton map, and each joint in the second hand skeleton map is similar to the corresponding joint in the first hand skeleton map, and is substituted with each piece of label data to generate entirely new second depth map data. Therefore, on the basis of not destroying the original hand skeleton diagram, new data are added by using limited data, and newly generated data are more natural and abundant, so that the diversity of large training data is met.

Description

Data generation method and device
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data generation method and apparatus.
Background
With the rapid development of computer technology, in an application scenario of 3D gesture interaction using depth data, it is first necessary to detect and track the positions of finger joint points of a user. How to detect and track each joint point of the finger also becomes a difficulty to overcome at present.
In the prior art, the detection and tracking of the positions of the joints of the fingers are basically based on the training of a deep learning network model by a large amount of collected depth data, and the joints are detected and tracked by the model through the trained deep learning network model. However, the prior art usually consumes a lot of resources and time in the step of acquiring the training data, and the accuracy of labeling each joint point has a certain deviation. Particularly, in the process of collecting and labeling depth map data of hand joint point positions, as hand motion gestures are varied, every person has characteristics, even if the hand motions are in the same gesture, the action performance is still different, and therefore the diversity of training data needs to be increased if the deep learning network model is adapted to more gesture changes.
Disclosure of Invention
In view of the above, the present invention provides a data generation method, by which new data is added by using limited data without destroying an original hand skeleton diagram, and the newly generated data is more natural and abundant, and meets the diversity of large training data.
The invention also provides a data generation device which is used for ensuring the realization and the application of the method in practice.
A method of data generation, comprising:
acquiring first depth map data, wherein the first depth map data comprise a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map;
adding noise data into each first hand skeleton diagram, and carrying out standardization processing according to a preset standardization rule to obtain a second hand skeleton diagram corresponding to each first hand skeleton diagram;
and labeling the labeling data corresponding to each first hand skeleton map into a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data.
In the above method, optionally, the adding noise data to each of the first hand skeleton diagrams and performing normalization processing according to a preset normalization rule to obtain a second hand skeleton diagram corresponding to each of the first hand skeleton diagrams includes:
starting a pre-trained encoder, and applying a mean variance calculation module in the encoder to perform encoding calculation on each first hand skeleton diagram to obtain a first mean value and a first variance corresponding to each first hand skeleton diagram;
applying a Gaussian noise module in the encoder, and adding noise data into a first mean value and a first variance corresponding to each first hand skeleton to obtain a second mean value and a second variance corresponding to each first hand skeleton;
normalizing each second mean value and each second variance based on the normalization processing rule to obtain each third mean value and each third variance;
and generating a second hand skeleton map corresponding to each first hand skeleton map based on each third mean value and each third variance.
Optionally, the normalizing, based on the normalization processing rule, the second mean and the second variance to obtain third mean and third variance includes:
determining a preset parameter standard of standardized normal distribution;
and adjusting each second mean value and each second variance according to the parameter standard to obtain each third mean value and each third variance.
In the above method, optionally, the labeling data corresponding to each first hand skeleton map is labeled to a second hand skeleton map corresponding to the first hand skeleton map to generate each second depth map data, and the method includes:
and marking each joint point in each second hand skeleton diagram by using a preset generator, inputting each marking data corresponding to each first hand skeleton diagram into the generator, and marking each marking data corresponding to each first hand skeleton diagram onto each node of the second hand skeleton diagram corresponding to each first hand skeleton diagram by using the generator to obtain each second depth map data.
The method, optionally, the process of training the encoder includes:
acquiring training data, wherein the training data are hand skeleton diagrams for training the encoder respectively;
sequentially inputting each training data into the encoder so that the encoder trains based on the training data input each time until the encoder meets the preset convergence condition;
after each training data is input into the encoder, obtaining a first training result of the training data currently input into the initial encoder, wherein the first training result is a training mean and a training variance obtained after mean and variance calculation is carried out on the training data and noise data is added; standardizing the first training result according to the standardized processing rule to obtain a second training result, and generating sample data corresponding to the training data based on the second training result; calculating the training data and the sample data to obtain a loss function value; judging whether the loss function value meets a preset convergence condition or not; if not, adjusting model parameters of a mean variance calculation module and a Gaussian noise module in the encoder according to the loss function value; and if so, finishing inputting the training data to the encoder and finishing the training of the encoder.
A data generation apparatus, comprising:
a first obtaining unit, configured to obtain first depth map data, where the first depth map data includes a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map;
a processing unit, configured to add noise data to each of the first hand skeleton maps and perform normalization processing according to a preset normalization rule to obtain a second hand skeleton map corresponding to each of the first hand skeleton maps;
and generating means for labeling the labeling data corresponding to each of the first hand skeleton maps to a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data.
The above apparatus, optionally, the processing unit includes:
the calculation subunit is configured to start a pre-trained encoder, and perform encoding calculation on each first hand skeleton diagram by using a mean variance calculation module in the encoder to obtain a first mean and a first variance corresponding to each first hand skeleton diagram;
a first processing subunit, configured to apply a gaussian noise module in the encoder, add noise data to a first mean value and a first variance corresponding to each first hand skeleton, and obtain a second mean value and a second variance corresponding to each first hand skeleton;
a second processing subunit, configured to perform normalization processing on each second mean and each second variance based on the normalization processing rule, so as to obtain each third mean and each third variance;
and a first generating subunit, configured to generate a second hand skeleton map corresponding to each first hand skeleton map based on each third mean and each third variance.
The above apparatus, optionally, the second processing subunit includes:
the determining subunit is used for determining a preset parameter standard of the standardized normal distribution;
and the adjusting subunit is configured to adjust each second mean value and each second variance according to the parameter standard to obtain each third mean value and each third variance.
The above apparatus, optionally, the generating unit includes:
and marking each joint point in each second hand skeleton diagram by using a preset generator, inputting each marking data corresponding to each first hand skeleton diagram into the generator, and marking each marking data corresponding to each first hand skeleton diagram onto each node of the second hand skeleton diagram corresponding to each first hand skeleton diagram by using the generator to obtain each second depth map data.
The above apparatus, optionally, further comprises:
the second acquisition unit is used for acquiring each training data, and each training data is a hand skeleton diagram for training the encoder;
the training unit is used for sequentially inputting the training data into the encoder so as to enable the encoder to train based on the training data input each time until the encoder meets the preset convergence condition; after each training data is input into the encoder, obtaining a first training result of the training data currently input into the initial encoder, wherein the first training result is a training mean and a training variance obtained after mean and variance calculation is carried out on the training data and noise data is added; standardizing the first training result according to the standardized processing rule to obtain a second training result, and generating sample data corresponding to the training data based on the second training result; calculating the training data and the sample data to obtain a loss function value; judging whether the loss function value meets a preset convergence condition or not; if not, adjusting model parameters of a mean variance calculation module and a Gaussian noise module in the encoder according to the loss function value; and if so, finishing inputting the training data to the encoder and finishing the training of the encoder.
A storage medium, the storage medium comprising stored instructions, wherein when the instructions are executed, a device in which the storage medium is located is controlled to execute the above data generation method.
An electronic device comprising a memory, and one or more instructions, wherein the one or more instructions are stored in the memory and configured to be executed by the one or more processors to perform the above-described data generation method.
Compared with the prior art, the invention has the following advantages:
the invention provides a data generation method, which comprises the following steps: acquiring first depth map data, wherein the first depth map data comprise a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map; adding noise data into each first hand skeleton diagram, and carrying out standardization processing according to a preset standardization rule to obtain a second hand skeleton diagram corresponding to each first hand skeleton diagram; and labeling the labeling data corresponding to each first hand skeleton map into a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data. In the method provided by the embodiment of the invention, in each original first depth map, the first hand skeleton map is subjected to Gaussian noise and standardization processing to change partial data in the original hand skeleton map, and further processed according to the standardization to obtain a second hand skeleton map, each joint in the derived second hand skeleton map is similar to the joint of the first hand skeleton map, but has a difference in form, and after each marking data corresponding to the first hand skeleton map is substituted, brand-new second depth map data is generated. Therefore, by applying the method provided by the embodiment of the invention, new data is added by using limited data on the basis of not destroying the original hand skeleton diagram, and the newly generated data is more natural and abundant, thereby meeting the diversity of large training data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a method of generating data according to an embodiment of the present invention;
fig. 2 is a flowchart of another method of a data generation method according to an embodiment of the present invention;
fig. 3 is a device structure diagram of a data generating device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions, and the terms "comprises", "comprising", or any other variation thereof are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The invention is operational with numerous general purpose or special purpose computing device environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multi-processor apparatus, distributed computing environments that include any of the above devices or equipment, and the like.
An embodiment of the present invention provides a data generating method, where the method may be applied to multiple system platforms, an execution subject of the method may be a computer terminal or a processor of various mobile devices, and a flowchart of the method is shown in fig. 1, and specifically includes:
s101: each first depth map data is acquired.
The first depth map data comprises a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map.
In the embodiment of the invention, the first hand skeleton map is 3D depth point cloud data, which deeply represents the shape of the hand in a 3D form. Each joint point of each first hand skeleton diagram comprises marking data for marking the joint point, and the marking data is the coordinate position of the joint point in the first hand skeleton diagram to which the joint point belongs.
In the hand skeleton diagram of the present invention, each joint point may be 21, and if there may be some joint points that are occluded or not shown in the first hand skeleton diagram, the label data corresponding to the occluded or not shown joint points is (0, 0, 0).
S102: noise data is added to each first hand skeleton map, and normalization processing is performed according to a preset normalization rule, so that a second hand skeleton map corresponding to each first hand skeleton map is obtained.
In the embodiment of the present invention, the noise data added to the first hand skeleton map may be additive noise data or multiplicative noise data. Noise data is added to the first hand skeleton map to change image data of the first hand skeleton map, thereby changing the form, size, style, and the like of the hand skeleton in the image. The first hand skeleton map added with noise data is normalized so that the image data of the changed first hand skeleton map conforms to a set normalization rule, and the first hand skeleton map added with Gaussian noise and subjected to normalization processing is a second hand skeleton map.
The normalization rule may be to ensure that the corresponding variance and mean in the hand skeleton diagram satisfy a standard normal distribution.
Each second hand skeleton map is highly similar to its corresponding first hand skeleton map, and after noise data addition and normalization processing, only the size, position, or orientation, etc. of the hand skeleton is changed. Such as: the picture content of the first hand skeleton diagram is displayed as a hand with five fingers opened, and after noise data is added and normalization processing is carried out, the picture content of the obtained second hand skeleton diagram still can display the hand with five fingers opened, but the size or the position of the hand is changed.
S103: and labeling the labeling data corresponding to each first hand skeleton map into a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data.
In the embodiment of the invention, after obtaining each second hand skeleton map, each annotation data in each first hand skeleton map is added to the corresponding second hand skeleton map, so as to generate new depth map data, namely, second depth map data.
It can be understood that the label data of each joint point in the first hand skeleton diagram is consistent with the label data of each joint point in the second hand skeleton diagram corresponding to the first hand skeleton diagram. If the labeling data of the wrist joint points in the first hand skeleton drawing is (10,20,20), the labeling data of the wrist joint points summarized by the second hand skeleton drawing corresponding to the first hand skeleton drawing is (10,20, 20).
In the data generating method provided by the embodiment of the invention, when data needs to be added, each existing first depth map data is acquired, noise data is added into the first hand skeleton map in each depth map data, and the first hand skeleton map added with the noise data is subjected to standardization processing to obtain a second hand skeleton map corresponding to the first hand skeleton map. The depth map data includes a first hand skeleton map and label data labeled on each joint point of the first hand skeleton map, and after a second hand skeleton map corresponding to the first hand skeleton map is obtained, each label data corresponding to the first hand skeleton map is labeled on the second hand skeleton map, and new depth map data is generated from the second hand skeleton map and each label data.
By applying the method provided by the embodiment of the invention, the limited depth map data is applied to increase the new depth map data, and the generated new hand skeleton map is more natural on the basis of not damaging the hand skeleton map, thereby meeting the diversity of large training data.
In the method according to the embodiment of the present invention, based on the content of S102, after obtaining each first depth map data, the first hand skeleton map in each first depth map data is extracted to perform a data generation process, and in the data generation process, noise data needs to be added to the first hand skeleton map and normalization processing needs to be performed. Therefore, referring to fig. 2, in the present invention, the process of adding noise data to each of the first hand skeleton diagrams and performing normalization processing according to a preset normalization rule to obtain a second hand skeleton diagram corresponding to each of the first hand skeleton diagrams may specifically include:
s201: and starting the pre-trained encoder, and applying a mean variance calculation module in the encoder to perform encoding calculation on each first hand skeleton diagram to obtain a first mean and a first variance corresponding to each first hand skeleton diagram.
In an embodiment of the present invention, the encoder includes a mean variance calculation module and a gaussian noise module, where the mean variance calculation module is configured to encode and calculate each first hand skeleton diagram to obtain a first mean and a first variance corresponding to each first hand skeleton diagram.
S202: and adding noise data into a first mean value and a first variance corresponding to each first hand skeleton by using a Gaussian noise module in the encoder to obtain a second mean value and a second variance corresponding to each first hand skeleton.
In the embodiment of the invention, after the mean variance calculation module outputs the first mean and the first variance, the first mean and the first variance are input to the noise informing module. The Gaussian noise module is mainly used for distributing reasonable noise data according to the input mean value and variance, and adding the corresponding noise data into each first mean value and each first variance to enable the generated second mean value and second variance to have deviation with the original data.
It is to be understood that the noise data added by the first mean and the first variance corresponding to each first hand skeleton map may be different noise data.
S203: and normalizing the second mean values and the second variances based on the normalization processing rule to obtain third mean values and third variances.
In the embodiment of the present invention, the normalization processing rule is to ensure that the corresponding variance and mean in the hand skeleton diagram satisfy the standard normal distribution, and therefore, the normalization processing on the second mean and the second variance actually performs normalized distribution conversion on the second mean and the second variance to obtain the corresponding third mean and third variance.
Specifically, in the method provided in the embodiment of the present invention, after obtaining each second mean value and each second variance, normalizing each second mean value and each second variance based on the normalization processing rule to obtain each third mean value and each third variance, which may specifically include:
determining a preset parameter standard of standardized normal distribution; and adjusting each second mean value and each second variance according to the parameter standard to obtain each third mean value and each third variance.
It can be understood that after the noise data is added to the first mean and the first variance, the original first hand skeleton map is added with noise, and in order to facilitate the subsequent generation of new depth map data, the first mean and the first variance of the noise data, that is, the second mean and the second variance, need to be normalized.
In the normalized distribution conversion in the present invention, the respective second mean values are 0, and the second variance values are 1, so that the parameter standard for normalizing the normal distribution is (0, 1). For example: the second mean value obtained after adding the noise data is 10 and the second variance is 2, then 10 is subtracted from all elements in the hand skeleton diagram, and then the result is divided by 2 to obtain a new hand skeleton diagram, wherein the mean value of the hand skeleton diagram is 0 and the variance is 1.
S204: and generating a second hand skeleton map corresponding to each first hand skeleton map based on each third mean value and each third variance.
In the embodiment of the invention, after Gaussian noise is added and normalization processing is carried out, the third mean value and the third variance are combined to generate a new hand skeleton diagram.
In the method provided by the embodiment of the invention, after the first depth map data are obtained, the first hand skeleton map in each first depth map data is input into the encoder, the encoder outputs the first mean value and the first variance corresponding to each first hand skeleton map, and adds noise data into each first mean value and each first variance to obtain the second mean value corresponding to each first mean value and the second variance corresponding to each first variance. And after the second mean values and the second variances are subjected to standardization processing, third mean values and third variances are obtained, and a new hand skeleton diagram is synthesized through the third mean values and the third variances.
Further, after obtaining the respective third mean and third variance, a second hand skeleton map needs to be generated through the respective third mean and third variance. After the second hand skeleton map is generated, labeling data corresponding to each joint point is newly labeled in each second hand skeleton map. Specifically, labeling the labeling data corresponding to each first hand skeleton drawing to a second hand skeleton drawing corresponding to the first hand skeleton drawing to generate each second depth drawing data includes:
and marking each joint point in each second hand skeleton diagram by using a preset generator, inputting each marking data corresponding to each first hand skeleton diagram into the generator, and marking each marking data corresponding to each first hand skeleton diagram onto each node of the second hand skeleton diagram corresponding to each first hand skeleton diagram by using the generator to obtain each second depth map data.
It can be understood that after the second hand skeleton map is generated, since noise is added to the second hand skeleton map relative to the corresponding first hand skeleton map, each joint point in the second hand skeleton map needs to be located, i.e. each joint point needs to be identified, before labeling data of each joint point in the second hand skeleton map. For example, after identifying which joint point is a thumb joint point, an index finger joint point, etc., and inputting each labeled data in the first hand skeleton diagram corresponding to each second hand skeleton diagram into the generator, the generator labels each labeled data correspondingly to the position of the corresponding joint point based on each identified joint point. That is, after receiving each piece of labeled data, the generator determines the joint point to which each piece of labeled data belongs, and associates and labels each piece of labeled data and each piece of joint point in the second hand skeleton drawing in accordance with the joint point to which each piece of labeled data belongs and the identification information that the generator identifies each piece of joint point in the second hand skeleton drawing in advance.
By applying the method provided by the embodiment of the invention, after the encoder carries out encoding calculation on each first hand skeleton image and noise data is added, the generator synthesizes the data into depth map data, thereby achieving the purpose of increasing the data.
In the method provided by the embodiment of the invention, in the process of adding new depth map data, the mean value and the variance of the hand skeleton map in the depth map data need to be scattered through an encoder, and noise data is added. Before that, the encoder needs to be trained to achieve the accuracy of the encoder's calculation of the mean and variance, and the rationale of adding noisy data. A process of training an encoder, comprising:
acquiring training data, wherein the training data are hand skeleton diagrams for training the encoder respectively;
sequentially inputting each training data into the encoder so that the encoder trains based on the training data input each time until the encoder meets the preset convergence condition;
after each training data is input into the encoder, obtaining a first training result of the training data currently input into the initial encoder, wherein the first training result is a training mean and a training variance obtained after mean and variance calculation is carried out on the training data and noise data is added; standardizing the first training result according to the standardized processing rule to obtain a second training result, and generating sample data corresponding to the training data based on the second training result; calculating the training data and the sample data to obtain a loss function value; judging whether the loss function value meets a preset convergence condition or not; if not, adjusting model parameters of a mean variance calculation module and a Gaussian noise module in the encoder according to the loss function value; and if so, finishing inputting the training data to the encoder and finishing the training of the encoder.
In the data generation method provided by the embodiment of the invention, the training data is a hand skeleton diagram, the number of the hand skeleton diagrams for training the encoder is multiple, the training data is sequentially input into the encoder according to a set sequence in the process of training the encoder, after the training data is input into the encoder each time, a mean variance calculation module in the encoder performs coding calculation on the currently input training data to obtain a corresponding mean value and variance, and a Gaussian noise module adds noise data into the mean value and variance corresponding to the currently input training data to obtain a training mean value and a training variance. And carrying out standardization processing on the training mean value and the training variance according to a standardization processing rule to obtain a second training result, wherein the second training result is the training mean value and the training variance after the standardization processing. And generating sample data corresponding to the training data through the second training result, wherein the sample data is a hand skeleton diagram obtained by carrying out noise addition and standardization processing on the currently input training data. And calculating the currently input training data and the obtained sample data, wherein the calculation mode is specifically KL divergence calculation. And obtaining a loss function value corresponding to the currently input training data. And judging whether the loss function value meets a preset convergence condition, if not, adjusting model parameters in a mean variance calculation module and a Gaussian noise module in the encoder, and continuing to perform mean variance calculation and add noise data on next input training data. And if the currently input training data accords with the convergence condition, determining that the encoder training is finished, and inputting the training data to the encoder for training.
In the present invention, the calculation formula for calculating the loss function value of the training data is as follows:
Figure BDA0003037593540000111
where d is the dimension of the sample data, and μ and σ 2 represent the i-th component of the mean vector and variance vector of the normal distribution, respectively, and i denotes the index number of the current training data.
By applying the method provided by the embodiment of the invention, the training encoder can add noise data into data and generate new depth map data by applying the encoder so as to meet the diversity of large training data.
The specific implementation procedures and derivatives thereof of the above embodiments are within the scope of the present invention.
Corresponding to the method described in fig. 1, an embodiment of the present invention further provides a data generating apparatus, which is used for implementing the method in fig. 1 specifically, the data generating apparatus provided in the embodiment of the present invention may be applied to a computer terminal or various mobile devices, and a schematic structural diagram of the data generating apparatus is shown in fig. 3, and specifically includes:
a first obtaining unit 301, configured to obtain first depth map data, where the first depth map data includes a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map;
a processing unit 302, configured to add noise data to each of the first hand skeleton diagrams, and perform normalization processing according to a preset normalization rule to obtain a second hand skeleton diagram corresponding to each of the first hand skeleton diagrams;
generating means 303 for generating second depth map data by labeling the labeling data corresponding to each of the first hand skeleton maps to the second hand skeleton map corresponding to the first hand skeleton map.
In the data generating apparatus according to the embodiment of the present invention, when data needs to be added, each of the existing first depth map data is acquired, noise data is added to the first hand skeleton map in each depth map data, and the first hand skeleton map to which the noise data has been added is normalized to obtain a second hand skeleton map corresponding to the first hand skeleton map. The depth map data includes a first hand skeleton map and label data labeled on each joint point of the first hand skeleton map, and after a second hand skeleton map corresponding to the first hand skeleton map is obtained, each label data corresponding to the first hand skeleton map is labeled on the second hand skeleton map, and new depth map data is generated from the second hand skeleton map and each label data.
By applying the device provided by the embodiment of the invention, the limited depth map data is applied to increase the new depth map data, and the generated new hand skeleton map is more natural on the basis of not damaging the hand skeleton map, thereby meeting the diversity of large training data.
In the apparatus provided in the embodiment of the present invention, the processing unit 302 includes:
the calculation subunit is configured to start a pre-trained encoder, and perform encoding calculation on each first hand skeleton diagram by using a mean variance calculation module in the encoder to obtain a first mean and a first variance corresponding to each first hand skeleton diagram;
a first processing subunit, configured to apply a gaussian noise module in the encoder, add noise data to a first mean value and a first variance corresponding to each first hand skeleton, and obtain a second mean value and a second variance corresponding to each first hand skeleton;
a second processing subunit, configured to perform normalization processing on each second mean and each second variance based on the normalization processing rule, so as to obtain each third mean and each third variance;
and a first generating subunit, configured to generate a second hand skeleton map corresponding to each first hand skeleton map based on each third mean and each third variance.
In the apparatus provided in the embodiment of the present invention, the second processing subunit includes:
the determining subunit is used for determining a preset parameter standard of the standardized normal distribution;
and the adjusting subunit is configured to adjust each second mean value and each second variance according to the parameter standard to obtain each third mean value and each third variance.
In the apparatus provided in the embodiment of the present invention, the generating unit 303 includes:
and marking each joint point in each second hand skeleton diagram by using a preset generator, inputting each marking data corresponding to each first hand skeleton diagram into the generator, and marking each marking data corresponding to each first hand skeleton diagram onto each node of the second hand skeleton diagram corresponding to each first hand skeleton diagram by using the generator to obtain each second depth map data.
The device provided by the embodiment of the invention further comprises:
the second acquisition unit is used for acquiring each training data, and each training data is a hand skeleton diagram for training the encoder;
the training unit is used for sequentially inputting the training data into the encoder so as to enable the encoder to train based on the training data input each time until the encoder meets the preset convergence condition; after each training data is input into the encoder, obtaining a first training result of the training data currently input into the initial encoder, wherein the first training result is a training mean and a training variance obtained after mean and variance calculation is carried out on the training data and noise data is added; standardizing the first training result according to the standardized processing rule to obtain a second training result, and generating sample data corresponding to the training data based on the second training result; calculating the training data and the sample data to obtain a loss function value; judging whether the loss function value meets a preset convergence condition or not; if not, adjusting model parameters of a mean variance calculation module and a Gaussian noise module in the encoder according to the loss function value; and if so, finishing inputting the training data to the encoder and finishing the training of the encoder.
For specific working processes of each unit and sub-unit in the data generating device disclosed in the above embodiment of the present invention, reference may be made to corresponding contents in the data generating method disclosed in the above embodiment of the present invention, and details are not described here again.
The embodiment of the invention also provides a storage medium, which comprises a stored instruction, wherein when the instruction runs, the device where the storage medium is located is controlled to execute the data generation method.
An electronic device is provided in an embodiment of the present invention, and the structural diagram of the electronic device is shown in fig. 4, which specifically includes a memory 401 and one or more instructions 402, where the one or more instructions 402 are stored in the memory 401 and configured to be executed by one or more processors 403 to perform the following operations for executing the one or more instructions 402:
acquiring first depth map data, wherein the first depth map data comprise a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map;
adding noise data into each first hand skeleton diagram, and carrying out standardization processing according to a preset standardization rule to obtain a second hand skeleton diagram corresponding to each first hand skeleton diagram;
and labeling the labeling data corresponding to each first hand skeleton map into a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both.
To clearly illustrate this interchangeability of hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of generating data, comprising:
acquiring first depth map data, wherein the first depth map data comprise a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map;
adding noise data into each first hand skeleton diagram, and carrying out standardization processing according to a preset standardization rule to obtain a second hand skeleton diagram corresponding to each first hand skeleton diagram;
and labeling the labeling data corresponding to each first hand skeleton map into a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data.
2. The method according to claim 1, wherein the adding noise data to each of the first hand skeleton maps and performing normalization processing according to a preset normalization rule to obtain a second hand skeleton map corresponding to each of the first hand skeleton maps comprises:
starting a pre-trained encoder, and applying a mean variance calculation module in the encoder to perform encoding calculation on each first hand skeleton diagram to obtain a first mean value and a first variance corresponding to each first hand skeleton diagram;
applying a Gaussian noise module in the encoder, and adding noise data into a first mean value and a first variance corresponding to each first hand skeleton to obtain a second mean value and a second variance corresponding to each first hand skeleton;
normalizing each second mean value and each second variance based on the normalization processing rule to obtain each third mean value and each third variance;
and generating a second hand skeleton map corresponding to each first hand skeleton map based on each third mean value and each third variance.
3. The method according to claim 2, wherein the normalizing each second mean and each second variance based on the normalization processing rule to obtain each third mean and each third variance comprises:
determining a preset parameter standard of standardized normal distribution;
and adjusting each second mean value and each second variance according to the parameter standard to obtain each third mean value and each third variance.
4. The method according to claim 1, wherein the labeling data corresponding to each first hand skeleton map is labeled to a second hand skeleton map corresponding to the first hand skeleton map to generate each second depth map data, and the method comprises:
and marking each joint point in each second hand skeleton diagram by using a preset generator, inputting each marking data corresponding to each first hand skeleton diagram into the generator, and marking each marking data corresponding to each first hand skeleton diagram onto each node of the second hand skeleton diagram corresponding to each first hand skeleton diagram by using the generator to obtain each second depth map data.
5. The method of claim 2, wherein the process of training the encoder comprises:
acquiring training data, wherein the training data are hand skeleton diagrams for training the encoder respectively;
sequentially inputting each training data into the encoder so that the encoder trains based on the training data input each time until the encoder meets the preset convergence condition;
after each training data is input into the encoder, obtaining a first training result of the training data currently input into the initial encoder, wherein the first training result is a training mean and a training variance obtained after mean and variance calculation is carried out on the training data and noise data is added; standardizing the first training result according to the standardized processing rule to obtain a second training result, and generating sample data corresponding to the training data based on the second training result; calculating the training data and the sample data to obtain a loss function value; judging whether the loss function value meets a preset convergence condition or not; if not, adjusting model parameters of a mean variance calculation module and a Gaussian noise module in the encoder according to the loss function value; and if so, finishing inputting the training data to the encoder and finishing the training of the encoder.
6. A data generation apparatus, comprising:
a first obtaining unit, configured to obtain first depth map data, where the first depth map data includes a three-dimensional first hand skeleton map and labeling data for labeling each joint point in the first hand skeleton map;
a processing unit, configured to add noise data to each of the first hand skeleton maps and perform normalization processing according to a preset normalization rule to obtain a second hand skeleton map corresponding to each of the first hand skeleton maps;
and generating means for labeling the labeling data corresponding to each of the first hand skeleton maps to a second hand skeleton map corresponding to the first hand skeleton map to generate second depth map data.
7. The apparatus of claim 6, wherein the processing unit comprises:
the calculation subunit is configured to start a pre-trained encoder, and perform encoding calculation on each first hand skeleton diagram by using a mean variance calculation module in the encoder to obtain a first mean and a first variance corresponding to each first hand skeleton diagram;
a first processing subunit, configured to apply a gaussian noise module in the encoder, add noise data to a first mean value and a first variance corresponding to each first hand skeleton, and obtain a second mean value and a second variance corresponding to each first hand skeleton;
a second processing subunit, configured to perform normalization processing on each second mean and each second variance based on the normalization processing rule, so as to obtain each third mean and each third variance;
and a first generating subunit, configured to generate a second hand skeleton map corresponding to each first hand skeleton map based on each third mean and each third variance.
8. The apparatus of claim 7, wherein the second processing subunit comprises:
the determining subunit is used for determining a preset parameter standard of the standardized normal distribution;
and the adjusting subunit is configured to adjust each second mean value and each second variance according to the parameter standard to obtain each third mean value and each third variance.
9. The apparatus of claim 6, wherein the generating unit comprises:
and marking each joint point in each second hand skeleton diagram by using a preset generator, inputting each marking data corresponding to each first hand skeleton diagram into the generator, and marking each marking data corresponding to each first hand skeleton diagram onto each node of the second hand skeleton diagram corresponding to each first hand skeleton diagram by using the generator to obtain each second depth map data.
10. The apparatus of claim 7, further comprising:
the second acquisition unit is used for acquiring each training data, and each training data is a hand skeleton diagram for training the encoder;
the training unit is used for sequentially inputting the training data into the encoder so as to enable the encoder to train based on the training data input each time until the encoder meets the preset convergence condition; after each training data is input into the encoder, obtaining a first training result of the training data currently input into the initial encoder, wherein the first training result is a training mean and a training variance obtained after mean and variance calculation is carried out on the training data and noise data is added; standardizing the first training result according to the standardized processing rule to obtain a second training result, and generating sample data corresponding to the training data based on the second training result; calculating the training data and the sample data to obtain a loss function value; judging whether the loss function value meets a preset convergence condition or not; if not, adjusting model parameters of a mean variance calculation module and a Gaussian noise module in the encoder according to the loss function value; and if so, finishing inputting the training data to the encoder and finishing the training of the encoder.
CN202110448065.2A 2021-04-25 2021-04-25 Data generation method and device Pending CN113158911A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110448065.2A CN113158911A (en) 2021-04-25 2021-04-25 Data generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110448065.2A CN113158911A (en) 2021-04-25 2021-04-25 Data generation method and device

Publications (1)

Publication Number Publication Date
CN113158911A true CN113158911A (en) 2021-07-23

Family

ID=76870226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110448065.2A Pending CN113158911A (en) 2021-04-25 2021-04-25 Data generation method and device

Country Status (1)

Country Link
CN (1) CN113158911A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080047673A (en) * 2006-11-27 2008-05-30 (주)플렛디스 Apparatus for transforming 3d image and the method therefor
CN101964117A (en) * 2010-09-25 2011-02-02 清华大学 Depth map fusion method and device
US20170168586A1 (en) * 2015-12-15 2017-06-15 Purdue Research Foundation Method and System for Hand Pose Detection
US20180096463A1 (en) * 2016-09-30 2018-04-05 Disney Enterprises, Inc. Point cloud noise and outlier removal for image-based 3d reconstruction
CN110188598A (en) * 2019-04-13 2019-08-30 大连理工大学 A kind of real-time hand Attitude estimation method based on MobileNet-v2
CN110232672A (en) * 2019-06-20 2019-09-13 合肥工业大学 A kind of denoising method and system of exercise data
WO2020140798A1 (en) * 2019-01-04 2020-07-09 北京达佳互联信息技术有限公司 Gesture recognition method, device, electronic apparatus, and storage medium
CN111814962A (en) * 2020-07-09 2020-10-23 平安科技(深圳)有限公司 Method and device for acquiring parameters of recognition model, electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080047673A (en) * 2006-11-27 2008-05-30 (주)플렛디스 Apparatus for transforming 3d image and the method therefor
CN101964117A (en) * 2010-09-25 2011-02-02 清华大学 Depth map fusion method and device
US20170168586A1 (en) * 2015-12-15 2017-06-15 Purdue Research Foundation Method and System for Hand Pose Detection
US20180096463A1 (en) * 2016-09-30 2018-04-05 Disney Enterprises, Inc. Point cloud noise and outlier removal for image-based 3d reconstruction
WO2020140798A1 (en) * 2019-01-04 2020-07-09 北京达佳互联信息技术有限公司 Gesture recognition method, device, electronic apparatus, and storage medium
CN110188598A (en) * 2019-04-13 2019-08-30 大连理工大学 A kind of real-time hand Attitude estimation method based on MobileNet-v2
CN110232672A (en) * 2019-06-20 2019-09-13 合肥工业大学 A kind of denoising method and system of exercise data
CN111814962A (en) * 2020-07-09 2020-10-23 平安科技(深圳)有限公司 Method and device for acquiring parameters of recognition model, electronic equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN, D,等: "A Novel Approach to 3D-DOA Estimation of Stationary EM Signals Using Convolutional Neural Networks", SENSORS, vol. 20, no. 10, pages 2761 *
徐正则,等: "基于深度图像预旋转的手势估计改进方法", 华东师范大学学报(自然科学版), no. 04, pages 124 - 133 *
陈子嫣: "基于机器视觉的远程操控救援系统", 通讯世界, no. 12, pages 298 - 300 *

Similar Documents

Publication Publication Date Title
US8483440B2 (en) Methods and systems for verifying automatic license plate recognition results
CN109871800B (en) Human body posture estimation method and device and storage medium
CN110688929B (en) Human skeleton joint point positioning method and device
CN109308438B (en) Method for establishing action recognition library, electronic equipment and storage medium
CN109710066B (en) Interaction method and device based on gesture recognition, storage medium and electronic equipment
CN110414502B (en) Image processing method and device, electronic equipment and computer readable medium
US20220139061A1 (en) Model training method and apparatus, keypoint positioning method and apparatus, device and medium
CN111027403A (en) Gesture estimation method, device, equipment and computer readable storage medium
CN109754464B (en) Method and apparatus for generating information
CN111260774A (en) Method and device for generating 3D joint point regression model
CN108900788B (en) Video generation method, video generation device, electronic device, and storage medium
JP2023539934A (en) Object detection model training method, image detection method and device
CN113705534A (en) Behavior prediction method, behavior prediction device, behavior prediction equipment and storage medium based on deep vision
CN112580666A (en) Image feature extraction method, training method, device, electronic equipment and medium
Varona et al. Toward natural interaction through visual recognition of body gestures in real-time
CN109829431B (en) Method and apparatus for generating information
CN110728172B (en) Point cloud-based face key point detection method, device and system and storage medium
CN111447379B (en) Method and device for generating information
CN113158911A (en) Data generation method and device
CN114694257A (en) Multi-user real-time three-dimensional action recognition and evaluation method, device, equipment and medium
CN113781462A (en) Human body disability detection method, device, equipment and storage medium
CN110363110A (en) Face forward reference method for fast establishing, device, storage medium and processor
CN110414623B (en) Classification model construction method and device based on multi-view learning
CN114495173A (en) Posture recognition method and device, electronic equipment and computer readable medium
WO2023005725A1 (en) Pose estimation method and apparatus, and device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination