WO2015008937A1

WO2015008937A1 - Method and apparatus for generating motion data

Info

Publication number: WO2015008937A1
Application number: PCT/KR2014/005198
Authority: WO
Inventors: 이제희; 양경용; 윤기범; 이경호
Original assignee: 서울대학교산학협력단
Priority date: 2013-07-15
Filing date: 2014-06-13
Publication date: 2015-01-22
Also published as: KR101482419B1

Abstract

A method for generating motion data is disclosed. The method for generating motion data according to one embodiment of the present invention comprises the steps of: acquiring first motion data from a motion database (DB) in which motion data is stored; clustering the first motion data according to a category of body postures; and resampling the first motion data for each cluster, according to a distribution rate of the amount of the clustered motion data set for each cluster.

Description

Method and apparatus for generating motion data

The present invention relates to a method and apparatus for generating motion data. More particularly, the present invention relates to a machine learning method for recognizing human poses of humans using a machine-learned body parts recognition algorithm. The present invention relates to a method and apparatus for generating motion data capable of generating motion data according to a purpose.

Recently, due to the popularization of 3D movies, 3D games, and 3D TVs, interest in stereoscopic contents is increasing. Microsoft's Kinect, in particular, is gaining popularity as it applies video content and user interaction technology to games. Kinect is a system based on a probabilistic module. When a classifier that has learned a lot of data in advance receives an arbitrary depth image, Kinect calculates the probability of each pixel and estimates a body part.

In this regard, Figure 1 schematically illustrates the Kinect's body recognition process. Referring to FIG. 1, Kinect retrieves motion data from a motion database in which a posture image of a body is stored, obtains a depth learning image for machine learning (step S20), and uses the depth image to machine an algorithm for body part recognition. To learn (step S30). Then, when Kinect is actually used by the user, for example, when Kinect receives the user's depth image (test depth image) in real time through a depth camera, the body part recognition algorithm learned through steps S20 and S30 is performed. This will predict the body's three-dimensional pose in this depth image.

Existing posture recognition technology used in Kinect has the advantage of high posture recognition accuracy for general posture in which a person stands up, but it is difficult to apply motion data to a completely new screen when applying to various categories of human posture recognition. In particular, in order to recognize the postures of easy-to-recognition classification and the postures of difficult-to-recognition classification at the same time, the motion data used for machine learning should be uniformly secured according to the classification of human posture.

However, in general, hundreds of thousands to one million depth images are used for machine learning of body part recognition algorithms, so the number of motion data to be dealt with is large. Moreover, the posture of the body is mathematically defined in tens of dimensions (for example, 60 dimensions), so each of the data is high-dimensional data. Distributing data uniformly in each dimension while dealing with a large number of high-dimensional data is difficult for the human to do directly, thus raising the need for a method of obtaining statistically well-distributed motion data according to the classification of the human posture. have.

According to an embodiment of the present invention, a motion data generation method and apparatus for analyzing a given motion data to remove a duplicate posture and regenerate a short posture based on existing data may be provided according to a distribution desired by a user.

According to one embodiment of the present invention, since the motion data can be adjusted to have any size, density, and operation range, it is possible to provide a motion data generation method and apparatus capable of flexibly manipulating a large amount of motion data.

According to an embodiment of the present invention, a method of generating motion data, the method comprising: obtaining first motion data from a motion database (DB) in which motion data is stored; Clustering the first motion data according to a body category of the body; And resampling the first motion data for each cluster according to a distribution ratio of the amount of motion data set for each clustered cluster.

Further, according to an embodiment of the present invention, in the body recognition method for predicting the three-dimensional posture of the body from the depth image of the body, obtaining a learning depth image from the motion data generated by the motion data generation method ; And machine learning a body part recognition algorithm by using the learning depth image.

According to an embodiment of the present invention, there is also provided an apparatus for generating motion data by a motion data generating application, the apparatus comprising a processor and a memory, wherein the application is loaded into the memory and executed by the control of the processor. When the application is configured to obtain first motion data from a motion database (DB) in which motion data is stored; Clustering the first motion data according to a posture category of a body; And a function of resampling the first motion data for each cluster according to the distribution ratio of the amount of motion data set for each clustered cluster.

According to an embodiment of the present invention, a computer-readable recording medium having a program recorded thereon for executing the motion data generating method or the body recognition method on a computer can be provided.

According to an embodiment of the present invention, the posture recognition performance may be improved by analyzing the given motion data and removing the overlapping postures and regenerating the postures based on the existing data according to the distribution desired by the user.

In addition, according to an embodiment of the present invention, since it is possible to automatically generate a uniform and large amount of motion data from a non-uniform or small amount of motion database, it is possible to save time and effort for securing a uniform and large amount of motion database. Can be.

In addition, according to an embodiment of the present invention, since the motion data can be adjusted to have any size, density, and operating range, there is an advantage in that a large amount of motion data can be flexibly manipulated.

1 is a flowchart schematically illustrating a human posture recognition process by a conventional Kinect device;

2 is a flowchart schematically illustrating a posture recognition process of a person including a method of generating and / or deleting motion data according to an embodiment of the present invention;

3 is an exemplary flowchart for performing a resampling method according to an embodiment of the present invention;

4 is a view for explaining a method of projecting motion data into a low dimensional space;

5 is a diagram for explaining a step of projecting motion data into a two-dimensional space, according to an exemplary embodiment;

FIG. 6 is a diagram for explaining a step of generating motion data in each cell of the two-dimensional space of FIG. 5; FIG.

7 is a view for explaining a step in which motion data is projected in a two-dimensional space according to a first alternative embodiment;

FIG. 8 is a diagram for explaining a step of generating motion data in each cell of the two-dimensional space of FIG. 7; FIG.

9 is a diagram for explaining a step in which motion data is projected in a two-dimensional space according to a second alternative embodiment;

FIG. 10 is a diagram for explaining a step of generating motion data in each cell of the two-dimensional space of FIG. 9; FIG.

11A and 11B are diagrams for explaining the steps of generating motion data in 75% and 50% of the cells in the two-dimensional space of FIG. 10, respectively;

12 is a diagram for explaining posture types of persons classified according to one embodiment;

13 is a diagram for explaining the effect according to an embodiment of the present invention, and

14 is a block diagram illustrating an example system configuration for generating motion data, according to one embodiment.

Objects, other objects, features and advantages of the present invention will be readily understood through the following preferred embodiments associated with the accompanying drawings. However, the present invention is not limited to the embodiments described herein and may be embodied in other forms. Rather, the embodiments introduced herein are provided so that the disclosure may be made thorough and complete, and to fully convey the spirit of the present invention to those skilled in the art.

Where a component is referred to herein as being on another component, it means that it may be formed directly on the other component or a third component may be interposed therebetween.

Where the terms first, second, etc. are used herein to describe the components, these components should not be limited by these terms. These terms are only used to distinguish one component from another. The embodiments described and illustrated herein also include complementary embodiments thereof.

In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, the words 'comprise' and / or 'comprising' do not exclude the presence or addition of one or more other components.

Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. In describing the specific embodiments below, various specific details are set forth in order to explain and understand the invention in more detail. However, those skilled in the art can understand that the present invention can be used without these various specific details. In some cases, it is mentioned in advance that parts of the invention which are commonly known in the description and which are not related to the invention are not described in order to prevent confusion in describing the invention.

2 is a flowchart schematically illustrating a posture recognition process of a person including a method of generating and / or deleting motion data according to an exemplary embodiment of the present invention.

Referring to FIG. 2, one of the main technical features of the present invention includes a step S10 of obtaining motion data from a motion database DB and then resampling the motion data.

Specifically, first, before step S10, motion data are obtained from a motion database DB in which motion data is stored. That is, it collects motion data to be used for machine learning of body part recognition algorithm. As used herein, the term "motion data" used herein refers to motion data in which a posture of a human body is photographed. However, the inventive concept is not necessarily limited to the posture of a person, and according to an alternative embodiment the term "motion data" can be extended to mean motion data of any object (eg animal or robot, etc.) capable of movement. Of course.

There are many ways to collect motion data and the present invention is not limited to any particular method. For example, a database that stores motion data in advance may be provided from the beginning, collect motion data published on the Internet, or generate motion data using a motion capture system.

As such, the motion data is collected from the motion database (DB), but it is virtually impossible to obtain all the data necessary for machine learning in this process. In general, the collected motion data is unevenly distributed according to the posture classification of the body. In order to accurately recognize the actual human posture by using the machine learning data, it is necessary to increase the amount of data for insufficient posture classification and to reduce the amount of data for excessively secured posture classification to fit the overall distribution. Therefore, the preferred embodiment of the present invention shown includes the step S10 of resampling motion data.

In one embodiment, “resampling” may include generating new motion data or removing some of the existing motion data from previously obtained motion data. For example, if the data of category A has a large number of data and the data of category B is insufficient as a result of analyzing the acquired motion data, the resampling step is performed to remove some data from the category A data and lack the data in category B. By generating a new, it is possible to uniformly match the ratio of the data of the A category and B category.

In addition, for such resampling, a step of first classifying (clustering) the previously acquired motion data according to the pose category of the body may be preceded, and then the amount of motion data set for each cluster (the pose category of each body). The motion data may be resampled according to the distribution ratio of.

On the other hand, in one embodiment, the distribution ratio of the amount of motion data allocated to the posture category of each body has the same value for all clusters, thereby obtaining uniform posture data over the entire category. However, in alternative embodiments, the distribution ratio of the data amount per category may be set differently for each category. For example, when the body part recognition algorithm according to the present invention is applied to a posture recognition application of a specific posture (such as yoga or stretching), including more motion data in the category of the specific posture can improve posture recognition accuracy. have. Therefore, it will be appreciated that by setting different distribution ratios for categories according to specific application fields, a posture recognition application specific to the application field can be created.

Referring back to FIG. 2, after resampling the motion data in step S10, steps S20 and S30 are executed to pre-learn the body part recognition algorithm. That is, as described with reference to FIG. 1, a learning depth image is obtained from the motion data obtained by resampling (step S20), and the body part recognition algorithm is machine-learned using the learning depth image (step S20). Step S30). Then, when the machine-learned body part recognition algorithm is mounted and used in a device such as Kinect, for example, when a human depth image is received in real time from a depth camera, the body part recognition algorithm learned in this depth image is used. It is applied to predict the three-dimensional posture of the body (step S40).

Hereinafter, an exemplary method of resampling will be described in detail with reference to FIGS. 3 to 6.

3 is an exemplary flowchart for performing a resampling method according to an embodiment of the present invention. Referring to FIG. 3, after collecting motion data through a motion database DB, the motion data is clustered in step S110. That is, the collected motion data are clustered (classified) by the body posture category, thereby grouping motion data of similar body postures by cluster.

In one embodiment, the "motion data" at this time may be data that mathematically defines a posture of a person. For example, when a human posture is modeled with 20 joints and three axes of x, y, and z for each joint, the posture of the human can be expressed with a value of 60 dimensions. The clustering of motion data of "similar body postures" may mean, for example, classifying motions of a predetermined number or less of joints in different motions or continuities of motions into one cluster.

Many methods for clustering data are known, and one embodiment of the present invention may employ an agglomerative hierarchical clustering algorithm. According to this algorithm, each motion data initially forms a single cluster, and then pairs of clusters gradually merge to form a hierarchy of clusters. Pairs of clusters are sorted by max-distance (the longest of the distances between members of two clusters), and two clusters with the smallest maximum-distance are analyzed first for possible merging do. If their minimum-distance (the shortest of the distances between cluster members) is less than the predetermined threshold and their intrinsic dimension does not increase beyond the maximum threshold, their merging is made.

As such, after clustering the motion data through an arbitrary clustering method in step S110, the motion data is projected into the low dimensional space of the body posture in step S120 for each cluster. Herein, the "low dimensional space" means a smaller number of dimensional spaces than the dimensional space (for example, 60 dimensional space) representing a human posture, and may be, for example, two or three dimensional spaces. That is, in the projection step S120, the transformation is performed to the coordinate space where dimensional reduction can be applied to the motion data expressed in the high dimensional space. As a result, by selecting only the dimensions necessary to represent the data, it is possible to more easily represent and resample the motion data in a low dimensional space.

It is also known to reduce the number of dimensions representing data by projecting the high dimensional space into the low dimensional space. In an embodiment of the present invention, a principal component analysis (PCA) may be used. This method transforms data defined in a multidimensional coordinate system into a new coordinate system of two or more mutually orthogonal vectors called principal components. For example, when transforming into a two-dimensional coordinate system, the largest variance of the data is zero. The principal component coordinates are respectively selected such that they lie in one principal component coordinate and the second largest variance is in the second principal component coordinate example.

Alternatively, methods other than principal component analysis (PCA) can be used to reduce the number of dimensions in high dimensional space. For example, Linear Discrimination Analysis (LDA), Canonical Correlation Analysis (CCA), or the like may be used, and the present invention is not limited to any particular method.

4 illustrates a case of projecting motion data belonging to any one cluster into a two-dimensional space as an example.

In Fig. 4, the horizontal axis represents, for example, the coordinate axis of the first principal component, and the vertical axis represents the coordinate axis of the second principal component, and each point is motion data. That is, motion data defined in high dimensions is projected on a two-dimensional plane.

This projection step S120 is performed for each cluster, and it will be appreciated that the principal component coordinates selected as a result of the projection step S120 may be the same for all clusters but may differ from cluster to cluster. In addition, although FIG. 4 illustrates the conversion of the high-dimensional space into the two-dimensional space through the projection step S120, the alternative embodiment may convert the motion data into, for example, the three-dimensional or four-dimensional space.

Referring back to FIG. 3, after the projection step S120, a step S130 of stratifying a low dimensional space into a grid of cells is performed for each cluster.

Referring to FIG. 4, in the illustrated embodiment, "cell" represents a square whose length is r, and in general, when extended to n-dimensional space, it means a hypercube whose length is r. do. The length r of one side may be any value and may be defined as r = k · r ₀ in one embodiment.

Here, r ₀ denotes an average distance between the nearest data among the motion data distributed in the transformed low-dimensional (two-dimensional embodiment in FIG. 4) space, and k is a coefficient. The size of the cell, i.e. the length r of one side, is related to the density of motion data finally produced by resampling. If the length r is small, more dense motion data may be generated. An embodiment of changing the length r will be described later with reference to FIGS. 7 and 8.

On the other hand, as shown in Fig. 4, a set of grids of n-dimensional cells adjacent to the data centered on arbitrary motion data will be defined as "neighborhood cells" (Nd). For example, in a two-dimensional space such as FIG. 4, the neighbor cell Nd of any motion data (i.e., any one point shown in FIG. 4) is a 3x3 grid surrounding this motion data. If the motion data is displayed in three-dimensional space, it will be appreciated that the neighbor cell Nd for any motion data is a 3x3x3 (= 3 ³ ) lattice surrounding this motion data. The reason for defining neighboring cells in this way is that it is not appropriate to generate motion data in an infinite number of all cells in the transformed low-dimensional space (two-dimensional space in FIG. 4), so that motion data is generated in the low-dimensional space. This is to set the spatial extent to do. In an embodiment of the present invention, motion data is generated only for cells neighboring existing acquired motion data (ie, motion data projected in a low dimension), that is, neighboring cells Nd.

Therefore, the size of the neighbor cell Nd is related to the diversity of the range and attitude of the motion data finally produced by resampling. For example, the neighbor cell Nd for any motion data may be defined as cells surrounding the motion data within two cells, that is, a 5x5 grid, in which case the motion data generated by resampling as compared to the 3x3 grid embodiment. As the spatial range of the sensor becomes wider, the number of motion data generated increases and the variety of body postures increases. An embodiment of changing the size of the neighbor cell Nd will be described later with reference to FIGS. 9 and 10.

As described above, when layering is performed according to the size of the cell and the size of the neighboring cell Nd in step S130, the result is as shown in FIG. 5. 5 illustrates a case in which the motion data is projected in the two-dimensional space when the neighbor cell Nd is set to a 3x3 grid size.

Referring back to FIG. 3, after the stratification step S130, some of the existing motion data is deleted for each cluster in steps S140 and S150 or new motion data is generated based on the existing motion data.

In detail, in operation S140, duplicate or similar motion data may be deleted in each cell. That is, when a plurality of motion data exist in one cell, the remaining motion data except for a predetermined number of the plurality of motion data is deleted. Preferably, only one motion data of the plurality of motion data is left for each cell, and the rest are deleted. The criteria for selecting motion data not to be deleted in each cell is not particularly limited. Since motion data belonging to one cell are almost the same or very similar to each other, for example, in one embodiment, only one data may be randomly selected within each cell and the remaining data may be deleted.

Thereafter, in step S150, some of the existing motion data is deleted and new motion data is generated according to the fill ratio. Here, the input ratio f is defined by the following equation.

f = (the number of cells that actually have motion data) / (the total number of cells that can have motion data)

The input ratio f has a value greater than 0 and less than or equal to 1, and when the input ratio f is 1, it means that the motion data is filled one by one in all cells. When the input ratio f is 0.5, half of all cells are input. This means that only one cell is filled with motion data.

The generation of the new motion data in this step is performed on the neighboring cell Nd of the cell including the previously acquired motion data (ie, the motion data projected in the lower dimension). That is, since the existing motion data (indicated by the dots) existing cells already contain the motion data one by one, there is no need to generate new motion data, and in step S150, the neighboring cell Nd of the cell in which the point exists is present. For each cell, one motion data is generated in each cell.

As described above, when the deletion and generation of the motion data are performed in steps S140 and S150, it may be represented as shown in FIG. FIG. 6 illustrates an example in which motion data is filled in one cell when the input ratio f is 1 in the two-dimensional space of FIG. 5. In FIG. 6, points represent existing acquired motion data (that is, motion data projected in a low dimension), and in the state of FIG. 5, only one for each cell is left and the rest is deleted (S140). As in 6, only one point is left in a cell. In FIG. 6, the "x" is new motion data generated in the neighbor cell Nd by step S150. As a result, each cell contains one dot or x mark, and each cell has one motion data. As described above, by executing the resampling method of the present invention, it is possible to obtain motion data having a much more diverse and uniform distribution than the conventional method using only the previously acquired motion data (that is, points). That is, while FIG. 5 shows the distribution of motion data of the conventional method without resampling according to the present invention, FIG. 6 shows resampling that deletes motion data overlapping one space and generates new motion data in an empty space. It shows the state after the execution, and it can be seen that a uniform and various motion data distribution is made as a whole.

Hereinafter, an alternative embodiment of the resampling step will be described with reference to FIGS. 7 to 11.

FIG. 7 is a diagram showing a state after motion data is projected in a two-dimensional space and stratified into a grid of cells in accordance with a first alternative embodiment (S120 and S130). Comparing FIG. 7 with FIG. 5, it can be seen that the distribution of motion data itself is the same, but the length r of one side of the cell is different. That is, the alternative embodiment of FIG. 7 is an embodiment in which the size of the cell is increased by increasing the length r compared to FIG. 5.

FIG. 8 illustrates a state in which motion data is uniformly distributed in each cell of the two-dimensional space of FIG. 7 by the generation and deletion steps S140 and S150 of the motion data. In FIG. 8, a dot means existing acquired motion data (ie, motion data projected in a low dimension). By performing step S140 in the state of FIG. 7, only one motion data is left in one cell. It became. In FIG. 8, the "x" is new motion data generated in the neighbor cell Nd by step S150. In this manner, the execution of steps S140 and S150 results in motion data one for every cell. In comparison with FIG. 6, it can be seen that since the cell size is larger than that of FIG. 6, the distribution density of motion data is smaller than that of FIG. 6.

Fig. 9 is a view after steps S120 and S130 are executed in which motion data is projected in a two-dimensional space and stratified into a grid of cells according to a second alternative embodiment. Comparing FIG. 9 with FIG. 5, it can be seen that the motion data itself has the same distribution, but the length r of one side of the cell is cut in half compared to FIG. 5 and the size of the neighboring cell Nd is set to a 5 × 5 grid.

FIG. 10 shows how the motion data generation and deletion steps S140 and S150 are executed in each cell of the two-dimensional space of FIG. In this embodiment, it is assumed that the input ratio f is one. In the cells where the existing motion data is overlapped, only one data is left and the rest is deleted, and one motion data is generated in a cell in which no motion data exists. As a result, one motion data exists in every cell. In comparison with FIG. 6, it can be seen that since the cell size is smaller than that in FIG. 6, the distribution density of the motion data is larger than that in FIG. 6.

11A and 11B are diagrams for explaining the steps of generating motion data in 75% and 50% of the cells in the two-dimensional space of FIG. 9, respectively. FIG. 11A illustrates a case in which the input data f is set to 0.75 and resampled the motion data, in which motion data exists only in three quarters of the cells. 11B shows that the input ratio f is set to 0.5 and resampled the motion data so that only one half of the cells have motion data.

Referring now to FIG. 3 again, step S160 of deleting overlapping motion data from the entire motion data may be further added. The steps S120 to S150 described above are the steps performed individually for each cluster. In other words, motion data belonging to this cluster is projected to a low dimensional space for each cluster, and duplicate data is deleted, and data is generated in empty cells to obtain motion data uniformly distributed for each cluster. However, although motion data are uniformly distributed without overlapping in a cluster, there is a possibility of overlapping with motion data in another cluster. Therefore, in step S160, if there is overlapping motion data among the entire motion data, it is deleted.

Now, effects according to one embodiment of the present invention will be described with reference to FIGS. 12 and 13. First, in order to analyze the experimental results of the present invention, the human posture is classified into three types as shown in FIG. This classification is for the analysis of experimental results and may not be the same as the classification that clusters the categories of human posture to perform resampling.

Referring to FIG. 12, Type I includes sit and squat positions. Specifically, this type includes a posture lower than the height of the knee when the height of the pelvis from the ground is standing up, or a posture in which body parts except the foot are in contact with the ground.

Type II includes an acrobatic stand. Specifically, this type is a position in which the upper body is inclined at least 45 degrees with respect to the vertical axis, or the knee or foot is higher than the pelvis height, or any part of the upper body is in contact with the lower body part (for example, the hand is placed on the knee). Posture). In general, standing posture is easiest to recognize posture, but this type corresponds to standing posture and difficult posture.

Type III includes an Upright Stand. This type includes all postures except Type I and Type II and includes, for example, a standing posture with the feet on the ground and no contact between the upper body and the lower body. Type III is easiest to distinguish relative to Type I and Type II.

13 is a diagram for explaining an effect according to an embodiment of the present invention. As indicated in the left column of the table, the data includes seven training data sets (ie, a set for each of Type I, Type II, and Type III; a set of any two combinations of the three types) and all three types for the experiment. Machine learning algorithm using body part recognition algorithm. All data sets were uniformly resampled to have approximately the same data size (approximately 10,000 frames per set), and then input a test depth image into the trained body part algorithm as shown in FIG. 2 to predict each part of the body. It was.

The values without the parentheses in the results of FIG. 13 are true positive predictions of body posture, and represent the ratio of the number of joints determined to true and the number of total joints. It is determined to be true if the predicted joint is exactly within or within 10 cm of the actual joint position.

In Fig. 13, the value in parentheses is a false positive, indicating the case where the predicted joint position is 10 cm or more away from the actual position. And the bottom row of Figure 13 shows the results of experiments using the existing Kinect device for comparison with the present invention.

As can be seen from the table of Figure 13, any combination of data sets for the three types (i.e., "Type I & Type II" set, "Type I & Type III" set, and "Type II & Type III" set) ) Shows consistent results. For example, the combination of Type I and Type II learning data shows good performance when testing the postures belonging to Type I or Type II, but the performance is somewhat lower for the postures of Type III. The overall performance of the combination of types is better than that of using only a single type data set. However, when testing a posture belonging to Type I, using only the Type I data set shows better performance than other combinations. In other words, if it is intended to predict only a certain posture, it means that the machine learning using only the data of the specific posture can yield better results.

In addition, comparing the present invention with Kinect shows that the present invention is comparable with Kinect in overall performance. In particular, it can be seen that the present invention shows better results in lower body recognition.

Referring to FIG. 14, the motion data generating apparatus 10 according to an embodiment may communicate with the motion database 20 to obtain motion data therefrom. Although not shown, the motion data generating apparatus 10 may be connected to the motion capture system and receive motion data therefrom, or may receive motion data from any terminal device or server through a network such as the Internet. .

In the illustrated embodiment, the motion data generating apparatus 10 may be any terminal device or server capable of performing the resampling algorithm described with reference to FIGS. 2 and 3, and as illustrated, the processor 110 and the storage unit ( 120, and a memory 130. In this configuration, the resampling algorithm may be stored in the storage 120 and loaded and executed in the memory 130 under the control of the processor 110. Although not shown in FIG. 13, a body part recognition algorithm that performs machine learning using learning data may also be included in the motion data generating apparatus 10 of FIG. 13.

As described above, if the motion data generation method according to an embodiment of the present invention is applied, a uniform and large amount of motion data can be generated from a non-uniform or small amount of motion database, thereby ensuring a uniform and large amount of motion database. You can improve the predictive performance of your body posture while saving time and effort.

As described above, although the present invention has been described by way of limited embodiments and drawings, the present invention is not limited to the above embodiments. Those skilled in the art will understand that various modifications and variations are possible from the above description. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined not only by the claims below but also by the equivalents of the claims.

Claims

In the motion data generation method,

Obtaining first motion data from a motion database (DB) in which the motion data is stored;

Clustering the first motion data according to a body category of the body; And

And resampling the first motion data for each cluster according to a distribution ratio of the amount of motion data set for each clustered cluster.
The method of claim 1,

And resampling the first motion data comprises generating second motion data from the first motion data.
The method of claim 1,

The distribution ratio of the amount of motion data has the same value for all clusters.
The method of claim 1,

Resampling the first motion data for each cluster may include:

Projecting the first motion data into a low dimensional space of a body posture;

Stratifying the low dimensional space into a grid of cells; And

Deleting some of the first motion data or generating second motion data based on the first motion data.
The method of claim 4, wherein

And wherein the cell is a hypercube having the same length of edge r in each dimension.
The method of claim 4, wherein

Deleting some of the first motion data or generating second motion data,

Deleting a plurality of first motion data except for a predetermined number of the plurality of first motion data when there is a plurality of first motion data in one cell; And

And deleting some of the first motion data or generating second motion data according to a preset fill ratio (f).

The input ratio f is defined by the following equation,

f = (number of cells containing motion data) / (number of total cells in low dimensional space)

And the input ratio f has a value greater than zero and less than or equal to one.
The method of claim 4, wherein

After resampling the first motion data for each cluster,

And deleting the overlapping motion data among the first motion data and the second motion data of the entire cluster.
In the body recognition method for predicting the three-dimensional posture of the body from the depth image of the body,

Obtaining a learning depth image from the motion data generated by the motion data generating method according to any one of claims 1 to 7; And

Machine learning a body part recognition algorithm using the learning depth image; body recognition method comprising a.
An apparatus for generating motion data by a motion data generation application,

The apparatus includes a processor and a memory,

When the application is loaded into the memory and executed by the control of the processor, the application,

Obtaining first motion data from a motion database (DB) in which the motion data is stored;

Clustering the first motion data according to a posture category of a body; And

And resampling the first motion data for each cluster according to a distribution ratio of the amount of motion data set for each clustered cluster.
The method of claim 9, wherein the function of resampling the first motion data for each cluster comprises:

Projecting the first motion data into a low dimensional space of a body posture;

Stratification of the low dimensional space into a grid of cells; And

And deleting a portion of the first motion data or generating second motion data based on the first motion data.
The method of claim 10,

Deleting some of the first motion data or generating second motion data,

Deleting a plurality of first motion data except for a predetermined number of the plurality of first motion data when there are a plurality of first motion data in one cell; And

And deleting some of the first motion data or generating second motion data according to a preset fill ratio (f: fill ratio).
The method of claim 9,

After the application performs the function of resampling the first motion data for each cluster, the application may perform the overlapped motion data among the first motion data and the second motion data of the entire cluster. Motion data generation device.
A computer-readable recording medium having recorded thereon a program for executing the method according to any one of claims 1 to 7.
A computer-readable recording medium having recorded thereon a program for executing the method of claim 8 on a computer.