CN107341179B

CN107341179B - Standard motion database generation method and device and storage device

Info

Publication number: CN107341179B
Application number: CN201710386848.6A
Authority: CN
Inventors: 黄源浩; 肖振中; 许宏淮
Original assignee: Shenzhen Orbbec Co Ltd
Current assignee: Orbbec Inc
Priority date: 2017-05-26
Filing date: 2017-05-26
Publication date: 2020-09-18
Anticipated expiration: 2037-05-26
Also published as: CN107341179A

Abstract

The invention provides a standard motion database generation method, a standard motion database generation device and a standard motion database storage device. The method comprises the steps of obtaining a depth image sequence containing standard motion actions of a human body; marking each part of the human body according to the depth image sequence; recording standard track information of each part of the body in standard motion according to the depth image sequence; and saving the standard track information to form a standard motion database. The apparatus includes a depth camera, a processor, and a memory. The storage means stores program data which can be executed to implement the above-described method. The invention can acquire comprehensive information of the human body in the standard movement process and accurately distinguish the shielding relation of the trunk, the limbs and the like of the human body, so that more comprehensive and accurate data can be acquired to form a standard movement database, and further, the later evaluation and analysis of the human body movement are more accurate.

Description

Standard motion database generation method and device and storage device

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating a standard motion database, and a storage apparatus.

Background

The depth camera captures the depth information of each pixel in the depth image of the scene, wherein the depth information is the distance from the surface of the scene to the depth camera, and therefore the position information of the scene target can be acquired according to the depth image.

In the training of sports such as ball sports, track and field sports, the electronic data is formed by capturing the overall body posture, motion and motion tracks of all parts of the body, and the analysis of the data has important significance for improving the sports training effect. In the prior art, data are obtained by wearing electronic equipment capable of tracking a motion trail, or a 2D image sequence is adopted to analyze and evaluate the motion trail and a human body posture. In the research and practice processes of the prior art, the inventor of the present invention finds that data acquired by a wearable electronic device capable of tracking a motion trajectory is not comprehensive and is limited to motion data of an area where the electronic device is worn, and when each part of a body wears the electronic device, motion is inevitably affected, and recognition of gestures having a shielding relationship of, for example, limbs in front of the body cannot be accurately distinguished in a 2D image sequence, so that an analysis result is inaccurate, and a motion training effect is difficult to improve.

Disclosure of Invention

The invention provides a standard motion database generation method, a standard motion database generation device and a standard motion database storage device, which can solve the problem of inaccurate analysis results in the prior art.

In order to solve the technical problems, the invention adopts a technical scheme that: a standard motion database generation method is provided, which comprises the following steps: acquiring a depth image sequence containing standard motion actions of a human body; marking each part of the human body according to the depth image sequence; recording standard track information of all parts of the body in standard motion according to the depth image sequence; and saving the standard track information to form a standard motion database.

In order to solve the technical problem, the invention adopts another technical scheme that: an apparatus for generating a quasi-motion database is provided, the apparatus comprising a depth camera, a processor, and a memory, both the depth camera and the memory being coupled to the processor; the depth camera is used for acquiring a depth image sequence containing human body standard motion actions; the processor is used for marking all parts of the human body according to the depth image sequence; recording standard track information of all parts of the body in standard motion according to the depth image sequence; the memory is used for saving the standard track information to form a standard motion database.

In order to solve the technical problem, the invention adopts another technical scheme that: there is provided a storage device having stored program data executable to implement the above method.

The invention has the beneficial effects that: different from the situation of the prior art, the method and the device perform processing through the depth image sequence, track the motion track of each part of the body during the standard motion action of the human body to obtain standard track information, and store the standard track information to form a standard motion database to be used as a reference standard of the motion action of the human body to be evaluated in the later period. The invention can acquire comprehensive information of the human body in the standard movement process and accurately distinguish the shielding relation of the trunk, the limbs and the like of the human body, so that more comprehensive and accurate data can be acquired to form a standard movement database, and further, the later evaluation and analysis of the human body movement are more accurate.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a standard exercise database generation method provided in the present invention;

FIG. 2 is a schematic flow chart of another embodiment of a standard exercise database generation method provided by the present invention;

FIG. 3 is a schematic flow chart of step S22 in FIG. 2;

fig. 4 is a schematic flowchart of recording standard posture information in step S23 in fig. 2;

FIG. 5 is a schematic diagram of a method for generating a standard motion database according to another embodiment of the present invention, wherein the relationship between the center of mass of the knee joint and the spatial position of the center of the human body is shown;

fig. 6 is a schematic structural diagram of an embodiment of an apparatus for generating a standard exercise database according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for generating a standard exercise database according to an embodiment of the present invention. The standard motion database generation method shown in fig. 1 includes the steps of:

and S11, acquiring a depth image sequence containing the standard motion action of the human body.

Wherein, the standard sport action of the human body can be the action finished by professional athletes, coaches and the like. Wherein the standard exercise motion may be a squat exercise motion, a push-up motion, a sit-up motion, etc. The depth image includes not only pixel information of an object in space, but also depth information of each pixel information, i.e., distance information between the object in space to the depth camera. The depth image sequence may be obtained by a depth camera, and the depth image sequence refers to continuous depth images in a time period, that is, a process of tracking and shooting the whole motion of the human body by using the depth camera.

S12, labeling each body part of the human body based on the depth image sequence.

Specifically, the various parts of the human body may be the head, the shoulder, the neck, the trunk, the four limbs, the hands, the feet, and the like, as well as the joints of the knee, the elbow, the wrist, the ankle, the hip joint, and the like. By analyzing the depth image sequence, each body part of the human body is identified in the depth image sequence, and is marked.

And S13, recording standard track information of each part of the body in standard motion according to the depth image sequence.

In step S13, the labeled body parts are tracked in the depth image sequence, so that the standard trajectory information of the body parts is acquired and recorded.

And S14, saving the standard track information to form a standard motion database.

In step S14, the standard trajectory information of each part of the human body during the standard movement of the human body is stored to form a database, so that the movement data can be compared with the standard movement data in the standard movement database during the later evaluation of the movement of the human body, thereby analyzing whether the movement of the human body to be evaluated meets the requirements of the standard movement. For example, after comparing the trajectory information of each part of the human body to be evaluated with the standard trajectory information in the standard motion database, it can be determined whether the motion of the human body to be evaluated meets the requirement of the standard motion, and further, adjustment suggestions can be provided, including the adjusted part and the adjustment direction, for example, the left hand needs to move downwards again, the distance between the two feet is reduced, the positions of the two knees do not exceed the toe, and the like.

Different from the prior art, the method and the device perform processing through the depth image sequence, track the motion track of each part of the body during the standard motion action of the human body to obtain standard track information, and store the standard track information to form a standard motion database to be used as a reference standard of the motion action of the human body to be evaluated in the later period. The invention can acquire comprehensive information of the human body in the standard movement process and accurately distinguish the shielding relation of the trunk, the limbs and the like of the human body, so that more comprehensive and accurate data can be acquired to form a standard movement database, and further, the later evaluation and analysis of the human body movement are more accurate.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a method for generating a standard exercise database according to another embodiment of the present invention.

And S21, acquiring a depth image sequence containing the standard motion action of the human body.

S22, labeling each body part of the human body based on the depth image sequence.

Specifically, as shown in fig. 3, fig. 3 is a schematic flow chart of step S22 in fig. 2. Step S22 includes:

and S221, removing the background in the depth image series.

For example, one blob (i.e., a connected group of pixels having similar values) may be preliminarily determined in the depth map as the subject's body, and then other blobs having significantly different depth values may be removed from the blob. Plaque preliminarily determined in this manner must generally have some minimum size. However, for this reason, a simple euclidean distance between pixel coordinates at the edges of the plaque does not give an accurate measure of the size. The reason for this inaccuracy is that the size (in pixels) of the blob corresponding to an object of a given physical size increases or decreases as the distance of the object from the device changes.

Thus, to determine the actual size of an object, the (x, y, depth) coordinates of the object are first transformed into "real world" coordinates (xr, yr, depth) using the following formula:

xr (x-fovx/2) pixel size depth/reference depth

yr (y-fovy/2) pixel size depth/reference depth

Here, fovx and fovy are the fields of view (in pixels) of the depth map in the x and y directions. The pixel size is the length that the pixel subtends at a given distance (reference depth) from the drawing device. The size of the blob may then be determined by taking the euclidean distance between the real world coordinates of the edges of the blob.

Thus, the background in the depth image may be removed by identifying a blob having a required minimum size with a minimum average depth value among blobs in the scene. It may be assumed that the blob closest to the depth camera is a human body, that all pixels with a depth greater than the average depth value by at least some threshold are assumed to belong to background objects, and that the depth values of these pixels are set to zero. Wherein, the threshold value can be determined according to actual needs. Furthermore, in some embodiments, pixels having depth values significantly smaller than the average depth value of the blob may also be zeroed out. Alternatively, a maximum depth may be preset so that objects exceeding the maximum depth are ignored.

In some embodiments, the depth value may also be determined dynamically, beyond which objects are removed from the depth map. For this reason, it is assumed that objects in the scene are moving. Thus, any pixel that has not changed in depth for some minimum number of frames is assumed to be a background object. Pixels with depth values greater than the static depth value are considered to belong to the background object and are therefore all zeroed out. Initially, all pixels in the scene may be defined as static, or all pixels in the scene may be defined as non-static. In both cases, the actual depth filter can be dynamically generated as soon as the object starts to move.

Of course, the background in the depth image may also be removed by other methods known in the art.

S222, acquiring the contour of the human body in the depth image sequence.

After removing the background, the outer contour of the body can be found in the depth map by an edge detection method. In this embodiment, a two-step thresholding mechanism is used to find the contour of the human body:

first, all pixels in the blob in the depth image that correspond to the humanoid form are traversed and marked as contour locations if any given pixel has a valid depth value and if the difference in depth value between that pixel and at least one of its four connected neighboring pixels (right, left, top, and bottom) is greater than a first threshold. (where the difference between the effective depth value and the zero value is considered to be infinite).

Then, after the above step is completed, the blob is traversed again and marked as a contour position if there is a contour pixel among eight contiguous neighboring pixels of any pixel (which pixel has not yet been marked as a contour position) and if the difference in depth value between the current pixel and at least one of the remaining contiguous neighboring positions is greater than a second threshold (lower than the first threshold).

And S223, identifying the trunk of the human body according to the contour.

After finding the outer contour of the human body, various parts of the body, such as the head, torso, and limbs, are identified.

The depth image is first rotated so that the body contour is in a vertical position. The purpose of this rotation is to simplify the calculations in the following steps by aligning the longitudinal axis of the body with the Y coordinate (vertical) axis. Alternatively, the following calculations may be performed relative to the longitudinal axis of the body without the need to make this rotation, as will be appreciated by those skilled in the art.

The 3D axes of the body may be found prior to identifying various parts of the body. Specifically, finding the 3D axis of the body may employ the following method:

the original depth image is down-sampled (down-sample) into a grid of nodes, where one node is taken n pixels apart in the X-direction and Y-direction. The depth value of each node is calculated based on the depth values in the n × n squares centered on the node. If more than half of the pixels in a block have a value of zero, the corresponding node is set to a value of zero. Otherwise, the node is set to the average of the valid depth values in the nxn square.

This down-sampled depth image may then be further "cleaned up" based on the values of neighboring nodes: if most of the neighbors of a given node have a value of zero, then that node is also set to a value of zero (even if it has a valid depth value after the preceding steps).

Upon completion of the above steps, the vertical axes of the remaining nodes in the down-sampled graph are found. To do this, a linear least squares fit can be performed to find the line that best fits each node. Alternatively, an ellipse around each node may be fitted and its major axis found.

After finding the 3D axis of the body, the torso of the body is identified by measuring the thickness of the body contour in directions parallel and perpendicular to the longitudinal axis. To this end, a bounding box may be defined around the body contour, and the pixel values in this box may then be binarized: pixels with zero depth values are set to 0 and pixels with non-zero depth values are set to 1.

Then, a longitudinal thickness value is calculated for each X value within the box by summing the binary pixel values along the corresponding vertical line, and a transverse thickness value is calculated for each Y value by summing the binary pixel values along the corresponding horizontal line. A threshold is applied to the resulting values to identify along which vertical and horizontal lines the contour is relatively thick.

When the transverse thickness of a certain horizontal area of the outline exceeds an X threshold value and the longitudinal thickness of a certain vertical area exceeds a Y threshold value, the intersection of the horizontal area and the vertical area can be determined as the trunk.

And S224, identifying each part of the human body according to the trunk.

After the torso is determined, the head and limbs of the body may be identified based on geometric considerations. The hand arms are regions connected to the left and right sides of the torso region; the head is the connecting area above the torso area; the legs are the connection areas under the torso area. The upper left and right corners of the torso region may also be preliminarily identified as shoulders.

And S225, marking each part of the body.

The method comprises the steps of marking all parts of the body so as to track the motion trail of all the parts of the body.

In another embodiment, identifying the body parts of the human body can be further realized by the following three steps:

firstly, segmenting a human body. In the embodiment, a method combining interframe difference and background difference is adopted to segment a moving human body, one frame in an RGBD image is selected as a background frame in advance, a Gaussian model of each pixel point is established, then an interframe difference method is used for carrying out difference processing on two adjacent frames of images, background points and changed regions (the changed regions in the current frame comprise an exposed region and a moving object) are distinguished, then model fitting is carried out on the changed regions and the corresponding regions of the background frame to distinguish the exposed region and the moving object, and finally a shadow is removed from the moving object, so that the moving object without the shadow is segmented. When updating the background, determining the interframe difference as a background point, and updating according to a certain rule; and if the background difference is determined to be the point of the exposed area, updating the background frame at a higher updating rate, and not updating the area corresponding to the moving object. This method can obtain a more ideal segmentation target.

And (II) extracting and analyzing the contour. After the binarized image is acquired, the contour is acquired using some classical edge detection algorithm. For example, by adopting a Canny algorithm, a Canny edge detection operator fully reflects the mathematical characteristics of an optimal edge detector, has good signal-to-noise ratio and excellent positioning performance for different types of edges, generates low probability of multiple responses to a single edge and has the maximum inhibition capability on false edge responses, and after an optical flow segmentation field is obtained by utilizing the segmentation algorithm, all concerned moving objects are contained in the segmentation areas. Therefore, the Canny operator is used for extracting the edges in the segmentation areas, so that on one hand, background interference can be greatly limited, and on the other hand, the running speed can be effectively improved.

And (III) automatically marking the joint. The moving target is obtained through a difference method, after the Canny edge detection operator extracts the contour, the human body target is further analyzed through a 2D belt model (Ribbonmodel) of MaylorK, LeungandYee-HongYang. The model divides the front of the body into different areas, for example, the body is constructed with 5U-shaped areas representing the head and limbs of the body, respectively.

Thus, by finding the 5U-shaped body endpoints, the approximate location of the body can be determined, extracting the required information by vector contour compression based on the extracted contour, preserving the most prominent human extremity features, compressing the human contour into a fixed shape, e.g., such that the contour has fixed 8 endpoints and 5U-shaped points and 3 inverted U-shaped points, so that the apparent features facilitate the calculation of the contour. Here, the contour may be compressed using a distance algorithm of adjacent end points on the contour, and the contour is compressed into 8 end points through an iterative process.

After the compressed contour is obtained, the following algorithm is adopted to automatically label each part of the body:

(1) a U-shaped body end point is determined. Given a certain reference length M, a vector greater than M can be considered as a part of the body contour, and a vector smaller than M can be ignored. Searching from a certain point according to the vectorized contour, finding a vector larger than M as Mi, finding the next vector as Mj, comparing included angles from Mi to Mj, considering the included angles as U endpoints if the included angles are within a certain range (0-90 degrees) (note that the included angles are positive and indicate that the included angles are convex), and recording the two vectors to find the U endpoint. This is done until 5U endpoints are found.

(2) The end points of the three inverted U-shapes are determined. In the same step (1), the included angle condition is changed from positive to negative.

(3) The positions of the head, the hands and the feet can be easily obtained according to the end points of the U and the inverted U. According to the physiological shape of the body, each joint point can be determined, and the width and the length of the trunk can be respectively determined by utilizing the intersection angle part of the arms and the body and the intersection angle part of the head and the legs; then, the neck and waist positions account for 0.75 and 0.3 of the trunk respectively, the elbows are positioned at the midpoints of the shoulders and the hands, and the knees are positioned at the midpoints of the waist and the feet. Thus, the general position of each part of the body can be defined.

And S23, recording standard track information of each part of the body in standard motion according to the depth image sequence, and recording standard posture information of the body when the repeated motion starts or ends in the standard motion.

There are various methods for recording the standard motion trajectory of various parts of the body, for example, the OGHMs (orthologalgausian-transmitter momenta) method, whose basic principle is: and judging whether the pixel point belongs to the foreground motion area or not by comparing the change degree of the corresponding pixel value between the temporally continuous image frames.

An input image sequence is represented by { f (x, y, t) | t ═ 0,1,2 … }, f (x, y, t) represents an image at the time t, x, y represent coordinates of pixel points on the image, and assuming that a Gaussian function is g (x, σ), bn (t) is a product of g (x, σ) and a Hermite polynomial, an n-order hmogs can be represented as:

wherein a is_iDetermined by the standard deviation σ of the Gaussian function. Depending on the nature of the convolution operation, OGHMs of order n can be viewed as the convolution of the sum of the derivatives of the image sequence function in order of time with a Gaussian function. The larger the derivative value of a certain point is, the larger the change of the pixel value at the point position along with the change of time is, it indicates that the point should belong to the motion region block,this provides a theoretical basis for the OGHMs method to detect moving objects. In addition, from equation (1), the basis functions of OGHMs are

This is a linear combination of the different order derivatives of the Gaussian function. Because the gaussian function itself has the ability to smooth noise, OGHMs also have the ability to effectively filter out various types of noise.

For example, the Temporal Difference method (Temporal Difference) is a method of extracting a motion region in an image by thresholding using a Temporal Difference between pixels of several adjacent frames before and after a temporally continuous image sequence. Early methods used the difference between two adjacent frames to obtain moving objects, e.g. set F_kIs the data of the gray level of the k frame image in the image sequence, F_k+1Representing the gray value data of the (k + 1) th frame image in the image sequence, the differential image of two time-adjacent frames is defined as:

where T is the threshold. If the difference is larger than T, the gray scale change of the area is large, namely the detected moving target area is needed.

As another example, the Optical Flow method (Optical Flow), which is based on the following assumptions: the change in image grey scale is due solely to the motion of the object or background. That is, the gray levels of the object and the background do not change with time. The motion detection based on the optical flow method utilizes the characteristic that a moving object shows a velocity field in an image along with the time change, and estimates the optical flow corresponding to the motion according to a certain constraint condition.

For another example, Background Subtraction method (Background Subtraction) is based on the basic principle that a Background model image is first constructed, then a difference is made between a current frame image and a Background frame image, and a moving object is detected by thresholding the difference result. Suppose that the background frame image at time t is F₀Corresponding to the current frame image as F_tThen the current frame and the backgroundThe difference of the scene frame can be expressed as:

if the gray value difference of corresponding pixels of the current frame image and the background frame image is greater than the threshold value, the corresponding value in the obtained binary image is 1, and the region is determined to belong to the moving target.

In other embodiments, the track information may also be obtained by describing the motion track of each part of the human body through an HOG-HOF descriptor.

Referring to fig. 4, fig. 4 is a schematic flow chart illustrating the recording of the standard posture information in step S23 in fig. 2. Recording standard posture information of the human body at the beginning or the end of the repeated action in the standard movement can comprise:

and S231, identifying each part of the human body according to the depth image sequence and determining a human body reference point of the human body.

Specifically, the human body reference point may be a human body centroid or a human body center, and the present embodiment describes the present invention with the human body center as the human body reference point. Of course, in other embodiments, other specific points of the human body may also be selected as the human body reference points.

The human body center is the geometric center of the human body in the depth image, and after the trunk and all parts of the human body are identified, the human body center can be determined through the outline of the whole human body of the depth image, namely the median of the outer edge values of the three-dimensional human body edge.

And S232, acquiring the standard relative position relation between each part of the body and the human body reference point.

The standard relative position relationship of the present embodiment is a relative position relationship between the center of mass and the center of the human body of each part of the body of the standard exercise human body in the posture to be evaluated. In one embodiment, the relative position relationship may be an euclidean distance and a cosine distance between the center of mass of each part of the body and the center of the reference body, and a standard vector of the standard body posture may be formed according to the euclidean distance and the cosine distance. For example, the euclidean and cosine distances of the centroid of the head to the center of the body, the euclidean and cosine distances of the centroid of the hand to the center of the body, and the like.

The euclidean distance and the cosine distance may be calculated as follows:

first, a first coordinate value of a human body center of a standard posture is acquired. In this embodiment, the first coordinate value of the human body center is a coordinate value of the human body center in a camera coordinate system of the depth camera.

For example, in the end posture of the deep squatting motion, the first coordinate value of the human body center point A is (x)₁,y₁,z₁)。

And then acquiring the mass center of each part of the body and a second coordinate value of the mass center of each part of the body.

Specifically, after each part of the body is identified, the centroid of each region of the body can be determined. Wherein the centroid of a region refers to the representative depth or position of the region. To this end, for example, a histogram of depth values within a region may be generated and the depth value having the highest frequency (or an average of two or more depth values having the highest frequencies) may be set as the centroid of the region. After the mass centers of all parts of the body are determined, the coordinates of the mass centers of all parts of the body in the camera coordinate system can be determined.

It is worth mentioning that the centroid in the present invention refers to a centroid obtained by depth image processing, and not to a physical centroid. The centroid of the present invention can be obtained by the centroid method, and can also be obtained by other methods, which is not limited in the present invention.

For example, in the end posture of the deep squat motion, the second coordinate value of the center of mass B of the knee joint is (x)₂,y₂,z₂)。

And finally, calculating Euclidean distances and cosine distances between the center of mass of each part of the human body and the center of the human body according to the first coordinate value and the second coordinate value, and forming a standard vector of the standard posture of the human body.

Cosine distance, also called cosine similarity, is a measure of the magnitude of the difference between two individuals using the cosine value of the angle between two vectors in a vector space. The difference between sample vectors is measured by the concept in machine learning. Wherein, the cosine distance of two vectors can be represented by the cosine value of the included angle between them.

For example, as shown in fig. 5, fig. 5 is a schematic diagram of a spatial position relationship between a center of mass of a knee joint and a center of a human body according to another embodiment of a standard motion database generation method provided by the present invention. After the first coordinate value and the second coordinate value are obtained, the vector of the human body center can be obtained

And the vector of the center of mass of the knee joint

Specifically, the euclidean distance between the center of mass of the knee joint and the center of the human body is calculated by the following formula:

and

the cosine distance between can be calculated by the following formula:

wherein the Euclidean distance measures the absolute distance of points in space, e.g. d_ABMeasuring the absolute distance between the point A and the point B, and directly correlating with the position coordinates of the points; the cosine distance measures the included angle of the space vector, and the difference in direction is reflected rather than the position.

Specifically, the cosine distance has a value range of [ -1, 1 ]. The larger the cosine of the included angle is, the smaller the included angle between the two vectors is, and the smaller the cosine of the included angle is, the larger the included angle between the two vectors is. When the directions of the two vectors are coincident, the cosine of the included angle takes the maximum value of 1, and when the directions of the two vectors are completely opposite, the cosine of the included angle takes the minimum value of-1.

Of course, in the estimation process of the human body posture, the Euclidean distance and the cosine distance from the mass center of the hand, the foot and other body parts to the center of the human body are usually calculated. Finally, the values obtained by the Euclidean distances and the cosine distances from the centroid of all the required body parts to the center of the human body correspond to all the body parts one by one, and the standard vectors of the standard postures are integrated.

When the human body movement is evaluated in the later stage, a vector to be evaluated can be formed by the same method, and the evaluation is carried out by comparing the vector to be evaluated with the standard vector.

It will be appreciated that in other embodiments, the body gesture may also be a body gesture at some point in time during the movement, and is not limited to a body gesture at the beginning or end.

And S24, saving the standard track information and the standard relative position relation to form a standard motion database.

Specifically, in this embodiment, step S24 may store the standard relative position relationship while storing the standard trajectory information, that is, the euclidean distance and the cosine distance between the centroid of each part of the human body in the standard posture and the human body reference point, and the standard vector, so as to compare the trajectory information of the human body to be evaluated with the standard trajectory information when the human body motion and posture are evaluated, and compare the euclidean distance and the cosine distance between the centroid of each part of the human body to be evaluated and the human body reference point with the euclidean distance and the cosine distance between the centroid of each part of the human body in the standard posture and the human body reference point, that is, compare the vector to be evaluated with the standard vector.

The embodiment not only stores the standard track information through the processing of the depth image sequence, but also extracts each part of the body and the human body reference point in the standard human body posture when the repeated action starts or ends, and acquires and stores the standard posture information through the relative position relationship between each part of the body and the human body reference point, so that the positions of each part of the human body can be accurately distinguished, the accurate relative position relationship can be acquired, the accuracy of the posture evaluation result can be improved, and the training efficiency of each movement can be improved.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an apparatus for generating a standard motion database according to an embodiment of the present invention.

Specifically, the apparatus for generating the quasi-motion database includes a depth camera 10, a processor 11 and a memory 12, and the depth camera 10 and the memory 12 are connected to the processor 11.

Wherein the depth camera 10 is used to acquire a sequence of depth images containing standard motion actions of the human body.

The processor 11 is used for marking each part of the human body according to the depth image sequence; and recording standard track information of all parts of the body in standard motion according to the depth image sequence.

The memory 12 is used for saving standard track information to form a standard motion database.

In this embodiment, the processor 11 is further configured to remove a background in the depth image sequence; acquiring the contour of a human body in a depth image sequence; identifying the trunk of the human body according to the contour; identifying each part of the human body according to the trunk; marking each part of the body.

Optionally, the processor 11 is further configured to record standard posture information of the human body at the beginning or the end of the repetitive motion in the standard exercise; and storing standard posture information.

Optionally, the processor 11 is further configured to identify body parts of the human body according to the depth image sequence and determine a human body reference point of the human body; and acquiring the standard relative position relation between each part of the body and the human body reference point.

The memory 12 is also used to store standard relative positional relationships.

In some embodiments, the standard relative position relationship is a relative position relationship from a center of mass of each part of the human body in the standard posture to a human body reference point of the human body; the processor 11 is further configured to obtain a first coordinate value of the human body center in the standard posture; acquiring a second coordinate value of the mass center of each part of the body and the mass center of each part of the body; and calculating the Euclidean distance and the cosine distance between the mass center of each part of the human body in the standard posture and the human body reference point according to the first coordinate value and the second coordinate value so as to form a standard vector of the standard posture.

The present invention also provides a storage device storing program data that can be executed to implement the standard exercise database generation method of any of the above embodiments.

For example, the storage device may be a portable storage medium, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk. It is to be understood that the storage device may also be various media that can store program codes, such as a server.

In conclusion, the invention can acquire comprehensive information of the human body in the standard movement process and accurately distinguish the shielding relation of the trunk, the limbs and the like of the human body, so that more comprehensive and accurate data can be acquired to form a standard movement database, and further, the later evaluation and analysis of the human body movement are more accurate.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A standard motion database generation method is characterized by comprising the following steps:

acquiring a depth image sequence containing standard motion actions of a human body;

marking each part of the human body according to the depth image sequence;

recording standard track information of all parts of the body in standard motion according to the depth image sequence;

saving the standard trajectory information to form a standard motion database,

wherein the step of recording trajectory information of each part of the body in standard motion according to the depth image sequence further comprises: recording standard posture information of the human body when the repeated action starts or ends in the standard movement;

the step of saving the standard trajectory information to form a standard motion database further comprises: the standard posture information is saved, and the standard posture information is saved,

wherein the step of recording the standard posture information of the human body when the repeated action starts or ends in the standard movement comprises:

recognizing all parts of the human body according to the depth image sequence and determining a human body reference point of the human body;

acquiring a standard relative position relation between each part of the body and the human body reference point;

the step of saving the standard attitude information comprises: the standard relative position relationship is saved and,

the standard relative position relationship is the relative position relationship from the mass center of each part of the human body in the standard posture to the human body reference point of the human body;

the step of obtaining the standard relative position relationship between each part of the body and the human body reference point comprises the following steps:

acquiring a first coordinate value of a human body center of a standard posture;

acquiring a second coordinate value of the mass center of each part of the body and the mass center of each part of the body;

and calculating the Euclidean distance and the cosine distance between the mass center of each part of the human body in the standard posture and the human body reference point according to the first coordinate value and the second coordinate value so as to form a standard vector of the standard posture.

2. The method of claim 1, wherein the step of labeling each body part of the human body from the sequence of depth images comprises:

removing a background in the sequence of depth images;

acquiring the contour of the human body in the depth image sequence;

identifying a torso of the human body according to the contour;

identifying each part of the human body according to the trunk;

marking each part of the body.

3. An apparatus for generating a quasi-motion database, comprising a depth camera, a processor, and a memory, both the depth camera and the memory being coupled to the processor;

the depth camera is used for acquiring a depth image sequence containing human body standard motion actions;

the processor is used for marking all parts of the human body according to the depth image sequence; recording standard track information of all parts of the body in standard motion according to the depth image sequence;

the memory is used for saving the standard track information to form a standard motion database,

the processor is further used for recording standard posture information of the human body when repeated actions in standard movement start or end; the standard posture information is saved, and the standard posture information is saved,

the processor is further used for identifying all parts of the human body according to the depth image sequence and determining a human body reference point of the human body; acquiring a standard relative position relation between each part of the body and the human body reference point;

the memory is further configured to store the standard relative positional relationship,

the processor is further used for obtaining a first coordinate value of the human body center of the standard posture; acquiring a second coordinate value of the mass center of each part of the body and the mass center of each part of the body; and calculating the Euclidean distance and the cosine distance between the mass center of each part of the human body in the standard posture and the human body reference point according to the first coordinate value and the second coordinate value so as to form a standard vector of the standard posture.

4. The apparatus of claim 3, wherein the processor is further configured to remove a background in the sequence of depth images; acquiring the contour of the human body in the depth image sequence; identifying a torso of the human body according to the contour; identifying each part of the human body according to the trunk; marking each part of the body.

5. A storage device, characterized in that program data are stored, which program data can be executed to implement the method according to any one of claims 1 to 2.