CN112416133A

CN112416133A - Hand motion capture method and device, electronic equipment and storage medium

Info

Publication number: CN112416133A
Application number: CN202011376590.XA
Authority: CN
Inventors: 柴金祥; 其他发明人请求不公开姓名
Original assignee: Shanghai Movu Technology Co Ltd; Mofa Shanghai Information Technology Co Ltd
Current assignee: Shanghai Movu Technology Co Ltd; Mofa Shanghai Information Technology Co Ltd
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-02-26
Anticipated expiration: 2040-11-30
Also published as: CN112416133B

Abstract

The application discloses a hand motion capture method, a hand motion capture device, electronic equipment and a storage medium, wherein the hand motion capture method comprises the following steps: determining the spatial position of a first marking object attached to the hand and label information of a part of the first marking object, which is positioned on the hand; adjusting an initial hand pose model using the spatial position of the first marker object and the tag information to generate a reference hand pose model corresponding to the hand. By the aid of the method and the device, the spatial position and the label information of the first marked object can be utilized to generate the reference hand model of the hand, and subsequent hand motion capture is facilitated.

Description

Hand motion capture method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a hand motion capture method and apparatus, an electronic device, and a storage medium.

Background

With the rapid development of computer technology, sensor technology and virtual reality industry, the development of motion capture technology is extremely rapid and the application range is increasingly wide, especially playing a very important role in many fields such as sports industry, game making, animation making and movie and television special effect making, etc., which forms a novel mode of mutual infiltration and fusion of art and technology and will become a development trend in the future.

The existing optical capturing technology assumes that each bone of the human body is a rigid body, and based on this assumption, a bone model of an actor is reconstructed, under which assumption, only the mark points can be placed at the bone nodes of the human body.

Aiming at hand capture, the existing optical capture technology has the following defects that firstly, a three-dimensional hand model matched with an actor hand cannot be reconstructed, the position for placing a mark point cannot be defined, secondly, bones of fingers are attached with muscles, and bone nodes are very small and are contrary to the assumption of a rigid body. For the above reasons, the conventional optical capturing technology cannot capture the hand motion well.

Disclosure of Invention

In the field of motion capture, capturing of hand motion is one of important challenges, and up to now, there is no high-precision hand gesture capturing scheme. The existing hand capturing schemes mainly comprise two types, namely inertial capturing and optical capturing. The inertial capture requires wearing gloves with inertial devices on the hands, and when an actor moves fingers, information such as angular velocity and linear acceleration of each joint of the hands is acquired, and then integration is carried out to reconstruct absolute position and direction data of each joint of the actor fingers. The inertial capture has the defect that the hand postures of actors can be obtained only through integration; the angular velocity and acceleration obtained by inertia capture are usually noisy, so the error accumulation caused by integration is larger and larger as the recording time is increased. Therefore, the inertia capture cannot be recorded for a long time. In addition, since the joints of the human hand are small, it is often difficult to wear an inertial sensor for each joint, and it is impossible to capture information for each joint of the finger, which results in a decrease in the capturing accuracy.

Aiming at optical capture of a body, marking points are placed at all joints of the body, and three-dimensional coordinates of the marking points are acquired through a plurality of infrared cameras, so that a model of an actor is acquired, and then the technique of performing the actor is acquired in real time. At present, gesture capture cannot directly carry a body capture scheme under optical capture, because a hand cannot place a mark point on each joint like a body. The number of joints of the hand is large, and each joint is small, so that the fact that mark points are placed on each joint is unrealistic, and the hand performance is influenced by the fact that the mark points are stuck too many. In the above situation, a part of the mark points can be placed only on the hand, and in the case of placing the part of the mark points, there is a problem that, during capturing, since there are many joints of the hand and the degree of freedom is high, there are various gestures that can satisfy the actual three-dimensional coordinates of the mark points, for example, placing the mark points on the thumb and the index finger, the gesture is more than number 5, and the gestures are the same as the gestures on the thumb and the index finger when compared with number 8. Thus, at the time of capture, the result of the capture is not controllable (as in the above example, it is possible that the captured output gesture is 5, it is also possible that the captured output gesture 8 is captured), and there may also be a situation where the sequence of motion of the capture is not coherent (as there may be a situation where gesture 5 suddenly changes to gesture 8 in the capture).

In view of the above problems, embodiments of the present application provide a hand motion capture method, apparatus, electronic device and storage medium, which are used to solve at least the above-mentioned problems.

The embodiment of the application also provides a hand motion capturing method, which comprises the following steps: determining the spatial position of a first marking object attached to a hand and label information of a part of the first marking object, which is positioned on the hand; adjusting an initial hand pose model using the spatial position of the first marker object and the tag information to generate a reference hand pose model corresponding to the hand.

The embodiment of the method has the advantages that the marking point does not need to be placed at the bone node of the hand or need to be accurate to a certain fixed position, and time is saved. And a three-dimensional hand model matching the hand can be reconstructed.

Optionally, the initial hand pose model comprises shape parameters for describing a hand shape and hand motion parameters for describing a hand motion.

Optionally, the adjusting an initial hand posture model using the spatial position of the first marker object and the tag information, and generating a reference hand posture model corresponding to the hand includes: setting hand motion parameters in the initial hand pose model by causing the hand to make a specific hand motion; if the hand motion parameters are determined, acquiring the spatial position and the tag information of a first marking object when the hand makes the specific hand motion; and adjusting the shape parameters of the initial hand posture model by using the hand motion parameters, the spatial position and the label information to generate the reference hand posture model corresponding to the hand.

Optionally, after generating the reference hand pose model corresponding to the hand, the method further includes: responding to the current hand motion made by the hand, and acquiring the current spatial position and the current label information of the first marking object; and adjusting the reference hand posture model by using the current spatial position, the current label information and a prior hand motion model corresponding to the prior hand motion to obtain the current hand posture model of the hand so as to be used for capturing the current hand motion of the hand.

According to the method, when the hand motion is captured, the captured result is controlled through the prior hand motion model, so that the captured result has no ambiguity, and meanwhile, the captured motion sequence is also ensured to be coherent. The current spatial position of the first marker object and the current tag information can be used to achieve highly accurate hand motion capture.

Optionally, the method further comprises: determining the distribution of a first marking object on the hand; selecting at least one hand action meeting the distribution condition from a preset hand action library; determining with the at least one hand motion as the prior hand motion if it is determined that there is no ambiguous hand motion for the at least one hand motion; and establishing the prior hand motion model by using the prior hand motion.

Optionally, the obtaining of the current tag information of the first tagged object includes: acquiring a predicted spatial position of the current tag information and a current spatial position of the first marker object; under the condition that the current spatial position of the first marking object is determined to be within a preset range of the predicted spatial position of the current tag information, matching the first marking object with the current tag information to obtain a matching relation, wherein the preset range is a range set according to the prediction of the hand motion track of the hand; and determining the current label information corresponding to the first marking object according to the matching relation.

Optionally, the adjusting the reference hand posture model by using the current spatial position, the current tag information, and a prior hand motion model corresponding to the prior hand motion to obtain the current hand posture model of the hand, so as to be used for capturing the current hand motion of the hand includes: under the condition that the shape parameters of the reference hand posture model are determined, the sum of the distances between the virtual spatial positions of all the tag information and the spatial positions of the first mark objects corresponding to all the tag information is minimized by continuously adjusting the hand motion parameters under the constraint of the prior hand motion model, and the current hand posture model is obtained to be used for capturing the posture of the hand.

Optionally, the method further comprises: determining an interactive prop for performing interaction with the hand; acquiring a prop space position and prop label information of the interactive prop through a prop mark object attached to the interactive prop; and adjusting a basic prop posture model corresponding to the interactive prop by using the prop spatial position and the prop label information to generate a current prop posture model for capturing the motion of the interactive prop.

Optionally, the obtaining of the prop spatial position and the prop tag information of the interactive prop through the prop mark object attached to the interactive prop includes: selecting a first item marking object from the item marking objects according to a preset selection mode; acquiring the spatial position of a first road tool marking object and determining the spatial position of the first road tool marking object as the prop spatial position; acquiring prop label information of a first road tool marking object and determining the prop label information of the first road tool marking object as the prop label information.

Optionally, the spatial position comprises coordinate data of the first marker object within a spatial coordinate system corresponding to a capture volume for capturing the hand.

Optionally, the method further comprises: the first marker object is fixed on a glove worn by the hand to achieve attachment to the hand, or the first marker object is worn directly on the hand in a ring manner to achieve attachment to the hand.

Optionally, the tag information further includes identification information for identifying the hand.

Optionally, the determining the spatial position of the first marker object attached to the hand and the tag information of the part of the first marker object located on the hand further includes: and matching the first mark object with the label information to obtain the corresponding relation between the first mark object and the label information.

The embodiment of the application also provides a hand motion capturing method, which comprises the following steps: in response to a hand motion made by a hand, determining a current spatial position of a first marker object attached to the hand and label information describing a location of the first marker object on the hand; and adjusting the reference hand posture model of the hand by using the current spatial position of the first mark object, the label information and the prior hand motion model corresponding to the prior hand motion to obtain the current hand posture model of the hand, thereby capturing the current hand motion of the hand.

The embodiment of this application still provides a trapping apparatus of hand action, the device includes: a tag information determination unit configured to determine a spatial position of a first marker attached to a hand and tag information of a portion of the hand where the first marker is located; and a reference hand posture model generating unit for adjusting an initial hand posture model by using the spatial position of the first marker object and the tag information, and generating a reference hand posture model corresponding to the hand.

An embodiment of the present application further provides an electronic device, including: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing the above methods.

Embodiments of the present application also provide a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform the method.

The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects:

in summary, the hand motion capture method according to the exemplary embodiment of the present application can determine the reference hand posture model corresponding to the hand by using only the spatial position and the tag information of the first marker object without constraining the first marker object, does not need to limit the first marker object to the skeleton point, is more flexible to use, and can more flexibly and conveniently acquire the reference hand posture model corresponding to the hand. According to the scheme, the hand model matched with the shot hand is reconstructed by placing some sparse mark points on the hand, so that the precision of capturing hand performance is improved. Meanwhile, a hand position database corresponding to the position of the mark point is set, a prior model of the artificial intelligent hand gesture is trained through the massive three-position hand gesture database, and ambiguity is removed through the hand gesture prior model during capturing, so that the continuity of the capturing result is ensured. The scheme has the advantages that the capture duration is not limited, the arrangement is simple, and the captured fingers are coherent, fine and predictable.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a flowchart of steps of a hand motion capture method according to an exemplary embodiment of the present application;

FIG. 2 is a diagram of a calibration operation performed on a plurality of cameras using a second marking device, according to an exemplary embodiment of the present application;

FIG. 3 is a block diagram of performing calibration operations on multiple cameras according to an exemplary embodiment of the present application;

FIG. 4 is a block diagram of obtaining a spatial position of a first marker object according to an exemplary embodiment of the present application;

FIG. 5 is a diagram of spatial coordinate matching according to an exemplary embodiment of the present application;

FIG. 6 is a block diagram of generating a reference hand pose model of a hand according to an exemplary embodiment of the present application;

FIG. 7 is a block diagram of obtaining a current hand pose model according to an exemplary embodiment of the present application;

FIG. 8 is a block diagram of determining current tag information according to an exemplary embodiment of the present application;

FIG. 9 is a block diagram of a hand motion capture device according to an exemplary embodiment of the present application;

FIG. 10 is a schematic illustration of a first marker object at a human hand position at hand motion capture according to an exemplary embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings. In order to be able to solve the above technical problem, a hand motion capture method of an exemplary embodiment of the present application may determine a spatial position of a first marker object attached to a hand using at least two cameras, and generate a reference hand pose model of the hand using the spatial position. And after the hand moves, the hand motion parameters in the reference hand posture model can be determined by utilizing the spatial position, the label information and the prior hand motion model of the first marked object, so that a current hand posture model corresponding to the hand moves is generated, and the aim of capturing the posture of the hand is fulfilled.

The hand motion capture method of the exemplary embodiment of the present application may be applied to various fields including, but not limited to, animation fields, sports fields, game productions, and movie productions.

FIG. 1 is a flowchart of steps of a hand motion capture method according to an exemplary embodiment of the present application.

As shown in fig. 1, in step S110, a spatial position of a first marker object attached to a hand and tag information describing a portion of the first marker object on the hand are determined.

In an implementation, the hands may indicate the left or right hand of the same actor, or may be the left or right hand of a different actor, e.g., in a scenario where two actors are shaking hands, the capture of the hand motion includes the left/right hands of the two actors shaking hands separately.

The mark object refers to a mark (marker point) whose surface is covered with a special light-reflecting material, such as a spherical mark. The first markup object and the second markup object are only used for distinguishing in naming in the present application. In practice, a camera may be used to emit infrared light that is reflected off of the marker and acquires the planar coordinates (i.e., two-dimensional coordinates) of the marker. In addition, neither the first marker object nor the second marker object mentioned in the present application limits the number, that is, there may be a plurality of first marker objects and a plurality of second marker objects. The processing may be performed for each of the first and second time marked objects in the following manner.

The first label object attached to the hand may be directly attached or indirectly attached. The capturing of the hand motion requires higher precision than the capturing of the motion of the body, and if the first marker object cannot be fixed to the hand, the hand motion cannot be captured accurately.

Thus, in practice, an actor may be provided with a glove of a particular material and the first marked object may be affixed to the glove. The specific material mentioned here means a material that does not reflect light, and further, in order to be able to attach the first marking object to the glove, the specific material may also be a material that uses a hook surface. In order to ensure that the first mark object can not shift to other positions along with the movement of the actor, the first mark object can be sewn on the glove, and the first mark object can be indirectly attached to the hand by adopting the above mode.

In addition, in implementation, the first mark object can be directly attached to the hand in a wearing manner, for example, the first mark object can be worn on the finger of the actor in a manner of making the first mark object into a ring.

Further, the spatial location comprises coordinate data of a first marker object within a capture volume for capturing the hand. Specifically, in order to capture the hand gesture, a calibration site is first constructed for the hand, wherein the calibration site is composed of a plurality of cameras, and then after the calibration site is constructed, a virtual capture space corresponding to the calibration site is determined by using the cameras and the second marker object, and then a space coordinate system corresponding to the capture space is determined. The camera calibration operation will be described below with reference to fig. 2 and 3.

Fig. 2 is a diagram of performing a calibration operation on a plurality of cameras using a marking device according to an exemplary embodiment of the present application. FIG. 3 is a block diagram of performing calibration operations on multiple cameras according to an exemplary embodiment of the present application.

As shown in fig. 2, these cameras constitute a calibration space. Then, it may be calibrated using a calibration device (e.g., a calibration rod) as in fig. 2, wherein a marking object (i.e., a second marking object) is disposed on the calibration device, and preferably, three marking objects may be disposed on the calibration device.

The field is then swept using a marking device, in particular the marking device with the second marking object may be a marking rod with marking points. In implementation, a user (e.g., a technician) may swing a calibration pole with marker points in a calibration field, each camera acquires two-dimensional coordinates of the marker points, and performs calibration on all cameras according to the two-dimensional coordinates to obtain calibration information between all cameras, where the calibration information includes relative position relationships between the cameras and internal parameters of the cameras. Wherein the calibration site is a real space.

As shown in fig. 2, each camera in fig. 2 may capture an image of a calibration bar including a marker object and calculate calibration information. In an implementation, the cameras in fig. 2 may include at least two cameras.

Specifically, as shown at block 301, individual retro-reflective dots within the capture volume may be excluded. Since some reflective spots are inevitably captured by the camera in the field, the camera needs to be tested to eliminate the reflective spots affecting the capture of the camera, i.e. to ensure that the camera captures the marked object.

Subsequently, as shown in block 302, a sweep is performed using the calibration apparatus. Three collinear second marking objects can be mounted on the calibration device, and the distances between the three marking objects are determined. In the calibration space, the calibration device is swung, the cameras can capture the plane positions of the three marked object points, and finally, the scanning is finished after each camera acquires the plane positions.

Subsequently, as shown in block 303, calibration information for all cameras is determined, wherein the calibration information includes parameter information, relative position and scale information of the cameras. In an implementation, the parameter information includes internal parameters of the camera, including a focal length, a distortion parameter, and the like, and external parameters, which refer to a position and an orientation of the camera.

In implementation, a marking device with a second marking object in the calibration space can be shot by the camera, and an image of the marking device is acquired; and finally, determining the proportional relation through the distance between the second marking objects in the calibrating device.

Specifically, when the cameras are calibrated, the calibrated cameras are used for capturing the mark points on the calibration rod, three-dimensional coordinates of the captured mark points are reconstructed in a capturing space, the distance of the three-dimensional coordinates of the reconstructed mark points is compared with the distance between the mark points on the actual calibration rod, and a proportional relation is obtained and used for subsequent calculation.

At the same time, as shown in block 304, a set square (with 3 vertices on the set square having respective marked objects) may be placed in the capture volume to calibrate the ground, thereby determining ground information. Specifically, an L-shaped triangular rod with a mark point on each corner is placed on a calibration field. Three-dimensional coordinates of three mark points on the L-shaped triangular rod are reconstructed in the capturing space, a virtual L-shaped triangular rod is formed in the capturing space, a right-angle point of the virtual L-shaped triangular rod is an original point, a short side is a Z axis, a long side is an X axis, a Y axis can be established through the X axis and the Z axis, ground information of the capturing space can be established through the X axis and the Z axis, and the original point and the X axis, the Y axis and the Z axis are space coordinate systems in the capturing space. Wherein the capture space is a virtual space.

Finally, as shown in block 305, the spatial coordinate system is determined using the calibration information and the ground information determined in block 304. That is, after determining the ground information for the capture volume, a spatial coordinate system for the capture volume based on the ground information may be determined.

After determining the spatial coordinate system of the capture space, the spatial location of the first marker object may be determined. Which will be described in detail below with reference to fig. 4. FIG. 4 is a block diagram of acquiring a spatial position of a first marker object according to an exemplary embodiment of the present application.

As per block 401, two-dimensional positions of a number of first tagged objects may be acquired with a number of cameras. In an implementation, each first marker object is photographed by at least two cameras, at least two images of the same first marker object photographed by the at least two cameras are acquired, and then at least two-dimensional positions for the same marker object are acquired by the at least two images. At block 402, calibration information for the at least two cameras is obtained. Subsequently, at block 403, at least two rays corresponding to the same first marker object may be generated using the calibration information of the at least two cameras and the at least two-dimensional positions corresponding thereto.

Subsequently, as shown in block 404, the corresponding relationships of different cameras to the same marked object can be obtained according to various constraint conditions. And a corresponding ray is generated for each two-dimensional position using the parameter information of the camera.

Finally, at block 406, after the above correspondence is obtained, the three-dimensional position of the same first marker object may be determined by intersecting rays generated by different cameras for the first marker object. That is, a point having the smallest distance from all the rays is found as the three-dimensional coordinate point of the mark object.

In practice, these rays may not intersect at a point, and an optimization process, as shown in block 405, may be employed to make the reconstructed three-dimensional position more stable. In short, the optimization processing may iteratively adjust the weights of the different rays according to the different distances between the generated three-dimensional coordinate point and the different rays, so that the generated three-dimensional coordinate point is closest to the most rays.

Given the above process of determining spatial positions for a single first marker object, in an implementation, in the case of multiple first marker objects, the above process may be employed to obtain corresponding spatial coordinates for each first marker object.

For better explanation, the following description will be made in conjunction with fig. 5. Fig. 5 is a diagram of spatial coordinate matching according to an exemplary embodiment of the present application. As shown in fig. 5, the first tagged object may generate

different images

510 and 520 with different cameras. The two-dimensional position of the first marker object in the image 510 is PL and the two-dimensional position of the first marker object in the image 520 is PR. The optical center of the camera corresponding to image 510 is OL and the optical center of the camera corresponding to image 520 is OR. The rays PLOL and pro r thus formed may intersect at a point P, which is the reconstructed spatial position of the first marked object. Fig. 5 may be referred to as a three-dimensional reconstruction process of the spatial position of the first marker object.

Subsequently, label information of the first marker object can be determined, and a certain position defined to be attached to a certain part of the hand is referred to as label information.

The hand may include fingers, wherein the fingers may include a thumb, an index finger, a middle finger, a ring finger, and a little finger, and each finger may include an upper portion, a middle portion, and a lower portion. Therefore, the tag information may be information such as little finger middle section.

Preferably, a set of mark objects (markerset) is set in advance, that is, positions of which parts of the hand the first mark object is attached to are set in advance. To accurately capture hand motion, the hand motion to be captured may be determined, and the distribution of the first marker object may be determined based on the hand motions. For example, since the motion "five" and the motion "six" are performed by the hand, the thumb and the little finger need to be extended, and therefore, in order to distinguish the two hand motions, the mark object needs to be added to the upper portion of at least one of the index finger, the middle finger, and the ring finger.

In the implementation, the hand can be made to perform a specific motion (for example, five fingers are opened), then the spatial position of each first mark object attached to the hand is acquired, and the tag information of each first mark object is determined according to the preset markerset.

Subsequently, step S120 may be performed, where an initial hand pose model is adjusted by using the spatial position of the first marker object and the tag information, and a reference hand pose model corresponding to the hand is generated. To better describe this step, the following will be described with reference to fig. 6.

Figure 6 is a block diagram of generating a reference hand pose model of a hand according to an exemplary embodiment of the present application.

The technician may acquire a large amount of hand model data via a three-dimensional scan, and the hand pose database in block 610 may include pose data for various body states and/or actions, such as a big hand, a small hand, a left hand, a right hand, a male hand, a female hand, and the like.

At block 620, a low-dimensional hand distribution may be generated using the hand gesture database in block 610. The distribution can be sampled to generate different hand shapes.

At block 630, an initial hand pose model is established, wherein the initial hand pose model includes shape parameters for describing a hand shape and motion parameters for describing a motion. The initial hand pose model includes shape parameters for describing a hand shape and hand motion parameters for describing a hand motion. As shown in the following equation 1, FK is used to indicate an initial hand pose model, α, ρ may represent the size and fat, respectively, of the hand, and pos represents the pose of the hand, and since α, ρ, and pos are unknown, it is necessary to solve using the spatial position of the first marker object.

At block 640, a spatial position and tag information of a first tagged object is obtained, and at block 650, an initial hand pose model is adjusted using the spatial position and tag information of the first tagged object to generate a reference hand pose model corresponding to the hand.

Optionally, in step S110, determining the spatial position of the first marker object attached to the hand and the tag information of the part of the first marker object on the hand further includes: and matching the first mark object with the label information to obtain the corresponding relation between the first mark object and the label information.

In practice, due to

And the position are unknown, so that each space position and each label information of the first marking object under different time points and/or different hand motions of the hand can be obtainedAnd then solving by using the spatial positions and the information of each label to determine the parameters in the FK. It should be noted that how to determine the tag information of the first marked object when the spatial position changes will be explained in detail below with reference to fig. 7, and will not be described in detail herein.

Preferably, the hand motion parameters in the initial hand pose model may be set by making the hand a specific hand motion (e.g. five fingers open). In order to make the result more accurate, the set hand motion parameters are standard specific hand motions. In the case where the hand motion parameter is determined, the spatial position and the tag information of a first marker object in the case where the hand makes the specific hand motion are acquired.

Then, the shape parameters of the initial hand posture model are adjusted by using the hand motion parameters, the spatial position, and the label information, and the reference hand posture model corresponding to the hand is generated. That is, the shape parameters in the FK model are continually adjusted in the manner of equation 1 below until equation 1 converges.

In implementation, the initial hand pose model may be adjusted using equation 1 as follows:

in the formula, α, ρ can represent the shape parameters (size and fat) of the hand, respectively, and pos represents the motion parameters (motion) of the hand; FK represents a hand posture model, and a virtual hand model with hand motion corresponding to a hand can be reconstructed using α, ρ, pos and the hand posture model. .

Corr represents the matching relationship of the tag information with the first tagged object, i.e. to which first tagged object the tag information i corresponds (or to which tag information the first tagged object m belongs). i represents the label information of the first mark object, hand marker in formula_iRepresenting the position of the ith label information on the virtual hand model corresponding to the hand, and passing through hand marker_i(FK(alpha, rho, position) can obtain the virtual three-dimensional coordinate, Marker, of the ith label information_mRepresenting the three-dimensional coordinates of the mth first marker object.

The tag information is matched with the first marked object by a matching relationship Corr, i.e. the tag information i corresponds to the mth first marked object. Dis represents hand marker_i(FK (. alpha.,. rho., pos) and Marker)_mThe distance of (c).

Formula (1) shows that after the three-dimensional coordinates of the first marked objects are acquired, the tag information of each first marked object is determined (Corr is used in the formula to represent the matching relationship between the first marked object and the tag information), and then the variables in formula (1) are optimized to minimize the sum of the distances between the virtual three-dimensional coordinates of all the tag information on the virtual hand model and the three-dimensional coordinates of the first marked objects corresponding to all the tag information, so as to acquire the virtual hand model corresponding to the hand.

The optimization process of equation (1) is as follows:

1. setting an initial value of the variable. Setting hand marker with initial value for defining marker rset_iI.e. when defining markerset, hand marker_iIndicating the position where the label information i is set on the virtual hand model; taking the average value in a hand posture database as an initial value of alpha and rho; the hand position is close to the preset action, and the initial value uses the preset specific action (such as the five fingers are opened).

2. The matching relationship of the tag information and the first marker object (Corr in formula (1)) is obtained. By hand marker_i(FK (alpha, rho, pos) can obtain virtual three-dimensional coordinates of label information, the virtual three-dimensional coordinates of all the label information are set A, the three-dimensional coordinates of all the first mark objects are set B. the set A is matched with the set B.

For example, the matching method may employ a nearest neighbor matching method. The nearest neighbor matching method is a method for forming matching by starting from a certain point a in the set A and searching for a point which is closest to the point a in another set B. In practical use, the present invention is not limited to the matching method (i.e., other matching methods may be used).

3. Optimizing variables alpha, rho, pos, hand marker_iThe sum of the distances between the virtual three-dimensional coordinates of all the label information on the virtual hand model and the three-dimensional coordinates of the first label object corresponding to all the label information is minimized.

4. And returning to the step 3 for iterative optimization until the formula (1) converges or the maximum iteration number is reached.

In addition, since the shape of each hand is different, each hand needs to perform the above operations before hand motion capture, and a reference hand posture model of the hand is determined, which is referred to as a calibration process, in order to be able to more accurately characterize each hand. And to generate a reference hand pose model of the hand more accurately, the starting hand motion of the hand may be set to a certain specific hand motion. For example, each actor may take five finger open hand motions before performing hand motion capture and then generate a baseline hand pose model for the actor.

According to the method, under the condition that the reference hand posture model corresponding to the hand is determined, the reference hand posture model and the subsequently described prior hand motion model can be used for generating the current hand posture model corresponding to the current hand motion of the hand, so that the current hand motion of the hand can be captured. To better describe this process, it will be described in detail below in conjunction with fig. 7.

After the calibration operation of the hand is completed according to the above-described operation, the capturing operation can be performed on the hand. When capturing, the hand first swings out a specific motion (e.g., five fingers open). And acquiring the three-dimensional coordinates of the first marked object and the matching relation between the label information and the first marked object.

In particular, the current spatial position of the first marker object is obtained by means of a three-dimensional reconstruction. The acquisition of the matching relation is similar to the calibration process, and at the moment, the position and the matching relation Corr are optimized, because the alpha, rho and hand marker in the formula (1)_iAcquired during the calibration process and kept unchanged.

The hand can make various hand motions according to actual requirements, for example, actors can perform various performances according to a script. In this case, in response to a hand motion made by the hand, at block 710, the current spatial position of the first tagged object is acquired along with the current tag information.

The current space position of the first marked object is obtained through three-dimensional reconstruction, and the current label information of the first marked object is obtained through the matching relation between the label information and the first marked object.

The current spatial position of the first marker object is the spatial position of the hand on which the position of the first marker object is moved after the hand performs the hand motion. In implementation, the current spatial position of the first marker object may be determined as described above with respect to FIG. 4.

In case the current spatial position of the first marker object has been determined, the current tag information of the first marker object may be determined. The process of determining the current tag information will be described below in conjunction with fig. 8.

Fig. 8 is a block diagram illustrating determining current tag information according to an exemplary embodiment of the present application.

After the calibration operation of the hand is completed according to the operation shown in fig. 7, the capturing operation may be performed on the hand. During the capturing process, a specific hand motion (for example, the five fingers are opened) may be put out first, and at this time, the matching relationship between the tag information and the first marked object also needs to be acquired. In this application, the tag information further includes identification information for identifying the hand, which may indicate left/right hand, or may indicate left/right hand of a specific actor.

At block 810, a predicted spatial location of current tag information i is obtained. In an implementation, a following spatial position of the current tag information i may be predicted according to a preceding spatial position of the first tagged object corresponding to the tag information i, and the following spatial position may be determined as a predicted spatial position, where the preceding spatial position refers to a spatial position of the first tagged object corresponding to the tag information i at a previous time (i.e., a previous frame). In the last time, the correspondence between the tag information i and the first marker object is determined, that is, the tag information i and the spatial position of the first marker object are consistent.

The predicted spatial position refers to a spatial position predicted by the current tag information i at the current time (i.e., the current frame). In implementation, the prediction spatial position may be determined using a prediction method for a motion trajectory of an object, and here, the prediction method will not be limited.

At block 820, the current spatial position of the first tagged object P is obtained. In practice, the current spatial position may be determined using the methods already described above.

At block 830, it is determined whether the current spatial position of the first tagged object P is within a preset range of the predicted spatial position of the current tag information i. If yes, namely when the current spatial position of the first marked object P is determined to be within the preset range of the predicted spatial position of the current label information i, the first marked object P and the current label information i are matched by using a nearest neighbor method, and whether the matching is correct or not is judged (the nearest neighbor method is introduced during the calibration and is not repeated), if the matching is correct, the first marked object P and the label information i are in an effective matching relation; if not, no matching is required. The preset range is a range set according to the motion track prediction of the object, and the matching relation between the first marking object and the label information is obtained through the process.

At block 840, the matching relationship between the tag information i at the current time and the first tag object P is determined, and the tag information i corresponding to the successfully matched first tag object P is determined according to the matching relationship.

In practice, if unconstrained, some non-performing hand movements are generated, and further, when performing continuous hand movements, these continuous hand movements are consistent and reasonable.

Based on the above consideration, when the hand motion is actually captured, the method needs to add a prior hand motion model to limit the captured hand motion so as not to generate the hand motion that the hand cannot make. The prior hand motion model is a model generated based on prior hand motion that satisfies both hand skeleton and hand motion coherence.

At block 730, the distribution of the first marker object at the hand may be determined.

At block 740, a hand motion library is obtained, which may include a variety of hand motions.

In block 750, at least one hand motion satisfying the distribution may be selected from the hand motion library using the distribution determined in block 730 and determined as a priori hand motions, in which ambiguous hand motions are not present, wherein an ambiguous hand motion is a hand motion in which the position of the hand where the first marker is located is the same for different hand motions, and the different hand motions are ambiguous hand motions, for example, motions of extending five fingers and extending thumb and little finger are ambiguous hand motions only with the first marker point on the thumb and little finger. In practice, if there is an ambiguous hand motion, the first marker point is added according to the method described above.

At block 760, an a priori hand motion model is built using the a priori hand motions determined in block 750.

At block 730, the reference hand pose model at block 720 is adjusted according to equation 2 below using the current spatial position determined at block 710, the current tag information, and the a priori hand motion model determined at block 760 to obtain a current hand pose model of the hand for capturing the current hand motion of the hand. Note that: the process of building the a priori hand motion model at blocks 730-760 is completed before block 710, and is only described herein for the acquisition method and does not represent the actual operational flow.

In the formula, α, ρ may represent shape parameters (size and fat) of the hand, and pos represents motion parameters (motion) of the hand, respectively. In the formula (2), FK represents the reference hand posture model, and a virtual hand model with hand motion corresponding to the hand can be reconstructed using the position and the reference hand posture model.

Corr represents the matching relationship of the tag information with the first tagged object, i.e. to which first tagged object the tag information i corresponds (or to which tag information the first tagged object m belongs). i represents the label information of the first mark object, hand marker in formula_iRepresenting the position of the ith label information on the virtual hand model corresponding to the hand, and passing through hand marker_i(FK (α, ρ, pos) can derive a virtual three-dimensional coordinate, Marker, of the i-th tag information_mRepresenting the three-dimensional coordinates of the mth first marker object.

In equation (2), i represents the ith label, j represents the current frame, and pos^jRepresents the current frame, position^j-1Represents the previous frame (pos)^j-1，...，pose^j-k) Representing the k frames (pos) before the current frame.

Prior2(pose^j) Indicates that the acquisition of the hand motion parameter pose satisfies the preset Prior motion model, Prior1 (pose)^j|(pose^j-1，...，pose^j-k) Indicates that obtaining a time series signal that satisfies the hand motion parameter pos requires no abrupt change.

Specifically, the reference hand posture model is a model obtained by the above processing, and the model is a model composed of a shape parameter and a hand motion parameter, and when the shape parameter is already determined, the hand motion parameter in the model can be determined using the current spatial position of the first marker object and the current tag information under the constraint of the prior hand motion model.

In practice, the sum of the distances between the virtual three-dimensional coordinates of all the tag information on the virtual hand model and the three-dimensional coordinates of the first marker object corresponding to all the tag information can be minimized by continuously adjusting the hand motion parameters under the constraint of the prior hand motion model.

In the case of gesture six, for example, for the first labeled object whose label information is the thumb and little finger, the prior hand motion model corresponding to the gesture six can be used for carrying out constraint, hand motion parameters in the reference hand posture model are obtained in a mode that the sum of the difference between the space position (three-dimensional coordinates) of the first marking object corresponding to the thumb label and the virtual space position (virtual three-dimensional coordinates) of the thumb label in the reference hand posture model and the difference between the space position of the first marking object corresponding to the small index label and the virtual space position corresponding to the small index label in the reference hand posture model is minimum, wherein the corresponding virtual spatial position of each tag in the reference hand pose model indicates the position at which the corresponding first marker object is attached to the contact point of the reference hand pose model (i.e. at the virtual hand).

After determining the hand motion parameters, a current hand pose model may be determined using the shape parameters and the hand motion parameters, capturing the current hand motion of the hand. The current hand motion is in accordance with the a priori hand motion model and does not necessarily match exactly the real motion. Because ambiguous actions are eliminated when selecting.

Further, during motion capture of the hand, interactive props may also be determined that interact with the hand, for example, a plurality of cameras may be utilized to simultaneously capture the hand of an actor and a basketball that interacts with the actor. It should be noted that the interactive props are not limited in number and type, that is, the interactive props may be single or multiple, and may be of the same type or different types.

In this case, an item marker object may be placed on the interactive item in advance, where the item marker object is the same marker object as the first marker object described above, and is not limited in number. In implementation, a first item marker object may be selected from the item marker objects according to a preset selection manner, where the preset selection manner may be determined by a user according to a requirement.

For example, in the implementation, when a prop is captured, at least 3 prop mark objects are attached to the prop, and the number of the prop mark objects during calibration is required to be not less than the number of the selected prop mark objects during capture. If the prop mark object is easy to be confused with the first mark object of the hand, the prop mark object which is easy to be confused with the first mark object is removed, and if no confusion is generated, all the prop mark objects can be reserved.

And then, acquiring a prop spatial position and prop label information of the interactive prop through a first prop marking object, wherein the prop spatial position and the prop label information can be acquired as above.

And finally, adjusting the basic prop posture model by using the space position of the prop mark points and the label information to generate a current prop posture model and capture the posture of the current prop. For the hands, the virtual hand model is generated through an algorithm, for the props, the virtual prop model is made manually, when the props are calibrated, the virtual prop model is made manually according to the mark point information of the props, and when the props are captured, prop actions are captured according to the space positions and the mark information of the mark points on the props, so that the virtual prop model can make the same actions or postures as the real props.

In addition, the method can also execute redirection operation, namely, the current hand posture model is redirected to the virtual object according to the preset corresponding relation. In addition, under the condition that the hand comprises the interactive prop, the current prop posture model can be redirected to the virtual object according to the preset corresponding relation.

In summary, the hand motion capture method according to the exemplary embodiment of the present application can determine the reference hand posture model corresponding to the hand by using only the spatial position and the tag information of the first marker object without constraining the first marker object, does not need to limit the first marker object to the skeleton point, is more flexible to use, and can more flexibly and conveniently acquire the reference hand posture model corresponding to the hand. Furthermore, by determining the shape parameters of the reference hand pose model when the hand is made to perform a specific hand motion, the reference hand pose model can be made to better conform to the shape of the hand. Furthermore, the current spatial position, the current tag information and the prior hand motion model of the first marker object can be used for obtaining the current hand posture model of the hand, so that the current hand motion of the hand is captured, the capture of the hand motion is realized, and the difficulty of capturing the hand motion is reduced.

The hand motion capture method can be used in the field of performance animation production and the field of virtual live broadcast, in particular to high-quality three-dimensional animation generation, and the motion and/or the gesture of the hand of a virtual character can be generated by capturing the motion and/or the gesture of the hand of a real object. The hand motion capture method can realize the fine capture of a single hand and the fine capture of a plurality of hands, namely, the output of the hand motion of a single virtual character can be realized in the same picture, and the output of the hand motion of a plurality of virtual characters can also be realized. Interactions between the actors' hands, such as handshaking, may also be captured, as well as interactions between the hands and props, such as playing basketball, fencing, and the like. Thereby outputting the interaction of the virtual character according to the interaction between the hands of the plurality of actors. The hand motion capture method may support an offline animation generation mode and a real-time online animation generation mode.

In the embodiment of the present application, the terminal and the like may be divided into functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.

Fig. 9 is a block diagram of a hand motion capture device according to an exemplary embodiment of the present application, with each functional module divided in correspondence with each function.

As shown in fig. 9, the hand motion capture device 900 includes a tag information determination unit 910 and a reference posture model generation unit 920, wherein the tag information determination unit 910 determines a spatial position of a first marker object attached to a hand and tag information of a part of the hand where the first marker object is located; the reference hand posture model generating unit 920 is configured to adjust an initial hand posture model using the spatial position of the first marker object and the tag information, and generate a reference hand posture model corresponding to the hand.

Optionally, the reference hand posture model generating unit 920 comprises a hand motion parameter setting module, a first mark object information acquiring module and a reference hand posture model generating module, wherein the hand motion parameter setting module is configured to set hand motion parameters in the initial hand posture model by making the hand make a specific hand motion; the first mark object information acquisition module is used for acquiring the space position and the label information of a first mark object when the hand performs the specific hand motion under the condition that the hand motion parameters are determined; and the reference hand posture model generating module is used for adjusting the shape parameters of the initial hand posture model by utilizing the hand motion parameters, the spatial position and the label information to generate the reference hand posture model corresponding to the hand.

Optionally, the first marked object information acquiring module may further acquire a current spatial position of the first marked object and current tag information in response to a current hand motion made by the hand; and the reference hand gesture model generation module is used for adjusting the reference hand gesture model by using the current spatial position, the current tag information and a prior hand motion model corresponding to the prior hand motion to obtain the current hand gesture model of the hand so as to be used for capturing the current hand motion of the hand.

Optionally, the capturing apparatus 900 for hand motion may further include a distribution determining unit, at least one hand motion determining unit, and a priori hand motion distribution unit, wherein the distribution determining unit is configured to determine a distribution of the first marker object in the hand using the label information of the first marker object; the at least one hand action determining unit is used for selecting at least one hand action meeting the distribution condition from a preset hand action library; the prior hand distribution determination unit is used for determining the prior hand distribution by using the at least one hand motion when determining that the at least one hand motion has no ambiguous hand motion.

Optionally, the first marked object information obtaining module is specifically configured to obtain the predicted spatial position of the current tag information and the current spatial position of the first marked object; under the condition that the current spatial position of the first marking object is determined to be within a preset range of the predicted spatial position of the current tag information, matching the first marking object with the current tag information to obtain a matching relation, wherein the preset range is a range set according to the prediction of the hand motion track of the hand; and determining the current label information corresponding to the first marking object according to the matching relation.

Optionally, the reference hand posture model generating module is specifically configured to, when the shape parameter of the reference hand posture model is determined, obtain the current hand posture model by continuously adjusting the hand motion parameter under the constraint of the prior hand motion model to minimize the sum of the virtual spatial positions of all the tag information and the spatial position distances of the first marker objects corresponding to all the tag information, so as to capture the posture of the hand.

Optionally, the capturing device 900 for the hand motion further includes an interactive prop determination unit, a prop information determination unit, and a current prop model determination unit, where the interactive prop determination unit is configured to determine an interactive prop performing interaction with the hand; the prop information determining unit is used for acquiring a prop spatial position and prop label information of the interactive prop through a prop mark object attached to the interactive prop; the current prop model validation unit is used for adjusting a basic prop attitude model corresponding to the interactive prop by utilizing the prop spatial position and the prop label information to generate a current prop attitude model for capturing the motion of the interactive prop

Optionally, the item information determining unit includes a first item marker object determining unit, an item spatial position determining unit, and an item tag information determining unit, where the first item marker object determining unit is configured to select a first item marker object from the item marker objects according to a preset selection mode; the prop space position determining unit is used for acquiring the space position of a first road marking object and determining the space position of the first road marking object as the prop space position; the item tag information determining unit is used for acquiring item tag information of a first item of marked object and determining the item tag information of the first item of marked object as the item tag information.

Optionally, the first marker object is affixed to a glove worn on the hand to effect attachment to the hand, or the first marker object is worn directly on the hand in a ring fashion to effect attachment to the hand.

Optionally, the capturing apparatus 900 for capturing the hand motion further includes a tag information matching unit, wherein the tag information matching unit matches the first marked object with the tag information to obtain a corresponding relationship between the first marked object and the tag information.

As shown in fig. 10, in the hand motion capture, the first marker object may be placed at a different position on the hand. The head of the shot object needs to wear a special motion capture glove, and the motion capture glove is pasted with a mark point. FIG. 10 is merely an example illustration, not showing the motion capture glove, in fact the first marked object is in contact with the motion capture glove.

In summary, the hand motion capture device according to the exemplary embodiment of the present application can determine the reference hand posture model corresponding to the hand by using only the spatial position and the tag information of the first marker object without constraining the first marker object, does not need to limit the first marker object to the skeleton point, is more flexible to use, and can more flexibly and conveniently acquire the reference hand posture model corresponding to the hand. Furthermore, by determining the shape parameters of the reference hand pose model when the hand is made to perform a specific hand motion, the reference hand pose model can be made to better conform to the shape of the hand. Furthermore, the current spatial position, the current tag information and the prior hand motion model of the first marker object can be used for obtaining the current hand posture model of the hand, so that the current hand motion of the hand is captured, the capture of the hand motion is realized, and the difficulty of capturing the hand motion is reduced.

Is the same device or the method is performed by a different device. For example, the execution subject of steps 21 and 22 may be device 1, and the execution subject of step 23 may be device 2; for another example, the execution subject of step 21 may be device 1, and the execution subjects of steps 22 and 23 may be device 2; and so on.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable hand motion capture device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable hand motion capture device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable hand motion capture device to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable hand motion capture device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer implemented process such that the instructions which execute on the computer or other programmable device provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A hand motion capture method, comprising:

determining the spatial position of a first marking object attached to a hand and label information of a part of the first marking object, which is positioned on the hand;

adjusting an initial hand pose model using the spatial position of the first marker object and the tag information to generate a reference hand pose model corresponding to the hand.

2. The method of claim 1, wherein the initial hand pose model comprises shape parameters for describing a hand shape and hand motion parameters for describing a hand motion.

3. The method of claim 2, wherein the adjusting an initial hand pose model using the spatial position of the first marker object and the tag information, generating a reference hand pose model corresponding to the hand comprises:

setting hand motion parameters in the initial hand pose model by causing the hand to make a specific hand motion;

if the hand motion parameters are determined, acquiring the spatial position and the tag information of a first marking object when the hand makes the specific hand motion;

and adjusting the shape parameters of the initial hand posture model by using the hand motion parameters, the spatial position and the label information to generate the reference hand posture model corresponding to the hand.

4. A method as recited in any of claims 1 to 3, further comprising, after generating the reference hand pose model corresponding to the hand:

responding to the current hand motion made by the hand, and acquiring the current spatial position and the current label information of the first marking object;

and adjusting the reference hand posture model by using the current spatial position, the current label information and a prior hand motion model corresponding to the prior hand motion to obtain the current hand posture model of the hand so as to be used for capturing the current hand motion of the hand.

5. The method of claim 4, further comprising:

determining the distribution of a first marking object on the hand;

selecting at least one hand action meeting the distribution condition from a preset hand action library;

determining with the at least one hand motion as the prior hand motion if it is determined that there is no ambiguous hand motion for the at least one hand motion;

and establishing the prior hand motion model by using the prior hand motion.

6. The method of claim 4, wherein obtaining current tag information for the first tagged object comprises:

acquiring a predicted spatial position of the current tag information and a current spatial position of the first marker object;

under the condition that the current spatial position of the first marking object is determined to be within a preset range of the predicted spatial position of the current tag information, matching the first marking object with the current tag information to obtain a matching relation, wherein the preset range is a range set according to the prediction of the hand motion track of the hand;

and determining the current label information corresponding to the first marking object according to the matching relation.

7. The method of claim 4, wherein said adapting the reference hand pose model using the current spatial location, the current tag information, and a prior hand motion model corresponding to prior hand motion to obtain the current hand pose model of the hand for capturing the current hand motion of the hand comprises:

under the condition that the shape parameters of the reference hand posture model are determined, the sum of the distances between the virtual spatial positions of all the tag information and the spatial positions of the first mark objects corresponding to all the tag information is minimized by continuously adjusting the hand motion parameters under the constraint of the prior hand motion model, and the current hand posture model is obtained to be used for capturing the posture of the hand.

8. The method of any of claims 1 to 7, further comprising:

determining an interactive prop for performing interaction with the hand;

acquiring a prop space position and prop label information of the interactive prop through a prop mark object attached to the interactive prop;

and adjusting a basic prop posture model corresponding to the interactive prop by using the prop spatial position and the prop label information to generate a current prop posture model for capturing the motion of the interactive prop.

9. The method of claim 8, wherein the obtaining the item spatial position and item tag information of the interactive item by the item tag object attached to the interactive item comprises:

selecting a first item marking object from the item marking objects according to a preset selection mode;

acquiring the spatial position of a first road tool marking object and determining the spatial position of the first road tool marking object as the prop spatial position;

acquiring prop label information of a first road tool marking object and determining the prop label information of the first road tool marking object as the prop label information.

10. The method of claim 1, wherein the spatial location comprises coordinate data of the first marker object within a corresponding spatial coordinate system of a capture volume used to capture the hand.

11. The method of claim 1, further comprising:

securing a first marking object to a glove worn on the hand to effect attachment to the hand, or,

the first marker object is worn directly on the hand in a ring fashion to enable attachment to the hand.

12. The method of claim 1, wherein the tag information further comprises identification information for identifying the hand.

13. The method of claim 1, wherein determining the spatial location of the first tagged object attached to the hand and the tag information of the portion of the first tagged object located on the hand further comprises: and matching the first mark object with the label information to obtain the corresponding relation between the first mark object and the label information.

14. A hand motion capture method, comprising:

in response to a hand motion made by a hand, determining a current spatial position of a first marker object attached to the hand and label information describing a location of the first marker object on the hand;

and adjusting the reference hand posture model of the hand by using the current spatial position of the first mark object, the label information and the prior hand motion model corresponding to the prior hand motion to obtain the current hand posture model of the hand, thereby capturing the current hand motion of the hand.

15. A hand motion capture device, comprising:

a tag information determination unit configured to determine a spatial position of a first marker attached to a hand and tag information of a portion of the hand where the first marker is located;

and a reference hand posture model generating unit for adjusting an initial hand posture model by using the spatial position of the first marker object and the tag information, and generating a reference hand posture model corresponding to the hand.

16. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1-13.

17. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform the method of any of claims 1-13.