CN114722913A

CN114722913A - Attitude detection method and apparatus, electronic device, and computer-readable storage medium

Info

Publication number: CN114722913A
Application number: CN202210261735.4A
Authority: CN
Inventors: 金天; 胥立丰; 赵璐璐; 丁路生
Original assignee: Beijing Eswin Computing Technology Co Ltd
Current assignee: Beijing Eswin Computing Technology Co Ltd
Priority date: 2022-03-16
Filing date: 2022-03-16
Publication date: 2022-07-08

Abstract

The embodiment of the application provides a posture detection method and device, electronic equipment and a computer readable storage medium, and relates to the field of computer processing. The method comprises the following steps: the method comprises the steps of obtaining an image sequence corresponding to a target object and motion data of a target part of the target object, and determining first three-dimensional coordinate information of each key point of the target part in any image in the image sequence. And determining a first unit vector representing the direction of the target part according to the first three-dimensional coordinate information of each key point. Based on the motion data of the target site, a second unit vector characterizing the direction of the target site is determined. And determining the posture of the target part according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point. According to the method and the device, the posture of the target part can be accurately determined by combining the first three-dimensional coordinate information of the target part with the motion data of the target part, and the user perception is improved.

Description

Attitude detection method and apparatus, electronic device, and computer-readable storage medium

Technical Field

The present application relates to the field of computer processing, and in particular, to a method and an apparatus for detecting an attitude, an electronic device, and a computer-readable storage medium.

Background

With the development of deep learning, the human posture estimation has higher and higher accuracy, and more applications are appeared. For example, some terminal devices support body sensing games, AI (Artificial Intelligence) fitness, etc., some live broadcast software provides driving of 3D (3-dimension) virtual characters, etc.

In the related technology, the depth learning algorithm based on monocular vision can adapt to most scenes, but under the conditions that certain parts of an organism are shielded or pictures are blurred due to rapid movement of limbs, the accuracy of the posture of the human body obtained based on the depth learning algorithm is poor, and even obvious jitter can occur. Although the problem of poor human body posture accuracy caused by shielding of certain parts of the organism can be solved by using the inertial sensor, the measurement value of the inertial sensor has errors, and the measurement value of the inertial sensor is accumulated to calculate the displacement of the organism or cause obvious offset. As can be seen, none of the above methods can detect a relatively precise posture of the living body.

Disclosure of Invention

The embodiment of the application provides a posture detection method and device, electronic equipment and a computer readable storage medium, which can accurately determine the posture of a target part. The specific technical scheme is as follows:

according to an aspect of an embodiment of the present application, there is provided a gesture detection method, including:

acquiring an image sequence corresponding to a target object and motion data of a target part of the target object;

determining first three-dimensional coordinate information of each key point of a target part in any image in an image sequence;

determining a first unit vector representing the direction of the target part according to the first three-dimensional coordinate information of each key point;

determining a second unit vector characterizing a direction of the target site based on the motion data of the target site;

and determining the posture of the target part according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point.

According to another aspect of embodiments of the present application, there is provided a posture detecting apparatus including:

the image and motion data acquisition module is used for acquiring an image sequence corresponding to the target object and motion data of a target part of the target object;

the three-dimensional coordinate information determining module is used for determining first three-dimensional coordinate information of each key point of a target part in any image in the image sequence;

the first unit vector determining module is used for determining a first unit vector representing the direction of the target part according to the first three-dimensional coordinate information of each key point;

a second unit vector determination module for determining a second unit vector characterizing the direction of the target site based on the motion data of the target site;

and the posture determining module is used for determining the posture of the target part according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point.

According to yet another aspect of embodiments of the present application, there is provided an electronic device comprising a memory, a processor and a computer program stored on the memory, the processor executing the computer program to implement the steps of the above-mentioned method.

According to a further aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

according to the target object posture detection method, when the posture of the target object is detected, the first three-dimensional coordinate information of each key point of the target part in any image in the image sequence is determined. By combining the first three-dimensional coordinate information of the target portion with the motion data of the target portion, the similarity between the first unit vector determined based on the first three-dimensional coordinate information of the target portion and the second unit vector determined based on the motion data of the target portion, in consideration of the fact that the real-time performance of the direction of the target portion represented by the second unit vector is stronger, whether the attitude information of the target portion cannot be accurately acquired based on the processing of the image due to the fact that the target portion is blocked in the acquired image or the target object moves fast or the like can be determined based on the similarity. Meanwhile, the stability of the position information of each key point obtained by processing the image is better, and based on the determined result of whether the attitude information of the target part can be accurately obtained based on the processing of the image and the first three-dimensional coordinate information of each key point, the attitude of the target part can be accurately determined, the user experience is improved, and the influence on the user perception is avoided under the conditions that the target part is shielded or moves fast and the like.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

Fig. 1 is a flowchart illustrating a method for detecting gestures according to an embodiment of the present application;

FIG. 2 shows a schematic diagram of an application scenario suitable for use in embodiments of the present application;

FIG. 3 shows a schematic diagram of another application scenario suitable for use in embodiments of the present application;

fig. 4 is a schematic structural diagram of an attitude detection apparatus according to an embodiment of the present application;

fig. 5 shows a schematic structural diagram of an electronic device to which the embodiment of the present application is applied.

Detailed Description

Embodiments of the present application are described below in conjunction with the drawings in the present application. It should be understood that the embodiments set forth below in connection with the drawings are exemplary descriptions for explaining technical solutions of the embodiments of the present application, and do not limit the technical solutions of the embodiments of the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the terms "comprises" and/or "comprising," when used in this specification in connection with embodiments of the present application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, as embodied in the art. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B".

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Based on the foregoing description, in some related technologies, a plurality of cameras are required to obtain depth information, and a key point of visual detection and a wearing point of an Inertial Measurement Unit (IMU) must strictly correspond to each other to obtain exact position information of the IMU. The method limits the use scene according to the placement requirement of the camera, and in some related technologies, the internal reference and the external reference need to be preset and calibrated, so that the method is complex, the use scene is limited, and the method cannot be applied to scenes such as digital games, fitness application and the like.

In view of the above, the present application provides a method, an apparatus, an electronic device, and a computer-readable storage medium for detecting a gesture, which are directed to at least one of the above technical problems or needs to be improved in the related art. The scheme determines first three-dimensional coordinate information of each key point of a target part of a target object and a first unit vector representing the direction of the target part on the basis of an image corresponding to the target object. Based on the motion data of the target site, a second unit vector characterizing the direction of the target site is determined. Therefore, according to the first unit vector, the second unit vector and the first three-dimensional coordinate information of each key point, the image detection result and the obtained motion data are combined, the posture of the target part is accurately determined, and the user experience is improved.

The gesture detection method may be implemented by a gesture detection device, which may be a terminal or a server, where the terminal may be any electronic device, such as a fitness terminal or a game terminal, and the server may be a local server, a cloud server, or a server cluster composed of at least one of the local server and the cloud server, and the embodiment of the present application is not limited thereto.

The gesture detection method can be applied to different application scenes, for example, the gesture detection method can be applied to the process of body-building or body-feeling games, and can enable a user to adjust the motion state in real time according to the display gesture in the display by accurately acquiring the gesture of the user and displaying the gesture of the user on the display corresponding to the body-building terminal, so that the body-building effect or game perception is better improved, and the user experience is improved. For another example, the gesture detection method can also be applied to a game production process, and the gesture of the action personnel is acquired through the gesture method, so that the art designer can better control the gesture of the game role in the game scene according to the acquired gesture of the action personnel, the action performance of the game role in the game scene is more vivid, and the game experience of the game player in the game process is improved.

The technical solutions of the embodiments of the present application and the technical effects produced by the technical solutions of the present application will be described below through descriptions of several exemplary embodiments. It should be noted that the following embodiments may be referred to, referred to or combined with each other, and the description of the same terms, similar features, similar implementation steps and the like in different embodiments is not repeated.

Fig. 1 shows a flowchart of a gesture detection method provided in an embodiment of the present application. As shown in fig. 1, the method may be applied to any electronic device, and particularly, the method includes steps S110 to S150.

Step S110: and acquiring an image sequence corresponding to the target object and motion data of a target part of the target object.

Step S120: first three-dimensional coordinate information of each key point of a target part in any image in the image sequence is determined.

Step S130: and determining a first unit vector representing the direction of the target part according to the first three-dimensional coordinate information of each key point.

Step S140: based on the motion data of the target site, a second unit vector characterizing the direction of the target site is determined.

Step S150: and determining the posture of the target part according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point.

According to the posture detection method provided by the embodiment of the application, the first unit vector is obtained by processing the image including the target object, the second unit vector is obtained by directly processing the motion data of the acquired target part, and the similarity between the first unit vector and the second unit vector is determined to determine whether the posture of the target part can be accurately acquired based on the processing of the acquired image in consideration of the fact that the real-time performance of the direction of the target part represented by the second unit vector is stronger. Meanwhile, the stability of the position information of each key point obtained by processing the image is better, the posture of the target part can be accurately determined based on whether the acquired image can be processed or not and the first three-dimensional coordinate information of each key point, the user experience is improved, and the situation that the posture of the target part cannot be accurately acquired based on the processing of the acquired image under the conditions that the target part is blocked or moves fast and the like and the user perception is influenced is avoided.

Alternatively, the target object may be any movable object. For example, the target object may be a human, an animal, or the like. In the embodiment of the present application, the gesture detection method can be described with a person as a target object. Taking the target object as a human, the target portion of the target object may be a hand, an arm, a leg, a foot, or the like of the target object. Of course, the target site may be a part of an arm, such as a big arm or a small arm, or the whole arm. In the case that the target portion is a whole arm, the target portion may include two sub-portions, a large arm and a small arm.

In this embodiment of the present application, any image capturing device may capture an image of a target object to obtain an image sequence corresponding to the target object, which is not limited in this embodiment of the present application, where the image capturing device may include, but is not limited to, a camera, a video camera, a still camera, or other devices with an image capturing function (such as a mobile phone, a tablet computer, and the like). The image capturing device may be a monocular image capturing device, and optionally, the image capturing device may be a monocular RGB camera (color camera, RGB).

For the manner of obtaining the motion data of the target portion, the embodiment of the present application does not limit this, and optionally, the inertial sensor, for example, an IMU bracelet, may be worn on the target portion of the target object, and the data obtained by measurement by wearing the IMU bracelet on the target portion is used as the motion data of the target portion.

The specific performance of the inertial sensor is not limited, and can be determined according to actual conditions. In order to acquire more accurate data corresponding to the target part, the inertial sensor worn on the IMU can be set as a nine-axis IMU, so that data of X, Y, Z three directions corresponding to the magnetometer, data of X, Y, Z three directions corresponding to the accelerometer and data of X, Y, Z three directions corresponding to the gyroscope in the IMU are taken as motion data of the target part in nine degrees of freedom. Wherein the data of the accelerometer is used for indicating acceleration, the data of the gyroscope is used for indicating angular velocity, and the data of the magnetometer is used for indicating direction.

Before acquiring the motion data of the target portion, the coordinate system corresponding to the inertial sensor may be calibrated based on the position information of the image capturing device, so that the coordinate system corresponding to the inertial sensor is aligned with the coordinate system corresponding to the image capturing device.

Fig. 2 shows a schematic diagram of an application scenario suitable for use in an embodiment of the present application. As shown in fig. 2, the vertical plane in which the image capturing device is located may be determined as the X, Y plane by the image capturing device being placed vertically. Controlling a target object to stand over the image acquisition equipment, putting out a T-shaped posture, setting the direction of the target object over the image acquisition equipment as the positive direction, initializing the inertial sensor, and recording initial data corresponding to the inertial sensor, so that in the subsequent process of acquiring real-time data of a target part through the inertial sensor, the difference value between the acquired real-time data of the target part and the initial data is determined as the motion data of the target part, thereby completing the calibration of a corresponding coordinate system of the inertial sensor, and ensuring that the coordinate system corresponding to the inertial sensor is aligned with the coordinate system corresponding to the image acquisition equipment.

The wearing number and the wearing position of the inertial sensors are not limited, and the inertial sensors can be determined according to actual conditions. For example, for a target site, an inertial sensor may be worn at the target site and at the very middle of the target site, i.e., at the same distance from the key points at the two end positions of the target site. Of course, where the target site includes multiple sub-sites, one inertial sensor may be worn on each sub-site of the target site.

For any image in the image sequence, the first three-dimensional coordinate information of each key point in the image can be determined through any three-dimensional posture detection model. Two-dimensional coordinate information of each key point in the image can be determined firstly through any two-dimensional attitude detection algorithm, and then the first three-dimensional coordinate information of each key point in the image is determined based on the image which has a time sequence relation with the image in the image sequence. The embodiments of the present application do not limit this.

It should be understood that the coordinate information (i.e., including two-dimensional coordinate information and three-dimensional coordinate information) of all the key points of the target object may be determined by the method provided in the embodiment of the present application, and in the embodiment of the present application, only each key point of the target portion of the target object is taken as an example for description. The coordinate information of each key point may be coordinate information relative to a root key point (root), where the embodiment of the present application does not limit a selection manner of the root key point, and may be one of all key points of the target object, or may be a designated point in a coordinate system corresponding to the image acquisition device. For example, a waist keypoint of the target object may be taken as a root keypoint.

Optionally, the image sequence includes at least two frames of images, and determining first three-dimensional coordinate information of each key point of the target portion in any one of the images in the image sequence includes:

for any image, determining first two-dimensional coordinate information of each key point;

for any image, determining first three-dimensional coordinate information of each key point in the image based on first two-dimensional coordinate information of each key point in at least one frame of image adjacent to the image in the image sequence and the first two-dimensional coordinate information of each key point corresponding to the image.

In this implementation, the first two-dimensional coordinate information of each keypoint relative to the root keypoint may be determined based on any two-dimensional pose detection model, which may be HRNet (a neural network model).

At least one frame of image adjacent to the image, namely the adjacent image which has a time sequence relation with the image. If the method is applied to a real-time scene, at least one frame of image adjacent to the image is 27 frames of image acquired before the image is acquired.

For any image, a waist key point of a target object can be selected as a root key point through a VideoPose algorithm, at least one frame of image adjacent to the image in an image sequence is determined, and first three-dimensional coordinate information of each key point in the image relative to the waist key point is determined based on a time sequence relation between the adjacent image and the image, two-dimensional coordinate information of each key point in the adjacent image and first two-dimensional coordinate information of each key point corresponding to the image.

By the method, the first three-dimensional coordinate information of each key point in each frame of image in the image sequence can be accurately determined, and the position capture of each key point of the target object in real time is realized.

Optionally, the initial three-dimensional coordinate information of each key point corresponding to the target portion is determined by:

acquiring an initial image sequence corresponding to the target object in a specified posture,

determining initial two-dimensional coordinate information of each key point for any initial image in the initial image sequence;

for any initial image, determining initial three-dimensional coordinate information of each key point in the initial image based on initial two-dimensional coordinate information of each key point in at least one frame of image adjacent to the initial image in the initial image sequence and the initial two-dimensional coordinate information of each key point corresponding to the initial image;

determining the length of the target part based on the initial three-dimensional coordinate information of each key point corresponding to the target part, wherein the determining comprises the following steps:

determining two target key points which are positioned at the end point position of the target part in all the key points of the target part;

and determining the distance between the two target key points based on the initial three-dimensional coordinate information of the two target key points, and determining the distance between the two target key points as the length of the target part.

In this implementation, the initial image sequence, i.e., the reference image sequence, may calibrate the three-dimensional coordinate information of the target object in any pose according to the initial three-dimensional coordinate information determined based on the initial image sequence.

Considering that the target portion of the target object does not have a situation such as occlusion when the target object is in the T-character posture, the designated posture may be set as the T-character posture, and the initial image sequence of the target object is acquired when the target object is in the T-character posture, so as to further determine the initial three-dimensional coordinate information of each key point of the target portion according to the initial image sequence of the target object. In the embodiment of the application, initial three-dimensional coordinate information of each key point can be determined while the inertial sensor is calibrated.

The initial three-dimensional coordinate information of each key point may be determined based on the above-described manner of determining the first three-dimensional coordinate information of each key point. The method comprises the following specific steps:

optionally, an initial image sequence including 27 frames of images corresponding to the target object may be acquired by the image acquisition device when the target object is in the T-shaped pose, and each image in the initial image sequence is an initial image of the target object in the T-shaped pose. For example, the target object may be controlled to be kept in the T-shaped pose for a preset time period, and within the preset time period, an initial image sequence corresponding to the target object is acquired. The embodiment of the application does not limit the preset time period and can be determined according to actual conditions. For example, in the case where the frequency with which the image capturing apparatus captures images is 30 times/second, the preset time period may be set to 5 seconds.

For any initial image in the initial image sequence, initial two-dimensional coordinate information of each keypoint may be determined based on HRNet. For any image in the initial image sequence, selecting a waist key point of a target object as a root key point through a VideoPose algorithm, determining 27 frames of images adjacent to the initial image in the initial image sequence, and determining initial three-dimensional coordinate information of each key point in the image relative to the waist key point based on the adjacent images and the time sequence relationship between the images, the initial two-dimensional coordinate information of each key point in the adjacent images and the initial two-dimensional coordinate information of each key point in the images.

There may be multiple key points for a target site, in the embodiments of the present applicationAfter the initial three-dimensional coordinate information of each key point of the target portion is determined, the description may be given by taking two target key points in which each key point of one target portion is an end point position of the target portion as an example. Can determine the initial three-dimensional coordinate information ki of two key target points of the target part₁And ki₂Distance L between₀＝|ki₁-ki₂L, the distance L₀The length of the target site is determined.

When the target object is in any posture, the first three-dimensional coordinate information k of two key points of the target part can be determined₁And k₂To the displacement between

Shifting the first three-dimensional coordinate information of two key points of the target part

And the length L of the target part₀Ratio therebetween

A first unit vector that characterizes the direction of the target portion (i.e., a first unit vector corresponding to the target portion based on a unit vector corresponding to the target portion obtained by processing the image) is determined.

In this implementation, a second unit vector representing the direction of the target portion may be determined according to, for example, an algorithm based on an AHRS (attitude heading reference system) in combination with the motion data of the target portion

(i.e., the second unit vector corresponding to the target region is based on the unit vector corresponding to the target region obtained by processing the motion data).

It should be understood that, in an ideal case, the first unit vector and the second unit vector corresponding to each other through the target portion should be the same, and the similarity between the two should be 1, that is, the direction of the target portion should be consistent regardless of the manner in which the direction is determined. In practical applications, since the target portion may be blocked in the acquired image or a clear image including the target object cannot be acquired along with the rapid movement of the target object, the unit vector corresponding to the target portion obtained by processing the image may be inaccurate, and at this time, if the posture of the target portion is still determined according to the first three-dimensional coordinate information of the target portion obtained by processing the image, the posture may be inaccurate.

In order to avoid this, in the embodiment of the present application, after the first unit vector and the second unit vector of the target portion and the first three-dimensional coordinate information of each keypoint are determined, it may be determined, based on the similarity between the first unit vector and the second unit vector, whether the pose information of the target portion cannot be accurately acquired based on the processing of the image due to the fact that the target portion is blocked in the acquired image or the target object moves rapidly, and the pose of the target portion may be further determined based on the first three-dimensional coordinate information of each keypoint.

Wherein if the similarity between the first unit vector and the second unit vector is high, that is, the direction of the target portion obtained based on the motion data of the target portion is consistent with the direction of the target portion obtained based on the motion data of the target portion, the target portion is not occluded in the acquired image; if the similarity between the first unit vector and the second unit vector is low, that is, the direction of the target portion obtained based on the motion data of the target portion and the direction of the target portion obtained based on the motion data of the target portion are not coincident, the target portion is occluded in the acquired image.

In the process of processing the image including the target object, even if part of the key points of the target part are shielded in the acquired image, the first three-dimensional coordinate information of the shielded key points can be obtained by prediction according to the first three-dimensional coordinate information of other non-shielded key points, that is, under the condition that part of the key points of the target part are shielded, the obtained first three-dimensional coordinate information of the shielded key points may not be accurate enough. Considering that the length of the target portion does not change and the real-time performance of the second unit vector of the direction representing the target portion determined based on the motion data of the target portion is stronger, the embodiment of the present application further provides the following specific manner of determining the posture of the target portion:

optionally, determining the pose of the target portion according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point, including:

if the similarity is larger than or equal to a preset threshold, determining the posture of the target part based on the first three-dimensional coordinate information of each key point corresponding to the target part;

if the similarity is smaller than a preset threshold value, determining the length of the target part based on the initial three-dimensional coordinate information of each key point corresponding to the target part, and determining second three-dimensional coordinate information of other key points based on the length of the target part and a second unit vector; and determining the posture of the target part based on the first three-dimensional coordinate information of the first specified key point and the second three-dimensional coordinate information of other key points, wherein the other key points are the key points except the first specified key point in each key point.

In this implementation manner, the specific value of the preset threshold is not limited, and may be an experimental value or an empirical value, or may be determined according to an actual situation, for example, the preset threshold may be set to 0.7. In the embodiment of the present application, since the modulus of the first unit vector and the modulus of the second unit vector are both 1, the inner product of the first unit vector and the second unit vector, that is, the cosine value of the included angle between the first unit vector and the second unit vector, can be directly used as the similarity between the first unit vector and the second unit vector. For example, in the first unit vector is

The second unit vector is

In the case of (2), the similarity between the second unit vector and the second vectorIs composed of

If the similarity is greater than or equal to the predetermined threshold, that is, the similarity between the first unit vector and the second unit vector is high as described above, the direction of the target portion obtained based on the motion data of the target portion is consistent with the direction of the target portion obtained based on the motion data of the target portion, and the target portion is not occluded in the acquired image.

In consideration of the better stability of the position information of each key point obtained by processing the image, the first three-dimensional coordinate information of each key point obtained by processing the image can be used as the three-dimensional coordinate information of the first designated key point required for determining the posture of the target part, namely, the posture of the target part can be determined directly on the basis of the first three-dimensional coordinate information of each key point corresponding to the target part.

Alternatively, since the direction of the target portion obtained based on the motion data of the target portion is consistent with the direction of the target portion obtained based on the motion data of the target portion, and the direction of the target portion obtained based on the motion data of the target portion is more real-time, in order to more accurately determine the posture of the target portion, a dynamic target portion, that is, a complete action expression of the target portion may be determined based on the first three-dimensional coordinate information of each key point corresponding to the target portion in combination with the direction of the target portion obtained based on the motion data of the target portion.

If the similarity is smaller than the preset threshold, that is, as described above, the similarity between the first unit vector and the second unit vector is low, the direction of the target portion obtained based on the motion data of the target portion is not consistent with the direction of the target portion obtained based on the motion data of the target portion, and the target portion may be blocked in the obtained image, or accurate information of the target portion cannot be obtained through the image acquisition device due to the rapid movement of the target object.

Considering that the length of the target portion is not changed regardless of whether the target portion is blocked or whether the target object moves rapidly, the direction of the target portion determined based on the motion data of the target portion can be measured in real time, that is, the real-time performance of the second unit vector of the direction representing the target portion determined based on the motion data of the target portion is stronger, therefore, the first three-dimensional coordinate information of each key point of the target portion can be corrected based on the length of the target portion and the motion data of the target portion, and the corrected first three-dimensional coordinate information of each key point, that is, the second three-dimensional coordinate information of each key point (that is, the three-dimensional coordinate information of the first specified key point required for finally determining the posture of the target portion) can be obtained.

Alternatively, a keypoint with a low possibility of change in the target portion may be selected as a first designated keypoint, first three-dimensional coordinate information of the first designated keypoint obtained by processing the image is determined as second three-dimensional coordinate information of the first designated keypoint, and second three-dimensional coordinate information of other keypoints is determined based on the length of the target portion and a second unit vector determined based on motion data of the target portion.

Taking the target part as a lower leg (left and right directions are not distinguished) as an example, the key points of the two endpoint positions corresponding to the lower leg are an ankle key point and a knee key point respectively, and the first three-dimensional coordinate information of the knee key point is k₃The first three-dimensional coordinate information of the ankle key point is k₄. Considering the first three-dimensional coordinate information k of the knee key points under normal conditions₃Will be relatively fixed, the knee key point may be set as the first designated key point, assuming that the initial three-dimensional coordinate information of the knee key point is ki₃The initial three-dimensional coordinate information of the ankle key point is ki₄The length of the lower leg is L' ═ ki₄-ki₃Based on the orientation of the target portion obtained by processing the image

The unit vector of the lower leg (pointing from knee key point to ankle key point) determined based on the motion data of the lower leg is

If it is

If the value is less than the preset threshold value, the second three-dimensional coordinate information of the ankle key point can be determined as

After determining the second three-dimensional coordinate information of the other keypoints of the target portion, the pose of the target portion may be determined based on the first three-dimensional coordinate information of the first specified keypoint and the second three-dimensional coordinate information of the other keypoints.

In this way, the posture of the target portion is determined based on the similarity between the first unit vector and the second unit vector and directly based on the first three-dimensional coordinate information of each keypoint obtained by processing the image when the similarity is high. When the similarity is low, the posture of the target portion is determined based on the first three-dimensional coordinate information of the first designated key point, the second unit vector of the target portion, and the length of the target portion, taking into account that the length of the target portion does not change. That is, in the above manner, the posture of the target portion can be accurately determined in a case where the posture of the target portion cannot be accurately acquired based on the acquired image of the target object.

Optionally, the target portion includes at least two sub portions having a connection relationship, the first three-dimensional coordinate information of each keypoint includes the first three-dimensional coordinate information of each keypoint of each sub portion, and the motion data of the target portion includes motion data corresponding to each sub portion;

determining a first unit vector corresponding to the target part according to the first three-dimensional coordinate information of each key point, wherein the determining comprises the following steps:

for each sub-part, determining a first unit vector corresponding to the sub-part according to first three-dimensional coordinate information of a key point of the sub-part;

determining a second unit vector for the target site based on the motion data for the target site, comprising:

for each sub-part, determining a second unit vector of the sub-part according to the motion data of the sub-part;

determining the posture of the target part according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point, wherein the method comprises the following steps:

for each sub-part, determining the posture of the sub-part according to the similarity between a first unit vector corresponding to the sub-part and a second unit vector corresponding to the sub-part and the first three-dimensional coordinate information of each key point of the sub-part;

the pose of the target portion is determined based on the poses of the sub-portions of the target portion.

In this implementation, the target site may be divided into at least two sub-sites having a connection relationship based on a connection key point among the key points of the target site.

As shown in fig. 2 and 3, taking a right-hand arm (an arm corresponding to the negative direction of the x-axis) with a target portion as an example, in the target portion, a key point corresponding to the elbow of the right hand should be a connection key point of the target portion, and the target portion can be divided into 2 sub-portions by the connection key point, which are a large arm (hereinafter, referred to as "right large arm") of the right-hand arm and a small arm (hereinafter, referred to as "right small arm") of the right-hand arm of the target object, respectively. Correspondingly, if the target portion is a left arm of the target object, the key point corresponding to the elbow of the left hand should be a connection key point of the target portion, and the target portion can be divided into 2 sub-portions by the connection key point, which are respectively a large arm (hereinafter referred to as "left large arm") of the left arm and a small arm (hereinafter referred to as "left small arm") of the left arm of the target object. And as shown in fig. 2 and 3, the number of the inertial sensors to be worn may be four, and each of the inertial sensors may be worn at any position of the right upper arm, the right lower arm, the left upper arm, and the left lower arm of the target object.

Determining initial three-dimensional coordinates of key points of a target site from images in a sequence of images as described aboveThe way of the mark information and the first three-dimensional coordinate information, as shown in fig. 2, can determine that the initial three-dimensional coordinate information of each key point of the right forearm is ki_r1，ki_r2The initial three-dimensional coordinate information of each key point of the right forearm is ki_r2，ki_r3. Wherein, the initial three-dimensional coordinate information of each key point of the left big arm is ki_l1，ki_l2The initial three-dimensional coordinate information of each key point of the left forearm is ki_l2，ki_l3. As shown in fig. 3, it can be determined that the first three-dimensional coordinate information of each key point of the right forearm is k_r1，ki_r2The first three-dimensional coordinate information of each key point of the right forearm is k_r2，k_r3. Wherein, the first three-dimensional coordinate information of each key point of the left big arm is k respectively_l1，k_l2The first three-dimensional coordinate information of each key point of the left forearm is k_l2，k_l3。

Referring to the above-described manner of determining the length of the target portion based on the initial three-dimensional coordinate information of each key point of the target portion, and the manner of determining the first unit vector corresponding to the target portion based on the first three-dimensional coordinate information of each key point of the target portion, it can be determined that the length of the right forearm is L₁＝|ki_r2-ki_r1The first unit vector corresponding to the right big arm is

The length of the right forearm is L₂＝|ki_r3-ki_r2The first unit vector corresponding to the right forearm is

The length of the left big arm is L₃＝|ki_l2-ki_l1I, the first unit vector corresponding to the left big arm is

The length of the left big arm is L₄＝|ki_l3-ki_l2I, the first unit vector corresponding to the left forearm is

Referring to the above-described manner for determining the second unit vector corresponding to the target portion based on the motion data of the target portion, the second unit vector corresponding to the right forearm may be determined as the second unit vector based on the AHRS algorithm and combined with the motion data of each sub-portion of the target portion

The second unit vector corresponding to the right forearm is

The second unit vector corresponding to the left forearm is

The second unit vector corresponding to the left forearm is

Referring to the above-described manner of determining the similarity between the first unit vector corresponding to the target portion and the second unit vector corresponding to the target portion, the similarity corresponding to the right big arm (i.e., the similarity between the first unit vector corresponding to the right big arm and the second unit vector corresponding to the sub-portion) can be determined as

The similarity corresponding to the right forearm is

The similarity corresponding to the left big arm is

The similarity corresponding to the left forearm is

Because the sub-parts of the target part have a connection relation, the first unit vector, the second unit vector and the similarity of the first unit vector and the second unit vector corresponding to the sub-parts with the connection relation in the target part are sequentially determined, and for each sub-part, the posture of the sub-part can be determined according to the similarity corresponding to the sub-part and the first three-dimensional coordinate information of each key point of the sub-part. After the postures of the sub-parts are determined, the posture of the target part may be determined based on the postures of the sub-parts of the target part and the connection relationship between the sub-parts, and the determined posture of the target part may be made more accurate.

As described above, since, in the process of processing an image including a target object, in a case where a part of key points of a target portion are occluded, the obtained first three-dimensional coordinate information of the occluded key points may not be accurate enough, that is, the first three-dimensional coordinate information of key points in each sub-portion of the target portion may not be accurate enough, based on the same consideration that the length of each sub-portion does not change and the real-time performance of the direction second unit vector representing each sub-portion determined based on the motion data of each sub-portion is stronger, the embodiment of the present application further provides the following way of specifically determining the pose of the target portion:

for each sub-part, determining the posture of the sub-part according to the similarity between the first unit vector corresponding to the sub-part and the second unit vector corresponding to the sub-part and the first three-dimensional coordinate information of each key point of the sub-part, including:

for a first sub-part with the similarity greater than or equal to a corresponding preset threshold, determining the posture of the sub-part based on first three-dimensional coordinate information of each key point of the sub-part;

for a second sub-part with the similarity smaller than a corresponding preset threshold, determining second three-dimensional coordinate information of each key point of the sub-part based on first three-dimensional coordinate information of each key point of the sub-part and a second unit vector corresponding to the sub-part; and determining the posture of the sub-part according to the second three-dimensional coordinate information of each key point of the sub-part.

In this implementation manner, the preset threshold corresponding to each sub-part may be the same or different, and this is not limited in this embodiment of the application. Taking the target portion described above as the right arm as an example, considering that the moving interval of the right big arm is smaller than that of the right small arm, a preset threshold corresponding to the right big arm may be set

Greater than the corresponding preset threshold value of the right forearm

For example, can be provided

The content of the organic acid is 0.8,

is 0.6.

In the same manner, for a sub-part, if the similarity corresponding to the sub-part is greater than or equal to the corresponding preset threshold, it indicates that the similarity between the first unit vector of the sub-part (i.e., the first sub-part) obtained by image processing and the second unit vector of the sub-part obtained by motion data of the sub-part is higher, and the posture of the sub-part can be determined directly based on the first three-dimensional coordinate information of each key point of the sub-part obtained by image processing.

As shown in FIG. 3, taking the target region as the right hand arm as an example, the similarity corresponding to the right forearm is shown

Similarity corresponding to the right forearm

Thereafter, if a > 0.8, and b > 0.6, then k can be based on_r1，k_r2Determining the posture of the right big arm based on k_r2，k_r3The pose of the right forearm is determined.

If the similarity corresponding to the sub-part is smaller than the corresponding preset threshold, it indicates that the similarity between the first unit vector of the sub-part (i.e., the second sub-part) obtained based on the image processing and the second unit vector of the sub-part obtained based on the motion data of the sub-part is low, and it is necessary to correct the first three-dimensional coordinate information of each key point of the sub-part, that is, it is necessary to further determine the second three-dimensional coordinate information of each key point of the sub-part, so as to accurately determine the posture of the sub-part.

Based on the mode, the postures of the sub-parts can be accurately determined.

As described above, considering that the length of the sub-part does not change, the length of the sub-part may be determined based on the initial three-dimensional coordinate information of each key point of the sub-part, and the second three-dimensional coordinate information of each key point of the sub-part may be determined based on the length of the sub-part and the second unit vector corresponding to the sub-part, so as to determine the posture of the sub-part according to the second three-dimensional coordinate information of each key point of the sub-part. The specific implementation mode is as follows:

optionally, the determining, for the second sub-portion, second three-dimensional coordinate information of each key point of the sub-portion based on the first three-dimensional coordinate information of each key point of the sub-portion and the second unit vector corresponding to the sub-portion includes:

determining the length of the sub-part based on the initial three-dimensional coordinate information of each key point of the sub-part;

if the at least two sub-parts comprise a first sub-part and a second sub-part, taking first three-dimensional coordinate information of a first connecting key point in each key point of a target first sub-part which has a connection relation with the sub-parts as second three-dimensional coordinate information of the first connecting key point in the sub-parts; determining second coordinate information of other key points in the sub-part, except for first connecting key points, according to the length of the sub-part and a second unit vector corresponding to the sub-part, wherein a target first sub-part is a first sub-part of which second three-dimensional coordinate information of each corresponding key point is determined, and a first connecting key point is a common key point of the sub-part and the target first sub-part;

if each sub-part of the at least two sub-parts is a second sub-part, determining a second designated key point of each key point, and determining second three-dimensional coordinate information of each key point of the sub-part in the following manner:

for a specified second sub-part to which a second specified key point belongs, taking first three-dimensional coordinate information of the second specified key point as second three-dimensional coordinate information of the second specified key point, and determining second three-dimensional coordinate information of other key points except the second specified key point in the sub-part according to the length of the sub-part and a second unit vector corresponding to the sub-part;

and for other sub-parts except for the designated second sub-part, determining second three-dimensional coordinate information of other key points except for second connecting key points in the sub-part according to the length of the sub-part and a second unit vector corresponding to the sub-part, wherein the target second sub-part is the second sub-part of which the second three-dimensional coordinate information of each corresponding key point is determined, and the second connecting key points are common key points of the sub-part and the target second sub-part which has a connection relation with the sub-part.

As described above, the posture of the first sub-part may be determined directly based on the first three-dimensional coordinate information of each keypoint of the first sub-part, that is, the first three-dimensional coordinate information of each keypoint of the first sub-part may be directly determined as the three-dimensional coordinate information of the corresponding keypoint for determining the posture of the first sub-part.

Because a connection relationship exists between the sub-parts in the target portion, a common key point exists between the mutually connected sub-parts (that is, one key point in the two sub-parts is the same, and the same key point is a key point of an end point position of each corresponding sub-part), if the target portion includes both the first sub-part and the second sub-part, for the second sub-part of the target portion having a connection relationship with the first sub-part, the second three-dimensional coordinate information of the common key point of the second sub-part and the first sub-part (that is, the three-dimensional coordinate information of the common key point required for determining the posture of the second sub-part, which is substantially the first three-dimensional coordinate information of the common key point) can be determined according to the first three-dimensional coordinate information of each key point of the first sub-part. Taking an example in which one second sub-part includes the keypoints at two end positions, after obtaining the second three-dimensional coordinate information of the keypoint at one end position of the second sub-part, the second three-dimensional coordinate information of another keypoint of the second sub-part may be determined based on the length of the second sub-part (i.e., the distance between the determined keypoints at the two end positions of the second sub-part based on the initial three-dimensional coordinate information of the keypoint at the two end positions of the second sub-part) and the second unit vector of the sub-part.

As shown in FIG. 3, taking the target site described above as the right hand arm, if a > 0.8 and b < 0.6, then k may be based on_r1And k_r2Determining the posture of the right big arm, and further determining second three-dimensional coordinate information of a key point which is not connected with the right big arm in the right small arm according to the length of the right small arm and a second unit vector corresponding to the right small arm, wherein the method specifically comprises the following steps:

and based on k_r2And k_r3' determining the pose of the right forearm.

If a < 0.8 and b > 0.6, then k may be based on_r2And k_r3Determining the posture of the right forearm, and further determining second three-dimensional coordinate information of a key point which is not connected with the right forearm in the right forearm according to the length of the right forearm and a second unit vector corresponding to the right forearm, wherein the second three-dimensional coordinate information specifically comprises the following steps:

and based on k_r1' and k_r2The pose of the right forearm is determined.

If all the sub-parts of the target portion are the second sub-parts, that is, for any one of the sub-parts of the target portion, the similarity between the first unit vector corresponding to the sub-part obtained by image processing and the second unit vector of the sub-part obtained by motion data of the sub-part is low, and the target portion may be blocked in the image. In this case, in order to reduce errors as much as possible, a key point with a small offset among the key points may be selected as a second designated key point, and the first three-dimensional coordinate information of the second designated key point may be determined as the second three-dimensional coordinate information of the second designated key point (i.e., the three-dimensional coordinate information of the second designated key point required for determining the designated second sub-portion to which the second designated key point belongs). And further, according to the length of the sub-part and a second unit vector corresponding to the sub-part, determining second three-dimensional coordinate information of other key points except for the second specified key point in the sub-part.

Since the second three-dimensional coordinate information of each key point of the designated second sub-part is already determined, the second three-dimensional coordinate information of each key point of the target part in all sub-parts in the target part can be determined by referring to the first three-dimensional coordinate information of each key point of the first sub-part and the second unit vector of the second sub-part having a connection relationship with the first sub-part, and determining the second three-dimensional coordinate information of each key point of the second sub-part having a connection relationship with the first sub-part, thereby determining the postures of all sub-parts in the target part.

As shown in fig. 3, taking the target portion described above as the right hand arm as an example, if a is less than 0.8 and b is less than 0.6, it can be determined that the right hand arm is likely to be blocked, and since the common key point of the right forearm and the right shoulder belongs to the shoulder joint and a large offset generally does not occur, the common key point of the right forearm and the right shoulder can be used as the second designated key point, and k is used as the second designated key point_r1As second three-dimensional coordinate information of the second specified key point, and further determines second three-dimensional coordinate information of another key point of the right forearm as

Based on k_r1And k_r2' determining the pose of the right forearm, and further determining a second keypoint of the right forearm not connected to the right forearmThree-dimensional coordinate information is

And k are_r2' and k_r3' determining the pose of the right forearm.

By the mode, the postures of all the sub-parts in the target part can be accurately determined, so that the postures of the target part can be better determined.

It should be noted that, although the target portion includes two key points or the sub-portion includes two key points, and each key point is a key point of the end point position of the corresponding portion, in practical applications, there may be multiple key points for one target portion and one sub-portion. The above manner may be adopted, an inertial sensor is worn between every two key points, the second three-dimensional coordinate information of all key points in the target portion is determined based on the first three-dimensional coordinate information of each key point, the length of the portion between every two adjacent key points, the first unit vector and the second unit vector corresponding to the portion between every two adjacent key points, and the connection relationship between the portion between every two adjacent key points and other portions, and the pose of the target portion is accurately determined based on the second three-dimensional coordinate information of all key points in the target portion.

Based on the same principle as the gesture detection method provided by the embodiment of the application, the embodiment of the application provides a gesture detection device. Fig. 4 is a schematic structural diagram of an attitude detection apparatus according to an embodiment of the present application, and as shown in fig. 4, the apparatus 40 may include:

an image and motion data acquiring module 401, configured to acquire an image sequence corresponding to a target object and motion data of a target portion of the target object;

a three-dimensional coordinate information determining module 402, configured to determine first three-dimensional coordinate information of each key point of a target portion in any image in an image sequence;

a first unit vector determining module 403, configured to determine, according to the first three-dimensional coordinate information of each key point, a first unit vector representing a direction of the target portion;

a second unit vector determining module 404, configured to determine a second unit vector characterizing the direction of the target portion based on the motion data of the target portion;

and a pose determining module 405, configured to determine a pose of the target portion according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point.

Optionally, the pose determining module 405, when determining the pose of the target portion according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point, is specifically configured to:

if the similarity is larger than or equal to a preset threshold value, determining the posture of the target part based on the first three-dimensional coordinate information of each key point corresponding to the target part;

the first unit vector determining module 403, when determining the first unit vector corresponding to the target portion according to the first three-dimensional coordinate information of each key point, is specifically configured to:

the second unit vector determining module 404, when determining the second unit vector of the target portion based on the motion data of the target portion, is specifically configured to:

when determining the pose of the target portion according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of each key point, the pose determination module 405 is specifically configured to:

Optionally, for each sub-part, when determining the pose of the sub-part according to the similarity between the first unit vector corresponding to the sub-part and the second unit vector corresponding to the sub-part, and the first three-dimensional coordinate information of each key point of the sub-part, the pose determination module 405 is specifically configured to:

Optionally, the initial three-dimensional coordinate information of each keypoint includes initial three-dimensional coordinate information of each keypoint of each sub-portion, and for a second sub-portion, the pose determining module 405 is specifically configured to, when determining the second three-dimensional coordinate information of each keypoint of the sub-portion based on the first three-dimensional coordinate information of each keypoint of the sub-portion and the second unit vector corresponding to the sub-portion:

Optionally, the image sequence includes at least two frames of images, and the three-dimensional coordinate information determining module 402 is specifically configured to, when determining the first three-dimensional coordinate information of each key point of the target portion in any image in the image sequence:

for any initial image, determining initial three-dimensional coordinate information of each key point in the initial image based on initial two-dimensional coordinate information of each key point in at least one frame of image adjacent to the initial image in the initial image sequence and initial two-dimensional coordinate information of each key point corresponding to the initial image;

when determining the length of the target portion based on the initial three-dimensional coordinate information of each key point corresponding to the target portion, the pose determination module 405 is specifically configured to:

The apparatus of the embodiment of the present application may execute the method provided by the embodiment of the present application, and the implementation principle is similar, the actions executed by the modules in the apparatus of the embodiments of the present application correspond to the steps in the method of the embodiments of the present application, and for the detailed functional description of the modules of the apparatus, reference may be specifically made to the description in the corresponding method shown in the foregoing, and details are not repeated here.

Based on the same principle as the gesture detection method and apparatus provided in the embodiments of the present application, an embodiment of the present application also provides an electronic device (e.g., a server), where the electronic device may include a memory, a processor, and a computer program stored in the memory, and the processor executes the computer program to implement the steps of the method provided in any optional embodiment of the present application.

Optionally, fig. 5 shows a schematic structural diagram of an electronic device to which an embodiment of the present application is applied, and as shown in fig. 5, an electronic device 4000 shown in fig. 5 includes: a processor 4001 and a memory 4003. Processor 4001 is coupled to memory 4003, such as via bus 4002. Optionally, the electronic device 4000 may further include a transceiver 4004, and the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. In addition, the transceiver 4004 is not limited to one in practical applications, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.

The Processor 4001 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or other Programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 4001 may also be a combination that performs a computational function, including, for example, a combination of one or more microprocessors, a combination of a DSP and a microprocessor, or the like.

Bus 4002 may include a path that carries information between the aforementioned components. The bus 4002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 4002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.

The Memory 4003 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.

The memory 4003 is used for storing computer programs for executing the embodiments of the present application, and execution is controlled by the processor 4001. The processor 4001 is used to execute computer programs stored in the memory 4003 to implement the steps shown in the foregoing method embodiments.

Embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program may implement the steps and corresponding contents of the foregoing method embodiments.

Embodiments of the present application further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the steps and corresponding contents of the foregoing method embodiments can be implemented.

The terms "first," "second," "third," "fourth," "1," "2," and the like in the description and claims of this application and in the preceding drawings (if any) are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than described or illustrated herein.

It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.

The foregoing is only an optional implementation manner of a part of implementation scenarios in the present application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of the present application are also within the protection scope of the embodiments of the present application without departing from the technical idea of the present application.

Claims

1. An attitude detection method, characterized by comprising:

determining first three-dimensional coordinate information of each key point of the target part in any image in the image sequence;

2. The method according to claim 1, wherein the determining the pose of the target portion according to the similarity between the first unit vector and the second unit vector and the first three-dimensional coordinate information of the respective key points comprises:

if the similarity is larger than or equal to a preset threshold value, determining the posture of the target part based on first three-dimensional coordinate information of each key point corresponding to the target part;

if the similarity is smaller than the preset threshold, determining the length of the target part based on the initial three-dimensional coordinate information of each key point corresponding to the target part, and determining second three-dimensional coordinate information of other key points based on the length of the target part and the second unit vector; and determining the posture of the target part based on the first three-dimensional coordinate information of the first appointed key point and the second three-dimensional coordinate information of other key points, wherein the other key points are key points except the first appointed key point in each key point.

3. The method according to claim 1, wherein the target portion comprises at least two sub portions having a connection relationship, the first three-dimensional coordinate information of the key points comprises first three-dimensional coordinate information of the key points of each of the sub portions, and the motion data of the target portion comprises motion data corresponding to each of the sub portions;

determining a first unit vector corresponding to the target part according to the first three-dimensional coordinate information of each key point, including:

for each sub-part, determining a first unit vector corresponding to the sub-part according to the first three-dimensional coordinate information of each key point of the sub-part;

the determining a second unit vector corresponding to the target portion based on the motion data of the target portion includes:

for each sub-part, determining a second unit vector corresponding to the sub-part based on the motion data corresponding to the sub-part;

the determining the posture of the target part according to the first unit vector, the second unit vector and the first three-dimensional coordinate information of each key point includes:

determining a pose of the target portion based on the poses of the sub-portions of the target portion.

4. The method according to claim 3, wherein for each of the sub-parts, the determining the pose of the sub-part according to the similarity between the first unit vector corresponding to the sub-part and the second unit vector corresponding to the sub-part and the first three-dimensional coordinate information of the key points of the sub-part comprises:

5. The method according to claim 4, wherein the initial three-dimensional coordinate information of the key points includes initial three-dimensional coordinate information of the key points of each of the sub-portions, and for the second sub-portion, the determining the second three-dimensional coordinate information of the key points of the sub-portion based on the first three-dimensional coordinate information of the key points of the sub-portion and the second unit vector corresponding to the sub-portion includes:

if the at least two sub-parts comprise a first sub-part and a second sub-part, taking first three-dimensional coordinate information of a first connecting key point in each key point of a target first sub-part which has a connection relation with the sub-part as second three-dimensional coordinate information of the first connecting key point in the sub-part; determining second coordinate information of other key points in the sub-part, except for the first connecting key point, according to the length of the sub-part and a second unit vector corresponding to the sub-part, wherein the target first sub-part is a first sub-part of which second three-dimensional coordinate information of each corresponding key point is determined, and the first connecting key point is a common key point of the sub-part and the target first sub-part;

if each sub-part of the at least two sub-parts is a second sub-part, determining a second specified key point of each key point, and determining second three-dimensional coordinate information of each key point of the sub-part by the following method:

for a designated second sub-part to which the second designated key point belongs, taking first three-dimensional coordinate information of the second designated key point as second three-dimensional coordinate information of the second designated key point, and determining second three-dimensional coordinate information of other key points in the sub-part except the second designated key point according to the length of the sub-part and a second unit vector corresponding to the sub-part;

and for other sub-parts of the at least two sub-parts except the designated second sub-part, determining second three-dimensional coordinate information of other key points of the sub-part except a second connecting key point according to the length of the sub-part and a second unit vector corresponding to the sub-part, wherein the target second sub-part is the second sub-part for which the second three-dimensional coordinate information of each corresponding key point is determined, and the second connecting key point is a common key point of the sub-part and a target second sub-part which has a connection relation with the sub-part.

6. The method of claim 1, wherein the image sequence comprises at least two images, and the determining the first three-dimensional coordinate information of each keypoint of the target portion in any image in the image sequence comprises:

and for any image, determining first three-dimensional coordinate information of each key point in the image based on the first two-dimensional coordinate information of each key point in at least one frame of image adjacent to the image in the image sequence and the first two-dimensional coordinate information of each key point corresponding to the image.

7. The method according to claim 2, wherein the initial three-dimensional coordinate information of each key point corresponding to the target portion is determined by:

the determining the length of the target part based on the initial three-dimensional coordinate information of each key point corresponding to the target part comprises:

determining the distance between the two target key points based on the initial three-dimensional coordinate information of the two target key points, and determining the distance between the two target key points as the length of the target part.

8. An attitude detection device characterized by comprising:

the image and motion data acquisition module is used for acquiring an image sequence corresponding to a target object and motion data of a target part of the target object;

the three-dimensional coordinate information determining module is used for determining first three-dimensional coordinate information of each key point of the target part in any image in the image sequence;

a first unit vector determining module, configured to determine, according to the first three-dimensional coordinate information of each key point, a first unit vector representing a direction of the target portion;

a second unit vector determination module, configured to determine a second unit vector characterizing a direction of the target portion based on the motion data of the target portion;

and the attitude determination module is used for determining the attitude of the target part according to the first unit vector, the second unit vector and the first three-dimensional coordinate information of each key point.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to perform the steps of the method of any of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.