CN117095131B - Three-dimensional reconstruction method, equipment and storage medium for object motion key points - Google Patents

Three-dimensional reconstruction method, equipment and storage medium for object motion key points Download PDF

Info

Publication number
CN117095131B
CN117095131B CN202311333051.1A CN202311333051A CN117095131B CN 117095131 B CN117095131 B CN 117095131B CN 202311333051 A CN202311333051 A CN 202311333051A CN 117095131 B CN117095131 B CN 117095131B
Authority
CN
China
Prior art keywords
key point
preset
target
key
key points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311333051.1A
Other languages
Chinese (zh)
Other versions
CN117095131A (en
Inventor
苏鹏
李观喜
张磊
覃镇波
梁倬华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Ziweiyun Technology Co ltd
Original Assignee
Guangzhou Ziweiyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Ziweiyun Technology Co ltd filed Critical Guangzhou Ziweiyun Technology Co ltd
Priority to CN202311333051.1A priority Critical patent/CN117095131B/en
Publication of CN117095131A publication Critical patent/CN117095131A/en
Application granted granted Critical
Publication of CN117095131B publication Critical patent/CN117095131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Artificial Intelligence (AREA)
  • Computer Graphics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the invention relates to the field of computer vision, in particular to a three-dimensional reconstruction method of object motion key points, which comprises the following steps: acquiring a plurality of target images of a target object at each target motion moment; determining a first 2D key point corresponding to each target image based on a deep learning algorithm; fine tuning the first 2D key points to obtain second 2D key points; judging whether the second 2D key point accords with a preset adjusting condition or not; if the second 2D key point meets the preset adjustment condition, taking the second 2D key point meeting the preset adjustment condition as a target 2D key point; and determining target 3D key points at the target motion moment after three-dimensional reconstruction according to the plurality of target 2D key points. The method and the device can improve the accuracy of the target 2D key points and the target 3D, and enable the accuracy to meet the expectations, so that the three-dimensional reconstruction method of the whole motion key points has higher acquisition efficiency and accuracy.

Description

Three-dimensional reconstruction method, equipment and storage medium for object motion key points
Technical Field
The embodiment of the invention relates to the field of computer vision, in particular to a three-dimensional reconstruction method, equipment and storage medium for object motion key points.
Background
In the fields of computer vision, virtual reality, augmented reality, human-computer interaction, etc., there is a need to recognize human actions, such as action classification, behavior recognition, and unmanned driving, etc. Meanwhile, under the rapid development background of the current metauniverse related technology, the man-machine interaction related technology, the virtual digital man-driven technology and the like are increasingly applied to practical engineering. The human body is used as a human-computer interaction body, and the recognition accuracy of the action gesture directly influences the interaction user experience.
The existing three-dimensional human motion key point data acquisition method mainly comprises the technologies of a visual sensor, virtual reality equipment and the like. However, techniques of virtual reality devices, such as gloves, glasses, etc., are required, which are inconvenient for users, and limit their use in practical applications.
In the existing vision sensor technology, a manual marking mode or an automatic identification mode is generally adopted to obtain 2D key points of human body movement, and then 3D key point data is obtained, the existing manual marking mode or the automatic identification mode is easy to cause inaccurate 3D key point data, and the three-dimensional reconstruction effect of a 3D model of human body movement is not facilitated.
Disclosure of Invention
In view of the above problems, embodiments of the present invention provide a three-dimensional reconstruction method, apparatus, and storage medium for object motion key points, which are used to solve the problems existing in the prior art.
According to a first aspect of an embodiment of the present invention, there is provided a three-dimensional reconstruction method of a motion key point of an object, the method including:
acquiring a plurality of target images of a target object at each target motion moment;
determining a first 2D key point corresponding to each target image based on a deep learning algorithm;
fine tuning the first 2D key point to obtain a second 2D key point;
judging whether the second 2D key point accords with a preset adjusting condition or not;
if the second 2D key point meets the preset adjustment condition, taking the second 2D key point meeting the preset adjustment condition as a target 2D key point;
and determining target 3D key points at the target motion moment after three-dimensional reconstruction according to the target 2D key points.
In some embodiments, the preset adjustment condition includes a preset re-projection error, and the determining whether the second 2D key point meets the preset adjustment condition further includes:
calculating corresponding candidate 3D key points according to the plurality of second 2D key points;
carrying out reprojection calculation on the candidate 3D key points to obtain reprojection 2D key points corresponding to each second 2D key point;
calculating a distance error between the re-projection 2D key point and the corresponding second 2D key point;
and judging whether the distance error accords with the preset re-projection error.
In some embodiments, after calculating the distance error between the 2D keypoint and the corresponding second 2D keypoint, the preset adjustment condition includes a preset reprojection error, and the determining whether the second 2D keypoint meets the preset adjustment condition further includes:
and displaying the distance error.
In some embodiments, the preset adjustment condition further includes a preset skeleton condition, and after the determining whether the distance error meets the preset reprojection error, the determining whether the second 2D key point meets the preset adjustment condition further includes:
and if the distance error accords with the preset reprojection error, judging whether the corresponding candidate 3D key points accord with a preset skeleton condition.
In some embodiments, the preset skeleton conditions include a preset skeleton angle, and the determining whether the candidate 3D keypoints meet the preset skeleton conditions further includes:
calculating the association angle between adjacent candidate 3D key point connecting lines;
judging whether the association angle accords with the preset skeleton angle.
In some embodiments, after determining whether the association angle meets the preset skeleton angle, if the second 2D key point meets the preset adjustment condition, the second 2D key point meeting the preset adjustment condition is taken as a target 2D key point, and a target 3D key point at the target motion time after three-dimensional reconstruction is determined according to a plurality of target 2D key points, which further includes:
and if the association angle accords with the preset skeleton angle, taking the corresponding second 2D key point as the target 2D key point, and taking the corresponding candidate 3D key point as the target 3D key point.
In some embodiments, after calculating the association angle between the neighboring candidate 3D keypoint lines, the determining whether the candidate 3D keypoints meet a preset skeleton condition further includes:
and displaying the motion skeleton and/or the association angle formed after connecting the adjacent candidate 3D key points.
In some embodiments, after the determining whether the second 2D keypoint meets the preset adjustment condition, the method further includes:
and if the second 2D key point does not meet the preset adjustment condition, taking the second 2D key point which does not meet the preset adjustment condition as the first 2D key point.
According to a second aspect of embodiments of the present invention, there is provided a computing device comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform the operations of the three-dimensional reconstruction method of object motion keypoints according to any one of the above.
According to a third aspect of embodiments of the present invention, there is provided a computer readable storage medium having stored therein at least one executable instruction that, when executed, performs the operations of the three-dimensional reconstruction method of object motion keypoints as described in any of the above.
According to the embodiment of the invention, the first 2D key point is determined by adopting the training model of the deep learning algorithm, manual marking is not needed, the acquisition efficiency of 2D key point data is improved, in addition, the false recognition rate caused by manual marking is reduced, the obtained first 2D key point is used as coarse positioning, and all the 2D key points are ensured to have certain precision and have better consistency. And then fine-tuning the first 2D key point to enable the first 2D key point to deviate to obtain a second 2D key point, and then judging whether the obtained second 2D key point meets preset adjustment conditions, so that the accuracy of the target 2D key point and the target 3D is improved, the accuracy of the target 2D key point meets the expectations, and the three-dimensional reconstruction method of the whole motion key point is higher in acquisition efficiency and accuracy.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific embodiments of the present invention are given for clarity and understanding.
Drawings
The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 is a schematic flow chart of a three-dimensional reconstruction method of object motion key points provided by an embodiment of the present invention;
FIG. 2 illustrates a schematic diagram of a computing device provided by some embodiments of the invention;
FIG. 3 illustrates a schematic diagram of the hardware connections of a plurality of cameras to a computing device provided by some embodiments of the invention;
fig. 4 is a schematic diagram of a target object motion key point according to some embodiments of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein.
The inventor finds that in the existing method for acquiring the human body movement key point data through the vision sensor, if the 2D key points of the human body movement are marked directly through manual operation, the 3D key point data is further acquired, the manual marking process is complicated, a professional is required to mark, the workload is large, and errors are easy to occur in the marking process. If an algorithm is adopted to automatically identify 2D key points of human body movement in an image, because the situation that movement joints are blocked mutually exists in a real scene, the error of the 2D key points corresponding to each camera is large, and the accuracy requirement of 3D key point data is not met.
The inventor provides a three-dimensional reconstruction method of object motion key points, firstly, a plurality of target images of each motion moment of an object are calculated through a deep learning algorithm, so that a first 2D key point corresponding to the target image is determined, the workload of manual marking is reduced, the working efficiency is improved, because the motion joints are shielded mutually in a real scene, the first 2D key point has larger error, under the condition, fine adjustment is further carried out on the first 2D key point to obtain a second 2D key point, the second 2D key point is adjusted to meet preset adjustment conditions, the precision of the second 2D key point is improved, the preset requirement is met, the obtained target 3D key point is more accurate, and the three-dimensional modeling accuracy of the object motion key point is facilitated.
Figure 1 shows a flow chart of a three-dimensional reconstruction method of object motion key points provided by an embodiment of the invention,
as shown in fig. 1, the method comprises the steps of:
step 110: acquiring a plurality of target images of a target object at each target motion moment;
step 120: determining a first 2D key point corresponding to each target image based on a deep learning algorithm;
step 130: fine tuning the first 2D key points to obtain second 2D key points;
step 140: judging whether the second 2D key point accords with a preset adjusting condition or not;
step 150: if the second 2D key point meets the preset adjustment condition, taking the second 2D key point meeting the preset adjustment condition as a target 2D key point;
step 160: and determining target 3D key points at the target motion moment after three-dimensional reconstruction according to the plurality of target 2D key points.
In step 110, the target object has different motion moments during the motion process, and multiple target images with different angles are captured by the vision sensor at each target motion moment, so that the computing device constructs a three-dimensional model to obtain target 3D key points representing the motion joints.
The target object refers to a movable object that needs to be reconstructed in three dimensions, wherein the object may be a human body or other animals or robots, a manipulator, etc.
In this embodiment, a plurality of target images at each target movement moment are obtained by shooting at different angles by a plurality of vision sensors, where the plurality of vision sensors may be a plurality of cameras of the same model, or may be a plurality of cameras of different models, and the cameras may be a common camera or a depth camera, which is not limited herein and is set as required.
Before step 110, a single-camera internal reference calibration and a multi-camera association calibration are required to be performed, and the relative position relationship of the multiple cameras is determined through the single-camera internal reference calibration and the multi-camera pose association calibration.
As shown in fig. 3, by way of example, cameras a, B, and C are named respectively with 3 cameras. These cameras may be connected to a computing device (PC) via a USB interface or other interface. The camera can be a common consumer grade camera or a professional grade camera, and the resolution can be determined according to actual needs.
And calibrating the internal parameters of the single-phase machine and the associated calibration of the multi-phase machine.
The process of shooting an image by the camera can be simplified into a form of small-hole imaging, and the whole process of projecting a scene from a three-dimensional space onto a picture can be approximately decomposed into three steps and four coordinate systems. The specific flow is as follows: converting from the world coordinate system to the camera coordinate system, converting from the camera coordinate system to the image coordinate system,and finally converting from the image coordinate system to the pixel coordinate system. Assuming that the point P is a point in three-dimensional space, its position in the camera coordinate system (/>,/>,/>) Position in world coordinate system +.> =(/>,/>,/>)。/>And->The transformation can be performed by a transformation matrix which can be subdivided into a rotation matrix (R) and a translation matrix (t), called extrinsic parameters. The mathematical expression is as follows
Coordinates of camera coordinate systemThe conversion to image coordinates and then to pixel coordinates can be achieved by a matrix to achieve the mathematical expression as follows, and the corresponding conversion matrix becomes an internal reference matrix.
The transformation between the four coordinate systems in the camera model can be all linked by the internal and external parameters of the camera, and the mathematical form is as follows:
and calibrating the cameras A, B, C by using a Zhang Zhengyou calibration method to obtain an internal reference matrix of each camera. Meanwhile, in order to establish the position relation between each camera, it is assumed that the camera coordinates of the camera A are regarded as world coordinate origins, and the pose relation of the cameras in pairs is calibrated according to a binocular vision calibration method, so that the relative position relation of multiple cameras is determined. Taking the rotational translation matrix of cameras B and A as an example, camera coordinates of camera B,/>) Camera coordinates converted into camera A +.>,/>)
In the above-mentioned method, the step of,representing the pose relationship matrix of cameras B and a.
Similarly, when the cameras are provided with other numbers than 3, the plurality of cameras are calibrated in the same way, and the method is not limited herein and is set according to the needs.
In step 120, the computing device determines the first 2D key point by using a training model of the deep learning algorithm, without manual marking, so as to improve the acquisition efficiency of 2D key point data, and in addition, reduce the false recognition rate caused by manual marking, so that the obtained first 2D key point is used as coarse positioning, and all the 2D key points are ensured to have certain precision and better consistency.
The deep learning algorithm is a training model trained on a moving object image, and may be an existing training model or a training model developed by itself, and is not limited herein.
In step 130 to step 160, based on the situation that the motion joints are blocked by each other in the first 2D key points obtained in step 120, a larger error exists in the first 2D key points, in this case, the first 2D key points are finely tuned so that the first 2D key points are offset to obtain second 2D key points, and then whether the obtained second 2D key points meet preset adjustment conditions is determined so as to determine whether the second 2D key points meet preset precision requirements. If the second 2D key point meets the preset adjustment condition, the second 2D key point is indicated to meet the preset precision requirement, and accordingly, the corresponding second 2D key point is used as the target 2D key point, and then the corresponding target 3D key point can be obtained, so that the more accurate motion key point is obtained.
Step 130 may be performed automatically by the processor or may be manually adjusted, and is not limited herein.
When the processor executes step 130, the shielding relationship of the motion joint needs to be considered for fine adjustment, fine adjustment setting can be performed through related formula relationships, and a corresponding fine adjustment training model can be established through manual labeling, so that a second 2D key point is obtained through automatic fine adjustment.
And when the step 130 is manually executed, manually adjusting the shielding relation of the motion joint to obtain a corresponding second 2D key point.
And steps 140 and 150 may be performed by a processor or manually, without limitation, as desired.
In step 160, the determining of the target 2D key point and the target 3D key point by the computing device may be synchronous or sequential, which is not limited herein and is set according to needs. In some embodiments, the computing device may calculate, in real time, a corresponding 3D keypoint from the adjusted second 2D keypoint, and when determining the target 2D keypoint, the target 3D keypoint exists accordingly. In some embodiments, the computing device may also determine the corresponding target 2D keypoints first and then calculate the corresponding target 3D keypoints.
The preset adjustment condition may be one condition or may be composed of a plurality of conditions, so as to limit the second 2D key point to be finally adjusted to be reasonably expected, so that the target 3D key point obtained according to the target 2D key point can more accurately represent the motion form of the object.
In some embodiments, the preset adjustment condition may be set to a preset reprojection error, in which case, the position control of the corresponding 3D key point is more accurate by determining whether the distance error between the second 2D key point and the corresponding reprojection 2D key point corresponds to the preset reprojection error. In some embodiments, the preset adjustment condition may be set to a preset skeleton angle, in which case, it is determined whether the association angle between the 3D keypoint connection lines corresponding to the adjacent second 2D keypoints meets the preset skeleton angle, so as to adjust the interrelationship of the motion keypoints, so that the motion coordination meets the expectations more. In some embodiments, the preset adjustment condition may also be set to preset the distance between adjacent keypoints, in which case, it is determined whether the adjacent distance between adjacent 3D keypoints meets the preset distance between adjacent keypoints, so that the adjacent distance between 3D keypoints meets the reasonable distance of the object, where the reasonable distance is the preset distance between adjacent keypoints. For example, for the hand joints of the human body, the distances between key points corresponding to the hand joints with different height ratios are different, in this case, when it is determined that the adjacent distance between adjacent 3D key points meets the preset distance between adjacent key points, the position adjustment of the adjacent second 2D key point can be considered to be more consistent with the human body structure, which is more accurate; when the adjacent distance between the adjacent 3D key points is not consistent with the preset adjacent key point distance, the position adjustment of the adjacent second 2D key point is considered unreasonable, and the adjustment needs to be continued.
In some embodiments, the preset adjustment conditions may also be set as other conditions for controlling accuracy and/or coordination of movement and/or reasonable relation of the 3D keypoints, which are not limited herein, and are set as required. In some embodiments, the preset adjustment conditions may also include a plurality of conditions, where the plurality of conditions include two or more of the preset re-projection error, the preset skeleton angle, the preset adjacent key point distance, and other conditions, which are not limited herein, and are set as needed.
When steps 110 through 160 are each performed by a computing device, the computing device may be a computing device that includes one or more processors, which may be Central Processing Units (CPUs), or specific integrated circuits ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention, without limitation. The one or more processors included by the computing device may be the same type of processor, such as one or more CPUs; but may be different types of processors such as, without limitation, one or more CPUs and one or more ASICs.
In steps 110 to 160, a training model of a deep learning algorithm is first adopted to determine the first 2D key point, manual marking is not needed, the acquisition efficiency of 2D key point data is improved, in addition, the false recognition rate caused by manual marking is reduced, the obtained first 2D key point is used as coarse positioning, and certain precision and better consistency of all the 2D key points are ensured. And then fine-tuning the first 2D key point to enable the first 2D key point to deviate to obtain a second 2D key point, and then judging whether the obtained second 2D key point meets preset adjustment conditions, so that the accuracy of the target 2D key point and the target 3D is improved, the accuracy of the target 2D key point meets the expectations, and the three-dimensional reconstruction method of the whole motion key point is higher in acquisition efficiency and accuracy.
In some embodiments, the preset adjustment condition includes a preset re-projection error, and step 140 further includes:
step a01: calculating corresponding candidate 3D key points according to the plurality of second 2D key points;
step a02: carrying out reprojection calculation on the candidate 3D key points to obtain reprojection 2D key points corresponding to each second 2D key point;
step a03: calculating a distance error between the reprojection 2D key point and a corresponding second 2D key point;
step a04: and judging whether the distance error accords with a preset reprojection error.
In step a01, when the object has a plurality of motion keypoints, each corresponding motion keypoint is mapped to a corresponding 2D keypoint in a target image, and a plurality of 2D keypoints formed by different angles are mapped in the plurality of target images, based on the corresponding 2D keypoints, the candidate 3D keypoints calculated by the computing device according to the plurality of second 2D keypoints are candidate motion keypoints, and the target 3D keypoints are corresponding target motion keypoints.
Wherein, candidate 3D key points can be calculated based on the geometric principle of triangle by using a mathematical model of triangulation, and the camera optical center is assumed to be O, and a certain point on the surface of an object is assumed to beThe point on the projection plane of the camera is +.>. Let the projection matrix of the first camera a be P1 and the projection matrix of the second camera B be P2, the following relationship is provided:
where P1 and P2 are projection matrices of 3x4,and->Two-dimensional coordinates on a projection plane of the camera A, B respectively, and solving by using a least square method to obtain candidate 3D key points in a three-dimensional spaceCoordinates, and camera coordinates in each camera.
In step a02 to step a04, the computing device performs the reprojection calculation on the candidate 3D key points, and correspondingly determines whether the distance error between the reprojection 2D key points and the corresponding second 2D key points accords with the preset reprojection error, so as to control the 3D key points after three-dimensional reconstruction to have smaller error with the actual 3D key points, thereby enabling the target 3D key points to be more accurate.
The distance between the distance error and the preset re-projection error may be a pixel distance, or may be an actual distance, which is not limited herein and is set as required. In this embodiment, the resolution is 1080p, and the preset reprojection error is set to be less than or equal to 5 pixels. In other embodiments, the resolution may be 1080p, and the preset reprojection error may be set to be 3 pixels or less; or, setting corresponding preset re-projection errors by other resolutions.
Step a04 may be performed by a computing device or manually, as desired.
In some embodiments, if the preset adjustment condition includes only the preset re-projection error, and accordingly, after step a04, if the distance error accords with the preset re-projection error, the corresponding second 2D key point is correspondingly used as the target 2D key point, and the corresponding target 3D key point is correspondingly determined.
In some embodiments, if the preset adjustment condition further includes other conditions than the preset reprojection error, correspondingly, after step a04, if the distance error meets the preset reprojection error, executing the other conditions of the preset adjustment condition until the second key point meets all conditions of the preset adjustment condition, correspondingly taking the corresponding second 2D key point as the target 2D key point, and correspondingly determining the corresponding target 3D key point.
In some embodiments, after step a03, step 140 further comprises:
step a05: and displaying the distance error.
In step a05, the computing device displays the distance error for the user to visually check, so that the adjustment condition of the second 2D key point is conveniently known.
When the second 2D key point is manually adjusted, the distance error can be displayed, so that an operator can conveniently and intuitively know the current adjustment condition, and the second 2D key point is adjusted to be smaller until the distance error is adjusted to be consistent with the preset re-projection error.
In some embodiments, when the distance error does not conform to the preset re-projection error, the corresponding computing device alerts, for example, the distance error text is red, or the computing device sounds an alert to alert the operator that the current distance error does not conform to the preset re-projection error.
In some embodiments, the preset adjustment conditions further include preset skeleton conditions, and after step a04, step 140 further includes:
step b01: if the distance error accords with the preset reprojection error, judging whether the corresponding candidate 3D key points accord with the preset skeleton conditions.
In step b01, if the preset adjustment condition includes a plurality of conditions, the computing device correspondingly determines whether the corresponding candidate 3D key point meets the preset skeleton condition, so as to further constrain the second 2D key point and the corresponding candidate 3D key point, so that the finally obtained target 3D key point is more accurate.
The preset skeleton conditions include preset skeleton angles and/or preset adjacent key point distances and/or other conditions for representing skeleton relations of the motion key points, are not limited herein, and are set according to requirements.
In some embodiments, if the preset adjustment condition includes only the preset reprojection error and the preset skeleton condition, and correspondingly, after step b01, if the corresponding candidate 3D key point meets the preset skeleton condition, the corresponding second 2D key point is correspondingly used as the target 2D key point, and the corresponding target 3D key point is correspondingly determined.
In some embodiments, if the preset adjustment condition further includes other conditions than the preset reprojection error and the preset skeleton condition, then, correspondingly, after step b04, if the corresponding candidate 3D key point meets the preset skeleton condition, then, executing the other conditions of the preset adjustment condition until the second key point meets all conditions of the preset adjustment condition, correspondingly taking the corresponding second 2D key point as the target 2D key point, and correspondingly determining the corresponding target 3D key point.
In some embodiments, the preset skeleton conditions include a preset skeleton angle, and step b01 further includes:
step b011: calculating the association angle between adjacent candidate 3D key point connecting lines;
step b012: judging whether the association angle accords with a preset skeleton angle.
In step b011, taking 3 candidate 3D key points at intervals as an example, the middle candidate 3D key point is taken as a reference key point, and two adjacent candidate 3D key points are connected with the middle candidate 3D key point to form two line segments, wherein the included angle of the two line segments is the association angle. Similarly, the two adjacent connecting lines of the key points are included angles of the two adjacent line segments, namely corresponding association angles.
In b012, the preset skeleton angle is set according to the motion relation, and is obtained based on the interrelationship of all candidate 3D key points.
In the steps b011 and b012, for the case that the preset skeleton condition includes a preset skeleton angle, by judging whether the association angle accords with the preset skeleton angle, the overall skeleton harmony formed by the candidate 3D key points is correspondingly judged.
The preset skeleton angle is based on the motion structure of the human body or other objects, the human body or other objects have reasonable motion angles between adjacent joint connecting lines in the motion process, the preset skeleton angle is correspondingly set based on the reasonable motion angles, and accordingly, if the association angle accords with the preset skeleton angle, the current position of the second 2D key point can be considered to be reasonable and accurate. When the association angle is judged not to be in accordance with the preset skeleton angle, the current second 2D key point is considered to be unsuitable for adjustment, and adjustment is needed to be continued until the association angle is judged to be in accordance with the preset skeleton angle.
In some embodiments, when the second 2D key point is obtained through manual adjustment, the preset skeleton angle is manually adjusted by the operator according to coordination of the human joint skeleton, and whether the corresponding association angle accords with the preset skeleton angle is manually judged.
In some embodiments, after step b012, steps 150 through 160 further comprise:
step c01: and if the association angle accords with the preset skeleton angle, taking the corresponding second 2D key point as a target 2D key point, and taking the corresponding candidate 3D key point as a target 3D key point.
In step c01, the association angle is the association angle between all candidate 3D keypoints, in this case, that is, when all association angles should correspond to the corresponding preset skeleton angles, the skeleton formed by the overall target 3D keypoints accords with expectations, which is favorable for better coordination of the target 3D keypoints and better three-dimensional reconstruction effect of motion.
At this time, based on that the association angle accords with the preset skeleton angle, correspondingly, the candidate 3D key point corresponding to the association angle is the target 3D key point, and the corresponding second 2D key point is the target 2D key point.
In some embodiments, after step b011, step b01 further comprises:
step b013: and displaying the motion skeleton and/or the association angle formed after the adjacent candidate 3D key point connecting lines.
In step b013, the computing device displays the motion skeleton and the associated angle formed after the adjacent candidate 3D key points are connected, so that the user can intuitively check the motion skeleton and the associated angle, and the adjustment condition of the second 2D key points can be conveniently known.
When the second 2D key point is manually adjusted, the motion skeleton and/or the association angle can be displayed conveniently, so that an operator can intuitively know the current adjustment condition, and the association angle is adjusted to be in line with the preset skeleton angle, and the whole motion skeleton is in line with expectations.
In some embodiments, when the associated angle does not conform to the preset skeleton angle, the corresponding computing device alerts, for example, the associated angle text is red, or the computing device sounds an alert to alert the operator that the current associated angle does not conform to the preset skeleton angle.
As shown in fig. 4, the display interface displays the relevant motion forms of the target object at each motion moment in real time, wherein the target object is a human body, the obtained candidate 3D key points have a plurality of two adjacent connecting lines of the candidate 3D key points to form corresponding motion frameworks, and the corresponding motion frameworks are displayed in the display page.
In some embodiments, when the user performs the trigger view instruction at an angle position corresponding to the association angle of the moving skeleton in the display page, the corresponding angle position displays the corresponding association angle.
In some embodiments, the first 2D keypoints may also be calculated to obtain a corresponding distance error and an associated angle, so that the corresponding distance error and the associated angle are displayed accordingly.
In some embodiments, after step 140, the method further comprises:
step 170: and if the second 2D key point does not meet the preset adjustment condition, taking the second 2D key point which does not meet the preset adjustment condition as the first 2D key point.
In step 170, for the case that the second 2D key point does not meet the preset adjustment condition, the adjustment needs to be continued for the second 2D key point until the second 2D key point is adjusted to meet the preset adjustment condition, so as to ensure that the accuracy of the finally obtained motion key point data meets the expectations.
In this case, if the current second 2D key point does not meet the preset adjustment condition, the second 2D key point is used as the first 2D key point to be adjusted continuously to obtain the next second 2D key point correspondingly. Likewise, when adjusting, the previous adjustment mode is inherited, and the adjustment is correspondingly performed by the computing device or manually.
When the preset adjustment condition includes a plurality of conditions, the second 2D key point is considered to be not in accordance with the preset adjustment condition as long as one of the conditions is not satisfied. For example, the preset adjustment condition includes a preset re-projection error and a preset skeleton angle, and in this case, if the second 2D key point does not meet the preset re-projection error or the preset skeleton angle, the second 2D key point is considered to not meet the preset adjustment condition.
FIG. 2 illustrates a schematic diagram of a computing device according to an embodiment of the present invention, and the embodiment of the present invention is not limited to a specific implementation of the computing device.
As shown in fig. 2, the computing device may include: a processor 202, a communication interface (Communications Interface) 204, a memory 206, and a communication bus 208.
Wherein: processor 202, communication interface 204, and memory 206 communicate with each other via communication bus 208. A communication interface 204 for communicating with network elements of other devices, such as clients or other servers. The processor 202 is configured to execute the program 210, and may specifically perform the relevant steps in the embodiment of the three-dimensional reconstruction method for the motion key points of the object.
In particular, program 210 may include program code comprising computer-executable instructions. The execution of the method steps of each module, such as the monitoring module, the creation module, the encryption module, etc., is implemented by the program 210.
The processor 202 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included by the computing device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
A memory 206 for storing a program 210. The memory 206 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.
The embodiment of the invention also provides various computer readable storage media, and at least one executable instruction is stored in the storage media, and the executable instruction executes the operation of the three-dimensional reconstruction method of the object motion key point of any one of the above steps when in operation.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component, and they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (6)

1. A method for three-dimensional reconstruction of object motion key points, the method comprising:
acquiring a plurality of target images of a target object at each target motion moment;
determining a first 2D key point corresponding to each target image based on a deep learning algorithm;
fine tuning the first 2D key point to obtain a second 2D key point;
judging whether the second 2D key point accords with a preset adjusting condition or not;
if the second 2D key point meets the preset adjustment condition, taking the second 2D key point meeting the preset adjustment condition as a target 2D key point;
if the second 2D key point does not meet the preset adjustment condition, taking the second 2D key point which does not meet the preset adjustment condition as the first 2D key point; the preset adjustment condition includes a preset reprojection error, and the judging whether the second 2D key point meets the preset adjustment condition further includes:
calculating corresponding candidate 3D key points according to the plurality of second 2D key points;
carrying out reprojection calculation on the candidate 3D key points to obtain reprojection 2D key points corresponding to each second 2D key point;
calculating a distance error between the re-projection 2D key point and the corresponding second 2D key point;
judging whether the distance error accords with the preset re-projection error or not;
the preset adjustment condition further includes a preset skeleton condition, after the distance error is determined whether to meet a preset reprojection error, the step of determining whether the second 2D key point meets the preset adjustment condition further includes:
if the distance error accords with the preset reprojection error, judging whether the corresponding candidate 3D key points accord with preset skeleton conditions or not;
the preset skeleton conditions comprise preset skeleton angles, and the judging of whether the candidate 3D key points meet the preset skeleton conditions or not further comprises:
calculating the association angle between adjacent candidate 3D key point connecting lines;
judging whether the association angle accords with the preset skeleton angle or not;
and determining target 3D key points at the target motion moment after three-dimensional reconstruction according to the target 2D key points.
2. The method for three-dimensional reconstruction of object motion keypoints according to claim 1, wherein,
after calculating the distance error between the 2D key point of the reprojection and the corresponding second 2D key point, the preset adjustment condition includes a preset reprojection error, and the determining whether the second 2D key point meets the preset adjustment condition further includes:
and displaying the distance error.
3. The method for three-dimensional reconstruction of object motion key points according to claim 1, wherein after the determining whether the association angle meets the preset skeleton angle, if the second 2D key point meets the preset adjustment condition, the second 2D key point meeting the preset adjustment condition is taken as a target 2D key point, and a target 3D key point at the target motion moment after three-dimensional reconstruction is determined according to a plurality of target 2D key points, further comprising:
and if the association angle accords with the preset skeleton angle, taking the corresponding second 2D key point as the target 2D key point, and taking the corresponding candidate 3D key point as the target 3D key point.
4. The method for three-dimensional reconstruction of object motion keypoints according to claim 1, wherein,
after calculating the association angle between the adjacent candidate 3D keypoint connecting lines, the determining whether the corresponding candidate 3D keypoints meet the preset skeleton condition further includes:
and displaying the motion skeleton and/or the association angle formed after connecting the adjacent candidate 3D key points.
5. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform the operations of the three-dimensional reconstruction method of object motion keypoints of any one of claims 1-4.
6. A computer readable storage medium having stored therein at least one executable instruction that, when executed, performs the operations of the three-dimensional reconstruction method of object motion keypoints of any of claims 1-4.
CN202311333051.1A 2023-10-16 2023-10-16 Three-dimensional reconstruction method, equipment and storage medium for object motion key points Active CN117095131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311333051.1A CN117095131B (en) 2023-10-16 2023-10-16 Three-dimensional reconstruction method, equipment and storage medium for object motion key points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311333051.1A CN117095131B (en) 2023-10-16 2023-10-16 Three-dimensional reconstruction method, equipment and storage medium for object motion key points

Publications (2)

Publication Number Publication Date
CN117095131A CN117095131A (en) 2023-11-21
CN117095131B true CN117095131B (en) 2024-02-06

Family

ID=88771952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311333051.1A Active CN117095131B (en) 2023-10-16 2023-10-16 Three-dimensional reconstruction method, equipment and storage medium for object motion key points

Country Status (1)

Country Link
CN (1) CN117095131B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695628A (en) * 2020-06-11 2020-09-22 北京百度网讯科技有限公司 Key point marking method and device, electronic equipment and storage medium
CN112767300A (en) * 2019-10-18 2021-05-07 宏达国际电子股份有限公司 Method for automatically generating labeling data of hand and method for calculating skeleton length
CN113393563A (en) * 2021-05-26 2021-09-14 杭州易现先进科技有限公司 Method, system, electronic device and storage medium for automatically labeling key points
CN113705379A (en) * 2021-08-11 2021-11-26 广州虎牙科技有限公司 Gesture estimation method and device, storage medium and equipment
CN114066814A (en) * 2021-10-19 2022-02-18 杭州易现先进科技有限公司 Gesture 3D key point detection method of AR device and electronic device
CN114201985A (en) * 2020-08-31 2022-03-18 魔门塔(苏州)科技有限公司 Method and device for detecting key points of human body
CN114638921A (en) * 2022-05-19 2022-06-17 深圳元象信息科技有限公司 Motion capture method, terminal device, and storage medium
WO2023016271A1 (en) * 2021-08-13 2023-02-16 北京迈格威科技有限公司 Attitude determining method, electronic device, and readable storage medium
CN115862149A (en) * 2022-12-30 2023-03-28 广州紫为云科技有限公司 Method and system for generating 3D human skeleton key point data set
CN116092178A (en) * 2022-11-25 2023-05-09 东南大学 Gesture recognition and tracking method and system for mobile terminal
CN116152907A (en) * 2021-11-23 2023-05-23 广州视源电子科技股份有限公司 Gesture reconstruction method and device and electronic equipment

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767300A (en) * 2019-10-18 2021-05-07 宏达国际电子股份有限公司 Method for automatically generating labeling data of hand and method for calculating skeleton length
CN111695628A (en) * 2020-06-11 2020-09-22 北京百度网讯科技有限公司 Key point marking method and device, electronic equipment and storage medium
CN114201985A (en) * 2020-08-31 2022-03-18 魔门塔(苏州)科技有限公司 Method and device for detecting key points of human body
CN113393563A (en) * 2021-05-26 2021-09-14 杭州易现先进科技有限公司 Method, system, electronic device and storage medium for automatically labeling key points
CN113705379A (en) * 2021-08-11 2021-11-26 广州虎牙科技有限公司 Gesture estimation method and device, storage medium and equipment
WO2023016271A1 (en) * 2021-08-13 2023-02-16 北京迈格威科技有限公司 Attitude determining method, electronic device, and readable storage medium
CN114066814A (en) * 2021-10-19 2022-02-18 杭州易现先进科技有限公司 Gesture 3D key point detection method of AR device and electronic device
CN116152907A (en) * 2021-11-23 2023-05-23 广州视源电子科技股份有限公司 Gesture reconstruction method and device and electronic equipment
CN114638921A (en) * 2022-05-19 2022-06-17 深圳元象信息科技有限公司 Motion capture method, terminal device, and storage medium
CN116092178A (en) * 2022-11-25 2023-05-09 东南大学 Gesture recognition and tracking method and system for mobile terminal
CN115862149A (en) * 2022-12-30 2023-03-28 广州紫为云科技有限公司 Method and system for generating 3D human skeleton key point data set

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于Transformer 的三维人体姿态估计方法;王玉萍 等;《图学学报》;第44卷(第1期);第139-145页 *

Also Published As

Publication number Publication date
CN117095131A (en) 2023-11-21

Similar Documents

Publication Publication Date Title
CN111783820B (en) Image labeling method and device
CN111062873B (en) Parallax image splicing and visualization method based on multiple pairs of binocular cameras
CN107564069B (en) Method and device for determining calibration parameters and computer readable storage medium
CN107223269B (en) Three-dimensional scene positioning method and device
US10922844B2 (en) Image positioning method and system thereof
CN112652016B (en) Point cloud prediction model generation method, pose estimation method and pose estimation device
CN111028155B (en) Parallax image splicing method based on multiple pairs of binocular cameras
CN109032348B (en) Intelligent manufacturing method and equipment based on augmented reality
CN108038886B (en) Binocular camera system calibration method and device and automobile
GB2580691A (en) Depth estimation
EP3330928A1 (en) Image generation device, image generation system, and image generation method
CN113361365B (en) Positioning method, positioning device, positioning equipment and storage medium
CN113538587A (en) Camera coordinate transformation method, terminal and storage medium
CN116109684B (en) Online video monitoring two-dimensional and three-dimensional data mapping method and device for variable electric field station
CN113256718A (en) Positioning method and device, equipment and storage medium
CN113379815A (en) Three-dimensional reconstruction method and device based on RGB camera and laser sensor and server
CN112184815A (en) Method and device for determining position and posture of panoramic image in three-dimensional model
EP3940636A2 (en) Method for acquiring three-dimensional perception information based on external parameters of roadside camera, and roadside device
CN115830135A (en) Image processing method and device and electronic equipment
CN115345942A (en) Space calibration method and device, computer equipment and storage medium
CN113129346B (en) Depth information acquisition method and device, electronic equipment and storage medium
CN113793392A (en) Camera parameter calibration method and device
CN113601510A (en) Robot movement control method, device, system and equipment based on binocular vision
CN117095131B (en) Three-dimensional reconstruction method, equipment and storage medium for object motion key points
CN111866467A (en) Method and device for determining three-dimensional coverage space of monitoring video and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant