CN112116653B - Object posture estimation method for multiple RGB pictures - Google Patents

Object posture estimation method for multiple RGB pictures Download PDF

Info

Publication number
CN112116653B
CN112116653B CN202011316344.5A CN202011316344A CN112116653B CN 112116653 B CN112116653 B CN 112116653B CN 202011316344 A CN202011316344 A CN 202011316344A CN 112116653 B CN112116653 B CN 112116653B
Authority
CN
China
Prior art keywords
rgb
dimensional
objects
pictures
rgb pictures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011316344.5A
Other languages
Chinese (zh)
Other versions
CN112116653A (en
Inventor
张键驰
贾奎
郭清达
陈轲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cross Dimension Shenzhen Intelligent Digital Technology Co ltd
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202011316344.5A priority Critical patent/CN112116653B/en
Publication of CN112116653A publication Critical patent/CN112116653A/en
Application granted granted Critical
Publication of CN112116653B publication Critical patent/CN112116653B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The invention belongs to a three-dimensional computer vision technology, and discloses an object posture estimation method for a plurality of RGB pictures, which comprises the following steps: respectively carrying out target detection on each RGB picture to obtain a two-dimensional boundary frame of a target object; interpolation and filtering processing are carried out on the detection result, a two-dimensional boundary frame of false detection is removed, and a two-dimensional boundary frame of missed detection is added; inputting all target objects into an object posture estimation network based on monocular RGB pictures respectively, and estimating the object type contained in each RGB picture, and the three-dimensional coordinate and orientation of each object; calculating the corresponding relation between objects in each two RGB pictures and a camera rotation matrix; constructing an object attitude diagram and optimizing the object attitude diagram; and projecting the three-dimensional coordinates, the orientation and the three-dimensional model corresponding to each object from three dimensions to two dimensions to obtain a new two-dimensional bounding box. Compared with the existing method, the method better solves the problem of occlusion and improves the accuracy of estimation.

Description

Object posture estimation method for multiple RGB pictures
Technical Field
The invention belongs to the technical field of three-dimensional computer vision, and particularly relates to an object posture estimation method for a plurality of RGB pictures.
Background
The smart manufacturing industry is gradually becoming an area of major concern. Compared with the traditional manufacturing industry, the intelligent manufacturing industry widely adopts the artificial intelligence technology, and the robot is expected to learn perception, planning and decision, so that the robot can replace the human to carry out repetitive labor. One of the key technologies is object pose estimation. The object posture estimation technology is an important component of several practical applications from robot navigation and manipulation to augmented reality, and has very important theoretical research value and application value.
With the development of deep learning technology and three-dimensional computer vision technology, object pose estimation technology has great progress in theory and application, such as template matching method, edge matching method and matching method based on 3D model, but the existing method still has limitations. How to accurately estimate the object pose under the condition of stacking or severe occlusion is still a problem to be solved; the estimation of the posture of the target object from multiple viewing angles by using multiple RGB pictures is a technical solution to solve the problem.
Disclosure of Invention
Aiming at the technical problem that the object posture is difficult to accurately estimate under the condition of stacking or severe shielding in the existing method, the invention provides the object posture estimation method of a plurality of RGB pictures by utilizing the relation among the plurality of RGB pictures under a plurality of different visual angles, and the accuracy of the object posture estimation technology under the conditions of stacking and severe shielding is improved.
The technical scheme provided by the invention is as follows: a method for estimating object postures of a plurality of RGB pictures comprises the following steps:
s1, selecting a plurality of RGB pictures from the same video frame, and respectively carrying out target detection on each RGB picture to obtain a two-dimensional boundary frame of a target object contained in each RGB picture as a target detection result of each RGB picture;
s2, carrying out interpolation and filtering processing on the target detection result of each RGB picture, removing a false detection two-dimensional boundary frame and adding a missing detection two-dimensional boundary frame;
s3, respectively inputting all target objects in each RGB picture into an object posture estimation network based on a monocular RGB picture, and estimating the object type, the three-dimensional coordinate of each object and the three-dimensional orientation of each object contained in each RGB picture;
s4, calculating the corresponding relation between objects in each two RGB pictures and a camera rotation matrix between each two RGB pictures according to the estimation result of the step S3;
s5, constructing an object posture diagram and optimizing the object posture diagram by utilizing the corresponding relation among the objects in the multiple RGB pictures, the camera rotation matrix, and the three-dimensional coordinates and the three-dimensional orientation of each object;
s6, after the poses of all objects in each RGB picture are obtained, three-dimensional coordinates, three-dimensional orientations and three-dimensional models corresponding to the objects are utilized to project from three dimensions to two dimensions, and a new two-dimensional bounding box is obtained;
s7, replacing the two-dimensional bounding box acquired in the step S1 with a new two-dimensional bounding box of each RGB picture, and repeating the steps S2-S6 until the repetition number reaches a set threshold value.
Compared with the prior art, the invention has the following beneficial effects:
the object posture estimation method based on the multiple RGB pictures well solves the problem of occlusion by utilizing the relation among the multiple RGB pictures under multiple different visual angles, and improves the accuracy of the object posture estimation technology under the conditions of stacking and serious occlusion.
Drawings
FIG. 1 is a flow chart of an object pose estimation method for a plurality of RGB pictures according to an embodiment of the present invention;
FIG. 2 is a flow chart of the application of a random sampling consistency algorithm in calculating the relative camera rotation matrices of two RGB pictures;
FIG. 3 is a schematic diagram of the optimization effect of the object pose graph optimization algorithm.
Detailed Description
The following describes the technical solution of the present invention in detail with reference to the examples and fig. 1 to 3, but the embodiments of the present invention are not limited thereto.
Examples
As shown in fig. 1, the method for estimating the pose of an object based on a plurality of RGB pictures and a three-dimensional model according to the present invention includes the steps of:
s1, target detection: selecting a plurality of RGB pictures from the same video frame, and respectively carrying out target detection on each RGB picture to obtain a two-dimensional boundary frame of a target object contained in each RGB picture as a target detection result of each RGB picture.
Specifically, each RGB picture is input into a target detection algorithm, which may be any target detection algorithm in the prior art, and the embodiment adopts a fast RCNN target detection method; the obtained target detection result, namely the two-dimensional boundary frame of the target object, is represented by [ x, y, w, h ], wherein (x, y) represents the pixel position of the central point of the object, and w and h respectively represent the width and height of the object in the RGB picture.
Furthermore, the existing training data set or the existing target detection network can be utilized to perform transfer learning on the target detection algorithm, so that the accuracy of target detection is improved.
And S2, carrying out interpolation and filtering processing on the target detection result of each RGB picture by using a voting mechanism, removing a two-dimensional boundary frame of false detection, and adding a two-dimensional boundary frame of missed detection.
According to the principle of scene content consistency, the object types and the number contained in all the RGB pictures are basically consistent; according to the characteristic, interpolation and filtering are carried out on the target detection result of the RGB picture, the two-dimensional boundary frame of false detection is removed, and the two-dimensional boundary frame of missed detection is recovered.
Further, the strategy for performing interpolation and filtering processing on the target detection result in the step is as follows: if not less than half of the target detection results of the RGB pictures contain an object, it can be considered that all the RGB pictures contain the object, so a two-dimensional bounding box is added to the corresponding position of the RGB picture where the object is not detected, and the corresponding position can be represented as [ (x1+ x2)/2, (y1+ y2)/2, (w1+ w2)/2, (h1+ h2)/2 ]; if no object is detected in the target detection results of more than half of the RGB pictures, all the RGB pictures are considered to contain no object, so that the two-dimensional bounding box can be deleted at the corresponding position in the RGB pictures in which the object is detected.
S3, object posture estimation based on the monocular RGB picture: and inputting all target objects in each RGB picture into an object posture estimation network based on the monocular RGB picture, so as to estimate the object type c, the three-dimensional coordinates (x, y, z) of each object and the three-dimensional orientation (qw, qx, qy, qz) of each object contained in each RGB picture.
The monocular RGB picture object pose estimation network can be any object pose estimation method based on a single RGB picture, a method based on key point review is adopted in the embodiment, improvement is performed on a typical PVNet network, only RGB channels of the PVNet network are utilized without using a depth image, only the RGB picture is used as input, and the network comprises 12 convolutional layers and two connecting layers.
S4, camera rotation matrix estimation: for any two RGB pictures, a random sampling consistency algorithm is used to calculate the correspondence between the objects in each two RGB pictures and the camera rotation matrix between each two RGB pictures according to the estimation result of step S3, and the flow is shown in fig. 2.
The strategy adopted by the random sampling consistency algorithm is as follows: for any two RGB pictures, randomly selecting two objects from the two RGB pictures according to the object pose estimation result obtained in the step S3, assuming that the two selected objects are corresponding and have the same object type c, and forming two pairs of objects with the same pair-wise consistent type from the two objects respectively selected from the two RGB pictures; calculating relative camera transformation between the two RGB pictures by using two pairs of objects with pairwise consistent categories, taking the reciprocal of the sum of attitude deviations of all objects in the two RGB pictures under the relative camera transformation as a confidence coefficient, and taking the relative camera transformation with the highest confidence coefficient and the object corresponding relation as a camera rotation matrix between the two RGB pictures and the corresponding relation between the two RGB pictures.
Because the object postures estimated from any two RGB pictures have deviations, namely deviations exist in the three-dimensional coordinates and the three-dimensional orientation, the method respectively performs three-dimensional coordinate deviation summation and three-dimensional orientation deviation summation, then superposes the numerical values obtained by the two deviations summation to obtain the final deviation sum, and takes the reciprocal of the final deviation sum as the confidence coefficient. In the present invention, "objects having a pair-wise coincidence classification" means: the object images of the same object taken at two different viewing angles of the camera constitute an object with a pair of consistent categories.
Specifically, for every two RGB pictures, the method selects two pairs of objects with pairwise consistent categories from the two RGB pictures
Figure DEST_PATH_IMAGE001
And
Figure 100002_DEST_PATH_IMAGE002
suppose thatSelecting objects of the categories c1 and c2 from one picture, and selecting objects of the categories c1 and c2 from the other picture, wherein one pair of objects of the category c1 form objects with pair-wise consistency, and the other pair of objects of the category c2 form objects with pair-wise consistency; using a pair of objects with pairwise coincident classes to build a camera rotation matrix
Figure DEST_PATH_IMAGE003
Also known as a camera rotation matrix or a camera transformation matrix. In order to solve the problem of position and attitude ambiguity caused by object symmetry,
Figure 113064DEST_PATH_IMAGE003
the calculation method of (c) is as follows:
Figure 100002_DEST_PATH_IMAGE004
the right side of the equal sign of the calculation mode is matrix multiplication; wherein S is*A symmetric transformation matrix for best alignment of the target object in the set of object pair-wise linear matrices,
Figure DEST_PATH_IMAGE005
representing camera view angle
Figure 100002_DEST_PATH_IMAGE006
Figure DEST_PATH_IMAGE007
Representing camera view angle
Figure 700385DEST_PATH_IMAGE006
Lower target object
Figure 100002_DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE009
Representing camera view angle
Figure 631432DEST_PATH_IMAGE006
Lower target object
Figure 198810DEST_PATH_IMAGE008
The posture of (a) of (b),
Figure 100002_DEST_PATH_IMAGE010
representing camera view angle
Figure DEST_PATH_IMAGE011
Figure 100002_DEST_PATH_IMAGE012
Representing camera view angle
Figure 683625DEST_PATH_IMAGE011
Lower target object
Figure DEST_PATH_IMAGE013
Figure 100002_DEST_PATH_IMAGE014
Representing camera view angle
Figure 144693DEST_PATH_IMAGE011
Lower target object
Figure 387587DEST_PATH_IMAGE013
The posture of (a) of (b),
Figure DEST_PATH_IMAGE015
to represent
Figure 525086DEST_PATH_IMAGE014
Is inverted and it is assumed that
Figure 862527DEST_PATH_IMAGE007
And
Figure 100002_DEST_PATH_IMAGE016
the same object corresponds, i.e. a pair of objects with a pair-wise identity category.
Calculate out
Figure 646943DEST_PATH_IMAGE003
Thereafter, the method employs another pair of objects having pairwise coincident classes
Figure DEST_PATH_IMAGE017
The camera rotation matrix of (2) is verified:
Figure 100002_DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE019
representing camera view angle
Figure 732842DEST_PATH_IMAGE006
Lower target object
Figure 100002_DEST_PATH_IMAGE020
Figure DEST_PATH_IMAGE021
Representing camera view angle
Figure 147250DEST_PATH_IMAGE006
Lower target object
Figure 100002_DEST_PATH_IMAGE022
The posture of (a) of (b),
Figure DEST_PATH_IMAGE023
representing camera view angle
Figure 367009DEST_PATH_IMAGE011
Lower target object
Figure 100002_DEST_PATH_IMAGE024
Figure DEST_PATH_IMAGE025
Representing camera view angle
Figure 802670DEST_PATH_IMAGE011
Lower target object
Figure 384436DEST_PATH_IMAGE024
The posture of (a) of (b),
Figure 100002_DEST_PATH_IMAGE026
to represent
Figure 7179DEST_PATH_IMAGE025
Inverting the matrix of (1); when in use
Figure DEST_PATH_IMAGE027
Is less than a preset range, the calculated camera rotation matrix is considered
Figure 968313DEST_PATH_IMAGE003
Is correct.
S5, pose graph construction and optimization: and constructing an object posture graph by utilizing the corresponding relation among the objects in the multiple RGB pictures, the camera rotation matrix and the estimated three-dimensional coordinate and three-dimensional orientation of each object, and optimizing the constructed object posture graph by adopting a light beam method adjustment.
For multiple RGB pictures, the same object image taken at two different viewing angles of the camera has no change in the three-dimensional coordinates and three-dimensional orientation of the object after eliminating the influence of the camera rotation transformation because only the camera moves and the object does not move, i.e., the relative world coordinate system of the same object in the RGB pictures does not change. In the previous step, the correspondence between objects in the respective pictures is known, and incoherent objects have been deleted. Therefore, a unique and consistent scene model can be recovered by performing global joint optimization after relative rotation transformation on the object and the camera.
Specifically, the object attitude map constructed by the method takes the three-dimensional coordinates and the three-dimensional orientation of the same object type estimated in each RGB picture as a vertex, the camera rotation matrix of each two RGB pictures as a side, the minimum global pose consistency deviation as an optimization target, and the camera rotation matrix between each two RGB pictures and the three-dimensional coordinates and the three-dimensional orientation of each object in each RGB picture are optimized by using adjustment of a light beam method.
As shown in fig. 3, when the shooting angle of view of the camera changes after the adjustment optimization by the beam method, the position and orientation of the object are unchanged, that is, the three-dimensional coordinates and three-dimensional orientation of the object are unchanged. In fig. 3, image 1, image 2, and image 3 respectively represent three shooting angles of the camera, R represents a three-dimensional rotation matrix, t represents a three-dimensional translation vector, R1 and t1, R2 and t2, R3 and t3 respectively form three positions and orientations of the camera, and a middle rectangular frame in each shooting angle represents a picture of an object to be shot; vertex X of the object1The corresponding points in the three pictures are respectivelyp 1,1 p 1,2 、p 1,3
S6, updating the two-dimensional bounding box: after the poses of all objects in each RGB picture are obtained, the three-dimensional coordinates, the three-dimensional orientation and the three-dimensional model corresponding to each object can be utilized to project from three dimensions to two dimensions, and a new compact two-dimensional bounding box is obtained. The three-dimensional model may be generic.
S7, pose estimation iteration: and (4) replacing the original two-dimensional bounding box acquired in the step (S1) with a new two-dimensional bounding box of each RGB picture, and repeating the steps (S2-S6) until the repetition number reaches a certain set threshold value.
Furthermore, the invention does not strictly limit the threshold value of the iteration times, can be suitable for any repeated iteration times, and can also be used for any more than two RGB pictures.
The invention discloses an object posture estimation method for a plurality of RGB pictures. According to the method, through the steps of target detection, two-dimensional boundary frame filtering and interpolation, monocular RGB picture-based object posture estimation, camera rotation matrix estimation, pose graph construction and optimization, two-dimensional boundary frame updating, pose estimation iteration and the like, the accuracy and robustness of the pose estimation method under the condition of stacking or serious shielding are improved, the problems existing in the existing method are effectively solved, and the method can be widely applied to the technical field of three-dimensional computer vision.

Claims (9)

1. An object posture estimation method of a plurality of RGB pictures is characterized by comprising the following steps:
s1, selecting a plurality of RGB pictures from the same video frame, and respectively carrying out target detection on each RGB picture to obtain a two-dimensional boundary frame of a target object contained in each RGB picture as a target detection result of each RGB picture;
s2, carrying out interpolation and filtering processing on the target detection result of each RGB picture, removing a false detection two-dimensional boundary frame and adding a missing detection two-dimensional boundary frame;
s3, respectively inputting all target objects in each RGB picture into an object posture estimation network based on a monocular RGB picture, and estimating the object type, the three-dimensional coordinate of each object and the three-dimensional orientation of each object contained in each RGB picture;
s4, calculating the corresponding relation between objects in each two RGB pictures and a camera rotation matrix between each two RGB pictures according to the estimation result of the step S3;
s5, constructing an object posture diagram and optimizing the object posture diagram by utilizing the corresponding relation among the objects in the multiple RGB pictures, the camera rotation matrix, and the three-dimensional coordinates and the three-dimensional orientation of each object;
s6, after the poses of all objects in each RGB picture are obtained, three-dimensional coordinates, three-dimensional orientations and three-dimensional models corresponding to the objects are utilized to project from three dimensions to two dimensions, and a new two-dimensional bounding box is obtained;
s7, replacing the two-dimensional bounding box obtained in the step S1 with a new two-dimensional bounding box of each RGB picture, and repeating the steps S2-S6 until the repetition times reach a set threshold;
step S4 is implemented by using a random sampling consistency algorithm, and the strategy is as follows:
for any two RGB pictures, two pairs of objects with pairwise consistent categories are randomly selected from the two RGB pictures according to the object pose estimation result obtained in the step S3; calculating relative camera transformation between the two RGB pictures by using two pairs of objects with pairwise consistent categories, taking the reciprocal of the sum of attitude deviations of all objects in the two RGB pictures under the relative camera transformation as a confidence coefficient, and taking the relative camera transformation with the highest confidence coefficient and the object corresponding relation as a camera rotation matrix between the two RGB pictures and the corresponding relation between the two RGB pictures.
2. The object pose estimation method according to claim 1, wherein the strategy of the step S2 for interpolating and filtering the target detection result is: if the target detection results of not less than half of the RGB pictures contain certain objects, the objects are considered to be contained in all the RGB pictures, and a two-dimensional boundary frame is added at the corresponding position of the RGB pictures where the objects are not detected; and if no object of a certain type is detected in the target detection results of more than half of the RGB pictures, all the RGB pictures are considered to contain no object of the type, and the two-dimensional boundary frame is deleted at the corresponding position in the RGB pictures in which the object of the type is detected.
3. The object pose estimation method according to claim 1, wherein the object pose estimation network of the monocular RGB pictures in step S3 is a key point review based method, and is improved on the PVNet network, only using the RGB channels of the PVNet network and not using the depth image.
4. The object pose estimation method according to claim 1, wherein in the random sampling consistency algorithm of step S4, the object poses estimated from any two RGB pictures are respectively subjected to three-dimensional coordinate deviation summation and three-dimensional orientation deviation summation, then the values obtained by the two deviation summations are superimposed to obtain the final deviation sum, and the reciprocal of the final deviation sum is taken as the confidence.
5. The object pose estimation method according to claim 1, wherein in the random sampling consistency algorithm of step S4, the camera rotation matrix is established by using one pair of objects with pairwise consistency classes, and the camera rotation matrix is verified by using the other pair of objects with pairwise consistency classes.
6. The object pose estimation method of claim 5, wherein a camera rotation matrix established with a pair of objects having pairwise coincident classes is employed
Figure DEST_PATH_IMAGE002
Comprises the following steps:
Figure DEST_PATH_IMAGE004
the right side of the equal sign of the calculation mode is matrix multiplication; wherein S is*A symmetric transformation matrix for best alignment of the target object in the set of object pair-wise linear matrices,
Figure DEST_PATH_IMAGE006
representing camera view angle
Figure DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE010
Representing camera view angle
Figure 92419DEST_PATH_IMAGE008
Lower target object
Figure DEST_PATH_IMAGE012
Figure DEST_PATH_IMAGE014
Presentation cameraAngle of view
Figure 212822DEST_PATH_IMAGE008
Lower target object
Figure 805608DEST_PATH_IMAGE012
The posture of (a) of (b),
Figure DEST_PATH_IMAGE016
representing camera view angle
Figure DEST_PATH_IMAGE018
Figure DEST_PATH_IMAGE020
Representing camera view angle
Figure 417462DEST_PATH_IMAGE018
Lower target object
Figure DEST_PATH_IMAGE022
Figure DEST_PATH_IMAGE024
Representing camera view angle
Figure 372779DEST_PATH_IMAGE018
Lower target object
Figure 523138DEST_PATH_IMAGE022
The posture of (a) of (b),
Figure 603221DEST_PATH_IMAGE010
and
Figure DEST_PATH_IMAGE026
a pair of objects having a pair-wise uniform classification.
7. The object pose estimation method according to claim 1, wherein step S5 optimizes the constructed object pose graph by using bundle adjustment.
8. The object pose estimation method according to claim 1, wherein the object pose graph constructed in step S5 takes the three-dimensional coordinates and three-dimensional orientation of the same object class estimated in each RGB picture as a vertex, the camera rotation matrix of each two RGB pictures as an edge, and the three-dimensional coordinates and three-dimensional orientation of each object in each RGB picture as an optimization target, and optimizes the camera rotation matrix between each two RGB pictures and the three-dimensional coordinates and three-dimensional orientation of each object in each RGB picture.
9. The object pose estimation method according to claim 1, wherein step S1 employs a fast RCNN object detection method.
CN202011316344.5A 2020-11-23 2020-11-23 Object posture estimation method for multiple RGB pictures Active CN112116653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011316344.5A CN112116653B (en) 2020-11-23 2020-11-23 Object posture estimation method for multiple RGB pictures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011316344.5A CN112116653B (en) 2020-11-23 2020-11-23 Object posture estimation method for multiple RGB pictures

Publications (2)

Publication Number Publication Date
CN112116653A CN112116653A (en) 2020-12-22
CN112116653B true CN112116653B (en) 2021-03-30

Family

ID=73794490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011316344.5A Active CN112116653B (en) 2020-11-23 2020-11-23 Object posture estimation method for multiple RGB pictures

Country Status (1)

Country Link
CN (1) CN112116653B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023015409A1 (en) * 2021-08-09 2023-02-16 百果园技术(新加坡)有限公司 Object pose detection method and apparatus, computer device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020611A (en) * 2019-03-17 2019-07-16 浙江大学 A kind of more human action method for catching based on three-dimensional hypothesis space clustering
CN110119148A (en) * 2019-05-14 2019-08-13 深圳大学 A kind of six-degree-of-freedom posture estimation method, device and computer readable storage medium
CN111340867A (en) * 2020-02-26 2020-06-26 清华大学 Depth estimation method and device for image frame, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018208791A1 (en) * 2017-05-08 2018-11-15 Aquifi, Inc. Systems and methods for inspection and defect detection using 3-d scanning
CN108830150B (en) * 2018-05-07 2019-05-28 山东师范大学 One kind being based on 3 D human body Attitude estimation method and device
CN110533721B (en) * 2019-08-27 2022-04-08 杭州师范大学 Indoor target object 6D attitude estimation method based on enhanced self-encoder
CN111932678B (en) * 2020-08-13 2021-05-14 北京未澜科技有限公司 Multi-view real-time human motion, gesture, expression and texture reconstruction system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020611A (en) * 2019-03-17 2019-07-16 浙江大学 A kind of more human action method for catching based on three-dimensional hypothesis space clustering
CN110119148A (en) * 2019-05-14 2019-08-13 深圳大学 A kind of six-degree-of-freedom posture estimation method, device and computer readable storage medium
CN111340867A (en) * 2020-02-26 2020-06-26 清华大学 Depth estimation method and device for image frame, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks";Junyi Pan et al;《2019 IEEE/CVF International Conference on Computer Vision (ICCV)》;20191231;第9963-9972页 *
"应用摄像机位姿估计的点云初始配准";郭清达等;《光学精密工程》;20170630;第25卷(第6期);第1635-1644页 *

Also Published As

Publication number Publication date
CN112116653A (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN111563415B (en) Binocular vision-based three-dimensional target detection system and method
CN110853075B (en) Visual tracking positioning method based on dense point cloud and synthetic view
US6671399B1 (en) Fast epipolar line adjustment of stereo pairs
CN109461180A (en) A kind of method for reconstructing three-dimensional scene based on deep learning
CN106940704A (en) A kind of localization method and device based on grating map
CN109887030A (en) Texture-free metal parts image position and posture detection method based on the sparse template of CAD
KR102206108B1 (en) A point cloud registration method based on RGB-D camera for shooting volumetric objects
CN107767339B (en) Binocular stereo image splicing method
CN111998862B (en) BNN-based dense binocular SLAM method
CN112015275A (en) Digital twin AR interaction method and system
CN109325995B (en) Low-resolution multi-view hand reconstruction method based on hand parameter model
Feng et al. Deep depth estimation on 360 images with a double quaternion loss
CN112116653B (en) Object posture estimation method for multiple RGB pictures
JP7178803B2 (en) Information processing device, information processing device control method and program
CN113313740B (en) Disparity map and surface normal vector joint learning method based on plane continuity
CN116681839B (en) Live three-dimensional target reconstruction and singulation method based on improved NeRF
Wang et al. Perspective 3-D Euclidean reconstruction with varying camera parameters
KR102375135B1 (en) Apparatus and Method for Cailbrating Carmeras Loaction of Muti View Using Spherical Object
Kitt et al. Trinocular optical flow estimation for intelligent vehicle applications
CN113242419A (en) 2D-to-3D method and system based on static building
Liu et al. Binocular depth estimation using convolutional neural network with Siamese branches
Xu et al. Study on the method of SLAM initialization for monocular vision
CN112633300B (en) Multi-dimensional interactive image feature parameter extraction and matching method
CN112581494B (en) Binocular scene flow calculation method based on pyramid block matching
CN117593618B (en) Point cloud generation method based on nerve radiation field and depth map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240221

Address after: 510641 Industrial Building, Wushan South China University of Technology, Tianhe District, Guangzhou City, Guangdong Province

Patentee after: Guangzhou South China University of Technology Asset Management Co.,Ltd.

Country or region after: China

Address before: 510640 No. five, 381 mountain road, Guangzhou, Guangdong, Tianhe District

Patentee before: SOUTH CHINA University OF TECHNOLOGY

Country or region before: China

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240410

Address after: 518057, Building 4, 512, Software Industry Base, No. 19, 17, and 18 Haitian Road, Binhai Community, Yuehai Street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Cross dimension (Shenzhen) Intelligent Digital Technology Co.,Ltd.

Country or region after: China

Address before: 510641 Industrial Building, Wushan South China University of Technology, Tianhe District, Guangzhou City, Guangdong Province

Patentee before: Guangzhou South China University of Technology Asset Management Co.,Ltd.

Country or region before: China